I find the most effective way to work with a search tool like Lucene is to engineer XML representations of the content you need to index to the file system. This is typically done using cron jobs or everytime the content "bean" is modified. Then Lucene is used to index and search specific nodes in the XML document. Lucene is document agnostic. It can index any type of document, as long as it can parse it (parsers are easy to implement). Lucene supports boolean searches, proximity, stemming, stop word processing and faceted searches. Lucene is widely supported and actively developed. Two thumbs up!
Built on top of Lucene, Solr has all its featurers plus a web based interface, the ability to load balance using existing web technologies and index replication.
0 comments:
Post a Comment