Terry Rosenbaum wrote:
I assume your indexer only indexes "whole words" (perhps with stemming) and therefore would not be useful in assisting evaluation of the "contains" or "ends-with" functions (as a substring index would be useful for)?
e.g.
if the text was: "some words" contains("some words", "wor") would return true, but if you indexed only whole words, you would not find "wor" in the index therefore the index would not be useful for assisting in evaluation of the contains function.
You can provide Lucene with different Analyzers[1] which determine how the text stream is handled so you could do what you want. I haven't decided the best way to allow the creator of the index to specify the Analyzer chain yet but I will make it possible to control which Analyzers are used.
[1] http://tinyurl.com/37gmc
-- Andy Armstrong, Tagish