Re: A question

Hari 22 Apr 2005 11:55:36 -0000

Murray /   Xoan  ,,

Murray u have a valid point :) .. , Can Lucene to perform document based search ... Harry


Xoan wrote:

Hi Murray,

Thanks for the info.

You're right, the only reason to store plain text is to permit searching.

I think your approach is valid for me. I don't know anything about
Lucene, thereby I have much to read, investigate, ...
Soon I'll come back with more questions ... :)

Regards,

Xoan

2005/4/22, Murray Altheim <[EMAIL PROTECTED]>:

Xoan,

All searches happen this way, but that process of indexing goes
on *before* the user does the search, which is why it seems fast.
I've integrated Lucene into my Xindice collections, with a
listener that notes when a document is created, changed or deleted.
There's an initial cost of indexing the whole collection (if the
database is populated all at once), but the cost is incremental
and almost unnoticeable otherwise.

Because Lucene uses a model whereby you feed documents to various
indexers depending on their type (so a text document goes to a
different one than an HTML document, which needs a text stripper
to remove the markup), you don't need a separate text document
stored for each HTML document, if the only reason you're doing
that is having the text available for searching. You only create
the text temporarily for the indexer to function, then dump it.

Murray

Re: A question

Reply via email to