The short answer is "you can't, easily".

The splitter breaks text into discrete words. The splitter also
removes "stop" words, words under two characters long, numbers,
and symbols. It returns a (non-unduped) list of words after
pruning the text of stop words, symbols, and numbers. The
current splitter implementation (as of Zope 2.3.0) is written in
C, and it is most effective when used against English text.

The splitter may also remove semantically desirable symbols which
are part of words, or it may remove words completely. For example,
the splitter will split the word "t-shirt" into "t" and "shirt". It will
then drop "t" (because it's less than two characters), leaving
"shirt". Another example: the splitter will turn the word "C++"
into "C" (after removing symbols). It will then drop "C",
removing the word entirely. If you wish to change this
behavior, you need to delve into code to replace the splitter

----- Original Message ----- 
From: "Michael R. Bernstein" <[EMAIL PROTECTED]>
To: "Erik Enge" <[EMAIL PROTECTED]>
Sent: Friday, February 23, 2001 9:15 PM
Subject: Re: [Zope-dev] Minor typos/changes to ZCatalog.

> Erik Enge wrote:
> > 
> > On Fri, 23 Feb 2001, Michael R. Bernstein wrote:
> > 
> > > On the subject of numbers, I was wondering how to index
> > > alphanumeric values like ISBN numbers.
> > 
> > Why can't you use FieldIndexes?
> Because I'm actually Using a SkinScript to concatenate
> several attributes (Author, Title, id) +into one , so that I
> can index them all with a single text index. In that way, I
> reduce the indexing overhead, and it's easy to search
> multiple attributes for a match from a single search box.
> So how do I get the text index to index the alphanumeric
> ISBN values as well?
> Thanks,
> Michael Bernstein.
> _______________________________________________
> Zope-Dev maillist  -  [EMAIL PROTECTED]
> **  No cross posts or HTML encoding!  **
> (Related lists - 
> )

Zope-Dev maillist  -  [EMAIL PROTECTED]
**  No cross posts or HTML encoding!  **
(Related lists - )

Reply via email to