> > Simon Coles writes:
> >  > We have binary files stored in Zope, for example Word documents (but
> >  > could be any of a variety of document types).
> >  >
> >  > We would like to be able to index and search the contents of these
> >  > files using ZCatalog. So if a Word file contains the word "Fred",
> >  > then any search for "Fred" would include that file in the list of
> >  > documents returned.
> > Someone else already told you, that you must create a parameterless
> > method (it need not necessary be named "PrincipiaSearchSource")
> > that returns the files content.
> >
> > You may not need to keep the rendered version around but
> > may be able to extract the plain text on demand.
> > I think, there is a "word.dll" that provides access to
> > MS Word from applications. Alternatively, you could
> > control Word via COM.

Ther is a Perl (I know, I know...) script to convert Word DOC
files into HTML. That should work well enough to make the stuff
searchable (I would use doc2html.pl | lynx -d to get a pure ASCII
version, though).
It is probably fast enough to just render on the fly (i.e., upon
indexing).

HTH,
Jan

_______________________________________________
Zope maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )

Reply via email to