I've been thinking about doing this. I wonder if there are any C filter
libraries that read word docs. The word 2000 docs are supposedly non-binary,
so you could proabaly write a parser of sorts in python or C/Lex; I used to
write text filters in C and Lex for my previous employer - one of these days
I will figure out how to extend python with C and do this.  I'm thinking
about doing this type of thing in order to make PDFs searchable (as well as
IPTC catopn data in JPG files).

Perhaps in the mean time, one could set up a macro in normal.dat template
file that ftps the doc to zope on every save and updates properties
containing the full text for the document.  Sort of kludgy, but I assume it
would work, if you were familiar with VBA coding, and had access to a http
client component.

Doing it this way would make it so you would likely have to manually reindex
the catalog.  There might be a way around that though, to automate it...

Sean

=========================
Sean Upton
Senior Programmer/Analyst
SignOnSanDiego.com
The San Diego Union-Tribune
619.718.5241
[EMAIL PROTECTED]
=========================


-----Original Message-----
From: Bowyer, Alex [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, January 02, 2001 2:45 PM
To: '[EMAIL PROTECTED]'
Subject: [Zope] Advice on searching/indexing Word documents?


Our company has a repository of staff CVs (Resumes) as Word Documents and I
am about to embark on creating a new feature for our Zope Intranet to allow
project managers to search those documents for keywords such as particular
skills or projects.

I am thinking about several possibilities such as a skills/CVs database
linked in via ODBC, or some task that converts the Word documents to text
files which can then be searched by Zope (I think Zope can do this, and I
assume it can't search Word format directly?).

Has anyone ever approached a similar problem, does anyone have any tips on
how to index/search a load of documents in Zope?

Any tips/suggestions/comments would be most welcome.

Thanks,

Alex

==================================
Alex Bowyer
IT Consultant, Logica Australasia
Tel    : +61 2 9202 8130
Fax    : +61 2 9922 7466
E-mail : [EMAIL PROTECTED]
WWW    : http://www.logica.com.au/
==================================

_______________________________________________
Zope maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )


_______________________________________________
Zope maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope-dev )

Reply via email to