I would second anything Bill says about swish-e. We use it as well.

/John

Bill Conlon wrote:

here's my broken record solution: swish-e (http://swish-e.org)

Very fast and flexible index builder and search engine for just this kind of thing (will also index PDF, Excel, Word).

I use it with both the perl cgi interface and with a shell script from witango.

One thing I especially like is its handling of meta tags. Instead of using a database to store meta data, you can embed it within the files being indexed, and then have swish-e pull based on just specific items in the meta tags.

We've built some dynamic content solutions with this technique. One quick example (sorry, it's perl, but the principle is the same in witango) is a press release archive. In the <head>

<meta name="Category" content="press">
<meta name="date" content="20040111">
<meta name="description" content="First line of press release">

</head>

Then you index, and have a script that gets all the html files where type="press", and sorts them via Date:

open(SWISH, "$swish -w $query -m $results -p Description Keywords Date -s Date -f $index|");

here $swish is the path to the swish-e executable
$query='Category=press';
$index is the path to the index.


We then parse the output and display each row with the date, release title, release description metatag, and a link to the release itself.

If you call it from witango, you will get a resultset, which you can tokenize (there is swish-e flag to specify the delimiter), and then format the results normal.



I have a project that calls for creating keyword searches of teleconference
transcripts and newsletters.

The files we need to search on will most likely be stored in a directory as
text files and or XML documents.

I need a witango solution. Apparently there is a PHP solution called PHP dig
that works well.

I probably need something that works like that.

Ideas? Code? Everything welcomed.

Dan


-- Dan Stein Digital Software Solutions 799 Evergreen Circle Telford PA 18969 Land: 215-799-0192 Mobile: 610-256-2843 Fax 413-410-9682 FMP, WiTango, EDI,SQL 2000 [EMAIL PROTECTED] www.dss-db.com


"When you are born, you cry and those who love you rejoice. And if you live your life as you should, when you die, you rejoice and those who love you cry."

________________________________________________________________________
TO UNSUBSCRIBE: Go to http://www.witango.com/developer/maillist.taf





Bill Conlon

To the Point
345 California Avenue Suite 2
Palo Alto, CA 94306

office: 650.327.2175
fax:    650.329.8335
mobile: 650.906.9929
e-mail: mailto:[EMAIL PROTECTED]
web:    http://www.tothept.com


________________________________________________________________________ TO UNSUBSCRIBE: Go to http://www.witango.com/developer/maillist.taf




________________________________________________________________________ TO UNSUBSCRIBE: Go to http://www.witango.com/developer/maillist.taf

Reply via email to