You are hearing that Witango is not a good tool for this task.
I spent a long time trying to build a search engine in Witango and finally gave up. I looked at Swish-e and tried installing the latest Perl and their program. I never got it to work on a Win2k server. Sorry, but the learning curve was too much for me in the time I had to get this together. I spent days on just the installation and configuration and finally gave up. (Sorry Bill, I know you love the program)
I ended up using a program called Alkaline Search Engine which was very good for a novice like me. I had it up and running in about a half a day. Requirements for me was to have the ability to weight different aspects of the pages being indexed, control the URL list of what to crawl (how deep, by directory or URL), be able to control the time the robot crawls, the ability to set a threshhold of changed pages to crawl, the ability to return just one URL per site, sort by weight, date, alpha domain, highlite keywords, etc.
What was nice is no Perl and a html front end to watch crawls and stats. All I needed to learn was about 100 tags that are unique to their program.
A word of caution though, no support. They have a forum, but it sometimes takes weeks to get an answer.
Just my 2 cents.
Hmmmm.... Looking at swish-e, it's clear that it's a more powerful approach, but...
If I want to keep life simple and have everything running from witango - even if I won't get all the swish-e bells and whistles, is there a better approach? Or am I hearing that witango isn't really a good tool for this task?
On 10/19/04 8:15 AM, "Bill Conlon" <[EMAIL PROTECTED]> wrote:
>>>> MSSQL calls the feature "Full-Text Search")Then, maybe spidering the forum content is the easiest way to do this. It will let you do free form searching. The cgi script included with swish-e will highlight found terms, and let the user apply various restrictions.
The other thing is you don't have to worry about a SQL injection attack.
On Tuesday, October 19, 2004, at 08:06 AM, Roland Dumas wrote:
It's a search function for a forum. Dynamic popups won't work here, because it's a search of subject and content, which would overwhelm a selection menu.
My thoughts were like this:
User's Search string = first argument Search string parsed (by space and comma) and the articles tossed out. That leaves an array of words within the first argument. The first argument and the remaning substrings comprise all the OR conditions you want.
Option 1: Generate SQL from a <@ROWS> that just appends a series of OR statements to the SELECT command.(easiest to do, but least secure)
Option 2: Write a taf in XML, using the <@ROWS> to create a custom <Criteria> section in a temporary taf that is just the search action, and then call that action with a branch/return. (typos will crash tango server - venturing into deep unknown)
Option 3: Do a For loop for each of the substrings and glue all the resultsets together (slow and painful)
Then, when you've got the amalgamated found set of records with whole string or substrings, figure a way to bubble up to the top the whole strings or items of greater value.
On 10/19/04 7:13 AM, "Bill Conlon" <[EMAIL PROTECTED]> wrote:
Full text indexing can be expensive if your application does a lot of inserts/updates into columns that are indexed, but things like the winery/varietal shouldn't be a problem if you want to get it out of the db.
If it's a problem for your users, then maybe you can build a selection list from the available choices. Or maybe you need something akin auto-complete: run a javascript keyboard event handler that populates your input field based on the characters typed so far.
On Tuesday, October 19, 2004, at 06:59 AM, John McGowan wrote:
Don't some newer databases have full text indexing now. (I believe>>>> www.mysite.com/dbindex/main.taf?id=ccccccWouldn't the best solution be to use a database that supports that type of searching?
If this functionality isn't available to you in your DB then I would suggest you still use swish-e like Bill suggests...
1. create a "dummy site" that will have a unique page for every record in the table that you're looking for. www.mysite.com/dbindex/main.taf 2. when you hit main.taf it generates a link to each record in the table you care about www.mysite.com/dbindex/main.taf?id=xxxxx www.mysite.com/dbindex/main.taf?id=yyyyy www.mysite.com/dbindex/main.taf?id=zzzzzz
(if you're familiar with witango this should take you about 5 minutes to accomplish)
3. Tell swish-e to index the site by hitting the initial main.taf url.
4. now when you want to do a full text search of the table, you call swish-e's searching functionality. it will return a list of the matching entries. www.mysite.com/dbindex/main.taf?id=aaaaa www.mysite.com/dbindex/main.taf?id=bbbbb>>>>>>> __
5. Of course at this point you know that if you strip out the "www.mysite.com/dbindex/main.taf?id=" You will have the part of the url that you care about, the aaaaa,bbbbb,ccccc which should be in a ranked order, and now you can do with that information whatever you want.
6. Schedule the re running of step 3. at some interval that satisfies your need for accuracy vs. performance.
Of course this all assumes you're doing this for 1 particular table. However, if you had more than 1 table you could still do it all by adding a little more code to your main.taf and some more logic to the part that stripps the url to get the important part.
/John
Roland Dumas wrote:
But, we're talking about a search of a database.
On 10/18/04 5:59 PM, "Bill Conlon" <[EMAIL PROTECTED]> wrote:
Roland,
You've heard this from me before on this list. Take a look at swish-e. You could use its built-in spider to index your site, and then use the built-in cgi-script to highlight your results. It's really a great piece of software.
Now if you take the swish-e approach, here's what I would do to solve this.
Dyanmically create metatags for the key parameters you want to search:
<meta name="vineyard" content="Chateau Lafite, Chateau, Lafite"> <meta name="varietal" content="Pinot Noir, Pinot, Noir"> etc.
Use witango to tokenize while creating the HTML pages for the various wines.
Then use swish-e's meta name search.
On Monday, October 18, 2004, at 05:39 PM, Roland Dumas wrote:
In search engines, when you submit a search string, the search engine first tokenizes and then searches for each substring string separately and then brings them together as your found set. So if I search for 1961 Chateau Lafite, I'll get items with 1961, others with Chateau or Chateu Lafite, and on top will be the found records with 1961 Chateau Lafite (I know, if you put it in quotes, it forces it to find only the whole string. That part is easy)
They will also rank a find of the full set of terms above ones with one or two terms in the documents.
Questions:
What's the approach with witango that will enable the search of tokenized strings.
Any ideas on how to do a crude ranking, such that the full term comes up on top of the found set?
__________________________________________________________________>>>___ _ TO UNSUBSCRIBE: Go to http://www.witango.com/developer/maillist.taf
___________________________________________________________________ __ ___ TO UNSUBSCRIBE: Go to http://www.witango.com/developer/maillist.taf
----------------------------------------- Roland Dumas Roberts Information Services 310 W. Bellevue Avenue San Mateo CA 94402 650-347-1373 415-412-9300 (cell) [EMAIL PROTECTED] SMS: http://new.servqual.com/html/sms.tml
____________________________________________________________________ __ __ TO UNSUBSCRIBE: Go to http://www.witango.com/developer/maillist.taf
_____________________________________________________________________ __ _ TO UNSUBSCRIBE: Go to http://www.witango.com/developer/maillist.taf
______________________________________________________________________ __ TO UNSUBSCRIBE: Go to http://www.witango.com/developer/maillist.taf
----------------------------------------- Roland Dumas Roberts Information Services 310 W. Bellevue Avenue San Mateo CA 94402 650-347-1373 415-412-9300 (cell) [EMAIL PROTECTED] SMS: http://new.servqual.com/html/sms.tml
_______________________________________________________________________ _ TO UNSUBSCRIBE: Go to http://www.witango.com/developer/maillist.taf
________________________________________________________________________ TO UNSUBSCRIBE: Go to http://www.witango.com/developer/maillist.taf
----------------------------------------- Roland Dumas Roberts Information Services 310 W. Bellevue Avenue San Mateo CA 94402 650-347-1373 415-412-9300 (cell) [EMAIL PROTECTED] SMS: http://new.servqual.com/html/sms.tml
________________________________________________________________________ TO UNSUBSCRIBE: Go to http://www.witango.com/developer/maillist.taf
________________________________________________________________________ TO UNSUBSCRIBE: Go to http://www.witango.com/developer/maillist.taf