Re: Witango-Talk: Search Engine

Robert Garcia Wed, 09 Nov 2005 23:11:28 -0800

And you only need the value of the title, and a few meta tags right?

We have done several solutions, in witango, where we hit a url with<@url> and use regex to parse out values. It works very well. Worksin v5, and 5.5, but 5.5 much better, regex is better.


An example.

I have a cust, they have a bank, that is, well, I nicknamed it Bankof Bedrock.

My client has several call centers, and I wrote an application thatthe call centers use to create orders. The bank must create a creditaccount, and give a credit limit for the rep to proceed.

The bank wanted to have the reps, minimum wage staff, looking for anyshortcut, to open another browser window, hit the bank site, log inwith a special account, and then fill out credit app by hand, waitfor the response, and manually enter the credit limit and approveddenied status into my witango app.


The bank had no means to provide a webservice for this functionality.

So, I wrote an app, that <@url> into the login page, and acts like auser logging in. I use regex to parse out the userreferencegenerated, so I can use in next <@url> to hit the credit app page,posting the form, acting like a user submitting app.

I then use regex to parse the html returned, finding declinedapproved status, and the credit line.

Regex works VERY well this way, and is very fast. Parsing for titleand meta tags would be even simpler.

The other method is to use an object, like a com object, that willhit a url, and turn the resulting HTML into a dom object. Then youcan use code to get the values, without regex.

Here is some simple code, where I swap the html of a dom, but you canaccess any value in the dom.


  obj = me.Content.Property("Document")
  obj = OLEObject( obj.invoke("body") )
  obj.Property("innerHTML") = html

Shell.explorer does this, and there are others. There are probablyjava tools that parse html also into a dom. In the end, we usedregex. Much simpler, and it was very fast.


--

Robert Garcia
President - BigHead Technology
VP Application Development - eventpix.com
13653 West Park Dr
Magalia, Ca 95954
ph: 530.645.4040 x222 fax: 530.645.4040
[EMAIL PROTECTED] - [EMAIL PROTECTED]
http://bighead.net/ - http://eventpix.com/

On Nov 9, 2005, at 9:57 PM, Rick Sanders wrote:

Hey John,

---------------------------[snip]------------------------------
Trying to build a search engine spider. I want to grab thehtml file using the <@URL> tag, then omit everything but thetitle, keywords, and description and throw it into the database.
---------------------------[snip]------------------------------
I just want to know if this is possible to do this solely withWiTango. Basically, I'm in a battle with Microsoft ContentManagement Server. See, MCMS doesn't have search capability becauseMicrosoft closed the database. So, I am grabbing the MCMS postingsusing an XML control (CMS Rapid) and the posting comes out in HTML.I want to take the HTML, parse the data I need out of it, throw itin a database, and query it.
Mondo Search is $10,800.00 first year, and $1800.00 the second &third year. There's no control to use Coveo with MCMS. So, I wantto build a custom search interface with WiTango.
Thanks!

Rick

----- Original Message ----- From: "John McGowan" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Wednesday, November 09, 2005 11:39 AM
Subject: Re: Witango-Talk: Search Engine
Rick,

What's your question?

/John

Rick Sanders wrote:
Hey Bill,

Thanks for the link!

But, I'd still love to do this completely in WiTango.

Rick
----- Original Message ----- From: "William M Conlon"<[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Tuesday, November 08, 2005 8:08 PM
Subject: Re: Witango-Talk: Search Engine
I'm a broken record on this, but here goes:
http://www.swish-e.org has a very nice perl spider which will dothis for you (well, you'll have to write a perl calback functionto INSERT INTO (link, title, keywords, description).
But the nice thing about this is that it's already integratedwith an HMTL parser, to pull this out for you.
On Nov 8, 2005, at 4:48 PM, Rick Sanders wrote:
Hey Guys,
Trying to build a search engine spider. I want to grab thehtml file using the <@URL> tag, then omit everything but thetitle, keywords, and description and throw it into the database.
I know I can do this with other platforms, but would like to doit with WiTango.
Rick Sanders
President
519-498-7994
www.webenergy-sw.com
Bill

William M. Conlon, P.E., Ph.D.
To the Point
345 California Avenue Suite 2
Palo Alto, CA 94306
   vox:  650.327.2175 (direct)
   fax:  650.329.8335
mobile:  650.906.9929
e-mail:  mailto:[EMAIL PROTECTED]
   web:  http://www.tothept.com
________________________________________________________________________
TO UNSUBSCRIBE: Go to http://www.witango.com/developer/maillist.taf
________________________________________________________________________
TO UNSUBSCRIBE: Go to http://www.witango.com/developer/maillist.taf
________________________________________________________________________
TO UNSUBSCRIBE: Go to http://www.witango.com/developer/maillist.taf
________________________________________________________________________
TO UNSUBSCRIBE: Go to http://www.witango.com/developer/maillist.taf


________________________________________________________________________
TO UNSUBSCRIBE: Go to http://www.witango.com/developer/maillist.taf

Re: Witango-Talk: Search Engine

Reply via email to