On Thu, Nov 26, 2009 at 10:01 PM, Robert lzw <[email protected]> wrote: > Hello folks, > > I want to build a SQL database based on data from web page link as the this > one: > http://tubic.tju.edu.cn/deg/information.php?ac=DEG10010001 > > Since the data would be collected from thousands of such link, I want > to write code for doing it automatically. Can anyone suggest how to > deal with the following tasks? > > (1) With the above link, how can I store the corresponding data to the > SQL database, assuming the database has identical fields (Access > Number, Gene Name, etc.) as the above link? > > (2) After processing the above link, how can I open a new link > automatically, for example, > http://tubic.tju.edu.cn/deg/information.php?ac=DEG10010002, and doing > the same thing as in step (1). > > (3) How can I repeat steps (1) and (2) for all the pages I want to handle. > > Any suggestions and recommendation of framework, book and online > source for doing it would be highly appreciated.
I'd say Wt is not the right tool here. You'd better use something to parse o scrape web pages. If you want to use C++, I'd use use Qt: - QNetworkAccessManager and QUrl to access and/or download the pages - QtWebkit from Qt 4.6 (particularly http://doc.trolltech.com/4.6-snapshot/qwebelement.html , which is new in Qt 4.6) to parse the data - QtSql to insert data in a database - You can implement a thread pool (use QThread) and parallellize fetching and processing. Other options are Ruby, using ScrAPI or Hpricot to easily parse the webpages. Or Python. There are essentially infinite options. -- Pau Garcia i Quiles http://www.elpauer.org (Due to my workload, I may need 10 days to answer) ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ witty-interest mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/witty-interest
