I would try, looping through the files, using <@tokenize> to split data into fields. Then write (append) as tab separated values to a new file.
Import the new file into excel to see what it looks like. If there's only a couple of problems, it's faster to clean it here. If there's a bunch of problems, elaborate on the tokenizing. Then take the cleaned up tab separated file directly into mysql (I use Webmin for that purpose). >How would I go about parsing a file in to clean variables? > >about a year ago, I created an unthreaded forum. Not wanting to put it >on FMP, and witango not ready for Mac, I made it file-based. Each >submission writes a file with the guts of the message in it and then a >link is created to a .tml that sucks up the guts as an include into a >formatted page. > >http://www.patricknagel.com/forum/forum.taf > >OK, it's ugly, but it worked. > >Someone reminded me that when mysql and witango talked, I'd redo that >as a database app so it can be searched, etc. > >Yesterday, I did that, and now have a challenge. Getting all those >messages into the database. I can copy/paste about 1000 files, which >will take forever and be boring as hell. Then, I might come up with a >way to read and parse these messages and auto-submit. > >Since the files are serial numbered: 101.txt, 102.txt, etc. > >I was thinking of a taf that looped through a number sequence, read >each file, parsed it, turned the pieces into variables, and then >submitted to database. The question is how to parse it. > >Here is a typical file: > ><H1>I have two Nagels</H1> ><H2>Roni</H2> ><p><A HREF="mailto:[EMAIL PROTECTED]">[EMAIL PROTECTED]</A></p> ><h2>07/23/03</H2> ><p>I have two Nagel prints with CERT. OF AUTHENTICITY.... but, they >don't say a name on them.. HOW DO I FIND OUT WHAT THEY ARE WORTH? HOW >DO I LOOK THEM UP..? THE CERTIFICATE NUMBERS ARE B4288G & B101250.. >I WOULD REALLY APPRECIATE anyone's assistance with this. >[EMAIL PROTECTED]</p> ><center> ><form name="reply" method="post" >action="/forum/forum.taf?_function=reply"> > <input type="hidden" name="subject" value="I have two Nagels"> > <input type="hidden" name="author" value="Roni"> > <input type="hidden" name="comment" value="I have two Nagel prints >with CERT. OF AUTHENTICITY.... but, they don't say a name on them.. >HOW DO I FIND OUT WHAT THEY ARE WORTH? HOW DO I LOOK THEM UP..? THE >CERTIFICATE NUMBERS ARE B4288G & B101250.. I WOULD REALLY APPRECIATE >anyone's assistance with this. [EMAIL PROTECTED]"> > <input type="submit" name="Submit" value="Respond to this message"> ></form> ></center> > >Everything past the first <center> is unnecessary. Got an idea how best >to do that? >Thanks for any hints. > >RAD >________________________________________________________________________ >TO UNSUBSCRIBE: Go to http://www.witango.com/maillist.taf Bill Conlon To the Point 345 California Avenue Suite 2 Palo Alto, CA 94306 office: 650.327.2175 fax: 650.329.8335 mobile: 650.906.9929 e-mail: mailto:[EMAIL PROTECTED] web: http://www.tothept.com ________________________________________________________________________ TO UNSUBSCRIBE: Go to http://www.witango.com/maillist.taf
