Re: [Wtr-general] OT: Regular Expression help

Alan Ark Thu, 15 Mar 2007 09:01:39 -0800

Hi Paul.


Just a few notes.

 

The “^” I believe is an anchor tag to the beginning of the line, so having a 
bunch of these probably caused some of your problems.

 

Here’s a regex that I came up with. 

I used the “?” qualifier to make the regex non-greedy – which probably would 
have been the next thing that you ran into.

 

 

paul=~/\".*?\"\t\".*?\"\t\".*?\"\t\"(.*?)\"\t\"(.*?)\"/

 

I tested this real quick (below) and it looks to work

irb(main):015:0> paul=File.open("C:/temp/paul.txt")

=> #<File:C:/temp/paul.txt>

irb(main):016:0> pop=paul.gets

=> "\"one\"\t\"1\"\t\" 0.1234\"\t\"0\"\t\"4\"\n"

irb(main):017:0> pop=~/\".*?\"\t\".*?\"\t\".*?\"\t\"(.*?)\"\t\"(.*?)\"/

=> 0

irb(main):018:0> $1

=> "0"

irb(main):019:0> $2

=> "4"

 

 

As an aside, what’s wrong with reading the line, then splitting it into an 
array?

I think that would have been much more readable than using the regex soln.

 

Regards

-Alan

 

   _____  

From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Paul Carvalho
Sent: Thursday, March 15, 2007 9:30 AM
To: [email protected]
Subject: [Wtr-general] OT: Regular Expression help

 

Hi there.  After several hours of looking at this problem, I've decided to ask 
for some help.

Here's the problem:  I have an input text file that has a series of values 
stored like this:
"one"   "1"   " 0.1234"   "0"   "4"
"two"   "3"   "1.3333"   "1"   "0"
...

I want the values in the 4th and 5th quotes on each line.  I originally thought 
about dumping each line 'split' into an array and working with the array, but 
then I thought it might save me time if I could just figure out the regular 
expression to get the right values. 

I've read through an online Regular Expression tutorial, reviewed a few books, 
and downloaded two apps (PowerGREP and Regexile) to help me try and figure this 
out but so far no luck.

Here's the line I started with: 
line =~ /^\"[^"]*"\t\"[^"]*"\t\"[^"]*"\t\"([^"]*)"\t\"([^"]*)"\t/

=> Expect $1 and $2 to hold the values I want... the (bracketed) regex's 

- I tried switching the \t with \s but no luck
- tried adding and removing extra backslashes around the quotes, but nothing
- tried adding and removing all sorts of other characters but still can't get 
it to work. 

Can anyone help me figure out how to parse these input lines in a quick and 
efficient way?  I wanted to avoid having to rely on arrays, but I'm ready to 
give up and use them right about now.

Please let me know.  Thanks in advance.  Paul C. 

(P.S. the *actual* input file has something like 20 values on each line.  If I 
can figure out the pattern above for the simplified input file, I'm sure I can 
apply it to the larger real input file.)



--
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.5.446 / Virus Database: 268.18.11/722 - Release Date: 3/14/2007 3:38 
PM


-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.446 / Virus Database: 268.18.11/722 - Release Date: 3/14/2007 3:38 
PM

_______________________________________________
Wtr-general mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/wtr-general

Re: [Wtr-general] OT: Regular Expression help

Reply via email to