[plug] Grep not following my regex
Tim Bowden
tim.bowden at westnet.com.au
Tue Sep 15 11:49:20 WST 2009
On Tue, 2009-09-15 at 11:09 +0800, Gregory Orange wrote:
> Tim Bowden wrote:
> > On Tue, 2009-09-15 at 09:38 +0800, Tim Bowden wrote:
> >> On Tue, 2009-09-15 at 09:11 +0800, Tim wrote:
> >>> I have been doing some string extraction for an application I've
> >>> written. I needed to extract a number (with decimal point) from a one
> >>> line string. I had the grep working, but then having moved to another
> >>> computer, it stopped working. I did some debugging on the computer I'd
> >>> moved to, to see why my application no longer worked, and narrowed it
> >>> down to this grep.
> >>> egrep -o '[0-9.]*'
> >>>
> >>> By changing the grep to the following, I managed to get it working again.
> >>> egrep -o '[0-9.]+'
> >>>
> >>> Now I know that the + may make more sense now, by forcing a match, but
> >>> I can't see why the first regex stopped working.
> >>>
> >>> The string it's matching against is:
> >>> You currently have 347.454MB remaining
> >>> and obviously it just pulls out the 347.454.
> >>>
> >>> Other than bash stealing the * from the grep (which I am sure
> >>> shouldn't be happening due to the quotes), can someone let me know why
> >>> the first regex stopped working? (Also, I was moving from an Ubuntu
> >>> system to a Fedora system)
> >>>
> >>> Thanks
> >>>
> >>> Tim
> >> The '.' has special meaning in a regex. Escape it to make it just a
> >> '.'.
> >>
> >> Without looking at your problem in detail, here is a possible regex for
> >> a decimal point number.
> >>
> >> grep -o [0-9]+\.[0-9]+
> >
> > Meh. What a load of horse shit. \ doesn't escape the '.' at all; Perl
> > it ain't. And it should be egrep. Happens to work anyway, so long as
> > your strings are as you say (otherwise it might not be quite what you're
> > after...).
> >
> > <hangs head in shame>
> > Tim Bowden
>
> I don't understand your retraction. \ does escape . in grep. I always
> use egrep but I tried straight grep, just in case:
> grep -o "[0-9][0-9]*\.[0-9][0-9]*"
> grep -o "[0-9][0-9]*.[0-9][0-9]*"
>
> give different results, but substituting grep for egrep doesn't change
> anything - GNU grep 2.5.1 on SLES9 if it's relevant.
>
> Also, my understanding is that the + symbol is extended regex, so .+ is
> not as portable as ..*
>
> Oh hang on... inside the [] . is different. Now I'm confused. Reading
> OP's original regex tho, it looks like it would match on something like
> "..3.4..5.." - not exactly a decimal number. Doesn't work for me tho (:
>
> Cheers,
> Greg.
Do grep \. testfile
and it will match every char in the file. The '.' isn't escaped by the
'\'. Using [.] does indeed treat it as a literal. In any event, there
is a problem if there is more than one number in any input line.
Tim Bowden
More information about the plug
mailing list