[plug] Grep not following my regex

Gregory Orange gregory.orange at rpsmetocean.com
Tue Sep 15 11:09:48 WST 2009


Tim Bowden wrote:
> On Tue, 2009-09-15 at 09:38 +0800, Tim Bowden wrote:
>> On Tue, 2009-09-15 at 09:11 +0800, Tim wrote:
>>> I have been doing some string extraction for an application I've
>>> written. I needed to extract a number (with decimal point) from a one
>>> line string. I had the grep working, but then having moved to another
>>> computer, it stopped working. I did some debugging on the computer I'd
>>> moved to, to see why my application no longer worked, and narrowed it
>>> down to this grep.
>>> egrep -o '[0-9.]*'
>>>
>>> By changing the grep to the following, I managed to get it working again.
>>> egrep -o '[0-9.]+'
>>>
>>> Now I know that the + may make more sense now, by forcing a match, but
>>> I can't see why the first regex stopped working.
>>>
>>> The string it's matching against is:
>>> You currently have 347.454MB remaining
>>> and obviously it just pulls out the 347.454.
>>>
>>> Other than bash stealing the * from the grep (which I am sure
>>> shouldn't be happening due to the quotes), can someone let me know why
>>> the first regex stopped working? (Also, I was moving from an Ubuntu
>>> system to a Fedora system)
>>>
>>> Thanks
>>>
>>> Tim
>> The '.' has special meaning in a regex.  Escape it to make it just a
>> '.'.
>>
>> Without looking at your problem in detail, here is a possible regex for
>> a decimal point number.
>>
>> grep -o [0-9]+\.[0-9]+
> 
> Meh.  What a load of horse shit.  \ doesn't escape the '.' at all; Perl
> it ain't.  And it should be egrep.  Happens to work anyway, so long as
> your strings are as you say (otherwise it might not be quite what you're
> after...).
> 
> <hangs head in shame>
> Tim Bowden

I don't understand your retraction. \ does escape . in grep. I always
use egrep but I tried straight grep, just in case:
grep -o "[0-9][0-9]*\.[0-9][0-9]*"
grep -o "[0-9][0-9]*.[0-9][0-9]*"

give different results, but substituting grep for egrep doesn't change
anything - GNU grep 2.5.1 on SLES9 if it's relevant.

Also, my understanding is that the + symbol is extended regex, so .+ is 
not as portable as ..*

Oh hang on... inside the [] . is different. Now I'm confused. Reading 
OP's original regex tho, it looks like it would match on something like 
"..3.4..5.." - not exactly a decimal number. Doesn't work for me tho (:

Cheers,
Greg.



More information about the plug mailing list