[plug] Re: trouble with searching for non-ascii characters in a text file
David Buddrige
buddrige at wasp.net.au
Fri May 16 16:05:48 WST 2003
thanks everyone, I'll give these a go. 8-)
regards
David.
Tony Breeds writes:
> On Fri, May 16, 2003 at 01:03:48PM +0800, David Buddrige wrote:
>> Hi all,
>>
>> I have some html files that contain odd [non-ascii] characters here and
>> there. The web-browser displays them as "?" in the html page. My text
>> editor uses another character to represent that particular character. I
>> want to find out how to determine what exact hexadecimal value that
>> character evaluates to, and then how to grep on that hex-value - rather
>> than its ascii equivilent. is this possible?
>
> You can use "od -x" to work out the hex values. but that output isn't
> really very machine readable.
>
> But I think you'd probably benift from something like:
> ---
> sub HEX($) {
> my $chr=shift;
> my $val = ord($chr);
> if ($val > 32 or $val < 127) {
> return $chr;
> } else {
> #return sprintf("&#%d;",$val); #HTML
> return sprintf("0x%x;",$val); #'C' hex
> }
> }
>
> while (<>) {
> s/./HEX($&)/eg;
> print;
> }
> ---
>
> It's ugly and untested but I should take any non printing chars and
> replace them with there hex equivilents. You could of course drop the
> else clause and then get rid of them altogether.
>
> Yours Tony
>
> Linux.Conf.AU http://lca2004.linux.org.au/
> Jan 12-17 2004 The Australian Linux Technical Conference!
>
More information about the plug
mailing list