[plug] Using sed to remove \n from a text file.

Thu Jul 11 12:05:27 WST 2002

On Thu, 11 Jul 2002, Buddrige, David wrote:

> Hi all,
> 
> I have a text file which contains records of data from which I want to
> extract particular records.
> 
> For the most part, the records are one line each, with each successive line
> being a new record.
> 
> However, some of the "records" are very long, and in this case, they have
> been continued over to the next line by adding a "-" character at the end of
> the line to indicate that the record continues on the next line.
> 
> Subsequently, you having something like this:
> 
> Record1, FieldData1, FieldData2, FieldData3
> Record2, FieldData4, FieldData5, FieldData6
> Record3, FieldData7-
> FieldData8, FieldData9
> Record 4, FieldData10, FieldData11, FieldData12
> ....
> 
> and so on.
> 
> What I am wanting to do is use a sed script to remove any occurence of "-\n"
> and replace it with no characters at all.
> 
> By this means I hope to get those records that are split over multiple
> lines, to all be on one relatively large single line that I can then grep
> through, while still getting the entire record returned from grep.
> 
> To do this, I have written a sed command like this:
> 
> 
> 	sed s/-\n//g myfilename.txt > my_new_filename.txt
> 
> However, this does not seem to do what I want it to. 
> 
> Can anyone see an obvious error in this command?

To the best of my knowledge sed can't/won't do what you want.

Try 
perl -i.bak -0p -e 's/-\n//g myfilename.txt

This will:
 a. copy myfilename.txt to myfilename.txt.bak (-i.bak)
 b. remove any/all "-\n"'s

If the file is large (and reading the whole thing would be a bad thing (tm))
then try:

-----
#!/usr/bin/perl

while (<>) {
	chop;
	if ($_ =~ /-$/) {
		print;
	} else {
		print "$_\n";
	}
}
-----

perl script <myfile.txt > my_new_file.txt

HTH

Yours Tony

   Jan 22-25 2003           Linux.Conf.AU            http://linux.conf.au/
		  The Australian Linux Technical Conference!