[plug] text processing
Sol
sol at autonomon.net
Mon Dec 9 17:39:23 WST 2002
---------- Forwarded Message ----------
Subject: Re: [plug] text processing
Date: Mon, 09 Dec 2002 17:38:18 +0800
From: Sol <sol at autonomon.net>
To: plug at plug.linux.org.au
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title></title>
</head>
<body>
Thanks for your replies Ryan and Graham,<br>
<br>
I've tried both out with no luck yet. Here's a what I did and what went
wrong:<br>
-----------------------------------------------------------------<br> #First
with Ryan's tip<br>
<br>
#!/bin/sh<br>
<br>
DOC_DIR=/home/sol/bswa_publications/DOC/<br>
HTML_DIR=/home/sol/bswa_publications/HTML/<br>
<br>
for FILE in `ls ${DOC_DIR}`<br>
do<br>
wvText ${FILE} | txt2html >
${HTML_DIR}${FILE}<br> done<br>
<br>
## What I got here was a bunch of files in ../HTML/ with exactly the same
names as in ../DOC/. They contained the bare minimum amount of HTML elements
and none of the text like this:<br>
<br>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"><br>
<HTML><br>
<HEAD><br>
<TITLE></TITLE><br>
<META NAME="generator" CONTENT="txt2html v1.28"><br>
</HEAD><br>
<BODY><br>
<P><br>
Usage: /usr/bin/wvText <word document> <text output
file><br>
<br>
</BODY><br>
</HTML><br>
<br>
I've tried doing "wvText < ${FILE}" which gives me a different error but
passed no files so I guess that's a backward step.<br>
-----------------------------------------------------------------------<br>
<br>
-----------------------------------------------------------------------<br>
#Now Graham's<br>
<br>
#!/bin/ksh<br>
<br>
for $file in $(/home/sol/bswa_publications/DOC/*doc)<br>
do<br>
$file_out=${$file%doc}html<br>
wvText < /home/sol/bswa_publications/DOC.$file |
txt2html > /home/sol/bswa_publications/$file_out<br> done<br>
<br>
##I got the error<br>
DOCtoHTML.ksh: `$file': not a valid identifier<br>
<br>
I don't know any ksh at all so I don't even know which $file is causing the
problem.<br>
-----------------------------------------------------------<br>
It's interesting to see different solutions for the same simple problem.
Ta.<br>
<br>
sol<br>
<br>
<br>
<br>
<br>
Graham, Alan A. wrote:<br>
<blockquote type="cite"
cite="mid41D499B16A81D4118D2500805F0DD05C084E5AE0 at perm03.woodside.com.au">
<pre wrap="">!#/bin/ksh
# I can't test this cos I'm on an NT box :-(
# I know korn shell, but I understand bash can run ksh scripts
#
for $file in $(ls ../docs/*doc)
do
$file_out=${$file%doc}html
wvText < ../docs.$file | txt2html > ../html/$file_out
done
I can't remember if the output from straight ls looks like ../doc/one.doc or
just one.doc. The code assumes one.doc. And I say again, I can't test this
goes I don't have access to a real OS at this site.
Alan
</pre>
<blockquote type="cite">
<pre wrap="">-----Original Message-----
From: Sol [<a class="moz-txt-link-abbreviated"
href="mailto:SMTP:sol at autonomon.net">SMTP:sol at autonomon.net</a>]
Sent: Monday, 9 December 2002 16:24
To: <a class="moz-txt-link-abbreviated"
href="mailto:plug at plug.linux.org.au">plug at plug.linux.org.au</a>
Subject: [plug] text processing
Hi PLUG,
I have a bunch of M$ Word files that in a directory that I want to clean
up
and output to another empty directory as HTML. I've been doing everything
the
slow way using command line tools, but I'm sure that it can all be done
with
a single command. I'm using wvText and txt2html. I want to pipe all the
files
in the directory in order through wvText into the empty directory and then
pass everything in that directory through txt2html.
If I've got these files: one.doc, two.doc and three.doc in ../docs/ and
want
them to end up as HTML in ../html/ as one.html, two.html and three.html,
how
can I do this with a single command?
Thanks,
sol
--
This email was received from the Internet. If this email is unsolicited,
non-business related, inappropriate or spam, please forward it to
<a class="moz-txt-link-abbreviated"
href="mailto:spamfilter at woodside.com.au">spamfilter at woodside.com.au</a>
</pre>
</blockquote>
<pre wrap=""><!---->
</pre>
</blockquote>
<br>
<br>
</body>
</html>
-------------------------------------------------------
More information about the plug
mailing list