[plug] PDF to TXT
Steve Grasso
steveg at calm.wa.gov.au
Thu Aug 3 12:04:17 WST 2000
Mike,
> Anyone recommend a good (cheap --> free, its a one off use) PDF to TXT
> ripper. The ones I have found so far on freshmeat either mean I send the
> PDF to them or I pay US $ 500 ... (ouch)
I hunted down an open-source pdf to html ripper a while back.
The site (http://www.ra.informatik.uni-stuttgart.de/~gosho/pdftohtml/) appears
to not be available, but I have the source tarball (~250k) if you're
interested.
Blurb from the author:
Pdftohtml v. 0.22 converts Portable Document Format files to HTML. This
release converts text and links. Bold and italic face are preserved. Pdftohtml
v.0.21 extracts all images as JPEG or PNG. Currently "pdf" vector drawings are
not extracted. The current version is tested on Linux and Solaris 2.6
Regards,
Steve
More information about the plug
mailing list