[plug] wget query
Nick Bannon
nick at ucc.gu.uwa.edu.au
Fri Sep 3 18:11:22 WST 1999
On Fri, Sep 03, 1999 at 05:01:24PM +0800, Matt Kemner wrote:
> On Fri, 3 Sep 1999, Bret Busby wrote:
> > [wget] is quite useful; except where a path includes extended ASCII
> > characters, such as the tilde.
>
> I've never had a problem with wget and websites containing ~
> Can you let me know (either in private or on the list) what website you
> are trying to download, and what errors wget is giving you?
[...]
This is very good advice.
FWIW, the way I usually use wget is ;
wget -m --no-parent <URL>
-m for mirror, which implies the recursion, etc, --no-parent so it starts
in that location and works down, and doesn't start trying to download
the whole site. (unless I give it a URL of the whole site)
I have given it URL's with ~'s in it plenty of times (ie a user home
directory), and it downloads them fine, but, yes, it does convert them
into %7E .
Hence ;
wget -m --no-parent http://www.ucc.gu.uwa.edu.au/~nick/test/
produces the directory www.ucc.gu.uwa.edu.au ;
containing the subdirectory %7Enick ;
containing the subdirectory test ;
containing the files index.html
file1.html
file2.html
file3.html
The reason is that URL's are tightly defined and can't just contain
any old character. "Special" characters, including ~, are escaped, or
"stuffed" by sending %, then the ASCII value of that character in hex.
If you needed to send %, you'd have to send %25 .
For the full details, refer to RFC-2396.
Nick.
--
Nick Bannon | "I made this letter longer than usual because
nick at it.net.au | I lack the time to make it shorter." - Pascal
More information about the plug
mailing list