[plug] Re: mapping out a website
David Buddrige
buddrige at wasp.net.au
Thu Jun 10 14:04:38 WST 2004
I have been experimenting with the --spider option, but couldn't get it to
work... here's a transcript of running wget from my isp shell account:
[buddrige at wasp buddrige]$ wget -r --spider -o test.txt http://www.gnu.org
[buddrige at wasp buddrige]$ cat test.txt
--13:57:48-- http://www.gnu.org/
=> `www.gnu.org/index.html'
Resolving www.gnu.org... done.
Connecting to www.gnu.org[199.232.41.10]:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 12,756 [text/html]
200 OK
www.gnu.org/index.html: No such file or directory
FINISHED --13:57:58--
Downloaded: 0 bytes in 0 files
[buddrige at wasp buddrige]$
I wasn't sure what to search for to do this task... will try searching for
"web mapping" on google... but also, am a bit confused as to wget's
behaviour... was primarily interested in wget because it is [theoretically]
scriptable...
thanks
David.
Mark O'Shea writes:
> On Thu, 10 Jun 2004, David Buddrige wrote:
>> I have been asked to map out all the pages in a given intranet website. So
>> for example, given website url:
>>
>> http://abc.com/
>>
>> They want a list of every url that can be got at from the links on the
>> initial page, sort of like this:
>>
> Would this wrok for you?:
> wget -r --spider -o logfile.txt http://abc.com/
>
> Have you tried searching google for website mapping or similar?
>
> Regards,
> --
> Mark O'Shea
> _______________________________________________
> PLUG discussion list: plug at plug.linux.org.au
> http://mail.plug.linux.org.au/cgi-bin/mailman/listinfo/plug
> Committee e-mail: committee at plug.linux.org.au
More information about the plug
mailing list