[plug] mapping out a website

Garry garbuck at westnet.com.au
Thu Jun 10 13:49:18 WST 2004


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David Buddrige wrote:
| Hi all,
| I have been asked to map out all the pages in a given intranet website.
| So for example, given website url:
| http://abc.com/
| They want a list of every url that can be got at from the links on the
| initial page, sort of like this:
| http://abc.com/
|  http://abc.com/page1.html
|  http://abc.com/page2.html
|     http://abc.com/page2a.html
|     http://abc.com/page2b.html
|     http://abc.com/page2c.html
|  http://abc.com/page3.html
|  http://abc.com/page4.html
|     http://abc.com/page4a.html
|     http://abc.com/page4b.html
|        http://abc.com/page4b1.html
|     http://abc.com/page4c.html
| And so on, mapping out the structure of links in the website.
| It seemed to me that this ought to be something that is scriptable -
| most likely using wget or something... I have been experimenting with
| wget, however I have not been able to determine a way of just getting
| the url's as opposed to actually downloading the entire page...
| Does anyone know if wget can be used just to map out the tree of url's
| in a given website, as opposed to fully downloading and mirroring the
| entire website?
| I've been pouring over the wget manual, but to no avail... is there a
| similar command that is more appropriate to what I am trying to do?
| thanks heaps guys
| David.
| _______________________________________________
| PLUG discussion list: plug at plug.linux.org.au
| http://mail.plug.linux.org.au/cgi-bin/mailman/listinfo/plug
| Committee e-mail: committee at plug.linux.org.au
|
It isn't linux, but here is something..


http://www.spadixbd.com/elink/

HTH

Garry

- --




-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFAx/ZevdH9DANniC8RArTIAJ9jSzaTx7LJxe9pf4Bwu72zTz7GIQCfYQIK
9Ds30tyBbh/uvEOI37GU1bU=
=4j9F
-----END PGP SIGNATURE-----



More information about the plug mailing list