[plug] mirroring and updating a remote http directory

Marc Wiriadisastra marc.w at smlintl.com.au
Fri Aug 6 11:32:17 WST 2004


I remember a script being able to compare files that have been altered
through date and time stamps.  Could that same concept be used in this
situation. I by no means know how to do it I'm just putting it forward as
a suggestion for those that know how to do it.

> I'm trying to mirror a remote HTTP directory and the files contained
> within it.  Now thats alright, can do that with wget fine and dandy
> (with -r), but I want to be able to get wget at set intervals to update
> any changed contents with that remote HTTP directory.  Now the problem
> is that for some reason the "Last-modified" header is missing on the
> files, even though the directory listing page generated by the remote
> server (which is Apache) shows the last modified time-stamp.  But I
> don't have control over the remote web server, so I can't change
> anything at that end.
>
> So what happens is that if you get wget to go and look for changes and
> only to download changed files, because it can't get the timestamp on
> the files.. it downloads every single file again.  According to the wget
> man page, wget is supposed to look at the time-stamp and/or the file
> size when figuring out whether a file has been modified since it was
> last retrieved.  But it doesn't seem to pay attention to the
> content-length info, even though it does get that.
>
> The reason why its a problem is that the files contained in the
> directory are a couple of meg each and there are quite a lot of them, so
> I don't want to be re-mirroring possibly 100-150Mb each time, when there
> might be only 5Mb of changed files.
>
> Anyone got any suggestions of how I could overcome this?  Alternative
> tools etc?
>
> TIA.
>
>
> / Ben
>
> _______________________________________________
> PLUG discussion list: plug at plug.linux.org.au
> http://mail.plug.linux.org.au/cgi-bin/mailman/listinfo/plug
> Committee e-mail: committee at plug.linux.org.au
>




More information about the plug mailing list