OT: wget bug

Sun Jul 19 06:17:17 UTC 2009

On Sat, 18 Jul 2009, Karl Vogel wrote:

> Date: Sat, 18 Jul 2009 19:34:24 -0400 (EDT)
> From: Karl Vogel <vogelke+unix at pobox.com>
> To: freebsd-questions at freebsd.org
> Subject: Re: OT: wget bug
>
> >> On Sat, 18 Jul 2009 09:41:00 -0700 (PDT),
> >> "Joe R. Jah" <jjah at cloud.ccsf.cc.ca.us> said:
>
> J> Do you know of any workaround in wget, or an alternative tool to ONLY
> J> download newer files by http?
>
>    "curl" can help for things like this.  For example, if you're getting
>    just a few files, fetch only the header and check the last-modified date:
>
>       me% curl -I http://curl.haxx.se/docs/manual.html
>       HTTP/1.1 200 OK
>       Proxy-Connection: Keep-Alive
>       Connection: Keep-Alive
>       Date: Sat, 18 Jul 2009 23:24:24 GMT
>       Server: Apache/2.2.3 (Debian) mod_python/3.2.10 Python/2.4.4
>       Last-Modified: Mon, 20 Apr 2009 17:46:02 GMT
>       ETag: "5d63c-b2c5-1a936a80"
>       Accept-Ranges: bytes
>       Content-Length: 45765
>       Content-Type: text/html; charset=ISO-8859-1
>
>    You can download files only if the remote one is newer than a local copy:
>
>       me% curl -z local.html http://remote.server.com/remote.html
>
>    Or only download the file if it was updated since Jan 12, 2009:
>
>       me% curl -z "Jan 12 2009" http://remote.server.com/remote.html
>
>    Curl tries to use persistent connections for transfers, so put as many
>    URLs on the same line as you can if you're looking to mirror a site.  I
>    don't know how to make curl do something like walking a directory for a
>    recursive download.
>
>    You can get the source at http://curl.haxx.se/download.html

Thank you Karl.  I already have curl installed, but I don't believe it can
get an entire website by giving it the base URL.

Regards,

Joe
-- 
     _/   _/_/_/       _/              ____________    __o
     _/   _/   _/      _/         ______________     _-\<,_
 _/  _/   _/_/_/   _/  _/                     ......(_)/ (_)
  _/_/ oe _/   _/.  _/_/ ah        jjah at cloud.ccsf.cc.ca.us