|Last modified: 16-01-2013|
The latest version of wget can be downloaded from http://www.christopherlewis.com/WGet/WGetFiles.htm
Here's how to download a list of files, and have wget download any of them if they're newer:
Not sure how reliable the -N switch is, considering that dates can change when uploading files to an FTP server, and a file can have been changed even though its size remained the same, but I didn't find a way to force wget to overwrite files every time (-r creates a directory tree).
Failed attempts to overwrite files:
wget -m -np ftp://jdoe:email@example.com/somedir
... where -m = mirror (a shortcut to a bunch of switches), and -np = do not go up to the parent directory
wget -mpk http://www.acme.com
wget -np -I /mysite -m http://localhost
wget -nc -c -N -r -l=NUMBER -L http://localhost
Should a login/passord be required, use --http-user=USER --http-passwd=PASS
If downloading files through FTP instead of HTTP, you can use the --passive-ftp option in case the host is running behind a stateless firewall.
If a site has a robots.txt and wget fails sucking a site, try the -e "robots = off" switch. If it still doesn't work, have Wget pretend it's a different user agent using -U "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 4.0)" or -U "user-agent="Mozilla/3.01Gold > (Win95;I)".
... without actually downloading any file:
--spider don't download anything.
-p, --page-requisites get all images, etc. needed to display HTML page.
You are using an older release of wget. Upgrade to at least release 1.8.x