Tag Archives: Automation

wget & shell scripting for automated downloads

GNU Logo - OpenSouceMy friend Leo came up with this interesting problem. He found a site where a lot of religious videos are hosted for free and he wanted to download them all. The problem is, that the site has a navigation system and you need to click 3 times to go to download page of each video. Considering that the site has about 400 videos to download, this seemed a huge task.

I knew that wget can be used for downloading files from web sites. Checking the download links of 2-3 videos revealed that the videos are stored in a particular folder (http://webserver/folder/ ) inside the webserver. The video links were all like : 1.avi , 2.avi, 3.avi and so on. Then it was all just a matter of minutes to write a small shell script. Here is the code :

#!/bin/bash
counter=0
while (($counter < 450 ));
do
echo Downloading Video $counter..
wget http://webserver/folder/$counter.avi
let counter++
done

echo "Done !"

I saved it as download.sh in my Ubuntu Linux machine and ran it using the below command.
rajesh@ubuntubox:~$ ./download.sh

The script took it's time downloading files one after one .We could do parallel downloads, but that will cause trouble for the web site admins, we are nice people you know 🙂 . Open Source saves the day !

Side Note: if you are using windows, you could probably use cygwin to do this in windows. This script is not perfect, but this did the job for us. One can easily use this as a base for developing it further for other tasks.