Scrape Images with wget

The desire to download all images or video on the page has been around since the beginning of the internet.  Twenty years ago I would accomplish this task with a python script I downloaded.  I then moved on to browser extensions for this task, then started using a PhearJS Node.js JavaScript utility to scrape images.  All of these solutions are nice but I wanted to know how I could accomplish this task from command line.

To scrape images (or any specific file extensions) from command line, you can use wget:

wget -nd -H -p -A jpg,jpeg,png,gif -e robots=off

The script above downloads images across hosts (i.e. from a CDN or other subdomain) to the directory from which the command is run from.  You’ll see downloaded media as they come down:

Reusing existing connection to
HTTP request sent, awaiting response... 200 OK
Length: 1505 (1.5K) 
Saving to: '1490571194319s.jpg' 1490571194319s.jpg 100%[=====================>] 1.47K --.-KB/s in 0s 2017-03-26 18:33:26 (205 MB/s) - '1490571194319s.jpg' saved [1505/1505] FINISHED --2017-03-26 18:33:26--
Total wall clock time: 2.7s
Downloaded: 66 files, 412K in 0.2s (2.10 MB/s)

Everyone loves cURL, which is another awesome resource, but don’t foget about wget, which is arguably easier to use!

  • CSS Filters

    CSS filter support recently landed within WebKit nightlies. CSS filters provide a method for modifying the rendering of a basic DOM element, image, or video. CSS filters allow for blurring, warping, and modifying the color intensity of elements. Let’s have…

  • Creating Scrolling Parallax Effects with CSS

    Introduction For quite a long time now websites with the so called “parallax” effect have been really popular. In case you have not heard of this effect, it basically includes different layers of images that are moving in different directions or with different speed. This leads to a…

  • MooTools Zebra Tables Plugin

    Tabular data can oftentimes be boring, but it doesn’t need to look that way! With a small MooTools class, I can make tabular data extremely easy to read by implementing “zebra” tables — tables with alternating row background colors. The CSS The above CSS is extremely basic.

  • JavaScript Speech Recognition

    Speech recognition software is becoming more and more important; it started (for me) with Siri on iOS, then Amazon’s Echo, then my new Apple TV, and so on.  Speech recognition is so useful for not just us tech superstars but for people who either want to work “hands…

Related Post

Rojenx is a leading concept artist who work appears in games and publications

Check out his personal gallery here

In other news …