Using grep and wget to download all hyperlinks to .pdf

I had a site that contained a load of pdfs that I wanted to download, to save me from clicking on each of the pdfs I did some googleing and found how to download all files ending in .pdf.

cat index.html | grep -o -e http://[^[:space:]\"]*.pdf | xargs wget

and for an even better approach you can make a little bash script that takes the URL as a parameter.

#!/bin/sh
url=$1
curl $url | grep -o -e http://[^[:space:]\"]*.pdf | xargs wget

Thanks to Ubuntu Forums, and google.

28 Jan 2011 | Tags ( bash script download grep Linux pdf wget )

Website Last Updated on 4 Oct 2024 (CC BY-SA 4.0)

This site uses JQuery and nanogallery2 hosted by jsdelivr.net
for the Flickr photo feed and GoatCounter for user insights.

Not Just Paranoid

Using grep and wget to download all hyperlinks to .pdf