Not Just Paranoid

Another site by Pete Maynard

Using grep and wget to download all hyperlinks to .pdf

I had a site that contained a load of pdfs that I wanted to download, to save me from clicking on each of the pdfs I did some googleing and found how to download all files ending in .pdf.

cat index.html | grep -o -e http://[^[:space:]\"]*.pdf | xargs wget

and for an even better approach you can make a little bash script that takes the URL as a parameter.

#!/bin/sh
url=$1
curl $url | grep -o -e http://[^[:space:]\"]*.pdf | xargs wget

Thanks to Ubuntu Forums, and google.

28 Jan 2011 | Tags ( bash script download grep Linux pdf wget )

Website Last Updated on 4 Oct 2024 (CC BY-SA 4.0)

This site uses JQuery and nanogallery2 hosted by jsdelivr.net
for the Flickr photo feed and GoatCounter for user insights.