

from requests.exceptions import HTTPError You'll also need to import HTTPError from requests at the top of your file before you can use this as is.

The break statement will exit the loop and jump down the the next line after the end of the loop. # This url doesn't exist so we should stop looping To do this you need to capture the HTTPError exception thrown by res.raise_for_status and then use break to exit the loop. You can use the 404 returned by requests when you try to get a url which does not exist as your exit condition. Print('Done saving pictures for %s' % model)īasically, you don't know when to stop looping, so instead you need to figure out an error condition you can use inside the loop to break out and then loop forever until this error happens. 'Erin Heatherton','Solveig Mork', 'Hailey Clauson', 'Rose Bertram', 'Ashley Smith', 'Kelly Rohrbach', 'Irina Shayk','lily Aldridge', 'Chrissy Teigen', 'Jessica Gomes', 'Ronda Rousey', 'Caroline Wozniacki', 'Samantha Hoopes', 'Chanel Iman', 'Hannah Davis', 'Kate Bock', 'Ariel Meredith', 'Genevieve Morton', So any help on making a better While-Loop to get all images of a page, whether they be 10 or 27, downloaded would be much appreciated! I Hope that was clear and concise enough for you guys.ĮDIT: OH! and this is the function that iterates through the List of Models to download their galleries Models = ['Gigi Hadid', 'Nina Agdal', 'Hannah Ferguson', 'Sara Sampaio', 'Emily Ratajkowski', 'Emily Didonato', My first attempt was to find the HTML tags for that arrow and end the While-Loop iteration of that particular Gallery when the (JavaScript)Next-Arrow was no longer visible on the webpage. Each page/pic has a dynamic (Javascript?) next-arrow to go to the next pic. So I find each img with soup_elem = page_lect('#content-container img')įinding any img tags that are nested in a id:content tag.

I use BeautifulSoup 4 to search through a request.get of the webpage. Image_file = open(os.path.join(folder_name, os.path.basename(image_url)), 'wb')Īs you can see. Print('Downloading image %s' % image_url) Soup_elem = page_lect('#content-container img') Mypath = 'C:/Users/nick/PycharmProjects/untitled/Models/'įolder_name = os.path.join(mypath, '_'.join(name) + "_gallery") I simply list all the model names in a list, then iterate through the list of names and plug them into the SI URL like so to get to each gallery (ex: /photos/'.format(name, name) My first attempt at that was to download all the model pics from Sports Illustrated's swimsuit site ( ) into separate folders. I just finished the lesson in Automate the Boring Stuff with Python on webscraping and image downloading to parse a website for images and download them into a folder.
