Unable to fetch all the links from a webpage using requests

python python-3.x web-scraping beautifulsoup re

If you want to do it with requests then please consider to query XHR/Ajax Http requests for imitating Lazy load. See the following picture:

You make queries to the instagram.com server similar to Scrape a JS Lazy load page by Python requests post.

Disclaimer

You might not succeed to complete that task due to some dynamic cookie values or other scraping prevention imposed by Instagram.

python python-3.x web-scraping beautifulsoup re

The Instagram web page uses lazy loading to load the images. You can overcome this in 2 ways:

Use the Instagram API as mentioned in the comments
Use a tool like selenium to load all the images on the page by scrolling to the bottom and then fetch the links

The 1st method is be the better way to do it.

python python-3.x web-scraping beautifulsoup re

I suggest you to use Instagram Graph API, if you are building a commercial product since using instagram public data is required the consent because of GDPR. This API will easy your work but under api limitations such as you can query 30 searches for 7 days per a user token.

If you are building non-commercial tool you have two approaches.

Scrape directly the instagram web page. As mentioned in above answers you can use selenium and automate page interactions since web page uses javascript to generate image urls. The disadvantage of this method is instagram and facebook do anti scraping methods to prevent scraping their data such as wrapping html elements with dynamic generated classes, change xpaths frequently. You might have to spend lot of time to code and fix those things later.
Using third party libraries that are built to scrape instagram data. There are many open source third party libraries in github and instaloader is my favorite. you can download all of hashtag search results using single command. This library not only download images but also data json of the post related to the image. Since there are maintainers to the library, you don't have to worry about later instagram web page changes. I recommend this method in your case.

CodeHunter

Unable to fetch all the links from a webpage using requests

Disclaimer

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last