Getting all posts from a blog (wordpress or blogger) Getting all posts from a blog (wordpress or blogger) wordpress wordpress

Getting all posts from a blog (wordpress or blogger)


What you're looking for is a sitemap.

First of all, you're writing a bot so it's good manners to check the blog's robots.txt file. And lo and behold, you'll often find a sitemap mentioned there. Here's an example from the Google blog:

User-agent: Mediapartners-GoogleDisallow: User-agent: *Disallow: /searchAllow: /Sitemap: http://googleblog.blogspot.com/feeds/posts/default?orderby=UPDATED

In this case, you can visit the Sitemap URL to get an xml sitemap.

For Wordpress, the same applies but it's not built-in as standard so not all blogs will have it. Have a look at this plugin which is the most popular way to create these sitemaps in Wordpress. For example, my blog uses this and you can find the sitemap at /sitemap.xml(the standard location)

In short:

  • Check robots.txt
  • Follow the Sitemap url if it's present
  • Otherwise, check for /sitemap.xml

Also: be a good Internet citizen! If you're going to write a bot, make sure it obeys the robots.txt file (like where blogspot tells you explicitly not to use /search!)