Managing Tweepy API Search

python twitter tweepy

I originally worked out a solution based on Yuva Raj's suggestion to use additional parameters in GET search/tweets - the max_id parameter in conjunction with the id of the last tweet returned in each iteration of a loop that also checks for the occurrence of a TweepError.

However, I discovered there is a far simpler way to solve the problem using a tweepy.Cursor (see tweepy Cursor tutorial for more on using Cursor).

The following code fetches the most recent 1000 mentions of 'python'.

import tweepy# assuming twitter_authentication.py contains each of the 4 oauth elements (1 per line)from twitter_authentication import API_KEY, API_SECRET, ACCESS_TOKEN, ACCESS_TOKEN_SECRETauth = tweepy.OAuthHandler(API_KEY, API_SECRET)auth.set_access_token(ACCESS_TOKEN, ACCESS_TOKEN_SECRET)api = tweepy.API(auth)query = 'python'max_tweets = 1000searched_tweets = [status for status in tweepy.Cursor(api.search, q=query).items(max_tweets)]

Update: in response to Andre Petre's comment about potential memory consumption issues with tweepy.Cursor, I'll include my original solution, replacing the single statement list comprehension used above to compute searched_tweets with the following:

searched_tweets = []last_id = -1while len(searched_tweets) < max_tweets:    count = max_tweets - len(searched_tweets)    try:        new_tweets = api.search(q=query, count=count, max_id=str(last_id - 1))        if not new_tweets:            break        searched_tweets.extend(new_tweets)        last_id = new_tweets[-1].id    except tweepy.TweepError as e:        # depending on TweepError.code, one may want to retry or wait        # to keep things simple, we will give up on an error        break

python twitter tweepy

There's a problem in your code. Based on Twitter Documentation for GET search/tweets,

The number of tweets to return per page, up to a maximum of 100. Defaults to 15. This was   formerly the "rpp" parameter in the old Search API.

Your code should be,

CONSUMER_KEY = '....'CONSUMER_SECRET = '....'ACCESS_KEY = '....'ACCESS_SECRET = '....'auth = tweepy.auth.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)api = tweepy.API(auth)search_results = api.search(q="hello", count=100)for i in search_results:    # Do Whatever You need to print here

python twitter tweepy

The other questions are old and the API changed a lot.

Easy way, with Cursor (see the Cursor tutorial). Pages returns a list of elements (You can limit how many pages it returns. .pages(5) only returns 5 pages):

for page in tweepy.Cursor(api.search, q='python', count=100, tweet_mode='extended').pages():    # process status here    process_page(page)

Where q is the query, count how many will it bring for requests (100 is the maximum for requests) and tweet_mode='extended' is to have the full text. (without this the text is truncated to 140 characters) More info here. RTs are truncated as confirmed jaycech3n.

If you don't want to use tweepy.Cursor, you need to indicate max_id to bring the next chunk. See for more info.

last_id = Noneresult = Truewhile result:    result = api.search(q='python', count=100, tweet_mode='extended', max_id=last_id)    process_result(result)    # we subtract one to not have the same again.    last_id = result[-1]._json['id'] - 1

CodeHunter

Managing Tweepy API Search

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last