Error while fetching Tweets with Tweepy

python mongodb twitter tweepy

This IncompleteRead error generally tends to occur when your consumption of incoming tweets starts to fall behind, which makes sense in your case given your long list of terms to track. The general approach most people seem to be taking (myself included) is simply to suppress this error and continue your collection (see the link above).

I can't completely remember if IncompleteRead will close your connection (I think it might, because my personal solution reconnects my stream), but you may consider something like the following (I'm just going to wing it, it probably needs reworking for your situation):

# from httplib import IncompleteRead # Python 2from http.client import IncompleteRead # Python 3...while True:    try:        # Connect/reconnect the stream        stream = Stream(auth, listener)        # DON'T run this approach async or you'll just create a ton of streams!        stream.filter(terms)    except IncompleteRead:        # Oh well, reconnect and keep trucking        continue    except KeyboardInterrupt:        # Or however you want to exit this loop        stream.disconnect()        break...

Again, I'm just winging it there, but the moral of the story is that the general approach taken here is to suppress the error and continue.

EDIT (10/11/2016): Just a useful tidbit for anyone dealing with very large volumes of tweets - one way to handle this case without losing connection time or tweets would be to drop your incoming tweets into a queuing solution (RabbitMQ, Kafka, etc.) to be ingested/processed by an application reading from that queue.

This moves the bottleneck from the Twitter API to your queue, which should have no problem waiting for you to consume the data.

This is more of a "production" software solution, so if you don't care about losing tweets or reconnecting, the above solution is still perfectly valid.

python mongodb twitter tweepy

I had this same problem, solved when I remved languages from the filter function

since it's not yet functional, although Twitter says it is

Instead I keep the check of the language as you did in the on_data(..)

Also I use the on_status(..) instead of on_data(..) as follows:

def on_status(self, status):    ...    tweet = json.dumps(status)    if tweet["lang"] == "nl":        print tweet["id"]        Tweets.insert(tweet)    ...

Other people reported that using twitterStream.filter(track=['word'], languages=['nl']), but it didn't with me.

python mongodb twitter tweepy

IncompleteRead error is the diagnostic of a network related issue. On where did you run that script? If the host running this script is behind a firewall, load balancer etc. network packages may be dropped for some reason.

CodeHunter

Error while fetching Tweets with Tweepy

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last