Efficient looping through large JSON-Files

python json

Well don't call response.json() over and over and over again unnecessarily.

Instead of

  for observation in response.json()['data']:      fullGroupName = response.json()['full_name']

  data = response.json()  for observation in data['data']:      fullGroupName = data['full_name']

After this change the whole thing takes my PC about 33 seconds. And pretty much all of that is for the requests. Maybe you could speed that up further by using parallel requests if that's ok for the site.

python json

Although Stefan Pochmann has already answered your question, I think it's worth to mention how you could have figured out what the problem is for yourself.

One way would be to use a profiler, for example Python's cProfile, which is included in the standard library.

Assuming that your script is called slow_download.py, you can limit the range in your loop to, for example, range(32, 33) and execute it in the following way:

python3 -m cProfile -s cumtime slow_download.py

The -s cumtime sorts the calls by cumulative time.

The result would be:

   http://dw.euro.who.int/api/v3/data_sets/HFAMDB/HFAMDB_832          222056 function calls (219492 primitive calls) in 395.444 seconds   Ordered by: cumulative time   ncalls  tottime  percall  cumtime  percall filename:lineno(function)    122/1    0.005    0.000  395.444  395.444 {built-in method builtins.exec}        1   49.771   49.771  395.444  395.444 py2.py:1(<module>)     9010    0.111    0.000  343.904    0.038 models.py:782(json)     9010    0.078    0.000  332.900    0.037 __init__.py:271(loads)     9010    0.091    0.000  332.801    0.037 decoder.py:334(decode)     9010  332.607    0.037  332.607    0.037 decoder.py:345(raw_decode)     ...

This clearly suggests that the problem is related to json() and related methods: loads() and raw_decode().

python json

If the data is really large, dump the data in mongodb, and query whatever you want efficiently.

CodeHunter

Efficient looping through large JSON-Files

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last