Converting Django QuerySet to pandas DataFrame

python django pandas

import pandas as pdimport datetimefrom myapp.models import BlogPostdf = pd.DataFrame(list(BlogPost.objects.all().values()))df = pd.DataFrame(list(BlogPost.objects.filter(date__gte=datetime.datetime(2012, 5, 1)).values()))# limit which fieldsdf = pd.DataFrame(list(BlogPost.objects.all().values('author', 'date', 'slug')))

The above is how I do the same thing. The most useful addition is specifying which fields you are interested in. If it's only a subset of the available fields you are interested in, then this would give a performance boost I imagine.

python django pandas

Django Pandas solves this rather neatly: https://github.com/chrisdev/django-pandas/

From the README:

class MyModel(models.Model):    full_name = models.CharField(max_length=25)    age = models.IntegerField()    department = models.CharField(max_length=3)    wage = models.FloatField()from django_pandas.io import read_frameqs = MyModel.objects.all()df = read_frame(qs)

python django pandas

Convert the queryset on values_list() will be more memory efficient than on values() directly. Since the method values() returns a queryset of list of dict (key:value pairs), values_list() only returns list of tuple (pure data). It will save about 50% memory, just need to set the column information when you call pd.DataFrame().

Method 1:    queryset = models.xxx.objects.values("A","B","C","D")    df = pd.DataFrame(list(queryset))  ## consumes much memory    #df = pd.DataFrame.from_records(queryset) ## works but no much change on memory usageMethod 2:    queryset = models.xxx.objects.values_list("A","B","C","D")    df = pd.DataFrame(list(queryset), columns=["A","B","C","D"]) ## this will save 50% memory    #df = pd.DataFrame.from_records(queryset, columns=["A","B","C","D"]) ##It does not work. Crashed with datatype is queryset not list.

I tested this on my project with >1 million rows data, the peak memory is reduced from 2G to 1G.

CodeHunter

Converting Django QuerySet to pandas DataFrame

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last