How to set dtypes by column in pandas DataFrame

python pandas types

I just ran into this, and the pandas issue is still open, so I'm posting my workaround. Assuming df is my DataFrame and dtype is a dict mapping column names to types:

for k, v in dtype.items():    df[k] = df[k].astype(v)

(note: use dtype.iteritems() in python 2)

For the reference:

The list of allowed data types (NumPy dtypes): https://docs.scipy.org/doc/numpy-1.12.0/reference/arrays.dtypes.html
Pandas also supports some other types. E.g., category: http://pandas.pydata.org/pandas-docs/stable/categorical.html
The relevant GitHub issue: https://github.com/pandas-dev/pandas/issues/9287

python pandas types

You may want to try passing in a dictionary of Series objects to the DataFrame constructor - it will give you much more specific control over the creation, and should hopefully be clearer what's going on. A template version (data1 can be an array etc.):

df = pd.DataFrame({'column1':pd.Series(data1, dtype='type1'),                   'column2':pd.Series(data2, dtype='type2')})

And example with data:

df = pd.DataFrame({'A':pd.Series([1,2,3], dtype='int'),                   'B':pd.Series([7,8,9], dtype='float')})print (df)   A  B0  1  7.01  2  8.02  3  9.0print (df.dtypes)A     int32B    float64dtype: object

python pandas types

As of pandas version 0.24.2 (the current stable release) it is not possible to pass an explicit list of datatypes to the DataFrame constructor as the docs state:

dtype : dtype, default None    Data type to force. Only a single dtype is allowed. If None, infer

However, the dataframe class does have a static method allowing you to convert a numpy structured array to a dataframe so you can do:

>>> myarray = np.random.randint(0,5,size=(2,2))>>> record = np.array(map(tuple,myarray),dtype=[('a',np.float),('b',np.int)])>>> mydf = pd.DataFrame.from_records(record)>>> mydf.dtypesa    float64b      int64dtype: object

CodeHunter

How to set dtypes by column in pandas DataFrame

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last