Converting a list of tuples to a Pandas series Converting a list of tuples to a Pandas series pandas pandas

Converting a list of tuples to a Pandas series


Using zip and sequence unpacking:

idx, values = zip(*L)a = pd.Series(values, idx)

With duplicate indices, as in your data, dict will not help as duplicate dictionary keys are not permitted: dict will only take the last value for every key supplied.


Use DataFrame constructor with set_index by first column, then select second column for Series:

a = pd.DataFrame(array2).set_index(0)[1]print (a)00    0.0714290    0.0714291    0.0833331    0.3333331    0.3333331    0.0833333    0.0588243    0.058824Name: 1, dtype: float64

Or create 2 lists and pass to Series contructor:

idx = [x[0] for x in array2]vals = [x[1] for x in array2]a = pd.Series(vals, index=idx)print (a)0    0.0714290    0.0714291    0.0833331    0.3333331    0.3333331    0.0833333    0.0588243    0.058824dtype: float64


The problem is that when you convert a list of tuples to a dictionary, Python drops all duplicate keys and only uses the last value for each key. This is necessary since each key can only appear once in a dictionary. So you need to use a method that preserves all the records. This will do that:

df = pd.DataFrame.from_records(array2, columns=['key', 'val'])df = df.set_index('key')a = df['val']

Example:

import pandas as pdarray2 = [    (0, 0.07142857142857142),    (0, 0.07142857142857142),    (1, 0.08333333333333333),    (1, 0.3333333333333333),    (1, 0.3333333333333333),    (1, 0.08333333333333333),    (3, 0.058823529411764705),    (3, 0.058823529411764705)]df = pd.DataFrame.from_records(array2, columns=['key', 'val'])df = df.set_index('key')a = df['val']print(a)# key# 0    0.071429# 0    0.071429# 1    0.083333# 1    0.333333# 1    0.333333# 1    0.083333# 3    0.058824# 3    0.058824# Name: val, dtype: float64