Reading Excel File using Python, how do I get the values of a specific column with indicated column name?
A somewhat late answer, but with pandas, it is possible to get directly a column of an excel file:
import pandasdf = pandas.read_excel('sample.xls')#print the column namesprint df.columns#get the values for a given columnvalues = df['Arm_id'].values#get a data frame with selected columnsFORMAT = ['Arm_id', 'DSPName', 'Pincode']df_selected = df[FORMAT]
Make sure you have installed xlrd and pandas:
pip install pandas xlrd
This is one approach:
from xlrd import open_workbookclass Arm(object): def __init__(self, id, dsp_name, dsp_code, hub_code, pin_code, pptl): self.id = id self.dsp_name = dsp_name self.dsp_code = dsp_code self.hub_code = hub_code self.pin_code = pin_code self.pptl = pptl def __str__(self): return("Arm object:\n" " Arm_id = {0}\n" " DSPName = {1}\n" " DSPCode = {2}\n" " HubCode = {3}\n" " PinCode = {4} \n" " PPTL = {5}" .format(self.id, self.dsp_name, self.dsp_code, self.hub_code, self.pin_code, self.pptl))wb = open_workbook('sample.xls')for sheet in wb.sheets(): number_of_rows = sheet.nrows number_of_columns = sheet.ncols items = [] rows = [] for row in range(1, number_of_rows): values = [] for col in range(number_of_columns): value = (sheet.cell(row,col).value) try: value = str(int(value)) except ValueError: pass finally: values.append(value) item = Arm(*values) items.append(item)for item in items: print item print("Accessing one single value (eg. DSPName): {0}".format(item.dsp_name)) print
You don't have to use a custom class, you can simply take a dict()
. If you use a class however, you can access all values via dot-notation, as you see above.
Here is the output of the script above:
Arm object: Arm_id = 1 DSPName = JaVAS DSPCode = 1 HubCode = AGR PinCode = 282001 PPTL = 1Accessing one single value (eg. DSPName): JaVASArm object: Arm_id = 2 DSPName = JaVAS DSPCode = 1 HubCode = AGR PinCode = 282002 PPTL = 3Accessing one single value (eg. DSPName): JaVASArm object: Arm_id = 3 DSPName = JaVAS DSPCode = 1 HubCode = AGR PinCode = 282003 PPTL = 5Accessing one single value (eg. DSPName): JaVAS
So the key parts are to grab the header ( col_names = s.row(0)
) and when iterating through the rows, to skip the first row which isn't needed for row in range(1, s.nrows)
- done by using range from 1 onwards (not the implicit 0). You then use zip to step through the rows holding 'name' as the header of the column.
from xlrd import open_workbookwb = open_workbook('Book2.xls')values = []for s in wb.sheets(): #print 'Sheet:',s.name for row in range(1, s.nrows): col_names = s.row(0) col_value = [] for name, col in zip(col_names, range(s.ncols)): value = (s.cell(row,col).value) try : value = str(int(value)) except : pass col_value.append((name.value, value)) values.append(col_value)print values