xls to JSON using python3 xlrd
Using the json
package and the OrderedDict
(to preserve key order), I think this gets to what you're expecting, and I've modified slightly so we're not building a string literal, but rather a dict
which contains the data that we can then convert with json.dumps
.
As Ron noted above, your previous attempt was skipping the lines where rows[1]
was not equal to one of your three key values.
This should read every line, appending to the last non-empty key:
def readRows(file, s_index=0): """ file: path to xls file s_index: sheet_index for the xls file returns a dict of OrderedDict of list of OrderedDict which can be parsed to JSON """ d = {"EVALUATION" : OrderedDict()} # this will be the main dict for our JSON object wb = xlrd.open_workbook(file) sheet = wb.sheet_by_index(s_index) # getting the data from the worksheet data = [[sheet.cell_value(r, c) for c in range(sheet.ncols)] for r in range(sheet.nrows)] # fill the dict with data: for _,row in enumerate(data[3:]): if row[1]: # if there's a value, then this is a new categorie element categorie = row[1] d["EVALUATION"][categorie] = [] if categorie: i,e,a = row[3::2][:3] if i or e or a: # as long as there's any data in this row, we write the child element val = OrderedDict([("INDICATEUR", i),("EVALUATION", e),("PROPOSITION D'AMELIORATION", a)]) d["EVALUATION"][categorie].append(val) return d
This returns a dict
which can be easily parsed to json. Screenshot of some output:
import io # for python 2d = readRows(file,0)with io.open('c:\debug\output.json','w',encoding='utf8') as out: out.write(json.dumps(d,indent=2,ensure_ascii=False))
Note: in Python 3, I don't think you need io.open
.
Is pandas not an option? Would add as a comment but don't have the rep.
From Documentation
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_excel.htmlhttps://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_json.html
df = pandas.read_excel('path_to_file.xls')df.to_json(path_or_buf='output_path.json', orient='table')