Simple CSV to XML Conversion - Python Simple CSV to XML Conversion - Python xml xml

Simple CSV to XML Conversion - Python


A possible solution is to first load the csv into Pandas and then convert it row by row into XML, as so:

import pandas as pddf = pd.read_csv('untitled.txt', sep='|')

With the sample data (assuming separator and so on) loaded as:

          Title                   Type Format  Year Rating  Stars  \0  Enemy Behind           War,Thriller    DVD  2003     PG     10   1  Transformers  Anime,Science Fiction    DVD  1989      R      9                Description  0          Talk about...  1  A Schientific fiction  

And then converting to xml with a custom function:

def convert_row(row):    return """<movietitle="%s">    <type>%s</type>    <format>%s</format>    <year>%s</year>    <rating>%s</rating>    <stars>%s</stars>    <description>%s</description></movie>""" % (    row.Title, row.Type, row.Format, row.Year, row.Rating, row.Stars, row.Description)print '\n'.join(df.apply(convert_row, axis=1))

This way you get a string containing the xml:

<movietitle="Enemy Behind">    <type>War,Thriller</type>    <format>DVD</format>    <year>2003</year>    <rating>PG</rating>    <stars>10</stars>    <description>Talk about...</description></movie><movietitle="Transformers">    <type>Anime,Science Fiction</type>    <format>DVD</format>    <year>1989</year>    <rating>R</rating>    <stars>9</stars>    <description>A Schientific fiction</description></movie>

that you can dump in to a file or whatever.

Inspired by this great answer.


Edit: Using the loading method you posted (or a version that actually loads the data to a variable):

import csv              f = open('movies2.csv')csv_f = csv.reader(f)   data = []for row in csv_f:    data.append(row)f.close()print data[1:]

We get:

[['Enemy Behind', 'War', 'Thriller', 'DVD', '2003', 'PG', '10', 'Talk about...'], ['Transformers', 'Anime', 'Science Fiction', 'DVD', '1989', 'R', '9', 'A Schientific fiction']]

And we can convert to XML with minor modifications:

def convert_row(row):    return """<movietitle="%s">    <type>%s</type>    <format>%s</format>    <year>%s</year>    <rating>%s</rating>    <stars>%s</stars>    <description>%s</description></movie>""" % (row[0], row[1], row[2], row[3], row[4], row[5], row[6])print '\n'.join([convert_row(row) for row in data[1:]])

Getting identical results:

<movietitle="Enemy Behind">    <type>War</type>    <format>Thriller</format>    <year>DVD</year>    <rating>2003</rating>    <stars>PG</stars>    <description>10</description></movie><movietitle="Transformers">    <type>Anime</type>    <format>Science Fiction</format>    <year>DVD</year>    <rating>1989</rating>    <stars>R</stars>    <description>9</description></movie>