How to merge two DataFrame columns and apply pandas.to_datetime to it?
You can do everythin in the read_csv
function:
pd.read_csv('test.csv', parse_dates={'timestamp': ['date','time']}, index_col='timestamp', usecols=['date', 'time', 'o', 'c'])
parse_dates
tells the read_csv
function to combine the date
and time
column into one timestamp
column and parse it as a timestamp. (pandas is smart enough to know how to parse a date in various formats)
index_col
sets the timestamp
column to be the index.
usecols
tells the read_csv
function to select only the subset of the columns.
As far as loading the data in, I think you've got it. To set the index do this:
st_new = pd.concat([(st.o + st.c) / 2, st.vol], axis=1, ignore_index=True)st_new.set_index(pd.to_datetime(st.date + " " + st.time), drop=True, inplace=True)
Here's the API documentation for set_index
.