python - convert to datetime64 format with to_datetime() -
i'm trying convert date time data in pandas.to_datetime() format. not working , type of df['time'] object. wrong?
please note have attached time file.
my code
import pandas pd import numpy np datetime import datetime f = open('time','r') lines = f.readlines() t = [] line in lines: time = line.split()[1][-20:] time2 = time[:11] + ' ' +time[12:21] t.append(time2) df = pd.dataframe(t) df.columns = ['time'] df['time'] = pd.to_datetime(df['time']) print df['time'] name: time, length: 16136, dtype: object please find attach time data file here
the file time contain invalid data.
for example, line 8323 contain 8322 "5/jul/2013::8:25:18 0530", different normal lines 8321 "15/jul/2013:18:25:18 +0530".
8321 "15/jul/2013:18:25:18 +0530" 8322 "5/jul/2013::8:25:18 0530" for normal line, time2 become 15/jul/2013 18:25:18, invalid line "5/jul/2013::8:25:18
15/jul/2013 18:25:18 "5/jul/2013::8:25:18 which cause lines parsed datetime, , lines not; data coerced object (to contain both datetime , string).
>>> pd.series(pd.to_datetime(['15/jul/2013 18:25:18', '15/jul/2013 18:25:18'])) 0 2013-07-15 18:25:18 1 2013-07-15 18:25:18 dtype: datetime64[ns] >>> pd.series(pd.to_datetime(['15/jul/2013 18:25:18', '*5/jul/2013 18:25:18'])) 0 15/jul/2013 18:25:18 1 *5/jul/2013 18:25:18 dtype: object if take first 5 data (which has correct date format) files, expected.
... df = pd.dataframe(t[:5]) df.columns = ['time'] df['time'] = pd.to_datetime(df['time']) above code yield:
0 2013-07-15 00:00:12 1 2013-07-15 00:00:18 2 2013-07-15 00:00:23 3 2013-07-15 00:00:27 4 2013-07-15 00:00:29 name: time, dtype: datetime64[ns] update
added small example show cause of dtype of object, not datetime.
Comments
Post a Comment