python - convert to datetime64 format with to_datetime() -
i'm trying convert date time data in pandas.to_datetime()
format. not working , type of df['time']
object
. wrong?
please note have attached time file.
my code
import pandas pd import numpy np datetime import datetime f = open('time','r') lines = f.readlines() t = [] line in lines: time = line.split()[1][-20:] time2 = time[:11] + ' ' +time[12:21] t.append(time2) df = pd.dataframe(t) df.columns = ['time'] df['time'] = pd.to_datetime(df['time']) print df['time'] name: time, length: 16136, dtype: object
please find attach time data file here
the file time
contain invalid data.
for example, line 8323 contain 8322 "5/jul/2013::8:25:18 0530"
, different normal lines 8321 "15/jul/2013:18:25:18 +0530"
.
8321 "15/jul/2013:18:25:18 +0530" 8322 "5/jul/2013::8:25:18 0530"
for normal line, time2
become 15/jul/2013 18:25:18
, invalid line "5/jul/2013::8:25:18
15/jul/2013 18:25:18 "5/jul/2013::8:25:18
which cause lines parsed datetime, , lines not; data coerced object (to contain both datetime , string).
>>> pd.series(pd.to_datetime(['15/jul/2013 18:25:18', '15/jul/2013 18:25:18'])) 0 2013-07-15 18:25:18 1 2013-07-15 18:25:18 dtype: datetime64[ns] >>> pd.series(pd.to_datetime(['15/jul/2013 18:25:18', '*5/jul/2013 18:25:18'])) 0 15/jul/2013 18:25:18 1 *5/jul/2013 18:25:18 dtype: object
if take first 5 data (which has correct date format) files, expected.
... df = pd.dataframe(t[:5]) df.columns = ['time'] df['time'] = pd.to_datetime(df['time'])
above code yield:
0 2013-07-15 00:00:12 1 2013-07-15 00:00:18 2 2013-07-15 00:00:23 3 2013-07-15 00:00:27 4 2013-07-15 00:00:29 name: time, dtype: datetime64[ns]
update
added small example show cause of dtype of object
, not datetime
.
Comments
Post a Comment