sql - Create a 350000 column csv file by merging smaller csv files -
i have 350000 one-column csv files, 200 - 2000 numbers printed 1 under another. numbers formatted this: "-1.32%" (no quotes). want merge files create monster of csv file each file separate column. merged file have 2000 rows maximum (each column may have different length) , 350000 columns.
i thought of doing mysql there 30000 column limit. awk or sed script job don't know them , afraid take long time. use server if solution requires to. suggestions?
this python script want:
#!/usr/bin/env python2 import os import sys import codecs fhs = [] count = 0 filename in sys.argv[1:]: fhs.append(codecs.open(filename,'r','utf-8')) count += 1 while count > 0: delim = '' fh in fhs: line = fh.readline() if not line: count -= 1 line = '' sys.stdout.write(delim) delim = ',' sys.stdout.write(line.rstrip()) sys.stdout.write('\n') fh in fhs: fh.close()
call csv files want merge , print new file stdout.
note can't merge files @ once; one, can't pass 350,000 file names arguments process , secondly, process can open 1024 files @ once.
so you'll have in several passes. i.e. merge files 1-1000, 1001-2000, etc. should able merge 350 resulting intermediate files @ once.
or write wrapper script uses os.listdir()
names or files , calls script several times.
Comments
Post a Comment