Saturday, September 13, 2014

Simple txt2csv python script

Often I have a text files with columns that are separated by spaces or tabs or a mixture of them. These files can be read by a typical spreadsheet program but this usually requires some extra mouse clicks to tell the spreadsheet program how to interpret the text file. Csv files (comma separated files) are often much better recognized by a spreadsheet program. Therefore I have written a simple python function/script to change a text file to csv file where spaces and tabs are replaced by a separator (default set to semicolon). Multiple spaces are combined into a single separator (which is what I typically want). Leading and trailing spaces on a row are removed. It is amazing how little python code is needed for this (thanks to python's nice string handling). The magic python line csvline=s.join(line.split()) in the script below is based on a very useful item on Stack Overflow.


import sys

def txt2csv(fnIn,fnOut,separator=';'):
    fin=open(fnIn,'r')
    if(not fin):
        print "Error opening %s"%(fnIn)
        sys.exit(-1)
    fout=open(fnOut,'w')
    for line in fin.readlines():
        s=separator
        csvline=s.join(line.split())
        fout.write(csvline+'\n')
    fin.close()
    fout.close()

if __name__== "__main__":
    try:
        fnIn=sys.argv[1]
        fnOut=sys.argv[2]
    except:
        print "Usage: python %s fnIn fnOut"%(sys.argv[0])
        
    txt2csv(fnIn,fnOut)