Thursday, November 28, 2013

spreadsheet like line plot with filled areas in python

A nice way to show if a series of values fall within a certain, per sample in the series variable, range is to make a line plot with a shaded area indicating the range.

Example:



A line plot is easily made in a typical spreadsheet program. Getting the correct region shaded in the plot by combining area and line plots was however much too cumbersome for me. Therefore I had a look if matplotlib also supports area plots and luckily it does: fill_between. The only difficulty I encountered was that matplotlib is mainly intended for making scatter plots, i.e. a data series with meaningful x and y coordinates. A typical spreadsheet line plot however has text labels for all points on the x-axis (as shown in the example). The easiest solution I could come up with was to simply number the samples in the data series as 1, 2, 3 etc. and then change the ticks on the x-axis manually with the xticks function. If there are too many samples in the data series and the x-axis gets too full and labels start to overlap, most spreadsheet programs simply drop some labels. The code below does the same. The code seems a bit long but most lines are actually used for reading in the data from a csv file.

import csv
#!/usr/bin/env python
import csv

from pylab import *


violet=(90.0/255.0,36.0/255.0,90.0/255.0)
red=(1.0,0.0,0.0)
green=(0.0,1.0,0.0)


def plotFancy(fn, label,figNum=None,ymin=0.0,ymax=500.0,maxticks=20):
    """
    plots directly from a csv file (with a header row!!)
    file layout:
    column
    1: strip name
    2: Exp
    3: Mod
    4: min
    5: max
    
    use label to name y-axis
    if figNum is supplied the grap will be plotted in the figure with
    that number (and cleared first)
    ymin and ymax determine the scale on the y-axis (i.e. ylim(ymin,ymax))
    maxticks gives the maximum number of ticks (labels) allowed on the x axis
    """
    f=open(fn,'rb')
    reader=csv.reader(f,delimiter=';')
    xlabel_lst=[]
    y_min_lst=[]
    y_max_lst=[]
    y_mod_lst=[]
    y_exp_lst=[]
    line_cnt=0
    for line_lst in reader:
        line_cnt+=1
        if(line_cnt==1):
            continue
        xlabel_lst.append(line_lst[0])
        y_exp_lst.append(float(line_lst[1]))
        y_mod_lst.append(float(line_lst[2]))
        y_min_lst.append(float(line_lst[3]))
        y_max_lst.append(float(line_lst[4]))
    cnt_lst= [i for i in range(len(y_mod_lst))]
    f.close()
    if(figNum==None):
        figure()
    else:
        figure(figNum)
        clf()
    fill_between(cnt_lst,y_min_lst,y_max_lst,facecolor=green,alpha=1.0)
    plot(cnt_lst,y_mod_lst,'bo',color=violet,label="%s mod"%(label),ms=12)
    plot(cnt_lst,y_exp_lst,'mv',color=red,label="%s exp"%(label),ms=12)
    ylim(ymin,ymax)
    ylabel("%s"%(label))
    grid(b=True)
    legend(loc=9)
    show()
    if(len(xlabel_lst)>maxticks):
        delta=len(xlabel_lst)/float(maxticks-1.0)
        tick_num_lst=[]
        tick_text_lst=[]
        index=0
        for i in range(maxticks-1):
            index=int(i*delta)
            tick_num_lst.append(index)
            tick_text_lst.append(xlabel_lst[index])
        tick_num_lst.append(len(xlabel_lst)-1)
        tick_text_lst.append(xlabel_lst[len(xlabel_lst)-1])
        xticks(tick_num_lst,tick_text_lst,rotation=90)
        
    else:
        xticks(arange(len(xlabel_lst)),xlabel_lst,rotation=90)
    xlim(0,len(xlabel_lst))