[Matplotlib-users] Filling in missing samples by interpolating.

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hello,

I've got many 1d arrays of data which contain occasional NaNs where there
weren't any samples at that depth bin. Something like this...
array([np.nan,1,2,3,np.nan,5,6,7,8,np.nan,np.nan,11,12,np.nan,np.nan,np.nan])

But much bigger, and I have hundreds of them. Most NaN's are isolated
between two valid values, but they still make my contour plots look
terrible.

Rather than just mask them, I want to interpolate so my plot doesn't have
holes in it where it need not.
I want to change any NaN which is preceded and followed by a value to the
average of those two values.
If it only has one valid neighbor, I want to change it to the values of it's
neighbor.

Here's a simplified version of my code:

from copy import copy
import numpy as np
sample_array =
np.array(([np.nan,1,2,3,np.nan,5,6,7,8,np.nan,np.nan,11,12,np.nan,np.nan,np.nan]))
#Make a copy so we aren't working on the original
cast = copy(sample_array)
#Now iterate over the copy
for j,sample in enumerate(cast):
    # If this sample is a NaN, let's try to interpolate
    if np.isnan(sample):
        #Get the neighboring values, but make sure we don't index out of
bounds
        prev_val = cast[max(j-1,0)]
        next_val = cast[min(j+1,cast.size-1)]
        print "Trying to fix",prev_val,"->",sample,"<-",next_val
        # First try an average of the neighbors
        inter_val = 0.5 * (prev_val + next_val)
        if np.isnan(inter_val):
            #There must have been an neighboring Nan, so just use the only
valid neighbor
            inter_val = np.nanmax([prev_val,next_val])
        if np.isnan(inter_val):
            print "   No changes made"
        else:
            print "   Fixed to",prev_val,"->",inter_val,"<-",next_val
            #Now fix the value in the original array
            sample_array[j] = inter_val

After this is run, we have:
sample_array = array([1,1,2,3,4,5,6,7,8,8,11,11,12,12,np.nan,np.nan])

This works, but is very slow for something that will be on the back end of a
web page.
Perhaps something that uses masked arrays and some of the numpy.ma methods?
I keep thinking there must be some much more clever way of doing this.

-Ryan