From: Ryan N. <rya...@gm...> - 2009-09-02 18:33:14
|
Hello, I've got many 1d arrays of data which contain occasional NaNs where there weren't any samples at that depth bin. Something like this... array([np.nan,1,2,3,np.nan,5,6,7,8,np.nan,np.nan,11,12,np.nan,np.nan,np.nan]) But much bigger, and I have hundreds of them. Most NaN's are isolated between two valid values, but they still make my contour plots look terrible. Rather than just mask them, I want to interpolate so my plot doesn't have holes in it where it need not. I want to change any NaN which is preceded and followed by a value to the average of those two values. If it only has one valid neighbor, I want to change it to the values of it's neighbor. Here's a simplified version of my code: from copy import copy import numpy as np sample_array = np.array(([np.nan,1,2,3,np.nan,5,6,7,8,np.nan,np.nan,11,12,np.nan,np.nan,np.nan])) #Make a copy so we aren't working on the original cast = copy(sample_array) #Now iterate over the copy for j,sample in enumerate(cast): # If this sample is a NaN, let's try to interpolate if np.isnan(sample): #Get the neighboring values, but make sure we don't index out of bounds prev_val = cast[max(j-1,0)] next_val = cast[min(j+1,cast.size-1)] print "Trying to fix",prev_val,"->",sample,"<-",next_val # First try an average of the neighbors inter_val = 0.5 * (prev_val + next_val) if np.isnan(inter_val): #There must have been an neighboring Nan, so just use the only valid neighbor inter_val = np.nanmax([prev_val,next_val]) if np.isnan(inter_val): print " No changes made" else: print " Fixed to",prev_val,"->",inter_val,"<-",next_val #Now fix the value in the original array sample_array[j] = inter_val After this is run, we have: sample_array = array([1,1,2,3,4,5,6,7,8,8,11,11,12,12,np.nan,np.nan]) This works, but is very slow for something that will be on the back end of a web page. Perhaps something that uses masked arrays and some of the numpy.ma methods? I keep thinking there must be some much more clever way of doing this. -Ryan |