Check out linregress from scipy.stats module. Not sure if it will handle
dates. Sample script below:
from scipy.stats import pearsonr from scipy.stats import linregress from
matplotlib import pyplot as plt import numpy as np
sat = np.array([595,520,715,405,680,490,565]) gpa =
np.array([3.4,3.2,3.9,2.3,3.9,2.5,3.5])
fig1 = plt.figure(1) ax = plt.subplot(1,1,1)
pearson = pearsonr(sat, gpa)
plt.scatter(sat,gpa, label="data")
# Get linear regression parameters slope, intercept, r_value, p_value,
std_err = linregress(sat, gpa)
# Format the chart plt.xlabel("SAT Scores") plt.ylabel("GPA")
plt.title("Scatter Plot with Linear Regression Fit\nY=a*X + b\na=%0.4f,
b=%0.4f" % (slope, intercept)) plt.grid()
# Create linear regression x values x_lr = sat
# Create linear regression y values: Y = slope*X + intercept y_lr = slope *
x_lr + intercept
print "Pearson correlation coefficient: ", pearson[0] print "Fit xvalues:
", str(x_lr) print "Fit yvalues: ", str(y_lr) print "Fit slope: ",slope
print "Fit intercept: ", intercept plt.plot(x_lr, y_lr, label="fit")
plt.legend()
plt.show()
> I have some time series of disk usage that I'd like to do a linear
> regression on an plot on a nice graph with Mb used on the yaxis and
> date on the x axis.
>
> I tried to use pylab.polyfit(dates, usage) where:
>
> dates = [datetime(x, y, z), datetime(a, b, c), ...]
> usage = [12123234, 2234235235, ...]
>
> ...but polyfit doesn't like the dates.
>
> How should I do this?
>
> Any example of a nice plot and linear regression using matplotlib?
>
