From: John H. <jdh...@ac...> - 2004-09-16 15:17:30
|
First, a general note about python2.2. It is becoming difficult to maintain adequate support for python2.2. The pyparsing module, on which mathtext relies, currently requires 2.3. Handling dates properly requires the datetime module (or mx.datetime but I'm not inclined to impose an external dependency). I am inclined to gently drop support for 2.2. By gently, I mean that some features will no longer work (mathtext and dates) but the core should, at least for the near future. How many people would this adversely affect? The dates module, aside from a bug that I patched yesterday in response to a post by Jim Boyle, has two fundamental problems: no timezone support and the date range supported by the built-in time functions (the 1972 epoch) is too narrow. Both of these limitations are imposed by trying to support python2.2. I would like to rewrite the dates module, and the ticker functions for dates, to use the python datetime module. Getting dates, timezones, and daylight savings time right is non-trivial, and I think the cleanest approach is to require python2.3 and datetime. Ie, I would jettison support for epoch times and mx datetimes, as well as the converter stuff. The new plot_date signature would be def plot_date(self, d, y, fmt='bo', **kwargs): where d would be an array of floats (no converter) and the floats would be the number of days since 1,1,1 (Gregorian calendar). The supported date range would be datetime.min to datetime.max (years 0001 - 9999). The dates module would provide some helper functions so that you could use to build date arrays from datetime and timedelta instances. It not be too hard to add some helper functions to convert existing epoch, mx, or datetime arrays to the required array of days floats. timezones, including timezones other than the local one, would be supported. Ie, if you are a financial guru in California, you could work with Eastern time zone stock quotes or Central time zone pork belly quotes. daylight savings time, etc, would be handled by the datetime module. The datetime module has functions toordinal and fromordinal to convert to integer number of days since the start of the Gregorian calendar, but not floating point. Ie, hours minutes, seconds, etc are lost. My guess is that it is done this way to avoid imprecisions in floating point, but am not sure. I have implemented to_ordinalf and from_ordinalf to do these conversions preserving the hours, etc. They seem to work. I occasionally get rounding error on the order of a couple microseconds, which I think should be tolerable for the vast majority of cases. If you need microsecond precision, you can use plot and not plot_date in any case. Below, I'm including some prototype code which does these conversions - if you have interest or experience with dates and timezones, please look over it to see if I'm making any fundamental or conceptual errors. There is also a function drange, which can be used to construct the floating point days arrays plot_date would require. Any other suggestions for improvement or changes to date handling welcome. Speak now, or forever hold your peace! JDH import sys, datetime from matplotlib.numerix import arange from matplotlib.dates import Central, Pacific, Eastern, UTC HOURS_PER_DAY = 24. MINUTES_PER_DAY = 60.*HOURS_PER_DAY SECONDS_PER_DAY = 60.*MINUTES_PER_DAY MUSECONDS_PER_DAY = 1e6*SECONDS_PER_DAY #tz = None tz = Pacific #tz = UTC def close_to_dt(d1, d2, epsilon=5): 'assert that datetimes d1 and d2 are within epsilon microseconds' delta = d2-d1 mus = abs(delta.days*MUSECONDS_PER_DAY + delta.seconds*1e6 + delta.microseconds) assert(mus<epsilon) def close_to_ordinalf(o1, o2, epsilon=5): 'assert that float ordinals o1 and o2 are within epsilon microseconds' delta = abs((o2-o1)*MUSECONDS_PER_DAY) assert(delta<epsilon) def to_ordinalf(dt): """ convert datetime to the Gregorian date as UTC float days, preserving hours, minutes, seconds and microseconds. return value is a float """ if dt.tzinfo is not None: delta = dt.tzinfo.utcoffset(dt) if delta is not None: dt -= delta base = dt.toordinal() return (base + dt.hour/HOURS_PER_DAY + dt.minute/MINUTES_PER_DAY + dt.second/SECONDS_PER_DAY + dt.microsecond/MUSECONDS_PER_DAY ) def from_ordinalf(x, tz=None): """ convert Gregorian float of the date, preserving hours, minutes, seconds and microseconds. return value is a datetime """ ix = int(x) dt = datetime.datetime.fromordinal(ix) remainder = x - ix hour, remainder = divmod(24*remainder, 1) minute, remainder = divmod(60*remainder, 1) second, remainder = divmod(60*remainder, 1) microsecond = int(1e6*remainder) dt = datetime.datetime(dt.year, dt.month, dt.day, int(hour), int(minute), int(second), microsecond, tzinfo=UTC()) if tz is not None: return dt.astimezone(tz) else: return dt def drange(dstart, dend, delta): """ Return a date range as float gregorian ordinals. dstart and dend are datetime instances. delta is a datetime.timedelta instance """ step = delta.days + delta.seconds/SECONDS_PER_DAY + delta.microseconds/MUSECONDS_PER_DAY f1 = to_ordinalf(dstart) f2 = to_ordinalf(dend) return arange(f1, f2, step) dt = datetime.datetime(1011, 10, 9, 13, 44, 22, 101010, tzinfo=tz) x = to_ordinalf(dt) newdt = from_ordinalf(x, tz) close_to_dt(dt, newdt) date1 = datetime.datetime( 2000, 3, 2, tzinfo=tz) date2 = datetime.datetime( 2000, 3, 5, tzinfo=tz) delta = datetime.timedelta(hours=8) print drange(date1, date2, delta) d1 = datetime.datetime( 2000, 3, 2, 4, tzinfo=tz) d2 = datetime.datetime( 2000, 3, 2, 12, tzinfo=UTC()) o1 = to_ordinalf(d1) o2 = to_ordinalf(d2) close_to_ordinalf(o1, o2) print 'all tests passed' |