From: Steve H. <sh...@zi...> - 2002-09-04 14:35:32
|
I'd like to schedule a meeting for 11pm EST to discuss the PyYaml approach to putting dates into data files. Here are my customers: 1) Mike uses the PyYaml library for a calendar application, and he wants to have certain unquoted strings get through the parser without choking. He seems to be happy if YAML just gives him the date/time-ish looking fields at strings, and he will just do his own logic to make them into date objects. 2) The YAML standards body, mostly represented by Oren and Brian on this issue, wants PyYaml to conform to the YAML spec, mostly to ensure that YAML becomes a solution for language interoperability. 3) Clark uses the PyYaml library for a scheduling application, and he wants to have certain unquoted strings automatically be converted to mxDateTime objects in Python. Here is the solution that I propose: 1) By default, PyYaml will treat 2002-11-02 (unquoted) as a simple string. This should make Mike happy. 2) I will cast a vote for Oren's option #3 from a prior email: "(3) Drop date and time altogether and live with YAML core types providing no solution to this issue." Oren and Brian both seem to like the idea, so the YAML standards body should be happy. 3) Give Clark an easy hook to convert strings to mxDateTime for use in his application. Although the hook API would be part of the PyYaml library, conceptually the hook implementation would be operating at a layer above the YAML parser and below Clark's application. This will hopefully keep Clark happy. Suppose you have this document: Project: Start date: 2002-09-15 End date: 2002-10-04 From YAML's perspective, all the datatypes are strings. You might run this YAML through 3 different programs: SlideShowell in Python -- For putting this YAML data into a slideshow presentation, the thingies are just strings. YOD in Ruby -- Again, we would just want to treat the dates as strings. The semantic content of the data just doesn't matter. Clark's project management software -- Clark would want to upgrade the thingies from strings to mxDateTime values, because he wants to do fancy date arithmetic, etc. This might be one way he does it: def convertDatesToMxDateTime(str): if re.match("\d\d\d\d-\d\d-\d\d", str): return asMxDateTime(str) return str def getExtendedLoader(data): parser = yaml.Parser(clarksData) parser.setScalarHook(convertDatesToMxDateTime) return parser parser = getExtendedLoader(clarksData) for doc in parser.load(): # do whatever Basically, we keep YAML simple, but we allow Clark a simple way to extend PyYaml to support his own implicit types. This doesn't violate YAML interoperability, though, because the data is still treated as strings at the YAML layer, and all other YAML parsers--even those without the scalar conversion hooks--will parse his files just fine. Strings make a great lowest common denominator data type. What do you guys think? Let me know about the IRC time. I'm also around earlier in the day, but we have those nagging time zone issues to worry about. ;) People: Steve: wakes up: 8:30am goes to bed: 1:00pm where: DC Ingy: wakes up: 11:00am goes to bed: 4:00am where: Portland Clark wakes up: 5:30am goes back to bed: 07:30:00.000am activity: swimming Cheers, Steve |