[libdb-develop] [feature] Date Ranges and Database Manipulations
Status: Inactive
Brought to you by:
morbus
From: Morbus I. <mo...@di...> - 2004-11-29 23:21:42
|
Project: LibDB Version: cvs Component: Code Category: feature requests Priority: normal Assigned to: Anonymous Reported by: Morbus Iff Updated by: Morbus Iff Status: active I've been thinking about this a lot, and I'd like to bounce one possible implementation off your collective heads. In reviewing the database schema, one of the things I don't address is that the FRBR spec mentions that a WEMI could have multiple dates, whereas the model only allows one per Group 1 Entity. The following possible revision allows an unlimited number of dates for any type of record (which follows the FRBR Final Report more closely) - in effect, "dates" become a special type of "annotation". There'd be two new tables: libdb_dates and libdb_dates_types, and they would be the exact same columns and intent as libdb_annotations and libdb_annotations_types (see the database schema for a refresher: http://libdb.com/node/2). Instead of simply an "annotation" column, however, there'd be three content columns: date_start (YYYY-MM-DD), date_end (YYYY-MM-DD) and date_raw (a string). This would allow: * an unlimited number of dates per "thing". * an unlimited number of types of dates. This flexibility allows handling of the "Lord of the Rings", or other fictional date system too. Just define a new "date type" of "Middle-earth Time". Then, when you're assigning a "Middle-earth Time" date to a record, you'd ignore the date_start and date_end (which assume YYYY-MM-DD dates) and use the date_raw instead, filling it with whatever string you need it to be. (The onus of sorting the fictional dates properly then get pushed onto the cataloguer, not us). There are some concerns, however. The first concern I have about this possible implementation is some planned accountability to the system. In the next revision of the database schema, there are plans to include a "date thing was created" and "date thing was modified", such that we can properly show changes in the Drupal "tracker" system (see http://libdb.com/tracker for an example). Since there are database integrity things and are NOT really necessary to the record itself, I'm not sure if they should belong in the libdb_dates table, from a cataloguer/"useful export data" position. It would certainly be easy and "cleaner" to put them in there (and that's probably where I'll lean toward come implementation), but I'm still not sure it's "right". The second concern is the "extra" fields that aren't immediately necessary. In most cases, you really only need that "libdb_manifestation" "12345" has a "date_type" "datereceived1234", and the values of "start", "end", and "raw" - there's really no need for a "referent_table" and "referent_id" as libdb_annotations and libdb_annotations_types require. The one time that I do see a need for them, and why I suggested mimicking libdb_annotations instead of, say, libdb_identifiers, is for a loaning system: "libdb_manifestation" "12345" has a "date_type" "dateILoanedIt2345", values of "start", "end", and "raw", and a "libdb_person", "Kevin Hemenway" - who you actually loaned the manifestation to can be the referent. In retrospect, however, I'm thinking that a "loan" may be better as merely a libdb_relationship, and these libdb_dates would better be represented as special types of identifiers, not annotations (but then, would we need to go overboard with "date I bought", "date I threw away", "date I watched again" relationships?) Morbus Iff Previous comments: ------------------------------------------------------------------------ August 18, 2004 - 09:18 : Morbus Iff There are a few qualms I have with the current data model's date system, and it's all due to implementation, not necessarily bad intent. The intent is innocent: dates should be in the form of YYYY, YYYY-MM, or YYYY-MM-DD. The goal here is to say that a "movie came out in 1997" or, if you know the exact date of theatrical release, "1997-06-02". Same sorta thing with birthdays: you may approximately know that someone was born in "1817" or even in "1817-04", but not the exact moment in time. The almost immediate problem with the above formats, however, is the lack of support for "A.D. 143" or "B.C. 23" (though I hesitate to label that a critical issue.) Technically, "A.D. 143" could be referenced as just "143" (YYY, not YYYY), but I'm not sure the underlying database will support that early of a year - regardless, MySQL would convert a date of "AD 43" ("43") to "1943", which isn't what we want. We'd have to zeropad non-YYYY dates ("0143" and "0043"). This doesn't help us with B.C. dates at all. See also http://dev.mysql.com/doc/mysql/en/DATETIME.html: "The supported range is '1000-01-01 00:00:00' to '9999-12-31 23:59:59'. (``Supported'' means that although earlier values might work, there is no guarantee that they will.)" Irregardless of how early a date is, our current data model expects to receive YYYY-MM-DD and ONLY YYYY-MM-DD - there's no capability for just YYYY or YYYY-MM. After some discussion on a #mysql IRC channel, it appears that the only way to handle ranges (where "1997" is a range of ["1997-01-01" to "1997-12-31"] and "1997-04" is ["1997-04-01" to "1997-04-31"]) is to have two database fields: date_start and date_end. It'd then be necessary in our code to create the ranges ourselves (from a user-entered "1997"). This is the only way to do the "right" thing, with proper database columns, and use of WHERE date/time clauses (WHERE date_start > '2004-04-25' for example). * Is this the "right" thing to do? * What about fictional dates ("Shiresmoon 23")? Affected tables: libdb_work libdb_expression libdb_manifestation libdb_person libdb_corporate_body libdb_events. ------------------------------------------------------------------------ August 18, 2004 - 16:23 : Little Nemo All right, I'll bite. Here's a superfluous oneshot bone-head suggestion. A column in each table to be used as a date modifier that can only be represented as AD OR BC in a pull down menu. WARNING - This would require the user to know the difference between BC and AD. :p I'm not sure what the column should be identified as in the schema however. Perhaps "date_type"? ------------------------------------------------------------------------ August 18, 2004 - 16:36 : Little Nemo Just reread this thread after posting. Using my suggestion would of course alter this: WHERE date_start > '2004-04-25' to this: WHERE date_start > '2004-04-25' AND date_type = 'AD' Oh and insofar as fictional events are concerned, if someone reallly wants them, I'm sure they can convert them to "actual" dates. ------------------------------------------------------------------------ August 24, 2004 - 17:28 : Little Nemo Regarding this post: One can't reasonably assume that a periodical/serial with a publication date of Nov/Dec actually took 2 months to produce, so why not simply assign the publication date as 2004-11-01? The publication couldn't have been produced any earlier right? As for Spring, why not 2004-03-21 (assuming the publisher is in the Northern Hemisphere)? I just don't think altering the code, much less the db, to get the dates "technically right" makes much sense when one can make a reasonable estimation. I think we need to work with what we've got, and what we've got (I think) is good enough. At least for now. And if anyone truly wants to know "fictional dates" I'm sure they can convert them to "normal" calendar dates. LOTR fans I'm looking at you. -- View: http://drupal.org/node/10160 Edit: http://drupal.org/project/comments/add/10160 |