|
From: SourceForge.net <no...@so...> - 2007-02-11 13:58:51
|
Bugs item #1656884, was opened at 2007-02-10 20:31 Message generated for change (Comment added) made by a1s You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=746843&aid=1656884&group_id=140566 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Closed Resolution: Fixed Priority: 5 Private: No Submitted By: Drew Ferguson (blacktav) Assigned to: Nobody/Anonymous (nobody) Summary: Bad dbf date field strings Initial Comment: Hi Thanks for a life-saver project An application I have (SwissPerfect) stores bad date strings formatted with spaces including incomplete dates (eg year only). Attached is a diff on utils.py 2.1.0 with a function to parse such bad date strings and fix them ---------------------------------------------------------------------- >Comment By: alexander smishlajev (a1s) Date: 2007-02-11 15:58 Message: Logged In: YES user_id=8719 Originator: NO oh yes, it does tolerate, but you'll be getting INVALID_VALUE instead of bad dates (and instead of bad numbers too, but INVALID_VALUE can be converted to numeric zero by simple int() or float() call). if you need, say, the year number from such dates, you'll have to apply either raw access trick or field type substitute. ---------------------------------------------------------------------- Comment By: Drew Ferguson (blacktav) Date: 2007-02-11 15:51 Message: Logged In: YES user_id=701865 Originator: YES Hey Alex You are some dude! dbfpy 2.2 tolerates all bad data out of the box Great work, very elegant solution too Drew ---------------------------------------------------------------------- Comment By: alexander smishlajev (a1s) Date: 2007-02-11 12:01 Message: Logged In: YES user_id=8719 Originator: NO Drew, i have just released dbfpy-2.2.0. with ignoreErrors=True you should be able to handle things like zero bytes in numeric fields. as for date fields containing non-date values, i guess the best hack is to do:: from dbfpy import record class DateString(record.DbfCharacterFieldDef): typeCode = "D" length = 8 record.registerField(DateStrings) before opening your dbf file. then you will get strings instead of datetime objects, and process them in any way you wish. ---------------------------------------------------------------------- Comment By: Drew Ferguson (blacktav) Date: 2007-02-11 01:10 Message: Logged In: YES user_id=701865 Originator: YES Hi Alex Wow, quick response! Thanks Trap for embedded spaces is fine but partial dates (eg year only) continues to barf. Yes, enforcing a valid (python) date is not good. Even with access to "broken" raw data this is a scary can-of-worms; expecting a valid python date from applications which evidently permit partial dates (by accident or design) isn't good either. I hit another issue too where a numeric field barfs on int(value) with "null byte in argument for int()" I presume this is bad data too. I can't see easy long-term fixes here so I guess I'll live with my hacks and wish you the best. Again, thanks for all your hard-work in dbfpy, and allowing my hacks to be trivial to implement. If you want the broken source data files let me know. ---------------------------------------------------------------------- Comment By: alexander smishlajev (a1s) Date: 2007-02-10 21:58 Message: Logged In: YES user_id=8719 Originator: NO yet another quick fix: i've added API to access raw DBF data. with current version you can catch conversion errors and then obtain offending values as:: dbf.header[field_name_or_number].decodeFromRecord( rec.rawFromStream(dbf, record_index)) where rec may be DbfRecord class or any valid record object. ---------------------------------------------------------------------- Comment By: alexander smishlajev (a1s) Date: 2007-02-10 21:03 Message: Logged In: YES user_id=8719 Originator: NO thank you for reporting it. it's a +1 for the plan to make the Dbf class to optionally ignore all conversion errors, as i said in https://sourceforge.net/mailarchive/message.php?msg_id=37890707 however, i am not given to "fix" invalid values by replacing missing date parts by an arbitrary number. perhaps returning None instead of invalid dates would be more consistent... i have checked in a quick fix allowing date strings to have leading spaces instead of zeroes. if that solves your problem, i will release new dbfpy version as soon as possible. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=746843&aid=1656884&group_id=140566 |