|
From: Marc T. <mar...@gm...> - 2009-03-01 20:12:20
|
I'm using ConfigObj to parse Medicare's list of error/comment codes (when a physician receives an electronic explanation of benefits, it's full of these codes telling him/her exactly why Medicare isn't paying squat.) The file is called "Codes.ini", and generally follows INI conventions, except that: - the first line is a release date - no "=", no nothing. I got around that by opening the file first, doing a readlines(), and pop()ing the date off for use elsewhere. - at about 75%, the following appears: > [188] > Message=This product/procedure is only covered when used according to FDA > recommendations. > EffDate=6/30/2005 > DeactDate= > Modified= > Note= > [189] > Message='Not otherwise classified' or 'unlisted' procedure code (CPT/HCPCS) > was billed when there is a specific procedure code for this > procedure/service > EffDate=6/30/2005 > DeactDate= > Modified= > Note= (I include number 188 just to show what these entries look like - the trouble begins with the Message for number 189.) Here's what I get: > Traceback (most recent call last): > File "E:\fsrPy\fsrStuff\mcrCodes.pyw", line 19, in <module> > codeFile = mcrCodes() > File "E:\fsrPy\fsrStuff\mcrCodes.pyw", line 17, in __init__ > self.codes = ConfigObj(tmpCodes, raise_errors=True) > File "C:\Python25\lib\site-packages\configobj.py", line 1272, in __init__ > self._load(infile, configspec) > File "C:\Python25\lib\site-packages\configobj.py", line 1341, in _load > self._parse(infile) > File "C:\Python25\lib\site-packages\configobj.py", line 1687, in _parse > ParseError, infile, cur_index) > File "C:\Python25\lib\site-packages\configobj.py", line 1748, in > _handle_error > raise error > configobj.ParseError: Parse error in value at line 5888. > There aren't any "real" quotes in this file - is there any way to tell ConfigObj to ignore any single-quote characters it finds? If not, any clever ideas for a workaround? I did think of replacing "'" with "-" or something, but there are plenty of contractions in this file, and "doesn-t" doesn't look very pretty. On the other hand, searching for all matched pairs of quotes that appear in the same data item would require me to... well, to re-invent much of the guts of ConfigObj itself, no? If you're interested, the code file is available here: http://www.cms.hhs.gov/AccesstoDataApplication/Downloads/MedicareRemitEasyPrint.zip (it's inside the zip file). Thanks! -- www.fsrtechnologies.com |
|
From: Marc T. <mar...@gm...> - 2009-03-02 02:22:01
|
On Sun, Mar 1, 2009 at 12:12 PM, Marc Tompkins <mar...@gm...>wrote:
> There aren't any "real" quotes in this file - is there any way to tell
> ConfigObj to ignore any single-quote characters it finds? If not, any
> clever ideas for a workaround? I did think of replacing "'" with "-" or
> something, but there are plenty of contractions in this file, and "doesn-t"
> doesn't look very pretty. On the other hand, searching for all matched
> pairs of quotes that appear in the same data item would require me to...
> well, to re-invent much of the guts of ConfigObj itself, no?
>
I came up with an ugly workaround - actually, it's fine for my purposes, but
typographically wrong - I'm replacing all the apostrophes with backticks.
I'm hoping my users won't notice. Actually, of course, I could do another
replacement just before printing; I might do that later on if I'm bored.
Here's my first pass at the problem:
> class mcrCodes(object):
>
> def __init__(self, codeFileName="C:/temp/Codes.ini"):
> with open(codeFileName,'r') as inFile:
> tmpCodes = inFile.readlines()
> releaseDate = tmpCodes.pop(0)
> for index, line in enumerate(tmpCodes):
> tmpCodes[index] = line.replace("\'", "`")
> self.codes = ConfigObj(tmpCodes, raise_errors=True,
> list_values=False)
>
> codeFile = mcrCodes()
> for code, val in codeFile.codes.iteritems():
> print code, val
>
--
www.fsrtechnologies.com
|
|
From: Marc T. <mar...@gm...> - 2009-03-02 10:50:47
|
On Sun, Mar 1, 2009 at 6:21 PM, Marc Tompkins <mar...@gm...>wrote: > I came up with an ugly workaround - actually, it's fine for my purposes, > but typographically wrong - I'm replacing all the apostrophes with > backticks. I'm hoping my users won't notice. Actually, of course, I could > do another replacement just before printing; I might do that later on if I'm > bored. > For the sake of completeness, I thought I'd follow up - like most things, it turned out to be very simple in the end. I wanted this class to look up the code and return the Message as a text-wrapped paragraph (or list of lines) anyway, so I just replace "`" with "'" before textwrapping, and everything is cool and groovy. Sorry for the waste of bandwidth. -- www.fsrtechnologies.com |
|
From: Michael F. <fuz...@vo...> - 2009-03-02 15:04:20
|
Marc Tompkins wrote: > On Sun, Mar 1, 2009 at 6:21 PM, Marc Tompkins <mar...@gm... > <mailto:mar...@gm...>> wrote: > > I came up with an ugly workaround - actually, it's fine for my > purposes, but typographically wrong - I'm replacing all the > apostrophes with backticks. I'm hoping my users won't notice. > Actually, of course, I could do another replacement just before > printing; I might do that later on if I'm bored. > > For the sake of completeness, I thought I'd follow up - like most > things, it turned out to be very simple in the end. I wanted this > class to look up the code and return the Message as a text-wrapped > paragraph (or list of lines) anyway, so I just replace "`" with "'" > before textwrapping, and everything is cool and groovy. > > Sorry for the waste of bandwidth. > Not a problem. :-) Michael > -- > www.fsrtechnologies.com <http://www.fsrtechnologies.com> > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------------ > Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA > -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise > -Strategies to boost innovation and cut costs with open source participation > -Receive a $600 discount off the registration fee with the source code: SFAD > http://p.sf.net/sfu/XcvMzF8H > ------------------------------------------------------------------------ > > _______________________________________________ > Configobj-develop mailing list > Con...@li... > https://lists.sourceforge.net/lists/listinfo/configobj-develop > -- http://www.ironpythoninaction.com/ |
|
From: Michael F. <fuz...@vo...> - 2009-03-06 20:33:41
|
Marc Tompkins wrote:
> On Sun, Mar 1, 2009 at 12:12 PM, Marc Tompkins
> <mar...@gm... <mailto:mar...@gm...>> wrote:
>
> There aren't any "real" quotes in this file - is there any way to
> tell ConfigObj to ignore any single-quote characters it finds? If
> not, any clever ideas for a workaround? I did think of replacing
> "'" with "-" or something, but there are plenty of contractions in
> this file, and "doesn-t" doesn't look very pretty. On the other
> hand, searching for all matched pairs of quotes that appear in the
> same data item would require me to... well, to re-invent much of
> the guts of ConfigObj itself, no?
>
>
> I came up with an ugly workaround - actually, it's fine for my
> purposes, but typographically wrong - I'm replacing all the
> apostrophes with backticks. I'm hoping my users won't notice.
> Actually, of course, I could do another replacement just before
> printing; I might do that later on if I'm bored.
I'm afraid that you're trying to read a configuration file format that
ConfigObj wasn't designed to work with. A different ConfigObj could have
a pluggable parser with a flexible grammar, but I don't have time to
write that one...
In the meantime your two phase replace sounds like our only option. You
could replace each apostrophe with a unique character combination and
then us walk() to replace every occurrence again after reading.
Michael Foord
>
> Here's my first pass at the problem:
>
> class mcrCodes(object):
>
> def __init__(self, codeFileName="C:/temp/Codes.ini"):
> with open(codeFileName,'r') as inFile:
> tmpCodes = inFile.readlines()
> releaseDate = tmpCodes.pop(0)
> for index, line in enumerate(tmpCodes):
> tmpCodes[index] = line.replace("\'", "`")
> self.codes = ConfigObj(tmpCodes, raise_errors=True,
> list_values=False)
>
> codeFile = mcrCodes()
> for code, val in codeFile.codes.iteritems():
> print code, val
>
>
> --
> www.fsrtechnologies.com <http://www.fsrtechnologies.com>
> ------------------------------------------------------------------------
>
> ------------------------------------------------------------------------------
> Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
> -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
> -Strategies to boost innovation and cut costs with open source participation
> -Receive a $600 discount off the registration fee with the source code: SFAD
> http://p.sf.net/sfu/XcvMzF8H
> ------------------------------------------------------------------------
>
> _______________________________________________
> Configobj-develop mailing list
> Con...@li...
> https://lists.sourceforge.net/lists/listinfo/configobj-develop
>
--
http://www.ironpythoninaction.com/
|
|
From: Marc T. <mar...@gm...> - 2009-03-06 21:10:04
|
On Sun, Mar 1, 2009 at 7:18 PM, Michael Foord <fuz...@vo...>wrote:
> In the meantime your two phase replace sounds like our only option. You
> could replace each apostrophe with a unique character combination and
> then us walk() to replace every occurrence again after reading.
>
Since you followed up, I figured I'd go ahead and post my eventual
solution. Could probably be more elegant, but it does what I need it to...
from __future__ import with_statement
import os, zipfile, textwrap
from configobj import ConfigObj
class mcrCodes(object):
"""
Provides dictionary of Medicare reason codes.
Methods:
__init__ - create dictionary
lookup - return formatted explanation for given code
Attributes:
codes - dictionary of codes, explanations, and valid dates
releaseDate - date working version of Codes.ini was released
"""
def __init__(self, codeFileName=os.getcwd() + os.sep +
"MedicareRemitEasyPrint.zip"):
"""
If codeFileName is a zip file, it will be searched for a 'Codes.ini'
file;
if not, codeFileName will be accessed directly.
ConfigObj interprets internal quotes, so we replace apostrophes with
backquotes before passing file to ConfigObj.
"""
if zipfile.is_zipfile(codeFileName):
piz = zipfile.ZipFile(codeFileName)
for item in piz.infolist():
if item.filename[-3:] == "ini":
inFile = piz.read(item.filename)
tmpCodes = inFile.splitlines()
else:
with open(codeFileName,'r') as inFile:
tmpCodes = inFile.readlines()
self.releaseDate = tmpCodes.pop(0)
for index, line in enumerate(tmpCodes):
tmpCodes[index] = line.replace("\'", "`")
self.codes = ConfigObj(tmpCodes, raise_errors=True,
list_values=False)
def lookup(self, inStr=None, width=70, para=True):
"""
Return a text-wrapped paragraph or list of lines containing
explanation text for the code passed in inStr.
inStr - code to be looked up
width - line length to wrap to
para - if True, return paragraph; if False, return list of lines
"""
wrp = textwrap.TextWrapper(width=width, initial_indent=inStr+" - ",
subsequent_indent=" ")
try:
outStr = self.codes[inStr]["Message"].replace("`", "'") # change
backquotes back to apostrophes
except KeyError:
outStr = "Sorry, no explanatory text is available for this
code. Make sure your code file is up-to-date."
if para:
return wrp.fill(outStr)
else:
return wrp.wrap(outStr)
--
www.fsrtechnologies.com
|