Menu

#28 Python2.7: TypeError: unicode argument expected, got 'str'

1.0
open
nobody
None
2021-11-30
2021-11-19
Ankur Sinha
No

Hi Dave,

We're seeing this error in our CI in libNeuroML in the pull request there:

https://github.com/NeuralEnsemble/libNeuroML/pull/110

The complete trace is:

self = <neuroml.test.test_hdf5_optimized.TestNeuroMLHDF5Optimized testMethod=test_write_load>

    def test_write_load(self):

        # for f in []:
        # for f in ['complete.nml']:
        # for f in ['simplenet.nml','testh5.nml','MediumNet.net.nml','complete.nml']:

        for f in ["simplenet.nml", "MediumNet.net.nml"]:
            file_name = "%s/../examples/test_files/%s" % (self.base_dir, f)

            print("Loading %s" % file_name)

            nml_doc0 = loaders.read_neuroml2_file(file_name, include_includes=True)
            summary0 = nml_doc0.summary()

            print(summary0)

            nml_h5_file = "%s/../examples/tmp/%s__1.h5" % (self.base_dir, f)
            writers.NeuroMLHdf5Writer.write(nml_doc0, nml_h5_file)
            print("Written to: %s" % nml_h5_file)

            nml_doc1 = loaders.read_neuroml2_file(
                nml_h5_file, include_includes=True, optimized=True
            )

            summary1 = nml_doc1.summary().replace(" (optimized)", "")
            print("\n" + summary1)

            compare(summary0, summary1)

            nml_h5_file_2 = "%s/../examples/tmp/%s__2.h5" % (self.base_dir, f)
            writers.NeuroMLHdf5Writer.write(nml_doc1, nml_h5_file_2)
            print("Written to: %s" % nml_h5_file_2)
            # exit()
>           nml_doc2 = loaders.read_neuroml2_file(nml_h5_file_2, include_includes=True)

neuroml/test/test_hdf5_optimized.py:60: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
neuroml/loaders.py:214: in read_neuroml2_file
    optimized=optimized,
neuroml/loaders.py:270: in _read_neuroml2
    nml2_doc = NeuroMLHdf5Loader.load(nml2_file_name_or_string, optimized=optimized)
neuroml/loaders.py:57: in load
    doc = cls.__nml2_doc(src, optimized)
neuroml/loaders.py:87: in __nml2_doc
    currParser.parse(file_name)
neuroml/hdf5/NeuroMLHdf5Parser.py:144: in parse
    self.parse_group(h5file.root.neuroml)
neuroml/hdf5/NeuroMLHdf5Parser.py:188: in parse_group
    self.parse_group(node)
neuroml/hdf5/NeuroMLHdf5Parser.py:179: in parse_group
    self.parse_group(node)
neuroml/hdf5/NeuroMLHdf5Parser.py:170: in parse_group
    self.start_group(g)
neuroml/hdf5/NeuroMLHdf5Parser.py:601: in start_group
    properties=properties,
neuroml/hdf5/NetworkBuilder.py:74: in handle_population
    self.nml_doc.append(component_obj)
neuroml/nml/nml.py:39122: in append
    self.add(element)
neuroml/nml/nml.py:14599: in add
    self.__add(obj, targets[0], force)
neuroml/nml/nml.py:14654: in __add
    obj, member.get_name()
neuroml/nml/nml.py:223: in __str__
    namespacedef_=settings["str_namespacedefs"],
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <neuroml.nml.nml.IafCell object at 0x7f5037247a50>
outfile = <_io.StringIO object at 0x7f5036851950>, level = 0
namespaceprefix_ = '', namespacedef_ = '', name_ = None, pretty_print = True

    def export(
        self,
        outfile,
        level,
        namespaceprefix_="",
        namespacedef_="",
        name_="IafCell",
        pretty_print=True,
    ):
        imported_ns_def_ = GenerateDSNamespaceDefs_.get("IafCell")
        if imported_ns_def_ is not None:
            namespacedef_ = imported_ns_def_
        if pretty_print:
            eol_ = "\n"
        else:
            eol_ = ""
        if self.original_tagname_ is not None and name_ == "IafCell":
            name_ = self.original_tagname_
        if UseCapturedNS_ and self.ns_prefix_:
            namespaceprefix_ = self.ns_prefix_ + ":"
        showIndent(outfile, level, pretty_print)
        outfile.write(
            "<%s%s%s"
            % (
                namespaceprefix_,
                name_,
>               namespacedef_ and " " + namespacedef_ or "",
            )
        )
E       TypeError: unicode argument expected, got 'str'

neuroml/nml/nml.py:45392: TypeError

nml.py was generated on Python 3.10.0 with generateDS.py version 2.40.5. The schema file is here:

https://github.com/NeuralEnsemble/libNeuroML/blob/bbe4cb60d9feed22e7647ee18dc99e15749faab9/neuroml/nml/NeuroML_v2.2.xsd

The generated nml.py file is here:

https://github.com/NeuralEnsemble/libNeuroML/blob/bbe4cb60d9feed22e7647ee18dc99e15749faab9/neuroml/nml/nml.py

The helper_methods and so on are in the same folder there. We use this script to generate nml.py and the format it with black:

https://github.com/NeuralEnsemble/libNeuroML/blob/bbe4cb60d9feed22e7647ee18dc99e15749faab9/neuroml/nml/regenerate-nml.sh

We only get this error on Python 2.7. It does not happen on any of the Python 3 versions.

Cheers,
Ankur

Discussion

  • Dave Kuhlman

    Dave Kuhlman - 2021-11-19

    Ankur,

    Thanks for your help with this.

    What I've done so far:

    • Downloaded NeuroML_v2.2.xsd, helper_methods.py, and
      regenerate-nml.sh.

    • Modified regenerate-nml.sh for my environment: path to
      generateDS.py, name of XML schema file.

    • Ran regenerate-nml.sh -- generated nml.py.

    • Build a virtual environment with Python 2.7.18, then installed
      lxml, and six required by generateDS.

    Next -- I need to run the new nml.py against an XML instance
    document so that it generates the reported exception.

    I looked in the libNeuroML Git repo, but could not find an XML instance file.
    Do you have one I could use to generate the error?

    I'll pick this up again on Monday.
    In the meantime, I hope you have an enjoyable weekend.

    Dave

     
  • Dave Kuhlman

    Dave Kuhlman - 2021-11-20

    OK, I've found a number of examples (*.nml) here:
    https://github.com/NeuroML/NeuroML2/tree/master/examples.
    But, the ones I have tried so far do not cause an exception, either
    with Python 3.9.7 nor with Python 2.7.18. The ones I've tried are:

    • NML2_AnalogSynapsesHH.nml
    • NML2_FullNeuroML.nml
    • NML2_PyNNCells.nml

    For example, it displays exported output when I run this:

    $ python nml.py NML2_FullNeuroML.nml
    

    By the way, I'm running nml.py with Python 2.7.18, but I
    generated nml.py with Python 3.9.7.

    Dave

     
  • Ankur  Sinha

    Ankur Sinha - 2021-11-22

    Hi Dave,

    Thanks for that. I'll go through the failing test and see if I can narrow it down. If you're not seeing the error, it could be something else in the library that's using nml.py perhaps. I'll report back once I have something.

    Thanks again,
    Ankur

     
  • Ankur  Sinha

    Ankur Sinha - 2021-11-22

    Hi Dave,

    I haven't been able to pin point the cause of the error yet but I did find a workaround that fixes the issue for us:

    https://github.com/NeuralEnsemble/libNeuroML/pull/110/commits/aa025c608a678697d14acaca1cf8d7ed0a0a7fd6

    The error is caused when we use nml.py with other bits, like reading from HDF5 files, so I'm thinking it's caused by some code in our HDF5 related methods where we haven't been careful enough to take both Python 2 and 3 into account.

    So we'll just carry this tweak for the time being, and we expect to drop Python 2 support after the next release and won't run into this issue in the future then.

    Please close this issue

    Cheers,
    Ankur

     
  • Ankur  Sinha

    Ankur Sinha - 2021-11-26

    Hi Dave,

    I think I figured out what was going on. In the generated nml.py file, in the __str__ method of GeneratedsSuper, we're initialising a new output variable as an object of the StringIO class. At the moment, we simply from io import StringIO, which while works for Python 3, isn't the right StringIO to use for Python 2. In Python 2, for strings, we must use StringIO.StringIO because there, io.StringIO is for unicode strings. So, I think the generateds.py code should be tweaked as per this diff:

    diff -r 5ea8905ebd46 generateDS.py
    --- a/generateDS.py     Tue Oct 19 13:15:33 2021 -0700
    +++ b/generateDS.py     Fri Nov 26 14:41:10 2021 +0000
    @@ -7177,7 +7177,10 @@
     #xmldisable#        for n in settings:
     #xmldisable#            if hasattr(self, n):
     #xmldisable#                setattr(settings[n], self[n])
    -#xmldisable#        from io import StringIO
    +#xmldisable#        if sys.version_info.major == 2:
    +#xmldisable#            from StringIO import StringIO
    +#xmldisable#        else:
    +#xmldisable#            from io import StringIO
     #xmldisable#        output = StringIO()
     #xmldisable#        self.export(
     #xmldisable#            output,
    

    I've manually made the change in nml.py for the moment, and all our tests now pass correctly on both Python2 and Python3.

    https://github.com/NeuralEnsemble/libNeuroML/commit/fcdcd56af1ded448696f009d041a3aa2183b81a8

    A simple reproducer is:

    from nml import Cell
    c = Cell()
    "{}".format(c)
    

    With the version of nml.py generated from generateds, it'll throw an error for Python 2 but not for Python 3:

    Traceback (most recent call last):                                                                                                      
    File "<stdin>", line 1, in <module>
    File "neuroml/nml/nml.py", line 223, in __str__
      namespacedef_=settings["str_namespacedefs"],
    File "neuroml/nml/nml.py", line 42519, in export
      namespacedef_  and " " + namespacedef_ or "",                         
    TypeError: unicode argument expected, got 'str'
    

    With the tweak, it'll work correctly for both.

    I wasn't aware of this difference between StringIO.StringIO and io.StringIO in Python 2, but this clarified it:

    (ins)(.venv27)$ python
    Python 2.7.18 (default, Sep 17 2021, 00:00:00) 
    [GCC 11.2.1 20210728 (Red Hat 11.2.1-1)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    (ins)>>> from StringIO import StringIO
    (ins)>>> a = StringIO()
    (ins)>>> a.write("a string")
    (ins)>>> 
    
    (ins)(.venv27)$ python
    Python 2.7.18 (default, Sep 17 2021, 00:00:00) 
    [GCC 11.2.1 20210728 (Red Hat 11.2.1-1)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    (ins)>>> from io import StringIO
    (ins)>>> a = StringIO()
    (ins)>>> a.write("a string")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: unicode argument expected, got 'str'
    (ins)>>> a.write(u"a unicode string")
    16L
    (ins)>>> a.getvalue()
    u'a unicode string'
    (ins)>>> 
    

    What do you think? Does this look correct now?

    Thanks again for all your help and patience,
    Ankur

     
  • Dave Kuhlman

    Dave Kuhlman - 2021-11-30

    Ankur,

    Your fix looks good to me. I've applied that patch, did some testing, created a new version, and uploaded it to PyPI and SourceForge.net.

    Thanks for your help with this. Please let me know if/when there is another issue.

    Dave

     

Log in to post a comment.