special case sequence with optional elements
Brought to you by:
pabigot
The dwGML schema incorporates a content model that has a sequence containing 99 individual elements each with a minOccurs=0 and maxOccurs=1. Generating a DFA from the content model produces 99+c states, which takes a long time and a lot of unnecessary space. Consider providing an alternative model for sequence content as was done for choice.
http://www.weather.gov/forecasts/xml/OGC_services/
http://www.weather.gov/forecasts/xml/OGC_services/schema/dwGML_WFS_GMLv311.xsd
Moving this out as it's a low priority performance enhancement.
As discussed here, I do indeed think that this is the problem. This is relatively high priority for me as I have been using pyxb at work for almost two months. I've built a few tools (XML <-> PostgreSQL conversions) using pyxb, but always tested with small example xml documents (my bad). Upon my using a larger example document, the time complexity of
CreateFromDocumentexploded. For me, this is unacceptable because I plan on integrating this function into a command line tool for performing regular imports and exports of XML data.If assistance might be helpful (other than the testing and feedback I will continue to give anyway), I would be happy to offer it. If not corrected in a timely fashion, project deadline constraints would force me to to seek another option in this niche (generateDS, or some schema-ignorant method using elementtree), and so helping to correct this bug would help me to complete my for-pay project on time.
Fixed in trac/33 and merged to next. Complete replacement of the model group portion of the content model. The NFA-to-DFA approach is gone. The resulting system does a better job in less space and significantly faster: 30% on the standard tmsxtvd test, orders of magnitude on documents with large sequences of optional elements.