special case sequence with optional elements

Brought to you by: pabigot

#33 special case sequence with optional elements

Milestone: PyXB 1.1.2

Status: closed

Owner: Peter A. Bigot

Labels: None

Resolution: fixed

Component: Content model

Priority: major

Version: PyXB 0.5.1

Type: enhancement

Updated: 2010-05-30

Created: 2009-07-17

Creator: Peter A. Bigot

Private: No

The dwGML schema incorporates a content model that has a sequence containing 99 individual elements each with a minOccurs=0 and maxOccurs=1. Generating a DFA from the content model produces 99+c states, which takes a long time and a lot of unnecessary space. Consider providing an alternative model for sequence content as was done for choice.

http://www.weather.gov/forecasts/xml/OGC_services/

http://www.weather.gov/forecasts/xml/OGC_services/schema/dwGML_WFS_GMLv311.xsd

Discussion

Peter A. Bigot - 2009-07-26

status changed from new to accepted

milestone changed from PyXB 0.5.2 to PyXB 1.0.0

Moving this out as it's a low priority performance enhancement.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Peter A. Bigot - 2009-08-08

type changed from defect to enhancement
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Peter A. Bigot - 2010-02-19

priority changed from minor to major

milestone changed from PyXB 1.0.0 to PyXB 1.1.2
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Aoriste Boutade - 2010-05-18

cc aoriste added

As discussed here, I do indeed think that this is the problem. This is relatively high priority for me as I have been using pyxb at work for almost two months. I've built a few tools (XML <-> PostgreSQL conversions) using pyxb, but always tested with small example xml documents (my bad). Upon my using a larger example document, the time complexity of CreateFromDocument exploded. For me, this is unacceptable because I plan on integrating this function into a command line tool for performing regular imports and exports of XML data.

If assistance might be helpful (other than the testing and feedback I will continue to give anyway), I would be happy to offer it. If not corrected in a timely fashion, project deadline constraints would force me to to seek another option in this niche (generateDS, or some schema-ignorant method using elementtree), and so helping to correct this bug would help me to complete my for-pay project on time.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Peter A. Bigot - 2010-05-30

status changed from accepted to closed

resolution set to fixed

Fixed in trac/33 and merged to next. Complete replacement of the model group portion of the content model. The NFA-to-DFA approach is gone. The resulting system does a better job in less space and significantly faster: 30% on the standard tmsxtvd test, orders of magnitude on documents with large sequences of optional elements.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link: