From: Ian S. <Ian...@ar...> - 2003-03-15 23:12:20
Attachments:
gml.zip
|
Hi, As promised, I've been working on the GMLDataSource. I went through various stages of pain and enlightenment during this mission. Current stage is closer to enlightenment. Stuff I did: 1) Looked at existing implementation and took what I could. 2) Write a coordinate parser. Since this seems to be the logical starting point for building geometries, I made it fast, zero garbage production, and robust. 3) Since it was easiest, built the basic XLink classes. 4) Added functionality for versioning through factories. If we can actually resolve the schema locations, these should contain the version of GML used. If not, default to user settings. 5) Added hooks for schema lookup and parsing. 6) Added geometry factory interfaces and kept JTS classes out of all the interfaces. I know, I know, theres supposed to be some gt2 geometry factories, but for now, whenever I can, I keep this logic separate so as to not let the JTS classes percolate throughout the code. 7) Began work on gml parsing. Because schema parsing is a bigger step, and the gml spec says that schemas are not required, I focused my attention on parsing sans schema. 8) Implemented basic geometry parsing and feature "handling" - though no features are created. Still to do: 1) XLink evaluation (medium difficulty) 2) xsi:schemaLocation evaluation and download/lookup (annoying, yet not hard) 3) Application schema parsing (big can of worms) 3) multi-geometry parsing (not hard, just haven't got to it) 4) Feature building using flat features (very close, but may require mutable FeatureType). Observations: 1) Well-formed GML parsing is 100% possible without schema. Validation is 100% impossible without schema. 2) FeatureType must be expanded in a very well planned out way. Suggestions for include : Typed GeometryCollections (restrict membership) - this cannot, nor should not be done using the worthless generics in 1.5 GeometryAttributes - gml describes these basic attributes Hard and soft "references" for Attributes. - to allow for lazy xlinks or real objects. In repsonse to the March 4th chat: <snip> [james] I had to fix it so that lineStringMember and polygonMember were not considered features [ian] sounds fair [james] for a number of things, it would be good to actually parse the xsd so we can do the assignment based on substitution types [rob] This has been a longstanding issue. Parsing the XSD must happen sooner or later. </snip> This is not needed, nor required by the spec. Because gml syntax is determinant, the logic of what a particular XML element is is relatively easy. The general idea is a state machine. The machine starts empty. The root element defines some type of featureCollection which is now on the stack. The next element MUST be a "property" of the "object" on the stack. For instance, "boundedBy" is a property of a FeatureCollection object. And the aforementioned "lineStringMember" and "polygonMember" are not features, as noted, but 'properties' of specific GeometryCollections. The way you know whats a property and whats an object is by whats on the stack! Example (The well-worked Cambridge): this is a snipit from cambridge. At this point, the City element named cambridge is on the stack. Assume we have no schema. "-->" denotes my notes -->This MUST be a "property" of the City type <cityMember> -->This must be an Object which is the "argument" to the cityMember property 'set call' <Road> -->This translates to Road.setName("M11"); <gml:name>M11</gml:name> -->And this, Road.setLinearGeometry( LineString ); <linearGeometry> <gml:LineString srsName="http://www.opengis.net/gml/srs/epsg.xml#4326"> You see the point. I've include my current parser, which, as acknowledged, is not complete. If you run the class org.geotools.gml.RootHandler, it will "parse" the Cambridge example. This will produce output telling you what is happening. Let it be known this code is confusing and tricky. Plenty of refactoring/redesign could occur. Input appreciated, especially by XML/GML experts. Ian S. |
From: Martin D. <mar...@te...> - 2003-03-16 17:33:29
|
Ian Schneider a =E9crit: > Typed GeometryCollections (restrict membership) > - this cannot, nor should not be done using the worthless generics > in 1.5 Its depends if the type are based on Class or on some other conditions.=20 E.g.: - Accept only Geometry which are instance of Polygon. It is not clear for me at this time why generic couldn't do that. (Note: with generic, you can know the type allowed in a GeometryCollection using the reflexion API). - Accept only Geometry with, said, z coordinate set to value 50. In this case, it is true that generic is of no help. Regards, Martin. |
From: Rob H. <rob...@op...> - 2003-03-18 03:59:24
|
Ian, > Is everyone busy or just hiding from GML? Mostly hiding! I have taken a look at your code. Unfortunately, I don't have time today to give it as thorough a review as I should so, pardon me if my questions are ignorant... Note that I am the original author (along with the other Ian) of the hideous code that currently parses GML. First, let me say that your code looks fantastic; it is a large improvement on the existing parser and I really welcome your participation. The original code was made with the following constraints in mind: (1) GML parsing must be implemented as a SAX Filter This is because the primary product that uses this GT2 code is GeoServer, which is a WFS. In a WFS, XML requests can take the following form: ...binding information... ...OGC filter encoding... ...GML... ...binding information... So, GML can be embedded in a larger XML document. This means that there is a real need to have a reusable GML filter, which other parsers can use to parse documents with embedded GML. Although this led to much ugliness, it is imperative that this be the way the GML parser is created...unless we want 2 parsers!. (2) Immutable Feature Types I agree with your statement: "[Arbitrary] Well-formed GML parsing is 100% possible without schema" only if we allow mutable feature types. Note that mutable feature types are bad. Dr. Peter Venkman: I'm fuzzy on the whole good/bad thing. What do you mean "bad"? Dr. Egon Spengler: Try to imagine all life as you know it stopping instantaneously and every molecule in your body exploding at the speed of light. Dr. Raymond Stantz: Total protonic reversal. Dr. Peter Venkman: That's bad. Okay. Alright, important safety tip, thanks Egon. The feature model (which I also largely wrote) is predicated on immutable feature types and will crumble without this. The whole idea of feature types could be revisited, but this is a potentially big project. It is also possible if we parse the document twice (once for the FeatureTypes and once for the Features), but this is not very satisfying. More information on the badness of this is here: http://www.geotools.org/gt2docs/design.html#corefeature under sections 5.2.2.2 and 5.2.2.3. Am I correct in assuming that your approach to parsing works only if we ignore these constraints? If so, we should chat and figure something out, because I like your code and the current parser clearly needs your help. It would be great if you could show up on the IRC tomorrow, or you can Yahoo! IM me at robhranac. I can also give you a call. One more related point... XSD Parsing > [rob] This has been a longstanding issue. Parsing the XSD must happen > sooner or later. > This is not needed, nor required by the spec. Because gml syntax is > determinant, the logic of what a particular XML element is > relatively Not so sure that I agree with this, given the constraints above. In the Cambridge example, you are not parsing certain features because of the fact that you can't parse the XSD. Am I missing something here, or is this supporting the idea that to parse anything but flat features, you need some sort of schema info? I am sure that there is more to say about this nice work, but I must go celebrate my 1/8th Irishness by drinking myself into a stupor. Also, may I nominate Mr. Schneider as the next core member, assuming that I am a core member? Best Regards, Rob Hranac slashdot for digital earth: http://digitalearth.org |