|
From: Andrea A. <and...@ge...> - 2016-01-29 11:44:06
|
Hi, I'm looking into improving our ability to parse GML files whose reference to the schema is invalid or unreachable, in other words, not usable. Right now the parser generates a feature collection by reflecting the schema out of the first feature, which causes issues some cases, like missing elements in the first feature (because they are null), which will be then not part of the schema, and will be pruned from any subsequent feature too. My first idea would be to mimic what we have in the GeoJSON module and allow the client to perform two parses, one that would determine the best feature type by scrolling over the GML (see FeatureJSON.readFeatureCollectionSchema), and a second one that can take the target feature type as a hint (see FeatureJSON.setFeatureType). These could be added as new methods to the GML facade class: SimpleFeatureType GML.decodeSchema(InputStream in) SimpleFeatureCollection GML.decodeFeatureCollection(InputStream in) Internally I'd pass a custom FeatureTypeCache that is not caching anything in the first case, allowing each feature to have its own natural feature type, and the second one to prime the cache with the target feature type. However, I was thinking that the same could be done in a single call, since we'd be getting a in memory feature collection anyways, so we can really just figure out the best fitting feature type, and then decorate the collection with a retyping one that will be returned. This would be best done using a PullParser, as we could control how the collection gets build (otherwise we end with a DefaultFeatureCollection that would spam the logs complaining the feature type of the last feature added is not the same as the collection one). This would however incur in a bit of extra cost... but probably negligible comparedf to the GML parsing itself... at the same time, the PullParser is not as tested as the non pull one. So, we have two solutions, one safe but slower, based on two methods, using the normal parser, and doing two scans, one faster but a litter bit more untested, using the PullParser. Which avenue would be the preferred one? Mind, I'd also need to backport this little new feature Cheers Andrea -- == GeoServer Professional Services from the experts! Visit http://goo.gl/it488V for more information. == Ing. Andrea Aime @geowolf Technical Lead GeoSolutions S.A.S. Via Poggio alle Viti 1187 55054 Massarosa (LU) Italy phone: +39 0584 962313 fax: +39 0584 1660272 mob: +39 339 8844549 http://www.geo-solutions.it http://twitter.com/geosolutions_it *AVVERTENZE AI SENSI DEL D.Lgs. 196/2003* Le informazioni contenute in questo messaggio di posta elettronica e/o nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il loro utilizzo è consentito esclusivamente al destinatario del messaggio, per le finalità indicate nel messaggio stesso. Qualora riceviate questo messaggio senza esserne il destinatario, Vi preghiamo cortesemente di darcene notizia via e-mail e di procedere alla distruzione del messaggio stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso, divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od utilizzarlo per finalità diverse, costituisce comportamento contrario ai principi dettati dal D.Lgs. 196/2003. The information in this message and/or attachments, is intended solely for the attention and use of the named addressee(s) and may be confidential or proprietary in nature or covered by the provisions of privacy act (Legislative Decree June, 30 2003, no.196 - Italy's New Data Protection Code).Any use not in accord with its purpose, any disclosure, reproduction, copying, distribution, or either dissemination, either whole or partial, is strictly forbidden except previous formal approval of the named addressee(s). If you are not the intended recipient, please contact immediately the sender by telephone, fax or e-mail and delete the information in this message that has been received in error. The sender does not give any warranty or accept liability as the content, accuracy or completeness of sent messages and accepts no responsibility for changes made after they were sent or for other risks which arise as a result of e-mail transmission, viruses, etc. ------------------------------------------------------- |