Re: [Xbrlapi-developer] BaseContentHandler Error
Brought to you by:
shuetrim
From: Matthew D. <ro...@gm...> - 2011-06-17 13:16:02
|
Hi Geoff, Thanks for the attention to this. I did not know enough about the schema location attribute to look closely for it, but I see it now in comparing the DAR instance to another instance. It is good to know that the schemaRef element is not recognizable to Xerces. I will verify that the information I want is being entered into the data store, but from your response below, it doesn't look like this is going to be a big deal for me. I will take a deeper dive into the aspect package. Right now, dimensions and aspects look a little intimidating, but they are too good to pass up. Regards, Matt On Thu, Jun 16, 2011 at 8:01 PM, Geoff Shuetrim <ge...@ga...> wrote: > Matthew > > Looking at the example XBRL instance ( > http://www.sec.gov/Archives/edgar/data/916540/000091654011000018/dar-20110615.xml) > the problem is straightforward - there is no use of the schema location > attribute at all. In the previous version of the XBRLAPI, this would have > thrown a validation error much earlier but now, I am caching the official > XBRL schemas in the grammar pool before doing any SAX parsing of XBRL > instances. This makes sure that the XBRL specification defined elements are > validated using XML schema, even if nothing else is. The instance is > clearly expecting the XBRL schemaRef element to be sufficient for the > processor to determine what schema to use for validation of the instance but > the schemaRef element and its semantics are not understood by the Xerces > parser. In a future version of the XBRLAPI, I may include a preprocessing > step that would scan documents for such ref elements to build up the grammar > pool in advance of XML Schema validation but that is not a high priority for > me at the moment. It adds to the processing time and is still not going to > lead to full XBRL validation. > > The reason to be careful with this is that, without XML schema validation, > things like XML Schema default values for elements and attributes are not > going to be added to the post validation infoset. Such defaults are used > pretty rarely in XBRL - because their usage if fraught with problems like > this one. If you want to be sure, try XQuerying the data store for usage of > default attributes and fixed attributes in the fragments that extend the > XMLSchemaContent class. That is about all I have to suggest at this stage. > > Regards > > Geoff Shuetrim > > > On 17 June 2011 08:03, Geoff Shuetrim <ge...@ga...> wrote: > >> I have recently made changes that make the XBRLAPI more demanding at the >> XML Schema validation stage, at least in terms of what it checks and >> reports. The changes do not alter what is actually being loaded into the >> data store but they do let you know what was and was not validated using >> Xerces XML Schema validation. >> >> The kind of error being found by you at the schema validation stage has >> been turning up for me in three difference circumstances: >> >> 1. When the xsi:schemaLocation attribute is not providing enough >> information to find all of the schemas required to do schema validation; and >> >> 2. When more than one schema has the same target namespace - such as >> occurs in the XBRL 2.1 conformance test suite. >> >> 3. When there is an XML Schema validity issue in the file being parsed. >> >> I am not sure what the right step to take regarding 1 is (perhaps it is to >> do DTS discovery first, find all relevant schemas, and then to do XML schema >> validation but that seems to be putting the cart before the horse a bit.) >> but for 2, the problem used to arise for me because the XML Schema grammar >> pool caches the first schema it encounters for the namespace and then >> continues using it without augmenting that schema grammar with information >> from other schemas with the same target namespace. I thought I had fixed 2 >> by locking the grammar pool after adding just the main xbrl XML Schemas but >> perhaps that was not sufficient. I am guessing I can ignore 3. >> >> I will take a look at the example files you provided links to and see if >> we have a new scenario in which this issue arises. In the meantime, the >> files should be loading into the data store OK so long as they are XBRL >> valid. The XBRLAPI is designed to be as robust to things like this as >> possible, kind of like a web browser is to wierd HTML markup. >> >> Regards >> >> Geoff S >> >> On 17 June 2011 06:29, Matthew DeAngelis <ro...@gm...> wrote: >> >>> Hi all (and especially Geoff): >>> >>> I am running the LoadAllSECFilings example (on both the provided RSS feed >>> and http://www.sec.gov/Archives/edgar/usgaap.rss.xml). While the loader >>> threads are running, I regularly get errors of the form below: >>> >>> ERROR BaseContentHandlerImpl.java 128 [error] - :cvc-complex-type.2.4.a: >>> Invalid content was found starting with element >>> 'dar:DecreaseInLongTermPensionLiability'. One of '{" >>> http://www.xbrl.org/2003/instance":item, " >>> http://www.xbrl.org/2003/instance":tuple, " >>> http://www.xbrl.org/2003/instance":context, " >>> http://www.xbrl.org/2003/instance":unit, " >>> http://www.xbrl.org/2003/linkbase":footnoteLink}' is expected.: on line >>> number 479 >>> >>> From reading the documentation, I gather that this is due to the element >>> not being located in the schema information provided in the instance. >>> However, in the above case (instance document: >>> http://www.sec.gov/Archives/edgar/data/916540/000091654011000018/dar-20110615.xml, >>> schema document: >>> http://www.sec.gov/Archives/edgar/data/916540/000091654011000018/dar-20110615.xsd), >>> the element is defined, not in the standard schema, but in the .xsd file. >>> This definition does not appear to be malformed. >>> >>> Since the API is also pulling the .xsd file, and it seems to recognize >>> other non-standard elements, I am not sure why this error is occurring. >>> Some of these errors are on US GAAP elements as well. I am uncomfortable >>> with the idea that one, seemingly random, element, may be missing from every >>> third or fourth XBRL report, so I would like to correct this if possible >>> (before I start manipulating the data). >>> >>> What is causing this error, and is there anything I can do about it? If >>> it simply reflects a problem with the reports as written, then I will have >>> to live with it. >>> >>> >>> Regards, >>> Matt >>> >>> >>> ------------------------------------------------------------------------------ >>> EditLive Enterprise is the world's most technically advanced content >>> authoring tool. Experience the power of Track Changes, Inline Image >>> Editing and ensure content is compliant with Accessibility Checking. >>> http://p.sf.net/sfu/ephox-dev2dev >>> _______________________________________________ >>> Xbrlapi-developer mailing list >>> Xbr...@li... >>> https://lists.sourceforge.net/lists/listinfo/xbrlapi-developer >>> >>> >> > |