Menu

#1 EXI header not seem correct

1.0
pending
None
2015-05-05
2015-04-29
dave brown
No

The EXI header I see in the .xml_exi and .xml_exi_schema when i run EXIficient is 00 80; I expected A0 00 for distinguishing bits of 10, presence bit set and schema ID of 00000. I see this both for my xml file and for the sample notebook.xml. I tried a number of different command line options without success in getting teh header I expect.

BTW - using the gui to encode would hang at 47%; not sure why.

2 Attachments

Discussion

  • Daniel Peintner

    Daniel Peintner - 2015-04-30

    Hi Dave,

    The first byte is 80

    80 = 10 | 0 | 0 0000 // distinguished bits, options NOT present, version

    Starting from the second byte content comes...

    A0 is IF the optional "EXI options" are present

    A0 = 10 | 1 | 0 0000 // distinguished bits, options present, version

    You can configure it as you like.
    Command-line option would be "-includeOptions"
    Code: EXIFactory.getEncodingOptions().setOption(EncodingOptions.INCLUDE_OPTIONS);

    BTW - using the gui to encode would hang at 47%; not sure why.

    It is just a guess but maybe your interpretation of the progress bar is different.
    The value of "47%" depicts the percentage of the original XML. Is this the case in your situation?

    Thanks,

    -- Daniel

     
  • Daniel Peintner

    Daniel Peintner - 2015-04-30
    • status: open --> pending
    • assigned_to: Daniel Peintner
     
  • dave brown

    dave brown - 2015-04-30

    Thanks for the quick response and clarification. I did not realize Frhed displays a byte counter in front of each line of data. I thought I had invoked the -includeOptions CLI option; perhaps something in my syntax is incorrect and it is not seeing the options I listed? I attached the file I was using to invoke EXIficient

     
  • dave brown

    dave brown - 2015-04-30

    re gui and 47%: 47% looks about right for non schema encoding; I have not been able to get the gui operation to use the schema to increase efficiency or to leave out the $EXI prefix in the output - but I can get the presence bit set.

     
  • dave brown

    dave brown - 2015-04-30

    BTW - I am using 0.9.4.

     
  • dave brown

    dave brown - 2015-04-30

    I tried compressing some other .xml files; these also did not set teh presence bit. I was surprised to see some of them had a different Schema ID (first 2 bytes were no longer 80 00 but 80 40 or 80 50. Then I realized that as it EXIficient seems to be ignoring -includeOptions it may aslo be ignoring -includeSchemaID. I suspect an issue with how I am invoking EXIficient.

     
  • dave brown

    dave brown - 2015-05-01

    I think of my problem - I am using EXIficientDemo and substituting my .xml input in the run-sample.bat file and adding options - but the demo program does not accept additional arguments. So I tried calling EXIficient but this gives me a class defn not found error. I think I need some additional guidance regarding how to correctly invoke EXIficient using the CLI.

     
  • Daniel Peintner

    Daniel Peintner - 2015-05-04

    Hi Dave,

    In your script you are running the EXIficientDemo class
    "java -cp .;bin;lib/exificient.jar;lib/xercesImpl.jar;lib/xml-apis.jar
    EXIficientDemo sample-data/Triglocnanswer.xml
    sample-data/Tier2.xsd -encode -xsdSchema -strict -bitpacked
    -includeOptions -includeSchemaId 10"

    EXIficientDemo is just a "demo" for using the library.
    Instead you have to use EXIficientCMD class.

    e.g.,
    java -cp .;lib\exificient.jar;lib\xercesImpl.jar;lib\xml-apis.jar com.siemens.ct.exi.cmd.EXIficientCMD -encode -i notebook.xml -o out\notebook.xml.exi

    Hope this helps,

    -- Daniel

    P.S. Note that version 0.9.4. of the CMD line interface has a bug w.r.t. to the output file. You need to specify an absolute file name or use a sub-directory like I did in my command (see "-o out\notebook.xml.exi")

     
  • dave brown

    dave brown - 2015-05-04

    Thanks - that helped - I can now run EXIficientCMD and the presence bit is set in the header - but it seems to be ignoring my Schema. Is there a specific order the options and schema file need to be put on the command line? I tried a few variants but did not discover the magic sequence.

     
  • dave brown

    dave brown - 2015-05-04

    looking at the output more carefully - It seems to be using the schema in some fashion (I moved it to a different location and when the path in the cmd line was not right it gave a file not found message. I thought it was not using the schema because the output was larger than the previous demo output with schema was (that had the wrong header). I added the -xsdschema option and the output got a lot bigger - almost like it was processed as no schema. maybe that should be in form of the schema filename. More info - I removed the includeOptions switch (but kept -includeSchemaID and the size now matches what I expect - except the options bit is not set and the schema ID is absent. Apparently -includeOptions turns on a bunch of things that I need to turn off; I thought it only set the bit in the header and I had to use additional include swtiches to turn on each ption.

     
  • dave brown

    dave brown - 2015-05-04

    here is cmd line I am using; I tried it with -includeSchemaID both before the -schema file and after it

    java -cp .;bin;lib/exificient.jar;lib/xercesImpl.jar;lib/xml-apis.jar com.siemens.ct.exi.cmd.EXIficientCMD -encode -strict -includeSchemaId -schema sample-data/Tier2.xsd -i sample-data/Triglocnanswer.xml -o out/Triglocnanswer.exi

     
  • Daniel Peintner

    Daniel Peintner - 2015-05-05

    Hi Dave,

    "-includeSchemaId" requires "-includeOptions" to work given that this option switches on the EXI options part which schemaId is part of.

    Hope this helps,

    -- Daniel

     
  • dave brown

    dave brown - 2015-05-05

    when I add the -includeOptions switch the presence bit gets set as it should - but the .exi file grows from 18 to 41 bytes! I woudl have expected it to grow by the single byte for the Schema ID. it seems to be adding a bunch of stuff that i do not know how to turn off. or - it thinks the schema ID is much bigger....where does it get the Schema ID? I do not have a schemaID explicitly set in my schema.

     
  • Daniel Peintner

    Daniel Peintner - 2015-05-05

    Hi,

    The EXI spec does not dictate the syntax or semantics of the schemaId. An example schemaId scheme is the use of URIs.
    (see http://www.w3.org/TR/exi/#key-schemaIdOption)

    The CMD line interface uses what gets passed. In your case it might be "sample-data/Tier2.xsd".

    Currently the CMD line interface cannot be triggered to use other schemes than files/uris. The EXIficient library can do so and also allows you to register a schemaID handler.

    If you really need this capability in the CMD line interface as well I ask you to file a feature request.

    -- Daniel

     

Log in to post a comment.

MongoDB Logo MongoDB