Name | Modified | Size | Downloads / Week |
---|---|---|---|
README.md | 2016-07-18 | 14.0 kB | |
ojson-0.2.tar | 2016-07-18 | 153.6 kB | |
ojson-0.2.zip | 2016-07-18 | 112.9 kB | |
ojson-0.1.tar | 2016-07-16 | 153.6 kB | |
ojson-0.1.zip | 2016-07-16 | 110.1 kB | |
Totals: 5 Items | 544.2 kB | 1 |
Optimised JSON
A Java API for optimising JSON text to minimise duplicate strings, and arrays of objects with repetitive keys by mapping the original JSON text to optimised JSON text.
The Optimised JSON (or OJSON) text is itself valid JSON. The API may be placed each side of a JSON exchange in order to optimise the size of JSON payloads, or to optimise JSON text before storing into a database.
Configuration is available to tune the API for specific expected payload structures in order to maximise the reduction in payload sizes.
The OJSON implementation makes use of existing org.json packages and takes the view that a valid JSON text is any JSON value according to the JSON grammar described more formally by the ECMA 404 Standard. Any object, array, number, string, true
, false
, or null
passed to the OJSON API will be considered a complete and valid JSON text.
Quick Start
To prepare to use OJson requires a couple of simple steps:
- Install Java 8 and add to your PATH / set JAVA_HOME
- Download the OJson distribution (ZIP file)
The distribution comes with the OJson JAR file library (implementing the API) and an example Client application for converting JSON text from the command line. The client may be run using simple scripts provided in the 'bin' directory.
$ unzip ojson-xx.zip
$ cd ojson-xx
Map your JSON file to standard output:
$ bin\ojson input.json
Json has been mapped to the following OJson:
--------------------------------------------
{...}
Map your JSON to an output file:
$ bin\ojson input.json optimised.ojson
Json has been mapped to OJson and saved in the file optimised.ojson
To map OJSON back to JSON just specify the "-u" option
$ bin\ojson -u optimised.ojson original.json
Using the Java API
To prepare to use the Java API requires a couple of simple steps. Note that a public repository distribution of the API is not yet available (such as via Maven Central) though this will come.
- Unpack the OJson distribution as described under Quick Start
- Copy the entire contents of ojson-xx/lib directory into your project, or refer to this lib/ directory in your project dependencies. The lib/ directory conains both the OJson JAR and other JAR files it requires.
Some Java examples follow, illustrating the use of the API to map between plain JSON and Optimised OJSON.
Convert JSON String to OJSON before sending
String plainJsonString = "....";
String optimisedJsonString = OJSon.toOJson(plainJsonString);
Convert OJSON strings back into plain JSON after receiving
String ojsonString = "....";
String plainJsonString = OJson.toJSON(ojsonString);
Convert objects created using org.json packages before sending
If you are using the org.json packages to build up a JSON object model, generate optimised JSON text as follows:
JSONObject plainJsonObject = new JSONObject();
plainJsonObject.put("isActive", true);
...
String optimisedJsonString = OJSon.toOJson(plainJsonObject);
Convert an OJSON string or objects back into plain JSON after receiving
String plainJsonString = OJson.toJSON(optimisedJsonStringReadFromNetwork);
JSONObject plainJsonObject = new JSONObject(plainJsonString);
or
JSONObject optimsedJsonObject = new JSONObject(optimisedJsonStringReadFromNetwork);
String plainJsonString = OJson.toJSON(optimisedJsonObject);
Other Input Formats
The API also accepts Files, Streams and Readers similar to the above examples. To use a string encoding other than the default "UTF-8" also specify options with an alternative encoding (described in the Options section below).
File jsonFile = new File(..)
String ojsonFromFile = OJSon.toOJSon(jsonFile);
InputStream jsonStream = ..
String ojsonFromStream = OJSon.toOJSon(jsonStream);
Reader jsonReader = ..
String ojsonFromReader = OJSon.toOJSon(jsonReader);
If you have a byte array, first wrap in a stream
byte[] myJsonByteArray;
InputStream jsonStream = new ByteArrayInputStream(myJsonByteArray);
Or to map your JSON to an OJSON string and convert to an org.json object
String ojsonString = OJSon.toOJson(...);
Object value = JSONObject.stringToValue(ojsonString);
The resulting value can be tested for it's type
if (value instanceof simple-type) {
// simple-type is Integer, Long, Double, Boolean, String or JSONObject.NULL
...
} else if (value instanceof JSONObject) {
...
} else if (value instanceof JSONArray) {
...
}
Using Options to tune specific optimisations
Create an options object
// Using defaults:
OJsonOptions options = OJsonOptions.defaultOptions();
// Using custom settings:
OJsonOptions options = new OJsonOptions(); // or initialise with defaults as above
Specify option settings, these include:
// Set an alternative string encoding (default: platform specific)
options.setEncoding("UTF-16");
// Only arrays with 10 or more elements are optimised (default: 3)
options.setMinimumArrayElements(10);
// Only strings containing 8 or more characters are optimised using string references (default: 5)
options.setMinimumStringLength(8);
// Only optimise arrays of objects if all objects share at least 5 common keys (default: 3)
options.setMinimumMatchingObjectKeys(5);
// Disable the use of string reference optimisations entirely (default: true)
options.setStringReferencesEnabled(false);
// Disable arrays of objects templating optimisations entirely (default: true)
options.setTemplatingEnabled(false);
Passing the options through the OJSON API:
String optimisedJsonString = OJSon.toOJson(jsonString, options);
Optimisations
Some details of just how these optimisations translate into OJSON text produced from plain JSON.
String references
A string reference consists of a quoted string beginning with @
followed by a number (unique within the document). Examples are @1
, @23
.
When declaring a new reference for the first occurrence of a particular string, the reference is followed by ':' and the string to be tagged. For example a string value of "previousAddress" can be tagged as:
"@1:previousAddress"
while the following refers to the above string reference:
"@1" : "some value"
when mapped back to JSON will produce:
"previousAddress" : "some value"
String duplication optimisation
This process replaces all duplicate occurrences of strings in both keys and values with string references. An array of addresses may look like:
{
"addresses" : [
{
"@1:address" : { "@2:streetName" : "1 Smith Street", "@3" : "Brunswick" }
},
{
"@1" : { "@2" : "20 Jones Avenue", "@3:suburbName" : "Preston" }
}
]
}
Here @1:address
marks the string "address" as reference @1
. Subsequent occurrences of the string refer only to @1
.
According to the JSON standard, the order of keys within a JSON object are not preserved so that Optimised JSON may be parsed by a receiver with string references appearing before being declared. In the above, @3
is one an example of a reference appearing before being declared. For this reason forward references are supported by the API.
When mapped to plain JSON this text becomes:
{
"addresses" : [
{
"address" : { "streetName" : "1 Smith Street", "suburbName" : "Brunswick" }
},
{
"address" : { "streetName" : "20 Jones Avenue", "suburbName" : "Preston" }
}
]
}
Object key duplication optimisation
Objects may be optimised when they are contained within an array of similar objects. Typically each object in the array share common keys, possibly also some optional keys. If the objects contain a minimum number (configurable) of overlap in keys, the array of objects can be flattened to become an array of arrays.
The repeated keys are placed in the first row within the array and tagged with @t:
. Subsequent rows contain the corresponding object values. The JSON standard requires that the order of array elements is preserved, allowing a simple mapping back to the original object structure. Optional keys are represented with the value @n
when missing from a particular object. Explicit null values remain as null
.
An example of a plain JSON array of objects:
"people" : [
{ "firstName": "bob", "lastName": "the builder", "age": 20, "title": "Mr" },
{ "firstName": "thomas", "lastName": "the tank", "title": null }
]
maps to OJSON as an array of arrays, removing duplication of keys:
"people" : [
[ "@t:firstName", "lastName", "age", "title" ],
[ "bob", "the builder", 20, "Mr" ],
[ "thomas", "the tank", "@n", null ]
]
Note the @n
value for 'thomas' row which does not contain a value for the 'age' key.
Combined String and Object key optimisations
An example of combining both string reference and template optimisations:
{
"addresses" : [
[ "@t:address", "suburbName", "postcode" ],
[ "1 Smith Street", "@2:Brunswick", 3000 ],
[ "20 Jones Avenue", "@2", 3000 ]
]
}
Mapping Rules
The API provides a mapping from valid JSON to valid JSON so that the resulting OJSON text can be parsed by any JSON API. The optimised format could even be manually created using any standard JSON API simply by including the appropriate tags within strings. Some limitations exist that may
Handling of unmappable JSON
It is possible the object mapping may not succeed for several reasons that relate to the lack of strict rules defined by the JSON standard. For example
- the objects within an array are too dissimilar
- an array contains a mix of element types including object and other types.
The API handles this by simply reverting back to plain JSON for the particular structure, while continuing to process the remainder of the JSON text looking for further optimisations.
If this occurs, the API will simply revert back to generating the original JSON structure for those objects and continue on, deep diving into the structure looking for further structures to optimise.
Reserved tags in plain JSON
Some JSON text may contain strings beginning with Optimised JSON tags simply by chance. Their values are escaped in order to be preserved in the Optimised text. The @e
tag is used to prefix the plain JSON text in the optimised text, and is always filtered when mapped back to plain JSON. This table illustrates the tag, an example of plain JSON beginning with each tag and how these strings are escaped in the final OJSON.
tag | plain JSON | optimisied JSON |
---|---|---|
@e:xxx |
@e:fooo |
@e:@e:fooo |
@<number> |
@122 |
@e:@122 |
@<number>:xxx |
@12:foo |
@e:@12:foo |
@t:xxx |
@t:bah |
@e:@t:bah . If a tag key then @t:@t:bah instead |
@n |
@n |
@e:@n |
Note also that OJSON tags only exist at the beginning on a string. Additionally, only a single tag is recognised in an OJSON string as tags cannot be nested.
Performance
A performance test is included which involves a simple array of customer data for 100 and 1000 records containing the same key values but random values. The test cases deliver between 45% and 49% reduction in the payload size using default settings.
Compared with raw compression, using a GZIP algorithm which converts the text to a binary stream, the 1000 record example was mapped to OJSON in 870msecs with 49% size reduction while the GZIP stream compressed the data in 90msecs with 80% reduction. If there is no requirement to preserve the human readable / plain text format for the JSON text then compression is clearly a better choice.
Roadmap
Under consideration are a number of more complex optimisations and other functionality. They are not currently defined in detail and not implemented as part of the mapping.
- Cleansing of string references not used in the remainder of the OJSON text
Enable a second pass optimisation to remove un-used string references
options.setStringReferenceCleaningEnabled(true);
Using a trivial example, in the complete OJSON text { @1:key
: @2:value
}, the @1
and @2
references may be removed as they are never re-used and only add to the resulting payload size.
- Fine tune string references
Allow certain structures to be excluded from string reference optimisation, for example large arrays of strings which are unlikely to contain duplicates.
- Concatenated string references
A limited form of concatenation could be possible in order to join a reference to a string literal. Such values would consist of a string reference concatenation tag followed by a string literal. The tags would only be possible at the start of the string. For example, optimising for "production description" string in plain JSON:
[ "product code", "product code description" ]
could map to OJSON as:
[ `@1:product code`, `@1+: description` ]
- Javascript module for the API
The API is initially targeted at Java to Java exchanges, for example one microservice calling another over REST / JSON or similar endpoints. There is also a case for optimisations between a front-end application and back-end API, typically a javascript app running in a browser and a microservice API. A Javascript module to provide the identical OJSON mapping/unmapping features as the Java API is due to be developed.
Author
Craig Ryan, 2016.