Menu

json-schema overview


Release Notes

version 1.0 : 2014-06-29
- added support for user defined types
- added ability to import user defined types from one schema into another

version 0.9.3 : 2014-06-20
- fixed threading issue with date format/validation

version 0.9.2 : 2014-05-12
- fixed commons-lang3 dependency
- added min, max and regex validations for type string

version 0.9.1 : 2014-03-02
- fixed example doc when schema has boolean field

version 0.9 : 2014-01-20
- initial release


Maven Dependency

<dependency>
   <groupId>org.hawksoft</groupId>
   <artifactId>json-schema</artifactId>
   <version>1.0</version>
</dependency>

Overview

json-schema allows you to validate a JSON data document against a given JSON schema document.

The JSON schema document follows the exact same structure as the data document it's validating. However, the schema attribute values are actually validation rules.

For example, the following JSON data attribute:

{ 
    ...
    "fruit" : "banana",
    ...
}

could be validated using the following schema attribute (rules):

{
    ...
    "fruit" : "type=enum;values=apple,orange,banana,pear",
    ...
}

Whereas { "fruit" : "banana" } would pass validation, { "fruit" : "strawberry" } would not, since strawberry is not one of the allowed enumeration values for 'fruit'.

Since json-schema uses the Jettison JSONObject internally, it supports all the built-in types that JSONObject supports and adds support for dates in any format.


Required 'document' object

In order to use json-schema for validation, every JSON data document and every JSON schema document requires a top-level 'document' object (referred to as a header) that describes the type of document ('schema' or 'instance') and allows the correct schema to be matched with a given data document.

This is an example of a schema document header:

{
    "document": {
        "type" : "schema",
        "model": "1312.0",
        "namespace": "org.hawksoft",
        "id": "test",
        "version": "1.0"
    },
    ...
  • 'type' describes the type of document - either 'schema' or 'instance'
  • 'model' is required only in the schema header and is used internally in the validation engine for versioning purposes
  • 'namespace' is a user defined value that allows you to create a grouping (hopefully unique) for your schemas. It can be whatever you like in whatever format you like.
  • 'id' is the name for the schema within the namespace. Likewise, it can be whatever you like.
  • 'version' is the version of the schema

The namespace, id and version must form a unique combination.

Likewise, here is an example of an instance document header:

{
    "document": {
        "type" : "instance",
        "namespace": "org.hawksoft",
        "id": "test",
        "version": "1.0"
    },
    ...

The instance document header is simply a reference to the schema document that should be used for validation.


Validations

As mentioned already, the attribute values in the schema document are actually validation rules. The following entries are mandatory:

  • type specifies the data type of the attribute and can be one of the following values: string, int, long, double, bool (boolean), date, uuid and enum
  • desc is a brief description of the attribute
  • example provides a contextual example of an expected value for the attribute and is used when creating/extracting example data documents from the schema

In addition, depending on the attribute 'type', other validation rules can be specified:

  • string, int, long, double and date types all support 'min' and 'max' validations
  • enum has a mandatory validation 'values' which lists the possible values for the enumeration
  • date has an additional attribute 'format' which allows you to specify a format string that will be used to parse the attribute value as a date; the default is 'yyyy-MM-dd'.
  • string has an additional validation 'regex' that allows you to provide a regular expression that will be used to validate the string data (be sure to double-escape the regex character sets, such as \d instead of just \d for the digit character set)

Here are some examples:

{
    ....
    "color" : "type=enum;values=red,orange,yellow,green,blue;desc=item color;example=red",
    "weight" : "type=int;min=10;max=1000;desc=item weight in pounds;example=500",
    "functional" : "type=bool;desc=is the item functional?;example=false",
    "serviceDate" : "type=date;desc=date item went into service;example=2014-01-18",
    "inspectionTime" : "type=date;format=hhmm;desc=daily inspection time;example=0700",
    "invoiceId" : "type=string;regex=^[A-E]\\d{8};desc=invoice number;example=C12345678",
    ...

User Defined Types

Reusable validation rules, also known as user defined types, can be specified inside the 'document' section and then referenced later in the schema definition as a custom 'type'.

For example, let's say we wanted to define a reusable validation rule for the zip+4 zip code. In the document section we would define something like this using the 'userTypes' object:

"document" : {
    "type" : "schema",
    ...
    "userTypes" : {
        "zip+4" : "type=string;regex=^\\d{5}[-]\\d{4}$;desc=zip code + 4;example=64840-8344"
    }

Then, later in the schema we could reference this new type like so:

"zip" : "type=zip+4"

Notice that we didn't specify the 'desc' and 'example' attributes of the 'zip' validation. Since they are already supplied with the definition of 'zip+4' we were not required to provide them. But, you can still include them if you like, such as to provide a more specific description or a better example.


Optional Attributes

If a given attribute is not required in the instance (data) document, simply add a question mark '?' to the end of the attribute name, like so:

{
    ...
    "birthDate?" : "type=date;min=1980-01-01;desc=birth date;example=1986-03-17",
    ...

If the data document contains the attribute 'birthDate' it will be validated against the given rules. If the attribute is missing, that fact will be ignored and validation will continue.

Unless marked optional, any attribute, object or array is mandatory and validation will fail if it is missing from the data document being validated.


Document structure

As mentioned previously, not only does json-schema provide validation of individual attributes, it also validates the structural content of the JSON document. In other words, when you define the schema document you are also defining the expected (required) structure of the data (instance) document as well.

For example, the following schema has nested objects and arrays:

{
    "document" : {
        "type" : "schema",
        ...
    },
    "person" : {
        "name" : "type=string;desc=first and last name;example=John Doe",
        "phone" : [
            {
                "type" : "type=enum;values=home,work,mobile;desc=phone type;example=mobile",
                "value" : "type=string;desc=phone number;example=608-555-1212"
            }
        ]
    ...

The following instance document would pass validation (assume the document headers match):

{
    "document" : {
        "type" : "instance",
        ...
    },
    "person" : {
        "name" : "Ralph Emerson",
        "phone" : [
            {
                "type" : "mobile",
                "value" : "414-555-5555" 
            }, {
                "type" : "home",
                "value" : "515-555-5505"
            }
        ]
    ...

Schema Imports

User defined types from other schemas can be made visible to the current schema by using the 'imports' section inside the 'document' header, like so:

"document" : {
    "type"     : "schema",
    ...
    "imports" : [
        {
            "namespace" : "org.hawksoft",
            "id" : "parent.commons",
            "version" : "0.1"
        }
    ]

One or more schemas can be imported, which will cause all of their user defined types to be visible inside the current schema. Also, any schemas that the imported schemas import will also be discovered and imported, and so on.

In order for this imported schema discovery to work a schema provider (such as a JSONSchemaRepository) must be supplied when a JSON data document is being validated. See JSONSchemaValidatorTest#testImports in the source code for an example of this.


Schema and data document example

The linked [schemaDocument] and [instanceDocument] are taken from the JSONSchemaValidator unit tests and provide a more robust example of validation types and document structure.


JSONSchemaValidator API

org.hawksoft.json.JSONSchemaValidator has the following API:

// use this method if you have an InputStream for the data doc and you want the
// validator to look up the schema from the given provider;  a JSONSchemaRepository
// is a valid schema provider
public static void validateDocument(InputStream json, ISchemaProvider schemaProvider)


// use this method if you have a JSONObject data doc and a schema provder, as described above
public static void validateDocument(JSONObject json, ISchemaProvider schemaProvider)


// use this method if you have both JSONObject data and schema documents
public static void validateDocument(JSONObject json, final JSONObject schema)


// use this method if you have a String representation of the data document and you 
// want the validator to look up the schema document
public static void validateDocument(String json, ISchemaProvider schemaProvider)


// use this method if you have String representations for both the data and schema docs
public static void validateDocument(String json, String schema)

All API methods throw a JSONValidationException if there is any problem parsing the data or schema docs or if there is a validation error. The exception message includes a detailed description of the problem.

Consult the Javadocs and take a look at the unit tests for more information and examples.


Schema Repository

json-schema also includes a schema repository for loading and accessing the schema documents. The class is named JSONSchemaRepository and the constructor takes either an absolute path to a directory on the file system or a relative path to a folder/package within the jar/war file where the schema documents are located. The current implementation uses a PathMatchingPatternResolver to locate the resources representing the JSON schema documents, as so:

public JSONSchemaRepository(String repoPath) throws JSONValidationException {
    PathMatchingResourcePatternResolver resolver =
        new PathMatchingResourcePatternResolver();

    try {
        Resource[] resources = resolver.getResources(repoPath + "/*.jsd");

The single method for looking up a schema is:

public JSONObject getSchema(String namespace, String schemaId, String version)

There are also methods that let you extract example data (instance) documents from the schema, which would be useful for things like:
* providing dynamic documentation of an API or service that consumes or produces JSON documents
* self-validating the schema (extract an example and feed it and the schema back in to the JSON validator)

// use this if you want the schema to be looked up
public JSONObject getExample(String namespace, String schemaId, String version)

// use this if you have the schema and simply want to extract an example from it
public JSONObject getExample(JSONObject schema)

Project Members

Project Members:


Related

Wiki: instanceDocument
Wiki: schemaDocument

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.