xsd2pgschema Code
Relational database replication tool based on XML Schema
Brought to you by:
yokochi
File | Date | Author | Commit |
---|---|---|---|
example | 2021-05-19 |
![]() |
[505d93] Ajust name of example scripts |
src | 2024-09-19 |
![]() |
[23d39c] Reimprove message for Not found root table error |
LICENSE | 2017-04-20 |
![]() |
[69a8e2] Release v2.4.3 |
NOTICE | 2024-07-11 |
![]() |
[e33b22] Release v4.4.9 |
README | 2024-09-19 |
![]() |
[23d39c] Reimprove message for Not found root table error |
User Guide.pdf | 2024-07-22 |
![]() |
[e7e685] Revise User Guide to avoid confusion of '--upda... |
pom.xml | 2024-07-11 |
![]() |
[e33b22] Release v4.4.9 |
Changes in xsd2pgschema 4.4.9, 5.3.9 (2024-07-25) NOTICE : PgSchema model server should be restarted even if you already used 4.4.9 or 5.3.9. Code refactoring and library updates. Fixed --table-excl option did not work in standalone mode. [General Discussion - Exception in thread "main" java.lang.StackOverflowError] Fixed StackOverflowError. [General Discussion - How to use --update properly?] Do not show the default '--update' option in usage to avoid the confusion. Revised User Guide to match with the implementation. Revised DDL description of dismissed local restrictions for reusable simple content. Retrieve the local restrictions of reusable simple content when in-lining simple content. Fixed missing of some simple content (bypassed via simple bridge or PostgreSQL's VIEW table) when in-lining simple content. Enabled to check the local restrictions of reusable simple content regardless of in-lining simple content. Minor fixes on 2024-09-19 Enabled to retrieve XSD file from URL in xsdvalidator main class. [General Discussion - Error: net.sf.xsd2pgschema.PgSchemaException: Not found root table in XML Schema] Improve error message to avoid confusion. Changes in xsd2pgschema 4.4.8, 5.3.8 (2024-01-29) Added --table-excl option, which allows to exclude uninterested or bulky tables (including their children) from data migration, full-text indexing, and JSON conversion. For users on JDK21 environment, you can find branch https://sourceforge.net/p/xsd2pgschema/code/ci/jdk21/tree/, which includes updated pom.xml and example scripts. How to compile JDK21 version from the source code (git and mvn are required): % git clone git://git.code.sf.net/p/xsd2pgschema/code xsd2pgschema-code % cd xsd2pgschema-code % git checkout jdk21 % git pull % mvn clean package % cp target/xsd2pgschema-*-jar-with-dependencies.jar ./xsd2pgschema.jar % export JAVA_OPT_X2P="--add-opens java.base/java.lang=ALL-UNNAMED --add-opens java.base/java.math=ALL-UNNAMED --add-opens java.base/java.util=ALL-UNNAMED --add-opens java.base/java.util.concurrent=ALL-UNNAMED --add-opens java.base/java.net=ALL-UNNAMED --add-opens java.base/java.text=ALL-UNNAMED --add-opens java.sql/java.sql=ALL-UNNAMED" % java $JAVA_OPT_X2P -classpath ./xsd2pgschema.jar (main_class_name) (arguments) Minor fixes on 2024-02-29 [Ticket #13] Revised User Guide.pdf and added JAR files for Java 14, 17, and 21 environments in v5.3.8 package. Changes in xsd2pgschema 4.4.7, 5.3.7 (2023-07-10) [General Discussion - Exception in File Splitter] Fixed null pointer exception while splitting. Removed serial_key and xpath_key in simple bridge table when it is implemented using PostgreSQL's VIEW. Changes in xsd2pgschema 4.4.6, 5.3.6 (2023-07-05) [General Discussion - XSD Validator throws an error, but the file seems to be valid] Fixed fetching incomplete XML Schema files. Changes in xsd2pgschema 4.4.5, 5.3.5 (2023-06-21) [General Discussion - primary key not getting created] Added unique key constraint on the root table's primary key, even if the root table is a list holder. Changes in xsd2pgschema 4.4.4, 5.3.4 (2023-03-03) Added relational tuple-oriented JSON conversion (--rel-tuple-json option). New JSON format contains "tags" and "loop" objects for each relation. The "tags" object stores tag names of tuples present in subsequent "loop". The "loop" object contains list of tuples. Added support for JSON Schema draft 2020_12 as default. The draft 2020_12 only affects the relational tuple-oriented JSON format. Fixed missing wildcard content in relational list-oriented JSON (f.k.a relational-oriented JSON) file. Minor fixes on 2023-03-22 Performance improvement of differential update and changed the default value of --min-rows-for-index from 2K to 1M. Revised User Guide.pdf to include performance improvement tips for differential update. Changes in xsd2pgschema 4.4.3, 5.3.3 (2022-12-19) Added '--field-deny', '--case-insensitive', '--pg-public-schema', and '--pg-named-schema' options in xml2lucineidx and xml2sphinxds main classes. The new '--field-deny' option allows to drop a table or a field from full-text indexing. Changes in xsd2pgschema 4.4.2, 5.3.2 (2022-10-14) Enabled to map any XSD duration type (xs:duration, xs:yearMonthDuration, xs:dayTimeDuration) to PostgreSQL interval type. Added '--pg-map-interval' and '--pg-map-text-interval' options, the latter one is prepared for backward compatibility. Added support for xml.bz2 file decompression (usage: --xml-file-ext bz2). Added support for xml.Z file decompression (usage: --xml-file-ext Z). For users on JDK17 environment, you can find branch https://sourceforge.net/p/xsd2pgschema/code/ci/jdk17/tree/, which includes updated pom.xml and example scripts. How to compile JDK17 version from the source code (git and mvn are required): % git clone git://git.code.sf.net/p/xsd2pgschema/code xsd2pgschema-code % cd xsd2pgschema-code % git checkout jdk17 % git pull % mvn clean package % cp target/xsd2pgschema-*-jar-with-dependencies.jar ./xsd2pgschema.jar % export JAVA_OPT_X2P="--add-opens java.base/java.lang=ALL-UNNAMED --add-opens java.base/java.math=ALL-UNNAMED --add-opens java.base/java.util=ALL-UNNAMED --add-opens java.base/java.util.concurrent=ALL-UNNAMED --add-opens java.base/java.net=ALL-UNNAMED --add-opens java.base/java.text=ALL-UNNAMED --add-opens java.sql/java.sql=ALL-UNNAMED" % java $JAVA_OPT_X2P -classpath ./xsd2pgschema.jar (main_class_name) (arguments) Changes in xsd2pgschema 4.4.1, 5.3.1 (2022-09-01) Added support for schema components, xs:openContent and xs:defaultOpenContent. Added support for XSD built-in data types, xs:precisionDecimal, xs:anySimpleType, and xs:anyAtomicType. Fixed rearrangement of integer data types during relation merging when signed int 32-bit is selected. (--pg-map-integer option). Revised User Guide.pdf Changes in xsd2pgschema 4.4.0, 5.3.0 (2022-08-23) Added support for schema transformation by xs:override and xs:redefine mechanisms. Changes in xsd2pgschema 4.3.6, 5.2.6 (2022-08-16) NOTICE: We recommend all users should update 4.3.6, 5.2.6, or later. [Ticket #12-5] Re-fix - xs:complexType|xs:complexContent/@mixed="true", which must generate xs:simpleContent field. Changes in xsd2pgschema 4.3.5, 5.2.5 (2022-08-15) NOTICE: The [Ticket #12-3, 5, 6, 7] may affect some users who utilize multiple XML schemata via the xs:import or xs:include mechanisms. [Withdrawn] Try to use 4.3.6, 5.2.6, or later. [Ticket #12-5] Bug fix - xs:complexType|xs:complexContent/@mixed="true", which must generate xs:simpleContent field. [Ticket #12-6] Bug fix - external root element must not be virtual (XsTableType.xs_extern_root type has been added). [Ticket #12-7] Bug fix - resolve duplication of nested keys and drop parent node name constraints when name collision between foreign key and nested key occurs that indicates self-reference. [Ticket #12-8] Bug fix - wild cards should not parse existing element/attribute nodes. [Ticket #12-9] Added '--wild-card-to-latter' option to move wild cards to latter column. Changes in xsd2pgschema 4.3.4, 5.2.4 (2022-08-12) NOTICE: The [Ticket #12-3] may affect some users who rely on ancient XML schema (~2012), please compare DDLs generated by different versions before you move to 4.3.4/5.2.4. The change is good for data integrity, otherwise some attribute information will remain lost. [Withdrawn] Try to use 4.3.6, 5.2.6, or later. [Ticket #12-2] Added '--show-orphan-table' option equally to main classes associated with generated PostgreSQL DDL with the option. [Ticket #12-3] Enable to handle attribute declarations without type specifications against the W3C rule, xs:string type is assigned, instead. [Ticket #12-4] Bug fixes in copying attributeGroup and modelGroup that were causing data migrations to fail using the xml2pgsql main class. Changes in xsd2pgschema 4.3.3, 5.2.3 (2022-08-09) [Withdrawn] Try to use 4.3.6, 5.2.6, or later. [Ticket #12] Fixed stack overflow exception due to self-rerefence in XSD. Raise an error when target XML files are not found. Changes in xsd2pgschema 4.3.2, 5.2.2 (2022-06-15) NOTICE: We recommend all users should update 4.3.2, 5.2.2, or later. Fixed PgSchema model server performance of the previous release (4.3.1, 5.2.1). Added support for blocking substitution models into global abstract element by @block attribute. Improved annotations for xs:list/@itemType and xs:union/@memberTypes. Fixed NullPointerException when using '--show-orphan-table' option Changes in xsd2pgschema 4.3.1, 5.2.1 (2022-06-14) [Withdrawn] Try to use 4.3.2, 5.2.2, or later. Added supports for global element difinitions having @abstruct="true" and @substitutionGroup attributes. Added more strict data model identification on the PgSchema server using the root schema's hash code. Changes in xsd2pgschema 4.3.0, 5.2.0 (2022-06-07) [Withdrawn] Try to use 4.3.2, 5.2.2, or later. Changes in xsd2pgschema 4.2.3, 5.1.3 (2021-12-28) Updated dependency (Lucene 8.11.1) Changes in xsd2pgschema 4.2.2, 5.1.2 (2021-09-21) Fixed to follow symbolic links in given XML directories. Changes in xsd2pgschema 4.2.1, 5.1.1 (2021-08-05) Added '--pg-named-schema-translation' option that allows user to choose PostgreSQL named schema instead of the default prefix. Changes in xsd2pgschema 4.2.0, 5.1.0 (2021-05-12) [General Discussion - "How to search for xsd files"] Added support for XSD retrieval from relative URL. Stored retrieved XSDs in local URL-like directory. Changes in xsd2pgschema 4.1.7, 5.0.3 (2021-03-19) Added DELETE and FREE query of PgSchema server. Changes in xsd2pgschema 4.1.6, 5.0.2 (release 2021-03-01) Fixed JSON Schema mapping of @fractionDidits. Added '--skip-range-outlier' option to skip content outside of range restrictions, @maxExclusive, @maxInclusive, @minInclusive, and @minExclusive. Changes in xsd2pgschema 4.1.5, 5.0.1 (release 2021-02-05) [Ticket #11] Implementation test including bug fixes for unique constraint generation. Code refactoring with minor fixes on switch expression. Changes in xsd2pgschema 5.0.0 (release 2020-12-24) NOTICE: [Ticket #10] JDK14 is required. Users in JDK8/9/10/11/12/13 environments should stay with v4 series. Dropped support for JSON Schema draft version 4. Removed XML post editorial functions, --filt-in --filt-out, and --fill-this options. Changes in xsd2pgschema 4.1.4 (release 2020-11-30) Allowed mixed use of prefixes (e.g. "xs" and "xsd" prefixes in a document) for the XML Schema namespace URI, http://www.w3.org/2001/XMLSchema. Removed reference to http://www.jsonix.org/jsonschemas/w3c/2001/XMLSchema.jsonschema# in JSON Schema. Added primitive "integer" type in JSON Schema for XSD integer types, instead of "number" type with "multipleOf": 1. Updated dependency (Lucene 8.7, ANTLR 4.9) Changes in xsd2pgschema 4.1.3 (release 2020-07-17) NOTICE: We recommend all users should update 4.1.3 or later. Fixed sharding of Sphinx full-text indexing (xml2sphinxds). Performance improvement on merging stage of full-text indexing. Changes in xsd2pgschema 4.1.2 (release 2020-07-16) Performance improvement on data migration/creating indexes. Changes in xsd2pgschema 4.1.1 (release 2020-06-24) Fixed exception when valid document path is not set. Changes in xsd2pgschema 4.1.0 (release 2020-01-24) NOTICE: We recommend all users should update 4.1.0 or later. Fixed duplication of primary key that affects the following versions: 4.0.5, 4.0.6, and 4.0.7 Fixed SQL translation of document order dependent XPath query using position() and last() functions. Fixed SQL translation of XPath query including aggregate functions, count(), sum(), local-name(), namespace-uri(), and name(). Fixed SQL translation of XPath query including string-before(), string-after(), string(), number(), boolean(), and not() functions. Added diagnostic hint for XPath query including id(), count(), and sum() functions. Fixed node identification in XPath function call context. Updated dependency (ANTLR4 from 4.7 to 4.8). Fixed SQL translation of XPath query including variable reference. Changes in xsd2pgschema 4.0.7 (release 2020-01-22) [Ticket #9] Fixed NullPointerException while schema generation. Used serial key in the ORDER BY clause if available. Imposed to use the same version of PgSchema server. Changes in xsd2pgschema 4.0.6 (release 2020-01-09) Added xsdvalidator main class to validate XML Schema using W3C's Schema for Schema. Fixed PostgreSQL DDL annotation for xs:maxExclusive and xs:minExclusive restraints. Added support for ORDER BY clause derived from xs:key and xs:unique. Fixed ConcurrentModificationException when serial key or XPath key is enabled. Added ORDER BY clause for document key of XPath's subject table. Changes in xsd2pgschema 4.0.5 (release 2019-12-27) Added a period (.) and jsonb operators as PostgreSQL reserved operators. Added --type-check option to enable data type/range check while data conversion. The type/range check is disabled by default for performance. Please utilize --type-check option when data migration fails because of parse error. Invalid data will be omitted (null) while data conversion. Changes in xsd2pgschema 4.0.4 (release 2019-11-18) Code refactoring for overall performance gains including start-up time. Performance improvement on XPath evaluation/full-text indexing with a help of PgSchema server. Fixed date change in XPath evaluation output affected by UTC and local timezone. Do not throw exception for reporting multiple root nodes. Changes in xsd2pgschema 4.0.3 (release 2019-11-05) Fixed non-sense delegated field name to primitive data type. Enabled inlining simple content when relational model extension was disabled by default. Enabled inlining simple content with attribute(s) having fixed value(s) when relational model extension was disabled. Omitted attribute(s) having fixed value(s) are described in PostgreSQL DDL. It occurs only when relational model extension is disabled. Added ancestor/parent node name constraint checker in JSON Schema conversion. Fixed missing tables when relational model extension was disabled with inlining simple content. Changes in xsd2pgschema 4.0.2 (release 2019-10-30) Added --xml-deny-frag and --json-deny-frag option in xpath2xml and xpath2json main classes, respectively. Revised PostgreSQL mapping of xs:dateTime, xs:dateTimeStamp, and xs:date (if --pg-map-timestamp is used) that support millisecond precision. Changed to use UTC timezone by default in XPath evaluation output documnent, instead of local timezone. Changes in xsd2pgschema 4.0.1 (release 2019-10-28) Fixed PgSchemaException while consistency test on PostgreSQL DDL (--test-ddl option). Added --pg-comment-on option in xsd2pgschema main class to set annotation as PostgreSQL comment. Fixed missing of UNION ALL clause in PostgreSQL views. Fixed ClassCastException in XPath evaluation. Changes in xsd2pgschema 4.0.0 (release 2019-10-24) CAVEAT: PostgreSQL databases generated by v3.x are compatible with read-only actions such as SQL SELECT clause (in your application), XPath evaluation, not with data migration/update actions. CAVEAT: Use --realize-simple-brdg option for backward compatibility with the v3.x series. Implemented simple bridge tables using PostgreSQL views by default (v4.x), added --realize-simple-brdg option for backward compatibility (v3.x). This change will contribute to shrink physical database size without side effects on XPath evaluation. Fixed PostgreSQL data type assignment on xs:list/xs:union defined under xs:simpleType with restrictions. Fixed --sync option did not delete rows if XML not exists. Changes in xsd2pgschema 3.4.0 (release 2019-10-21) NOTICE: We recommend all users should update 3.4.0 or later. Fixed missing of data in administrative xs:complexContent delegated to definition in the other XSD. Fixed table classification that was cause of XMLStreamException: Attribute not associated with any element in XPath evaluation. Defined "nested key as attribute group" in PostgreSQL DDL to escape from the XMLStreamException. Further SQL optimization for XPath evaluation including PostgreSQL indexing. Allowed to map xs:date to PostgreSQL TIMESTAMP type. (--pg-map-timestamp option, which conforms to the W3C standards) Otherwise, map xs:date to PostgreSQL DATE type by default (--pg-map-date option). Added --xml-unqualify-def-ns option in xpath2xml main class to unqualify default target namespace. Revised example scripts for IntAct database. Changes in xsd2pgschema 3.3.4 (release 2019-10-08) Revised implementation of child node name constraint. Revised restraint information in PostgreSQL DDL. Improved attribute existence check for nested key as attribute. Added content validation against xs:list/@itemType or xs:unon/@memberTypes. Added support for @maxOccurs constraint in nodeparser. Added support for CREATE/DROP of PostgreSQL index on primary key without unique constraint. (--create-non-uniq-pkey-index (default), --no-create-non-uniq-pkey-index, and --drop-non-uniq-pkey-index options) Improved XPath evaluation performance. Added support for mapping XSD's @fixed value to JSON Schema's "const" property. Changes in xsd2pgschema 3.3.3 (release 2019-09-27) NOTICE: We recommend all users should update 3.3.3 or later. Fixed missing of data in administrative xs:complexContent. Fixed invalid parent node name constraint causing missing of data. Changes in xsd2pgschema 3.3.2 (release 2019-09-25) Fixed infinite loop while XPath evaluation in some cases. Changes in xsd2pgschema 3.3.1 (release 2019-09-19) Fixed an implementation that avoids virtual duplication of nested keys causing primary key errors. Added support for JSON Schema Draft 2019-09 (https://tools.ietf.org/html/draft-handrews-json-schema-02). Changes in xsd2pgschema 3.3.0 (release 2019-09-18) [Ticket #4-7] Added support for xs:simpleContent/xs:restriction and xs:complexContent/xs:restriction. [Ticket #4-8] Fixed namespace misrecognition. [Ticket #4-9] Resolved name collision between nested key and foreign key. Improved orphan/unnecessary table’s information in PostgreSQL DDL. Improved restraint information in PostgreSQL DDL. Fixed absolute XPath expression of table utilized for XPath parser. Fixed misrecognition of tables when in-lining simple content. Changes in xsd2pgschema 3.2.1 (release 2019-09-06) Fixed missing primitive data type when default namespace prefix is http://www.w3.org/2001/XMLSchema. Dropped dead-end tables by default. Use --show-orphan-table option to verify the orphan/dead-end tables. Revised PostgreSQL DDL, which reports orphan/dead-end tables with their PostgreSQL named schema. Fixed unnecessary escaped table name in comment of nested key in special case. Changes in xsd2pgschema 3.2.0 (release 2019-09-05) [Ticket #8-4] Added --inline-simple-cont option to enable in-lining simple content. Fixed duplicated tables when multiple XML Schemata were included or imported. Resolved name collision between foreign key and nested key and their circular dependency. Fixed invalid attribute name in XPath evaluation result (simple content as attribute case). Changes in xsd2pgschema 3.1.2 (release 2019-08-20) Fixed null pointer exception while detecting root element. Changes in xsd2pgschema 3.1.1 (release 2019-08-08) [Ticket #8-2] Added table duplication check for root table with backstop. [Ticket #8-3] Added supports for shared simple type attribute and conditional simple type attribute. Changes in xsd2pgschema 3.1.0 (release 2019-08-06) NOTICE: We recommend all users should update 3.1.1 or later. [Ticket #8-1] Fixed stack overflow due to circular dependency among schema components. Updated dependency (Apache Lucene from 7.x to 8.x). Changes in xsd2pgschema 3.0.9 (release 2019-03-07) Added auto-detectction of root element in the primary schema. Added support for namespace change of field in table. Changes in xsd2pgschema 3.0.8 (release 2019-02-19) Fixed typos in usage. Added a user guide (User Guide.pdf) Fixed constraint name including PostgreSQL reserved operator. Fixed case-insensitive name including PostgreSQL reserved operator. Revised last modified date checker of check sum file. Added --xml-allow-frag option in xpath2xml main class and --json-allow-frag option in xpath2json main class. Changes in xsd2pgschema 3.0.7 (release 2018-11-16) Allowed to follow URL redirection of schema location. Automatic redirect system message to stderr if redirect of result to stdout. Added --del-invalid-xml option in xmlvalidator main class. Added support for CREATE/DROP of PostgreSQL index on element. (--create-elem-index, --no-create-elem-index (default), --drop-elem-index, and --max-elem-cols-for-index options) Added support for CREATE/DROP of PostgreSQL index on simple content. (--create-simple-cont-index (default), --no-create-simple-cont-index, and --drop-simple-cont-index options) Set --create-attr-index and --create-simple-cont-index option by default. Performance improvements of XPath serializer, XPath translation, and XPath evaluation. Allowed to use simple content as in-place document key when table name is specified by --inplace-doc-key-name (table_name).content format. Added orphan table report in PostgreSQL DDL. Added --max-fks-for-simple-cont-index option to escape from error while index because huge simple content excesses PostgreSQL working memory. Set defalt number 0 for --max-fks-for-simple-cont-index option. Changed argument name from --min-rows-for-doc-key-index to --min-rows-for-index because the value is uniformly applied to attribute, element, and simple content PostgreSQL index. Added --show-orphan-table option in xsd2pgschema main class. Fixed null name for document key in XPath evaluation result. Changes in xsd2pgschema 3.0.6 (release 2018-10-25) Added support for xs:dateTimeStamp, xs:yearMonthDuration, xs:dayTimeDuration. Dropped support for non-standard data types, bigserial, serial, and bigint. Allowed to map standard mathematical concept of integer numbers (xs:integer, xs:nonNegativeInteger, xs:nonPositiveInteger, xs:positiveInteger, xs:negativeInteger) to BigInteger in Java, DECIMAL in PostgreSQL. (--pg-map-big-integer option) Allowed to map the integer numbers to signed long 64 bits. (--pg-map-long-integer option) Otherwise, map the integer numbers to signed int 32 bits by default (--pg-map-integer option). Allowed to map the decimal numbers (xs:decimal) to double precision 64 bits. (--pg-map-double-decimal option) Allowed to map the decimal numbers to single precision 32 bits. (--pg-map-float-decimal option) Otherwise, map the decimal numbers to BigDecimal in Java, DECIMAL in PostgreSQL by default (--pg-map-big-decimal option). Added xsi:schemaLocation attribute in result of XPath evaluation. Added xsi:nil attributes in empty element if @nillable="true". Please use --xml-no-nil-elem option in xpath2xml main class to suppress the xsi:nil attribute. Fixed --xml-insert-doc-key option did not work in 3.0.5. Further performance improvements of XPath evaluation. Added precision control to value of double or float column. Revised comments in PostgreSQL DDL with regard to the mapping of integer/decimal numbers. Changes in xsd2pgschema 3.0.5 (release 2018-10-22) Added support for CREATE/DROP of PostgreSQL index on attribute like BaseX. (--create-attr-index, --no-create-attr-index (default), --drop-attr-index, and --max-attr-cols-for-index options) Fixed potential bug in XPath parser about XPath expression including any element (xs:any). Performance improvements in data migration, full-text index, XPath parser, and XPath evaluation by buffer size optimization, SQL optimization, reduction of excess validation, memory caching of frequently accessed status values, etc. Changes in xsd2pgschema 3.0.4 (release 2018-10-11) Added support for xs:unique and xs:key (generation of PostgreSQL multi-column UNIQUE constraint including a document key at least). Added --max-uniq-tuple-size option in xsd2pgschema main class that regulated the generation of PostgreSQL UNIQUE constraint derived from xs:key. Revised statistics and comments in PostgreSQL DDL. Fixed PSQLException (The column index is out of range) when relational model extension was disabled (affects 3.0.4 released before Oct 10). Fixed nested key as attribute was unreachable since 3.0.2 update. Changes in xsd2pgschema 3.0.3 (release 2018-09-14) Fixed incomplete PostgreSQL data model when multiple XML Schemata were included or imported. More strict implementation of identity constraint definition (xs:key and xs:keyref). Stabilized quoted table annotation. Allowed qualified XPath query evaluation even if public schema has been applied on PostgreSQL. Changes in xsd2pgschema 3.0.2 (release 2018-09-12) Changed --no-rel option in xml2luceneidx was set by default since performance and frequency, added --rel option for backward compatibility. Detected unreachable nested keys that contributed performance and database size. Further FST optimization. Changes in xsd2pgschema 3.0.1 (release 2018-08-30) Added support for complex type extension of xs:element defined by primitive data type (e.g. addition of xs:attribute) Fixed JSON conversion issues occurred in 2.12.7 and 3.0.0 (withdrawn). Fixed indent level of simple type attribute and conditional simple type attribute in XPath query evaluation. Fixed exception to report multiple root nodes and fragments in XPath query evaluation (introduced in 2.12.7, but unexpectedly unused in 3.0.0). Changes in xsd2pgschema 3.0.0 (release 2018-08-24) [Self Ticket #7] Added PgSchema server (pgschemaserv main class), which serves serialized PostgreSQL data model to client. Many main classes support PgSchema server by default (host: localhost, port: 5430). Added PgSchema server relating options (--no-pgschema-serv, --pgschema-serv-host HOST_NAME, --pgschema-serv-port PORT_NUMBER). Added example scripts to start server (start_pgschema_serv.sh), stop server (stop_pgschema_serv.sh), and report server status (status_pgschema_serv.sh). Improved XML Schema analysis performance at start-up utilizing the PgSchema server. Changes in xsd2pgschema 2.12.7 (release 2018-08-07) Added --freq option in luceneidx2dic, luceneidx2ftxt, and luceneidx2infix main classes as well as dicmerge4sphinx main class. Fixed typo in JSON Schema. Improved XPath query evaluation performance by using PostgreSQL index on document key. Added exception to report multiple root nodes and fragments in XPath query evaluation. Changes in xsd2pgschema 2.12.6 (release 2018-08-02) Moved utility classes into the main line (src/net/sf/xsd2pgschema) except for main classes. Allowed to compare time stamp between source XML and check sum for performance of differential update by default. To revert the previous default setting, please use --sync-rescue option. Changes in xsd2pgschema 2.12.5 (release 2018-07-20) Added support for CREATE/DROP PostgreSQL index on document key. (--create-doc-key-index, --no-create-doc-key-index (default), and --drop-doc-key-index options, respectively) Added --min-rows-for-doc-key-index option to set minimum rows in a table for creation of PostgreSQL index on document key. (default=10000) Enabled to use PostgreSQL index on document key by default in differential update (--sync and --sync-weak options). Improved performance in differential update as a result. To revert the previous default setting, please use --no-create-doc-key-index option. Changes in xsd2pgschema 2.12.4 (release 2018-07-19) Allowed in-place document key in case relational model extension was enabled. Fixed too many open files error in case of huge relations by optimizing file resource. Added DEFERRABLE INITIALLY DEFERRED to foreign key constraint in PostgreSQL DDL (to escape from duplicate key value violation by default). Added --sync-rescue option in xml2pgsql main class (to escape duplicate key value violation using previous DDL for backward compatibility). Fixed too many open files error using regular expression to specify a large number of directory. Multi-thread reporter of check sum files (chksumstat). Changes in xsd2pgschema 2.12.3 (release 2018-07-04) Improved performance in parallel processing by removal of lock objects. Changes in xsd2pgschema 2.12.2 (release 2018-06-25) Revised parser for any content (xs:any, xs:anyAttribute) by means of SAX (dropped jsoup dependency). Fixed issues in processing any content (PostgreSQL data migration, JSON Schema mapping, JSON conversion, XPath->SQL query translation, XPath query evaluation over PostgreSQL). Added support for @namespace restriction of any content. Changes in xsd2pgschema 2.12.1 (release 2018-06-14) Fixed violation of foreign key constraint in differential update using the previous versions: 2.11.0 and 2.12.0 (withdrawn). Changes in xsd2pgschema 2.12.0 (release 2018-06-13) Added --schema-ver option in xml2json, xpath2json, xsd2jsonschema main classes ("draft_v7", "draft_v6", "draft_v4", and "latest" are acceptable values). Added support for JSON Schema draft v7 (latest), v6, and v4. Fixed issues in XML Schema->JSON Schema conversion. Fixed issues in XML->JSON conversion. Changes in xsd2pgschema 2.11.0 (release 2018-06-08) Added XPath evaluator main class, xpath2json, to retrieve JSON document over PostgreSQL. Allowed multiple XPath query evaluations, just repeat --xpath-query options. Rewrote code base using java.nio.Path and java.nio.Files. Added example script to demonstrate XPath 1.0 query evaluation to JSON over PostgreSQL. Changes in xsd2pgschema 2.10.4 (release 2018-05-31) [Ticket #6] Fixed differential update in case of --no-key option. Fixed string escape function for TSV format. Fixed issues while XPath->SQL query translation (ArrayIndexOutOfBoundsException, invalid path to simple content, invalid JOIN clause). Added example script to demonstrate XPath 1.0 query evaluation over PostgreSQL. Added support for simple type attribute and conditional simple type attribute in XML->JSON conversion. Added support for simple type attribute and conditional simple type attribute in XPath->SQL query translation. Fixed issues while XML->JSON conversion (remove non-sense item definition in JSON Schema, case-insensitive, conditional attribute, string escape, avoid collapsing in object-oriented JSON format). Updated https://github.com/antlr/grammars-v4 module to fix qName and functionName rules. Changes in xsd2pgschema 2.10.3 (release 2018-05-23) Fixed ClassCastException in 2.10.2 during setting discarded document key or in-place document key. Fixed XML Schema->object- and column-oriented JSON Schema conversion in 2.10.2. [Ticket #6] Added --no-key option to xml2pgcsv and xml2pgtsv for argument's compatibility. Added statistics about simple type attribute, conditional simple type attribute and nested key pointing simple type attribute. Set UTF-8 as default client encoding. Changes in xsd2pgschema 2.10.2 (release 2018-05-22) Added --fill-default-value option and changed policy not to fill default value. Changed name of main class from xpathevaluator to xpath2xml, XPath 1.0 query evaluator to XML over PostgreSQL. Revised XML parse process not to violate unique constraint and to remove invalid nest key. Fixed missing predicate in some case while XPath->SQL query translation. Fixed missing of data of simple content in list. Added support for xs:attribute extended by xs:simpleType (introducing simple content as attribute). Fixed simple content override with attribute defined by simple type extension. (introducing simple content as conditional attribute). Fixed XML format issues; XML declaration, proper indent, empty element for element contains attribute only. [Ticket #6] Added --no-key option to xml2pgsql main class for differential update without primary/foreign key constraints. Changes in xsd2pgschema 2.10.1 (release 2018-05-16) Fixed XPath evaluator to use writeEmptyElement for empty element. Fixed XPath->SQL query translation in case of joining table in the same depth (null pointer exception). [Ticket #5] Fixed issues caused by --case-insensitive option. [Ticket #5] Revised implementation of table, field, foreign key, and so forth by separation of canonical name in XML Schema and name in PostgreSQL DDL. Removed invalid nest key which caused violation of unique constraint during PostgreSQL data migration. Changes in xsd2pgschema 2.10.0 (release 2018-05-14) Added XPath evaluator main class, xpathevaluator, to retrieve XML document over PostgreSQL. Fixed missing of data of simple content on virtual table. Fixed time zone of value of xs:date, xs:gYearMonth, and xs:gYear from local time to UTC. Fixed declaration of enumerated type if PostgreSQL named schema is enabled. Added ancestor node checker to remove nonsense nest key. Changes in xsd2pgschema 2.9.3 (release 2018-05-01) Added reporter of check sum files (chksumstat main class). Changes in xsd2pgschema 2.9.2 (release 2018-04-27) Added support for TSV format as PostgreSQL data migration format (xml2pgtsv, tsv2pgsql main classes). Fixed SQL translation when the last path was table (xpath2pgsql main class). Changes in xsd2pgschema 2.9.1 (release 2018-04-26) Added support for xml.zip file decompression (usage: --xml-file-ext zip). Changes in xsd2pgschema 2.9.0 (release 2018-04-17) Added support for PostgreSQL named schema (--pg-named-schema option). Changes in xsd2pgschema 2.8.5 (release 2018-04-09) Improved overall XML output performance. Improved differential update performance in case that relational model extension was disabled. Appended ETC (estimated time of completion) in progress display. Added verbose mode in xmlvalidator main class. Enabled differential XML Schema validation (--sync option in xmlvalidator main class). Added --well-formed option, which checked only whether document was well-formed or not. Changes in xsd2pgschema 2.8.4 (release 2018-03-30) Fixed differential update of Sphinx full-text indexing did not work in the previous version. Improved performance in differential update and XML splitter. Added --doc-key-if-no-inplace option, which appended document key in case that in-place document key did not exist. Changes in xsd2pgschema 2.8.3 (release 2018-03-29) Fixed null pointer exception in xmlsplitter. Fixed differential update of PostgreSQL data migration did not work in the previous version. Added --inplace-doc-key-name option to allow differential update in case that relational model extension was disabled. Fixed duplication if primary key was not unique while differential update. Added --lower-case-doc-key and --upper-case-doc-key options. Changes in xsd2pgschema 2.8.2 (release 2018-03-27) Appended ON DELETE CASCADE to foreign key constraint for differential update. Supported PostgreSQL UPSERT command while differential update. Fixed Lucene full-text indexing in differential update mode. Improved full-text indexing performance in differential update mode. Changes in xsd2pgschema 2.8.1 (release 2018-03-23) Retrieved order of tables in PostgreSQL DDL, which broke in 2.7.9 and 2.8.0. Defined prefix of namespace URI, sphinx, in xmlpipe2 document for StAX processing in differential update. Fixed differential update of xmlpipe2 document. Changes in xsd2pgschema 2.8.0 (release 2018-03-22) Enabled differential update for full-text indexing in addition to PostgreSQL data migration. Added --sync and --sync-weak option in xml2luceneidx and xml2sphinxds main classes. Updated example scripts using differential update. Changes in xsd2pgschema 2.7.9 (release 2018-03-13) Added --sync and --sync-weak option in xml2pgsql, which enables differential update of PostgreSQL. Changes in xsd2pgschema 2.7.8 (release 2018-02-28) Fixed consistency test on PostgreSQL DDL in case that relational model extension was disabled. Fixed data migration issue via xml2pgsql if field was defined as numerical enumeration. Added --test-ddl option in csv2pgsql, xml2pgcsv, xml2pgsql and xpath2pgsql main classes. Allowed to use regular expression in --xml option. e.g. --xml some_dir/[0-9a-z]{2}, --xml some_dir/*, etc. Added consistency test on column order and column type of PostgreSQL DDL. Changes in xsd2pgschema 2.7.7 (release 2018-02-08) Removed JSON array whose all objects are either null or empty. Added xmlvalidator main class, paralleled XML validator against XML Schema. Fixed exception in case of insufficient argument. Changes in xsd2pgschema 2.7.6 (release 2018-01-18) [Ticket #4-3] Fixed overriding prefix to http://www.w3.org/2001/XMLSchema by schema import or include. [Ticket #4-4] Fixed overriding schemata which have the same file name. [Ticket #4-5] Revised implementation of attribute group, model group and foreign key. [Ticket #4-6] Allowed lazy evaluation of attribute group and model group. Changes in xsd2pgschema 2.7.5 (release 2018-01-16) Fixed missing of attribute and field in full-text indexing when sharding was enabled. Excluded decimal data from minimum word length condition while indexing. Fixed issue in schema generation of XSD without declaration of target namespace. Fixed list holder detection (decided by maxOccurs, minOccurs of nested key element) in case the element was referenced by 'ref' attribute. Fixed nested keys were remained in PostgreSQL DDL although relational model extension was disabled. Changes in xsd2pgschema 2.7.4 (release 2017-12-26) Disallowed multiple full-text indexing for attribute except for Sphinx multi-valued attribute. The first value in a document of the attribute will be indexed by default. Added --mva option in xml2sphinxds main class to declare Sphinx multi-valued attribute. Changes in xsd2pgschema 2.7.3 (release 2017-12-20) Excluded integer data from minimum word length condition while indexing. (--min-word-len option in xml2luceneidx and xml2sphinxds) [Ticket #4-2] Added --no-cache-xsd option in almost all main classes. If schema location is specified as a URL, the program retrieves the XML schema without caching. Added --max-field-len option in xml2sphinxds main class to prevent failure while indexing that exceeds the memory limit defined by max_xml2pipe_field (sphinx.conf). Imposed hard limit on maximum field length while Sphinx full-text indexing by the --max-field-len option. Changes in xsd2pgschema 2.7.2 (release 2017-12-05) [Ticket #4-1] Fixed exception in case of missing root table. Changed option name from --discard-doc-key to --discarded-doc-key-name. Changes in xsd2pgschema 2.7.1 (release 2017-12-01) Allowed multiple --discard-json-doc-key options in xml2json. Added --discard-doc-key option in almost all main classes. Added xpathparser main class, an XPath 1.0 parser with XML Schema validation. Changes in xsd2pgschema 2.7.0 (release 2017-11-22) Added XPath->SQL translation for wild cards. Allowed XML Schema without namespace declaration, which caused null pointer exception. Added Sphinx configuration file in example directory. Changes in xsd2pgschema 2.6.6 (release 2017-11-16) Updated to PostgreSQL 10's reserved words. Fixed issue in handling 'ref' attribute with QName which caused exception while schema analysis. Fixed getAbsoluteXPathOfTable() which caused an infinite loop in XPath parser. Fixed xmlsplitter which generated duplicating document unit. Retrieved prefix of namespace URI in PostgreSQL DDL. Retrieved annotation of nested table in PostgreSQL DDL and JSON Schema, if possible. Fixed schema statistics in case that relational model extension was turned off. Changes in xsd2pgschema 2.6.5 (release 2017-10-25) [Ticket #3] Fixed automatic retrieval of XML Schemata located in the same directory of designated XML Schema. Changes in xsd2pgschema 2.6.4 (release 2017-10-24) Name of user key (document_key, serial_key and xpath_key) is adjustable via --doc-key-name, --ser-key-name and -xpath-key-name option, respectively. Changes in xsd2pgschema 2.6.3 (release 2017-10-10) Updated to support Apache Lucene 7.0.0 or later. Improved parallel processing efficiency using LinkedBlockingQueue. Changes in xsd2pgschema 2.6.2 (release 2017-09-15) Removed duplicating plugin in pom.xml. Changes in xsd2pgschema 2.6.1 (release 2017-09-08) Implemented all XPath 1.0 function calls. Fixed database replication error occurred in 2.6.0 and previous 2.6.1 released before 2017-08-24. Fixed PostgreSQL DDL mapping issue in case of complex restrictions. Changes in xsd2pgschema 2.6.0 (release 2017-08-04) Added XML Schema aware XPath parser enabled XPath->SQL translation via xpath2pgsql main class. Fixed data migration of xs:hexBinary and xs:base64Binary. Revised XPath parser, which allowed XPath 1.0 except for XPath 1.0 function call. Updated dependency (commons-text, instead of commons-lang3). Added --xpath-var option to enable variable reference in XPath->SQL translation. Implemented boolean functions of XPath 1.0. Implemented string functions of XPath 1.0. Implemented number functions of XPath 1.0 (not yet for node set functions). Changes in xsd2pgschema 2.5.0 (release 2017-07-13) [Ticket #2] Added xmlsplitter main class to split large XML file. Implemented ANTLR v4 based XPath parser for xmlsplitter. Implemented StAX based parser for xmlsplitter. Added --shard-size option to enable sharding in xmlsplitter. Fixed XPath parser allows QNameContext node. Added example scripts which perform splitting large XML file, database replication and full-text indexing for UniProtKB/Swiss-Prot database. Changes in xsd2pgschema 2.4.4 (release 2017-05-11) Fixed primary/foreign hash key which caused unreachable relational data. Revised index policy that index all data unless user didn't specify field option, previously string data were indexed in that case. Added --attr-string, --attr-integer, --attr-float, --attr-date and --attr-time option to enable type dependent attribute selection in full-text indexing. Fixed missing of data while full-text indexing in case of --no-rel option. Dropped --no-rel option from xml2sphinxds class. Revised Sphinx attribute index policy that value of the first occurrence in a document was stored, except for integer's multi-valued attribute. Changes in xsd2pgschema 2.4.3 (release 2017-05-08) Added Java package name for core classes. Added Maven project (groupId=net.sf.xsd2pgschema, artifactId=xsd2pgschema). Fixed collapsed JSON in case of multiple JSON conversion. Dropped --ow-csv, --ow-idx and --ow-ds option and appending mode has been turned off by default (to turn on, please use --append option). Deployed to the Maven Central Repository. Fixed fault detection of parent node name list which caused missing of data. Fixed XPath predicate error in list holder. Changes in xsd2pgschema 2.4.2 (release 2017-04-18) XML Schema validation has been turned off by default (to turn on, please use --valid option). Allowed to change DB host (localhost) and port (5432) by --db-host and --db-port options. Fixed isseus around data migration (in case of --no-rel option) occurred only in previous 2.4.2. Fixed relational oriented JSON convesion. Changes in xsd2pgschema 2.4.1 (release 2017-04-10) Allowed multiple --xml options to designate multiple sources of XML file or directory. Fixed fault detection of parent node name list which caused missing of data. Fixed nullPointerException at getLocalName(). Revised console output while processing. Changed option name from --array-json to --json-array-all. Changed option name from --smpl-cont-json-key to --simple-cont-json-key. Fixed missing of data of element with qualified name. Changes in xsd2pgschema 2.4.0 (release 2017-04-06) Added --max-thrds option to allow parallel processing (default is number of available processors) in xml2pgcsv, xml2pgsql, xml2json, xml2luceneidx and xml2sphinxds. Added --shard-size option to enable sharding in xml2luceneidx and xml2sphinxds. Allowed multiple --xml options to designate multiple sources of XML file or directory. Changes in xsd2pgschema 2.3.0 (release 2017-03-31) Revised PostgreSQL DDL mapping for xs:dateTime (TIMESTAMP), xs:time (TIME) and xs:explicitTimezone (e.g. WITH TIME ZONE). Revised time checker based on LocalTime and OffsetTime. Added normalizer for xs:fractionDigits and xs:whiteSpace restrictions. Added --no-wild-card option to ignore wild cards for backward compatibility. Fixed recognition of list table which caused missing of data. Added --update option in xml2pgsql enables partial update of document where document key must be predefined. Changes in xsd2pgschema 2.2.0 (release 2017-03-29) NOTICE: xsd2pgschema conforms to W3C XML Schema Definition Language 1.1 at last. Added support for XML Schema 1.1 open content components as known as wild cards (xs:any, xs:anyAttribute). where contents of the wild card are stored in PostgreSQL XML datatype with column name "any_element" for xs:any and "any_attribute" for xs:anyAttribute, it can be indexed by either Apache Lucene or Sphinx Search, and mapped as XML content in JSON item. Updated description about statistics of schema in PostgreSQL DDL. Removed improper enumeration value length limit on XML->JSON conversion. Changes in xsd2pgschema 2.1.0 (release 2017-03-27) Internal code refactoring and class name revisions. Added automatic retrieval of XML Schemata located in the same directory of designated XML Schema. [Ticket #1] Allowed to include or import XML Schema having neither root element nor administrative elements, but with warning message. Fixed stack overflow issue when including nested XML Schemata. Properly copying target namespace and schema location as concatenated form in multiple referencing case. Added --xml-file-ext-digest option in xml2pgcsv, xml2pgsql, xml2luceneidx and xml2sphinxds classes as well as xml2json class. Added descriptions about schema modeling options and statistics of schema in PostgreSQL DDL. Fixed to add transaction commit per document in xml2pgsql. Changes in xsd2pgschema 2.0.0 (release 2017-03-24) NOTICE: As we adopt strict schema component naming scheme from 2.0.0, PostgreSQL DDL is not compatible between v2.x.x and v1.x.x. Revised XML->JSON and XML Schema->JSON Schema conversion (inlining content of virtual table and dropped --no-rel option). Internal code refactoring and class name revisions. Fixed java.util.ConcurrentModificationException while blocking substitution group. Added --numeric-idx option in xml2luceneidx to allow to store numeric values in index for backward compatibility. Changes in xsd2pgschema 2.0.0.rc1 (release 2017-03-17) NOTICE: We recommend all users avoid to use 1.14.0 and 1.15.x. Fixed critical bug which caused missing of data occurs on 1.14.0 and 1.15.x. Fixed XPath base name. Adopted strict schema component name matching that sweeps name collision. Removed non-sense --no-key option except for xsd2pgschema class. Fixed escaping constraint name in PostgreSQL DDL. Added parent node name list of nested key in PostgreSQL DDL. Changed to store numerical data as Lucene index, previously stored numerical data for range filter. Allowed include or import global elements. Changes in xsd2pgschema 1.15.5 (release 2017-03-15) Strict primary key assignment. Fixed virtual duplication of nested key. (e.g. IntAct//experimentList/experimentDescription_id is dropped because another IntAct//experimentList/experimentDescriptionList_id is pointing the same id via IntAct//experimentDiscriptionList/experimentDescription_id) Internal code refactoring and class name revisions. Changes in xsd2pgschema 1.15.4 (release 2017-03-13) Fixed invalid primary key assignment. Added support for xs:all|xs:choice|xs:sequence/@maxOccurs|@minOccurs. Added table type declarations in PostgreSQL DDL: content holder (content: true), list holder (list: true), bridge table (bridge: true), hub table (hub: true), administrative table (virtual: true), target namespace and schema location, respectively. Fixed invalid XPath base name. Fixed extension of xs:simpleContent. Allowed to set schema location in --xsd option. Fixed virtual table identification. Updated table type declarations in PostgreSQL DDL (adding type column). Changes in xsd2pgschema 1.15.3 (release 2017-03-09) Fixed java.util.ConcurrentModificationException in case including or importing namespaces. Changes in xsd2pgschema 1.15.2 (release 2017-03-08) Added "xpath_id" column to enable direct access via XPath expression if --xpath-key is set in xsd2pgschema, xml2pgcsv and xml2pgsql. where primary key named "(relation_name)_id" stores hash value calculated from (document_id)(XPath to current node) and xpath key named "xpath_id" stores hash value calculated from (XPath to current node). Fixed a potential bug which caused missing of data in xml2pgcsv and xml2pgsql in case --ser-key is selected. Fixed default setting for --ser-key in xml2pgsql. Fixed invalid XPath base name in case of --xpath-key. Internal code refactoring for performance improvement. Changes in xsd2pgschema 1.15.1 (release 2017-03-02) Added "serial_id" column to keep node order if --ser-key option is set in xsd2pgschema, xml2pgcsv and xml2pgsql. Added --ser-size option in xml2pgcsv, xml2pgsql and xsd2pgschema classes that allows to choose bit length of serial key from "short" and "int" (default). Fixed PostgreSQL DDL conversion of XML Schema using unqualified name with error message; No namespace prefix stands for http://www.w3.org/2001/XMLSchema in a document. Adopted more XPath compatible nomenclature as base name. For example (document_id)/root_elem/child_elem[3] represents base name of the third child element (child_elem) of root element (root_elem) in document_id.xml. Changes in xsd2pgschema 1.15.0 (release 2017-02-24) NOTICE: Hash strategy has been changed to use XPath compatible nomenclature as base name. Hence, primary key, foreign key and nested key values are not compatible with previous ones. User can select a hash algorithm from MD5, SHA-1 and SHA-256 and truncation size from 32 bit integer, 64 bit long, default bit size; 128i bit for MD5, 160 bit for SHA-1 and 256 bit for SHA-256, respectively. Added --hash-by and --hash-size options in xml2json, xml2pgcsv, xml2pgsql, xml2luceneidx, xml2sphinxds classes. Added --hash-size option in xsd2pgscehma and xsd2jsonschema classes. where --hash-by allows to choose message digest algorithm from "MD5", "SHA-1" (default) and "SHA-256" and --hash-size allows to choose from "int", "long" (default) and "default". "int" and "long" represent truncation of hash key to unsigned 32bit integer and unsigned 64bit long, respectively and "default" means no truncation. Reusing MessageDigest to generate hash key. Fixed a potential bug in xml2pgcsv and xml2pgsql. Added truncation of constraint name to be less than 64 characters that reduces PostgreSQL warning message. Retrieved xs:attribute and xs:simpleContent declaration as comment in PostgreSQL DDL. Changes in xsd2pgschema 1.14.0 (release 2017-02-02) Internal code refactoring. Changes in xsd2pgschema 1.13.5 (release 2017-01-31) Updated IOUtils from 1.4 to 2.5. Updated CommonsLang from 2.4 to 3.5. Replaced StringBuffer by StringBuilder. Changes in xsd2pgschema 1.13.4 (release 2017-01-26) Reusing DOMParser for XML Schema validation. Reusing DocumentBuilfer for XML parser. Changes in xsd2pgschema 1.13.3 (release 2016-12-19) Enabled subset Lucene full-text indexing like Sphinx and introduced --attr, --field, --attr-all and --field-all options in xml2luceneidx. Enabled Sphinx full-text indexing including all attributes, then introduced --attr-all options like Lucene. Aligned option names in both Lucene and Sphinx full-text indexing; --attr, --field, --attr-all, --field-all options are common in both xml2luceneidx and xml2sphinxds. NOTICE: --sph-attr and --sph-field of xml2sphinxds are not valid. Please use --attr or --field instead. Changes in xsd2pgschema 1.13.2 (release 2016-12-14) Allowed additional retrieval of Sphinx attribute by using --sph-attr option that enables to reduce size of Sphinx index. You can repeat --sph-attr options to select attributes. Renamed --field option to --sph-field of xml2sphinxds that allows selective indexing by adding field in Sphinx xmlpipe2 format and the other fields are ignored from the index. You can repeat --sph-field options to select fields. xs:ID content is stored as Sphinx attribute by default not like 1.11.1 or earlier. Non xs:ID contents are not stored as Sphinx attribute by default that helps to reduce size of Sphinx index. Fixed foreign key constraint for multiple columns and append "NOT VALID" option by default. Changes in xsd2pgschema 1.13.1 (release 2016-12-01) NOTICE: We recommend all users avoid to use 1.13.1 released on 2016-11-29. Speed enhancement by reducing ancestor's node check for non-list holder node. Added '--valid' to turn on XML Schema validation, though it has been turned on by default. Fixed nullPointerException in case of invalid argument for xml2json and xsd2jsonschema. Changed schema source for XML Schema validation from embedded schema location in XML document (/@xsi:schemaLocation) to specified path by --xsd option. Fixed PostgreSQL DDL mapping error in declaration of foreign key because of remaining XPath's attribute notation: '@'. Fixed PostgreSQL DDL mapping error in xs:simpleContent and xs:complexContent. Changes in xsd2pgschema 1.13.0 (release 2016-11-21) Added xsd2jsonschema main class to map XML Schema to JSON Schema for all object-, column- and relational-oriented JSON formats. Retrieved global annotation (/xs:schema/xs:annotation). Changed PostgreSQL mapping for xs:duration from INTERVAL to TEXT because JDBC driver does not support INTERVAL. Added support for xs:restriction/xs:minLength|xs:maxLength|xs:maxInclusive|xs:maxExclusive|xs:minExclusive|xs:minInclusive|xs:totalDigits|xs:fractionDigits, respectively. Extracted xs:restriction/xs:whiteSpace|xs:assertions|xs:explicitTimezone/@value, but not utilized for any mapping or migration. More precise PostgreSQL DDL mapping which depends on xs:restriction above. Fixed issue that --case-insensitive option is ignored in xml2json. Fixed missing array in object-oriented JSON format. Changed default JSON format from object- to column-oriented JSON format and introduced --obj-json option to select object-oriented JSON format. Fixed false-positive conflict detection again. Fixed ClassNotFoundException: org.w3c.dom.ElementTraversal. Passed JSON Schema validation by revision of XML->JSON conversion and XML Schema->JSON Schema mapping. Fixed PostgreSQL mapping of xs:simpleType/xs:list/@itemType to TEXT. Fixed PostgreSQL mapping in case xs:simpleType/xs:union/@memberTypes does not consist of built-in data types. Changes in xsd2pgschema 1.12.0 (release 2016-11-15) Internal class name revisions. Added source code annotations using Javadoc style. Revised xs:hexBinary and xs:base64Binary checker. Fixed xs:serial and xs:bigserial checker. Fixed false-positive conflict detection. NOTICE: We have filled specifications of XML Schema 1.1 by the following extensions. CAVEAT: Several XML Schema components which might break strict DDL schema, such as wild cards (xs:any, xs:anyAttribute) are neither mapped nor migrated. Added support for targetNamespace (xs:schema/@targetNamespace, xs:attribute/@targetNamespace, xs:element/@targetNamespace, etc.) Added support for include (xs:include) or import (xs:import) namespaces. Fixed issue that /xs:schema/xs:element/@abstract=true element can be selected as concrete root element. Added support for attributeGroup definitions and default attributes (xs:attributeGroup, xs:schema/@defaultAttriburtes and xs:complexType/@defaultAttributesApply). Added support for modelGroup definitions (xs:model). Fixed issue that modelGroup definitions are not properly extended. Added support for prohibited attribute (xs:attribute/@use="prohibited"). Added support for element substitution (xs:element/@substitutionGroup). Added support for element substitution block (xs:element/@block="substitution"). Added support for datatype xs:anyType that is mapped to xs:string in PostgreSQL DDL and ignored xs:alternative definitions. Retrieved annotation from xs:annotation/xs:appinfo as well as xs:annotation/xs:documentation. Revised datatype assignment strategy when datatype definition is concatenated form trying to search least common datatype. (e.g. xs:union/@memberTypes="xs:long xs:int" is mapped to xs:long.) Updated Xerces from 2.7.0 to 2.11.0. Fixed enumerator name truncation in PostgreSQL DDL. Removed --json-root-prefix option. Removed xs:string type from Sphinx attribute except for enumeration. Changes in xsd2pgschema 1.11.1 (release 2016-11-02) Added ordinary object-oriented XML->JSON conversion (default) and column-oriented JSON conversion (--col-json). Added --json-indent-spaces [0-4], --json-key-value-spaces [0-1], --json-no-linefeed and --json-compact (equals to set --json-indent-spaces 0 --json-key-value-spaces 0 --json-no-linefeed) options in xml2json. Added --json-root-prefix option in xml2json (default="data_"). Added --xml-file-prefix-digest option in xml2json (default=""). Changed option name from --xml-file-digest to --xml-file-ext-digest. Added --discard-json-doc-key option in xml2json (default=""). Fixed --array-json option issue. Changes in xsd2pgschema 1.11.0 (release 2016-11-01) Added xml2json main class to convert XML document to ordinary JSON or relational-oriented JSON (PDBx/mmJSON or BMRB/JSON style) document (--rel-json option is required). NOTICE: Column-oriented XML->JSON conversion (--col-json) has not been implemented yet. Added --array-json option in xml2json to convert XML content to JSON array uniformly even if the content consists of single data, except for root node. Fixed escaping string in JSON (RFC4627). Revised xs:decimal checker using BigDecimal, instead of Double. Revised PostgreSQL reserved words extracted from current 9.6. Revised example script. Changes in xsd2pgschema 1.10.5 (release 2016-06-15) Added --filt-in option in xml2pgcsv, xml2pgsql, xml2luceneidx and xml2sphinxds for replication of subset database. Fixed bugs in processing --filt-out option. Performance improvement when --field option is set in xml2sphinxds. Changes in xsd2pgschema 1.10.4 (release 2016-04-11) NOTICE: Java 8 is now required. Added --field-annotation option in xsd2pgschema to retrieve field annotations in PostgreSQL DDL. Fixed missing data of hub table, which consists of primary key and nested keys only. Fixed duplication of data in case of list holder in list holder. Updated to support Apache Lucene 6.0.0 or later. Changed system requirement because the latest Apache Lucene runs on Java 8 or greater. Changes in xsd2pgschema 1.10.3 (release 2016-04-04) Fixed true-negative conflict detection raised by 1.10.2. Fixed DDL conversion of xs:fixed in case of conflict. Fixed missing of data in case of conflict. Changes in xsd2pgschema 1.10.2 (release 2016-03-14) Fixed false-positive conflict detection. Changes in xsd2pgschema 1.10.1 (release 2016-03-08) Fixed missing data of administrative xs:simpleType content. Added enumeration checker for simple content. Added duplication checker for enumerator. Added xs:pattern checker. Added xs:fixed checker. Changes in xsd2pgschema 1.10.0 (release 2016-03-04) Internal class name revisions. Added value checker for simple content. Fixed missing of data when root element is defined as simple element which does not use built-in data type. Fixed redundant data replication occurred only in 1.9.4. Passed replication tests of BMRB/XML, PDBML, UniProtKB, BLAST output, IntAct and EMDB header. Changes in xsd2pgschema 1.9.4 (release 2016-02-26) NOTICE: We recommend all users avoid to use 1.9.4 because of redundant data replication issue, fixed in 1.10.0. Revised PostgreSQL reserved words extracted from current 9.5. Dropped "document_url" field formerly used as resource URL of documents from both Lucene and Sphinx index. Allowed XML documents with default namespace. Added test with EMDB header (ftp://ftp.ebi.ac.uk/pub/databases/emdb/doc/XML-schemas/Header-schema/current/emdb.xsd). Changes in xsd2pgschema 1.9.3 (release 2016-02-18) Added support for built-in derived types (xs:NOTATION, xs:NMTOKEN, xs:NMTOKENS, xs:ID, xs:IDREF, xs:IDREFS, xs:ENTITY and xs:ENTITIES). Added polite checker of namespace prefix stands for http://www.w3.org/2001/XMLSchema. Fixed replication problem of all xs:element occurred only in 1.9.2. Changes in xsd2pgschema 1.9.2 (release 2016-02-15) NOTICE: We recommend all users avoid to use 1.9.2, whose critical bug was fixed in 1.9.3. Allowed non-primitive type assignment for xs:attribute. Added --ddl option to redirect XSD output in xsd2pgschema. Changes in xsd2pgschema 1.9.1 (release 2016-02-04) Fixed bugs in processing of --filt-out and --fill-this options. Added field selector (--filed option) in xml2sphinxds for generating subset dictionary utilized for autosuggestion. Added dicmerge4sphinx main class to generate trigram database for autosuggestion from Sphinx dictionary. Fixed a bug in timestamp conversion of Sphinx xmlpipe2 format. Changed attribute nomenclature of Sphinx data schema from "table_name"."column_name" to "table_name"__"column_name" to prevent syntax collision between SphinxQL and MySQL ("table_name"."column_name"). Added --ds-name option in xml2sphinxds and dsmerge4sphinx. Added support for generating Sphinx data source configuration file (data_source.conf). Fixed a bug in xs:hexBinary and xs:base64Binary conversion of Sphinx xmlpipe2 format. Changes in xsd2pgschema 1.9.0 (release 2015-12-21) Added support for full-text search using Sphinx. xml2sphinxds main class exports Sphinx xmlpipe2 data source file (data_source.xml). option names are basically compatible with the xml2luceneidx main class. added dsmerge4sphinx main class to merge separated data source files (data_source.xml) generated by the xml2sphinxds class. Changed field names "url" -> "document_url" and "id" -> "document_id" used as pointer to data source in Lucene index. Added --min-word-len option in xml2luceneidx and xml2sphinxds, which allowed to filter out short words. Added --delimiter option in xml2sphinxds, either tab (default) or comma code is used as delimiter code in non-mva attribute (e.g. string). Changes in xsd2pgschema 1.8.0 (release 2015-09-30) Added --case-insensitive option in csv2pgsql, xml2pgcsv, xml2pgsql and xsd2pgschema, which allowed lowercase table names and column names. NOTICE: You should retain --case-insensitive option through replication processes. Retrieved documentation source annotation in PostgreSQL DDL quoting xs:annotation/xs:documentation/@source. Changes in xsd2pgschema 1.7.4 (release 2015-09-30) Fixed NullPointerException when "type" attribute is null. Changes in xsd2pgschema 1.7.3 (release 2015-08-19) Added --dbuser option for authentication in csv2pgsql, xml2pgcsv and xml2pgsql. Revised example script. Changes in xsd2pgschema 1.7.2 (release 2015-08-03) NOTICE: We recommend all users should update 1.7.2 or later. Fixed invalid nested-key assignment in list holder relations caused missing of data. Changes in xsd2pgschema 1.7.1 (release 2015-07-10) Fixed enumeration constraint overriding under merged relations because of name collision. Revised example script. Changes in xsd2pgschema 1.7.0 (release 2015-06-19) Fixed insufficient xml exploring under merged relations because of name collision. Fixed schema errors because of name collision with PostgreSQL reserved words. Fixed invalid FOREIGN KEY constraint assignment for not unique case. More fixed invalid PRIMARY KEY assignment for not unique case. Changed message digest algorithm from MD5 to SHA-1. Fixed potential issues on both replication and indexing, which resolved missing data and useless keys. Re-tested with BMRB/XML, PDBML, UniProtKB, BLAST output and IntAct. Revised example scripts in which newer IntAct XML Schema (MIF254.xsd) was used. Added --no-doc-key option in xsd2pgschema, xml2pgcsv, xml2pgdb and csv2pgdb (revived from 1.3.0). Fixed translation error on xs:complexContent. Added support for default property for all xs:attribute and xs:element. Fixed xs:boolean value check issue. Fixed enumerator extraction error. Changes in xsd2pgschema 1.6.0 (release 2015-05-26) Fixed invalid PRIMARY KEY assignment for not unique case. Fixed unexpected content in "document_id" column in case of "xml.gz" file extension. Tested with BLAST output (blast.xsd refer to https://github.com/lindenb/xsd-sandbox). Changes in xsd2pgschema 1.5.0 (release 2015-04-28) Added luceneidx2dic, luceneidx2infix and luceneidx2ftxt main classes to prepare dictionary from index for autosuggestion using AnalyzingSuggester, AnalyzingInfixSuggester and FreeTextSuggester, respectively. Revised option names (filter -> filt-out, post-fill -> fill-this) and syntax ("table_name":"column_name":... -> "table_name"."column_name":...). Added field selector in luceneidx2dic and luceneidx2ftxt. Added "id" field as pointer to data source in Lucene index. Changes in xsd2pgschema 1.4.0 (release 2015-03-12) Added enumeration check while replication and indexing. Fixed schema conflicts in merged relations because of name collision. Re-tested with UniProtKB XML database (uniprot.xsd). Added "document_id" column as pointer to data source by default. Enhanced speed by bulk uploading. Turned off load-dtd-grammar and load-external-dtd features. Changes in xsd2pgschema 1.3.5 (release 2015-02-23) Updated to support Apache Lucene 5.0.0 or later. Changes in xsd2pgschema 1.3.0 (release 2015-02-20) Fixed translation error on xs:simpleContent. Anonymous xs:simpleContent has been mapped on "content" field of an relation. Added enumeration support for xs:attribute. Stored primary-key, foreign-key and nested-key values at Lucene indexing, if relational model was reconstructed. Changes in xsd2pgschema 1.2.0 (release 2015-01-20) Re-packaging (xsd2pgschema-*.tgz) Apache License V2.0 branding. Revised error messages. Enabled StoreTermVectors at Apache Lucene indexing by default. Added VecTextFiled.java and VecStringField.java files to support the TermVectors. Revised example scripts. Changes in xsd2pgschema 1.1.0 (release 2014-12-09) Released @ sourceforge.net. Added to support for full-text indexing using Apache Lucene. Added minimum jar file (xsd2pgschema-min.jar). Tested with UniProtKB XML database (uniprot.xsd). Changes in xsd2pgschema 1.0.0 (internal release) Released only for internal use. Tested with IntAct (MIF25.xsd), PDBML (pdbx-v40.xsd), BMRB/XML (mmcif_nmr-star.xsd) databases.