Smart Submission Dataset Viewer - Browse Files at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Documentation	2024-12-16		0
Test and demo files	2024-03-04		0
README.txt	2025-11-03	74.4 kB	0
Smart_Submission_Dataset_Viewer_2025-11-03.zip	2025-11-03	450.1 MB	4
Smart_Submission_Dataset_Viewer_2025-06-06.zip	2025-06-06	236.7 MB	0
Smart_Submission_Dataset_Viewer_CDISC_CORE.pdf	2025-06-06	653.0 kB	0
Smart_Submission_Dataset_Viewer_User_Manual.pdf	2025-06-06	3.7 MB	0
Smart_Submission_Dataset_Viewer_Installation_Instructions.pdf	2025-01-09	194.1 kB	0
Smart_Submission_Dataset_Viewer_2024-12-29.zip	2024-12-31	259.6 MB	0
License.txt	2024-12-06	2.7 kB	0
properties.dat	2024-12-06	197 Bytes	0
CDISC_SDTMIG_validation_rules_v1.1_2020.xml	2020-08-23	1.5 MB	0
PMDA_SDTM_validation_rules_2-0_2019.xml	2020-08-23	1.5 MB	0
FDA_SDTM_validation_rules_1-3_2018.xml	2020-08-23	1.8 MB	0
FDA_SEND_validation_rules_2019.xml	2020-08-23	1.1 MB	0
Totals: 15 Items		956.9 MB	4

ÿþThis version, November 3, 2025

- Implementation of CDISC CORE (CDISC Open Rules Engine) v.0.13.0

- Extended the CORE Engine with over 2,600 rules generated from the "SDTM Dataset Specializations"

(see e.g. https://www.linkedin.com/groups/12122426/). These rules can be regarded as "data quality rules" and go back to the "Biomedical Concepts"

- allowing to use "local rules" in the CORE engine

- better implementation of the progress bar for the CORE engine, now showing the real progress

- many other small improvements

This version, June 6, 2025

- Full implementation of CDISC CORE (CDISC Open Rules Engine) v.0.10.0 for CDISC Dataset-JSON-1.1 files.

- Many small further improvements

This version, December 29, 2024

- Fixed an issue when reading "properties.dat" when this file has more than 10 lines

This version, December 16, 2024

- Improved reading SUPPQUAL and recombining datasets in NDJSON format

- Fixed an issue in XPT2DatasetJSON for XPT files with "special" JSON characters

This version, December 6, 2024

- Full implementation of CDISC Dataset-JSON 1.1 and Dataset-NDJSON

- Generation of Dataset-JSON 1.1 from XPT files added

- Drag-and-drop of files added

- Ability to skip the first dialog asking whether XPT files must be transformed to Dataset-JSON (see file "properties.dat")

- Support for Dataset-XML has been discontinued

- CDISC CORE validation will be added again as soon as CORE has been adapted for Dataset-JSON v.1.1

- Documentation with the adaptions and Dataset-JSON 1.1 (and Dataset-NDJSON) files will be added in the following days

This version, January 17, 2024

Experimental version:

- Dataset-JSON is now the default format

- Option to use metadata from either the define.xml (recommended, more features, but slower) or from the Dataset-JSON file itself (slower)

- CORE validation for Dataset-JSON currently ONLY when metadata from define.xml

- Support for embedded images and other binary data in the submission datasets for Dataset-JSON format (requires define.xml)

- Support for embedded HL7-FHIR resources (as SDTM datapoint) and visualization of them.

- Support for links to external (binary) files such as movies (EMR, etc)

For short demos, see: https://www.linkedin.com/groups/56393/

This version, December 17, 2023

- Further improvements in the CDISC CORE validation features, especially regarding reporting

- Added features for allowing to start the Smart Submission Dataset Viewer from within other applications (spawn process)

(see separate file Startup_parameter_Smart_Submission_Dataset_Viewer.txt).

- Possibility to use "Open With" in Windows on a submission file. An example file for this is provided by the file "Smart_Submission_Dataset_Viewer_open_with.bat"

So, after selecting a Dataset-JSON file in Windows, one can use "Open with", and then browse to the application

"Smart_Submission_Dataset_Viewer_open_with.bat". The documentation for this can be found in the file

"Opening_Dataset-JSON_files_from_the_Desktop.pdf".

- Many small improvements to increase efficiency.

This version, November 25, 2023

- Newest CDISC CORE validation implementation, including updated implementation for Dataset-JSON, from "main" branch 2023-11-15

- XQuery validation (ORCS: Open Rules for CDISC Standards) has now been completely dismissed.

- Many small improvements to the CORE validation features.

This version, June 21, 2023

- Minor update

- Conversion of XPT to Dataset-JSON now according to the newest spec (https://www.cdisc.org/dataset-json)

- Dataset-JSON is now the default (instead of Dataset-XML) when starting the viewer

- Define-XML v.2.1 is now the default (instead of Dataset-XML) when starting the viewer

- Now includes an early, experimental version of CDISC CORE (for validation of submission files)

This version, February 13, 2023

- Corrected bugs for recombination of the supplemental qualifiers in SUPP-- datasets when the record number needs to be shown in the first column.

- Very first implementation of CDISC CORE validation (see https://www.cdisc.org/core), currently only for Dataset-JSON SDTM files.

This version, October 30, 2022

- Added feature that when a define.xml is loaded (using the "browse" button), the version of it is checked, and the corresponding radiobuttin is selected when appropriate

- For the feature "View define.xml in browser using the stylesheet", the system checks whether there is a stylesheet reference in the define.xml.

If not, the system asks to provide the stylesheet location. If there is a stylesheet reference in the define.xml,

the user can now still choose to use another stylesheet than the one referenced.

This version, October 23, 2022

- minor update: RELREC lookup (using CTRL-R) now also working for one-one and one-many relations that use --LNKGRP.

This version, October 20, 2022

- added feature to transform SAS Transport (XPT) files to new CDISC Dataset-JSON (https://wiki.cdisc.org/display/ODM2/Dataset-JSON) format "on the fly".

- fixed a bug that when generating Dataset-XML from XPT files, not all the information from the define.xml,

was copied over in the updated define.xml (e.g. processing instructions and XML comments).

TODO: Dataset-JSON files do not contain the correct number of "records" yet ("0" is set).

The number of records as provided in the JSON file is however not used by the Smart Submission Dataset Viewer.

TODO: fix problem that RELREC "lookup" does not work when the linking variable is not --SEQ.

This version, September 27, 2022

- fixed a bug that --DY values are not shown as tooltip (option) when DM is not the first dataset in the view, i.e. when also Trial Design datasets are loaded.

This version, September 26, 2022

- fixed a bug that when using Dataset-JSON, "Bring SUPPQUAL data back to original dataset" did not function correctly.

TODO: implement this feature also for the case of CSV files.

This version, April 18, 2022

- small bug fixes for use of the viewer with CSV files

- further support for Dataset-JSON

- Viewer cannow display record numbers (as first column) when requested (option)

This version, March 5, 2022

- Readded validation using "Open Rules for CDISC Standards", but limited to for Define-XML 2.0.

Implementation of CDISC CORE (https://www.cdisc.org/core) will probably get the higher priority.

- Dataset-JSON implementation. See https://wiki.cdisc.org/display/ODM2/Dataset-JSON for the draft specification.

Reason is that some people at the FDA have shown interest.

Users who would like to start testing this new format can always obtain sample files. Just mail us at Jozef.Aerts@XML4Pharma.com.

- Corrected a bug that after the generation of Dataset-XML files from XPT files, in some circumstances, the (updated) define.xml is not correctly loaded.

This version, January 23, 2022

- Minor fix: when a variable is found in the Dataset-XML file, and it is not defined in the corresponding Define-XML file,

the software does no longer crash, but provides a warning.

Data points for the undefined variable are then skipped.

- Tooltips on the tabs of the datasets with extended metadata information about the dataset.

This version, October 9, 2021

- New "smart" feature: show value of SUBJID in all other subject-related tables as tooltip on USUBJID

- New feature in menu "Tools - Column": move selected single column from GUI dialog

This version, September 4, 2021

- Fixed a bug that additional properties were not shown on USUBJID for Dataset-XML (worked well for JSON and CSV)

This version, August 5, 2021

- Support for CSV files has been added (still a bit experimental, as there are so many variations of "CSV").

A test set of CSV files from the Metadata Submission Guide 2.0, using Define-XML 2.1, can be found in the folder TestFiles/MetaData_Submission_Guide_2_0_2021_SDTM_CSV

- Support for the upcoming CDISC Dataset-JSON added, but the feature is still disabled.

Those who would like to try Dataset-JSON format out, please contact Jozef Aerts (Jozef.Aerts@XML4Pharma.com) for test files.

These persons should also pick up the source code, and comment out lines 638-641 in source code file GUI.java and then recompile.

This version, April 18, 2021

- Fixed a bug for define-xml 2.1 for dataset definitions that have def:HasNoData=Yes

- added selection for "Show LOINC details (LOINC website) when right-clicking a cell with a LOINC code.

This version, December 13, 2020

- Better handling of submission files that do not follow the usual SDTM/SEND naming convention, for example "my_dm.xml" instead of "dm.xml"

- "CDISC Library Discrepancy Report" now also displays the IG version and CDISC Controlled Terminology used.

TODO: Define-XML case that IG version and/or CDISC-CT version differs between datasets

- Filtering and sorting: feature added to filter on record number (and ranges). E.g. "1 4 26-30 77". Regulatory request

- TODO: implement ".fdax" format

This version, November 16, 2020

- CDISC Library authentication mechanism has changed, now requires an API key. Software and XQueries were adapted for this.

The API key is to be provided as parameter value "cdisclibraryapikey" in the "properties.dat" file (placeholder present)

- For CDISC Library access, catching the cases that there is no internet connection, or that the API key is invalid or not present.

This version, August 26, 2020

- Additional support for Define-XML version 2.1

- Updated "Open Rules for CDISC Standards" rule sets with support for Define-XML 2.1 and increased usage of the CDISC Library (will be further extended)

- "Related Records" can now also be found starting from a row in a single (non-RELREC) dataset

- When a "Related Record" is found, a summary table is displayed in a separate window

- Improved highlighting of "Related Records" in all the tables

This version, July 18, 2020

- Major update: support for define.xml 2.1 in viewer as well as for XPT to Dataset-XML conversion

- New features implementing the "CDISC Library API"

- Bug fix for "Bring back SUPPQUAL data back to original dataset" when dataset file names are lowercase

- Right-click on cell shows menu for "copy cell contents" and other functionalities when applicable

- "Open Rules for CDISC Standards" extended with rules for "FDA SEND validation rules 2019", "CDISC SDTM validation rules v.1.1 May 2020"

- Moved to BaseX as library for executing XQuery-based "Open Rules for CDISC standards", allowing usage of the "CDISC Library" in XQuery

- Many other small improvements

Known issues:

- In some cases, "Bring back SUPPQUAL data back to original dataset" does not work correctly

when Dataset-XMl files are zipped.

- "Open Rules for CDISC Standards" does not work yet when the define.xml version is 2.1.

This version, March 26, 2020

-Minor update: extended support for working with LOINC codes. See documentation: Smart_Submission_Dataset_Viewer_LOINC_Support.pdf

This version, February 12, 2020

- Minor update: when the user selects to first convert XPT files to Dataset-XML, the user can select where the XPT files reside.

The default is to retrieve the location of the XPT files from the define.xml itself.

All the generated Dataset-XML files + the automatically adapted define.xml file will then be stored into a single directory.

This version, January 11, 2020

- minor update, many small improvements, especially with regard to CDISC Library implementation

- bug fix: when "supplemental qualifiers" are brought back to the parent domain, in the latter, the tooltip was not correctly displayed for these variables.

- As many browser do not provide direct XSLT transformation anymore, when define.xml must be displayed in the browser, it is first transformed to HTML,

which is then passed to the browser

- Using Saxon-HE 9-9-1 now for XML parsing instead of Saxon HE 9-4. Adaptions in code made for this

This version, November 5, 2019

- minor update, many small improvements, especially further maximizing user-friendlyness

- added new "CDISC Library API" features, please see the file "Smart_Submission_Dataset_Viewer_CDISC_Library.pdf" under "Documentation" for details

This version, September 22, 2019

- minor update, adding more features and improving user-friendlyness for the validation part using the "Open Rules for CDISC Standards" (http://xml4pharmaserver.com/OpenRulesForCDISCStandards/index.html).

For details, please see the file "Release_Notes_Version_2019-09-22.pdf"

This version, August 5, 2018

- Minor update, two new features:

* Possibility to execute "Open Rules for CDISC Standards" validation without visualization of the dataset tables: see documentation, file Rules_Validation_without_visualization.pdf

* An algorithm was developed to detect whether a (lab) test needs units for the value, based on the LOINC code and the associated UCUM unit (from LOINC).

This algorithm works much better than the heavily disputed P21 implementation of the rule FDAB012/SD0026.

See documentation: file Smart_Submission_Dataset_Viewer_Units_for_Tests.pdf

This version, July 26, 2019

- Minor update, adding CDISC Library API RESTful web services feature for displaying CDISC variable information from the CDISC Library (requires authentication key).

See the documentation for more details.

- Moved from Jersey 1.18 to Jersey 2.27 libraries for using RESTful web services

This version, December 15, 2018

- Major update, rebranding to "Smart Submission Dataset Viewer"

- At startup, user can choose between using an existing SDTM, SEND or ADaM submission in CDISC standards XML (CDISC define.xml + Dataset-XML standard)

OR to start from a submission in the outdated XPT format (SAS Transport 5) and to convert that into the modern Dataset-XML format

- Support for embedded FHIR-EHR records - see Smart_Dataset-XML_FHIR_EHR_support.pdf

- Optional automated calculation and highlighting of "last non-empty observation before first exposure" record, showing that the variable LOBXFL is essentially unnecessary.

- many minor improvements

This version, February 17, 2018

- minor update, fixing some bugs for case of unexpected errors in the datasets themselves.

- Improved memory management

- Messages from XQuery validation (Open Rules for CDISC Standards) can now also be exported as CSV, e.g. for use in spreadsheets.

This version, October 30, 2017

- minor update: choice between simple RESTful web service for LOINC codes and extended RESTful web service (the latter providing more detailed information)

This version, June 18, 2017

- Fixed a bug that an exception was added to the log file (without any further harm done) when valuelist ItemDefs in the define.xml do not have a "label" in Description/TranslatedText

- some of the menu options are disabled in case of ADaM, as these options do not make sense in the context of ADaM

This version, April 15, 2017

- Fixed feature "Derive and highlight last observation records before first observation (baseline flag)":

the software will now also highlight records with a darker color (and a warning tooltip) for which the (last) observation is on the same day as the first exposure on treatment,

where one of both dates does not contain a time part, so that it is not 100% sure that the last observation was indeed performed before the first exposure (data quality issue).

Unfortunately, the latter is often the case in clinical studies, as often, only the date is captured, and those working with the data must rely on what is written in the protocol,

without no means to actually check whether the protocol was strictly followed.

- Minor bug fixes, e.g. for unexpected null values

- AGAIN: if you want to have the latest version of the "Open Rules for CDISC Standards" (XQuery rules), please download them from: http://xml4pharmaserver.com/RulesXQuery/index.html

A manual how to use the "Open Rules for CDISC Standards" is also included in this distribution, but is also available from:

http://xml4pharmaserver.com/RulesXQuery/Running_OpenRules_within_SmartDatasetXML.pdf

- TODO: implement optional RESTful web service to automatically update rules for which an improved version has become available.

This version, March 13, 2017

IMPORTANT: This version contains all FDA, PMDA and CDISC rules implemented so far for use with Dataset-XML as XQuery rules.

These implementations continue to be developed continuously. For the latest version of these rule implementations, please visit: http://xml4pharmaserver.com/RulesXQuery/index.html (almost updated daily).

If you want to use the latest version of these rules in the Smart Dataset-XML Viewer, just download the latest XML file from the aforementioned website, and copy them in the directory "Validation_Rules_XQuery".

In future, we will add a functionality that the user can request to update the set of rules automatically, using a RESTful web service, as explained at: http://xml4pharmaserver.com/WebServices/XQueryRules_webservices.html.

New features and bug fixes:

- fixed a bug for validation of --TESTCD and --CAT values

- new feature: Derive and highlight last observation records before first exposure to treatment (baseline flag)

This feature demonstrates why the new --LOXBFL variable in SDTM 1.5 is completely unnecessary (see: http://cdiscguru.blogspot.com/2016/08/why-lobxfl-should-not-be-in-sdtm.html)

- new feature: visualization of any selected validation rule in XQuery with color coding (in window "Validation Rule Selection")

Very many small improvements

This version, December 19, 2016

Further improved filtering capabilities (see document Smart_Dataset-XML_Filtering_Sorting.pdf):

- "stack" of filters

- undo last applied filter (from stack)

- each filter obtains an automatically generated description ("WHERE"-like statement)

- apply single filter to all tables at once

This version, December 3, 2016

Many new filters, like on subject demographics properties (ACTARMCD,AGE,SEX,RFSTDTC,RFXSTDTC,RFENDTC,RFXSTDTC,RFISTDTC) applicable to any table, and very well described in the document "Smart Dataset-XML Viewer: Sorting and Filtering" (file Smart_Dataset-XML_Filtering_Sorting.pdf).

"Undoing" filters, removing all filters.

Bug fixes for minor issues. New source code available.

This version, 15 November 2016

- minor update: the November 6 distribution was lacking the Saxon libraries, causing some (minor) features not to function. The Saxon libraries have now been added.

This version, 6 November 2016

- Many new sorting and filtering features, described in the document "Smart Dataset-XML Viewer: Sorting and Filtering" (file Smart_Dataset-XML_Filtering_Sorting.pdf)

This version, 1 November 2016

- When the user clicks "View Define.xml", a choice becomes available whether to see the define.xml in the browser using the stylesheet, or whether to see the define.xml source in a new interactive dialog. The latter allows reviewers to inspect what is really present in the define.xml, which is important, as the stylesheet filters the data (stylesheets can even be used to manipulate the data!).

- new "smart" feature: show trial arm name (ARM, ACTARM) as tooltip on arm code (ARMCD, ACTARMCD). This shows again that ARM and ACTARM are only necessary in TA, and not in any other datasets

- new "smart" feature (first of a new series): Display ACTARMCD on each USUBJID in every dataset. This feature is especially important for FDA and PMDA reviewers, as it allows them to see immediately to which arm the subject for the current data point belongs, without needing to switch to the corresponding DM record, which is still very easy by using "CTRL-D" anyway.

We will very soon add some more of these features, please let us know which ones you would like to see.

- TODO: in each subjects-related dataset, allow to filter on the actual arm (ACTARMCD)

This version, 8 September 2016

- minor update: records with last observation before exposure, but on same date as first exposure, and without time part get another color and a warning tooltip.

See: http://cdiscguru.blogspot.com/2016/09/lobxfl-follow-up.html

This version, 31 August 2016

- New feature: "derive and highlight last observation records before first exposure"

- New tutorial: "Deriving and displaying "Last Observation before First Exposure Records" with the Smart Dataset-XML Viewer"

This version, 22 April 2016

- "Bring SUPPQUAL back" checkbox disabled when Standard=ADaM

- XQuery rule consolidation: when more than 1 set (e.g. "FDA","CDISC","PMDA", "MyCompany") available, rules from different sets can be combined

- New set of CDISC-ADaM validation rules (far from complete) now available, these will be updated regularly and published separately. Other sets (e.g. PMDA) will be developed, we do however need more 'workforce'

This version, 10 April 2016

- new webservice: check whether a standardized value (--STRESC, CVFARS) is an allowed value for the given test code (--TESTCD). Currently only for EG, RS and CV.

- FDA SDTM rules validation using XQuery - see the separate document "Smart_Dataset-XML_XQuery_validation.pdf".

This version, 19 March 2016

- several small improvements and new features, such as generation of "test name" (--TEST) from "test code" (--TESTCD) from the CodeLists in the define.xml

TODO: case that --TESTCD and --TEST are given as "EnumeratedItem" and connected by the NCI code.

- MedLinePlus LOINC RESTful web service change from http to https

This version, 10 December 2015

- some minor changes in the GUI (to make it even more user-friendy)

- new feature: validate validity of --DY values (under "Options - Validation - Check --DY values")

- In future: even more web services will be added

This version, 23 August 2015

- new feature: automated calculation of "Study Element" (ETCD), "Study Element Name" (ELEMENT) and EPOCH, with display as tooltip on all --DTC cells.

Can be switched on by using the clicking the button "Options", then selecting the tab "Smart Features" and checking the chechbox "Show Study Element and Epoch on --DTC"

This version, 17 May 2015

- several minor bug fixes, especially concerning web services

- remark that not all menus (like "view - show annotated CRF") have been implemented yet

This version, 25 Jan 2015

New features:

- in the table view, a new functionality "Find Record by Record Number" has been added to the "Search" menu. It allows to select the original record by its record number (ItemGroupDataSeq in Dataset-XML),

even when the table has been resorted

- two new web services have been added:

a) check whether the unit (xxORRESU / original result unit, xxSTRESU / standardized result unit) is a correct unit for the given xxTESTCD (test code).

This feature is currently limited to VS datasets

b) check whether the VSPOS value (vital signs position) is a correct VSPOS for the given VSTESTCD

As soon as CDISC publishes more similar lists, a web service will be developed for it.

This version, 19 Oct 2014

Bug fix: in the (unusual?) case that the ItemData within an ItemGroupData do not come in the same order als the corresponding ItemRef in the ItemGroupDef, the display of the data was incorrect.

This has now been corrected using a new loading algorithm.

This version, 14 Oct 2014

Implementation of "Bring SUPPQUAL data back to original data set".

Further on, minor improvents + bug fix for finding parent records of RELREC records (did not function correctly). New tutorial.

This version, 1 Oct 2014

Minor update of the "web services" version, "lookup" of test name for given test code will now also work even when no codelist is associated with the --TESTCD variable in the define.xml file (though it should by the SDTM spec).

This version, 27 Sept 2014

Special "web services" version, prototyping using a number of web services - see document "Smart_Dataset-XML_WebServices.pdf"

This version, 27 July 2014

Software suggests when filtering before loading is recommended depending on file size. Threshold file size for this suggestion can be set using the "Properties" button (default 20MB).

This version, 16 July 2014

small enhancement: choice between use of --TESTCD and --CAT (or no filtering at all) for filtering data BEFORE loading, based on associated codelists on --TESTCD and/or --CAT in the define.xml.

Will probably be extremely useful for very large data sets such as LB and QS.

This version, July 9, 2014

--TESTCD filtering possible BEFORE loading the datasets, based on associated CodeList in the define.xml.

New tutorial / manual

This version, June 11, 2014

ADaM numeric dates/datetimes/times can be displayed as ISO-8601. Internally, these remain numbers. This new feature has been well tested on dates and times, but not yet on datetimes. Also see document Display_ADaM_dates_as_ISO8601.pdf.

This version: April 4, 2014

There was a bug in the display for non-European characters (especially Asian). This has been fixed. With many thanks to Dr. Chiba and his colleagues in Japan.

This version: April 3, 2014

- added support for non-US-ASCII characters

This version: April 2, 2014

- As the name of the standard has been changed from SDS-XML (Study-Data-Set-XML) into Dataset-XML, the name of the software and of the files has also been changed.

Please also remark that although there are minor changes in the XML-Schema, old SDS-XML files will still be usable, though not all features of the software might work.

We will soon upload a new set of test / demo files that are compliant with the new XML-Schema.

This version: January 15, 2014

- possibilty to start programm with parameters, allowing to use the viewer in combination with other software (see tutorial)

- automated calculation of --DY values from --DTC (and RFSTDTC) as an option, and display as a tooltip on --DTC

- automated lookup from VISIT (name) from VISITNUM when the TV table is loaded (optional) with the result being the visit name being displayed as a tooltip on VISITNUM

- order of the tables: after DM, the trial design tables are loaded before any other tables. After that, the CO and RELREC tables are loaded (when present)

- many smaller improvements, described in the updated tutorial

- new version of the tutorial, describing all current features of the software

This version: December 9, 2013

- table rows now have an alternating background color (white / gray) for better appearance

- column headers have color yellow

- selected tab has color orange

- Options-Settings menu usage now displays a dialog

- using this dialog, display of an intermediate information message when finding parent record of SUPPQUAL/CO record can be skipped.

- bug fix in RELREC: USUBJID is not always mandatory (e.g. when relationship is between datasets)

- TODO: update/remake tutorial/manual

This version: Friday November 29th, 2013

- bug fix: program hanged when selecting option "check age from birthdate and reference start date" when the birthdate is absent (is optional).

- bug fix: When viewing CO data, the option "Show parent record of CO record" does not work when IDVARR ad IDVARVAL are empty, as is usually the case for a comment on DM domain

- Correction in tutorial/manual: automatic validation is only done on "required" variables, not on "expected" variables. The latter can have null values under specific circumstances which is out of the scope of the viewer

- several smaller improvements

This version: Friday November 22nd, 2013

- major improvement that when one has loaded a set of tables, one can add additional ones without having to reload all the datasets.

This is very well explained in the updated manual

- further improved filtering capabilities

This version: Tuesday November 19th, 2013

- implemented ItemGroupDataSeq as a tooltip on the cell with STUDYID - also see the tutorial

THis version: Friday November 15th, 2013

- automated search for the corresponding record in the DM domain

- v.2.0 is now the default for the define.xml version

- When selecting a CO record, automated search for the parent record in other domains

- when selecting a RELID in the RELREC dataset, automated search for the related records in the other domains

- first date of study treatment and last date of study treatment (taken from EX) can be displayed as tooltip on USUBJID in the DM table (option)

- toggling between tables using CTRL-B or using the menu

- keyboad shortcuts for several functionalities

- update of the tutorial

This version: Friday October 18th, 2013

- new features: several ways of exporting to text files (see tutorial)

- tutorial partially updated

- some minor improvements

This version: Monday October 10th, 2013

- new feature: optional check whether Study OID of define.xml matches Study OID of data file

- user can now interrupt the loading process. The datasets that have already been loaded + the current (incomplete) table is then displayed, after the validation is performed.

This version: Friday October 4th, 2013

Many small improvements including

- wrote a draft tutorial

- immediate navigation to parent domain table when using "show parent records of supplemental qualifier" - TODO the same for CO

- better update of the progress bars

- QNAM is regarded as a topic variabe (for filtering)

- when having a prior filter available, the user can apply that to newly loaded files, which can save a lot of memory usage, as not the complete dataset will be loaded into memory (only those records that pass the filter). This would also be a preferred way of working for reviewers, e.g. first only load a few datasets, and then create a filter (based on e.g. age, site, ..., lab values, vital sign values ...) and then (re)load all files that are necessary.

This version: Monday September 30, 2013

A few bugs were corrected, e.g. that the selection/sorting in the table disappeared when the "filter" window pops up.

Also the version from Sunday crashed as I forgot to also provide the (empty) "temp" directory in the distribution. The new version will try to generate this directory when it is absent or was deleted.

New features:

- when a filter is applied to all currently loaded datasets, the user is allowed to give a title for this filter, which is then displayed in the frame.

As soon as the filter is removed, this title also disappears. This title can help the reviewer remembering that he/she is working on a subset of data and what this subset is about. Example "all subjects older than 80 years".

- when a filter is applied and a title has been given, and the window with tables is closed, and the user loads (additional) datasets and decides to implement the last used filter to these datasets, then the title of that filter appears on the top of the new window with tables.

This version: Sunday September 29, 2013

New features:

- now also support (again) for define.xml 2.0 - tested with files of the define 2.0 standard distribution

- new filtering features: filtering possible based on subset of subjects chosen from / selected in a table

- when the window with the tables is closed, the filtering is remembered. Upon a next click on the "start" button, the user is asked to apply that latest filtering to all datasets or to load all datasets completely.

This is probably a very interesting feature. For example, the user can only load the DM dataset and then select/filter a number of subjects based on age, sex, site, ...

He/she can subsequently (re)load other datasets and apply that filter. Like that, all tables are for the chosen subset of subjects. This does not only make the review easier, but also saves memory ...

Or the user only loads the DM and LB datasets and creates a filter for all subjects with low haemoglobin values. The user can then (re)load other datasets for those subjects only.

This version: Friday September 27, 2013

New features:

- software has been renamed to "Smart_SDS-XML_Viewer"

- considerable better memory usage: all the LZZT files can now be loaded with a memory usage of lower than 384MB. This has been achieved by storing the data in US-ASCII byte arrays instead of Unicode Strings (the Java default).

- added a second progress bar to also display the progress of validation for each dataset, as some minimal validation is ALWAYS done.

- added new functionality to change the order of the tabs in the view. TODO: also allow drag-and-drop to move tabs.

- some options (define.xml 2.0, data caching) have been enabled as they need further investigation.

Next to do: make a tutorial demonstrating the advantages of using SDS-XML and the viewer.

This version: Sunday June 30th, 2013

The SmartSDTMViewer is currently only a prototype for demoing.

It reads SDTM/SEND/ADaM data in CDISC-ODM-XML format (so we do realize "SDTMViewer" is not a good name). It assumes that there is one xml file per dataset (i.e. one XML file is not allowed to contain records from different SDTM domains). The CDISC-ODM-XML file must be of version 1.3 or 1.3.1 and have the "table rows" (1 ItemGroupData per row) in a ClinicalData container element, except for the trial design datasets, where the container element must be ReferenceData.

The software ALWAYS requires a define.xml file that is in agreement with the datasets. Currently no good error catching is done for the case that a loaded SDTM-XML file is not or incorrectly described in the define.xml file.

Currently, the file name of the dataset must correspond to the value of the "Name" attribute in the ItemGroupDef (as is common practice). The software does not try to read the value of def:leaf elements referenced by def:ArchiveLocationID attributes.

New features:

- the user can choose between a define.xml version 1.0 file and a define.xml file version 2.0.

- simple support for ADaM: testing of USUBJID in data files against the one from the ADSL.

- lower memory footprint due to "canocalization" of strings - however not fully optimized yet.

- search functionality (using the menu or CTRL-F).

- better display of progress

- log file generation

- error message display when running out of memory

New features 23.6.2013

- >50% less memory usage

- when total file size > 60% of available memory: string interning on values for STUDYID, DOMAIN and all coded values

- when total file size > 80% of available memory: further string interning, also on longer strings

- 30.6: when total files size > 100% of available memory: data caching is automatically switched on

The software currently initially "claims" 500MB of memory, and can claim up to a maximum of 1000MB;

This is currently more than sufficient for loading all the LZZT datasets (2013): when all datasets have been loaded, the memory usage is about 500MB, during loading a peak of 750MB is observed.

In case the software runs of of memory, you can increase the maximum amount of memory the software obtains, by changing the value of the -Xmx parameter in the SmartSDTMViewer.bat file.

For example, if you would like to have 2GB of memory available instead of the default of 1GB, then the line should be changed into:

java -Xms512M -Xmx2048M -cp %CLASSPATH% com.xml4pharma.smartsdtmviewer.gui.GUI

Do remark however that you should not set the -Xmx parameter higher than about 60% of the physical memory you really have available.

For example, if you have 4GB physical memory, you should not set the -Xmx parameter higher than about 2.4GB.

There is a user option "use data caching" which is VERY EXPERIMENTAL. It is meant for the case that the user does not have sufficient computer memory available. What it does is that when a dataset is not used, it is cached to file (into the "temp" directory). Of course this slows down execution considerably as reading from disc is much slower than reading from memory.

Do not use this "feature" with the LZZT files, that is unnecessary. You can try it out (EXPERIMENTAL) on considerably larger datasets.

Please also try this software on similar datasets that you have! Especially I am very curious about loading times and memory usage of considerably larger datasets. The software should also be able treat XML subject-related datasets where ItemGroupData comes directly under "ClinicalData", but I haven't tested this yet.

An example ADaM dataset has also been uploaded to the portal.

Please also try out the filtering and sorting features (multiple column sorting). Comments are very welcome.

LAST CHANGES:

- row highlighting is done using TableModel instead of JTable - is more correct

- if highlighting should be done (e.g. for RELREC parent record), but the row is currently not visible (e.g. filtered out), this is reported (warning message)

- user can choose between version 1.0 and 2.0 for the define.xml file.

TODO:

- File - exit menu

- features that YOU would like to see

- write documentation, make movie ...

Have fun!

Jozef

Source: README.txt, updated 2025-11-03