Smart Dataset-XML Viewer (outdated) - Browse Files at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size
README.txt	2018-06-21	46.7 kB
Smart_Dataset-XML_Viewer_software_2018-06-21.zip	2018-06-21	7.9 MB
Smart_Dataset-XML_FHIR_EHR_support.pdf	2018-06-21	251.6 kB
Files_from_LZZT_Pilot_2013_Dataset-XML_EHR.zip	2018-06-21	8.0 MB
Smart_Dataset_XML_Viewer_Installation_Instructions.pdf	2018-03-04	219.6 kB
Smart_Dataset-XML_Viewer_software_2018-02-17.zip	2018-02-17	44.0 MB
Smart_Dataset-XML_Viewer_software_2017-10-30.zip	2017-10-30	7.8 MB
Smart_Dataset-XML_Adding_own_Validation_Rules.pdf	2017-10-22	448.3 kB
Smart_Dataset-XML_tutorial_2017-06-15.pdf	2017-07-15	5.0 MB
Smart_Dataset-XML_Viewer_software_2017-06-18.zip	2017-06-18	7.8 MB
README_FIRST.txt	2017-04-15	497 Bytes
FDA_SEND_validation_rules.xml	2016-07-11	644.5 kB
FDA_SDTM_validation_rules.xml	2016-07-11	1.0 MB
PMDA_SDTM_Validation_rules.xml	2016-06-04	1.0 MB
Files_from_LZZT_Pilot_2013_LBLOINC_Dataset-XML.zip	2016-04-30	6.4 MB
CDISC_ADaM_validation_rules.xml	2016-04-23	245.0 kB
Smart_Dataset-XML_XQuery_validation.pdf	2016-04-15	528.6 kB
Smart_Dataset-XML_WebServices.pdf	2015-01-25	917.1 kB
XPT2DatasetXML_Software_with_ZIP.zip	2014-05-26	358.0 kB
Files_from_LZZT_Pilot_2013_Dataset-XML_with_errors.zip	2014-04-19	6.2 MB
Files_from_LZZT_Pilot_2013_Dataset-XML_OK.zip	2014-04-19	6.2 MB
Totals: 21 Items		105.0 MB

ÿþ

This version, June 21, 2018

- Minor update, adding a feature to visualize FHIR source records when embedded into the SDTM record. See the document "Smart_Dataset-XML_FHIR_EHR_support.pdf"

This version, February 17, 2018

- Minor update, fixing some bugs for case of unexpected errors in the datasets themselves.

- Improved memory management

- Messages from XQuery validation (Open Rules for CDISC Standards) can now also be exported as CSV, e.g. for use in spreadsheets.

- related records in MS can be searched for in MB (and vice versa - over MBGRPID/MSGRPID) using "Tools - show related records".

This version, October 30, 2017

- minor update: choice between simple RESTful web service for LOINC codes and extended RESTful web service (the latter providing more detailed information)

This version, June 18, 2017

- Fixed a bug that an exception was added to the log file (without any further harm done) when valuelist ItemDefs in the define.xml do not have a "label" in Description/TranslatedText

- some of the menu options are disabled in case of ADaM, as these options do not make sense in the context of ADaM

This version, April 15, 2017

- Fixed feature "Derive and highlight last observation records before first observation (baseline flag)":

the software will now also highlight records with a darker color (and a warning tooltip) for which the (last) observation is on the same day as the first exposure on treatment,

where one of both dates does not contain a time part, so that it is not 100% sure that the last observation was indeed performed before the first exposure (data quality issue).

Unfortunately, the latter is often the case in clinical studies, as often, only the date is captured, and those working with the data must rely on what is written in the protocol,

without no means to actually check whether the protocol was strictly followed.

- Minor bug fixes, e.g. for unexpected null values

- AGAIN: if you want to have the latest version of the "Open Rules for CDISC Standards" (XQuery rules), please download them from: http://xml4pharmaserver.com/RulesXQuery/index.html

A manual how to use the "Open Rules for CDISC Standards" is also included in this distribution, but is also available from:

http://xml4pharmaserver.com/RulesXQuery/Running_OpenRules_within_SmartDatasetXML.pdf

- TODO: implement optional RESTful web service to automatically update rules for which an improved version has become available.

This version, March 13, 2017

IMPORTANT: This version contains all FDA, PMDA and CDISC rules implemented so far for use with Dataset-XML as XQuery rules.

These implementations continue to be developed continuously. For the latest version of these rule implementations, please visit: http://xml4pharmaserver.com/RulesXQuery/index.html (almost updated daily).

If you want to use the latest version of these rules in the Smart Dataset-XML Viewer, just download the latest XML file from the aforementioned website, and copy them in the directory "Validation_Rules_XQuery".

In future, we will add a functionality that the user can request to update the set of rules automatically, using a RESTful web service, as explained at: http://xml4pharmaserver.com/WebServices/XQueryRules_webservices.html.

New features and bug fixes:

- fixed a bug for validation of --TESTCD and --CAT values

- new feature: Derive and highlight last observation records before first exposure to treatment (baseline flag)

This feature demonstrates why the new --LOXBFL variable in SDTM 1.5 is completely unnecessary (see: http://cdiscguru.blogspot.com/2016/08/why-lobxfl-should-not-be-in-sdtm.html)

- new feature: visualization of any selected validation rule in XQuery with color coding (in window "Validation Rule Selection")

Very many small improvements

This version, December 19, 2016

Further improved filtering capabilities (see document Smart_Dataset-XML_Filtering_Sorting.pdf):

- "stack" of filters

- undo last applied filter (from stack)

- each filter obtains an automatically generated description ("WHERE"-like statement)

- apply single filter to all tables at once

This version, December 3, 2016

Many new filters, like on subject demographics properties (ACTARMCD,AGE,SEX,RFSTDTC,RFXSTDTC,RFENDTC,RFXSTDTC,RFISTDTC) applicable to any table, and very well described in the document "Smart Dataset-XML Viewer: Sorting and Filtering" (file Smart_Dataset-XML_Filtering_Sorting.pdf).

"Undoing" filters, removing all filters.

Bug fixes for minor issues. New source code available.

This version, 15 November 2016

- minor update: the November 6 distribution was lacking the Saxon libraries, causing some (minor) features not to function. The Saxon libraries have now been added.

This version, 6 November 2016

- Many new sorting and filtering features, described in the document "Smart Dataset-XML Viewer: Sorting and Filtering" (file Smart_Dataset-XML_Filtering_Sorting.pdf)

This version, 1 November 2016

- When the user clicks "View Define.xml", a choice becomes available whether to see the define.xml in the browser using the stylesheet, or whether to see the define.xml source in a new interactive dialog. The latter allows reviewers to inspect what is really present in the define.xml, which is important, as the stylesheet filters the data (stylesheets can even be used to manipulate the data!).

- new "smart" feature: show trial arm name (ARM, ACTARM) as tooltip on arm code (ARMCD, ACTARMCD). This shows again that ARM and ACTARM are only necessary in TA, and not in any other datasets

- new "smart" feature (first of a new series): Display ACTARMCD on each USUBJID in every dataset. This feature is especially important for FDA and PMDA reviewers, as it allows them to see immediately to which arm the subject for the current data point belongs, without needing to switch to the corresponding DM record, which is still very easy by using "CTRL-D" anyway.

We will very soon add some more of these features, please let us know which ones you would like to see.

- TODO: in each subjects-related dataset, allow to filter on the actual arm (ACTARMCD)

This version, 8 September 2016

- minor update: records with last observation before exposure, but on same date as first exposure, and without time part get another color and a warning tooltip.

See: http://cdiscguru.blogspot.com/2016/09/lobxfl-follow-up.html

This version, 31 August 2016

- New feature: "derive and highlight last observation records before first exposure"

- New tutorial: "Deriving and displaying "Last Observation before First Exposure Records" with the Smart Dataset-XML Viewer"

This version, 22 April 2016

- "Bring SUPPQUAL back" checkbox disabled when Standard=ADaM

- XQuery rule consolidation: when more than 1 set (e.g. "FDA","CDISC","PMDA", "MyCompany") available, rules from different sets can be combined

- New set of CDISC-ADaM validation rules (far from complete) now available, these will be updated regularly and published separately. Other sets (e.g. PMDA) will be developed, we do however need more 'workforce'

This version, 10 April 2016

- new webservice: check whether a standardized value (--STRESC, CVFARS) is an allowed value for the given test code (--TESTCD). Currently only for EG, RS and CV.

- FDA SDTM rules validation using XQuery - see the separate document "Smart_Dataset-XML_XQuery_validation.pdf".

This version, 19 March 2016

- several small improvements and new features, such as generation of "test name" (--TEST) from "test code" (--TESTCD) from the CodeLists in the define.xml

TODO: case that --TESTCD and --TEST are given as "EnumeratedItem" and connected by the NCI code.

- MedLinePlus LOINC RESTful web service change from http to https

This version, 10 December 2015

- some minor changes in the GUI (to make it even more user-friendy)

- new feature: validate validity of --DY values (under "Options - Validation - Check --DY values")

- In future: even more web services will be added

This version, 23 August 2015

- new feature: automated calculation of "Study Element" (ETCD), "Study Element Name" (ELEMENT) and EPOCH, with display as tooltip on all --DTC cells.

Can be switched on by using the clicking the button "Options", then selecting the tab "Smart Features" and checking the chechbox "Show Study Element and Epoch on --DTC"

This version, 17 May 2015

- several minor bug fixes, especially concerning web services

- remark that not all menus (like "view - show annotated CRF") have been implemented yet

This version, 25 Jan 2015

New features:

- in the table view, a new functionality "Find Record by Record Number" has been added to the "Search" menu. It allows to select the original record by its record number (ItemGroupDataSeq in Dataset-XML),

even when the table has been resorted

- two new web services have been added:

a) check whether the unit (xxORRESU / original result unit, xxSTRESU / standardized result unit) is a correct unit for the given xxTESTCD (test code).

This feature is currently limited to VS datasets

b) check whether the VSPOS value (vital signs position) is a correct VSPOS for the given VSTESTCD

As soon as CDISC publishes more similar lists, a web service will be developed for it.

This version, 19 Oct 2014

Bug fix: in the (unusual?) case that the ItemData within an ItemGroupData do not come in the same order als the corresponding ItemRef in the ItemGroupDef, the display of the data was incorrect.

This has now been corrected using a new loading algorithm.

This version, 14 Oct 2014

Implementation of "Bring SUPPQUAL data back to original data set".

Further on, minor improvents + bug fix for finding parent records of RELREC records (did not function correctly). New tutorial.

This version, 1 Oct 2014

Minor update of the "web services" version, "lookup" of test name for given test code will now also work even when no codelist is associated with the --TESTCD variable in the define.xml file (though it should by the SDTM spec).

This version, 27 Sept 2014

Special "web services" version, prototyping using a number of web services - see document "Smart_Dataset-XML_WebServices.pdf"

This version, 27 July 2014

Software suggests when filtering before loading is recommended depending on file size. Threshold file size for this suggestion can be set using the "Properties" button (default 20MB).

This version, 16 July 2014

small enhancement: choice between use of --TESTCD and --CAT (or no filtering at all) for filtering data BEFORE loading, based on associated codelists on --TESTCD and/or --CAT in the define.xml.

Will probably be extremely useful for very large data sets such as LB and QS.

This version, July 9, 2014

--TESTCD filtering possible BEFORE loading the datasets, based on associated CodeList in the define.xml.

New tutorial / manual

This version, June 11, 2014

ADaM numeric dates/datetimes/times can be displayed as ISO-8601. Internally, these remain numbers. This new feature has been well tested on dates and times, but not yet on datetimes. Also see document Display_ADaM_dates_as_ISO8601.pdf.

This version: April 4, 2014

There was a bug in the display for non-European characters (especially Asian). This has been fixed. With many thanks to Dr. Chiba and his colleagues in Japan.

This version: April 3, 2014

- added support for non-US-ASCII characters

This version: April 2, 2014

- As the name of the standard has been changed from SDS-XML (Study-Data-Set-XML) into Dataset-XML, the name of the software and of the files has also been changed.

Please also remark that although there are minor changes in the XML-Schema, old SDS-XML files will still be usable, though not all features of the software might work.

We will soon upload a new set of test / demo files that are compliant with the new XML-Schema.

This version: January 15, 2014

- possibilty to start programm with parameters, allowing to use the viewer in combination with other software (see tutorial)

- automated calculation of --DY values from --DTC (and RFSTDTC) as an option, and display as a tooltip on --DTC

- automated lookup from VISIT (name) from VISITNUM when the TV table is loaded (optional) with the result being the visit name being displayed as a tooltip on VISITNUM

- order of the tables: after DM, the trial design tables are loaded before any other tables. After that, the CO and RELREC tables are loaded (when present)

- many smaller improvements, described in the updated tutorial

- new version of the tutorial, describing all current features of the software

This version: December 9, 2013

- table rows now have an alternating background color (white / gray) for better appearance

- column headers have color yellow

- selected tab has color orange

- Options-Settings menu usage now displays a dialog

- using this dialog, display of an intermediate information message when finding parent record of SUPPQUAL/CO record can be skipped.

- bug fix in RELREC: USUBJID is not always mandatory (e.g. when relationship is between datasets)

- TODO: update/remake tutorial/manual

This version: Friday November 29th, 2013

- bug fix: program hanged when selecting option "check age from birthdate and reference start date" when the birthdate is absent (is optional).

- bug fix: When viewing CO data, the option "Show parent record of CO record" does not work when IDVARR ad IDVARVAL are empty, as is usually the case for a comment on DM domain

- Correction in tutorial/manual: automatic validation is only done on "required" variables, not on "expected" variables. The latter can have null values under specific circumstances which is out of the scope of the viewer

- several smaller improvements

This version: Friday November 22nd, 2013

- major improvement that when one has loaded a set of tables, one can add additional ones without having to reload all the datasets.

This is very well explained in the updated manual

- further improved filtering capabilities

This version: Tuesday November 19th, 2013

- implemented ItemGroupDataSeq as a tooltip on the cell with STUDYID - also see the tutorial

THis version: Friday November 15th, 2013

- automated search for the corresponding record in the DM domain

- v.2.0 is now the default for the define.xml version

- When selecting a CO record, automated search for the parent record in other domains

- when selecting a RELID in the RELREC dataset, automated search for the related records in the other domains

- first date of study treatment and last date of study treatment (taken from EX) can be displayed as tooltip on USUBJID in the DM table (option)

- toggling between tables using CTRL-B or using the menu

- keyboad shortcuts for several functionalities

- update of the tutorial

This version: Friday October 18th, 2013

- new features: several ways of exporting to text files (see tutorial)

- tutorial partially updated

- some minor improvements

This version: Monday October 10th, 2013

- new feature: optional check whether Study OID of define.xml matches Study OID of data file

- user can now interrupt the loading process. The datasets that have already been loaded + the current (incomplete) table is then displayed, after the validation is performed.

This version: Friday October 4th, 2013

Many small improvements including

- wrote a draft tutorial

- immediate navigation to parent domain table when using "show parent records of supplemental qualifier" - TODO the same for CO

- better update of the progress bars

- QNAM is regarded as a topic variabe (for filtering)

- when having a prior filter available, the user can apply that to newly loaded files, which can save a lot of memory usage, as not the complete dataset will be loaded into memory (only those records that pass the filter). This would also be a preferred way of working for reviewers, e.g. first only load a few datasets, and then create a filter (based on e.g. age, site, ..., lab values, vital sign values ...) and then (re)load all files that are necessary.

This version: Monday September 30, 2013

A few bugs were corrected, e.g. that the selection/sorting in the table disappeared when the "filter" window pops up.

Also the version from Sunday crashed as I forgot to also provide the (empty) "temp" directory in the distribution. The new version will try to generate this directory when it is absent or was deleted.

New features:

- when a filter is applied to all currently loaded datasets, the user is allowed to give a title for this filter, which is then displayed in the frame.

As soon as the filter is removed, this title also disappears. This title can help the reviewer remembering that he/she is working on a subset of data and what this subset is about. Example "all subjects older than 80 years".

- when a filter is applied and a title has been given, and the window with tables is closed, and the user loads (additional) datasets and decides to implement the last used filter to these datasets, then the title of that filter appears on the top of the new window with tables.

This version: Sunday September 29, 2013

New features:

- now also support (again) for define.xml 2.0 - tested with files of the define 2.0 standard distribution

- new filtering features: filtering possible based on subset of subjects chosen from / selected in a table

- when the window with the tables is closed, the filtering is remembered. Upon a next click on the "start" button, the user is asked to apply that latest filtering to all datasets or to load all datasets completely.

This is probably a very interesting feature. For example, the user can only load the DM dataset and then select/filter a number of subjects based on age, sex, site, ...

He/she can subsequently (re)load other datasets and apply that filter. Like that, all tables are for the chosen subset of subjects. This does not only make the review easier, but also saves memory ...

Or the user only loads the DM and LB datasets and creates a filter for all subjects with low haemoglobin values. The user can then (re)load other datasets for those subjects only.

This version: Friday September 27, 2013

New features:

- software has been renamed to "Smart_SDS-XML_Viewer"

- considerable better memory usage: all the LZZT files can now be loaded with a memory usage of lower than 384MB. This has been achieved by storing the data in US-ASCII byte arrays instead of Unicode Strings (the Java default).

- added a second progress bar to also display the progress of validation for each dataset, as some minimal validation is ALWAYS done.

- added new functionality to change the order of the tabs in the view. TODO: also allow drag-and-drop to move tabs.

- some options (define.xml 2.0, data caching) have been enabled as they need further investigation.

Next to do: make a tutorial demonstrating the advantages of using SDS-XML and the viewer.

This version: Sunday June 30th, 2013

The SmartSDTMViewer is currently only a prototype for demoing.

It reads SDTM/SEND/ADaM data in CDISC-ODM-XML format (so we do realize "SDTMViewer" is not a good name). It assumes that there is one xml file per dataset (i.e. one XML file is not allowed to contain records from different SDTM domains). The CDISC-ODM-XML file must be of version 1.3 or 1.3.1 and have the "table rows" (1 ItemGroupData per row) in a ClinicalData container element, except for the trial design datasets, where the container element must be ReferenceData.

The software ALWAYS requires a define.xml file that is in agreement with the datasets. Currently no good error catching is done for the case that a loaded SDTM-XML file is not or incorrectly described in the define.xml file.

Currently, the file name of the dataset must correspond to the value of the "Name" attribute in the ItemGroupDef (as is common practice). The software does not try to read the value of def:leaf elements referenced by def:ArchiveLocationID attributes.

New features:

- the user can choose between a define.xml version 1.0 file and a define.xml file version 2.0.

- simple support for ADaM: testing of USUBJID in data files against the one from the ADSL.

- lower memory footprint due to "canocalization" of strings - however not fully optimized yet.

- search functionality (using the menu or CTRL-F).

- better display of progress

- log file generation

- error message display when running out of memory

New features 23.6.2013

- >50% less memory usage

- when total file size > 60% of available memory: string interning on values for STUDYID, DOMAIN and all coded values

- when total file size > 80% of available memory: further string interning, also on longer strings

- 30.6: when total files size > 100% of available memory: data caching is automatically switched on

The software currently initially "claims" 500MB of memory, and can claim up to a maximum of 1000MB;

This is currently more than sufficient for loading all the LZZT datasets (2013): when all datasets have been loaded, the memory usage is about 500MB, during loading a peak of 750MB is observed.

In case the software runs of of memory, you can increase the maximum amount of memory the software obtains, by changing the value of the -Xmx parameter in the SmartSDTMViewer.bat file.

For example, if you would like to have 2GB of memory available instead of the default of 1GB, then the line should be changed into:

java -Xms512M -Xmx2048M -cp %CLASSPATH% com.xml4pharma.smartsdtmviewer.gui.GUI

Do remark however that you should not set the -Xmx parameter higher than about 60% of the physical memory you really have available.

For example, if you have 4GB physical memory, you should not set the -Xmx parameter higher than about 2.4GB.

There is a user option "use data caching" which is VERY EXPERIMENTAL. It is meant for the case that the user does not have sufficient computer memory available. What it does is that when a dataset is not used, it is cached to file (into the "temp" directory). Of course this slows down execution considerably as reading from disc is much slower than reading from memory.

Do not use this "feature" with the LZZT files, that is unnecessary. You can try it out (EXPERIMENTAL) on considerably larger datasets.

Please also try this software on similar datasets that you have! Especially I am very curious about loading times and memory usage of considerably larger datasets. The software should also be able treat XML subject-related datasets where ItemGroupData comes directly under "ClinicalData", but I haven't tested this yet.

An example ADaM dataset has also been uploaded to the portal.

Please also try out the filtering and sorting features (multiple column sorting). Comments are very welcome.

LAST CHANGES:

- row highlighting is done using TableModel instead of JTable - is more correct

- if highlighting should be done (e.g. for RELREC parent record), but the row is currently not visible (e.g. filtered out), this is reported (warning message)

- user can choose between version 1.0 and 2.0 for the define.xml file.

TODO:

- File - exit menu

- features that YOU would like to see

- write documentation, make movie ...

Have fun!

Jozef

Source: README.txt, updated 2018-06-21