I have been reviewing how the rxNorm and Snomed data loads work and also cross checking with other recent discussion threads on the ICD9 errors encountered. I have a couple of thoughts and would like some feedback.
1.) the rxNorm and Snomed load functions write to /tmp. Most security professionals recommend that sites limit the use of writeable directories (food for thought if another alternative is available). The data load logic also takes into account a "version management" approach allowing users to keep track of which versions of these regularly published files are being used by openemr.
2.) ICD 9 codes used to be loaded via a web site page scraper by now are loaded via an sql script (recommended approach most DBAs would take).
3.) ICD 10 code that I am interested in implementing could be simplified taking the sql load approach and forgetting about adding the codes via the UI. Note that if an sql load approach is taken it would make sense to keep meta data identifying the versions of the data loads that are currently in use by openemr.
a.) Is the UI based data load approach taken by rxNorm and Snomed the preferred way users want to administer the database content?
b.) Is there an overview of how these codes used by the users of openemr?
Having an SQL file that is with the release is more reliable and stable, however it does require that someone take responsibility for keeping that file up to date like any other part of the system.
Using the web load model makes it the user's responsibility to update the files, but means that every time the web site changes it breaks the loader, ala ICD9.
If the data changes less that the website then the sql data model is best I think…. but how to know that in advance is a big question.
So … choose your favorite model, I guess they are about even.
www.mi-squared.com / @tonymi2
oemr.org / @OEMR_org
Note: SNOMED is not used anywhere in OpenEMR yet, but it will be required to coding 'Problem Lists' in Meaningful Use Stage 2 (2014) in the USA
The RXNORM, SNOMED and ICD10 are all rather special, because they are all released periodically from a official website in a raw format. This allowed the RXNORM/SNOMED import mechanism in OpenEMR to simply import the codes from the originally released package along with tracking the version/date releases. So, it seems, the most straightforward thing would be to do this for the ICD10 codeset also, thus there is no middleman or required re-packaging of codes. That being said, sometimes the official releases get changed (for example, SNOMED changed their packing naming and package structure, but with minimal changes in the import script, OpenEMR now supports both formats). I'd argue that although the sql load approach may seem like the quickest route, it will take much more resources to continue supporting for future ICD10 release revisions. For the import mechanism, pretty much just need to mimick the SNOMED/RXNORM mechanism with some mods (creating the sql tables and then importing the csv files into the tables).
I think the real decision is how to structure the ICD10 codes on import and whether to keep the hierarchical stuff (might as well).
Also good to start thinking about how to attack this problem in the code_types table. My thoughts are to add an 'active' column and a 'internal' column. The active column will then allow us to place entries for rxnorm,snomed,ICD10 even though they are not in use and the 'internal' column can be used to flag whether the codesets are kept in the codes table or are kept in a another set of tables (this gives some flexibility if users want to simply place some ICD10/SNOMED codes in the codes table).
Another advantage of the UI mechanism for ICD10 is automatic internationalization of the codes. As part of the stored version information, we can include the language of the codeset package, thus would be very straightforward to change to a different language of the ICD10 codeset.
And Phyaura released a bit of code (likely can find it in the Code Review tracker somewhere that does use the SNOMED codes in OpenEMR; it needed some work but it shows that it can be done).
The biggest issue with all of this is getting these mechanisms (such as the active and internal columns of code_types table) supported within the codebase; but there's really no way around it now since MU stage 2 requires SNOMED/ICD10 and practices will still need to use ICD9 for practical purposes at the same time until 2014.
Agreed we'll move along the lines you described above. I have been reviewing the the RxNorm docs at their site http://www.nlm.nih.gov/research/umls/rxnorm/docs/2012/rxnorm_doco_full_2012-2.html and realize that SNOMEDCT is just one of the "vocabularies" that is part of the RxNorm standardized nomenclature for clinical drugs. Do we have any user requirements for how a practice uses the RxNorm data? Are any of these other "vocabularies" that are included in the RxNorm data feed used (or will be used) by practices?
<p>The following source vocabularies are included in RxNorm:</p>
<table border="0" cellspacing="10" cellpadding="0">
<th align="left" scope="col">SAB</th><th align="left" scope="col">Source Name</th>
<td>Gold Standard Drug Database</td>
<td>Medi-Span Master Drug Data Base</td>
<td>Multum MediSource Lexicon</td>
<td>Micromedex RED BOOK</td>
<td>Medical Subject Headings (MeSH)</td>
<td>FDA National Drug Code Directory</td>
<td>FDA Structured Product Labels</td>
<td>FDB MedKnowledge (formerly NDDF Plus)</td>
<td>Veterans Health Administration National Drug File - Reference Terminology</td>
<td>SNOMED Clinical Terms (drug information)</td>
<td>Veterans Health Administration National Drug File</td>
To keep this project moving along, here's a proposed commit in order to get the basic mechanics of the code_types table and settings into the OpenEMR codebase in order to support the DSMIV, ICD10 and SNOMED code sets. Here are the changes:
-Added ct_active column to code_types table
-Only active entries are used.
-Added ct_label column to code_types table
-This will allow changing the labels without modifying the very important key
-Added ct_external column to code_types table
-This will allow assignment of where the codeset is stored
-0 is in the codes table
-1 is in the separate ICD10 tables
-2 is in the separate SNOMED tables (RF1)
-3 is in the separate SNOMED tables (RF2)
-Added ICD10,SNOMED,DSMIV entries to the code_types table (flagged as inactive)
Here's the github branch: http://github.com/bradymiller/openemr/commits/dx-codes_1
To get an idea how this works, can activate the new code types in Administration->Lists->'Code Type'; note the new Active, Label, and External settings. If you make them active, then you will note on the Fee Sheet that they are now options to be searched on the fee sheet. Of course, we still need the codesets etc., but this seems like a a good and safe starting point.
To specifically answer your question, the rxnorm database is not being used yet. Here's some code for Phyaura's eRx solution (still very early in review process) that does seem to use the rxnorm tables (I think):
The hope for the rxnorm stuff is that somebody comes along and does the following:
1. Better integrate the Medications and Prescriptions modules in OpenEMR
2. Integrate RxNorm into the Medication and Prescription modules
I think that for MU 2, will need to be using the rxnorm id's, but Tony probably knows this better.
Supporting multiple code sets won't be nearly as tough as I initially thought. Here's code that now supports DSMIV, SNOMED, and ICD10. Note all we need to do to support ICD10 is to get an import mechanism and then place the ICD10 queries in the placeholders in the custom/code_types.inc.php script. Here's the code: http://github.com/bradymiller/openemr/commits/dx-codes_2
Also added support for the external codes (SNOMED etc.) in the Administration->Service gui. This code has been rebased with the above code and is testing very well; it is potentially ready to commit to sourceforge. If wish, please review and test; detailed notes can be found in the commit comments: http://github.com/bradymiller/openemr/commits/dx-codes_3
To help with testing, instructions for importing SNOMED can be found here:
For now, I think we can now state we fully support SNOMED. For ICD10, all that is left to do is to create the ICD10 import mechanism(simply piggyback the current SNOMED/RXNORM mechanism) and sql storage structure (then when know the storage structure, it will be very straightfoward to place the search query in the ICD10 placeholder in custom/code_types.inc.php).
This codes flexibility will be great for international users and help towards the looming 2014 MU requirements.
I took your dx-codes_3 branch last night and added some additional changes to my ICD10-databaseLoadCode branch. I have a couple of commits (as I have been playing with git a little bit) that added the ICD10 menu item in the admin page and parsed the zip file name. Please note that the ICD10 file name doesn't have a date in the file name syntax so I nailed it down to January 1st for how.
Normally i would only commit when all the code is ready for a "release". Additional code that I have include the table DDL stuff and the actual data load. Might post any questions if I need any advice… Thanks for all the "startup" code for the ICD10 support, I was code reviewing most of those files you changed and was trying to figure out what approach to take (while you were doing it) :-)
I'm planning to commit the above stuff to the sourceforge codebase soon. For the ICD10 importing, it looks like your on the right track. We are always happy to look at code that is not ready for a "release", so please feel free to keep posting it.
Committed above commit to sourceforge (looks very good after extensive testing):
OpenEMR now fully supports SNOMED (and soon to fully support ICD10)
I am in the midst of reworking the code in the interface/code_systems/standard_tables_manage.php. Basically the ICD 10 file set comes in multiple zip files whereas the RxNorm and Snomed stuff comes in one zip file. Another thing is that the existing code tracks revision dates based on the details embedded in the zip file names. The ICD 10 files haven't been named in a similar fashion and the names have morphed across multiple releases. Anyway, I have most of the code done and have been doing some unit testing. I will be retesting all the RxNorm and Snomed loads as well as the new ICD 10 loads. I hope to post to github within the next week or so depending on my availability.
I would like someone to confirm a little test for me. After installing any one of the existing Snomed or RxNorm installations please run this:
select * from standardized_tables_track table
and let me know what you have in there. Thanks.
oops! ignore that last "table" in the code example
data in standardized_tables_track on my system.
'1', '2011-07-08 08:16:08', 'SNOMED', '', '2011-01-31 00:00:00'
'2', '2011-07-08 10:22:01', 'RXNORM', '', '2011-07-05 00:00:00'
Ok, It's been a little bit so I figured I get a status note up here. I have the data load code supporting ICD 9 raw data feeds and that same framework continues to load SNOMED and RXNORM feeds. I ended up leaving the existing loads alone and extending the UI logic to accommodate the ICD loads (both 9, 10 and eventually 11 :-) I hope to have the ICD 10 raw feeds done soon. At that point I am going to upload to github. Then the real work starts of integrating the code data files into the UI, which I haven't spent any time on yet. I was wondering if anyone had the idea developed on the steps needed to be taken (hopefully in the import step) to take the data from the "raw tables" and update the tables that are needed in the UI. Send pointers back here… ETA on the github is within the next week as I want to finalize my testing.
Sounds good. Actually don't think the codebase integration will be that tough; will be easier to work out once the tables are in there. Since this has recently been worked out for the SNOMED tables(see above commit comments for lots of details on this), the most crucial thing will be to populate the ICD10 code in lookup_code_descriptions() and code_set_search() function in custom/code_types.inc.php . Then can turn on ICD10 in Administration->Lists->Code Types and start testing everything (along with grep).
When an end-user "upgrades" to ICD10 I assume that they no longer be using the ICD9 codes. So this initial release doesn't include the actual upgrade processing. This process will involve using the GEM tables (mappings from ICD9 to ICD10) that are included in the raw data loads. This upgrade will undoubtedly encounter the "one-to-many" coding problem for some docs, where one ICD9 code can "go to" more than one ICD10 code. I forget where I read this but approximately 5% of the codes are in this category, so we'll see many openEMR users upgrade to ICD10 without an issues. I will check into that SNOMED integration commit that you mention earlier when I get to that point.
Question on performance: I coded up some timestamps to echo to the page and when I go to install the ICD9 raw data feeds it takes about 6 minutes, but when the page renders the timestamps the actual work is only taking a few seconds which is what I was expecting…. not sure what's up… any pointers on figuring out what's going on? it appears something is taking about 6 minutes to resolve itself then the page processes…. have looked in the main log in /var/log/apache2/error.log but nothing is obvious…
Sounds odd. Hard to say what's going on without seeing the code. Could always throw in a bunch of error_log() debug statements.
The best way to identify a performance problem is to profile code. Use xdebug and wincachegrind or kcachegrind depending on your OS of choice. Google for details.
It's more involved initially, but ultimately a better approach because everything else is pretty much just guess work.
I got to the bottom of that performance problem. The Kcachegrind/xdebug tool combo did the trick. The culprit was the ICD9 load was using a non-indexed column. That load, after the index was created, runs in a few seconds.
I am about to commit the ICD code to the hub but wanted to post a review of the approach taken. The general semantics of the data load user interface remain unchanged. So the user still places the zip file into the contrib/<target unload> directory and clicks the Install button. The ICD messages are a little different as some of the processing is handled differently - read on. The implementation is based on the best of the rxNorm load and the Snomed load functions. I took the LOAD INFILE text file load technique from Snomed and took the table driven technique from the rxNorm as the basis for the ICD load implementation. It uses a metadata table that contains the substitution values that are used in the LOAD INFILE command for each of the zip files being loaded. Remember the ICD 10 implementation contains not only the actual code tables but the mapping tables to aid in the migration to ICD10:
ICD 10 File Types
General Equivalence Mappings
The load produces the following tables:
icd10_pcs_order_codeICD10 Procedure Codes
icd10_dx_order_codeICD10 Diagnostic Codes
icd10_gem_dx_10_9General Equivalence Mappings For Diagnostics from ICD10 to ICD9
icd10_gem_dx_9_10General Equivalence Mappings For Diagnostics from ICD9 to ICD10
icd10_gem_pcs_10_9General Equivalence Mappings For Procedures from ICD10 to ICD9
icd10_gem_pcs_9_10General Equivalence Mappings For Procedures from ICD9 to ICD10
icd10_reimbr_dx_9_10Reimbursement Mappings For Diagnostics from ICD9 to ICD10
icd10_reimbr_pcs_9_10Reimbursement Mappings For Procedures from ICD9 to ICD10
icd9_dx_codeICD9 Diagnostic Codes
icd9_sg_codeICD9 Procedure Codes
Each of these tables is setup with a surrogate key (mySql SERIAL attribute aka auto-increment) using alter statements once the text file has been loaded. This approach is a best practice and will be helpful in any future foreign key mappings that we might want to implement reporting and analytics based on the ICD10 code base.
The framework of how rxNorm and Snomed has not been changed so much but I did make some minor changes about how the version and revision dates are handled in the standard_tables_track table. The existing rxNorm and Snomed load files have some details regarding the release date embedded in the filename, whereas the ICD 9 and 10 files don't really follow the same approach. So when managing the ICD revision dates the code actually unzips the incoming release and uses the same approach as the existing code of storing the revision dates in an array in reverse order. The only difference is that the ICD dates are the actual last update dates on the content files within the zip files versus the date derived from the file name as is used in the rxNorm and Snomed loads.
I am wrapping up the testing and will be posting the load code soon (by weekend's end) that takes the raw data and loads it to the staging tables listed above. I haven't integrated the staging tables into the code base as I wanted to sync up my code to the tree that has had some changes since I took my branch. This will most likely be a tweak to this code or perhaps somewhere else in the code tree that I have yet to discover. Look forward to the code review after the commit. :-)
Looking forward to seeing the code :)
Here is where I pushed the ICD 9 / 10 code
Not sure if this is the way to send it up…
I have not retested the upgrade processing for RxNorm or Snomed, Here are the test cases that I did test:
1.) Install staging tables for ICD 9 with no ICD staging tables pre-existing
2.) Install staging tables for ICD 10 with no ICD staging tables pre-existing
3.) Install staging tables for ICD 10 with ICD 9 staging tables installed
4.) Install staging tables for RxNorm with no RxNorm staging tables installed
5.) Install staging tables for Snomed with no Snomed staging tables pre-existing