One of the first things that should happen with respect to database restructuring is changing tables to InnoDB to allow for transaction support and referential integrity as well as potential performance advantages of using InnoDB.
On my own systems, I did a grep of database.sql and changed all references to MyISAM to InnoDB before creating the database.
Almost all of the tables had no problems with this, the only exception I can remember at the moment is ar_activity which is not valid in InnoDB because it has a composite primary key which includes an auto increment field.
Fortunately InnoDB vs. MyISAM is not an all or nothing proposition. The issue is should we start switching tables over? And if so how aggressively should we go about it?
Maybe as an initial "decry" we should make all new tables be InnoDB rather than MyISAM, if we aren't ready to switch all tables (where possible) in database.sql InnoDB.
It is also fortunately possible to tell MySQL to switch an already running databases tables to InnoDB format with an alter table statement or in a tool such as mysql workbench.
Maybe we should start by converting a few key tables (forms, form_encounter, billing) to InnoDB, and potentially adding foreign key constraint relationships between them.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I think we should set database.sql to default to INNODB for all NEW installs.
I think we should provide an optional sql_upgrade script choice/or separate script for updating an existing DB from MyISAM to INNODB. That script should do a mysqldump before the update just to protect the people who fail to backup for themselves.
ar_activity can be left alone for now or we can get ZHH to fix it as part of the process.
-Tony
www.mi-squared.com
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Shouldn't we force upgraders to also use InnoDB via the normal upgrade script? Otherwise, seems like there will be issues that will arise in the future if code is written assuming InnoDB is being used. Would it make sense to test out a complete conversion(except for the tables we know will break) in both the database.sql and upgrade script also (in upgrade, would only want to Alter tables that are not yet INNODB, so may need to make a function that checks this first).
-brady
www.open-emr.org
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It probably makes sense to provide an easy mechanism to upgrade existing databases to InnoDB table. However, it won't be a disaster for new code to execute against MyISAM tables.
The main thing that would happen if the code assumes InnoDB when MyISAM is present is the loss of Atomicity in transactions. Begin Transaction, commit and rollback operations would just get ignored. When everything goes "smoothly" the behavior would be the same in both instances. However, when something goes wrong, InnoDB won't make changes to the database, while MyISAM will still make updates.
The syntax for upgrading is simple
ALTERTABLE`openemr`.`addresses`ENGINE=InnoDB;
However, if a table is inconsistent for some reason, it can fail and how to correct that problem is not something a novice user is likely to be able to know how to correct.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
So, to fully convert, just need to:
1. Modify database.sql script (and fix the incompatible tables and code, if possible)
2. Make an script that does the Alter Table command for all the pertinent (as the UTF8 upgrade will warn that issues may come up and will recommend a backup before this)
-brady
www.open-emr.org
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The composite primary key issue can be addressed in part by changing the primary key to just the auto increment column and adding a uniqueness constraint on the AI column and the other the behavior will be the same.
Performance will be different, (can't definitely say faster or slower). However as long as there's no table scan I don't think the difference will be significant.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Kevin I think the composite key issue will take a bit more study and some additional code. In lbf_data for example a given value of the auto_increment field has multiple occurrences and so cannot by itself constitute a primary key. So a different method will be needed to assign its values.
Rod
www.sunsetsystems.com
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Maybe an approach for LBF would be to instead of using AI, use generate_id(). Move the responsibility for populating the form_id column to the php code rather than leaving it to the database.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
More than just changing to InnoDB I have heard a lot of issues with tables itself. So whoever had those complaints should actually speak up now. One concrete issue I have seen mentioned before is the addressbook and users being in the same table.
Shameem
www.zhservices.com
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It doesn't seem like too much of a challenge.
It looks like there are 2 places in the code, ippf_upgrade.php and interface/forms/lbf.php that do inserts.
Here's the bit in lbf
foreach ($a as $field_id => $value) {
if ($value !== '') {
if ($newid) {
$query = "INSERT INTO lbf_data " .
"( form_id, field_id, field_value ) " .
" VALUES ( '$newid', '$field_id', '$value' )";
if ($verbose) echo "<br />$query\n";
if (!$debug) sqlStatement($query);
}
else {
$query = "INSERT INTO lbf_data " .
"( field_id, field_value ) " .
" VALUES ( '$field_id', '$value' )";
if ($verbose) echo "<br />$query\n";
if (!$debug) $newid = sqlInsert($query);
}
$didone = true;
}
}
So rather than inserting the first row, getting the ID from the DB from Autoinsert, we would just generate the ID first, no need for the if statement here any more and proceed.
Then the next big thing is if we wrap this in a transaction, if one of the insert statements fails for some reason (invalid field data would probably be the most likely cause) we can get atomicity for creating a new set of LBF entries.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
@Shameem: Users and the LBF's are two real catching points for usability.
My input on LBF tables, user tables, and users in general:
All LBF data being in one three-column table just sucks. Try doing a report off of data gathered in an LBF form and you will see what I mean. What a disaster. I am sure a real guru could pull it off, but compare it to running a report on any other form table.
I think that a non-user table SHOULD be used for addresses, or at least flag the non-users better than having a damn asterisk in front of the last name. The user data fields should also carry flags that allow differences in provider types, and have more classes of users there vs. just the ACL.
Calender users: Three types, providers with appointments for the schedule, lower level providers that provide services on the same encounter such as drawing labs, and clerical/administrative calender users who need to track their own personal schedule and work assignments.
Folks who can authorize (finalize) billing on fee sheets (not authorize encounters…two different subjects!!!)…which should just be an acl check defined in vanillaEMR. Lots of these things should NOT be typical of a "Provider" as defined in the current acl. Typically, providers that are not one-man shops should NOT be considered the next best thing to a superuser. They should have lots of data access, be the only people that can "sign off" to finalize encounters, and have nothing to do with final billing line item authorization status.
Finally, what is up with that addresses table? Would it not be better to use it for all gathered addresses, have a flag for what type of contact uses that particular address (patient, insco, pharmacy, addressbook).
…big tangent….I'll shut up.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I don't see what the problem is reporting from lbf_data, and have done it myself. If you have a specific case where you're having difficulty I can probably offer a suggestion or two. But we're straying off topic here, please start a new thread for further discussion in that area, or email me privately if you want to discuss how much my designs suck. Thank you.
Rod
www.sunsetsystems.com
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The LBF table is implemented in what to database folks is known as first normal form: http://en.wikipedia.org/wiki/First_normal_form
Which is a really good thing.
This is why such a simple table can flexibly handle so many different tasks and doesn't get messy like the users and patient_data tables when "new fields" are needed.
It makes "reporting" more difficult initially, but ultimately it is more flexible too. You just have to learn more SQL beyond just simple select statements. (subqueries and joins are probably the best things to learn about first.)
The simplicity of the table means that despite it's current schema being incompatible with InnoDB it will be very simple to correct.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hey Rod,
I made a mistake on this thread, so I should try to correct it here.
I'm really not saying that anything you have done "sucks", far from it. I really used inappropriate and perhaps insulting language, spoke from the layman outlook, and I sincerely hope you can accept my apology for not approaching the matter from a different viewpoint. It is obviously very clever, efficient, and provides many advantages. I am also trying not to stray off topic, just replying to the statement made by Shameem to outline the "feelings" or contextual issues of db structure and practical use.
I am really speaking from the end user point of view. In OpenEMR, which is provided with configuration options, yet rarely meets the exact needs of a clinic off-the-shelf, end-users must be able to do quite a bit of configuration and customization to use the product effectively. For those, it all comes down to documentation. Obviously, the guy who wrote it, and anyone that is experienced in doing things the"right" way may have no issues dealing with it. On the other hand, a one-paragraph explanation of how the system works (in a configuration manual) and perhaps a short example of how to report on the data in the tables might really do a lot to promote the use of the format.
With the addition of NationNotes to LBF (and other forms), I should hope that the layout based forms become the standard. I can see further improvements for configuration, such as a form sharing database export, a WYSIWYG designer tool, a report designing tool (or at least a set of functions) and perhaps some more table layout options to arrange data input or display in more elegant ways would do a heck of a lot for those "end users".
And yes, if you accept my apology, I would absolutely love to discuss the LBF format, take notes, and maybe get some sort of practical example of reporting against the lbf table.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Art, no problem, accepted. Would be great if you can write some documentation and examples, and I'll be happy to answer whatever questions I can.
Kevin, that's interesting. Not sure sequence_no first is what we want though, does it restart at 1 for each new encounter? I so then I guess we could do it that way, but also a new index would be needed on (pid, encounter) to maintain efficiency.
Rod
www.sunsetsystems.com
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Rod,
I agree with you that re-ordering the primary key will impact performance. I'm not sure what happens with each new encounter. An explanation from Shameem's team (who wrote all of this code) might be helpful.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
For upgrading existing systems
We can determine the list of tables that need to be updated with a query like this:
SELECT TABLE_NAME, Engine FROM information_schema.TABLES where TABLE_SCHEMA = 'openemr' and ENGINE = 'MyISAM';
Also, another way to upgrade is to do a MySQL dump, then search and replace MyISAM for InnoDB in the dump file, then load the modified dump into a database. That way people can do it without needing to risk their original database at all.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
One of the first things that should happen with respect to database restructuring is changing tables to InnoDB to allow for transaction support and referential integrity as well as potential performance advantages of using InnoDB.
On my own systems, I did a grep of database.sql and changed all references to MyISAM to InnoDB before creating the database.
Almost all of the tables had no problems with this, the only exception I can remember at the moment is ar_activity which is not valid in InnoDB because it has a composite primary key which includes an auto increment field.
Fortunately InnoDB vs. MyISAM is not an all or nothing proposition. The issue is should we start switching tables over? And if so how aggressively should we go about it?
Maybe as an initial "decry" we should make all new tables be InnoDB rather than MyISAM, if we aren't ready to switch all tables (where possible) in database.sql InnoDB.
It is also fortunately possible to tell MySQL to switch an already running databases tables to InnoDB format with an alter table statement or in a tool such as mysql workbench.
Maybe we should start by converting a few key tables (forms, form_encounter, billing) to InnoDB, and potentially adding foreign key constraint relationships between them.
I am in favor of moving forward.
I think we should set database.sql to default to INNODB for all NEW installs.
I think we should provide an optional sql_upgrade script choice/or separate script for updating an existing DB from MyISAM to INNODB. That script should do a mysqldump before the update just to protect the people who fail to backup for themselves.
ar_activity can be left alone for now or we can get ZHH to fix it as part of the process.
-Tony
www.mi-squared.com
I think a switch to InnoDB for most tables just needs adequate testing.
I like that composite auto increment feature though. lbf_data uses it also. Too bad InnoDB doesn't support it.
Rod
www.sunsetsystems.com
Hi,
Shouldn't we force upgraders to also use InnoDB via the normal upgrade script? Otherwise, seems like there will be issues that will arise in the future if code is written assuming InnoDB is being used. Would it make sense to test out a complete conversion(except for the tables we know will break) in both the database.sql and upgrade script also (in upgrade, would only want to Alter tables that are not yet INNODB, so may need to make a function that checks this first).
-brady
www.open-emr.org
It probably makes sense to provide an easy mechanism to upgrade existing databases to InnoDB table. However, it won't be a disaster for new code to execute against MyISAM tables.
The main thing that would happen if the code assumes InnoDB when MyISAM is present is the loss of Atomicity in transactions. Begin Transaction, commit and rollback operations would just get ignored. When everything goes "smoothly" the behavior would be the same in both instances. However, when something goes wrong, InnoDB won't make changes to the database, while MyISAM will still make updates.
The syntax for upgrading is simple
However, if a table is inconsistent for some reason, it can fail and how to correct that problem is not something a novice user is likely to be able to know how to correct.
Hi,
So we could treat this like the UTF-8 mysql conversion then where it was optional for upgraders:
http://open-emr.org/wiki/index.php/OpenEMR_UTF-8_Upgrade_Howto
So, to fully convert, just need to:
1. Modify database.sql script (and fix the incompatible tables and code, if possible)
2. Make an script that does the Alter Table command for all the pertinent (as the UTF8 upgrade will warn that issues may come up and will recommend a backup before this)
-brady
www.open-emr.org
Just as a reference, at least 75% of all MySQL deployments uses InnoDB for the storage engine. Highly recommended to switch.
Also, I think you run v 5.0.x now as default? Why not upgrade to 5.5 as well and take advantage of performance improvements and bug fixes?
Thanks
Ulf
The composite primary key issue can be addressed in part by changing the primary key to just the auto increment column and adding a uniqueness constraint on the AI column and the other the behavior will be the same.
Performance will be different, (can't definitely say faster or slower). However as long as there's no table scan I don't think the difference will be significant.
Kevin I think the composite key issue will take a bit more study and some additional code. In lbf_data for example a given value of the auto_increment field has multiple occurrences and so cannot by itself constitute a primary key. So a different method will be needed to assign its values.
Rod
www.sunsetsystems.com
Maybe an approach for LBF would be to instead of using AI, use generate_id(). Move the responsibility for populating the form_id column to the php code rather than leaving it to the database.
Yep, something like that.
Rod
www.sunsetsystems.com
More than just changing to InnoDB I have heard a lot of issues with tables itself. So whoever had those complaints should actually speak up now. One concrete issue I have seen mentioned before is the addressbook and users being in the same table.
Shameem
www.zhservices.com
It doesn't seem like too much of a challenge.
It looks like there are 2 places in the code, ippf_upgrade.php and interface/forms/lbf.php that do inserts.
Here's the bit in lbf
So rather than inserting the first row, getting the ID from the DB from Autoinsert, we would just generate the ID first, no need for the if statement here any more and proceed.
Then the next big thing is if we wrap this in a transaction, if one of the insert statements fails for some reason (invalid field data would probably be the most likely cause) we can get atomicity for creating a new set of LBF entries.
Ooops. I actually pasted the ippf_upgrade block instead of the lbf… but it looks similar
Here's LBF's
@Shameem: Users and the LBF's are two real catching points for usability.
My input on LBF tables, user tables, and users in general:
All LBF data being in one three-column table just sucks. Try doing a report off of data gathered in an LBF form and you will see what I mean. What a disaster. I am sure a real guru could pull it off, but compare it to running a report on any other form table.
I think that a non-user table SHOULD be used for addresses, or at least flag the non-users better than having a damn asterisk in front of the last name. The user data fields should also carry flags that allow differences in provider types, and have more classes of users there vs. just the ACL.
Calender users: Three types, providers with appointments for the schedule, lower level providers that provide services on the same encounter such as drawing labs, and clerical/administrative calender users who need to track their own personal schedule and work assignments.
Folks who can authorize (finalize) billing on fee sheets (not authorize encounters…two different subjects!!!)…which should just be an acl check defined in vanillaEMR. Lots of these things should NOT be typical of a "Provider" as defined in the current acl. Typically, providers that are not one-man shops should NOT be considered the next best thing to a superuser. They should have lots of data access, be the only people that can "sign off" to finalize encounters, and have nothing to do with final billing line item authorization status.
Finally, what is up with that addresses table? Would it not be better to use it for all gathered addresses, have a flag for what type of contact uses that particular address (patient, insco, pharmacy, addressbook).
…big tangent….I'll shut up.
I don't see what the problem is reporting from lbf_data, and have done it myself. If you have a specific case where you're having difficulty I can probably offer a suggestion or two. But we're straying off topic here, please start a new thread for further discussion in that area, or email me privately if you want to discuss how much my designs suck. Thank you.
Rod
www.sunsetsystems.com
The LBF table is implemented in what to database folks is known as first normal form:
http://en.wikipedia.org/wiki/First_normal_form
Which is a really good thing.
This is why such a simple table can flexibly handle so many different tasks and doesn't get messy like the users and patient_data tables when "new fields" are needed.
It makes "reporting" more difficult initially, but ultimately it is more flexible too. You just have to learn more SQL beyond just simple select statements. (subqueries and joins are probably the best things to learn about first.)
The simplicity of the table means that despite it's current schema being incompatible with InnoDB it will be very simple to correct.
Hey Rod,
I made a mistake on this thread, so I should try to correct it here.
I'm really not saying that anything you have done "sucks", far from it. I really used inappropriate and perhaps insulting language, spoke from the layman outlook, and I sincerely hope you can accept my apology for not approaching the matter from a different viewpoint. It is obviously very clever, efficient, and provides many advantages. I am also trying not to stray off topic, just replying to the statement made by Shameem to outline the "feelings" or contextual issues of db structure and practical use.
I am really speaking from the end user point of view. In OpenEMR, which is provided with configuration options, yet rarely meets the exact needs of a clinic off-the-shelf, end-users must be able to do quite a bit of configuration and customization to use the product effectively. For those, it all comes down to documentation. Obviously, the guy who wrote it, and anyone that is experienced in doing things the"right" way may have no issues dealing with it. On the other hand, a one-paragraph explanation of how the system works (in a configuration manual) and perhaps a short example of how to report on the data in the tables might really do a lot to promote the use of the format.
With the addition of NationNotes to LBF (and other forms), I should hope that the layout based forms become the standard. I can see further improvements for configuration, such as a form sharing database export, a WYSIWYG designer tool, a report designing tool (or at least a set of functions) and perhaps some more table layout options to arrange data input or display in more elegant ways would do a heck of a lot for those "end users".
And yes, if you accept my apology, I would absolutely love to discuss the LBF format, take notes, and maybe get some sort of practical example of reporting against the lbf table.
Interesting,
Changing LBF to InnoDB without any modifications actually works as is.
Turns out If the auto insert column is the first part of the primary key InnoDB doesn't complain.
Reordering the PK in ar_activity so that sequence_no comes first instead of third in the pk makes it accept InnoDB as the engine.
Hi Art, no problem, accepted. Would be great if you can write some documentation and examples, and I'll be happy to answer whatever questions I can.
Kevin, that's interesting. Not sure sequence_no first is what we want though, does it restart at 1 for each new encounter? I so then I guess we could do it that way, but also a new index would be needed on (pid, encounter) to maintain efficiency.
Rod
www.sunsetsystems.com
Rod,
I agree with you that re-ordering the primary key will impact performance. I'm not sure what happens with each new encounter. An explanation from Shameem's team (who wrote all of this code) might be helpful.
The table "claims" has a similar PK/AI arrangement to ar_activity and doesn't convert out of box.
For upgrading existing systems
We can determine the list of tables that need to be updated with a query like this:
SELECT TABLE_NAME, Engine FROM information_schema.TABLES where TABLE_SCHEMA = 'openemr' and ENGINE = 'MyISAM';
Also of note:
http://dev.mysql.com/doc/refman/5.5/en/alter-table.html
When you specify an ENGINE clause, ALTER TABLE rebuilds the table. This is true even if the table already has the specified storage engine.
Also, another way to upgrade is to do a MySQL dump, then search and replace MyISAM for InnoDB in the dump file, then load the modified dump into a database. That way people can do it without needing to risk their original database at all.
Hi,
Will the information_schema stuff require mysql root access privileges?
Is another option to just document how to use the embedded phpmyadmin gui to do this (looks pretty straighforward):
http://richardcummings.info/convert-myisam-to-innodb-using-phpmyadmin/
-brady
www.open-emr.org