Menu

lang_constants constant_name not unique?

Developers
Kevin Yeh
2014-04-16
2014-04-18
  • Kevin Yeh

    Kevin Yeh - 2014-04-16

    Is there a reason why we don't have a uniqueness index on the constant_name in lang_constants?

    It seems like the behavior of xl would be inconsistent if more than one entry exists for the same constant_name.

     
    • Pieter W

      Pieter W - 2014-04-16

      I have been involved for a long time in Translations for OpenEMR.

      Can you give a real world example for uniqueness? I do not see in translations spreadsheet and connections-software related what need to be changed.

       
      • Kevin Yeh

        Kevin Yeh - 2014-04-16

        I'm proposing something akin to

        ALTER TABLE lang_constants DROP INDEX cons_name , ADD UNIQUE INDEX cons_name (constant_name ASC);

        This is an issue with regards to the database representation of information, and primarily concerns when people add custom/additional translation.

        Adding a uniqueness constraint to the database would provide an additional check to make sure that duplicate constant's don't accidentally get defined when definitions are added using the interface in "Administration > Other > Language".

        If duplicates entries exist, then the results of the query in translations.inc.php

        $sql="SELECT * FROM lang_definitions JOIN lang_constants ON " .
        "lang_definitions.cons_id = lang_constants.cons_id WHERE " .
        "lang_id=? AND constant_name = ? LIMIT 1";

        Could be unpredictable.

        However with regards to the spreadsheet, when just the spreadsheet is loaded, all of the constants "should" be unique. However, if for some reason the same constant had two rows in the spreadsheet (error with the perl script somewhere, manual edits that were done incorrectly...) This extra constraint would lead to an error being generated when the spreadsheet was loaded, rather than the spreadsheet being loaded and translations potentially being unpredictable.

         
        • Rod Roark

          Rod Roark - 2014-04-16

          Be sure to handle the case of upgrading a database where cons_name has non-unique values for some reason. You don't want the upgrade script to fail.

          Rod
          http://www.sunsetsystems.com/

           
          • Kevin Yeh

            Kevin Yeh - 2014-04-16

            Except if there are non-unique constants, then there isn't really a good "automatic" way to resolve conflicts.

             
            • Rod Roark

              Rod Roark - 2014-04-16

              I think you might be able to with a couple of messy queries using subqueries. One to update lang_definitions.cons_id to point to just one of the duplicates, and another to remove the duplicated constants. I'd have to play with it for a bit.

              Then you might be left with some duplicated (lang_id, cons_id) pairs in the lang_definitions table, for which the same problem of possible duplicates exists.

              Nothing is easy. :)

              Rod
              http://www.sunsetsystems.com/

               
              • Kevin Yeh

                Kevin Yeh - 2014-04-16

                Except there may also be conflict resolution needed for when multiple lang_definitions exist that might map to the same constant_name with different definitions. Which definition should win?

                I am certain that appropriate queries are possible, but to make them "one-size fits all" would be difficult.

                 
                • Rod Roark

                  Rod Roark - 2014-04-16

                  It doesn't really matter which one wins, since the translation lookup will not have a predictable result anyway. Throwing either one away will be better than doing nothing.

                  Yes it's kinda messy but I suspect do-able. Main thing is to make sure upgrading is robust.

                  Rod
                  http://www.sunsetsystems.com/

                   
        • Pieter W

          Pieter W - 2014-04-16

          What I understand is that there might be included duplicates in the spreadsheet. As far as I made translations, the only duplicates in a rows are the ones with a " " space at the end of a Constant and so also in Definitions. But translations wise for the used scripts or I can't be of any help. For manual added translations, as long as a sentence/constant is translated there is no need for uniqueness. The translated words represent the Constant and the constant is used for interactions and not the translations. Correct me if I am wrong.

          If by manual you mean to xl(..) translation characters as a part of the software this extra constant will be included or not included if needed due to not yet included in the Constant column.

           
          • Kevin Yeh

            Kevin Yeh - 2014-04-16

            You are wrong.
            Both the constant and the definition are used in the lookup process.

             
            • Pieter W

              Pieter W - 2014-04-16

              You having said this, I am out! Please let's hope other developers will jump in to support the idea of uniqueness. I did translate some different constants but were definitely in luck not to encounter problems due to uniqueness.

               
              • Kevin Yeh

                Kevin Yeh - 2014-04-16

                Pieter, thanks for your interest and desire to help.

                The basic issue in my mind is that the table should have been enforcing uniqueness all along. If all that people are using is just the spreadsheet, this is a non-issue, but if someone is trying to use definitions from additional sources, there may be a problem.

                 
                • Pieter W

                  Pieter W - 2014-04-16

                  The only manual problem that exists than is your additional translations through Administration => Others -> Translations etc....

                  But these new translations have to be done in the software ...php ...txt(?) ....sql files, etc. and after the xl(...) is added give it the translation definition. This is like creating a personal local translation-file.

                  BTW for translation there is used: xl(..) xl t(...) xl a(...) I did not check for completeness, but there seemed to be a reason for this different translation options.

                   
  • Rod Roark

    Rod Roark - 2014-04-16

    I can't think of one. Uniqueness should certainly be enforced one way or another.

    Rod
    http://www.sunsetsystems.com/

     
    • Kevin Yeh

      Kevin Yeh - 2014-04-16

      I'm guessing that the perl scripts work to maintain uniqueness. Changing the index to unique would provide additional "protection" when manual edits are made

       
  • Brady Miller

    Brady Miller - 2014-04-17

    A very important item I forget to mention here; the change to database.sql is actually going to do nothing(it's good to do for clarity sake), since when it brings in the translation set on install, tables will get replaced by:
    https://github.com/openemr/openemr/blob/master/contrib/util/language_translations/currentLanguage_utf8.sql
    So, for testing, remember to change it there.
    -brady

     

    Last edit: Brady Miller 2014-04-17
  • Brady Miller

    Brady Miller - 2014-04-17

    Also,
    While looking at the tables, doesn't it seem odd that constant name is limited to 255 while the definition is a mediumtext?
    -brady

     
    • Kevin Yeh

      Kevin Yeh - 2014-04-17

      VARCHAR(255) or even a longer length varchar will have better indexing/lookup performance than MEDIUMTEXT.

      Validation that no existing constants are longer than 255 would probably be a good thing to do.

       
  • Pieter W

    Pieter W - 2014-04-17

    Indeed this is odd. How many Chars is mediumtext? It could explain this difference in layout of the screens using a translation. Sometimes it is very hard to find a translation with equal or less char than the original text for a translation, but in can't be that a translation is much longer than the original.

     
  • Brady Miller

    Brady Miller - 2014-04-18

    To show an example, here is a entry that is 425 characters long:
    "Alternatively, you may use the search page to upload an electronic remittance (X12 835) file that you have obtained from your payer or clearinghouse. You can do this by clicking the Browse button and selecting the file to upload, and then clicking Search to perform the upload and display the corresponding invoices. In this case the other parameters mentioned above do not apply and will be ignored. Uploading saves the file but does not yet process its contents -- that is done separately as described below."

    Note that it won't break for users that use the standard translations or work in the standard translation spreadsheet and the first 255 characters are enough to make it unique. But will break for users that do local translations since they will not see the entire constant there. Perhaps we should increase the size of the VARCHAR here (agree should not go to a MEDIUMTEXT since then lose index/performance).

    -brady
    OpenEMR

     

Log in to post a comment.