#608 java.lang.IndexOutOfBoundsException during strict tag validation

3.0
closed-fixed
None
5
2013-09-19
2013-07-23
Guido Leenders
No

Call stack:
13281: Error: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
13281: Error: at java.util.ArrayList.rangeCheck(ArrayList.java:604)
13281: Error: at java.util.ArrayList.get(ArrayList.java:382)
13281: Error: at org.omegat.gui.tagvalidation.TagValidationTool.inspectOmegaTTags(TagValidationTool.java:383)
13281: Error: at org.omegat.gui.tagvalidation.TagValidationTool.checkEntry(TagValidationTool.java:245)
13281: Error: at org.omegat.gui.tagvalidation.TagValidationTool.listInvalidTags(TagValidationTool.java:190)
13281: Error: at org.omegat.Main.validateTagsConsoleMode(Main.java:323)
13281: Error: at org.omegat.Main.runConsoleTranslate(Main.java:282)
13281: Error: at org.omegat.Main.main(Main.java:180)

Discussion

  • Guido Leenders
    Guido Leenders
    2013-07-23

    Analysing. Caused by translations validation pattern

    <tagValidation_customPattern>(&amp;[0-9])|(\{res:[^}]*\})</tagValidation_customPattern>
    

    and translations such as this one:

    <tu>
      <tuv lang="EN-US">
        <seg>Please choose a different {res:bubs_task_lc} or change {res:bubs_task_lc} status.</seg>
      </tuv>
      <tuv lang="NL-NL" changeid="dedicated @ JEN (JEN)" changedate="20120525T163359Z" creationid="dedicated @ JEN (JEN)" creationdate="20120525T163359Z">
        <seg>Kies alsjeblieft een andere {res:bubs_task_lc} of wijzig processtatus.</seg>
      </tuv>
    </tu>
    
     
  • Guido Leenders
    Guido Leenders
    2013-07-23

    For example, the input:

    After applying the payment schedule '&1' on the {res:bubs_task_lc} unit with ID '&2' there was a difference of &3 between the original invoiceable number of units and the sum of the added {res:bubs_task_lc} units.

    with translation:

    Na toepassen van het betalingsschema '&1' op de {res:bubs_task_lc}unit met ID '&2' was er een verschil van &3 tussen het origineel aantal facturabele units en de som van de toegevoegde units.

    Results in commonTagsSrc being equal to:

    [&1, {res:bubs_task_lc}, &2, &3, {res:bubs_task_lc}]

    but commonTagsLoc to:

    [&1, {res:bubs_task_lc}, &2, &3]
    

    This difference is correct, since the source contains one tag extra. But the loop in TagValidationTool.java.inspectOmegaTTags reads:

                for (int i = 0; i < commonTagsSrc.size(); i++) {
                    String tag = commonTagsLoc.get(i);
    

    So it expects commonTagsLoc to have the same amount of tags as the source.

     
  • Guido Leenders
    Guido Leenders
    2013-07-23

    Suggested change, old code:

            for (int i = 0; i < commonTagsSrc.size(); i++) {
                String tag = commonTagsLoc.get(i);
                if (!tag.equals(commonTagsSrc.get(i))) {
                    report.transErrors.put(tag, TagError.ORDER);
                    commonTagsSrc.remove(tag);
                    commonTagsLoc.remove(i);
                    i--;
                }
            }
    

    into

            for (int i = 0; i < commonTagsSrc.size(); i++) {
                if (i >= commonTagsLoc.size())
                {
                  tag="Missing source tag in translation: " + commonTagsSrc.get(i);
                  // This should probably done somehow else. 
                  // The code below commented with "Check source tags..." does not detect
                  // cases where the translated list of tags contains duplicates.
                  // Also, somehow the tag validation report is not printed.
                  err=true;
                }
                else
                {
                  tag = commonTagsLoc.get(i);
                  err = !tag.equals(commonTagsSrc.get(i));
                  commonTagsLoc.remove(i);
                  i--;
                }
                if (err) {
                    report.transErrors.put(tag, TagError.ORDER);
                    commonTagsSrc.remove(tag);
                }
            }
    
     
  • Hi guys. Sorry for the delay.

    The fundamental problem here is that I never dealt properly with the issue of having multiple identical tags. I'm working on a fix now.

    After the fix, any tag that appears a different number of times between the source and target text will be flagged as a "number mismatch". This will happen regardless of the "loose ordering" option. If anyone wants a "loose numbering" option then that will have to be a separate feature, as order and number are separate concerns.

     
  • Didier Briel
    Didier Briel
    2013-07-26

    • assigned_to: Aaron Madlon-Kay
     
  • Actually forget that last comment. I have a simpler solution: Custom tags will be treated the same as Java MessageFormat tags, which are checked only for missing and extraneous tags (number of tags is not enforced).

    I'm just writing some much-needed unit tests to nail down the tag validation and repair behavior; I'll commit in a day or two.

     
  • Didier Briel
    Didier Briel
    2013-07-29

    • Group: SVN --> 3.0
     
  • Fixed in r5561.

     
  • Didier Briel
    Didier Briel
    2013-07-31

    • status: open --> open-fixed
     
  • Didier Briel
    Didier Briel
    2013-07-31

    Fixed in SVN (/trunk).

    Didier

     
  • Didier Briel
    Didier Briel
    2013-09-19

    • status: open-fixed --> closed-fixed
     
  • Didier Briel
    Didier Briel
    2013-09-19

    Fixed in the released version 3.0.4 update 2.

    Didier