From: Ian V. <Ian...@he...> - 2013-09-04 22:28:56
|
I grabbed this code from the on-line source. Pretty sure it is from Hapi 2.1, although I think I checked 2.0 and found this routine hadn't had changes. As to a pluggable version, for me if the proposal that "a single escape character on it's own does not constitute an escape sequence and should not be removed from the message" is accepted, I don't need a pluggable version. If however single escape characters continue to be removed from messages such that it cannot be determined that this has occurred, then I will need a pluggable version. The trickiest part from my end is that it's gone, and I can't tell. Nearly every other case I have encountered where delimiter characters are not escaped (and I have seen them all), there has been some way to tell (except maybe the field delimiter, that one sucks to figure out). In general I would recommend that there be test cases around the place (including bad behaviours) that decode then encode a message, segment or field, and compare the input to the output. If they don't match, try to figure out if it is possible to make them match. Some cases may indicate that a bad sender can really stuff things up, and there is no fixing it. These cases may be wise to mention in doco. I have an substantial amount of compensation code for various fields and data types to try to help with unescaped delimiters, principally to try to get the field to look like what the data on the senders application screen looked like, and pass it on correctly escaped. We seem to have a LOT of badly behaved vendors, and apparently no power to make them fix it. There are some big names on the list to. Thanks for your consideration. Ian >>> Christian Ohr <chr...@gm...> 05/09/13 2:08 >>> Yes, proper escaping is an endless source of joy ;-) What HAPI version are you using, btw? I'll take a look at your code in the next few days if time permits. When modifying existing functionality, we always face the problem of backwards compatibility. So for the past one or two releases, we rather added possibilities to plug in custom strategies of doing things while keeping the default, rather than changing existing behavior. So far, Escape is unfortunately very static, but for 2.2 we can think about making the escaping strategy pluggable just like other things in HAPI. Thoughts? cheers Christian 2013/9/4 Ian Vowles <Ian...@he...> I have sent mails to the general list about this issue before, and the advice has helped me progress. Then along comes another system that has slightly different behaviour. In this particular case a system correctly escapes the HL7 delimiters EXCEPT the escape delimiter. This allows it to send field content like this (from an address): 1 \ 24 Smith \T\ Wesson Road I was hopeful that since the single escape on it's own didn't form part of an escape sequence, that it might be preserved through the parse. This is not the case. The lone backslash is consumed in the process and disappears. I don't know how valid an argument it is to say it should be preserved, but if it isn't, I can't subsequently properly escape it to send to a downstream system. Given that I had been dealing with HL7 for some time before I found HAPI, I had done some work previously on an encode / unencode routine. My own code couldn't cope with this one either. I decided it was time to be brave, and dive into the HAPI code. Somewhere there had to be encode/unecode low level routines. Up until I looked in the source, I had been creating a new ST object, and using it's parse and encode methods. Once I looked into the source I found the Escape class. This updated version of Escape does the following: Preserves escape characters that do not form part of an escape sequence Permits the exceptional escape sequence case of \X000d\ to work when the escape character has been changed to something other than \ Adds extra HEX escaped code \X0D\ and \X0A\ because we see them here occasionally. Test case code is also included at the bottom, including my now infamous "HATER" example :-). Test cases with lots of > < are there because we often do transforms between HL7 and XML, so we often look at these in additional test cases of the XML output produced. What are my chances of this being adopted? If not, how can I get my version to override the existing one? Thanks Ian ---------- /** The contents of this file are subject to the Mozilla Public License Version 1.1 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.mozilla.org/MPL/ Software distributed under the License is distributed on an "AS IS" basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License for the specific language governing rights and limitations under the License. The Original Code is "Escape.java". Description: "Handles "escaping" and "unescaping" of text according to the HL7 escape sequence rules defined in section 2.10 of the standard (version 2.4)" The Initial Developer of the Original Code is University Health Network. Copyright (C) 2001. All Rights Reserved. Contributor(s): Mark Lee (Skeva Technologies); Elmar Hinz Alternatively, the contents of this file may be used under the terms of the GNU General Public License (the ?GPL?), in which case the provisions of the GPL are applicable instead of those above. If you wish to allow use of your version of this file only under the terms of the GPL and not to allow others to use your version of this file under the MPL, indicate your decision by deleting the provisions above and replace them with the notice and other provisions required by the GPL License. If you do not delete the provisions above, a recipient may use your version of this file under either the MPL or the GPL. */ package ca.uhn.hl7v2.parser; import java.util.Collections; import java.util.LinkedHashMap; import java.util.Map; /** * Handles "escaping" and "unescaping" of text according to the HL7 escape * sequence rules defined in section 2.10 of the standard (version 2.4). * Currently, escape sequences for multiple character sets are unsupported. The * highlighting and locally defined escape sequences are also * unsupported. * The only hexademical escapes supported are X000d, X0D, X0A * * @author Bryan Tripp * @author Mark Lee (Skeva Technologies) * @author Elmar Hinz * @author Christian Ohr */ public class HL7Escape { /** Creates a new instance of Escape */ public Hl7Escape() { } /** * @param text string to be escaped * @return the escaped string * <p>Defaults the escape characters to the conventional values |^~\& */ public static String escape(String text) { return escape(text,"|^~\\&"); } /** * @param text string to be escaped * @param encChars encoding characters to be used in the order * <br>Field, Component, Repetition, Escape, Sub-component * @return the escaped string */ public static String escape(String text, String encChars) { EncLookup esc = getEscapeSequences(encChars); int textLength = text.length(); StringBuilder result = new StringBuilder(textLength); for (int i = 0; i < textLength; i++) { boolean charReplaced = false; char c = text.charAt(i); FORENCCHARS: for (int j = 0; j < 6; j++) { if (text.charAt(i) == esc.characters[j]) { // Formatting escape sequences such as \.br\ should be left alone if (j == 4) { if (i+1 < textLength) { // Check for \.br\ char nextChar = text.charAt(i + 1); switch (nextChar) { case '.': case 'C': case 'M': case 'X': case 'Z': { int nextEscapeIndex = text.indexOf(esc.characters[j], i + 1); if (nextEscapeIndex > 0) { result.append(text.substring(i, nextEscapeIndex + 1)); charReplaced = true; i = nextEscapeIndex; break FORENCCHARS; } break; } case 'H': case 'N': { if (i+2 < textLength && text.charAt(i+2) == '\\') { int nextEscapeIndex = i + 2; if (nextEscapeIndex > 0) { result.append(text.substring(i, nextEscapeIndex + 1)); charReplaced = true; i = nextEscapeIndex; break FORENCCHARS; } } break; } } } } result.append(esc.encodings[j]); charReplaced = true; break; } } if (!charReplaced) { result.append(c); } } return result.toString(); } /** * @param text string to be unescaped * @return the unescaped string * <p>Defaults the escape characters to the conventional values |^~\& */ public static String unescape(String text) { return unescape(text,"|^~\\&"); } /** * @param text string to be unescaped * @param encChars encoding characters to be used in the order * <br>Field, Component, Repetition, Escape, Sub-component * @return the unescaped string */ public static String unescape(String text, String encChars) { // If the escape char isn't found, we don't need to look for escape sequences char escapeChar = encChars.charAt(3); boolean foundEscapeChar = false; for (int i = 0; i < text.length(); i++) { if (text.charAt(i) == escapeChar) { foundEscapeChar = true; break; } } if (!foundEscapeChar) { return text; } int textLength = text.length(); StringBuilder result = new StringBuilder(textLength + 20); EncLookup esc = getEscapeSequences(encChars); char escape = esc.characters[3]; int encodingsCount = esc.characters.length; int i = 0; while (i < textLength) { char c = text.charAt(i); if (c != escape) { result.append(c); i++; } else { boolean foundEncoding = false; // Test against the standard encodings for (int j = 0; j < encodingsCount; j++) { String encoding = esc.encodings[j]; int encodingLength = encoding.length(); if ((i + encodingLength <= textLength) && text.substring(i, i + encodingLength) .equals(encoding)) { result.append(esc.characters[j]); i += encodingLength; foundEncoding = true; break; } } if (!foundEncoding) { // If we haven't found this, there is one more option. Escape sequences of /.XXXXX/ are // formatting codes. They should be left intact if (i + 1 < textLength) { char nextChar = text.charAt(i + 1); switch (nextChar) { case '.': case 'C': case 'M': case 'X': case 'Z': { int closingEscape = text.indexOf(escape, i + 1); if (closingEscape > 0) { String substring = text.substring(i, closingEscape + 1); result.append(substring); i += substring.length(); } else { i++; } break; } case 'H': case 'N': { int closingEscape = text.indexOf(escape, i + 1); if (closingEscape == i + 2) { String substring = text.substring(i, closingEscape + 1); result.append(substring); i += substring.length(); } else { i++; } break; } default: { // Preserve unescaped escape delimiter result.append(c); i++; } } } else { // Preserve unescaped escape delimiter result.append(c); i++; } } } } return result.toString(); } /** * Returns a HashTable with escape sequences as keys, and corresponding * Strings as values. * @param encChars * @return */ private static EncLookup getEscapeSequences(String encChars) { EncLookup escapeSequences = new EncLookup(encChars); return escapeSequences; } /** * A performance-optimized replacement for using when * mapping from HL7 special characters to their respective * encodings * * @author Christian Ohr */ private static class EncLookup { char[] characters = new char[8]; String[] encodings = new String[8]; EncLookup(String ec) { characters[0] = ec.charAt(0); characters[1] = ec.charAt(1); characters[2] = ec.charAt(2); characters[3] = ec.charAt(3); characters[4] = ec.charAt(4); characters[5] = '\r'; characters[6] = '\r'; characters[7] = '\n'; char escapeChar = ec.charAt(3); char[] codes = {'F', 'S', 'R', 'E', 'T'}; for (int i = 0; i < codes.length; i++) { StringBuilder seq = new StringBuilder(); seq.append(escapeChar); seq.append(codes[i]); seq.append(escapeChar); encodings[i] = seq.toString(); } // encodings[5] = "\\X000d\\"; encodings[5] = escapeChar + "X000d" + escapeChar; encodings[6] = escapeChar + "X0D" + escapeChar; encodings[7] = escapeChar + "X0A" + escapeChar; } } } ----- Test case: /* * To change this template, choose Tools | Templates * and open the template in the editor. */ package ca.uhn.hl7v2.parser; import org.junit.After; import org.junit.AfterClass; import org.junit.Before; import org.junit.BeforeClass; import org.junit.Test; import static org.junit.Assert.*; import org.slf4j.Logger; import org.slf4j.LoggerFactory; /** * * @author vowlesi */ public class SingleBackslashV3Test { private static final Logger log = LoggerFactory.getLogger(EscapeV2Test.class); private String encChars = "|^~\\&"; public SingleBackslashV3Test() { } @BeforeClass public static void setUpClass() { } @AfterClass public static void tearDownClass() { } @Before public void setUp() { } @After public void tearDown() { } /** * Test of unescape method, of class Escape. */ @Test public void testUnescapeSingleBackslash() { log.debug("unescape with single backslash"); String text = "1 \\ 24 Smith \\T\\ Wesson Road"; String expResult = "1 \\ 24 Smith & Wesson Road"; String result = Hl7Escape.unescape(text); log.debug("Input : " + text); log.debug("Result : " + result); log.debug("Expected : " + expResult); assertEquals(expResult, result); text = "\\H\\A\\T\\E\\R\\\\N\\<<\\S\\>>\"\\E\\''\\F\\Special test '\\XFFFFFFFFFFFFFFFFFFFF\\'"; expResult = "\\H\\A&E~\\N\\<<^>>\"\\''|Special test '\\XFFFFFFFFFFFFFFFFFFFF\\'"; result = Hl7Escape.unescape(text); log.debug("Input : " + text); log.debug("Result : " + result); log.debug("Expected : " + expResult); assertEquals(expResult, result); text = "\\H\\A\\T\\E\\R\\\\N\\<<\\S\\>>\"\\E\\''\\F\\Special test '\\X000d\\'"; expResult = "\\H\\A&E~\\N\\<<^>>\"\\''|Special test '\r\'"; result = Hl7Escape.unescape(text); log.debug("Input : " + text); log.debug("Result : " + result); log.debug("Expected : " + expResult); assertEquals(expResult, result); text = "\\\\\\\\\\\\\\\\\\\\"; expResult = "\\\\\\\\\\\\\\\\\\\\"; result = Hl7Escape.unescape(text); log.debug("Input : " + text); log.debug("Result : " + result); log.debug("Expected : " + expResult); assertEquals(expResult, result); text = "Ken\\n\\F\\edy"; expResult = "Ken\\E\\n\\F\\edy"; result = Hl7Escape.unescape(text); result = Hl7Escape.escape(result); log.debug("Input : " + text); log.debug("Result : " + result); log.debug("Expected : " + expResult); assertEquals(expResult, result); } } ******************************************************************************** This email, including any attachments sent with it, is confidential and for the sole use of the intended recipient(s). This confidentiality is not waived or lost, if you receive it and you are not the intended recipient(s), or if it is transmitted/received in error. Any unauthorised use, alteration, disclosure, distribution or review of this email is strictly prohibited. The information contained in this email, including any attachment sent with it, may be subject to a statutory duty of confidentiality if it relates to health service matters. If you are not the intended recipient(s), or if you have received this email in error, you are asked to immediately notify the sender by telephone collect on Australia +61 1800 198 175 ( tel:%2B61%201800%20198%20175 ) or by return email. You should also delete this email, and any copies, from your computer system network and destroy any hard copies produced. If not an intended recipient of this email, you must not copy, distribute or take any action(s) that relies on it; any form of disclosure, modification, distribution and/or publication of this email is also prohibited. Although Queensland Health takes all reasonable steps to ensure this email does not contain malicious software, Queensland Health does not accept responsibility for the consequences if any person's computer inadvertently suffers any disruption to services, loss of information, harm or is infected with a virus, other malicious computer programme or code that may occur as a consequence of receiving this email. Unless stated otherwise, this email represents only the views of the sender and not the views of the Queensland Government. ********************************************************************************** ------------------------------------------------------------------------------ Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! Discover the easy way to master current and previous Microsoft technologies and advance your career. Get an incredible 1,500+ hours of step-by-step tutorial videos with LearnDevNow. Subscribe today and save! http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk _______________________________________________ Hl7api-devel mailing list Hl7...@li... https://lists.sourceforge.net/lists/listinfo/hl7api-devel ******************************************************************************** This email, including any attachments sent with it, is confidential and for the sole use of the intended recipient(s). This confidentiality is not waived or lost, if you receive it and you are not the intended recipient(s), or if it is transmitted/received in error. Any unauthorised use, alteration, disclosure, distribution or review of this email is strictly prohibited. The information contained in this email, including any attachment sent with it, may be subject to a statutory duty of confidentiality if it relates to health service matters. If you are not the intended recipient(s), or if you have received this email in error, you are asked to immediately notify the sender by telephone collect on Australia +61 1800 198 175 or by return email. You should also delete this email, and any copies, from your computer system network and destroy any hard copies produced. If not an intended recipient of this email, you must not copy, distribute or take any action(s) that relies on it; any form of disclosure, modification, distribution and/or publication of this email is also prohibited. Although Queensland Health takes all reasonable steps to ensure this email does not contain malicious software, Queensland Health does not accept responsibility for the consequences if any person's computer inadvertently suffers any disruption to services, loss of information, harm or is infected with a virus, other malicious computer programme or code that may occur as a consequence of receiving this email. Unless stated otherwise, this email represents only the views of the sender and not the views of the Queensland Government. ********************************************************************************** |