Re: [HAPI-devel] Decode ZPD-3.3 UUencode

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Some time ago I noticed a problem with some messages we were dealing with which sent fields with single escape characters in them.  HAPI consumes them during parsing, and I found the code that did it and suggested a change to allow them to be left alone.  I believe the standard is unclear in it's definition of the action to perform in the case of a single escape character being present, where it doesn't form part of a valid escape sequence.  A copy of the suggested change to Escape object made at the time (code may have changed slightly with the updated versions of HAPI, suggestion was made some time ago).  In this object there is also reference to the hexadecimal escape not being supported, but the code does include handing of \X000d\, and could easily be extended to cover the most common hexadecimal escapes we have seen being \X0D\ and \X0A\.

Hope this helps

Code Follows
Ian

/**
The contents of this file are subject to the Mozilla Public License Version 1.1 
(the "License"); you may not use this file except in compliance with the License. 
You may obtain a copy of the License at http://www.mozilla.org/MPL/ 
Software distributed under the License is distributed on an "AS IS" basis, 
WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License for the 
specific language governing rights and limitations under the License. 

The Original Code is "Escape.java".  Description: 
"Handles "escaping" and "unescaping" of text according to the HL7 escape sequence rules
defined in section 2.10 of the standard (version 2.4)" 

The Initial Developer of the Original Code is University Health Network. Copyright (C) 
2001.  All Rights Reserved. 

Contributor(s): Mark Lee (Skeva Technologies); Elmar Hinz 

Alternatively, the contents of this file may be used under the terms of the 
GNU General Public License (the  ?GPL?), in which case the provisions of the GPL are 
applicable instead of those above.  If you wish to allow use of your version of this 
file only under the terms of the GPL and not to allow others to use your version 
of this file under the MPL, indicate your decision by deleting  the provisions above 
and replace  them with the notice and other provisions required by the GPL License.  
If you do not delete the provisions above, a recipient may use your version of 
this file under either the MPL or the GPL. 
 */
package ca.uhn.hl7v2.parser;

import java.util.Collections;
import java.util.LinkedHashMap;
import java.util.Map;

/**
 * Handles "escaping" and "unescaping" of text according to the HL7 escape
 * sequence rules defined in section 2.10 of the standard (version 2.4).
 * Currently, escape sequences for multiple character sets are unsupported. The
 * highlighting, hexademical, and locally defined escape sequences are also
 * unsupported.
 * 
 * @author Bryan Tripp
 * @author Mark Lee (Skeva Technologies)
 * @author Elmar Hinz
 * @author Christian Ohr
 */
public class EscapeV2 {

    /**
     * limits the size of variousEncChars to 1000, can be overridden by system property.
     */
    private static Map<EncodingCharacters, EncLookup> variousEncChars = Collections.synchronizedMap(new LinkedHashMap<EncodingCharacters, EncLookup>(5, 0.75f, true) {

        private static final long serialVersionUID = 1L;
        final int maxSize = new Integer(System.getProperty(Escape.class.getName() + ".maxSize", "1000"));

        @Override
        protected boolean removeEldestEntry(Map.Entry<EncodingCharacters, EncLookup> eldest) {
            return this.size() > maxSize;
        }
    });

    /** Creates a new instance of Escape */
    public EscapeV2() {
    }

    /**
     * @param text string to be escaped
     * @param encChars encoding characters to be used
     * @return the escaped string
     */
    public static String escape(String text, EncodingCharacters encChars) {
        EncLookup esc = getEscapeSequences(encChars);
        int textLength = text.length();

        StringBuilder result = new StringBuilder(textLength);
        for (int i = 0; i < textLength; i++) {
            boolean charReplaced = false;
            char c = text.charAt(i);

            FORENCCHARS:
   for (int j = 0; j < 6; j++) {
                if (text.charAt(i) == esc.characters[j]) {

     // Formatting escape sequences such as \.br\ should be left alone
     if (j == 4) {

      if (i+1 < textLength) {

       // Check for \.br\
       char nextChar = text.charAt(i + 1);
       switch (nextChar) {
       case '.':
       case 'C':
       case 'M':
       case 'X':
       case 'Z':
       {
        int nextEscapeIndex = text.indexOf(esc.characters[j], i + 1);
        if (nextEscapeIndex > 0) {
         result.append(text.substring(i, nextEscapeIndex + 1));
         charReplaced = true;
         i = nextEscapeIndex;
         break FORENCCHARS;
        }
        break;
       }
       case 'H':
       case 'N':
       {
        if (i+2 < textLength && text.charAt(i+2) == '\\') {
         int nextEscapeIndex = i + 2;
         if (nextEscapeIndex > 0) {
          result.append(text.substring(i, nextEscapeIndex + 1));
          charReplaced = true;
          i = nextEscapeIndex;
          break FORENCCHARS;
         }
        }
        break;
       }
       }

      }

     }

                    result.append(esc.encodings[j]);
                    charReplaced = true;
                    break;
                }
            }
            if (!charReplaced) {
                result.append(c);
            }
        }
        return result.toString();
    }

    /**
     * @param text string to be unescaped
     * @param encChars encoding characters to be used
     * @return the unescaped string
     */
    public static String unescape(String text, EncodingCharacters encChars) {

        // If the escape char isn't found, we don't need to look for escape sequences
        char escapeChar = encChars.getEscapeCharacter();
        boolean foundEscapeChar = false;
        for (int i = 0; i < text.length(); i++) {
            if (text.charAt(i) == escapeChar) {
                foundEscapeChar = true;
                break;
            }
        }
        if (!foundEscapeChar) {
            return text;
        }

        int textLength = text.length();
        StringBuilder result = new StringBuilder(textLength + 20);
        EncLookup esc = getEscapeSequences(encChars);
        char escape = esc.characters[4];
        int encodingsCount = esc.characters.length;
        int i = 0;
        while (i < textLength) {
            char c = text.charAt(i);
            if (c != escape) {
                result.append(c);
                i++;
            } else {
                boolean foundEncoding = false;

    // Test against the standard encodings
    for (int j = 0; j < encodingsCount; j++) {
                    String encoding = esc.encodings[j];
     int encodingLength = encoding.length();
     if ((i + encodingLength <= textLength) && text.substring(i, i + encodingLength)
                            .equals(encoding)) {
                        result.append(esc.characters[j]);
                        i += encodingLength;
                        foundEncoding = true;
                        break;
                    }
                }

                if (!foundEncoding) {

     // If we haven't found this, there is one more option. Escape sequences of /.XXXXX/ are
     // formatting codes. They should be left intact
     if (i + 1 < textLength) {
      char nextChar = text.charAt(i + 1);
      switch (nextChar) {
       case '.':
       case 'C':
       case 'M':
       case 'X':
       case 'Z':
       {
        int closingEscape = text.indexOf(escape, i + 1);
        if (closingEscape > 0) {
         String substring = text.substring(i, closingEscape + 1);
         result.append(substring);
         i += substring.length();
        } else {
         i++;
        }
        break;
       }
       case 'H':
       case 'N':
       {
        int closingEscape = text.indexOf(escape, i + 1);
        if (closingEscape == i + 2) {
         String substring = text.substring(i, closingEscape + 1);
         result.append(substring);
         i += substring.length();
        } else {
         i++;
        }
        break;
       }
       default:
       {
                                                                // Preserve unescaped escape delimiter
                                                                result.append(c);
        i++;
       }
      }

     } else {
                                                // Preserve unescaped escape delimiter
                                                result.append(c);
      i++;
     }
                }

            }
        }
        return result.toString();
    }

    /**
     * Returns a HashTable with escape sequences as keys, and corresponding
     * Strings as values.
     */
    private static EncLookup getEscapeSequences(EncodingCharacters encChars) {
        EncLookup escapeSequences = variousEncChars.get(encChars);
        if (escapeSequences == null) {
            // this means we haven't got the sequences for these encoding
            // characters yet - let's make them
            escapeSequences = new EncLookup(encChars);
            variousEncChars.put(encChars, escapeSequences);
        }
        return escapeSequences;
    }

    /**
     * A performance-optimized replacement for using when
     * mapping from HL7 special characters to their respective
     * encodings
     *
     * @author Christian Ohr
     */
    private static class EncLookup {

        char[] characters = new char[6];
        String[] encodings = new String[6];

        EncLookup(EncodingCharacters ec) {
            characters[0] = ec.getFieldSeparator();
            characters[1] = ec.getComponentSeparator();
            characters[2] = ec.getSubcomponentSeparator();
            characters[3] = ec.getRepetitionSeparator();
            characters[4] = ec.getEscapeCharacter();
            characters[5] = '\r';
            char[] codes = {'F', 'S', 'T', 'R', 'E'};
            for (int i = 0; i < codes.length; i++) {
                StringBuilder seq = new StringBuilder();
                seq.append(ec.getEscapeCharacter());
                seq.append(codes[i]);
                seq.append(ec.getEscapeCharacter());
                encodings[i] = seq.toString();
            }
//            encodings[5] = "\\X000d\\ ( file://\X000d\ )";
            encodings[5] = ec.getEscapeCharacter() + "X000d" + ec.getEscapeCharacter();
        }
    }
}

Test case code

/*
 * To change this template, choose Tools | Templates
 * and open the template in the editor.
 */
package ca.uhn.hl7v2.parser;

import org.junit.After;
import org.junit.AfterClass;
import org.junit.Before;
import org.junit.BeforeClass;
import org.junit.Test;
import static org.junit.Assert.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
 *
 * @author vowlesi
 */
public class SingleBackslashV2Test {

    private static final Logger log = LoggerFactory.getLogger(EscapeV2Test.class);
    private EncodingCharacters encChars = EncodingCharacters.defaultInstance();

    public SingleBackslashV2Test() {
    }

    @BeforeClass
    public static void setUpClass() {
    }

    @AfterClass
    public static void tearDownClass() {
    }

    @Before
    public void setUp() {
    }

    @After
    public void tearDown() {
    }

    /**
     * Test of unescape method, of class Escape.
     */
    @Test
    public void testUnescapeSingleBackslash() {
        log.debug("unescape with single backslash");
        String text = "1 \\ 24 Smith \\T\\ ( file://\T\ ) Wesson Road";
        String expResult = "1 \\ 24 Smith & Wesson Road";
        String result = EscapeV2.unescape(text, encChars);
        log.debug(result);
        log.debug(expResult);
        assertEquals(expResult, result);
        text = "\"\\E\\''\\F\\\\H\\A\\T\\E\\R\\\\N\\<<\\S\\>>\"\\E\\''\\F\\Special test '\\XFFFFFFFFFFFFFFFFFFFF\\'";
        expResult = "\"\\\\H\\A&E~\\N\\<<^>>\"\\''|Special test '\\XFFFFFFFFFFFFFFFFFFFF\\'";
        result = EscapeV2.unescape(text, encChars);
        log.debug(result);
        log.debug(expResult);
        assertEquals(expResult, result);
        text = "\"\\E\\''\\F\\\\H\\A\\T\\E\\R\\\\N\\<<\\S\\>>\"\\E\\''\\F\\Special test '\\X000d\\'";
        expResult = "\"\\\\H\\A&E~\\N\\<<^>>\"\\''|Special test '\r\'";
        result = EscapeV2.unescape(text, encChars);
        log.debug(result);
        log.debug(expResult);
        assertEquals(expResult, result);
        text = "\\\\\\\\\\\\\\\\\\\\";
        expResult = "\\\\\\\\\\\\\\\\\\\\";
        result = EscapeV2.unescape(text, encChars);
        log.debug(result);
        log.debug(expResult);
        assertEquals(expResult, result);
        text = "Ken\\n\\F\\edy";
        expResult = "Ken\\E\\n\\F\\edy";
        result = EscapeV2.unescape(text, encChars);
        result = EscapeV2.escape(result, encChars);
        log.debug(result);
        log.debug(expResult);
        assertEquals(expResult, result);
    }
}

>>> g3949 <g3...@ya...> 16/01/14 20:03 >>>
sorry....I#m wrong...
The Problem seems to be in the parser.

Parsing the HL7 Textfile getting the "\" lost...

Coee:
        Parser p = context.getPipeParser();

        Message msg = iter.next();

        try {
            log.info(p.encode(msg));
        } catch (HL7Exception e2) {
            // TODO Auto-generated catch block
            e2.printStackTrace();
        }

ZPD|1|PDF|14627^20675^begin 644 pdf1.pdfx0Dx0A\M)5!$1BTQ+C,-"B7BX\E_3#0H-"C$

Still needing help...

FP

On Thursday, January 16, 2014 10:02 AM, g3949 <g3...@ya...> wrote:

Hi,
in fact...thats the situation.
In my system, I get UUEncoded and plain.PDF documents within the zpd-3.3 segment.

Now I have still the problem replacing the \x0D\\x0A\ because while catching the zpd-3.3 segment useing the terser, the "\" getting lost...

Reading direct from file results ->> String WITH "\"
2014-01-16 09:51:17  [INFO ] (Hl7FromFile):39 - begin 644 pdf1.pdf\x0D\\x0A\M)5!$1BTQ+C,-"B7BX\E\_

getting zpd-3.3 from terser and save the segment in String variable called zpd: --> String WITHOUT "\"
2014-01-16 09:56:46  [INFO ] (Hl7FromFile):84 - useing terser: begin 644 pdf1.pdfx0Dx0A\M)5!$1BTQ+C,-"B7BX

Code:
    String zpd = null;
            Terser t = new Terser(msg);
             try {
                zpd = t.get("/.ZPD-3-3");
                log.info("useing terser: "+zpd);
            } catch (HL7Exception e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }

Why does the terser cuts out the backslahes and which workarounds are possible.

Thanks a lot!

FP

On Monday, January 13, 2014 4:33 PM, James Agnew <ja...@ja...> wrote:

Hi GGK,

I've never seen anyone use UUEncoding inside an HL7 message (Base64 is the way I've generally seen people solve this problem) but it should be possible.

Your problem is definitely that the first line of a UUEncoded string needs to be in the form
begin <mode> <filename><newline>

You have all of that in your string except the newline. That may be what the string "x0Dx" is representing.. You would need to convert that to a newline, but also be careful since that string could also appear in the UUEncoded text.

James

On Mon, Jan 13, 2014 at 3:14 AM, g3949 <g3...@ya...> wrote:

Now I have the Problem, do decode a ZPD-3.3 Segement which is UUDecoded.

Example:
ZPD|1|PDF|14627^20675^begin 644 pdf1.pdfx0Dx0A\M)5!$1BTQ+C,-"B7BX\E_3#0H-"C.... end

Decode the segement, I alway get the Errof:

sun.misc.CEFormatException: UUDecoder: No begin line.

Does anybody hab som Ideas?

GGK

------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today.
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Hl7api-devel mailing list
Hl7...@li...
https://lists.sourceforge.net/lists/listinfo/hl7api-devel

------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk

_______________________________________________
Hl7api-devel mailing list
Hl7...@li...
https://lists.sourceforge.net/lists/listinfo/hl7api-devel

********************************************************************************
This email, including any attachments sent with it, is confidential and for the sole use of the intended recipient(s). This confidentiality is not waived or lost, if you receive it and you are not the intended recipient(s), or if it is transmitted/received in error.
Any unauthorised use, alteration, disclosure, distribution or review of this email is strictly prohibited.  The information contained in this email, including any attachment sent with it, may be subject to a statutory duty of confidentiality if it relates to health service matters.
If you are not the intended recipient(s), or if you have received this email in error, you are asked to immediately notify the sender by telephone collect on Australia +61 1800 198 175 or by return email.  You should also delete this email, and any copies, from your computer system network and destroy any hard copies produced.
If not an intended recipient of this email, you must not copy, distribute or take any action(s) that relies on it; any form of disclosure, modification, distribution and/or publication of this email is also prohibited.
Although Queensland Health takes all reasonable steps to ensure this email does not contain malicious software, Queensland Health does not accept responsibility for the consequences if any person's computer inadvertently suffers any disruption to services, loss of information, harm or is infected with a virus, other malicious computer programme or code that may occur as a consequence of receiving this email.
Unless stated otherwise, this email represents only the views of the sender and not the views of the Queensland Government.
**********************************************************************************