Menu

#25 20+ seconds to parse 37kb message

v2.0
open
nobody
Parser (14)
5
2012-01-25
2012-01-25
Anonymous
No

Hello,

The pipe parser is consistently taking 20+ seconds to (correctly) parse a pipe-formatted message. This message has a BASE64 encoded PDF in an OBX segment with OBX-2 = "RP" and OBX-5 having the base-64 encoding starting "ADTX^Image^PDF^Base64^JVBERi0xLjQN..."

The pipe parser will later re-encoded the parsed message back to an almost identical plaintext (minus a few trailing pipes, etc), so there don't seem to be any validation errors.

This 37kb message is our early test message and later messages could be longer.

Is this expected?
What is the root cause of this slow performance?
Are there workarounds without changing the communicated data that will speed this up?

My only guess that this is due to a Regexp-based parser with an inefficiently written rule.

Thanks!

Discussion

  • Anonymous

    Anonymous - 2012-01-30

    I cannot yet post a file example because I cannot be sure how much of the file needs redacting.

    If anyone can say for sure that their similarly formatted messages do not have such slow performance then I'll know it is message specific and we can try to narrow it down.

     

    Last edit: Anonymous 2014-07-31
  • ChadC

    ChadC - 2012-03-04

    Jason, one of the other members has submitted some code to fix this performance issue. However, we haven't incorporated it back into the main trunk of the repository. If you grab the root of the SVN, and look under the 'optimizations' folder, you can get his version of the project which fixes this issue. It sounds like there were some pretty major changes, and it probably needs to be tested pretty thoroughly.

     
MongoDB Logo MongoDB