Re: [Quickfix-developers] FIX.IntConvertor.convert() throwing exception
Brought to you by:
orenmnero
From: Rick L. <ric...@gm...> - 2008-06-24 18:15:08
|
John Haldi discovered that IntConvertor.convert() is a static method -- I have 2 threads that run concurrently in the following manner: Thread A (producer): ----------- 1) takes in raw compressed FAST data from the CME, converts it to a FIX string 2) takes FIX string and creates a QuickFix.Message object, passing the string into the constructor 3) checks the MsgSeqNum of this message ( message.getHeader().getInt(34) ) 3a) if MsgSeqNum is the next one it expects, it hands it off to the consumer (Thread B) 3b) if MsgSeqNum is /not /the next one, it creates a request to obtain the missed packets (this is on UDP so unreliable) Thread B (consumer): ---------- 1) listens for Thread A to add another QuickFix.Message to a shared Queue 2) Processes the message's fields So I'm wondering if the two red portions are causing these issues, because the low-level IntConvertor.convert() function is static. Even though the same message object will NEVER be accessed by more than one thread, if the same helper function is then I could see this causing a problem.... I don't see any shared/static member /variables /used by these methods, so I don't know how they could be interfering with each other -- but I thought I'd add this bit of information. Thanks, Rick John Haldi wrote: > Sorry it wasn't helpful. In looking at the source code for QF, I see > that the FIX.IntConvertor.convert function is indeed declared as a > shared method, but that and a buck will get you a cup of coffee. From > what you say there is no possibility of another thread calling the > convert function concurrently, so I'm somewhat at a loss as to what > could cause the function to fail. Its pretty straightforward code in > that function, so if something is wrong it should throw the exception > constantly. I'm still suspicious that something in the QF library > could be calling this function concurrently, but I have no clue where > to begin guessing... > > jh > > ------------------------------------------------------------------------ > *From:* Rick Lane [mailto:ric...@gm...] > *Sent:* Tuesday, June 24, 2008 11:39 AM > *To:* John Haldi > *Subject:* Re: [Quickfix-developers] FIX.IntConvertor.convert() > throwingexception > > John, > > Yes, my first thought was a threading issue as well, however I don't > believe it is one -- and if it is, it's not as straightforward as your > example below. Here's why: > > I am not establishing a FIX session -- I am simply using the QuickFix > library as a "utility" library to parse incoming FIX messages. The > CME (which I forget -- is this what you work with as well?) provides > /market data/ in FAST format ("FIX Adapted for STreaming" or something > like that) which is compressed data that I decode into a string. > Rather than re-invent the wheel and parse the string representation of > the FIX message myself, I simply create a QuickFix.Message object > passing this string into the constructor (along with a > DataDictionary). Then I can use the QuickFix functions like getGroup, > getIntField, etc..., and it does all the parsing legwork for me. > > Also, there is only one thread listening to market data. > > Now, the /order routing /portion of my server /does /use a true > QuickFix "session" -- however for testing purposes, I'm not even > instantiating this, I'm /only /listening to the Market Data side of > things.... > > Thanks again for your time! > > Rick > > John Haldi wrote: >> Rick, >> >> Is it possible that somehow the group and/or the field in question is >> getting overwritten by a concurrent call on a different thread. My >> thinking is as follows: If you have a threadSocketInitiator/Acceptor >> working, perhaps every now and then two messages with this repeating >> group are coming in at the exact same time on two different threads, >> and that there is a helper function of some sort going on under the >> hood and one message is stomping on the other message - i.e. maybe >> the helper function is using a shared variable/class when it should >> be using an instance variable/class. >> >> The scenario I'm thinking of goes something like this: >> >> Thread #1 gets message with 10 group elements >> Thread #1 calls getGroup - getGroup stores group related info in a >> variable >> Thread #1 processes the first 5 of 10 group elements >> Thread #2 gets message with 3 group elements >> Thread #2 calls getGroup - getGroup stores group related info in the >> same variable >> Thread #1 now tries to access 6-10 of the group elements but they >> point to disposed memory >> Thread #1 throws a really nasty exception >> >> If we allow for something like the above as a possibility, it would >> explain 1) the seemingly intermittent nature of the problem, and 2) >> why you can't recreate it in a debugger. >> >> Its just a thought... >> >> John >> >> >> ------------------------------------------------------------------------ >> *From:* qui...@li... >> [mailto:qui...@li...] *On Behalf >> Of *Rick Lane >> *Sent:* Tuesday, June 24, 2008 11:13 AM >> *To:* qui...@li... >> *Subject:* Re: [Quickfix-developers] FIX.IntConvertor.convert() >> throwingexception >> >> What's interesting is that I take the message text from the message >> that causes the exception and then basically recreate the >> QuickFix.Message object with this string in a separate application, >> and make the same calls, and I don't get the exception. >> >> So it seems pretty obvious to me this isn't a message-formatting >> issue -- does this shed any light onto what might be the problem? >> Not sure why my Exception text is so vague (" External component has >> thrown an exception."). >> >> Rick Lane wrote: >>> Greetings, >>> >>> I have finally tracked down a bug that has been giving me problems >>> for some time. I'm getting an exception thrown in the >>> FIX.IntConvertor.convert(string) function, and I can't seem to >>> figure out why. It always happens at the same place: in an >>> Incremental Refresh message, I extract the NoMDEntries group. I >>> then try to extract the "price level" of the update (int field >>> 1023), and here is where I get the exception. Here is the stack trace: >>> >>> External component has thrown an exception. >>> at _CxxThrowException(Void* , _s__ThrowInfo* ) >>> at >>> FIX.IntConvertor.convert(basic_string<char\,std::char_traits<char>\,std::allocator<char> >>> >* value) >>> at QuickFix.Group.getField(IntField field) >>> at >>> MDPDataServer.MDPMarketDataProvider.ProcessMarketDataIncrementalRefresh >>> >>> and here's the line of code that's causing the exception: >>> >>> // message is a QuickFix.Message object constructed from the string >>> below >>> int numEntries = message.getInt(268); >>> for (uint i = 0; i < numEntries; i++) { >>> QuickFix44.MarketDataIncrementalRefresh.NoMDEntries group = new >>> QuickFix44.MarketDataIncrementalRefresh.NoMDEntries(); >>> message.getGroup(i + 1, group); >>> int priceLevel = group.getField(new IntField(1023)).getValue(); >>> // exception occurs here >>> ... >>> >>> What's strange is that I process millions of market data messages >>> every day and this only happens maybe 2 or 3 times a week -- my >>> first thought was that this was a FAST decoding issue (when I'm >>> building the text representation of the FIX message before QuickFix >>> is even used), but at such a low probability of occurrence, I can't >>> imagine this is a decoding issue. >>> >>> Here is the message that is throwing the exception; I've highlighted >>> the 1023 entries, and they all look fine to me -- any thoughts? >>> (also, to make it more readable/email friendly, I removed the >>> stop-bits and replaced them with the | character). >>> >>> Thanks, >>> Rick >>> >>> 8=FIX.4.2 | 9=1961 | 35=X | 34=1304872 | 49=CME | >>> 52=20080624115930866 | 75=20080624 | 268=22 | 279=1 | *1023=2* | >>> 269=0 | 270=45 | 271=85 | 273=115930000 | 336=2 | 276=K | 22=8 | >>> 48=801005 | 83=13568 | 279=1 | *1023=1* | 269=0 | 270=94 | 271=293 | >>> 273=115930000 | 336=2 | 276=K | 22=8 | 48=801101 | 83=38117 | 279=1 >>> | *1023=1* | 269=0 | 270=112.5 | 271=293 | 273=115930000 | 336=2 | >>> 276=K | 22=8 | 48=801109 | 83=35245 | 279=1 | *1023=2* | 269=1 | >>> 270=9551 | 271=1743 | 273=115930000 | 336=2 | 276=K | 22=8 | >>> 48=803001 | 83=231922 | 279=1 | *1023=1* | 269=1 | 270=9631 | >>> 271=1134 | 273=115930000 | 336=2 | 276=K | 22=8 | 48=803900 | >>> 83=278737 | 279=1 | *1023=2* | 269=1 | 270=9631.5 | 271=12656 | >>> 273=115930000 | 336=2 | 276=K | 22=8 | 48=803900 | 83=278738 | 279=1 >>> | *1023=1* | 269=1 | 270=9536 | 271=1175 | 273=115930000 | 336=2 | >>> 276=K | 22=8 | 48=806001 | 83=204449 | 279=1 | *1023=2* | 269=1 | >>> 270=9536.5 | 271=13774 | 273=115930000 | 336=2 | 276=K | 22=8 | >>> 48=806001 | 83=204450 | 279=1 | *1023=1* | 269=1 | 270=9612 | >>> 271=332 | 273=115930000 | 336=2 | 276=K | 22=8 | 48=806901 | >>> 83=422681 | 279=1 | *1023=2* | 269=1 | 270=9612.5 | 271=17576 | >>> 273=115930000 | 336=2 | 276=K | 22=8 | 48=806901 | 83=422682 | 279=1 >>> | *1023=2* | 269=1 | 270=9592 | 271=30035 | 273=115930000 | 336=2 | >>> 276=K | 22=8 | 48=809901 | 83=312614 | 279=1 | *1023=2* | 269=0 | >>> 270=17 | 271=47 | 273=115930000 | 336=2 | 276=K | 22=8 | 48=800915 | >>> 83=20839 | 279=1 | *1023=1* | 269=0 | 270=43 | 271=58 | >>> 273=115930000 | 336=2 | 276=K | 22=8 | 48=801105 | 83=10961 | 279=2 >>> | *1023=1* | 269=1 | 270=44.5 | 271=12 | 273=115930000 | 336=2 | >>> 276=K | 22=8 | 48=801105 | 83=10962 | 279=1 | *1023=1* | 269=1 | >>> 270=45 | 271=12 | 273=115930000 | 336=2 | 276=K | 22=8 | 48=801105 | >>> 83=10963 | 279=0 | *1023=2* | 269=1 | 270=45.5 | 271=216 | >>> 273=115930000 | 336=2 | 276=K | 22=8 | 48=801105 | 83=10964 | 279=1 >>> | *1023=1* | 269=0 | 270=60 | 271=58 | 273=115930000 | 336=2 | 276=K >>> | 22=8 | 48=801113 | 83=9462 | 279=1 | *1023=2* | 269=0 | 270=-4 | >>> 271=24 | 273=115930000 | 336=2 | 276=K | 22=8 | 48=801208 | 83=3856 >>> | 279=2 | *1023=2* | 269=1 | 270=9495.5 | 271=49 | 273=115930000 | >>> 336=2 | 276=K | 22=8 | 48=803201 | 83=8643 | 279=0 | *1023=2* | >>> 269=1 | 270=9496 | 271=93 | 273=115930000 | 336=2 | 276=K | 22=8 | >>> 48=803201 | 83=8644 | 279=1 | *1023=1* | 269=0 | 270=81 | 271=967 | >>> 273=115930000 | 336=2 | 276=K | 22=8 | 48=800208 | 83=76335 | 279=1 >>> | *1023=2* | 269=0 | 270=80.5 | 271=409 | 273=115930000 | 336=2 | >>> 276=K | 22=8 | 48=800208 | 83=76336 | 1128=8 | 10=233 | >>> >>> >>> >> No virus found in this incoming message. >> Checked by AVG. >> Version: 8.0.100 / Virus Database: 270.4.1/1516 - Release Date: >> 6/24/2008 7:53 AM >> > No virus found in this incoming message. > Checked by AVG. > Version: 8.0.100 / Virus Database: 270.4.1/1516 - Release Date: > 6/24/2008 7:53 AM > |