Re: [Quickfix-developers] FIX.IntConvertor.convert() throwing exception
Brought to you by:
orenmnero
From: Rick L. <ric...@gm...> - 2008-06-30 14:19:47
|
Hello all, just wondering if anyone had any luck discovering any issues with the IntConvertor class with respect to concurrency issues - or if there were any additional thoughts on the matter. For the life of me I can't figure out what else could be causing this "/External component has thrown an exception/" exception. Thanks again, Rick Rick Lane wrote: > John Haldi discovered that IntConvertor.convert() is a static method > -- I have 2 threads that run concurrently in the following manner: > > Thread A (producer): > ----------- > 1) takes in raw compressed FAST data from the CME, converts it to a > FIX string > 2) takes FIX string and creates a QuickFix.Message object, passing the > string into the constructor > 3) checks the MsgSeqNum of this message ( message.getHeader().getInt(34) ) > 3a) if MsgSeqNum is the next one it expects, it hands it off to > the consumer (Thread B) > 3b) if MsgSeqNum is /not /the next one, it creates a request to > obtain the missed packets (this is on UDP so unreliable) > > Thread B (consumer): > ---------- > 1) listens for Thread A to add another QuickFix.Message to a shared Queue > 2) Processes the message's fields > > So I'm wondering if the two red portions are causing these issues, > because the low-level IntConvertor.convert() function is static. Even > though the same message object will NEVER be accessed by more than one > thread, if the same helper function is then I could see this causing a > problem.... > > I don't see any shared/static member /variables /used by these > methods, so I don't know how they could be interfering with each other > -- but I thought I'd add this bit of information. > > Thanks, > Rick > > John Haldi wrote: >> Sorry it wasn't helpful. In looking at the source code for QF, I see >> that the FIX.IntConvertor.convert function is indeed declared as a >> shared method, but that and a buck will get you a cup of coffee. >> From what you say there is no possibility of another thread calling >> the convert function concurrently, so I'm somewhat at a loss as to >> what could cause the function to fail. Its pretty straightforward >> code in that function, so if something is wrong it should throw the >> exception constantly. I'm still suspicious that something in the QF >> library could be calling this function concurrently, but I have no >> clue where to begin guessing... >> >> jh >> >> ------------------------------------------------------------------------ >> *From:* Rick Lane [mailto:ric...@gm...] >> *Sent:* Tuesday, June 24, 2008 11:39 AM >> *To:* John Haldi >> *Subject:* Re: [Quickfix-developers] FIX.IntConvertor.convert() >> throwingexception >> >> John, >> >> Yes, my first thought was a threading issue as well, however I don't >> believe it is one -- and if it is, it's not as straightforward as >> your example below. Here's why: >> >> I am not establishing a FIX session -- I am simply using the QuickFix >> library as a "utility" library to parse incoming FIX messages. The >> CME (which I forget -- is this what you work with as well?) provides >> /market data/ in FAST format ("FIX Adapted for STreaming" or >> something like that) which is compressed data that I decode into a >> string. Rather than re-invent the wheel and parse the string >> representation of the FIX message myself, I simply create a >> QuickFix.Message object passing this string into the constructor >> (along with a DataDictionary). Then I can use the QuickFix functions >> like getGroup, getIntField, etc..., and it does all the parsing >> legwork for me. >> >> Also, there is only one thread listening to market data. >> >> Now, the /order routing /portion of my server /does /use a true >> QuickFix "session" -- however for testing purposes, I'm not even >> instantiating this, I'm /only /listening to the Market Data side of >> things.... >> >> Thanks again for your time! >> >> Rick >> >> John Haldi wrote: >>> Rick, >>> >>> Is it possible that somehow the group and/or the field in question >>> is getting overwritten by a concurrent call on a different thread. >>> My thinking is as follows: If you have a >>> threadSocketInitiator/Acceptor working, perhaps every now and then >>> two messages with this repeating group are coming in at the exact >>> same time on two different threads, and that there is a helper >>> function of some sort going on under the hood and one message is >>> stomping on the other message - i.e. maybe the helper function is >>> using a shared variable/class when it should be using an instance >>> variable/class. >>> >>> The scenario I'm thinking of goes something like this: >>> >>> Thread #1 gets message with 10 group elements >>> Thread #1 calls getGroup - getGroup stores group related info in a >>> variable >>> Thread #1 processes the first 5 of 10 group elements >>> Thread #2 gets message with 3 group elements >>> Thread #2 calls getGroup - getGroup stores group related info in the >>> same variable >>> Thread #1 now tries to access 6-10 of the group elements but they >>> point to disposed memory >>> Thread #1 throws a really nasty exception >>> >>> If we allow for something like the above as a possibility, it would >>> explain 1) the seemingly intermittent nature of the problem, and 2) >>> why you can't recreate it in a debugger. >>> >>> Its just a thought... >>> >>> John >>> >>> >>> ------------------------------------------------------------------------ >>> *From:* qui...@li... >>> [mailto:qui...@li...] *On >>> Behalf Of *Rick Lane >>> *Sent:* Tuesday, June 24, 2008 11:13 AM >>> *To:* qui...@li... >>> *Subject:* Re: [Quickfix-developers] FIX.IntConvertor.convert() >>> throwingexception >>> >>> What's interesting is that I take the message text from the message >>> that causes the exception and then basically recreate the >>> QuickFix.Message object with this string in a separate application, >>> and make the same calls, and I don't get the exception. >>> >>> So it seems pretty obvious to me this isn't a message-formatting >>> issue -- does this shed any light onto what might be the problem? >>> Not sure why my Exception text is so vague (" External component has >>> thrown an exception."). >>> >>> Rick Lane wrote: >>>> Greetings, >>>> >>>> I have finally tracked down a bug that has been giving me problems >>>> for some time. I'm getting an exception thrown in the >>>> FIX.IntConvertor.convert(string) function, and I can't seem to >>>> figure out why. It always happens at the same place: in an >>>> Incremental Refresh message, I extract the NoMDEntries group. I >>>> then try to extract the "price level" of the update (int field >>>> 1023), and here is where I get the exception. Here is the stack trace: >>>> >>>> External component has thrown an exception. >>>> at _CxxThrowException(Void* , _s__ThrowInfo* ) >>>> at >>>> FIX.IntConvertor.convert(basic_string<char\,std::char_traits<char>\,std::allocator<char> >>>> >* value) >>>> at QuickFix.Group.getField(IntField field) >>>> at >>>> MDPDataServer.MDPMarketDataProvider.ProcessMarketDataIncrementalRefresh >>>> >>>> and here's the line of code that's causing the exception: >>>> >>>> // message is a QuickFix.Message object constructed from the string >>>> below >>>> int numEntries = message.getInt(268); >>>> for (uint i = 0; i < numEntries; i++) { >>>> QuickFix44.MarketDataIncrementalRefresh.NoMDEntries group = new >>>> QuickFix44.MarketDataIncrementalRefresh.NoMDEntries(); >>>> message.getGroup(i + 1, group); >>>> int priceLevel = group.getField(new IntField(1023)).getValue(); >>>> // exception occurs here >>>> ... >>>> >>>> What's strange is that I process millions of market data messages >>>> every day and this only happens maybe 2 or 3 times a week -- my >>>> first thought was that this was a FAST decoding issue (when I'm >>>> building the text representation of the FIX message before QuickFix >>>> is even used), but at such a low probability of occurrence, I can't >>>> imagine this is a decoding issue. >>>> >>>> Here is the message that is throwing the exception; I've >>>> highlighted the 1023 entries, and they all look fine to me -- any >>>> thoughts? (also, to make it more readable/email friendly, I >>>> removed the stop-bits and replaced them with the | character). >>>> >>>> Thanks, >>>> Rick >>>> >>>> 8=FIX.4.2 | 9=1961 | 35=X | 34=1304872 | 49=CME | >>>> 52=20080624115930866 | 75=20080624 | 268=22 | 279=1 | *1023=2* | >>>> 269=0 | 270=45 | 271=85 | 273=115930000 | 336=2 | 276=K | 22=8 | >>>> 48=801005 | 83=13568 | 279=1 | *1023=1* | 269=0 | 270=94 | 271=293 >>>> | 273=115930000 | 336=2 | 276=K | 22=8 | 48=801101 | 83=38117 | >>>> 279=1 | *1023=1* | 269=0 | 270=112.5 | 271=293 | 273=115930000 | >>>> 336=2 | 276=K | 22=8 | 48=801109 | 83=35245 | 279=1 | *1023=2* | >>>> 269=1 | 270=9551 | 271=1743 | 273=115930000 | 336=2 | 276=K | 22=8 >>>> | 48=803001 | 83=231922 | 279=1 | *1023=1* | 269=1 | 270=9631 | >>>> 271=1134 | 273=115930000 | 336=2 | 276=K | 22=8 | 48=803900 | >>>> 83=278737 | 279=1 | *1023=2* | 269=1 | 270=9631.5 | 271=12656 | >>>> 273=115930000 | 336=2 | 276=K | 22=8 | 48=803900 | 83=278738 | >>>> 279=1 | *1023=1* | 269=1 | 270=9536 | 271=1175 | 273=115930000 | >>>> 336=2 | 276=K | 22=8 | 48=806001 | 83=204449 | 279=1 | *1023=2* | >>>> 269=1 | 270=9536.5 | 271=13774 | 273=115930000 | 336=2 | 276=K | >>>> 22=8 | 48=806001 | 83=204450 | 279=1 | *1023=1* | 269=1 | 270=9612 >>>> | 271=332 | 273=115930000 | 336=2 | 276=K | 22=8 | 48=806901 | >>>> 83=422681 | 279=1 | *1023=2* | 269=1 | 270=9612.5 | 271=17576 | >>>> 273=115930000 | 336=2 | 276=K | 22=8 | 48=806901 | 83=422682 | >>>> 279=1 | *1023=2* | 269=1 | 270=9592 | 271=30035 | 273=115930000 | >>>> 336=2 | 276=K | 22=8 | 48=809901 | 83=312614 | 279=1 | *1023=2* | >>>> 269=0 | 270=17 | 271=47 | 273=115930000 | 336=2 | 276=K | 22=8 | >>>> 48=800915 | 83=20839 | 279=1 | *1023=1* | 269=0 | 270=43 | 271=58 | >>>> 273=115930000 | 336=2 | 276=K | 22=8 | 48=801105 | 83=10961 | 279=2 >>>> | *1023=1* | 269=1 | 270=44.5 | 271=12 | 273=115930000 | 336=2 | >>>> 276=K | 22=8 | 48=801105 | 83=10962 | 279=1 | *1023=1* | 269=1 | >>>> 270=45 | 271=12 | 273=115930000 | 336=2 | 276=K | 22=8 | 48=801105 >>>> | 83=10963 | 279=0 | *1023=2* | 269=1 | 270=45.5 | 271=216 | >>>> 273=115930000 | 336=2 | 276=K | 22=8 | 48=801105 | 83=10964 | 279=1 >>>> | *1023=1* | 269=0 | 270=60 | 271=58 | 273=115930000 | 336=2 | >>>> 276=K | 22=8 | 48=801113 | 83=9462 | 279=1 | *1023=2* | 269=0 | >>>> 270=-4 | 271=24 | 273=115930000 | 336=2 | 276=K | 22=8 | 48=801208 >>>> | 83=3856 | 279=2 | *1023=2* | 269=1 | 270=9495.5 | 271=49 | >>>> 273=115930000 | 336=2 | 276=K | 22=8 | 48=803201 | 83=8643 | 279=0 >>>> | *1023=2* | 269=1 | 270=9496 | 271=93 | 273=115930000 | 336=2 | >>>> 276=K | 22=8 | 48=803201 | 83=8644 | 279=1 | *1023=1* | 269=0 | >>>> 270=81 | 271=967 | 273=115930000 | 336=2 | 276=K | 22=8 | 48=800208 >>>> | 83=76335 | 279=1 | *1023=2* | 269=0 | 270=80.5 | 271=409 | >>>> 273=115930000 | 336=2 | 276=K | 22=8 | 48=800208 | 83=76336 | >>>> 1128=8 | 10=233 | >>>> >>>> >>>> >>> No virus found in this incoming message. >>> Checked by AVG. >>> Version: 8.0.100 / Virus Database: 270.4.1/1516 - Release Date: >>> 6/24/2008 7:53 AM >>> >> No virus found in this incoming message. >> Checked by AVG. >> Version: 8.0.100 / Virus Database: 270.4.1/1516 - Release Date: >> 6/24/2008 7:53 AM >> |