Re: [Quickfix-developers] FIX.IntConvertor.convert() throwingexception
Brought to you by:
orenmnero
From: <or...@qu...> - 2008-06-30 14:47:51
|
Static method shouldn't make a difference if there is no static variables, as each thread has its own stack. I'm really curious what this can be as well. Do you have any capability of reproducing this in a testing environment? --oren > -------- Original Message -------- > Subject: Re: [Quickfix-developers] FIX.IntConvertor.convert() > throwingexception > From: Rick Lane <ric...@gm...> > Date: Mon, June 30, 2008 9:19 am > To: qui...@li... > QuickFIX Documentation: http://www.quickfixengine.org/quickfix/doc/html/index.html > QuickFIX Support: http://www.quickfixengine.org/services.html<hr>Hello all, just wondering if anyone had any luck discovering any issues > with the IntConvertor class with respect to concurrency issues - or if > there were any additional thoughts on the matter. For the life of me I > can't figure out what else could be causing this "/External component > has thrown an exception/" exception. > Thanks again, > Rick > Rick Lane wrote: > > John Haldi discovered that IntConvertor.convert() is a static method > > -- I have 2 threads that run concurrently in the following manner: > > > > Thread A (producer): > > ----------- > > 1) takes in raw compressed FAST data from the CME, converts it to a > > FIX string > > 2) takes FIX string and creates a QuickFix.Message object, passing the > > string into the constructor > > 3) checks the MsgSeqNum of this message ( message.getHeader().getInt(34) ) > > 3a) if MsgSeqNum is the next one it expects, it hands it off to > > the consumer (Thread B) > > 3b) if MsgSeqNum is /not /the next one, it creates a request to > > obtain the missed packets (this is on UDP so unreliable) > > > > Thread B (consumer): > > ---------- > > 1) listens for Thread A to add another QuickFix.Message to a shared Queue > > 2) Processes the message's fields > > > > So I'm wondering if the two red portions are causing these issues, > > because the low-level IntConvertor.convert() function is static. Even > > though the same message object will NEVER be accessed by more than one > > thread, if the same helper function is then I could see this causing a > > problem.... > > > > I don't see any shared/static member /variables /used by these > > methods, so I don't know how they could be interfering with each other > > -- but I thought I'd add this bit of information. > > > > Thanks, > > Rick > > > > John Haldi wrote: > >> Sorry it wasn't helpful. In looking at the source code for QF, I see > >> that the FIX.IntConvertor.convert function is indeed declared as a > >> shared method, but that and a buck will get you a cup of coffee. > >> From what you say there is no possibility of another thread calling > >> the convert function concurrently, so I'm somewhat at a loss as to > >> what could cause the function to fail. Its pretty straightforward > >> code in that function, so if something is wrong it should throw the > >> exception constantly. I'm still suspicious that something in the QF > >> library could be calling this function concurrently, but I have no > >> clue where to begin guessing... > >> > >> jh > >> > >> ------------------------------------------------------------------------ > >> *From:* Rick Lane [mailto:ric...@gm...] > >> *Sent:* Tuesday, June 24, 2008 11:39 AM > >> *To:* John Haldi > >> *Subject:* Re: [Quickfix-developers] FIX.IntConvertor.convert() > >> throwingexception > >> > >> John, > >> > >> Yes, my first thought was a threading issue as well, however I don't > >> believe it is one -- and if it is, it's not as straightforward as > >> your example below. Here's why: > >> > >> I am not establishing a FIX session -- I am simply using the QuickFix > >> library as a "utility" library to parse incoming FIX messages. The > >> CME (which I forget -- is this what you work with as well?) provides > >> /market data/ in FAST format ("FIX Adapted for STreaming" or > >> something like that) which is compressed data that I decode into a > >> string. Rather than re-invent the wheel and parse the string > >> representation of the FIX message myself, I simply create a > >> QuickFix.Message object passing this string into the constructor > >> (along with a DataDictionary). Then I can use the QuickFix functions > >> like getGroup, getIntField, etc..., and it does all the parsing > >> legwork for me. > >> > >> Also, there is only one thread listening to market data. > >> > >> Now, the /order routing /portion of my server /does /use a true > >> QuickFix "session" -- however for testing purposes, I'm not even > >> instantiating this, I'm /only /listening to the Market Data side of > >> things.... > >> > >> Thanks again for your time! > >> > >> Rick > >> > >> John Haldi wrote: > >>> Rick, > >>> > >>> Is it possible that somehow the group and/or the field in question > >>> is getting overwritten by a concurrent call on a different thread. > >>> My thinking is as follows: If you have a > >>> threadSocketInitiator/Acceptor working, perhaps every now and then > >>> two messages with this repeating group are coming in at the exact > >>> same time on two different threads, and that there is a helper > >>> function of some sort going on under the hood and one message is > >>> stomping on the other message - i.e. maybe the helper function is > >>> using a shared variable/class when it should be using an instance > >>> variable/class. > >>> > >>> The scenario I'm thinking of goes something like this: > >>> > >>> Thread #1 gets message with 10 group elements > >>> Thread #1 calls getGroup - getGroup stores group related info in a > >>> variable > >>> Thread #1 processes the first 5 of 10 group elements > >>> Thread #2 gets message with 3 group elements > >>> Thread #2 calls getGroup - getGroup stores group related info in the > >>> same variable > >>> Thread #1 now tries to access 6-10 of the group elements but they > >>> point to disposed memory > >>> Thread #1 throws a really nasty exception > >>> > >>> If we allow for something like the above as a possibility, it would > >>> explain 1) the seemingly intermittent nature of the problem, and 2) > >>> why you can't recreate it in a debugger. > >>> > >>> Its just a thought... > >>> > >>> John > >>> > >>> > >>> ------------------------------------------------------------------------ > >>> *From:* qui...@li... > >>> [mailto:qui...@li...] *On > >>> Behalf Of *Rick Lane > >>> *Sent:* Tuesday, June 24, 2008 11:13 AM > >>> *To:* qui...@li... > >>> *Subject:* Re: [Quickfix-developers] FIX.IntConvertor.convert() > >>> throwingexception > >>> > >>> What's interesting is that I take the message text from the message > >>> that causes the exception and then basically recreate the > >>> QuickFix.Message object with this string in a separate application, > >>> and make the same calls, and I don't get the exception. > >>> > >>> So it seems pretty obvious to me this isn't a message-formatting > >>> issue -- does this shed any light onto what might be the problem? > >>> Not sure why my Exception text is so vague (" External component has > >>> thrown an exception."). > >>> > >>> Rick Lane wrote: > >>>> Greetings, > >>>> > >>>> I have finally tracked down a bug that has been giving me problems > >>>> for some time. I'm getting an exception thrown in the > >>>> FIX.IntConvertor.convert(string) function, and I can't seem to > >>>> figure out why. It always happens at the same place: in an > >>>> Incremental Refresh message, I extract the NoMDEntries group. I > >>>> then try to extract the "price level" of the update (int field > >>>> 1023), and here is where I get the exception. Here is the stack trace: > >>>> > >>>> External component has thrown an exception. > >>>> at _CxxThrowException(Void* , _s__ThrowInfo* ) > >>>> at > >>>> FIX.IntConvertor.convert(basic_string<char\,std::char_traits<char>\,std::allocator<char> > >>>> >* value) > >>>> at QuickFix.Group.getField(IntField field) > >>>> at > >>>> MDPDataServer.MDPMarketDataProvider.ProcessMarketDataIncrementalRefresh > >>>> > >>>> and here's the line of code that's causing the exception: > >>>> > >>>> // message is a QuickFix.Message object constructed from the string > >>>> below > >>>> int numEntries = message.getInt(268); > >>>> for (uint i = 0; i < numEntries; i++) { > >>>> QuickFix44.MarketDataIncrementalRefresh.NoMDEntries group = new > >>>> QuickFix44.MarketDataIncrementalRefresh.NoMDEntries(); > >>>> message.getGroup(i + 1, group); > >>>> int priceLevel = group.getField(new IntField(1023)).getValue(); > >>>> // exception occurs here > >>>> ... > >>>> > >>>> What's strange is that I process millions of market data messages > >>>> every day and this only happens maybe 2 or 3 times a week -- my > >>>> first thought was that this was a FAST decoding issue (when I'm > >>>> building the text representation of the FIX message before QuickFix > >>>> is even used), but at such a low probability of occurrence, I can't > >>>> imagine this is a decoding issue. > >>>> > >>>> Here is the message that is throwing the exception; I've > >>>> highlighted the 1023 entries, and they all look fine to me -- any > >>>> thoughts? (also, to make it more readable/email friendly, I > >>>> removed the stop-bits and replaced them with the | character). > >>>> > >>>> Thanks, > >>>> Rick > >>>> > >>>> 8=FIX.4.2 | 9=1961 | 35=X | 34=1304872 | 49=CME | > >>>> 52=20080624115930866 | 75=20080624 | 268=22 | 279=1 | *1023=2* | > >>>> 269=0 | 270=45 | 271=85 | 273=115930000 | 336=2 | 276=K | 22=8 | > >>>> 48=801005 | 83=13568 | 279=1 | *1023=1* | 269=0 | 270=94 | 271=293 > >>>> | 273=115930000 | 336=2 | 276=K | 22=8 | 48=801101 | 83=38117 | > >>>> 279=1 | *1023=1* | 269=0 | 270=112.5 | 271=293 | 273=115930000 | > >>>> 336=2 | 276=K | 22=8 | 48=801109 | 83=35245 | 279=1 | *1023=2* | > >>>> 269=1 | 270=9551 | 271=1743 | 273=115930000 | 336=2 | 276=K | 22=8 > >>>> | 48=803001 | 83=231922 | 279=1 | *1023=1* | 269=1 | 270=9631 | > >>>> 271=1134 | 273=115930000 | 336=2 | 276=K | 22=8 | 48=803900 | > >>>> 83=278737 | 279=1 | *1023=2* | 269=1 | 270=9631.5 | 271=12656 | > >>>> 273=115930000 | 336=2 | 276=K | 22=8 | 48=803900 | 83=278738 | > >>>> 279=1 | *1023=1* | 269=1 | 270=9536 | 271=1175 | 273=115930000 | > >>>> 336=2 | 276=K | 22=8 | 48=806001 | 83=204449 | 279=1 | *1023=2* | > >>>> 269=1 | 270=9536.5 | 271=13774 | 273=115930000 | 336=2 | 276=K | > >>>> 22=8 | 48=806001 | 83=204450 | 279=1 | *1023=1* | 269=1 | 270=9612 > >>>> | 271=332 | 273=115930000 | 336=2 | 276=K | 22=8 | 48=806901 | > >>>> 83=422681 | 279=1 | *1023=2* | 269=1 | 270=9612.5 | 271=17576 | > >>>> 273=115930000 | 336=2 | 276=K | 22=8 | 48=806901 | 83=422682 | > >>>> 279=1 | *1023=2* | 269=1 | 270=9592 | 271=30035 | 273=115930000 | > >>>> 336=2 | 276=K | 22=8 | 48=809901 | 83=312614 | 279=1 | *1023=2* | > >>>> 269=0 | 270=17 | 271=47 | 273=115930000 | 336=2 | 276=K | 22=8 | > >>>> 48=800915 | 83=20839 | 279=1 | *1023=1* | 269=0 | 270=43 | 271=58 | > >>>> 273=115930000 | 336=2 | 276=K | 22=8 | 48=801105 | 83=10961 | 279=2 > >>>> | *1023=1* | 269=1 | 270=44.5 | 271=12 | 273=115930000 | 336=2 | > >>>> 276=K | 22=8 | 48=801105 | 83=10962 | 279=1 | *1023=1* | 269=1 | > >>>> 270=45 | 271=12 | 273=115930000 | 336=2 | 276=K | 22=8 | 48=801105 > >>>> | 83=10963 | 279=0 | *1023=2* | 269=1 | 270=45.5 | 271=216 | > >>>> 273=115930000 | 336=2 | 276=K | 22=8 | 48=801105 | 83=10964 | 279=1 > >>>> | *1023=1* | 269=0 | 270=60 | 271=58 | 273=115930000 | 336=2 | > >>>> 276=K | 22=8 | 48=801113 | 83=9462 | 279=1 | *1023=2* | 269=0 | > >>>> 270=-4 | 271=24 | 273=115930000 | 336=2 | 276=K | 22=8 | 48=801208 > >>>> | 83=3856 | 279=2 | *1023=2* | 269=1 | 270=9495.5 | 271=49 | > >>>> 273=115930000 | 336=2 | 276=K | 22=8 | 48=803201 | 83=8643 | 279=0 > >>>> | *1023=2* | 269=1 | 270=9496 | 271=93 | 273=115930000 | 336=2 | > >>>> 276=K | 22=8 | 48=803201 | 83=8644 | 279=1 | *1023=1* | 269=0 | > >>>> 270=81 | 271=967 | 273=115930000 | 336=2 | 276=K | 22=8 | 48=800208 > >>>> | 83=76335 | 279=1 | *1023=2* | 269=0 | 270=80.5 | 271=409 | > >>>> 273=115930000 | 336=2 | 276=K | 22=8 | 48=800208 | 83=76336 | > >>>> 1128=8 | 10=233 | > >>>> > >>>> > >>>> > >>> No virus found in this incoming message. > >>> Checked by AVG. > >>> Version: 8.0.100 / Virus Database: 270.4.1/1516 - Release Date: > >>> 6/24/2008 7:53 AM > >>> > >> No virus found in this incoming message. > >> Checked by AVG. > >> Version: 8.0.100 / Virus Database: 270.4.1/1516 - Release Date: > >> 6/24/2008 7:53 AM > >><hr>------------------------------------------------------------------------- > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for > just about anything Open Source. > http://sourceforge.net/services/buy/index.php<hr>_______________________________________________ > Quickfix-developers mailing list > Qui...@li... > https://lists.sourceforge.net/lists/listinfo/quickfix-developers |