|
From: Florian L. <fl...@un...> - 2020-12-07 14:34:32
|
Hi Chris Sorry for the late reply and thanks for taking an interest in this issue. The connection doesn't break in quiet times, but in times where there is lots of incoming traffic, while at the same time I try to send out a heartbeat message after the configured interval. Since I'm only a message receiver, I never send out my own application messages and therefore Quickfix/j produces the heartbeat messages regularly. But somehow sending out a (heartbeat) messages via proxy while simultaneously receiving incoming (application) messages seems to break the connection. I verified with the counter party that they never receive my heartbeat in such a case and I stop receiving their messages as well. This behavior doesn't happen when we connect directly without proxy, but it happens with the proxy in my development environment as well as with the proxy at the client's production environment (where a direct connection is not possible). The end of the log basically looks like this: 2020-11-25 17:12:24.448 INFO 19036 --- [ssage Processor] c.u.fix_adapter.fix.FixApp : fromApp: 8=FIXT.1.19=31235=634=819143=Y49=<REPLACED>52=20201125-16:12:14.78156=<REPLACED>115=<REPLACED>122=20201125-07:00:10.209142=MADRID15=EUR22=423=Tag23-720127=2500000028=N44=97.78248=ES037015200354=255=ALRSRU 0 01/11/202060=20201125-07:00:07.103235=MATURITY854=0199=1104=E2529=12530=32531=-126.22532=310=136 2020-11-25 17:12:24.448 INFO 19036 --- [ssage Processor] c.u.fix_adapter.fix.FixApp : fromApp: 8=FIXT.1.19=32735=634=819243=Y49=<REPLACED>52=20201125-16:12:14.78156=<REPLACED>115=<REPLACED>122=20201125-07:00:10.210142=MADRID15=EUR22=423=Tag23-720227=220000028=N44=98.45248=ES036179601654=255=ALRSRU 0 01/11/202060=20201125-07:00:07.103235=MATURITY236=0.058488653854=0199=1104=E2529=12530=32531=-130.32532=310=116 2020-11-25 17:12:24.448 INFO 19036 --- [ssage Processor] c.u.fix_adapter.fix.FixApp : fromApp: 8=FIXT.1.19=32735=634=819343=Y49=<REPLACED>52=20201125-16:12:14.78156=<REPLACED>115=<REPLACED>122=20201125-07:00:10.211142=MADRID15=EUR22=423=Tag23-720327=500000028=N44=100.1548=ES030524800954=255=ALRSRU 0 01/11/202060=20201125-07:00:07.103235=MATURITY236=0.124030151854=0199=1104=E2529=12530=32531=-124.82532=110=067 2020-11-25 17:12:24.456 INFO 19036 --- [ssage Processor] c.u.fix_adapter.fix.FixApp : fromApp: 8=FIXT.1.19=32735=634=819443=Y49=<REPLACED>52=20201125-16:12:14.78156=<REPLACED>115=<REPLACED>122=20201125-07:00:10.211142=MADRID15=EUR22=423=Tag23-720427=550000028=N44=99.75248=ES037015102154=255=ALRSRU 0 01/11/202060=20201125-07:00:07.103235=MATURITY236=0.013007278854=0199=1104=E2529=12530=32531=-126.52532=310=100 2020-11-25 17:12:24.456 INFO 19036 --- [ssage Processor] c.u.fix_adapter.fix.FixApp : fromApp: 8=FIXT.1.19=32635=634=819543=Y49=<REPLACED>52=20201125-16:12:14.78156=<REPLACED>115=<REPLACED>122=20201125-07:00:10.213142=MADRID15=EUR22=423=Tag23-720527=60000028=N44=96.99748=ES037798400254=255=ALRSRU 0 01/11/202060=20201125-07:00:07.103235=MATURITY236=0.103649457854=0199=1104=E2529=12530=32531=-126.82532=310=093 2020-11-25 17:12:24.456 INFO 19036 --- [ssage Processor] c.u.fix_adapter.fix.FixApp : fromApp: 8=FIXT.1.19=31235=634=819643=Y49=<REPLACED>52=20201125-16:12:14.78156=<REPLACED>115=<REPLACED>122=20201125-07:00:10.213142=MADRID15=EUR22=423=Tag23-720627=2500000028=N44=99.68148=ES037798800354=255=ALRSRU 0 01/11/202060=20201125-07:00:07.103235=MATURITY854=0199=1104=E2529=12530=32531=-124.12532=310=162 2020-11-25 17:12:24.456 INFO 19036 --- [ssage Processor] c.u.fix_adapter.fix.FixApp : fromApp: 8=FIXT.1.19=35835=634=819743=Y49=<REPLACED>52=20201125-16:12:14.78156=<REPLACED>115=<REPLACED>122=20201125-07:00:10.213142=MADRID15=EUR22=423=Tag23-720727=200000028=N44=98.91448=XS105765983854=255=ALRSRU 0 01/11/202060=20201125-07:00:07.103218=0.31235=MATURITY236=3.895172789699=XS1057659838761=4854=0199=1104=E2529=12530=32531=332.92532=110=221 2020-11-25 17:12:24.456 INFO 19036 --- [ssage Processor] c.u.fix_adapter.fix.FixApp : fromApp: 8=FIXT.1.19=35735=634=819843=Y49=<REPLACED>52=20201125-16:12:14.78156=<REPLACED>115=<REPLACED>122=20201125-07:00:10.213142=MADRID15=EUR22=423=Tag23-720827=870000028=N44=104.81748=ES031497023954=255=ALRSRU 0 01/11/202060=20201125-07:00:07.103218=93.0235=MATURITY236=0.04773846699=DE0001141687761=4854=0199=1104=E2529=12530=32531=20.32532=310=064 2020-11-25 17:12:24.456 INFO 19036 --- [ssage Processor] c.u.fix_adapter.fix.FixApp : fromApp: 8=FIXT.1.19=34635=634=819943=Y49=<REPLACED>52=20201125-16:12:14.78156=<REPLACED>115=<REPLACED>122=20201125-07:00:10.213142=MADRID15=EUR22=423=Tag23-720927=118526428=N44=103.59348=EU000A1U982954=255=ALUFP 6.45 03/15/202960=20201125-07:00:07.103218=30.48235=MATURITY699=DE0001141703761=4854=0199=1104=E2529=12530=32531=-42.32532=310=065 2020-11-25 17:12:24.480 INFO 19036 --- [ssage Processor] c.u.fix_adapter.fix.FixApp : toAdmin: 8=FIXT.1.19=6635=034=5049=<REPLACED>52=20201125-16:12:24.48056=<REPLACED>10=191 2020-11-25 17:12:24.622 INFO 19036 --- [ scheduling-1] c.unitek.fix_adapter.ScheduledTasks : Saved [400] IOI messages. I get 1000s of incoming messages that I printed out for debugging purposes as "fromApp" log statements....and then there is the first heartbeat that I send out, logged as "toAdmin", after the configured interval (e.g. 30s after logging on). After that the incoming messages immediately stop, as if the outgoing heartbeat message caused a disconnect of the connection, but in a way that Quickfix/j doesn't even realize that something is wrong because no "recover attempt" is done which I usually see when the counter party is e.g. really offline for a time, then Quickfix/j usually tries to reconnect until the connection can be re-established. But in case of this proxy-disconnect, the application is then in a state that Quickfix/j seems not to be able to get out of again. But if I send out a heartbeat message in a time where there are no incoming messages simultaneously, then the heartbeat goes through just fine and doesn't cause a disconnect, so only when both incoming and outgoing messages occur together, this seems to be an issue, as if the proxy were a one-way-street that shuts down if there is traffic from both sides at the same time. The proxy settings used in my test environment initiator.cfg file are as follows: ProxyHost=194.191.110.6 ProxyPassword=test ProxyPort=3128 ProxyType=http ProxyUser=test ProxyVersion=1.1 There are also config parameters ProxyDomain and ProxyWorkstation available in Quickfix/j, but I didn't find out what they are used for. I implemented my own workaround now be initiating a logout/logon after not receiving an incoming message for 3 minutes, but that only works because we know that we basically get a constant stream of messages, so when we receive nothing for 3 minutes something must be wrong, but that's more a workaround then a solution, same as setting heartbeat interval to 100'000 to avoid getting into this situation in the first place. Regards, Florian PS: The newer answer in this mail-thread from Andrew Marlow seems to belong to another topic altogether... On 02.12.2020 15:28, Christoph John wrote: > Hi, > > can you share some message and event logs? > Are you saying that the connection only breaks when a Heartbeat > *needs* to be sent, i.e. when there is always traffic the connection > works, but in quiet times the connection will break? > > Cheers, > Chris. > > > On 24.11.20 09:57, Florian Leu wrote: >> QuickFIX/J Documentation: http://www.quickfixj.org/documentation/ >> QuickFIX/J Support: http://www.quickfixj.org/support/ >> >> >> Hi everyone >> >> I have a weird issue that my quickfix initiator works fine when >> connecting directly to the message supplier, but as soon as we >> connect via an HTTP proxy, it works for the first n seconds and then >> the messages stop arriving. >> >> I've now managed to trace the problem to the fact that those n >> seconds are exactly what is configured for the heartbeat interval, >> i.e. if I set this heartbeat interval higher, then it works longer >> and if I set it e.g. to 5, then it already stops after 5 seconds. >> >> At first I assumed the counter-party just closes the connection if >> they don't receive the heartbeat on time, but then I learned that >> after a missing heartbeat message first a test request is sent out >> and only after that a connection would be closed. So it seems to me >> that actually the sending out of the heartbeat from the Quickfix/j >> library (version 2.2.0) via proxy somehow seems to block the incoming >> traffic, as though the proxy was a one way street. Since the logon >> message exchange works, I know that communicating both ways actually >> works, but it seems that sending out a heartbeat while simultaneously >> processing an incoming stream of messages seems to somehow cut of >> this incoming stream for good. I can literally see in my logs that >> the incoming messages are processed via the fromApp method exactly up >> to the point that my first heartbeat is sent out, so logging this >> heartbeat via the toAdmin method is always the last thing logged >> before the communication stops altogether....no new incoming or >> outgoing messages are logged until the software is restarted. >> >> Has anyone come across a similar issue? And is there a solution for >> it? For the moment we think about just disabling the heartbeat >> messages altogether (that seems to work for now), but that it more of >> a hack workaround in my opinion and not an actual solution. >> >> Thanks for your support and regards, >> Florian >> >> >> >> _______________________________________________ >> Quickfixj-users mailing list >> Qui...@li... >> https://lists.sourceforge.net/lists/listinfo/quickfixj-users > > -- > Christoph John > Software Engineering > T +49 241 557080-28 > chr...@ma... > > MACD GmbH > Oppenhoffallee 103 > 52066 Aachen, Germany > www.macd.com > > Amtsgericht Aachen: HRB 8151 > Ust.-Id: DE 813021663 > Geschäftsführer: George Macdonald |