|
From: Vipin C. <vip...@gm...> - 2020-08-27 12:24:55
|
Hi,
I have some detailed updates on this.
When this issue occur - During the first client LOGON the logon get hanged
(don't no why)
Following is the state in event log
Accepting session FIX.4.4: *****Server->****client from /clientIP: 60868
Acceptor heartbeat set to 60 seconds
After this the logon process get hanged somewhere and no LOGON message is
sent to client.
When client does not get his LOGON msg response than it disconnect and
retry.
=> This disconnect is not detected at our side and when new LOGON message
comes than at *AcceptorIoHandler.java *line no 69
*qfSession.hasResponder() => return true*
*But * when i checked *qfSession.getRemoteAddress() *than it return me*
null. *Also I tried qfSession.logout() which does not do anything.
In case of normal LOGON floe we have following state of event log
Accepting session FIX.4.4: *****Server->****client from /clientIP: 60868
Acceptor heartbeat set to 60 seconds
Logon contains ResetSeqNumFlag=Y,resetting sequence numbers to 1
Received logon
Responding to Logon request
and then I can see the logon message in our outgoing log.
I am not able to replicate this issue on my local machine so I don't know
where it is getting hanged during first logon :(
Do anyone has any solution to this ?
Thanks
Vipin
On Thu, Aug 6, 2020 at 9:41 PM Christoph John <chr...@ma...>
wrote:
> Hi,
>
> when all clients are affected at the same time it could still be a network
> issue, right? :)
> As said, QFJ does not handle the TCP connection stuff by itself but uses
> MINA which is a stable and mature library. Of course, it still could have
> bugs. But your description rather sounds like all client's connection get
> closed and go into TIME_WAIT or CLOSE_WAIT state.
>
> https://superuser.com/questions/173535/what-are-close-wait-and-time-wait-states
>
> Did you check "netstat" or similar tool when the connection problem occurs?
>
> If you were to implement the logic that you proposed that would mean that
> someone could do a simple attack against your server which would not only
> end the new malicious session but only the one that is rightfully
> connected. Are you sure you want to follow that way? ;)
>
> Cheers,
> Chris.
>
> On 06.08.20 13:27, Vipin Chaudhary wrote:
>
> Hi guys.
>
> I am still facing this issue and now the frequency is very high. I am also
> not able to detect any networking issue.
>
> One more point is, it is happening with all clients at same time. This
> does not resolve without the restart of application.
> One more strange thing is if I don't restart, then the client which was
> disconnected at the time of this issue, when try to connect at their
> session time also facing this issue.
>
> So now I am highly suspecting this as quickfix bug.
>
> As per me quickfixj should close old session if it detect that old session
> is still running when it receive new logon. To achieve the same I am
> thinking to edit *AcceptorIoHandler.java.*
>
> Class : AcceptorIoHandler.java
> Line no 69
> if (qfSession.hasResponder()) {
> // Session is already bound to another connection
> sessionLog.onErrorEvent("Multiple logons/connections for this
> session are not allowed");
> protocolSession.closeNow();
> *//TODO Close old session here*
> return;
> }
>
> There are many methods to close the session in Session class as following
>
> 1. qfSession.close();
> 2. qfSession.disconnect("Closing Old session", true);
> 3. qfSession.logout("Closing old session to accommodate new
> session");
> 4. qfSession.generateLogout();
> 5. qfSession.reset();
>
> Which of the above method I should opt for, that can serve my purpose ?
>
> Thanks
> Vipin
>
>
> On Mon, May 4, 2020 at 3:47 PM Christoph John <chr...@ma...>
> wrote:
>
>> Hi,
>>
>> if VPN rekeying is not the problem then maybe differing MTU sizes or
>> asymmetric routing. Or of course it could also be another problem. Just
>> said what the problems mostly were in our case.
>> But I think if you describe the problem to your and your counterparty's
>> network team (since you only seem to have this problem with some of your
>> sessions) they should be able to debug it. Maybe your team need to do some
>> tcp dumps around the time of the problem.
>>
>> It would be nice if you could reply to the user group (not me privately)
>> if you found something out. This could help other users as well.
>>
>> Cheers,
>> Chris.
>>
>>
>> On 04.05.20 08:28, Vipin Chaudhary wrote:
>>
>> Hi Christoph,
>>
>> I double checked from the network team and find that no initiator use VPN
>> to connect to us.
>>
>> In that case, Can you help me on what should I request to network team to
>> fix the network connections?
>>
>> Thanks
>> Vipin Chaudhary
>>
>> On Thu, Apr 30, 2020 at 3:49 PM Christoph John <chr...@ma...>
>> wrote:
>>
>>> The only solution is to fix the network connection. Everything else is
>>> only a workaround.
>>> You could try to increase socket timeouts on both sides of the
>>> connection. Maybe it helps (depends on the cause of the problem) but as
>>> said this will only work around the problem.
>>>
>>> Cheers,
>>> Chris.
>>>
>>> On 30.04.20 12:11, Vipin Chaudhary wrote:
>>>
>>> Hi Christoph,
>>>
>>> Thanks for your input,
>>>
>>> We have multiple sessions and this problems does not happens with all
>>> sessions simultaneously. Mostly we see this problem with one session and
>>> rarely with other session.
>>>
>>> As far as I know initiator connect directly with us over internet (have
>>> ssl). Will double check on this with network team.
>>>
>>> Meanwhile any solution you think of this problem ?
>>>
>>> Thanks
>>> Vipin Chaudhary
>>>
>>> On Thu, Apr 30, 2020 at 3:31 PM Christoph John <chr...@ma...>
>>> wrote:
>>>
>>>> Addition: if VPN rekeying is the problem you will probably see this
>>>> message every hour (or whatever the rekey interval is)
>>>>
>>>> Chris.
>>>>
>>>> On 30.04.20 12:00, Christoph John wrote:
>>>>
>>>> Hi,
>>>>
>>>> did you change QFJ version or why do you think it is QFJ related? Do
>>>> you only have one FIX session? If not, do all sessions show this behaviour?
>>>> (apart from that, the TCP/IP stuff is done via the MINA framework and
>>>> not within QFJ itself)
>>>>
>>>> This message appears when the initiator side of the connection
>>>> considers the connection broken and tries to reconnect. But the acceptor
>>>> still considers it a vital connection (probably until the connection
>>>> timeout kicks in).
>>>>
>>>> From my experience this happens mostly on VPN connections via internet.
>>>> Reasons for this were one of:
>>>> - different MTU sizes on both sides of the connection or on a
>>>> router/firewall in between
>>>> - asymmetric routing
>>>> - differing VPN parameters leading to different rekeying behaviour on
>>>> both ends of the connection.
>>>>
>>>> Hope that helps,
>>>> Chris.
>>>>
>>>>
>>>> On 30.04.20 05:51, Vipin Chaudhary wrote:
>>>>
>>>> QuickFIX/J Documentation: http://www.quickfixj.org/documentation/
>>>> QuickFIX/J Support: http://www.quickfixj.org/support/
>>>>
>>>>
>>>> Hi Team,
>>>>
>>>> We are facing strange issue with quickfixj.
>>>>
>>>> We are SessionAcceptor, sometime when initiator disconnect, then
>>>> quickfixj is not able to recognize the disconnection. So when client logon
>>>> next time it say
>>>> " Multiple logons/connections for this session are not allowed".
>>>> Although in reality client is disconnected.
>>>> Earlier it was very rare and happening once in a while but nowadays its
>>>> happening like once in week.
>>>> Quickfixj is not able to recover from this and we need to restart our
>>>> application
>>>>
>>>> *Do anyone have seen/fix this ?*
>>>>
>>>> Thanks
>>>> Vipin Chaudhary
>>>>
>>>>
>>>> _______________________________________________
>>>> Quickfixj-users mailing lis...@li...://lists.sourceforge.net/lists/listinfo/quickfixj-users
>>>>
>>>>
>>>> --
>>>> Christoph John
>>>> Software Engineering
>>>> T +49 241 557...@ma...
>>>>
>>>> MACD GmbH
>>>> Oppenhoffallee 103
>>>> 52066 Aachen, Germanywww.macd.com
>>>>
>>>> Amtsgericht Aachen: HRB 8151
>>>> Ust.-Id: DE 813021663
>>>> Geschäftsführer: George Macdonald
>>>>
>>>>
>>>> --
>>>> Christoph John
>>>> Software Engineering
>>>> T +49 241 557...@ma...
>>>>
>>>> MACD GmbH
>>>> Oppenhoffallee 103
>>>> 52066 Aachen, Germanywww.macd.com
>>>>
>>>> Amtsgericht Aachen: HRB 8151
>>>> Ust.-Id: DE 813021663
>>>> Geschäftsführer: George Macdonald
>>>>
>>>>
>>> --
>>> Christoph John
>>> Software Engineering
>>> T +49 241 557...@ma...
>>>
>>> MACD GmbH
>>> Oppenhoffallee 103
>>> 52066 Aachen, Germanywww.macd.com
>>>
>>> Amtsgericht Aachen: HRB 8151
>>> Ust.-Id: DE 813021663
>>> Geschäftsführer: George Macdonald
>>>
>>>
>> --
>> Christoph John
>> Software Engineering
>> T +49 241 557...@ma...
>>
>> MACD GmbH
>> Oppenhoffallee 103
>> 52066 Aachen, Germanywww.macd.com
>>
>> Amtsgericht Aachen: HRB 8151
>> Ust.-Id: DE 813021663
>> Geschäftsführer: George Macdonald
>>
>>
> --
> Christoph John
> Software Engineering
> T +49 241 557...@ma...
>
> MACD GmbH
> Oppenhoffallee 103
> 52066 Aachen, Germanywww.macd.com
>
> Amtsgericht Aachen: HRB 8151
> Ust.-Id: DE 813021663
> Geschäftsführer: George Macdonald
>
>
|