|
From: <tom...@up...> - 2018-12-06 08:49:32
Attachments:
smime.p7s
FIX.4.2-SenderCompID-TargetCompID.SESSION
|
Hello dear readers,
This is a situation that I've encountered twice now, at two different
clients of ours. Their application server with our FIX module running on it
based on the QuickFIX/J engine (1.5.3) crashes, and leaves the QuickFIX/J
engine in a corrupt state. The first time, the server crash was for a
virtualized server (some reason the hosting provider never told me) and the
second was due a UPS that did not signal the server to shut down properly
when the battery started running out. Our configuration has
FileStorePath=filestore in both instances. Our instances are buy side (i.e.
initiators).
On inspection of the QuickFIX/J state files, I'm seeing that they are
corrupted. I can fix this by removing all state files and restart our
module, then all is fine again. The first time happened about a year ago,
the second just earlier today. All state files were empty (.body, .header,
.senderseqnums, .targetseqnums), except for the .session file (see
attachment). You'll see it's filled with all ASCII NUL characters. A valid
one as I've noticed has something like <NUL><ETX><some number> as contents.
The exceptions I'm getting in my module's log file goes like:
quickfix.ConfigError: error during session initialization
at
quickfix.mina.initiator.AbstractSocketInitiator.createSessions(AbstractSocke
tInitiator.java:169)
at
quickfix.mina.initiator.AbstractSocketInitiator.createSessionInitiators(Abst
ractSocketInitiator.java:84)
at quickfix.SocketInitiator.initialize(SocketInitiator.java:86)
at quickfix.SocketInitiator.start(SocketInitiator.java:64)
at <INTERNALS-STRIPPED>
at <INTERNALS-STRIPPED>
at <INTERNALS-STRIPPED>
at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.RuntimeException: java.io.IOException: invalid UTC
timestamp value:
at quickfix.FileStoreFactory.create(FileStoreFactory.java:80)
at quickfix.Session.<init>(Session.java:467)
at
quickfix.DefaultSessionFactory.create(DefaultSessionFactory.java:185)
at
quickfix.mina.SessionConnector.createSession(SessionConnector.java:140)
at
quickfix.mina.initiator.AbstractSocketInitiator.createSessions(AbstractSocke
tInitiator.java:163)
... 7 more
Caused by: java.io.IOException: invalid UTC timestamp value:
at
quickfix.FileStore.initializeSessionCreateTime(FileStore.java:137)
at quickfix.FileStore.initializeCache(FileStore.java:123)
at quickfix.FileStore.initialize(FileStore.java:116)
at quickfix.FileStore.<init>(FileStore.java:101)
at quickfix.FileStoreFactory.create(FileStoreFactory.java:78)
... 11 more
I remember this is the same exception I got the first time one of our
clients' server crashed.
Of course in an ideal world a server should never crash, but alas it does.
It would appear that the QuickFIX/J engine's state isn't properly flushed to
disk? Or maybe this happens because the engine's state cannot be written
atomically? I do not know, as I've never had the time to delve into
QuickFIX/J internals and see how this situation can arise.
Perhaps it has been reported before and fixed in a later version, I've not
had time to delve into that either. Perhaps this is something that simply
can't be fixed due to the way filesystems work? Or perhaps it can?
Besides this, I'm pretty happy with QuickFIX/J 1.5.3 and I wish to thank all
contributors to QuickFIX/J!
Kind regards,
--
Tom Tempelaere
Upsilon SA
|
|
From: Christoph J. <chr...@ma...> - 2018-12-06 09:26:10
|
Hi Tom, IMHO you can always run into this kind of problem (data corruption) when the power fails. You could try turning on synchronous writes but this of course will slow down the message processing. The option to turn this on is FileStoreSync=Y. I just discovered that it is not part of the documentation. :-/ Will add it. Hope that helps a bit. Cheers, Chris. On 06/12/2018 09:33, tom...@up... wrote: > QuickFIX/J Documentation: http://www.quickfixj.org/documentation/ > QuickFIX/J Support: http://www.quickfixj.org/support/ > > > > Hello dear readers, > > This is a situation that I’ve encountered twice now, at two different clients of ours. Their > application server with our FIX module running on it based on the QuickFIX/J engine (1.5.3) > crashes, and leaves the QuickFIX/J engine in a corrupt state. The first time, the server crash was > for a virtualized server (some reason the hosting provider never told me) and the second was due a > UPS that did not signal the server to shut down properly when the battery started running out. Our > configuration has /FileStorePath=filestore/ in both instances. Our instances are buy side (i.e. > initiators). > > On inspection of the QuickFIX/J state files, I’m seeing that they are corrupted. I can fix this by > removing all state files and restart our module, then all is fine again. The first time happened > about a year ago, the second just earlier today. All state files were empty (.body, .header, > .senderseqnums, .targetseqnums), except for the .session file (see attachment). You’ll see it’s > filled with all ASCII NUL characters. A valid one as I’ve noticed has something like > /<NUL><ETX><some number>/ as contents. > > The exceptions I’m getting in my module’s log file goes like: > > quickfix.ConfigError: error during session initialization > > at > quickfix.mina.initiator.AbstractSocketInitiator.createSessions(AbstractSocketInitiator.java:169) > > at > quickfix.mina.initiator.AbstractSocketInitiator.createSessionInitiators(AbstractSocketInitiator.java:84) > > at quickfix.SocketInitiator.initialize(SocketInitiator.java:86) > > at quickfix.SocketInitiator.start(SocketInitiator.java:64) > > at <INTERNALS-STRIPPED> > > at <INTERNALS-STRIPPED> > > at <INTERNALS-STRIPPED> > > at java.lang.Thread.run(Unknown Source) > > Caused by: java.lang.RuntimeException: java.io.IOException: invalid UTC timestamp value: > > at quickfix.FileStoreFactory.create(FileStoreFactory.java:80) > > at quickfix.Session.<init>(Session.java:467) > > at quickfix.DefaultSessionFactory.create(DefaultSessionFactory.java:185) > > at quickfix.mina.SessionConnector.createSession(SessionConnector.java:140) > > at > quickfix.mina.initiator.AbstractSocketInitiator.createSessions(AbstractSocketInitiator.java:163) > > ... 7 more > > Caused by: java.io.IOException: invalid UTC timestamp value: > > at quickfix.FileStore.initializeSessionCreateTime(FileStore.java:137) > > at quickfix.FileStore.initializeCache(FileStore.java:123) > > at quickfix.FileStore.initialize(FileStore.java:116) > > at quickfix.FileStore.<init>(FileStore.java:101) > > at quickfix.FileStoreFactory.create(FileStoreFactory.java:78) > > ... 11 more > > I remember this is the same exception I got the first time one of our clients’ server crashed. > > Of course in an ideal world a server should never crash, but alas it does. It would appear that > the QuickFIX/J engine’s state isn’t properly flushed to disk? Or maybe this happens because the > engine’s state cannot be written atomically? I do not know, as I’ve never had the time to delve > into QuickFIX/J internals and see how this situation can arise. > > Perhaps it has been reported before and fixed in a later version, I’ve not had time to delve into > that either. Perhaps this is something that simply can’t be fixed due to the way filesystems work? > Or perhaps it can? > > Besides this, I’m pretty happy with QuickFIX/J 1.5.3 and I wish to thank all contributors to > QuickFIX/J! > > Kind regards, > > -- > > Tom Tempelaere > > Upsilon SA > > > > _______________________________________________ > Quickfixj-users mailing list > Qui...@li... > https://lists.sourceforge.net/lists/listinfo/quickfixj-users -- Christoph John Software Engineering T +49 241 557080-28 chr...@ma... MACD GmbH Oppenhoffallee 103 52066 Aachen, Germany www.macd.com Amtsgericht Aachen: HRB 8151 Ust.-Id: DE 813021663 Geschäftsführer: George Macdonald |
|
From: <tom...@up...> - 2018-12-06 09:51:41
Attachments:
smime.p7s
|
Hi Christoph, Thanks for the hint :). Ill try this one. Our clients are not high volume traders, so a minor performance drop should be OK still. Kind regards // Herzliche Grüße, -- Tom Tempelaere Upsilon SA From: Christoph John <chr...@ma...> Sent: Thursday, 6 December 2018 10:26 To: qui...@li...; tom...@up... Subject: Re: [Quickfixj-users] Server crash leaves QuickFIX/J engine (1.5.3) in corrupt state for FileStorePath=filestore Hi Tom, IMHO you can always run into this kind of problem (data corruption) when the power fails. You could try turning on synchronous writes but this of course will slow down the message processing. The option to turn this on is FileStoreSync=Y. I just discovered that it is not part of the documentation. :-/ Will add it. Hope that helps a bit. Cheers, Chris. On 06/12/2018 09:33, tom...@up... <mailto:tom...@up...> wrote: QuickFIX/J Documentation: http://www.quickfixj.org/documentation/ QuickFIX/J Support: http://www.quickfixj.org/support/ Hello dear readers, This is a situation that Ive encountered twice now, at two different clients of ours. Their application server with our FIX module running on it based on the QuickFIX/J engine (1.5.3) crashes, and leaves the QuickFIX/J engine in a corrupt state. The first time, the server crash was for a virtualized server (some reason the hosting provider never told me) and the second was due a UPS that did not signal the server to shut down properly when the battery started running out. Our configuration has FileStorePath=filestore in both instances. Our instances are buy side (i.e. initiators). On inspection of the QuickFIX/J state files, Im seeing that they are corrupted. I can fix this by removing all state files and restart our module, then all is fine again. The first time happened about a year ago, the second just earlier today. All state files were empty (.body, .header, .senderseqnums, .targetseqnums), except for the .session file (see attachment). Youll see its filled with all ASCII NUL characters. A valid one as Ive noticed has something like <NUL><ETX><some number> as contents. The exceptions Im getting in my modules log file goes like: quickfix.ConfigError: error during session initialization at quickfix.mina.initiator.AbstractSocketInitiator.createSessions(AbstractSocke tInitiator.java:169) at quickfix.mina.initiator.AbstractSocketInitiator.createSessionInitiators(Abst ractSocketInitiator.java:84) at quickfix.SocketInitiator.initialize(SocketInitiator.java:86) at quickfix.SocketInitiator.start(SocketInitiator.java:64) at <INTERNALS-STRIPPED> at <INTERNALS-STRIPPED> at <INTERNALS-STRIPPED> at java.lang.Thread.run(Unknown Source) Caused by: java.lang.RuntimeException: java.io.IOException: invalid UTC timestamp value: at quickfix.FileStoreFactory.create(FileStoreFactory.java:80) at quickfix.Session.<init>(Session.java:467) at quickfix.DefaultSessionFactory.create(DefaultSessionFactory.java:185) at quickfix.mina.SessionConnector.createSession(SessionConnector.java:140) at quickfix.mina.initiator.AbstractSocketInitiator.createSessions(AbstractSocke tInitiator.java:163) ... 7 more Caused by: java.io.IOException: invalid UTC timestamp value: at quickfix.FileStore.initializeSessionCreateTime(FileStore.java:137) at quickfix.FileStore.initializeCache(FileStore.java:123) at quickfix.FileStore.initialize(FileStore.java:116) at quickfix.FileStore.<init>(FileStore.java:101) at quickfix.FileStoreFactory.create(FileStoreFactory.java:78) ... 11 more I remember this is the same exception I got the first time one of our clients server crashed. Of course in an ideal world a server should never crash, but alas it does. It would appear that the QuickFIX/J engines state isnt properly flushed to disk? Or maybe this happens because the engines state cannot be written atomically? I do not know, as Ive never had the time to delve into QuickFIX/J internals and see how this situation can arise. Perhaps it has been reported before and fixed in a later version, Ive not had time to delve into that either. Perhaps this is something that simply cant be fixed due to the way filesystems work? Or perhaps it can? Besides this, Im pretty happy with QuickFIX/J 1.5.3 and I wish to thank all contributors to QuickFIX/J! Kind regards, -- Tom Tempelaere Upsilon SA _______________________________________________ Quickfixj-users mailing list Qui...@li... <mailto:Qui...@li...> https://lists.sourceforge.net/lists/listinfo/quickfixj-users -- Christoph John Software Engineering T +49 241 557080-28 chr...@ma... <mailto:chr...@ma...> MACD GmbH Oppenhoffallee 103 52066 Aachen, Germany www.macd.com <http://www.macd.com> Amtsgericht Aachen: HRB 8151 Ust.-Id: DE 813021663 Geschäftsführer: George Macdonald |
|
From: <tom...@up...> - 2018-12-06 11:09:36
Attachments:
smime.p7s
|
Hi Christoph, I DuckDuckGod FileStoreSync and see in the JavaDoc for 1.6.4 the message its safer to sync, but its also much slower (100x slower in some cases). Im wondering by how much time things are slowed down now. Are we talking about milliseconds per FIX message on current servers? Of course it all depends on the file subsystem (hard drive vs SSD), I know ;-) Kind regards // Herzliche Grüße, -- Tom Tempelaere Upsilon SA From: Christoph John <chr...@ma...> Sent: Thursday, 6 December 2018 10:26 To: qui...@li...; tom...@up... Subject: Re: [Quickfixj-users] Server crash leaves QuickFIX/J engine (1.5.3) in corrupt state for FileStorePath=filestore Hi Tom, IMHO you can always run into this kind of problem (data corruption) when the power fails. You could try turning on synchronous writes but this of course will slow down the message processing. The option to turn this on is FileStoreSync=Y. I just discovered that it is not part of the documentation. :-/ Will add it. Hope that helps a bit. Cheers, Chris. On 06/12/2018 09:33, tom...@up... <mailto:tom...@up...> wrote: QuickFIX/J Documentation: http://www.quickfixj.org/documentation/ QuickFIX/J Support: http://www.quickfixj.org/support/ Hello dear readers, This is a situation that Ive encountered twice now, at two different clients of ours. Their application server with our FIX module running on it based on the QuickFIX/J engine (1.5.3) crashes, and leaves the QuickFIX/J engine in a corrupt state. The first time, the server crash was for a virtualized server (some reason the hosting provider never told me) and the second was due a UPS that did not signal the server to shut down properly when the battery started running out. Our configuration has FileStorePath=filestore in both instances. Our instances are buy side (i.e. initiators). On inspection of the QuickFIX/J state files, Im seeing that they are corrupted. I can fix this by removing all state files and restart our module, then all is fine again. The first time happened about a year ago, the second just earlier today. All state files were empty (.body, .header, .senderseqnums, .targetseqnums), except for the .session file (see attachment). Youll see its filled with all ASCII NUL characters. A valid one as Ive noticed has something like <NUL><ETX><some number> as contents. The exceptions Im getting in my modules log file goes like: quickfix.ConfigError: error during session initialization at quickfix.mina.initiator.AbstractSocketInitiator.createSessions(AbstractSocke tInitiator.java:169) at quickfix.mina.initiator.AbstractSocketInitiator.createSessionInitiators(Abst ractSocketInitiator.java:84) at quickfix.SocketInitiator.initialize(SocketInitiator.java:86) at quickfix.SocketInitiator.start(SocketInitiator.java:64) at <INTERNALS-STRIPPED> at <INTERNALS-STRIPPED> at <INTERNALS-STRIPPED> at java.lang.Thread.run(Unknown Source) Caused by: java.lang.RuntimeException: java.io.IOException: invalid UTC timestamp value: at quickfix.FileStoreFactory.create(FileStoreFactory.java:80) at quickfix.Session.<init>(Session.java:467) at quickfix.DefaultSessionFactory.create(DefaultSessionFactory.java:185) at quickfix.mina.SessionConnector.createSession(SessionConnector.java:140) at quickfix.mina.initiator.AbstractSocketInitiator.createSessions(AbstractSocke tInitiator.java:163) ... 7 more Caused by: java.io.IOException: invalid UTC timestamp value: at quickfix.FileStore.initializeSessionCreateTime(FileStore.java:137) at quickfix.FileStore.initializeCache(FileStore.java:123) at quickfix.FileStore.initialize(FileStore.java:116) at quickfix.FileStore.<init>(FileStore.java:101) at quickfix.FileStoreFactory.create(FileStoreFactory.java:78) ... 11 more I remember this is the same exception I got the first time one of our clients server crashed. Of course in an ideal world a server should never crash, but alas it does. It would appear that the QuickFIX/J engines state isnt properly flushed to disk? Or maybe this happens because the engines state cannot be written atomically? I do not know, as Ive never had the time to delve into QuickFIX/J internals and see how this situation can arise. Perhaps it has been reported before and fixed in a later version, Ive not had time to delve into that either. Perhaps this is something that simply cant be fixed due to the way filesystems work? Or perhaps it can? Besides this, Im pretty happy with QuickFIX/J 1.5.3 and I wish to thank all contributors to QuickFIX/J! Kind regards, -- Tom Tempelaere Upsilon SA _______________________________________________ Quickfixj-users mailing list Qui...@li... <mailto:Qui...@li...> https://lists.sourceforge.net/lists/listinfo/quickfixj-users -- Christoph John Software Engineering T +49 241 557080-28 chr...@ma... <mailto:chr...@ma...> MACD GmbH Oppenhoffallee 103 52066 Aachen, Germany www.macd.com <http://www.macd.com> Amtsgericht Aachen: HRB 8151 Ust.-Id: DE 813021663 Geschäftsführer: George Macdonald |
|
From: Christoph J. <chr...@ma...> - 2018-12-06 11:15:43
|
Hi Tom, to be honest: I don't know. That message is in the docs since 2006. :) So I assume with a recent server and an SSD the performance hit should be substantially lower. I never measured it but would assume some milliseconds. Grüße, :) Chris. On 06/12/2018 12:09, tom...@up... wrote: > > Hi Christoph, > > I DuckDuckGo’d FileStoreSync and see in the JavaDoc for 1.6.4 the message “it’s safer to sync, but > it’s also much slower (100x slower in some cases)”. > > I’m wondering by how much time things are slowed down now. Are we talking about milliseconds per > FIX message on current servers? Of course it all depends on the file subsystem (hard drive vs > SSD), I know ;-) > > Kind regards // Herzliche Grüße, > > -- > > Tom Tempelaere > > Upsilon SA > > *From:*Christoph John <chr...@ma...> > *Sent:* Thursday, 6 December 2018 10:26 > *To:* qui...@li...; tom...@up... > *Subject:* Re: [Quickfixj-users] Server crash leaves QuickFIX/J engine (1.5.3) in corrupt state > for FileStorePath=filestore > > Hi Tom, > > IMHO you can always run into this kind of problem (data corruption) when the power fails. You > could try turning on synchronous writes but this of course will slow down the message processing. > The option to turn this on is FileStoreSync=Y. I just discovered that it is not part of the > documentation. :-/ Will add it. > > Hope that helps a bit. > Cheers, > Chris. > > > On 06/12/2018 09:33, tom...@up... <mailto:tom...@up...>wrote: > > QuickFIX/J Documentation: http://www.quickfixj.org/documentation/ > > QuickFIX/J Support: http://www.quickfixj.org/support/ > > > > Hello dear readers, > > This is a situation that I’ve encountered twice now, at two different clients of ours. Their > application server with our FIX module running on it based on the QuickFIX/J engine (1.5.3) > crashes, and leaves the QuickFIX/J engine in a corrupt state. The first time, the server crash > was for a virtualized server (some reason the hosting provider never told me) and the second > was due a UPS that did not signal the server to shut down properly when the battery started > running out. Our configuration has /FileStorePath=filestore/ in both instances. Our instances > are buy side (i.e. initiators). > > On inspection of the QuickFIX/J state files, I’m seeing that they are corrupted. I can fix > this by removing all state files and restart our module, then all is fine again. The first > time happened about a year ago, the second just earlier today. All state files were empty > (.body, .header, .senderseqnums, .targetseqnums), except for the .session file (see > attachment). You’ll see it’s filled with all ASCII NUL characters. A valid one as I’ve noticed > has something like /<NUL><ETX><some number>/ as contents. > > The exceptions I’m getting in my module’s log file goes like: > > quickfix.ConfigError: error during session initialization > > at > quickfix.mina.initiator.AbstractSocketInitiator.createSessions(AbstractSocketInitiator.java:169) > > at > quickfix.mina.initiator.AbstractSocketInitiator.createSessionInitiators(AbstractSocketInitiator.java:84) > > at quickfix.SocketInitiator.initialize(SocketInitiator.java:86) > > at quickfix.SocketInitiator.start(SocketInitiator.java:64) > > at <INTERNALS-STRIPPED> > > at <INTERNALS-STRIPPED> > > at <INTERNALS-STRIPPED> > > at java.lang.Thread.run(Unknown Source) > > Caused by: java.lang.RuntimeException: java.io.IOException: invalid UTC timestamp value: > > at quickfix.FileStoreFactory.create(FileStoreFactory.java:80) > > at quickfix.Session.<init>(Session.java:467) > > at quickfix.DefaultSessionFactory.create(DefaultSessionFactory.java:185) > > at quickfix.mina.SessionConnector.createSession(SessionConnector.java:140) > > at > quickfix.mina.initiator.AbstractSocketInitiator.createSessions(AbstractSocketInitiator.java:163) > > ... 7 more > > Caused by: java.io.IOException: invalid UTC timestamp value: > > at quickfix.FileStore.initializeSessionCreateTime(FileStore.java:137) > > at quickfix.FileStore.initializeCache(FileStore.java:123) > > at quickfix.FileStore.initialize(FileStore.java:116) > > at quickfix.FileStore.<init>(FileStore.java:101) > > at quickfix.FileStoreFactory.create(FileStoreFactory.java:78) > > ... 11 more > > I remember this is the same exception I got the first time one of our clients’ server crashed. > > Of course in an ideal world a server should never crash, but alas it does. It would appear > that the QuickFIX/J engine’s state isn’t properly flushed to disk? Or maybe this happens > because the engine’s state cannot be written atomically? I do not know, as I’ve never had the > time to delve into QuickFIX/J internals and see how this situation can arise. > > Perhaps it has been reported before and fixed in a later version, I’ve not had time to delve > into that either. Perhaps this is something that simply can’t be fixed due to the way > filesystems work? Or perhaps it can? > > Besides this, I’m pretty happy with QuickFIX/J 1.5.3 and I wish to thank all contributors to > QuickFIX/J! > > Kind regards, > > -- > > Tom Tempelaere > > Upsilon SA > > > > > _______________________________________________ > > Quickfixj-users mailing list > > Qui...@li... <mailto:Qui...@li...> > > https://lists.sourceforge.net/lists/listinfo/quickfixj-users > > > > -- > Christoph John > Software Engineering > T +49 241 557080-28 > chr...@ma... <mailto:chr...@ma...> > MACD GmbH > Oppenhoffallee 103 > 52066 Aachen, Germany > www.macd.com <http://www.macd.com> > Amtsgericht Aachen: HRB 8151 > Ust.-Id: DE 813021663 > Geschäftsführer: George Macdonald -- Christoph John Software Engineering T +49 241 557080-28 chr...@ma... MACD GmbH Oppenhoffallee 103 52066 Aachen, Germany www.macd.com Amtsgericht Aachen: HRB 8151 Ust.-Id: DE 813021663 Geschäftsführer: George Macdonald |
|
From: <tom...@up...> - 2018-12-06 12:15:37
Attachments:
smime.p7s
|
Hi Christoph, Ok, Im going to try it out anyway. I dont expect much of a slowdown, and according to the docs it should be safer to do anyway :D. Thank you for your help! Kind regards // Herzliche Grüße, -- Tom Tempelaere Upsilon SA From: Christoph John <chr...@ma...> Sent: Thursday, 6 December 2018 12:15 To: tom...@up... Cc: qui...@li... Subject: Re: [Quickfixj-users] Server crash leaves QuickFIX/J engine (1.5.3) in corrupt state for FileStorePath=filestore Hi Tom, to be honest: I don't know. That message is in the docs since 2006. :) So I assume with a recent server and an SSD the performance hit should be substantially lower. I never measured it but would assume some milliseconds. Grüße, :) Chris. On 06/12/2018 12:09, tom...@up... <mailto:tom...@up...> wrote: Hi Christoph, I DuckDuckGod FileStoreSync and see in the JavaDoc for 1.6.4 the message its safer to sync, but its also much slower (100x slower in some cases). Im wondering by how much time things are slowed down now. Are we talking about milliseconds per FIX message on current servers? Of course it all depends on the file subsystem (hard drive vs SSD), I know ;-) Kind regards // Herzliche Grüße, -- Tom Tempelaere Upsilon SA From: Christoph John <mailto:chr...@ma...> <chr...@ma...> Sent: Thursday, 6 December 2018 10:26 To: qui...@li... <mailto:qui...@li...> ; tom...@up... <mailto:tom...@up...> Subject: Re: [Quickfixj-users] Server crash leaves QuickFIX/J engine (1.5.3) in corrupt state for FileStorePath=filestore Hi Tom, IMHO you can always run into this kind of problem (data corruption) when the power fails. You could try turning on synchronous writes but this of course will slow down the message processing. The option to turn this on is FileStoreSync=Y. I just discovered that it is not part of the documentation. :-/ Will add it. Hope that helps a bit. Cheers, Chris. On 06/12/2018 09:33, tom...@up... <mailto:tom...@up...> wrote: QuickFIX/J Documentation: http://www.quickfixj.org/documentation/ QuickFIX/J Support: http://www.quickfixj.org/support/ Hello dear readers, This is a situation that Ive encountered twice now, at two different clients of ours. Their application server with our FIX module running on it based on the QuickFIX/J engine (1.5.3) crashes, and leaves the QuickFIX/J engine in a corrupt state. The first time, the server crash was for a virtualized server (some reason the hosting provider never told me) and the second was due a UPS that did not signal the server to shut down properly when the battery started running out. Our configuration has FileStorePath=filestore in both instances. Our instances are buy side (i.e. initiators). On inspection of the QuickFIX/J state files, Im seeing that they are corrupted. I can fix this by removing all state files and restart our module, then all is fine again. The first time happened about a year ago, the second just earlier today. All state files were empty (.body, .header, .senderseqnums, .targetseqnums), except for the .session file (see attachment). Youll see its filled with all ASCII NUL characters. A valid one as Ive noticed has something like <NUL><ETX><some number> as contents. The exceptions Im getting in my modules log file goes like: quickfix.ConfigError: error during session initialization at quickfix.mina.initiator.AbstractSocketInitiator.createSessions(AbstractSocke tInitiator.java:169) at quickfix.mina.initiator.AbstractSocketInitiator.createSessionInitiators(Abst ractSocketInitiator.java:84) at quickfix.SocketInitiator.initialize(SocketInitiator.java:86) at quickfix.SocketInitiator.start(SocketInitiator.java:64) at <INTERNALS-STRIPPED> at <INTERNALS-STRIPPED> at <INTERNALS-STRIPPED> at java.lang.Thread.run(Unknown Source) Caused by: java.lang.RuntimeException: java.io.IOException: invalid UTC timestamp value: at quickfix.FileStoreFactory.create(FileStoreFactory.java:80) at quickfix.Session.<init>(Session.java:467) at quickfix.DefaultSessionFactory.create(DefaultSessionFactory.java:185) at quickfix.mina.SessionConnector.createSession(SessionConnector.java:140) at quickfix.mina.initiator.AbstractSocketInitiator.createSessions(AbstractSocke tInitiator.java:163) ... 7 more Caused by: java.io.IOException: invalid UTC timestamp value: at quickfix.FileStore.initializeSessionCreateTime(FileStore.java:137) at quickfix.FileStore.initializeCache(FileStore.java:123) at quickfix.FileStore.initialize(FileStore.java:116) at quickfix.FileStore.<init>(FileStore.java:101) at quickfix.FileStoreFactory.create(FileStoreFactory.java:78) ... 11 more I remember this is the same exception I got the first time one of our clients server crashed. Of course in an ideal world a server should never crash, but alas it does. It would appear that the QuickFIX/J engines state isnt properly flushed to disk? Or maybe this happens because the engines state cannot be written atomically? I do not know, as Ive never had the time to delve into QuickFIX/J internals and see how this situation can arise. Perhaps it has been reported before and fixed in a later version, Ive not had time to delve into that either. Perhaps this is something that simply cant be fixed due to the way filesystems work? Or perhaps it can? Besides this, Im pretty happy with QuickFIX/J 1.5.3 and I wish to thank all contributors to QuickFIX/J! Kind regards, -- Tom Tempelaere Upsilon SA _______________________________________________ Quickfixj-users mailing list Qui...@li... <mailto:Qui...@li...> https://lists.sourceforge.net/lists/listinfo/quickfixj-users -- Christoph John Software Engineering T +49 241 557080-28 chr...@ma... <mailto:chr...@ma...> MACD GmbH Oppenhoffallee 103 52066 Aachen, Germany www.macd.com <http://www.macd.com> Amtsgericht Aachen: HRB 8151 Ust.-Id: DE 813021663 Geschäftsführer: George Macdonald -- Christoph John Software Engineering T +49 241 557080-28 chr...@ma... <mailto:chr...@ma...> MACD GmbH Oppenhoffallee 103 52066 Aachen, Germany www.macd.com <http://www.macd.com> Amtsgericht Aachen: HRB 8151 Ust.-Id: DE 813021663 Geschäftsführer: George Macdonald |