Good morning everyone,
I'm having a problem with the routine that creates batches in the sym_outgoing_batch table.
Exporadically this routine stops working and in this way does not create batches and consequently does not send to the nodes.
When I make a select in the sym_data tables the changes are there, but in sym_data_event and sym_outgoing_batch nothing appears.
After the service is restarted the batches are created and shipped normally.
I can not understand why this routine is stopping.
Here's my properties file.
Last edit: Luiz Manoel Maia de Oliveira 2017-12-06
Try running the following to see if this helps. It sounds like the routing job is getting stuck or not running at all. Data gaps are computed during this routing process and if they are corrupted it might cause what you are seeing. You can delete these and it will recalculate them.
delete from sym_data_gap
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Josh,
could help me understand why sometimes a lot of batches in the sym_outgoing_bach table stay in RT status.
When this happens, the sync does not work, leaving this user in a looping
I noticed that it is looping while loading the initial load. At the end of the load it seems that the symmetric dispenses and remakes, thus remaining forever.
Last edit: Luiz Manoel Maia de Oliveira 2017-12-21
RT = means that a batch is in "Routing" status. Routing is the process where SymmetricDS reads through sym_data and puts the data into batches (ie, create sym_data_event and sym_outgoing_batch.)
If a table is getting locked in your DB during routing, that would explain why the routing gets stuck. Some questions: does this happen at a certain time of day? what DB platform are you running?
I would try to isolate which DB tables are locked for you, and where the contention is. We had some users report contention between the Symmetric routing and purge jobs (around midnight by default). In that case, you can use cron scheduling to make sure those jobs do not run at the same time.
Mark
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Mark,
I am following the database daily and the strangest thing is that the events are exporadic.
There is no specific moment of occurrence and whenever there is any problem of this, I check if there is something locked in the DB.
In the last event (yesterday) there was no lock in the DB and others nodes were working perfectly, only one node was not working.
So any change that should go to this node, would not go to sym_outgoing_batch. I changed a data to this node, it went to sym_data but did not create
the batch for the sym_outgoing_batch.
I cleaned sym_data_gap and even then nothing happened. After restarting the symmetric he noticed the changes and created the batches in sym_outgoing_batch.
My server is working with informix and the nodes are with sqlite (Android).
you can see my properties file attached in the first post
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Good morning everyone,
I'm having a problem with the routine that creates batches in the sym_outgoing_batch table.
Exporadically this routine stops working and in this way does not create batches and consequently does not send to the nodes.
When I make a select in the sym_data tables the changes are there, but in sym_data_event and sym_outgoing_batch nothing appears.
After the service is restarted the batches are created and shipped normally.
I can not understand why this routine is stopping.
Here's my properties file.
Last edit: Luiz Manoel Maia de Oliveira 2017-12-06
Try running the following to see if this helps. It sounds like the routing job is getting stuck or not running at all. Data gaps are computed during this routing process and if they are corrupted it might cause what you are seeing. You can delete these and it will recalculate them.
delete from sym_data_gap
Josh,
I did what you suggested.
I'll evaluate it and then I'll give you a feedback.
Thanks!
Hi Josh,
could help me understand why sometimes a lot of batches in the sym_outgoing_bach table stay in RT status.
When this happens, the sync does not work, leaving this user in a looping
I noticed that it is looping while loading the initial load. At the end of the load it seems that the symmetric dispenses and remakes, thus remaining forever.
Last edit: Luiz Manoel Maia de Oliveira 2017-12-21
Josh,
doing some validations in the DB I could realize that this happens when the DB is with some lock
Josh,
Do you have any other ideas? Exporadically the problem still happens.
Even deleting sym_data_gap nothing appears in sym_outgoing_batch
Luiz,
RT = means that a batch is in "Routing" status. Routing is the process where SymmetricDS reads through sym_data and puts the data into batches (ie, create sym_data_event and sym_outgoing_batch.)
If a table is getting locked in your DB during routing, that would explain why the routing gets stuck. Some questions:
does this happen at a certain time of day?
what DB platform are you running?
I would try to isolate which DB tables are locked for you, and where the contention is. We had some users report contention between the Symmetric routing and purge jobs (around midnight by default). In that case, you can use cron scheduling to make sure those jobs do not run at the same time.
Mark
Mark,
I am following the database daily and the strangest thing is that the events are exporadic.
There is no specific moment of occurrence and whenever there is any problem of this, I check if there is something locked in the DB.
In the last event (yesterday) there was no lock in the DB and others nodes were working perfectly, only one node was not working.
So any change that should go to this node, would not go to sym_outgoing_batch. I changed a data to this node, it went to sym_data but did not create
the batch for the sym_outgoing_batch.
I cleaned sym_data_gap and even then nothing happened. After restarting the symmetric he noticed the changes and created the batches in sym_outgoing_batch.
My server is working with informix and the nodes are with sqlite (Android).
you can see my properties file attached in the first post
Hmm.. It does sound like something got locked up in the DB. Can you share a log file from when it happened?
Mark,
the log file is attached.
the service stopped and tried everything, cleaned the sym_data_gap and nothing happened.
Only when I restarted the service that the batches were created in sym_outgoing_batch
The log is hinting at problems with data gaps. Try setting this in your properties file:
routing.detect.invalid.gaps=true
I thought we had fixed problems with gaps, but enabling this property will cause it to fix any data gap problems.
Ok Eric, i will try.
tks