Hi ,
Just discovery about JAILER this week and start test it (v4.0.14) with IBM Informix 11.50 .
I'm still learning how it works and like to do some suggestions and correction :
Suggestion: on jailer.xml for Informix map field. Is missing the binserial -> bigint .
If not included occur an error during the export , create table (2 serials fields on the same table).
Question: Where I found any documentation about the scripts (scripts folder) and if I can create them for Informix?
My objective is check/test the possibility for developer instances try active the compression over jailer_* tables and minimize the disk space consumption and have better performance.
Suggestion: I observe they use transaction during the export and there no FKs on jailer_ tables, this is really needded? Is there some way to disable this transaction and alter the jailer_ tables to RAW type (~= not logged on Oracle) where they will not have suffer control of transactions , lowering the overheading and speedup the process. (sinaxe : create RAW table)
Question: I was included the the createtableprefix , but don't have any effect. Why?
Informix have a SQL Limit Suffix too, the "FIRST #" keyword , what work like the TOP of MS SQL : select first 10 * from ....
I include it into the XML, but don't see it begin used at the database.
Question: How I know if they will if the configuration was "take" and will be used ?
This the configuration what I trying work....
<beanclass="net.sf.jailer.Configuration"><propertyname="urlPattern"value="jdbc:informix.*"/><propertyname="varcharLengthLimit"value="255"/><propertyname="typeReplacement"><map><entrykey="datetime"value="datetime year to fraction"/><entrykey="serial"value="int"/><entrykey="serial8"value="int8"/><entrykey="bigserial"value="bigint"/></map></property><propertyname="sqlLimitSuffix"value="FIRST %s"/><propertyname="sessionTemporaryTableManager"><beanclass="net.sf.jailer.database.DefaultTemporaryTableManager"><propertyname="createTablePrefix"value="CREATE RAW TABLE "/></bean></property></bean>
Regards
Cesar
Last edit: Cesar Martins 2013-05-13
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
About use the RAW tables, after few tests I able to use them.
I don't notice before use the createTablePrefix (sessionTemporaryTable) needed select the "session" option during the export :P
Anyway , I need to complete all XML otherwise I got null pointer error, sharing the XML bellow where when use the session option during export the tables are created as RAW.
Because of large row size of the table jailer_entity they grow a lot I include the parameters "extent size + next size" to better performance and avoid create a lot of extents...
<!-- for Informix --><beanclass="net.sf.jailer.Configuration"><propertyname="urlPattern"value="jdbc:informix.*"/><propertyname="varcharLengthLimit"value="255"/><propertyname="typeReplacement"><map><entrykey="datetime"value="datetime year to fraction"/><entrykey="serial"value="int"/><entrykey="serial8"value="int8"/><entrykey="bigserial"value="bigint"/></map></property><propertyname="sqlLimitSuffix"value="FIRST %s"/><propertyname="sessionTemporaryTableManager"><beanclass="net.sf.jailer.database.DefaultTemporaryTableManager"><propertyname="dmlTableReferencePrefix"value=""/><propertyname="createTablePrefix"value="CREATE RAW TABLE "/><propertyname="createTableSuffix"value=" extent size 51200 next size 25000"/><propertyname="createIndexPrefix"value="CREATE INDEX "/><propertyname="createIndexSuffix"value=""/><propertyname="indexTablePrefix"value=""/><propertyname="dropTablePrefix"value="DROP TABLE "/></bean></property></bean>
About the large row size of the table jailer_entitly , they Jailer include all columns of the table where we are exporting (and not only the PK fields) and setting NULL for the columns where aren't PK . There is someway to avoid this? this is terrible for performance!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
thank you very much for completing the configuration. I will include it into the next release (Except the "sqlLimitSuffix"-property, which will not work this way as it is expected to be a suffix of a select statement. It's used by the data browser to limit the number of rows shown in a table browser. Without it, the data browser queries for all rows but only fetches the first ones)
The jailer_entity table has columns for each PK-column of all tables in the data model. (PK-columns of the same type are mapped onto the same jailer_entity column). If you have composite PKs, it might be worth trying to set the property "minimize-UPK" to "true:
<!-- If minimize-UPK is true, the UPK don't preserve order. This minimizesthe size of the UPK.--><beanname="minimize-UPK"class="java.lang.Boolean"><!--set value="true" if you use MySQL and get the following exception:"java.sql.SQLException: Too many key parts specified; max 16 partsallowed" --><constructor-argvalue="true"/></bean>
HTH,
Ralf
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Ralf ,
Thanks for your answer.
I'm still learning how work with Jailer and I like a lot so far. Congratulations for this excellent work!
About the sqlLimitSuffix , now I understand where they apply, but since the Informix have it why not keep it? (informix syntax: select first 10 * from xyz)
About the RAW table I use the sessionTemporaryTableManager because I don't discovery other way to force use it , but RAW tables aren't effective temporary tables they are non log tables. Honestly I don't try use temporary tables because on Informix temporary table are restricted to their session and how Jailer open multiple sessions I suppose they will not work since each session will not seen the temporary table of other session (sintaxe: create temp table xyz (....) with no log , the others DDL don't change).
I try open a feature request ticket but I think is restrict to you only.
So, here is my suggestion :
Include a option to use the "truncate table xyz" when cancel an export.
I have situations here , where when I try export only 1 month of our orders the jailer_* tables grown +1GB and if I cancel... take few minutes deleting it (or rollbacking).
Since the tables are dropped /created each export, why not just truncate them?
(at least when used the session export option).
The unique restriction is should not have anyone accessing the tables.
This will work with mostly of databases, not only informix.
About the UPK columns, I already try this option and appear not have effect , but I do not check with proper care , so I will test again .
Just a note, I suspect the jailer is including all columns of the table exported, not only the PKs... I will test and return here with this details.
Regards
Cesar
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The RAW table can be an alternative for non session tables, if you want to implement or just keep commented into the jailer.xml as alternative option for "session" options...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
When I export 5 tables where for each table PKs have only 1 field (serial datatype) and a relation between them by this PKs , this is how the jailer_entity are created ... there a lot of unused columns there.
2013-05-14 22:35:32,989 [jailer-main] INFO - CREATE RAW TABLE JAILER_ENTITY_T ( r_entitygraph INTEGER NOT NULL, PK0 char(15) , PK1 int , PK2 int , PK3 char(20) , PK4 char(6) , PK5 date , PK6 smallint , PK7 smallint , PK8 smallint , PK9 int , PK10 char(8) , PK11 char(8) , PK12 date , PK13 int , PK14 datetime year to fraction , PK15 char(18) , PK16 char(13) , PK17 bigint , PK18 decimal(22) , PK19 date , PK20 int , PK21 char(12) , PK22 varchar(60) , PK23 varchar(38) , PK24 varchar(38) , PK25 decimal(1) , PK26 int , PK27 date , birthday INTEGER NOT NULL, type VARCHAR(120) NOT NULL, PRE_PK0 char(15) , PRE_PK1 int , PRE_PK2 int , PRE_PK3 char(20) , PRE_PK4 char(6) , PRE_PK5 date , PRE_PK6 smallint , PRE_PK7 smallint , PRE_PK8 smallint , PRE_PK9 int , PRE_PK10 char(8) , PRE_PK11 char(8) , PRE_PK12 date , PRE_PK13 int , PRE_PK14 datetime year to fraction , PRE_PK15 char(18) , PRE_PK16 char(13) , PRE_PK17 bigint , PRE_PK18 decimal(22) , PRE_PK19 date , PRE_PK20 int , PRE_PK21 char(12) , PRE_PK22 varchar(60) , PRE_PK23 varchar(38) , PRE_PK24 varchar(38) , PRE_PK25 decimal(1) , PRE_PK26 int , PRE_PK27 date , PRE_TYPE VARCHAR(120), orig_birthday INTEGER, association INTEGER ) extent size 51200 next size 25000
as consequence , this occur too :
2013-05-14 22:40:52,954 [jailer-main] INFO - CREATE INDEX jlr_enty_upk_T1 ON JAILER_ENTITY_T (r_entitygraph , PK0, PK1, PK2, PK3, PK4, PK5, PK6, PK7, PK8, PK9, PK10, PK11, PK12, PK13, PK14, PK15, PK16, PK17, PK18, PK19, PK20, PK21, PK22, PK23, PK24, PK25, PK26, PK27, type)
2013-05-14 22:40:52,954 [jailer-main] ERROR - Error executing statement
java.sql.SQLException: The total size of the index is too large or too many parts in index.
When this error occur , the jailer recreate the tables and create one index of each column PK* .
Changing the minimize-UPK the same error above occur but they create only the first 15 indexes..
If help, I can send to you the sql.log of both situations (with minimize-upk enabled and disabled).
Question: There is some way to be receive an email when any answer are posted here ?
I don't receive nothing...
Regards
Cesar
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
the columns of jailer_entity table are a superset of the PK columns of all tables in the data model (not only the tables you export rows from). You might have at least one table with many PK columns?! If you like, you can send me the data model (table.csv and column.csv in the data model folder)
Regards,
Ralf
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
hmm... I don't understand the concept/reason for that (all PKs in one table).
I use the "Analyse Database" to get my model and become a lot of tables with PK (and a lot without PKs, is a very old system) ... since all tables are centralized at one "schema" . At informix "world" isn't common the concept of schema like oracle, is very common have all tables over just one owner.
I pretty sure will have some tables where the PK have up to 10 columns...
(this is a very old system)
I don't know if affect some concept/logic or how hard is to implement, but I believe will be a great optimization (execution) and smooth use only the PKs involved on the export .
Since this data model is from the system company, is there some way to send to you in private ?
Regards
Cesar
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Couldn't you remove all tables from jailer's data model, that are not relevant for your subsetting task? Within the "DataModel->Data Model Editor", select all tables you want to remove (shift/ctrl for selecting multiple items) and press "delete". All associations of these tables will be automatically removed too.
HTH,
Ralf
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Yes, I can do that.
The problem is, if later I need some table what I removed before and I execute the analyse again, they will include allll tables again... :(
Here is other feature request where could be a workaround for me and easy do implement :
Include a option to filter table name on "Analyse Database" (LIKE statement or regular expression)
Last edit: Cesar Martins 2013-05-15
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
there is a file "exclude-tables.csv" in the data model folder. List all tables that you don't want to be part of your model here, one table per line.
Use exactly the same table names as in the first column of the "tables.csv" file.
HTH,
Ralf
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I forgot .
I discovery how configure to use the scripts , where I include a script to update statistics .
I include the avoid left join option because for personal experience (I'm DBA), ansi SQL can restrict the database to choose the best way to execute a SQL sometimes.
But honestly, I on my tests I don't see any significant gain so far with this option enabled. But I prefer keep it .
<!-- for Informix --><beanclass="net.sf.jailer.Configuration"><propertyname="urlPattern"value="jdbc:informix.*"/><propertyname="varcharLengthLimit"value="255"/><propertyname="typeReplacement"><map><entrykey="datetime"value="datetime year to fraction"/><entrykey="serial"value="int"/><entrykey="serial8"value="int8"/><entrykey="bigserial"value="bigint"/></map></property><propertyname="statisticRenovator"><beanclass="net.sf.jailer.database.SqlScriptBasedStatisticRenovator"><constructor-argvalue="script/informix/update_statistics.sql"/></bean></property><propertyname="avoidLeftJoin"value="true"/><propertyname="sqlLimitSuffix"value="FIRST %s"/><propertyname="sessionTemporaryTableManager"><beanclass="net.sf.jailer.database.DefaultTemporaryTableManager"><propertyname="dmlTableReferencePrefix"value=""/><propertyname="createTablePrefix"value="CREATE RAW TABLE "/><propertyname="createTableSuffix"value=" extent size 51200 next size 25000"/><propertyname="createIndexPrefix"value="CREATE INDEX "/><propertyname="createIndexSuffix"value=""/><propertyname="indexTablePrefix"value=""/><propertyname="dropTablePrefix"value="DROP TABLE "/></bean></property></bean>
where the update_statiscs.sql contains :
update statistics high for table ${JAILER_ENTITY};
update statistics high for table ${JAILER_GRAPH};
update statistics high for table ${JAILER_DEPENDENCY};
update statistics high for table ${JAILER_SET};
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
First, just to tell, I remove the "extra" tables from my data model and the columns over jailer_entity lower considerable.
Now my suggestion:
Looking the SQLs produced by Jailer I believe could be a great optimization during the collect rows at export process use the MERGE statement, which is supported by Informix, Oracle , DB2, MSSQL at least.
This command able you to get the result of a SELECT and "apply" over a table with choice to insert or update/delete the records what not match or match.
I already improve the performance for a lot of programs and batch process using it with Informix and Oracle , is very useful statement.
Think about...
Insert into JAILER_ENTITY_T (r_entitygraph, PK1, PK3, birthday, type)
Select distinct 1450032645 as GRAPH_ID
, A.cod AS PK1
, A.sequen AS PK3
, 3 AS BIRTHDAY
, 'x_ccc' AS TYPE
From x_ccc A
, x_bbb B
, JAILER_ENTITY_T E
Where (
E.r_entitygraph = 1450032645
and E.birthday = 2
and E.type = 'x_bbb'
and E.PK0 is null
and E.PK1 is null
and E.PK2 is null
and E.PK3 is null
and E.PK4 is null
and E.PK5 is null
and E.PK6 is null
and E.PK7 = B.cod
and E.PK8 is null
and E.PK9 is null
and E.PK10 is null
and E.PK11 is null
and E.PK12 is null
)
and ((A.cod = B.cod))
AND NOT EXISTS (
select *
from JAILER_ENTITY_T DuplicateExists
where r_entitygraph = 1450032645
AND DuplicateExists.type = 'x_ccc'
and DuplicateExists.PK0 is null
and DuplicateExists.PK1 = A.cod
and DuplicateExists.PK2 is null
and DuplicateExists.PK3 = A.sequen
and DuplicateExists.PK4 is null
and DuplicateExists.PK5 is null
and DuplicateExists.PK6 is null
and DuplicateExists.PK7 is null
and DuplicateExists.PK8 is null
and DuplicateExists.PK9 is null
and DuplicateExists.PK10 is null
and DuplicateExists.PK11 is null
and DuplicateExists.PK12 is null
)
if I understand right the select above, this should be the merge statement similar to run where probably will get better performance (much better).
Maybe have some adjust to do , because this SQL is very tricky.
MERGE INTO JAILER_ENTITY_T as jt
USING (
Select distinct 1450032645 as GRAPH_ID
, A.cod AS PK1
, A.sequen AS PK3
, 3 AS BIRTHDAY
, 'x_ccc' AS TYPE
From x_ccc A
, x_bbb B
, JAILER_ENTITY_T E
Where (
E.r_entitygraph = 1450032645
and E.birthday = 2
and E.type = 'x_bbb'
and E.PK0 is null
and E.PK1 is null
and E.PK2 is null
and E.PK3 is null
and E.PK4 is null
and E.PK5 is null
and E.PK6 is null
and E.PK7 = B.cod
and E.PK8 is null
and E.PK9 is null
and E.PK10 is null
and E.PK11 is null
and E.PK12 is null
)
and ((A.cod = B.cod))
) as tab
ON jt.r_entitygraph = tab.graph_id
and type = 'x_ccc'
and e.birthday = 2
WHEN NOT MATCHED INSERT (r_entitygraph, PK1, PK3, birthday, type) values (tab.graph_id, tab.pk1, tab.pk3, tab.birthday, tab.type)
;
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi ,
Just discovery about JAILER this week and start test it (v4.0.14) with IBM Informix 11.50 .
I'm still learning how it works and like to do some suggestions and correction :
If not included occur an error during the export , create table (2 serials fields on the same table).
My objective is check/test the possibility for developer instances try active the compression over jailer_* tables and minimize the disk space consumption and have better performance.
I include it into the XML, but don't see it begin used at the database.
This the configuration what I trying work....
Regards
Cesar
Last edit: Cesar Martins 2013-05-13
Hi ,
About use the RAW tables, after few tests I able to use them.
I don't notice before use the createTablePrefix (sessionTemporaryTable) needed select the "session" option during the export :P
Anyway , I need to complete all XML otherwise I got null pointer error, sharing the XML bellow where when use the session option during export the tables are created as RAW.
Because of large row size of the table jailer_entity they grow a lot I include the parameters "extent size + next size" to better performance and avoid create a lot of extents...
About the large row size of the table jailer_entitly , they Jailer include all columns of the table where we are exporting (and not only the PK fields) and setting NULL for the columns where aren't PK . There is someway to avoid this? this is terrible for performance!
Hi,
thank you very much for completing the configuration. I will include it into the next release (Except the "sqlLimitSuffix"-property, which will not work this way as it is expected to be a suffix of a select statement. It's used by the data browser to limit the number of rows shown in a table browser. Without it, the data browser queries for all rows but only fetches the first ones)
The jailer_entity table has columns for each PK-column of all tables in the data model. (PK-columns of the same type are mapped onto the same jailer_entity column). If you have composite PKs, it might be worth trying to set the property "minimize-UPK" to "true:
HTH,
Ralf
Hi Ralf ,
Thanks for your answer.
I'm still learning how work with Jailer and I like a lot so far. Congratulations for this excellent work!
About the sqlLimitSuffix , now I understand where they apply, but since the Informix have it why not keep it? (informix syntax: select first 10 * from xyz)
About the RAW table I use the sessionTemporaryTableManager because I don't discovery other way to force use it , but RAW tables aren't effective temporary tables they are non log tables. Honestly I don't try use temporary tables because on Informix temporary table are restricted to their session and how Jailer open multiple sessions I suppose they will not work since each session will not seen the temporary table of other session (sintaxe: create temp table xyz (....) with no log , the others DDL don't change).
I try open a feature request ticket but I think is restrict to you only.
So, here is my suggestion :
Include a option to use the "truncate table xyz" when cancel an export.
I have situations here , where when I try export only 1 month of our orders the jailer_* tables grown +1GB and if I cancel... take few minutes deleting it (or rollbacking).
Since the tables are dropped /created each export, why not just truncate them?
(at least when used the session export option).
The unique restriction is should not have anyone accessing the tables.
This will work with mostly of databases, not only informix.
About the UPK columns, I already try this option and appear not have effect , but I do not check with proper care , so I will test again .
Just a note, I suspect the jailer is including all columns of the table exported, not only the PKs... I will test and return here with this details.
Regards
Cesar
Temporary tables are restricted to their session on most DBMS. Jailer opens a single session if it uses the "Session" working table scope.
It's a good idea not to delete from temporary tables when an export is canceled. I will implement this feature in the next release.
Ok , in this case the correct configuration on Informix to use the Temporary table is bellow:
I tested successfully this configuration where when I look at the session on the database I see the temporary tables created :
The RAW table can be an alternative for non session tables, if you want to implement or just keep commented into the jailer.xml as alternative option for "session" options...
Hi Ralf,
When I export 5 tables where for each table PKs have only 1 field (serial datatype) and a relation between them by this PKs , this is how the jailer_entity are created ... there a lot of unused columns there.
as consequence , this occur too :
When this error occur , the jailer recreate the tables and create one index of each column PK* .
Changing the minimize-UPK the same error above occur but they create only the first 15 indexes..
If help, I can send to you the sql.log of both situations (with minimize-upk enabled and disabled).
Question: There is some way to be receive an email when any answer are posted here ?
I don't receive nothing...
Regards
Cesar
Hi,
the columns of jailer_entity table are a superset of the PK columns of all tables in the data model (not only the tables you export rows from). You might have at least one table with many PK columns?! If you like, you can send me the data model (table.csv and column.csv in the data model folder)
Regards,
Ralf
hmm... I don't understand the concept/reason for that (all PKs in one table).
I use the "Analyse Database" to get my model and become a lot of tables with PK (and a lot without PKs, is a very old system) ... since all tables are centralized at one "schema" . At informix "world" isn't common the concept of schema like oracle, is very common have all tables over just one owner.
I pretty sure will have some tables where the PK have up to 10 columns...
(this is a very old system)
I don't know if affect some concept/logic or how hard is to implement, but I believe will be a great optimization (execution) and smooth use only the PKs involved on the export .
Since this data model is from the system company, is there some way to send to you in private ?
Regards
Cesar
Couldn't you remove all tables from jailer's data model, that are not relevant for your subsetting task? Within the "DataModel->Data Model Editor", select all tables you want to remove (shift/ctrl for selecting multiple items) and press "delete". All associations of these tables will be automatically removed too.
HTH,
Ralf
Yes, I can do that.
The problem is, if later I need some table what I removed before and I execute the analyse again, they will include allll tables again... :(
Here is other feature request where could be a workaround for me and easy do implement :
Include a option to filter table name on "Analyse Database" (LIKE statement or regular expression)
Last edit: Cesar Martins 2013-05-15
Hi,
there is a file "exclude-tables.csv" in the data model folder. List all tables that you don't want to be part of your model here, one table per line.
Use exactly the same table names as in the first column of the "tables.csv" file.
HTH,
Ralf
I forgot .
I discovery how configure to use the scripts , where I include a script to update statistics .
I include the avoid left join option because for personal experience (I'm DBA), ansi SQL can restrict the database to choose the best way to execute a SQL sometimes.
But honestly, I on my tests I don't see any significant gain so far with this option enabled. But I prefer keep it .
where the update_statiscs.sql contains :
Thanks for the update-statistics script. I will include it into the next release too.
Hi Ralf,
First, just to tell, I remove the "extra" tables from my data model and the columns over jailer_entity lower considerable.
Now my suggestion:
Looking the SQLs produced by Jailer I believe could be a great optimization during the collect rows at export process use the MERGE statement, which is supported by Informix, Oracle , DB2, MSSQL at least.
This command able you to get the result of a SELECT and "apply" over a table with choice to insert or update/delete the records what not match or match.
I already improve the performance for a lot of programs and batch process using it with Informix and Oracle , is very useful statement.
Think about...
if I understand right the select above, this should be the merge statement similar to run where probably will get better performance (much better).
Maybe have some adjust to do , because this SQL is very tricky.
Hi Cesar,
thank you very much for your suggestion. I've opened a feature request for it.
Regards,
Ralf