Hi Richard,
how are you? I read through the forum and it seems nobody had a 'challenge' like this before: I map my I/O classes to the database using XML. For all type classes (n:1 relation) I use lazy loading (retrieveAutomatic = false) - it works fine.
However, I have one main table with five detail-tables for which I use "retrieveAutomatic = true". The class mapping the main table uses variables of the type CPersistentCollection to hold the detail tables. The performance is quite bad - how can I improve that? Do I have to split the model? Do I have to implement lazy loading for collections (and if yes, how is this done?)?
Many thanks for your help,
David
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Can I assume that for each of the detail tables there are no further associations that are processed with retrieveautomatic="lazy" or retrieveautomatic="true"?
If there are performance will slow down quite a bit. I'll explain why.
When you retrieve the main object the framework does the single SQL select to get the main record (1 round trip) with joins for all the 1-1 associations.
It then performs a retrieve for each of the 1:n associations (5 round trips). If the association is lazy loaded then that will be the end of it - the framework will build up the proxy objects and populate the collections. 6 trips in total.
If on the other hand you have retrieveAutomatic=true then the behaviour is a little different. For each one of the 1:N associations a dataset is retrieved with as many records as there will be objects in the collection, then for each record in the dataset a full object is populated - if that object has it's own associations to process then the framework will then perform another query to try and retrieve those associations and this will iterate down the object tree until the full object is retrieved.
If one of your 1:N associations has 100 objects in it then we could be talking about 1+5+100 round trips. If this is the case, you can run your program in debug and watch the debug output window. You should see hundreds of SQL queries being executed (you could also look at the performance counters)
The round tripping to the database is by far the slowest aspect of the system, so you might want to rethink your use of the autoretrieve. It's pretty easy to make an object fully load itself if it is only a proxy version when you access it and this delays the performance hit until it is absolutely necessary.
I hope that helps your understanding a little, and if you need more help please feel free to ask.
- Richard.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
many thanks for the quick answer.
You assumed right: the model is quite complex with further 1:n and 1:1 relations and the output window shows hundreds of statements of this kind: "Cache - getting IOObject object from cache. Key...1) System.Int64: 460".
You wrote: "It's pretty easy to make an object fully load itself if it is only a proxy version when you access it and this delays the performance hit until it is absolutely necessary."
Using CPersistentObject.Find, is a object automatically loaded as proxy if "retrieveAutomatic = false" and proxy fields are defined in the XML?
Could you give me a hint: How can I load a collection as proxy objects? (By now, I implemented lazy loading for single objects inherited from CPersistentObject)?
Thanks and regards, David
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
In terms of the log messages where you see "Cache - getting <classname> object from cache" messages you can be sure that the operation is quick. As this will be a cache hit, and the object retrieval will be performed in memory.
The slow ones are the ones where the SQL had to be generated and executed. The generation of the SQL itself is very quick (especially if it's a repeated oepration) but the execution of the SQL and the poulation of the dataset is slow.
The .Find() method will retrieve a complete copy of an object, and will populate all associations based on the rules you have defined. RetrieveAutomatic="true" ensures that the target object/collection is populated with fully loaded objects, while RetrieveAutomatic="false" will populate the target object/collection with proxy objects (of the toClass type).
Proxy flags come into use when using the various retrieveCriteria to shorten the time is takes to retrieve multiple objects from the database.
To load a collection as proxy objects simply mark the 1:n association as retrieveAutomatic="lazy". What you will need to do in your classes is change the get accessor for non-proxy properties. Check if the object is a proxy object or not, and if it is do a retrieve/find on itself to load the complete object before setting a return value from the property. You might also want to check the set accessor and decide what to do when setting values for non-proxy properties on a proxy copy of an object.
- Richard.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Richard,
if I set retrieveAutomatic to "lazy" I get a InvalidCastException "Cast from string 'lazy' to type boolean is not valid". I guess setting retrieveAutomatic to lazy means setting it to false or do I maybe have a old version?
When I set retrieveAutomatic to false, the CPersistentCollection objects holding the detail data are empty - the debugger shows a count of 0 objects. I set the "name" field of the table to proxy="true" but the objects are not loaded using cObject.Find(cObject). I initialize the CPersistentCollection like follows:
"coll = New CPersistentCollection"
"coll.ContainerObject = Me"
In the XML file the line of the table description looks like that: "<attribute name="coll" />" and the relation description is the following:
<association fromClass="1Table"
toClass="nTable"
target="coll"
cardinality="oneToMany"
retrieveAutomatic="false"
saveAutomatic="false"
deleteAutomatic="false"
inverse="false">
<entry fromAttribute="ID" toAttribute="N_ID"/>
</association>
I cannot see why this doesn't work. Is there anything missing in my code?
Greets, David
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
many thanks for the quick answer.
You assumed right: the model is quite complex with further 1:n and 1:1 relations and the output window shows hundreds of statements of this kind: "Cache - getting IOObject object from cache. Key...1) System.Int64: 460".
You wrote: "It's pretty easy to make an object fully load itself if it is only a proxy version when you access it and this delays the performance hit until it is absolutely necessary."
Using CPersistentObject.Find, is a object automatically loaded as proxy if "retrieveAutomatic = false" and proxy fields are defined in the XML?
Could you give me a hint: How can I load a collection as proxy objects? (By now, I implemented lazy loading for single objects inherited from CPersistentObject)?
Thanks and regards, David
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I've made a few steps but it takes me still 12 minutes to load a medium complex data model. The solution seems to be that I lazy load (as I read in one of your comments) a CPersistentCollection object using CMultiRetrieveCriteria. The help is not very useful to me - could you provide a code expample for this so I can see which objects to use in which context?
Many thanks, David
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Just to give me a feel of where your performance loss is can you tell me;
Are you trying to load the entire object space into memory in one go?
How many objects are you actually retrieving from the database? (you can check the cache size when the load is complete)
How many SQL statements actually get executed?
How complex is the object model (levels of inheritance, circular associations, etc) - if you have a UML model (obviously without any incriminating names/attributes/etc) that would be a real help
I'll put together an example once I get a better idea of what you are doing. And I'll see what else I can do in terms of improving performance after that.
Just for your reference, some time ago I made a change to speed up performance, and it worked really well in some situations, but in others it made things 10x worse. I had to undo those changes and I want to be very careful that my next "improvement" doesn't actually downgrade the overall responsiveness of the framework.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Richard,
the problem occurs if I retrieve the whole model with one find() - this takes about 12 minutes. The debug output, saved to a txt file is 8,5 MB - containing about 10 SQL Statements. The first SQL Statment is huge and returns 9600 rows.
The application is mainly a calculator and needs to have all the input data at once in order to do statistics.
I have one main table, called Project with 7 1:n associations to Loans, Revenues, Expenses etc. Most of those 7 detail tables have 1:1 associations to different type tables. 3 of those detail tables have together 8 associations to a structure called Function. The Function table has one 1:1 association to VariableFunction and one 1:n association to FixFunctions. The table VariableFunction has 2 more detail tables and one type table. The circular associations in the database are not represented in the XML. Wherever possible I built referential integrity using only numeric keys.
I built lazy loading for every 1:1 association (mainly the 18 type-tables) like shown in your documentation example but in the huge first SQL Statement it seems as if these type-tables are anyway loaded with the model.
However, when I turn retrieveAutomatic=false for the 7 detail tables, the first SQL Statement only loads the Project table. Lazy loading for the 1:1 associations of the Project table works perfectly. But how can I lazy load the associations of the Project table?
I hope my explanations are not too weird ;-)
Thanks and Greets, David
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
When the first statement is built it makes a join that includes the keys of all the tables in the other associations. This is done so that when I process a record I can easily know which other objects need to be instantiated against the main object.
If you turn off association automatic retrieval then this doesn't need to be done and the generated SQL doesn't need any joins.
Now, how to lazy-load the associations while leaving retrieve automatic off?
Try implementing something like the following:
Public Class Project
Private _loans as CPersistentCollection
Public Property Loans as CPersistentCollection
Get
If _loans is nothing then
dim l as new loan
rc = new CRetrieveCriteria
rc.classmap = l.getClassMap()
rc.WhereCondition.AddSelectEqualTo("ProjectID",Me.ProjectID)
dim c as CCursor = rc.perform()
while not c.EOF and c.HasElements
l = new loan
c.loadobject(l)
_loans.add(l)
c.nextCursor()
end while
'Optionally add event handlers
AddHandler _loans.ObjectAdded, addressof MyHandler
end if
return _loans
end get
set (Value as CPeristentCollection)
_loans = value
end set
end property
end class
Excuse any typos, errors, etc as I've just typed in the code from memory. In any case it should give you an idea of what needs to happen.
Also, you can obviously inherit CPersistentCollection and use a strongly typed collection if you want to.
I hope that helps a bit
- Richard.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Richard, many thanks for your help! The data loading takes now 2.5 seconds. May I suggest you to insert a code example like yours in the manual?
Many thanks again,
Greets David
Public ReadOnly Property Details() As CPersistentCollection
Get
If cDetails Is Nothing Then
cDetails = New CPersistentCollection
cDetails.ContainerObject = Me
Dim dd As New IODetailData
Dim rc As CRetrieveCriteria = New CRetrieveCriteria
rc.ClassMap = dd.getClassMap()
rc.WhereCondition.addSelectEqualTo("DET_WHO_ID", Me.WholeDataID)
Dim c As CCursor = rc.perform()
While Not c.EOF And c.hasElements
dd = New IODetailData
c.loadObject(dd)
cDetails.Add(dd)
c.nextCursor()
End While
'Optionally add event handlers
'AddHandler cDetails.ObjectAdded, AddressOf MyHandler
End If
Return cDetails
End Get
End Property
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
For me documentation is a bit of a problem. I _know_ that I should do more, but I just never seem to make it a priority. I guess I'd much rather be doing some coding than doing some writing.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Richard,
how are you? I read through the forum and it seems nobody had a 'challenge' like this before: I map my I/O classes to the database using XML. For all type classes (n:1 relation) I use lazy loading (retrieveAutomatic = false) - it works fine.
However, I have one main table with five detail-tables for which I use "retrieveAutomatic = true". The class mapping the main table uses variables of the type CPersistentCollection to hold the detail tables. The performance is quite bad - how can I improve that? Do I have to split the model? Do I have to implement lazy loading for collections (and if yes, how is this done?)?
Many thanks for your help,
David
Hi David,
Can I assume that for each of the detail tables there are no further associations that are processed with retrieveautomatic="lazy" or retrieveautomatic="true"?
If there are performance will slow down quite a bit. I'll explain why.
When you retrieve the main object the framework does the single SQL select to get the main record (1 round trip) with joins for all the 1-1 associations.
It then performs a retrieve for each of the 1:n associations (5 round trips). If the association is lazy loaded then that will be the end of it - the framework will build up the proxy objects and populate the collections. 6 trips in total.
If on the other hand you have retrieveAutomatic=true then the behaviour is a little different. For each one of the 1:N associations a dataset is retrieved with as many records as there will be objects in the collection, then for each record in the dataset a full object is populated - if that object has it's own associations to process then the framework will then perform another query to try and retrieve those associations and this will iterate down the object tree until the full object is retrieved.
If one of your 1:N associations has 100 objects in it then we could be talking about 1+5+100 round trips. If this is the case, you can run your program in debug and watch the debug output window. You should see hundreds of SQL queries being executed (you could also look at the performance counters)
The round tripping to the database is by far the slowest aspect of the system, so you might want to rethink your use of the autoretrieve. It's pretty easy to make an object fully load itself if it is only a proxy version when you access it and this delays the performance hit until it is absolutely necessary.
I hope that helps your understanding a little, and if you need more help please feel free to ask.
- Richard.
Hi Richard,
many thanks for the quick answer.
You assumed right: the model is quite complex with further 1:n and 1:1 relations and the output window shows hundreds of statements of this kind: "Cache - getting IOObject object from cache. Key...1) System.Int64: 460".
You wrote: "It's pretty easy to make an object fully load itself if it is only a proxy version when you access it and this delays the performance hit until it is absolutely necessary."
Using CPersistentObject.Find, is a object automatically loaded as proxy if "retrieveAutomatic = false" and proxy fields are defined in the XML?
Could you give me a hint: How can I load a collection as proxy objects? (By now, I implemented lazy loading for single objects inherited from CPersistentObject)?
Thanks and regards, David
Hi David,
In terms of the log messages where you see "Cache - getting <classname> object from cache" messages you can be sure that the operation is quick. As this will be a cache hit, and the object retrieval will be performed in memory.
The slow ones are the ones where the SQL had to be generated and executed. The generation of the SQL itself is very quick (especially if it's a repeated oepration) but the execution of the SQL and the poulation of the dataset is slow.
The .Find() method will retrieve a complete copy of an object, and will populate all associations based on the rules you have defined. RetrieveAutomatic="true" ensures that the target object/collection is populated with fully loaded objects, while RetrieveAutomatic="false" will populate the target object/collection with proxy objects (of the toClass type).
Proxy flags come into use when using the various retrieveCriteria to shorten the time is takes to retrieve multiple objects from the database.
To load a collection as proxy objects simply mark the 1:n association as retrieveAutomatic="lazy". What you will need to do in your classes is change the get accessor for non-proxy properties. Check if the object is a proxy object or not, and if it is do a retrieve/find on itself to load the complete object before setting a return value from the property. You might also want to check the set accessor and decide what to do when setting values for non-proxy properties on a proxy copy of an object.
- Richard.
Hi Richard,
if I set retrieveAutomatic to "lazy" I get a InvalidCastException "Cast from string 'lazy' to type boolean is not valid". I guess setting retrieveAutomatic to lazy means setting it to false or do I maybe have a old version?
When I set retrieveAutomatic to false, the CPersistentCollection objects holding the detail data are empty - the debugger shows a count of 0 objects. I set the "name" field of the table to proxy="true" but the objects are not loaded using cObject.Find(cObject). I initialize the CPersistentCollection like follows:
"coll = New CPersistentCollection"
"coll.ContainerObject = Me"
In the XML file the line of the table description looks like that: "<attribute name="coll" />" and the relation description is the following:
<association fromClass="1Table"
toClass="nTable"
target="coll"
cardinality="oneToMany"
retrieveAutomatic="false"
saveAutomatic="false"
deleteAutomatic="false"
inverse="false">
<entry fromAttribute="ID" toAttribute="N_ID"/>
</association>
I cannot see why this doesn't work. Is there anything missing in my code?
Greets, David
Hi David,
It looks like you are using an older version. I would recommend either updating via a release package, or getting the latest code from CVS.
CVS contains a few bug fixes since the 2.0 release and is probably the best option.
Hi Richard,
many thanks for the quick answer.
You assumed right: the model is quite complex with further 1:n and 1:1 relations and the output window shows hundreds of statements of this kind: "Cache - getting IOObject object from cache. Key...1) System.Int64: 460".
You wrote: "It's pretty easy to make an object fully load itself if it is only a proxy version when you access it and this delays the performance hit until it is absolutely necessary."
Using CPersistentObject.Find, is a object automatically loaded as proxy if "retrieveAutomatic = false" and proxy fields are defined in the XML?
Could you give me a hint: How can I load a collection as proxy objects? (By now, I implemented lazy loading for single objects inherited from CPersistentObject)?
Thanks and regards, David
Hi Richard,
I've made a few steps but it takes me still 12 minutes to load a medium complex data model. The solution seems to be that I lazy load (as I read in one of your comments) a CPersistentCollection object using CMultiRetrieveCriteria. The help is not very useful to me - could you provide a code expample for this so I can see which objects to use in which context?
Many thanks, David
Just to give me a feel of where your performance loss is can you tell me;
Are you trying to load the entire object space into memory in one go?
How many objects are you actually retrieving from the database? (you can check the cache size when the load is complete)
How many SQL statements actually get executed?
How complex is the object model (levels of inheritance, circular associations, etc) - if you have a UML model (obviously without any incriminating names/attributes/etc) that would be a real help
I'll put together an example once I get a better idea of what you are doing. And I'll see what else I can do in terms of improving performance after that.
Just for your reference, some time ago I made a change to speed up performance, and it worked really well in some situations, but in others it made things 10x worse. I had to undo those changes and I want to be very careful that my next "improvement" doesn't actually downgrade the overall responsiveness of the framework.
Hi Richard,
the problem occurs if I retrieve the whole model with one find() - this takes about 12 minutes. The debug output, saved to a txt file is 8,5 MB - containing about 10 SQL Statements. The first SQL Statment is huge and returns 9600 rows.
The application is mainly a calculator and needs to have all the input data at once in order to do statistics.
I have one main table, called Project with 7 1:n associations to Loans, Revenues, Expenses etc. Most of those 7 detail tables have 1:1 associations to different type tables. 3 of those detail tables have together 8 associations to a structure called Function. The Function table has one 1:1 association to VariableFunction and one 1:n association to FixFunctions. The table VariableFunction has 2 more detail tables and one type table. The circular associations in the database are not represented in the XML. Wherever possible I built referential integrity using only numeric keys.
I built lazy loading for every 1:1 association (mainly the 18 type-tables) like shown in your documentation example but in the huge first SQL Statement it seems as if these type-tables are anyway loaded with the model.
However, when I turn retrieveAutomatic=false for the 7 detail tables, the first SQL Statement only loads the Project table. Lazy loading for the 1:1 associations of the Project table works perfectly. But how can I lazy load the associations of the Project table?
I hope my explanations are not too weird ;-)
Thanks and Greets, David
A little theory first.
When the first statement is built it makes a join that includes the keys of all the tables in the other associations. This is done so that when I process a record I can easily know which other objects need to be instantiated against the main object.
If you turn off association automatic retrieval then this doesn't need to be done and the generated SQL doesn't need any joins.
Now, how to lazy-load the associations while leaving retrieve automatic off?
Try implementing something like the following:
Public Class Project
Private _loans as CPersistentCollection
Public Property Loans as CPersistentCollection
Get
If _loans is nothing then
dim l as new loan
rc = new CRetrieveCriteria
rc.classmap = l.getClassMap()
rc.WhereCondition.AddSelectEqualTo("ProjectID",Me.ProjectID)
dim c as CCursor = rc.perform()
while not c.EOF and c.HasElements
l = new loan
c.loadobject(l)
_loans.add(l)
c.nextCursor()
end while
'Optionally add event handlers
AddHandler _loans.ObjectAdded, addressof MyHandler
end if
return _loans
end get
set (Value as CPeristentCollection)
_loans = value
end set
end property
end class
Excuse any typos, errors, etc as I've just typed in the code from memory. In any case it should give you an idea of what needs to happen.
Also, you can obviously inherit CPersistentCollection and use a strongly typed collection if you want to.
I hope that helps a bit
- Richard.
Richard, many thanks for your help! The data loading takes now 2.5 seconds. May I suggest you to insert a code example like yours in the manual?
Many thanks again,
Greets David
Public ReadOnly Property Details() As CPersistentCollection
Get
If cDetails Is Nothing Then
cDetails = New CPersistentCollection
cDetails.ContainerObject = Me
Dim dd As New IODetailData
Dim rc As CRetrieveCriteria = New CRetrieveCriteria
rc.ClassMap = dd.getClassMap()
rc.WhereCondition.addSelectEqualTo("DET_WHO_ID", Me.WholeDataID)
Dim c As CCursor = rc.perform()
While Not c.EOF And c.hasElements
dd = New IODetailData
c.loadObject(dd)
cDetails.Add(dd)
c.nextCursor()
End While
'Optionally add event handlers
'AddHandler cDetails.ObjectAdded, AddressOf MyHandler
End If
Return cDetails
End Get
End Property
I'm glad it's all performing better now :-)
For me documentation is a bit of a problem. I _know_ that I should do more, but I just never seem to make it a priority. I guess I'd much rather be doing some coding than doing some writing.