From: Ford, K. <kf...@co...> - 2007-04-16 16:51:27
|
Dear All, =20 I wanted to share some wisdom about something I just completed working through this morning. This has to do with the time it takes for the page to render when viewing an item in a collection. =20 Last week I started implementing VRA Core 4 with Fez, a rather involved and complex XML metadata format. As the project progressed, viewing an item in the collection that employs VRA Core 4 became slower and slower: 52 seconds for the page to render (I have Fez 1.3 on my workstation - Pentium D 3.4 GHz with 2 GBs RAM running Win XP, Fedora 2.1.1 and MySQL 5.x are also on my machine). Viewing a record that has a very simple DC record renders in 2 seconds. =20 Trawling through the code to see precisely where the time sink was (my hunch was it had to do with dealing the VRA Core XSD and XML datastream), I discovered the multiple calls to the record->checkExists function from the view2.php and class.record.php page (within the getXmlDisplayId function). Placing checks throughout the code, I noted that every time the checkExists function was called it took about 16 seconds for the function to return a result. =20 =20 So, I created an additional variable for the RecordGeneral class to store the result of the checkExists function, still called once at the top of view2.php (variable holding the result after that). It still takes 16 or so seconds to receive the result from the first (and only) call to checkExists, but the page completes rendering 3 seconds after that initial call to checkExists, in 19 seconds. =20 Clearly, this is not (necessarily) a typical example. Like I said, an object with a simple DC record renders much, much quicker. Nevertheless, it seems that the multiple calls to checkExists are unnecessary and negatively impact performance (by more than 30 seconds in this example).=20 =20 Also, providing there is a PID (and the record exists), a RecordObject is created twice in view2.php. =20 Hope this may help, =20 Kevin =20 ------------------------------------------------ =20 Kevin Ford Digital Services Specialist Columbia College Chicago Library 624 S. Michigan Avenue Chicago, IL 60605 Tel: 312 344 8568 Email: kf...@co... =20 |
From: Ford, K. <kf...@co...> - 2007-04-16 17:24:48
|
Dear All, =20 On a lark, and seeing if I could speed up the result of the checkExists function still, I changed the objectExists function in class.fedora_api.php. =20 I modified the soapcall from getObjectXML to getDatastream. =20 I commented out: =20 $parms =3D array('pid' =3D> $pid); $result =3D Fedora_API::openSoapCall('getObjectXML', $parms, false); =20 And inserted: =20 // Shaking things up - try to only get the DC part of the datastream - might be faster and, last I checked, FOXML requires a DC datastream, and Fedora will create it if it doesn't exist $parms =3D array('pid' =3D> $pid, 'dsID' =3D> 'DC'); $result =3D Fedora_API::openSoapCall('getDatastream', $parms, false); =20 Page now renders in 4 seconds. I will report, of course, if I've broken anything. So far though, nothing. =20 Warmly, =20 Kevin =20 =20 ________________________________ From: fez...@li... [mailto:fez...@li...] On Behalf Of Ford, Kevin Sent: Monday, April 16, 2007 11:52 AM To: fez...@li... Subject: [Fez-users] checkExists =20 Dear All, =20 I wanted to share some wisdom about something I just completed working through this morning. This has to do with the time it takes for the page to render when viewing an item in a collection. =20 Last week I started implementing VRA Core 4 with Fez, a rather involved and complex XML metadata format. As the project progressed, viewing an item in the collection that employs VRA Core 4 became slower and slower: 52 seconds for the page to render (I have Fez 1.3 on my workstation - Pentium D 3.4 GHz with 2 GBs RAM running Win XP, Fedora 2.1.1 and MySQL 5.x are also on my machine). Viewing a record that has a very simple DC record renders in 2 seconds. =20 Trawling through the code to see precisely where the time sink was (my hunch was it had to do with dealing the VRA Core XSD and XML datastream), I discovered the multiple calls to the record->checkExists function from the view2.php and class.record.php page (within the getXmlDisplayId function). Placing checks throughout the code, I noted that every time the checkExists function was called it took about 16 seconds for the function to return a result. =20 =20 So, I created an additional variable for the RecordGeneral class to store the result of the checkExists function, still called once at the top of view2.php (variable holding the result after that). It still takes 16 or so seconds to receive the result from the first (and only) call to checkExists, but the page completes rendering 3 seconds after that initial call to checkExists, in 19 seconds. =20 Clearly, this is not (necessarily) a typical example. Like I said, an object with a simple DC record renders much, much quicker. Nevertheless, it seems that the multiple calls to checkExists are unnecessary and negatively impact performance (by more than 30 seconds in this example).=20 =20 Also, providing there is a PID (and the record exists), a RecordObject is created twice in view2.php. =20 Hope this may help, =20 Kevin =20 ------------------------------------------------ =20 Kevin Ford Digital Services Specialist Columbia College Chicago Library 624 S. Michigan Avenue Chicago, IL 60605 Tel: 312 344 8568 Email: kf...@co... =20 |
From: Lynette R. <el...@cs...> - 2007-04-16 19:06:40
|
=20 In a separate Fedora project, we discovered that using versioning slowed = down the retrieval process, especially if the control group for the versioned datastream is X (internally managed XML). If it is taking 16 seconds for checkExists to complete and you have updated the = object many times with versioning on, you might want to weigh the advantages of versioning against the performance enhancement of having versioning = turned off. You can also get a performance boost by using control group M = (managed content) for XML content instead of X (internally managed XML). =20 For any who may not be familiar with how Fedora stores datastreams, the reason this happens is that the object foxml holds the metadata for each datastream version. In the case of internally managed XML, the object = foxml also holds the XML value of each datastream version. The object foxml = can grow quite large if there have been lots of updates to an internally = managed XML datastream. =20 Lynette =20 ________________________________ From: fez...@li... [mailto:fez...@li...] On Behalf Of Ford, = Kevin Sent: Monday, April 16, 2007 12:52 PM To: fez...@li... Subject: [Fez-users] checkExists =20 Dear All, =20 I wanted to share some wisdom about something I just completed working through this morning. This has to do with the time it takes for the = page to render when viewing an item in a collection. =20 Last week I started implementing VRA Core 4 with Fez, a rather involved = and complex XML metadata format. As the project progressed, viewing an item = in the collection that employs VRA Core 4 became slower and slower: 52 = seconds for the page to render (I have Fez 1.3 on my workstation - Pentium D = 3.4 GHz with 2 GBs RAM running Win XP, Fedora 2.1.1 and MySQL 5.x are also on my machine). Viewing a record that has a very simple DC record renders in = 2 seconds. =20 Trawling through the code to see precisely where the time sink was (my = hunch was it had to do with dealing the VRA Core XSD and XML datastream), I discovered the multiple calls to the record->checkExists function from = the view2.php and class.record.php page (within the getXmlDisplayId = function). Placing checks throughout the code, I noted that every time the = checkExists function was called it took about 16 seconds for the function to return = a result. =20 =20 So, I created an additional variable for the RecordGeneral class to = store the result of the checkExists function, still called once at the top of = view2.php (variable holding the result after that). It still takes 16 or so = seconds to receive the result from the first (and only) call to checkExists, but = the page completes rendering 3 seconds after that initial call to = checkExists, in 19 seconds. =20 Clearly, this is not (necessarily) a typical example. Like I said, an = object with a simple DC record renders much, much quicker. Nevertheless, it = seems that the multiple calls to checkExists are unnecessary and negatively = impact performance (by more than 30 seconds in this example).=20 =20 Also, providing there is a PID (and the record exists), a RecordObject = is created twice in view2.php. =20 Hope this may help, =20 Kevin =20 ------------------------------------------------ =20 Kevin Ford Digital Services Specialist Columbia College Chicago Library 624 S. Michigan Avenue Chicago, IL 60605 Tel: 312 344 8568 Email: kf...@co... =20 |
From: Ford, K. <kf...@co...> - 2007-04-17 15:05:49
|
Thanks Lynette for the reminder. I know that with versioning on the Fedora object continues to grow with each update, but I completely forgot about that aspect since one doesn't readily see the growth (I was looking at the Fedora object through the web interface, not the pure Foxml that can be accessed through the fedora-admin tool). And, indeed, I had been working over the one record so that Fez does with it what I want it to do. I'll have to consider the merits of versioning, at least for some datastreams. =20 This morning, I removed all changes I made to Fez, deleted the Fedora object, recreated the Fedora object, and then imported it to Fez. Page renders quickly, in less than 4 seconds. I reimplemented my changes to Fez and the time difference, before any changes to the Fedora object, is negligible (2-3 tenths of a second). (Nevertheless, for the time being, I like the idea of looking only for the DC record in the checkExists function because it still seems more efficient to me, but I would welcome any thoughts on the matter from other Fez users.) =20 Warmly, =20 Kevin =20 =20 ________________________________ From: fez...@li... [mailto:fez...@li...] On Behalf Of Lynette Rayle Sent: Monday, April 16, 2007 2:07 PM To: fez...@li... Subject: Re: [Fez-users] checkExists =20 =20 In a separate Fedora project, we discovered that using versioning slowed down the retrieval process, especially if the control group for the versioned datastream is X (internally managed XML). If it is taking 16 seconds for checkExists to complete and you have updated the object many times with versioning on, you might want to weigh the advantages of versioning against the performance enhancement of having versioning turned off. You can also get a performance boost by using control group M (managed content) for XML content instead of X (internally managed XML). =20 For any who may not be familiar with how Fedora stores datastreams, the reason this happens is that the object foxml holds the metadata for each datastream version. In the case of internally managed XML, the object foxml also holds the XML value of each datastream version. The object foxml can grow quite large if there have been lots of updates to an internally managed XML datastream. =20 Lynette =20 ________________________________ From: fez...@li... [mailto:fez...@li...] On Behalf Of Ford, Kevin Sent: Monday, April 16, 2007 12:52 PM To: fez...@li... Subject: [Fez-users] checkExists =20 Dear All, =20 I wanted to share some wisdom about something I just completed working through this morning. This has to do with the time it takes for the page to render when viewing an item in a collection. =20 Last week I started implementing VRA Core 4 with Fez, a rather involved and complex XML metadata format. As the project progressed, viewing an item in the collection that employs VRA Core 4 became slower and slower: 52 seconds for the page to render (I have Fez 1.3 on my workstation - Pentium D 3.4 GHz with 2 GBs RAM running Win XP, Fedora 2.1.1 and MySQL 5.x are also on my machine). Viewing a record that has a very simple DC record renders in 2 seconds. =20 Trawling through the code to see precisely where the time sink was (my hunch was it had to do with dealing the VRA Core XSD and XML datastream), I discovered the multiple calls to the record->checkExists function from the view2.php and class.record.php page (within the getXmlDisplayId function). Placing checks throughout the code, I noted that every time the checkExists function was called it took about 16 seconds for the function to return a result. =20 =20 So, I created an additional variable for the RecordGeneral class to store the result of the checkExists function, still called once at the top of view2.php (variable holding the result after that). It still takes 16 or so seconds to receive the result from the first (and only) call to checkExists, but the page completes rendering 3 seconds after that initial call to checkExists, in 19 seconds. =20 Clearly, this is not (necessarily) a typical example. Like I said, an object with a simple DC record renders much, much quicker. Nevertheless, it seems that the multiple calls to checkExists are unnecessary and negatively impact performance (by more than 30 seconds in this example).=20 =20 Also, providing there is a PID (and the record exists), a RecordObject is created twice in view2.php. =20 Hope this may help, =20 Kevin =20 ------------------------------------------------ =20 Kevin Ford Digital Services Specialist Columbia College Chicago Library 624 S. Michigan Avenue Chicago, IL 60605 Tel: 312 344 8568 Email: kf...@co... =20 |
From: Christiaan K. <c.k...@li...> - 2007-04-17 22:46:36
|
Hi Kevin Yes I have also noticed that objects with very large version trails can perform a lot slower. I=B9m testing Fedora 2.2 at the moment (and finding som= e undocumented API changes). 2.2 now has the ability to turn versioning off per datastream (as well as general performance improvements) so either way with Fez and Fedora 2.2 performance should improve. There may be a way we can improve checkExists if that seems like the bottleneck you are having eg make it uses a different (faster) Fedora api call to do the checking. Cheers, Christiaan On 18/4/07 1:06 AM, "Ford, Kevin" <kf...@co...> wrote: > Thanks Lynette for the reminder. I know that with versioning on the Fedo= ra > object continues to grow with each update, but I completely forgot about = that > aspect since one doesn=B9t readily see the growth (I was looking at the Fed= ora > object through the web interface, not the pure Foxml that can be accessed > through the fedora-admin tool). And, indeed, I had been working over the= one > record so that Fez does with it what I want it to do. I=B9ll have to consid= er > the merits of versioning, at least for some datastreams. > =20 > This morning, I removed all changes I made to Fez, deleted the Fedora obj= ect, > recreated the Fedora object, and then imported it to Fez. Page renders > quickly, in less than 4 seconds. I reimplemented my changes to Fez and t= he > time difference, before any changes to the Fedora object, is negligible (= 2-3 > tenths of a second). (Nevertheless, for the time being, I like the idea = of > looking only for the DC record in the checkExists function because it sti= ll > seems more efficient to me, but I would welcome any thoughts on the matte= r > from other Fez users.) > =20 > Warmly, > =20 > Kevin > =20 > =20 >=20 >=20 > From: fez...@li... > [mailto:fez...@li...] On Behalf Of Lynette Ray= le > Sent: Monday, April 16, 2007 2:07 PM > To: fez...@li... > Subject: Re: [Fez-users] checkExists > =20 > =20 > In a separate Fedora project, we discovered that using versioning slowed = down > the retrieval process, especially if the control group for the versioned > datastream is X (internally managed XML). If it is > taking 16 seconds for checkExists to complete and you have updated the ob= ject > many times with versioning on, you might want to weigh the advantages of > versioning against the performance enhancement of having versioning turne= d > off. You can also get a performance boost by using control group M (mana= ged > content) for XML content instead of X (internally managed XML). > =20 > For any who may not be familiar with how Fedora stores datastreams, the r= eason > this happens is that the object foxml holds the metadata for each datastr= eam > version. In the case of internally managed XML, the object foxml also ho= lds > the XML value of each datastream version. The object foxml can grow quit= e > large if there have been lots of updates to an internally managed XML > datastream. > =20 > Lynette > =20 >=20 >=20 > From: fez...@li... > [mailto:fez...@li...] On Behalf Of Ford, Kevin > Sent: Monday, April 16, 2007 12:52 PM > To: fez...@li... > Subject: [Fez-users] checkExists > =20 > Dear All, > =20 > I wanted to share some wisdom about something I just completed working th= rough > this morning. This has to do with the time it takes for the page to rend= er > when viewing an item in a collection. > =20 > Last week I started implementing VRA Core 4 with Fez, a rather involved a= nd > complex XML metadata format. As the project progressed, viewing an item = in > the collection that employs VRA Core 4 became slower and slower: 52 secon= ds > for the page to render (I have Fez 1.3 on my workstation - Pentium D 3.4= GHz > with 2 GBs RAM running Win XP, Fedora 2.1.1 and MySQL 5.x are also on my > machine). Viewing a record that has a very simple DC record renders in 2 > seconds. > =20 > Trawling through the code to see precisely where the time sink was (my hu= nch > was it had to do with dealing the VRA Core XSD and XML datastream), I > discovered the multiple calls to the record->checkExists function from th= e > view2.php and class.record.php page (within the getXmlDisplayId function)= . > Placing checks throughout the code, I noted that every time the checkExis= ts > function was called it took about 16 seconds for the function to return a > result. =20 > =20 > So, I created an additional variable for the RecordGeneral class to store= the > result of the checkExists function, still called once at the top of view2= .php > (variable holding the result after that). It still takes 16 or so second= s to > receive the result from the first (and only) call to checkExists, but the= page > completes rendering 3 seconds after that initial call to checkExists, in = 19 > seconds. > =20 > Clearly, this is not (necessarily) a typical example. Like I said, an obj= ect > with a simple DC record renders much, much quicker. Nevertheless, it see= ms > that the multiple calls to checkExists are unnecessary and negatively imp= act > performance (by more than 30 seconds in this example). > =20 > Also, providing there is a PID (and the record exists), a RecordObject is > created twice in view2.php. > =20 > Hope this may help, > =20 > Kevin > =20 > ------------------------------------------------ > =20 > Kevin Ford > Digital Services Specialist > Columbia College Chicago Library > 624 S. Michigan Avenue > Chicago, IL 60605 > Tel: 312 344 8568 > Email: kf...@co... > =20 >=20 >=20 > ------------------------------------------------------------------------- > This SF.net email is sponsored by DB2 Express > Download DB2 Express C - the FREE version of DB2 express and take > control of your XML. No limits. Just data. Click to get it now. > http://sourceforge.net/powerbar/db2/ >=20 > _______________________________________________ > Fez-users mailing list > Fez...@li... > https://lists.sourceforge.net/lists/listinfo/fez-users --=20 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Christiaan Kortekaas Senior Library Systems Programmer Library Technology Service The University of Queensland, Australia QLD 4072 Telephone : (+61) (7) 3346 4337 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |