Menu

Performance

2007-04-24
2013-06-04
  • chris gilbert

    chris gilbert - 2007-04-24

    On our site, publishing times have suddenly started getting worse and worse - even small multipart articles are now taking over a minute to publish, and our gatekeepers are fed up with frequent deadlocks (caused by publishing several articles in different windows) and publishing that seems to take hours.

    I have been looking into this and tracked the performance hit to updates on dnm_object_1_granted_context table executed in the change_context procedure. As far as I can tell, there is little that can be done to speed up the sql (though having said that I can't entirely get my head around the way that the start with...connect by...prior syntax is applied here).

    However, the number of entries added to the dnm tables is pretty liberal at the moment, as is the number of context updates. It is possible to completely avoid many calls to change_context, and I am fairly sure that dnm entries are only really required for objects that have a permission context or have permissions granted directly - ie objects that have any kind of access control presence.

    I have been experimenting on my test instance, and have got publish times on a three section multipart article down from 72 seconds to 6.5 seconds. Tests so far show no adverse effects.

    Changes are:

    PermissionManager - in setContext method, only set context if it has changed - the persistence layer doesn't always realise when an entry hasn't been modified, because of entries made by triggers.

    Triggers - currently entries are added to dnm_object_1_granted_context, and change_context is procedure called, for every acs_object created. Changed to only add entries when context is set or when permission added.

    Permission check sql in permissions.pdl - small change to queries required for some permission checks - eg on site nodes (objects with no context/privileges have 0 as their context in dnm tables - 0 is an object with all privileges granted to site-wide-admin). The change has had no impact on the duration of permission checks - fortunately - the whole point of the dnm tables is to ensure permission checks are quick.

    delete all existing entries from dnm_object_1_granted_context where the associated object_context entry has null context and there are no entries in acs_permissions (that's 145000 out of 173000 entries on my test instance!!) 

    Minor additional changes

    ContentItem - don't automatically set context to the content section on all items - it never remains as that, and it seems to be an unneccesary overhead

    Article text - don't set article as parent of text - the relationship is already recorded in text_pages, and setting parent means adding an unneccessary object_context with associated overhead.

    Get rid of surplus acs_objects with associated object_context and dnm entries - lifecycles and notifications can optionally be deleted, I have added config parameter to specify whether you want them deleted when complete.

    I am going through checking for any horrible errors I may have introduced, and will eventually submit a patch when I am confident that everything is working. I have only made changes to Oracle versions of the triggers - I am not sure about writing triggers in postgres sql. Without these changes there are only minor benefits for pg users - the change to PermissionManager speeds things up a litle bit.

    Chris.......

    If anyone is interested in trying out these changes, I can send an early patch

     
    • pboy

      pboy - 2007-07-09

      Chris,

      just a question: I suppose your patch r 1586 (SF 1707913) in trunk is for the described changes?

      If it is:

        I have only made changes to Oracle versions of the triggers - I am not sure about writing triggers
        PermissionManager speeds things up a litle bit.

      We are quite interested in performance issues for postgresql and it would be fine to benefit from your findings, too. As I understand, it is just a matter of writing the trigger? No other modifications required for postgresql? May be we could write the triggers. Do you have a test scenario at hand to ensure the patch will work correctly?

      And what about:

      | delete all existing entries from dnm_object_1_granted_context where the associated object_context
      | entry has null context and there are no entries in acs_permissions (that's 145000 out of 173000 entries
      | on my test instance!!)

      is deletion part of the upgrade script of just your experimental procedure?

      Peter

       
      • chris gilbert

        chris gilbert - 2007-07-16

        Hi Peter,

        Yes - sorry, I didn't comment the sourceforge patch entry when I merged the second diff file. The first file merged at revision 1583 added config parameters to allow users to stop historical records being retained for things like sent notifications etc. The second file merged at revision 1586 was mainly the trigger changes, and the third which I don't think I merged in revision 1586 includes changes to prevent unneccessary context inserts and updates.

        I have just got back from holiday, so will give you some more information later this week - I would stress that I think the huge benefits we have seen here are mainly because of the large amount of access control we have here. However I would hope the same trigger changes would provide at least some noticeable performance improvement on an average postgres site.

        Chris.....

         
    • chris gilbert

      chris gilbert - 2007-07-25

      Hi Peter,

      I'll start with the easy question. The deletion of surplus object_1_granted_context entries is included in the upgrade script - in sql/ccm-core/default/upgrade/6.5.4-6.5.5/delete-object_1_granted_context_entries.sql (I put it in the default directory as I think the syntax and data model is the same for Oracle & Postgres)

      This is called from sql/ccm-core/upgrade/oracle-se-6.5.4-6.5.5.sql which drops the original triggers, runs the existing but heavily edited trigger creation script and then deletes the entries.

      The equivalent postgres file sql/ccm-core/upgrade/postgres-6.5.4-6.5.5.sql is empty.

      To create something similar for Postgres,  you would need to edit sql/ccm-core/postgres/kernel/triggers-dnm_context.sql and then populate sql/ccm-core/upgrade/postgres-6.5.4-6.5.5.sql with something similar to the oracle version.

      Note the optional deletions in my later comment on the Sourceforge patch page. These are not in an upgrade script as they are very much optional – if you choose to set configuration so that lifecycles and notifications are removed when complete, then you may choose to run these statements in order to remove the historical entries. In hindsight, I could have written a program invoked from the command line to delete these cleanly, but in the meantime the sql is there.

      The testing I carried out involved checking the following access control scenarios

      1.    Direct permission granted prior to upgrade still works following upgrade
      2.    Indirect permission granted prior to upgrade still works following upgrade
      3.    Direct permission granted after upgrade works correctly
      4.    Indirect permission granted after upgrade works correctly

      In each case, permission checks covered permission filtering of data collections and specific permission checks on an object.

      Some examples:

      One form of permission filtering is used by the item expiry notification to decide which subset of alert recipients should be informed – filtering is carried out on the draft folder containing the item to check if user has edit privilege. To test, create 1 folder (let’s call it test 1) at root level and 2 more folders – one at root level (test 2) and one below that one (test 3).

      Create 2 users, user 1 and user 2. Specify custom permissions on folders test 1 and test 2. Grant user 1 edit permission on test 1 and grant user 2 edit permission on test 2 (ensure 3 retains default permissions so it is inheriting from test 2)

      Now ensure both users are in the alert recipient role and create an item in test 1 and another item in test 3.

      Publish the items with an expiry date 1 hour in the future and set the notify in advance to 1 hour. Leave for about half an hour to allow the background processes to notice the expiry (or check the phases table – look to see if the latest phases with listener com.arsdigita.cms.lifecycle.NotifyLifecycleListener  have been flagged as has_begun)

      Once the NotifyLifecycleListener runs, there should be one entry in messages for each item with details of the expired item and for each message there should be an entry in nt_requests specifying the appropriate user.

      Because of the inherent complexity of expiry notifications, run this test before upgrading the code to ensure that it works in your existing setup.

      Another type of filtering is optionally used on object lists on a navigation page (though I’m not sure that you use the ccm-ldn-navigation package). To use this, in your navigation jsp page, apply this scriptlet line to an object list

      ((com.arsdigita.london.navigation.ui.object.SimpleObjectList) itemList).getDefinition().setCheckPermissions(true);
           
      Disable caching by adding directly below imports

      <jsp:scriptlet>
             DispatcherHelper.cacheDisable(response);
         </jsp:scriptlet>

      Now the list should only include objects for which the current user has read access, either directly or inherited. (for content items the permission is ALWAYS inherited - you can check direct permissions by defining additional object lists in the jsp page - eg a portal workspace list)

      To excercise the permission check on a single object, task availablity is a good test for inherited permission. With the 2 users above granted edit privilege on folders 1 and 2, they should be able to edit items in folders 1 and 3 provided the authoring stage of the workflow is enabled and the authoring task is unlocked.

      For direct permission check I added permission to access the ds sitenode directly in the database - inserting a new entry in acs_permissions granting admin privilege on the ds sitenode object (object_id 29 on our site) to a specified user and checking that it was accessible

      Hope this gives you a start – let me know if you need more information about the actual trigger changes

      Chris…..

       
    • Alan_permeance

      Alan_permeance - 2009-08-05

      Hi all,

      We are currently having the same deadlocking issues. It occurs in all of our environments regardless of the size of the DB. Its even happening on the latest fedora trunk.

      We've noticed that when republishing two different items that do not have any related items all is ok. However as soon as one item has references to another (related link, metadata etc) it deadlocks every time.

      I'm not convinced that the deadlock is related to content Items as such. Being to think that maybe the persistence layer handling of associations is not fully concurrent?

      Has any one out there experience this, have any suggestions?

      Cheers
      Al

       
      • shawn lane (powers of ten)

        We have noticed the deadlocks. Database performance drops off, then you look in the database and clear the deadlocks. We have never got to the bottom of the actual cause, but we do notice it when people are editing and publishing.  I suppose in normal terms the software should not allow deadlocks, so it is  a real bug.. our workaround is to check the database of deadlocks and delete these when they are found.

         
    • pboy

      pboy - 2009-08-07

      Hm, we seem to have the same or a similiar problem here. From time to time the system is very very slow and it is practically impossible to do any work in content-center. The public site shows a very slow performance as well. The prblem vanishes after some time.

      We are using postgresql, so the patch provided by Chris doesn't affect us.  What db are you using?

      Peter

       
      • shawn lane (powers of ten)

        We are using Postgres 8. something. It can happen when people try to load very large files too.

         
    • Alan_permeance

      Alan_permeance - 2009-08-10

      We've fixed our deadlocking issues (thanks Tim Carpenter).

      Oracle issues a SHARE lock on child tables when updating parents when there is no index on the FK column. If two transactions have a SHARE lock they will deadlock when DML starts to happen.

      Once we ensured that all Foreign keys had a single column index the dead locks went away.
      We haven't tested this on Postgres.

      We have updated the oracle-se-create.sql and postgres-create.sql to ensure the indexes are created during a load-bundle and we commit them soonish.

      Reference:
      http://download.oracle.com/docs/cd/B14117_01/appdev.101/b10795/adfns_sq.htm#1024837
      http://www.oracle-base.com/articles/9i/SQLNewFeatures9i.php#ShareLocksOnUnindexedFKs

       
  • pboy

    pboy - 2009-09-28

    This issue seems not to be resolved yet.

    Shawnlane: Please could you describe how you monitor deadlocks in your postgresql environment? Perhaps we could prepare a workaround?

    Alan: Could you test your solutions? Are they ready to be commited to trunk?

    Regards
    Peter

     
  • shawn lane (powers of ten)

    To look for deadlocked process in Postgres:

    It looks as though the deadlock processes now return to the normal 'idle in transaction' state after 30 minutes. If this doesn't happen you can use the following commands to manually kill the process:-

    Log on to the appropriate machine and run this command:

    grep deadlock /var/log/postgresql.log

    (you may need to be root,)

    NB: If the file is not there - do
    grep deadlock /var/log/messages

    The ouput shows the process id. You can check if this process still exists with

    ps waux | grep &lt;process-id&gt;

    The deadlock can be killed with:

    sudo kill &lt;process-id&gt;

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.