Menu

#4199 [HHQ-4068] Contention on _schedules object in ScheduleThread

Bug
open
None
7
2012-10-09
2010-06-21
No

http://jira.hyperic.com/browse/HHQ-4068


In the ScheduleThread the resource schedules are kept in a private _schedules synchronized Map. During metric collection this schedule Map is synchronized on, which is the correct behavior since the values of the Map are being iterated on.



The problem lies in the schedule() and unschedule() methods in ScheduleThread as those also require access to the _schedules object. These methods are called directly from the server, which means slow collection of metrics or a hung ScheduleThread can result in these methods hanging, potentially hanging connections from the server. Agent connections do time out after 60 seconds so these are eventually freed, however the command are lost since the server will not retry them.



The proposed fix here is to create separate List's for unschedule and schedule requests. The schedule() and unschedule() APIs will simply add to this list which will later be operated on in the run() method, just prior to running metric collection. This proposal is still under review and I'll open up a CR once an implementation is in place.

Discussion

Anonymous
Anonymous

Add attachments
Cancel





Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.