Re: [Embedlets-dev] Embedlet event topic scheme

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

>Anyway, when either Gregg or Chris gets a chance could one of you please
>explain a little more about the philosophy behind event topics again?  Also, a
>short description of one or two in-use systems that use a scheme like would be
>very helpful.

Using a topic based event system allow subscribers and publishers to be 
unknowing of each others presence.  As events are published, there can be
zero or more subscribers, including subscribers whose job is to distribute
events into external systems.  The topic is important because it lets you 
authorize access to parts of the system separate from the data that is flowing.

In our 'Edge Broker', we use a topic system that puts data items flowing into 
the system into a tree such as

	app.data.gh.0
	app.data.gh.2
	app.data.gh.3
	app.data.rm.0
	app.data.rm.6
	app.data.rm.2
	app.data.rm.3

Our pub/sub system uses user level authentication to control publishing of 
topical events so that only the correct modules in the system can inject the 
appropriate data.  You can find the Broker via Jini, and get an RMI interface 
for publishing from remote code.  When you get the remote interface stub, you 
have to provide your credentials.  The credentials stay in the server and 
confine the user to the authorized limits.

The data queue manager and the data delivery manager use topic based events to 
tell each other what to do.

	app.recover.<deliveryManager>.<key>
	app.<QueueManager>.<key>.store.enq
	app.<QueueManager>.<key>.delete.enq
	app.<QueueManager>.<key>.defer.enq
	app.<QueueManager>.<key>.ignore.enq
	app.<DeliveryManager>.<key>.store.ack
	app.<DeliveryManager>.<key>.store.nak
	etc...

It is thus easy to watch what is happening with data as it flows, you can add 
secondary subscribers to count or track events, perform secondary processing 
etc.

<ramblings>
In a future version of the code, I can trivially distribute the system across 
multiple JVMs by just using event forwarding to remote systems.  I have a 
design drawn up that uses Javaspaces to do this and use Jini to locate the 
appropriate pieces of the system.  Currently though, we are handling our 
largest site at ~1500 devices on midrange machines, so I am not needing to 
distribute for CPU load yet.  We get 1500 hourly reports plus another 1000 
alarms and/or audits per hour and then 750 of the devices publish additional
reports every 2-5 minutes.  This seems like a pretty reasonable performance 
with string based topics and we have done much tuning over time by using ^\ 
under linux to get thread dumps under load and look for places that code is 
camped out on the heap monitor.  In a loaded event based system of course, 
queuing theory is ever present and you quickly rediscover that the system is 
only as fast as the slowest event in the system.  I/O based slowness can be 
mitigated with threading to some degree.  But, as soon as a monitor must be 
touched by every thread, you have to make sure that monitor is fast...

<jiniDigression>

The Jini ServiceDiscoveryManager (SDM) tracks and caches the existance of 
objects and thus allows you to just ask it for an instance of an object 
matching a ServiceTemplate, and when that call returns, you have such an 
object.  You can put timeouts on the 'lookups' and manage them with other 
steps taken when timeouts occur, if needed (very complex recoveries involving 
hardware switching).

The SDM does all the hard part.  If you distribute the workers across multiple 
JVMs, then you can just do

// Don't need any special DiscoveryManagement or LeaseRenewalManager
ServiceDiscoveryManager sdm = new ServiceDiscoveryManager( null, null );

ServiceTemplate myTemplate = getTemplateToUse();

LookupCache lc = sdm.createLookupCache( myTemplate, null, null );

while( workToDo ) {
	WorkItem wi = getNextWorkItem();
	ServiceItem si = lc.lookup( null );
	WorkHandler wh = (WorkHandler)si.service;
	wh.handleWorkItem( wi );
}

This is the simple model...  Some exception handling is needed in there, and 
some retries associated with partial failure handling.

</jiniDigression>

</ramblings>

-----
gr...@cy...  (Cyte Technologies Inc)