From: Marty K. <mrk...@co...> - 2010-04-20 14:41:21
|
Benjamin Franksen wrote: > Hello Marty > > first thank you for answering in such detail. It greatly helped me to > understand the issues better. I agree that the new way to compose records > probably saves a great amount of memory that was previously wasted because > it was simply not used. So, maybe, I should relax and hope that the > additional overhead is balanced by these savings. And yes, availability of > an efficient Java VM is probably the limiting factor for emmbedded systems, > rather than memory footprint. (However, less memory footprint also often > means less work for the garbage collector; this _can_ be a decisive factor > w.r.t. runtime, depending on the implementation). > > I also understand better now why you think that fixed record or structure > types are too restrictive. It is certainly correct that one size does not > fit all, and that it should not only be possible but must be simple, easy, > and reliable to create new compositions from existing components. Indeed, > this has been my sole credo for a long time: compositionality rules. > > It still wonder whether we can have the cake and eat it, too. That is, reap > the benefits of (quasi-)statically defined structures (aka types), while > making the process of creating, composing, and instantiating them so simple > and predictable that control system engineers can and will routinely > perform these tasks. This needs a lot more thinking on my part, so I'll > leave it alone for the moment. > > There is one remaining point where I think pvData as it stands is more > complex than necessary: > > On Montag, 19. April 2010, Marty Kraimer wrote: > >> Benjamin Franksen wrote: >> >>> And, maybe, even the spec >>> should be re-considered. I think it would be enough if client code gets >>> the notification for the single field that changed. Propagating the >>> event upwards seems to be unnecessary, IMO. If client code subscribes >>> to a whole sub-structure, then this subscription can be propagated to >>> the (sub-) fields *downward* at registration time, and similarly if the >>> client subscribes for a subset of fields. >>> >> Not sure where this goes. Sounds like it will be even harder for the >> client. >> > > It gets much simpler, see below. > > >> The client now uses pvData.pvCopy and pvData.monitor to do all >> the work. >> > > Yes, and I very much like the general idea. > > >> Look at them with what you are thinking. Are they easier or >> harder to implement efficiently? >> > > I have taken a cursory look at the code. It looks very complicated. I'd > rather program a solution myself than try to understand this code. > > I can understand this feeling. But let me make a couple of comments: The first example is asyn. I wrote asynManager. If you want to be scared of code look at this. I spent LOTS of time getting it to work. But it works and provides a very useful service. I agree that some of the pvData implementation is complex. In fact over time it is becomming less rather than more complex. Where possible I try to implement complex code in single source files instead of sharing complexity between different source modules. In addition once you see how recursion is implement in one place it makes it easier to understand it in other places. Can it be made better? Certainly. Every time I add something new and a bug appears in old code I try hard to simplify the old code rather that just implement some hack. I am learning while implementing!! For example I am somewhat horrified by how complex pvCopy is. If you are looking at the main tree and not the branch adding the ability to specify monitor options on a per field basis made pvCopy and pvMonitor even more complex. I also have a method pvCopy.createPVRequest(String request) that allows a client to make a request like PVStructure PVCopyFactory.createPVRequest("value,alarm,timeStamp") ; to create a pvRequest structure to pass to channel.createChannelGet and other channel.createXXX methods. But createPVRequest also allows access to complex record structures. The request string can become quite complex but most client will not need this. When I added the new monitor support the old implementation could not be easily fixed. The new version (not yet committed) is smaller and easier to understand even though it does more. One last comment. javaIOC implements portDriver which does what asynDriver does (not yet usable since no low level drivers). But it does not have the equivalent of asynManager. Instead the interfaces from asynDriver were rearranged. portDriver is much easier to understand, including the implementation, than asynDriver. Thus I am learning!! > Let me explain how I see the problem and how I would go about a solution. > Maybe it is not that much different from what you did. > > Before starting to comment this, I ask you to read until the end. > > A structure is more or less a (multi-branched) tree. Scalar fields are the > leaves and structure fields are the nodes (I am deliberately simplifying by > ignoring arrays for the moment). Immediate children of a node are (by > definition) uniquely identified by a name (a string). So we can uniquely > address each field in a tree by a list of names, namely the labels > encountered on the path from the root to the field. > > We are given a specification of a partial one-to-one map from the fields of > an existing tree R (for Record) onto another tree U (for User). The task is > to (1) construct U according to the given mapping and (2) provide methods > to efficiently propagate changes from one side of the mapping to the other, > that is, from U to R and vice versa (remember that the mapping is > one-to-one, where defined, by definition). Thus, the resulting tree U > should serve as a proxy for a subset of the fields of the original tree R. > (Really 'access proxy' is a much better name for these things > than 'PVCopy'.) > > I think the current implementation is pretty efficient. Also note that it keeps change and over run bitSets. These provide two important features: 1) It limits what has to be sent over the network. Only fields that have changed are sent. 2) It tells the client what has changed and which fields have been modified more that once since the last time data was sent. About the name. PVCopy IS descriptive. Unless the client requests shared data, an actual copy is made. If the data is modified while data is being sent data integrity is guaranteed. Note that a single send may involve multiple network packets (think of big arrays). channelAccess works together with the synchronize methods of PVField to allow: a) multiple messages can be sent in a single network packet. b) a single message can span multiple network packets. Thus efficient support for small messages while also support for big arrays. But it does introduce synchronization issues. > Although the specification of the map is given as a string, I'll skip over > the problem of parsing it (which is a separate one and has standard > solutions) and assume that the request is already in tree form. Now, this > tree has the same form as the tree U we want to construct, only that the > nodes and leaves do not contain fields, but instead paths (remember that > paths are lists of strings that identify fields in the target tree R). > Thus, to create U, we map over the given specification tree, replacing each > path with a new field of the same type as the field of the tree R that the > path adresses. Since fields are merely interfaces, the field implementation > we use determines how changes get propagated. For put (to fields of U) we > can decide whether the new value (a) gets propagated immediately to R or > (b) is cached and gets propagated only on demand. For get, we can (a) chose > to add a listener to the corresponding field of R in order to be notified > of changes and on get merely return the cached value, or (b) poll the field > of R and return the value with or without caching it. All these > possibilities can be factored into suitable implementations of the field > interface. (If the programming language gives us a way to do so, the > process of creating a proxy should be parameterized with the field > implementation(s), so we can customize the proxy's behaviour.) > > I have left out some details, but this is the overall picture, as I see it. > Note that no mention is made of > > * field indexes > * structure sizes > * getParent method > * bit fields > > and removing them from the general field interface would leave (beside the > reflection stuff) only put, get, and addListener. > > AFAICS the purpose of the items I listed above is to optimize the process of > finding out which of the fields of a structure U have changed -- from > linear in the number of fields to (nearly) constant. While this is > certainly a worthwhile goal, and may even be necessary for acceptable > performance in certain cases, it has a cost, too. Apart from the fact that > it complicates the code, the way it is implemented today adds overhead to > _all_ fields of _all_ structures, while it is needed only for the fields of > access proxies ('PVCopy') (and then only if they have many fields -- I > would expect the majority of requests to be for not more than a handfull of > fields, in which case it could well be that a traversal of the fields gets > amortized by simpler field implementations, especially in case the access > pattern is mostly periodic; all this should ideally be measured, not > guessed.) > > What you describe above is much like what is provided. We could delay creating fields required for client access until the first client connects but should we delete the fields when there are no more clients connected? I agree that most requests will be for some subset of value,alarm,timeStamp,display,control. However the application engineer responsible for the ioc may access other fields while diagnosing problems. It is also possible to attach to a complete record. Think of bored operators. > I think if there is demand for this optimization, the additional methods > should be placed in derived (extended) or even independent interfaces. For > instance, we could add a method getProxyField to PVField (and/or similar > for derived interfaces) that returns an interface to the additional methods > or null if they are not implemented. > > Could be. Another possible optimization is to not propagate postPut up the tree if no client is attached to this field via a higher level structure. For example the input structure will normally not have any clients looking at anything in it. Thus when raw values are read no reason to have postPut do anything. But there is some saying about premature optimization :-) But pvData, etc already violates this saying because so much of it is designed for efficient cpu and network performance (Perhaps at the expanse of memory usage) :-( Keep looking!! Thanks, Marty |