From: <ovi...@jb...> - 2006-05-24 21:27:45
|
I recently fixed some inter-release compatibility problems, so I had to deal with the current version compatibility code, as we have it in the repository right now. This had me go back to this thread, to check on the validity of some assumption we're making, and this sparked the following IM discussion with Tim. I am attaching it to the thread for completeness. The major concepts debated are whether we should be send versioning information with each invocation, or to rely on the fact that our system mostly holds stateful conversations (it actually holds only stateful conversations now), and the state id contained in each invocation is sufficient to allow the server determine the version of the client and act accordingly, without an explicit versioning byte being necessary. It's a long discussion, so don't feel compelled to follow it. The transcript is here because it contains some valuable ideas. I can just say we didn't reach any conclusion, the tentative conclusion is that we need to build the equivalent systems and perform statistical analysis on them to determine which one is more performant. Which we won't do anytime soon :) ---- Start IM Conversation Transcript ------------------------------------------------------------------------------------------------------- flanker707: Tim, I've been thinking about the versioning discussion we had yesterday, and I also read the discussion thread, and I am not convinced yet the solution we're using now it's the best one. Do you want to spend some time and discuss this a little bit? fox_tim_l: ok flanker707: so, the current argument is that we're sending a versioning byte with each invocation, because the server doesn't know where that invocation comes from, and versioning processing has to take place very early in the process (when the invocation is unmarshalled) flanker707: did I get the gist of the thread? fox_tim_l: yes kind of [...] flanker707: considering that my interpretation of the forum thread is correct flanker707: the major argument for sending versioning info on each invocation is that there is a certain moment in time when the invocation processing logic on the server doesn't know what that invocation comes from, so it needs to take some sort of insurance: "let's make sure I understand the language of my counterpart, even I don't know yet who it is" flanker707: is this correct? fox_tim_l: yes flanker707: alright flanker707: but later in processing, that identity of the counterpart client has to become apparent, otherwise the server won't know where to route the invocation to flanker707: is that correct? fox_tim_l: it doesn't need to know what client sent the invocation in order to route it fox_tim_l: it just needs to know the object id and method id fox_tim_l: which specifies the api call [...] flanker707: but, again, if there is no way of knowing where to send the invocation that arrives on the server (the "object id" and the "method id", how you mentioned), the server will have no way of knowing where to send invocation to. flanker707: So there's in "object id" flanker707: which is, probably, the connection ID fox_tim_l: no it's not flanker707: then what it is? fox_tim_l: it's the object id of the object in the aop dispatcher, and the methodhash flanker707: ok, not the connection ID, but the "advised" id flanker707: which is either connection id, session id, ... etc fox_tim_l: yes i gues so flanker707: I am talking about JMS connection endpoint id, JMS session endpoint id flanker707: so flanker707: each invocation HAS an ID flanker707: or a way of identifying the target flanker707: othewise it'll be an unroutable invocation flanker707: right? fox_tim_l: well not quite flanker707: ok flanker707: ... flanker707: waiting for you to contiunue flanker707: and explain "not quite" fox_tim_l: the method id might refer to a method that doesn't exist (from an earlier version) so the version byte is needed to work out how to "translate" the method call into a call on the new api fox_tim_l: so it can work with old clients flanker707: oh, that's fine flanker707: I didn't say that flanker707: I am talking one level of abstraction above that flanker707: I am not concerned yet how I will solve the incompatiblity cases flanker707: what I want to agree on flanker707: is that, one way or another, each invocation contains information of uniquely indentifying its target flanker707: (agree so far?) fox_tim_l: yes flanker707: alright flanker707: so it become possible to think of a system flanker707: (client - server system) flanker707: in which interaction flows something like that: flanker707: 1. Client sends a handshake to the server: I am "Monica" :) and my version is 1.0 flanker707: 2. The server takes note: ummm, yes, Monica started to talk to me, so anything that comes from her will be in the "1.0" language flanker707: (so let's write this down Monica - 1.0) flanker707: 3. The client start sending invocations (Monica, I want a connection on the server) flanker707: 4. The server sees "Monica" on the wire, looks in his tables, and says: ah, ok, everything that follows after "Monica' is in "1.0", so let's better be careful to understand that correctly flanker707: 5. The server says. .. ah, monica wants a connection (in 1.0-speak), let's create a connection for her flanker707: 6. Server returns a connection to monica, with the id "4 8 15 16 23 42" flanker707: 7. Monica wants a session so it sends to the server "4 8 15 16 23 42" wants a session flanker707: 8. Server looks up at "4 8 15 16 23 42", looks in its tables, and figures out that "4 8 15 16 23 42" belongs to Monica, that speaks 1.0, so everything that follows after "4 8 15 16 23 42" must be in 1.0 flanker707: and so on fox_tim_l: there is an assumption here that objects are only used by one client flanker707: I don't quite understand that. So what if the objects are used by more than one client? fox_tim_l: if an object is used by more than one client how can it "belong" to Monica? fox_tim_l: it belongs to both clients fox_tim_l: so you can't determine identiy of the caller flanker707: Give me an example of such object fox_tim_l: in the current way of structuring the API none fox_tim_l: but in the general case of remote RPC, sure fox_tim_l: in the future we may want to expose the serverpeer for instance fox_tim_l: over RPC flanker707: Tim ... fox_tim_l: so this works in a very specific case fox_tim_l: imagine a remote ejb fox_tim_l: if this stuff is going to be abstracted out at some point, then it needs to be more general fox_tim_l: and not limited to very specific domains like JMS flanker707: from my experince, my worst results starts to show up when I start thinking something about these lines: oh, this works well in this case, and this is all I need so far, but then why don't I complicate my life and make it work for a more general case (which will never come to life, in most cases) fox_tim_l: actually there's another point here too flanker707: ok, what's the other point? fox_tim_l: let me remember it :) flanker707: sure fox_tim_l: we don't want to be prevented from designing the server how we want because it doesn't work with the versioning system. flanker707: huh? flanker707: blank flanker707: (my mind drew a blank here :) ) fox_tim_l: what i'm saying is that doing versioning by ownership puts constraints on how we design the API fox_tim_l: it will only work as long as the API is designed in such a way that objects aren't shared flanker707: no, that's not true fox_tim_l: i think it's dangeroous to limit ourself in such a way flanker707: there are only two invariants flanker707: that we need to maintain across releases flanker707: and those invariants are: flanker707: each conversation starts (from the client side) with "My identity - My version" flanker707: and then each invocations (from the client side) is structured as follows: "my target identity - anything else" flanker707: and that's it flanker707: anytying else it's literarily "anything else" flanker707: the server takes decisioos on where to route that ivocation (if it's an invocation) flanker707: and if the versioning is correctly maintained flanker707: that invocation will be always sent to the object with the right API flanker707: my whole point in the discussion is for our case, the conversation is always stateful (the client has at any time state maintained on its behalf on the server, and it always send some id to look up that state on the server), so it's no point in sending version state over the wire every time. It's redundat. fox_tim_l: correct, in the specific case you mention fox_tim_l: but in the future we might want non stateful operations [...] fox_tim_l: your solution will work with them, but not with stateless fox_tim_l: but i want to support both fox_tim_l: actually here is a good example fox_tim_l: i have thought about this before flanker707: ok flanker707: listening fox_tim_l: when we send a message or an ack to the server fox_tim_l: this is one of the invocations that need to be fast fox_tim_l: as fast as possible fox_tim_l: so when we send a message fox_tim_l: the minimum amount of information we need is: fox_tim_l: the message itself, and the destination id fox_tim_l: currently, when we send a message we send that, but we also send the object id of the connectiondelegate and the method id on the connection delegate fox_tim_l: this is actually redundant fox_tim_l: and takes up valuable space on the wire fox_tim_l: but because we are putting everything through "remote object calls" fox_tim_l: we do it this way fox_tim_l: but structuring the api as a set of remote objects is only one way of doing it, and is probably not the best in some situations (like this) flanker707: yes, but then if you only send the message id fox_tim_l: so, if there is no "remote object" then in your system, then you cannot work out the version fox_tim_l: hence to avoid limiting ourselves to a straight object id fox_tim_l: i prefer to send the version every time fox_tim_l: just more flexible flanker707: well flanker707: you're saying it's more flexible fox_tim_l: sometimes a "C style" api is beter flanker707: I am saying it's redundant flanker707: and the conclusion is flanker707: that both solutions are possible flanker707: and you cannot say that one is better than another from a performance point of view flanker707: unless you implemented them both flanker707: and then you measure them fox_tim_l: so how would your solution work in the example i just described? fox_tim_l: i don't get it flanker707: you'd just prefix the invocation with an "object id" that will help the server figure out the version. fox_tim_l: well that's what the version byte is for!! flanker707: yes, but that version byte is NOT needed in any other case when sending the object id HAS to be there flanker707: so you're loading each invocation with an extra byte fox_tim_l: but it's much easier to do it consistently the same way for all invocations flanker707: for a limited number of "future" cases fox_tim_l: instead of having to do it one way for some and another way for others flanker707: while you could just add an extra "object id" info for that specific future case fox_tim_l: depdnding on how the api is built. yuck flanker707: no flanker707: you don'thave to do it one way or another flanker707: you do it the same way for all flanker707: if you want, I can even apply math to it fox_tim_l: so you add an id to all invocations? flanker707: the id IS in all invocation flanker707: I don't have to add it flanker707: It IS arleady there fox_tim_l: no it's not in the one i just described flanker707: yes flanker707: but that's a "future" case that don't even know we'll implement or not flanker707: dooesn't exist yet fox_tim_l: again, limiting flanker707: you're designing by exception flanker707: no flanker707: it's not limiting fox_tim_l: actually this is something we should really consider doing flanker707: in that hypotetical future case of yours flanker707: for which I don't see any need, but that's me flanker707: you can always send "object id" instead of version flanker707: and you're consistent with everything else flanker707: let's do some math: flanker707: Today, each invocation contains the object id fox_tim_l: i guess my philosophy is "keep it simple", applying a single id is nice and simple, we also have less code, don't have to maintain looks up of ownership on the server etc fox_tim_l: also this is ultimately flexible flanker707: yes flanker707: object id - anything else is ultimately flexible fox_tim_l: so why not go for the simplest and most flexible solution? flanker707: it works now flanker707: and it will work in the future flanker707: and it's the most efficient by the way flanker707: because I dont' have to send anything redundant over the wire fox_tim_l: ok consider this case: fox_tim_l: in version 1 of the system the api is as it currently is fox_tim_l: in version 2 we change everything to a c style api fox_tim_l: because we decided to reqrite it all in C fox_tim_l: but we still want it to work for all clients fox_tim_l: i suppose my point is handling of versioning should not impose rules on how we design the server fox_tim_l: since it may well cause us big problems in the future that we can't see now fox_tim_l: but we won't be able to do anything about fox_tim_l: since the clients will already be out there flanker707: how is sending "object id - 10011010101011010....10010101" imposing rules on how you desing the server? flanker707: that byte blob can be anything fox_tim_l: that's what i currently send flanker707: no fox_tim_l: change object id to version id flanker707: you send version id - somehting with object id embedded - flanker707: version id is redundant fox_tim_l: object id is limiting you to objects flanker707: because the server had enough information to figure out version id flanker707: oh well, call it "id" flanker707: and not object "id" fox_tim_l: exactky fox_tim_l: call it id flanker707: id of "something" on the server that helps the server to figure out the version fox_tim_l: then you are proposing exactly what we already have flanker707: NOT the version flanker707: but an id fox_tim_l: this is the core point: fox_tim_l: if you are just using your id to figure out the version (an extra lookup) then why not just send the version in the first place? flanker707: because a lookup is cheaper than shoving bytes over the wire fox_tim_l: no - you're send an id over the wire in both cases fox_tim_l: object id - 10011010101011010....10010101 fox_tim_l: we're just calling it version id now flanker707: no fox_tim_l: still just one id flanker707: version id is a byte flanker707: it can olny have 256 values flanker707: the "object id" or "id" what I am talking about is generic flanker707: you can have an infinity of values fox_tim_l: aha! fox_tim_l: so can the byte? fox_tim_l: actually fox_tim_l: if you read the thread fox_tim_l: this problem is addresses fox_tim_l: addressed fox_tim_l: what happens is this: fox_tim_l: if the byte has values from 0-254 fox_tim_l: then that is your version fox_tim_l: if the byte has value 255, then you look in the next byte fox_tim_l: so in most cases you just use one byte fox_tim_l: in the very unlikely event we have more than 255 versions of jboss messaging then we use the 2dn byte too fox_tim_l: but this isn't going to happen probably flanker707: yes, but you deviated the discussion fox_tim_l: i discussed this with scott on the forum flanker707: It's not the version can have 256 distinct values or not flanker707: it's that you don't need to send any *version* information at all over the wire flanker707: since you are already sending object ids fox_tim_l: again, your assuming a stateful api flanker707: but WE HAVE a stateful API fox_tim_l: currently fox_tim_l: but there are good cases for some operations being not stateful as already descrtibed flanker707: you could only come with only one case (mesage acknowledgment), and we didnt' even discuss the real need for that in detail, and even if we do need it, the case can be very well molded with the current model fox_tim_l: in summary, your solution works as long as we impose constraints (stateful etc) on the API flanker707: no flanker707: you didn't understand fox_tim_l: my argument is that i don't think that is flexible enough flanker707: the solution works in any case flanker707: if it needs flanker707: and by the way flanker707: we don't have such needs flanker707: right now flanker707: it's a very well defined api flanker707: statful is fine flanker707: because this is JMS fox_tim_l: again i refer to send a message flanker707: and it's simpler to keep that way and not over engineer it with hypotetical exotic cases fox_tim_l: this doesn't need to be stateful fox_tim_l: it's unperformat as such flanker707: you cannot say "unperformant" without coming with numbers showing the difference fox_tim_l: ok, less performant flanker707: no flanker707: you cannot say that either fox_tim_l: more bytes = slower flanker707: until you have both systems compared flanker707: yes, exactly :) flanker707: now we're sending (version + object id) flanker707: what I was proposing was to send (object id) fox_tim_l: there are more bytes with stateful therefore it's slower flanker707: so in my case we're sending less bytes flanker707: so hence, according your logic flanker707: my case is faster flanker707: but I am not saying that flanker707: because I haven't measured it fox_tim_l: no, i am talking about sending just version fox_tim_l: no object id flanker707: We are already sending version + object id fox_tim_l: there is no object fox_tim_l: it is a c style api flanker707: I said this probbably three or four times already flanker707: with each invocation flanker707: we currently sending version + object id flanker707: and I can prove that flanker707: by printing out byte dumps of each invocation type fox_tim_l: sorry we are talking at cross purposes flanker707: let's re-align then fox_tim_l: i was talking about how the minium amount of information to send when sending a message is: fox_tim_l: message, destination id, version id flanker707: yes, in this very specific case fox_tim_l: you want to send message, destinatoin id, object id, method id fox_tim_l: which is more stuff fox_tim_l: quad erat demonstrandum flanker707: with your example, you still need to send method id fox_tim_l: no flanker707: ok, flanker707: so how do you know it's sending the message and not acknowleding the message flanker707: or just "please take a look at this message and tell me if you like its color"? fox_tim_l: this is just a byte, not a string flanker707: what is just a byte? fox_tim_l: if you look in the marshaller fox_tim_l: it adds a byte to represent the operation flanker707: that's perfect flanker707: but it needs to be there fox_tim_l: if you use an aop object invocation then it sends a method hash flanker707: in your case and in my case flanker707: the cases are identical flanker707: "somehting" to say what to do with this message flanker707: either it's a byte fox_tim_l: the string is much longer than the byte fox_tim_l: more stuff flanker707: or a 3000 characters string flanker707: yes flanker707: agreed flanker707: but I didn't say I will send a string either flanker707: what I said flanker707: was that in my case flanker707: what goes over the wire is (state_id, operation_code, message) fox_tim_l: so you going to rewrite aop? fox_tim_l: to not use method hashes? flanker707: I am not talking about rewriting aop. You're leaving again our beforehand-agreed abstraction level flanker707: I don't care about AOP, C, or anything else implementation flanker707: what I am comparing right now flanker707: is what you send over the wire in both cases flanker707: in the "send version over the wire case" flanker707: you send (version, operation_code, message, destination_id) flanker707: in "stateful" case flanker707: you send (state_id, operation_code, message) flanker707: so fox_tim_l: ok flanker707: version = 1 byte, operation_code = 1 byte, destination_id = 4 bytes fox_tim_l: ok flanker707: state_id = 4 bytes, operation_code = 1byte flanker707: you send 6 bytes flanker707: I send 5 fox_tim_l: huh? flanker707: how is this more stuff? fox_tim_l: you need to send destinationid too fox_tim_l: you send 3 more bytes flanker707: state_id supposedly uniquely defines a server-side "object" that is unequivocal related to destination flanker707: the "producer id" in the JMS case flanker707: so no flanker707: no destination Id fox_tim_l: that doesn't work since you can anonymouis producers flanker707: yes flanker707: in that specific case flanker707: you need to send an extra 4 bytes fox_tim_l: in needs to work in all cases flanker707: that's fine flanker707: but statistically speaking, if you add them all together flanker707: i think (again, I say I think, this needs statistical analysis) - it's better to send very little stuff a lot of times, and then just from time to time send a lot of stuff fox_tim_l: anyway, we don't have server side producer objects any more flanker707: than sending all the time more stuff that it's necessary flanker707: yes flanker707: and that's a mistake :) [...] flanker707: I will post this discussion on the versioning forum flanker707: for reference flanker707: we cannot draw any definitive conclusion flanker707: until we build two equivalent systems flanker707: and perform statistical analysis on them flanker707: until then flanker707: everything we said it's speculation fox_tim_l: i beleive there are valid conclusions that can be reached with no experimentation flanker707: that's true flanker707: but this is not one of them flanker707: since there are a lot of side effect issues we didn't even consider flanker707: the conclusion is that flanker707: we won't change anything in the implementation anytime soon flanker707: but alternative ways of implementing it are posssible fox_tim_l: sure flanker707: and it's quite a tall order to reject apriory an alternate implementation flanker707: just based on hunches flanker707: and "but in the future we may want it to also be able to make coffee"-type of stuff fox_tim_l: remember bill gates said we would never need more than 640K of RAM? flanker707: oh sure ... but also people were building "personal airplanes" in the fifties fox_tim_l: in ten years time when we're programming in "super aop language" which has no objects and the new server has to work with old jboss messaging 1.1 clients ? ;) [...] ---- End IM Conversation Transcript ----------------------------------------- View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=3946337#3946337 Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=3946337 |