From: Xuân B. <xua...@ba...> - 2008-04-05 10:59:58
|
Conal Tuohy wrote: > On Tue, 2008-03-11 at 22:36 +1300, Xuân Baldauf wrote: > >> Conal Tuohy wrote: >> >>> On Tue, 2008-03-11 at 16:39 +1300, Xuân Baldauf wrote: >>> >>> >>> >>>> If I remember correctly, I think >>>> http://tm4j.cvs.sourceforge.net/tm4j/tm4j/src/org/tm4j/topicmap/tmdm/tm4j1/TopicMapObjectImpl.java?revision=1.5&amp;view=markup#l_126 is the reason for equalsByID(). This provider is a bridge between XML Topic Maps (in the style of TM4J 0.9.x) and TMDM (in the style of TM4J2). TMDM in general, and the TM4J2-implementation of TMDM in particular, does not require a TopicMapObject to have exactly one ID. Consequently, getID() is not defined, strictly speaking. If getID() is called anyways, some compatibility magic auto-creates an ID if no ID cannot be found or somehow inferred. However, this is a kludge (as a read-only method has a state-changing behaviour), and should be avoided, especially if the only purpose of calling getID() is comparing whether two TopicMapObjects are equal in some sens >>>> e (which >>>> should be read-only-process in turn). You do not want to have an OutOfMemoryError just because you called getID(). >>>> >>>> >>> Why not just use the equals() method then, Xuan? >>> >>> >> Personally, I do not care about the name. However, care should be >> taken because we have multiple notions of equality in topic maps and >> in TM4J (like: Are two Topic objects, which are merged, equal? Are two >> Topic objects, which are not yet merged, but should be merged, >> equal?). >> >>> Isn't that the purpose of equalsByID()? >>> >>> >> Maybe, maybe not. The purpose of equalsByID() is "If I would take the >> IDs of the two topic map items involved and compare them, would that >> comparison yield true?". >> > > What does "equals by ID" mean in the case of TMDM topic maps in which > objects may have multiple ids? Would the comparison require that the two > topic map items had at least one item identifier in common? > I have been digging into this (again). When looking at the call sites where equalsByID() is used apart from being used internally to a backend, there is only http://tm4j.cvs.sourceforge.net/tm4j/tm4j/src/org/tm4j/topicmap/utils/TopicMapWalker.java?r1=1.24&r2=1.25 . In this case, the walk() method first walks over all topic objects and then over all association objects. Over all topic objects? No. It does not walk over topic objects which are in "slave" mode when being merged. That needs a little bit more elaboration. In TM4J1, topic objects just exist initially. However, when two topic objects are merged, they still exist independently as two distinct Java objects. However (unless specified explicitly by the user), after merging, the two topic objects are somehow connected internally and behave like one merged topic. However, in one respect they cannot behave like one merged topic, and this is regarding the pointers to them, the pointers are still not equal. So if somebody is encountering both pointers (i.e. by walking over the list of topic objects of a topic map object or by other means), the topic will be walked over twice, which should not happen, because two topics are merged into one. Now, in TM4J1, the "somehow connected internally" between two or more topic objects is implemented by electing one master topic object, all the other ones being slave topic objects. For example, in the memory backend, the slave topic objects have the "m_baseTopic" field set to point to the master topic object, while the master topic object has the "m_mergedTopics" field set to an ArrayList of slave topic objects. The topic objects in the slave mode and topic objects in the master mode (should) behave identical, however, apparently, they differ with respect to the method getID(). Thus, the expression "t.getID().equals(t.getBaseTopic().getID())" (which was scattered all over TM4J1 before I changed this a little bit) evaluates only to "true" if "t" is actually the master mode topic object of a merged topic. (Topic objects which were not merged with any other topic object are in master mode by default.) Because there is only one master mode topic object per merged topic, using this expression, it was tried to ensure that all topics, being merged or not, are walked over only once. What the author of this expression presumably meant, is "is this topic object in master mode?", and thus the author should have refactored all occurrences of this expression to "t.isInMasterMode()" (or "t.isBaseTopic()" when retaining the nomenclature). However, he did not, and refactoring "t.getID().equals(t.getBaseTopic().getID())" into "t.equalsByID(t.getBaseTopic())" was one step into this direction. For the way TMDM topic maps are implemented in TM4J2, there is no notion at all of a master mode or a slave mode a topic object can be in. In the TMDM implementation, topic objects are either always in slave mode (then they are instance of class BasicTopic, see http://tm4j.cvs.sourceforge.net/tm4j/tm4j/src/org/tm4j/topicmap/tmdm/basic/ ), or they are always in master mode (then they are currently instance of class MergedTopic, see http://tm4j.cvs.sourceforge.net/tm4j/tm4j/src/org/tm4j/topicmap/tmdm/merged/ ). There is a distinction of where you apply the changes (in BasicTopic) and where you see the consequences (in MergedTopic). Why this is so, is a different, well, topic (but I'm happy to explain that). For what is needed here, both BasicTopic and MergedTopic are wrapped by org.tm4j.topicmap.tmdm.tm4j1.TopicImpl to an TM4J1-compatible interface, and the method tm.getTopicsIterator() called by method TopicMapWalker.walk() always yields (in the TMDM wrapping layer case) a list of these TopicImpl objects, where merged topic objects do not appear twice. Thus, "t.equalsByID(t.getBaseTopic())" needs to yield "true" in every case when it is called at this call site, and, indeed, it does, as TopicImpl.getBaseTopic() always returns "this" (see http://tm4j.cvs.sourceforge.net/tm4j/tm4j/src/org/tm4j/topicmap/tmdm/tm4j1/TopicImpl.java?revision=1.10&view=markup#l_553 ). > >> Whether this definition happens to be equivalent or not to other >> definitions of equality: I do not know, and this is out of the scope >> of equalsByID(), and this is because I chose "equalsByID()" as name >> and not "equals()". >> > > OK. > > I think it would be good to actually determine the existing semantics of > equals() (in the existing providers), and if this is compatible with the > requirements of equalsByID() (or can easily be made so), to use the name > "equals" rather than "equalsByID". > Having the analysis of above in mind, I'd suggest doing the refactoring into the direction of Topic.isInMasterMode() (or Topic.isBaseTopic() respectively). Then we can get rid of equalsByID() in the interface. (However, we can go even further by defining TopicMap.getMasterModeTopicsIterator(), that is, this method's contract guarantees that no slave topics (i.e. "double topics") are returned. This is what the TMDM wrapper backend can provide (and other backends can provide, too, by using the legacy method TopicMap.getTopicsIterator() and filtering out any slave topic object), while Topic.isInMasterMode() is not meaningful at all for the TMDM wrapper backend.) > >>> Or is there some difference in semantics between equals() and >>> equalsByID()? >>> >>> >> Depends on the intended semantics of equals() and, as they do not seem >> to be defined currently (like equals() is not overriden), a difference >> between equals() and equalsByID() is also not defined. >> >> ciao, >> Xuân. >> >> P.S.: If you happen to be in Auckland in the next months, then just >> drop me a note, then we can meet in person. :-) >> > > I don't get up there much, but you never know; I might well be. When are > you planning to be in Auckland? > From between now and the next 3 month (I'm already there). ciao, Xuân. |