Re: [TM4J-users] [Tm4j-developers] TopicMapObject.equalsByID - is it really needed?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Conal Tuohy wrote:
> Hi Xuan
>
> I can see that you have defined a new set of interfaces for the new 
> data model. The point I'm trying to make is that I think you should 
> have done that WITHOUT ALSO making (any) change to the existing 
> interfaces for the old data model.
>
> You say you've only made this one change, and it's true it's not in 
> itself a big change. I would be quite happy for this method to exist 
> (though I'd prefer it to be called equals() as I said earlier), but 
> more importantly, the fact that you added the method at all is 
> indicative of a design approach you have taken which I think is 
> mistaken, and which has actually caused me some problems.
>
> You agreed with me that the data models are incompatible, yet your 
> change to the TopicMapObject interface is precisely for the purpose of 
> establishing (limited) compatibility. This still doesn't make sense to 
> me: it seems to me that if the 2 models are really incompatible, then 
> there is no need at all to change the old interface. So I think 
> there's a contradiction in what you have said.
>
> It appears to me that the change to the interface was to make it 
> convenient to use common code to process both XTM 1 and XTM 2.
If you mean the equalsByID() change, then, no, this change was 
independent of the XTM 2 reading support.
> Specifically, I'm referring to the revisions to 
> org.tm4j.topicmap.utils in which you introduced some XTM 2 support:
>
> http://tm4j.cvs.sourceforge.net/tm4j/tm4j/src/org/tm4j/topicmap/utils/XTMParser.java?r1=1.19&r2=1.20 
> <http://tm4j.cvs.sourceforge.net/tm4j/tm4j/src/org/tm4j/topicmap/utils/XTMParser.java?r1=1.19&r2=1.20>
> http://tm4j.cvs.sourceforge.net/tm4j/tm4j/src/org/tm4j/topicmap/utils/XTMBuilder.java?r1=1.72&r2=1.73 
> <http://tm4j.cvs.sourceforge.net/tm4j/tm4j/src/org/tm4j/topicmap/utils/XTMBuilder.java?r1=1.72&r2=1.73>
>
> These classes used to parse XTM 1 and build an XTM1-flavoured topic 
> map graph.
>
> If I understand it correctly, it still builds a topic map conforming 
> to the old model (i.e. not the TMDM), but you have changed it so that 
> it will parse documents which are either:
>
> 1) XTM 1;
> 2) a subset of (?) XTM 2; or
> 3) some kind of hybrid of XTM 1 and 2;
>
> Dealing with those in order:
>
> 1) The XTM 1 support is currently broken in HEAD because of a change 
> to the handling of xtm:member/xtm:topicRef (there's even a comment in 
> the code to this effect, saying that it causes problems but "is 
> correct" - something I don't understand at all).
Well, that's a bug. Maybe there is a testcase available?
>
> 2) I don't understand how you are planning to finish this; how will 
> you handle the other discrepancies between the 2 data model? How would 
> you handle typed names for example?
>
> 3) I don't think is desirable at all. IMHO the parser/builder would be 
> much improved by adding validation to ensure that the markup conformed 
> to the correct schema. Allowing for a mixture of the 2 is a step away 
> from that.
>
> In any case I don't think it's a good idea to mix up the 2 markup 
> languages (XTM 1 and 2) in the same parser. In my opinion you should 
> create a distinct XTM2Parser, and it should use an XTM2Builder to 
> build a TMDM graph.
Yes, I agree one should build a nice XTM2 parser because the current 
parser actually parses a subset of the union of the XTM1 and XTM2 
languages. However, building a separate XTM2 parser is a considerable 
effort, and the current XTM2 support is a hack (that is, it works for 
what it was needed, it was written relatively quickly using existing 
infrastructure, but it also allows more than should be allowed). 
Allowing more than it should is not really a problem, as there are 
external XML validators available which can reject everything which is a 
real superset of either XTM1 or XTM2. Breaking correct code is a 
problem, though (but one which can be solved).

If someone finds time to write a clean XTM2 parser - great. But XTM2 is 
not that far from XTM1 away, so for supporting typed names, a XTM1+2 
parser could peek and poke through the wrapper for this special case. 
That is not best design, however, given that XTM1 and XTM2 are fixed 
standards and there is only a small list of differences between them, it 
is probably considerably more economic to make a current XTM1 parser 
XTM2 aware than to build a new XTM2 parser from scratch.
> At the moment we have a parser/builder which does not fully handle 
> either XTM 1 or 2, which is regrettable.
>
>>  Additionally, I considered (and still consider) this API "public, but  
>>  extendable" (although I do not intend any further changes).  
>>     
>
> Well - my perspective is different. My view is that the old interface 
> is adequate for XTM 1 and should not in general be changed (certainly 
> not without consensus). Just as the XTM 1 standard still exists and 
> remains stable these last several years, so the interface should 
> remain stable as well. Naturally XTM 2 is entitled to exist and to 
> formally "supercede" XTM 1, but it does not actually modify XTM 1, and 
> neither, IMHO, should TMDM support in TM4J2 require changes to the 
> XTM1 parts of TM4J.
It should not, but TM4J1 (and the Java language itself, too) are not 
that modular that TM4J2 support can just be "merged in" like you can 
"merge in" a topic map. TMDM support in TM4J2 is not intended to be a 
wholly separate TM engine, it should leverage all the existing TM4J1 
applications, else I would have made the TMDM support in TM4J2 being a 
separate open source project right from the start. So if two systems are 
to connect, I think it is reasonable to build bridges at both ends (i.e. 
XTM1 parts as one end and and the TMDM support as the other end) where 
it makes sense most.
>
>>  But even  
>>  though I do know now that there is external software implementing the  
>>  API, I still think an approach like it is done with the Linux kernel is  
>>  appropriate: 
>>
>>     1. What is in the kernel gets refactored appropriately by those who 
>>        do the refacto ring.
>>     
> yes
>>     2. What is outside but implements an inside interface has to be 
>>        updated by the outside maintainers in order to mirror the kernel 
>>        develop ment.
>>     
> yes
>>     3. Development freeze (=stopping changing APIs) over extended periods 
>>        of time is unaccept able.
>>     
> Here I disagree. I believe that certain APIs should indeed be frozen 
> over extended periods.
>
> These are the APIs that correspond closely to e.g. the standard markup 
> languages and data models.
>
> At the very least, when these public APIs are changed they MUST be 
> announced.
I agree. As I said, I had a closed-world assumption (I expected that the 
only recipients of such an announcement would be the 3 existing 
backends, and in this case it was easier and faster to just change them).
>>     4. Who wants to be really conservative and not go the way of being 
>>        up-to-date (both regarding breaking code and regarding reaping the 
>>        fruits of the development) may stick with an old ver sion.
>>     
> I believe that being conservative has its place. I am not at all 
> opposed to adding new APIs, but changing existing APIs should not to 
> be done lightly IMHO.
>
> There is nothing stopping you from adding new APIs, and adding XTM 2 
> support - the only problem is that you have disrupted the XTM 1 
> support in the process. I am currently working from an old revision 
> becaue of bugs in XTM 1 support in the current version. This is why 
> agreement of the other committers to adding XTM 2 support was 
> conditional on that change being made in a branch, and kept separate 
> from the XTM 1-related code. Particularly for people who are running 
> TM4J in production, it's important to be able to have the ability to 
> fix bugs, while otherwise retaining stability. Our project at NZETC is 
> heavily reliant on XTM 1, with little or no prospect of adopting XTM 
> 2. For us, XTM 2 development is significantly less important (at least 
> in the short term) than retaining stability of the XTM 1-compatible 
> code base.
>
> So this is why I believe we must now create a branch (based on a 
> revision prior to the introduction of XTM 2 code). We can call the 
> branch "TM4J_1" and we can keep all XTM 2 features out of it.
Now, for what it is worth, I have created such a branch long ago, and I 
have called it "TM4J_1_x", very similar to what you propose. However, I 
created this branch deliberately _after_ introducing the XTM2 reading 
code (which, however, only reads XTM2 files into the XTM1 data model), 
because I felt it would benefit TM4J1 users by allowing them to remain 
longer on TM4J1 before switching to TM4J2 while the rest of the world 
starts emitting XTM2 documents, similar to the .odt support in 
OpenOffice 1.1.5 (.odt is the file format of OpenOffice 2). Of course, I 
was not aware of that XTM1 topicRef bug. Do you think if I fix this bug 
(both in the trunk and in the "TM4J_1_x" branch), then this "TM4J_1_x" 
branch reflects your intended "TM4J_1" branch enough? If not, please 
feel free to create another branch before the introduction of XTM2 
reading code.
> This branch can then be developed until the release of TM4J 1.0. In 
> the meantime, Xuan, you can continue your work on XTM 2 in the trunk.
:-)
>
>>  Additionally, specific to TM4J, there is low to virtually no activity on  
>>  the project in the last 12 months (at least when I exclude commits by  
>>  myself), so even for major (API) changes, there is not really any reply  
>>  expected. Thus, it is quite unreasonable to ask and wait for no reply.  
>>  Thus, I think, if there is any progress to be expected at all, the  
>>  burden of disagreeing (like writing e-mails and debating pros and cons)  
>>  should be on those being otherwise passive (i.e. by default every change  
>>  is allowed unless challenged, not by default every change is denied,  
>>  unless allowed). 
>>     
> I disagree strongly with this. Disruptive changes to the code should 
> be discussed in public, or at least proposed publicly, so that they 
> CAN be discussed. I think there are great advantages to following such 
> a collaborative process.
>
> Anyway ... consider your revision "challenged". I am not just going to 
> revert your changes but I do expect that we attempt to reach to a 
> consensus now.
>
> I want to come to an arrangement in which the TM4J_1 branch is 
> established purely for XTM 1.0, and the trunk is used for developing 
> XTM 2.0 support in a way which does not mix up XTM 1 and XTM 2. As I 
> explained earlier, my primary interest is in XTM 1.0 (because of my 
> other software which produces XTM 1), and hence I don't think I'll be 
> able to do much on the XTM 2 branch, but all the same I still have an 
> opinion on how it should be done, and I don't want to be left out of 
> the loop :-)
I think, with your branching suggestion, we are well at such an 
agreement (or at least pretty near to it, as I still think that it 
sometimes makes sense to change the TM4J1-legacy within TM4J2 to 
integrate with the TMDM backend). What do you think?

(It is a little bit a shame that the code is not (and presumably never 
has been) in the state where all the testcases succeeded. If so, we 
could switch to test-driven development where nearly every change is 
okay unless a testcase fails.)
>
>>  I'd like to ask you to make your API implementation publicly writable  
>>  (e.g. import it into the TM4J project under some open source licenses),  
>>  then bad surprises like a need to adapt to new versions can be avoided. 
>>     
>
> Yeah - I will be contributing the sqlprovider code when it is 
> substantially more complete. At present it is still missing quite a 
> few bits - a number of TopicMapUtils methods, and most of the indexing 
> interface. It's probably still a couple of months off, at least.
Maybe you want to share it anyways, even if it is not complete. :-) This 
is okay in the CVS HEAD (not in releases, though), and it would give 
other people the opportunity to do some parts of the work needed.
>
> -- 
> Conal Tuohy
> New Zealand Electronic Text Centre
> www.nzetc.org <http://www.nzetc.org>
>
>         
>
> ------------------------------------------------------------------------
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2008.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> ------------------------------------------------------------------------
>
> _______________________________________________
> Tm4j-developers mailing list
> Tm4...@li...
> https://lists.sourceforge.net/lists/listinfo/tm4j-developers
>