Re: [Tm4j-developers] TopicMapObject.equalsByID - is it really needed?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hi Xuan

I can see that you have defined a new set of interfaces for the new data
model. The point I'm trying to make is that I think you should have done
that WITHOUT ALSO making (any) change to the existing interfaces for the
old data model. 

You say you've only made this one change, and it's true it's not in
itself a big change. I would be quite happy for this method to exist
(though I'd prefer it to be called equals() as I said earlier), but more
importantly, the fact that you added the method at all is indicative of
a design approach you have taken which I think is mistaken, and which
has actually caused me some problems.

You agreed with me that the data models are incompatible, yet your
change to the TopicMapObject interface is precisely for the purpose of
establishing (limited) compatibility. This still doesn't make sense to
me: it seems to me that if the 2 models are really incompatible, then
there is no need at all to change the old interface. So I think there's
a contradiction in what you have said.

It appears to me that the change to the interface was to make it
convenient to use common code to process both XTM 1 and XTM 2.
Specifically, I'm referring to the revisions to org.tm4j.topicmap.utils
in which you introduced some XTM 2 support:

http://tm4j.cvs.sourceforge.net/tm4j/tm4j/src/org/tm4j/topicmap/utils/XTMParser.java?r1=1.19&r2=1.20
http://tm4j.cvs.sourceforge.net/tm4j/tm4j/src/org/tm4j/topicmap/utils/XTMBuilder.java?r1=1.72&r2=1.73

These classes used to parse XTM 1 and build an XTM1-flavoured topic map
graph.

If I understand it correctly, it still builds a topic map conforming to
the old model (i.e. not the TMDM), but you have changed it so that it
will parse documents which are either:

1) XTM 1;
2) a subset of (?) XTM 2; or
3) some kind of hybrid of XTM 1 and 2;

Dealing with those in order:

1) The XTM 1 support is currently broken in HEAD because of a change to
the handling of xtm:member/xtm:topicRef (there's even a comment in the
code to this effect, saying that it causes problems but "is correct" -
something I don't understand at all).

2) I don't understand how you are planning to finish this; how will you
handle the other discrepancies between the 2 data model? How would you
handle typed names for example?

3) I don't think is desirable at all. IMHO the parser/builder would be
much improved by adding validation to ensure that the markup conformed
to the correct schema. Allowing for a mixture of the 2 is a step away
from that. 

In any case I don't think it's a good idea to mix up the 2 markup
languages (XTM 1 and 2) in the same parser. In my opinion you should
create a distinct XTM2Parser, and it should use an XTM2Builder to build
a TMDM graph. At the moment we have a parser/builder which does not
fully handle either XTM 1 or 2, which is regrettable.

> Additionally, I considered (and still consider) this API "public, but 
> extendable" (although I do not intend any further changes). 

Well - my perspective is different. My view is that the old interface is
adequate for XTM 1 and should not in general be changed (certainly not
without consensus). Just as the XTM 1 standard still exists and remains
stable these last several years, so the interface should remain stable
as well. Naturally XTM 2 is entitled to exist and to formally
"supercede" XTM 1, but it does not actually modify XTM 1, and neither,
IMHO, should TMDM support in TM4J2 require changes to the XTM1 parts of
TM4J.

> But even 
> though I do know now that there is external software implementing the 
> API, I still think an approach like it is done with the Linux kernel is 
> appropriate:
> 
>    1. What is in the kernel gets refactored appropriately by those who
>       do the refactoring.

yes

>    2. What is outside but implements an inside interface has to be
>       updated by the outside maintainers in order to mirror the kernel
>       development.

yes

>    3. Development freeze (=stopping changing APIs) over extended periods
>       of time is unacceptable.

Here I disagree. I believe that certain APIs should indeed be frozen
over extended periods. 

These are the APIs that correspond closely to e.g. the standard markup
languages and data models.

At the very least, when these public APIs are changed they MUST be
announced.

>    4. Who wants to be really conservative and not go the way of being
>       up-to-date (both regarding breaking code and regarding reaping the
>       fruits of the development) may stick with an old version.

I believe that being conservative has its place. I am not at all opposed
to adding new APIs, but changing existing APIs should not to be done
lightly IMHO.

There is nothing stopping you from adding new APIs, and adding XTM 2
support - the only problem is that you have disrupted the XTM 1 support
in the process. I am currently working from an old revision becaue of
bugs in XTM 1 support in the current version. This is why agreement of
the other committers to adding XTM 2 support was conditional on that
change being made in a branch, and kept separate from the XTM 1-related
code. Particularly for people who are running TM4J in production, it's
important to be able to have the ability to fix bugs, while otherwise
retaining stability. Our project at NZETC is heavily reliant on XTM 1,
with little or no prospect of adopting XTM 2. For us, XTM 2 development
is significantly less important (at least in the short term) than
retaining stability of the XTM 1-compatible code base.

So this is why I believe we must now create a branch (based on a
revision prior to the introduction of XTM 2 code). We can call the
branch "TM4J_1" and we can keep all XTM 2 features out of it. This
branch can then be developed until the release of TM4J 1.0. In the
meantime, Xuan, you can continue your work on XTM 2 in the trunk. 

> Additionally, specific to TM4J, there is low to virtually no activity on 
> the project in the last 12 months (at least when I exclude commits by 
> myself), so even for major (API) changes, there is not really any reply 
> expected. Thus, it is quite unreasonable to ask and wait for no reply. 
> Thus, I think, if there is any progress to be expected at all, the 
> burden of disagreeing (like writing e-mails and debating pros and cons) 
> should be on those being otherwise passive (i.e. by default every change 
> is allowed unless challenged, not by default every change is denied, 
> unless allowed).

I disagree strongly with this. Disruptive changes to the code should be
discussed in public, or at least proposed publicly, so that they CAN be
discussed. I think there are great advantages to following such a
collaborative process. 

Anyway ... consider your revision "challenged". I am not just going to
revert your changes but I do expect that we attempt to reach to a
consensus now.

I want to come to an arrangement in which the TM4J_1 branch is
established purely for XTM 1.0, and the trunk is used for developing XTM
2.0 support in a way which does not mix up XTM 1 and XTM 2. As I
explained earlier, my primary interest is in XTM 1.0 (because of my
other software which produces XTM 1), and hence I don't think I'll be
able to do much on the XTM 2 branch, but all the same I still have an
opinion on how it should be done, and I don't want to be left out of the
loop :-)

> I'd like to ask you to make your API implementation publicly writable 
> (e.g. import it into the TM4J project under some open source licenses), 
> then bad surprises like a need to adapt to new versions can be avoided.

Yeah - I will be contributing the sqlprovider code when it is
substantially more complete. At present it is still missing quite a few
bits - a number of TopicMapUtils methods, and most of the indexing
interface. It's probably still a couple of months off, at least.

-- 
Conal Tuohy
New Zealand Electronic Text Centre
www.nzetc.org