From: Kal A. <ka...@te...> - 2001-08-08 09:00:18
|
Hi all, You will probably have seen the recent posting from Florian Haas about work he has done on implementing the TM4J API on top of a DOM representation of an XTM document. The approach he has taken is interesting in a number of respects: 1) The use of XPath queries to provide the topic map indexes 2) The maintenance of an underlying data structure (the DOM) which much more closely represents the source XTM file being processed. My feeling is that were this package to be developed, it would provide a robust foundation for building a topic map editing package that can do some of the useful editor-type things that the existing in-memory package could not do. For example, the DOM implementation could be made to preserve element ordering, and so preserve toppic map elements on a "round-trip" editing cycle...this is something which would be a big advantage for manual editing. Also, having an integration with an XPath processor would enable an editor to construct arbitrary queries quite easily. Florian and I have discussed how this package relates to the existing in-memory implementation and I think that this package has strengths that make it ideally suited for building an editing environment. As a new project team, we don't really have a "process" for introducing new packages into the project. Perhaps at some stage in the future we should formalise the process...however, right now I am interested in hearing what the views of the other members of this list are. The main question is "Does this sound like something that should be a part of TM4J ?"...the second question is "Is it something that you feel you could / would help out with ?" I look forward to hearing your thoughts. [The email discussion between Florian and myself is copied at the end of this email]. Cheers, Kal ----------------------------------- Kal Ahmed XML and Topic Maps Consultant e: ka...@te... p: +44 (0)7968 529 531 w: www.techquila.com ----------------------------------- > > Hi Kal, > > | Firstly, thanks for telling me about this development! > > No sweat at all. You're very welcome. :-) > > | I like the use of the XPath evaluations to provide the indexing. > > Well, I do think it's a nice approach, although I'm not too happy with the > way I've implemented it so far -- it's actually quite quick and dirty. I > think I could get this to run a lot more elegantly if I evaluated > the XPath > expression by some other means than using the static methods in > the XPathAPI > class. But for now, it's all "go with what works". > Always the best way ;-) > | I suppose that theoretically, this DOM implementation could be layered > over any > | persistent storage mechanism that provides a DOM interface, right ? > > Honestly: when I started on this I was merely playing with TM4J and > something like "implementing this using the DOM would be nice". Just the > let's-see-if-this-can-be-done type of thing, so now real worries > if there is > any real applicability in life. :-) But what you are saying > certainly makes > sense. > It would be interesting to see if the DOM implementation can play with one of the XML databases that provides a DOM interface. The Ozone database which TM4J uses to provide a persistent backend is also used by another project which layers XML content management on the database - perhaps that might be an interesting starting point... > | I aslo thing that if you can work out a way to make application of the > topic naming constraint and topic > | merging in general work without modifying the DOM tree [...] > > Not quite sure what you mean by "without modifying the DOM tree". > Say I have > a topic that's already in the tree, and the tree already represents a > consistent TM. Now I want to add a new topic, which has the same base name > in the same scope as that existing topic. Classic case of TNC-based merge: > REMOVE existing topic, ADD merged topic. That's modifying the > tree, right? I > take nodes out and I put nodes in. How should I do this without modifying > the tree? Please clarify. > What you describe is what should "logically" happen to the topic map, however, an application should be free to "physically" implement that in any way it sees fit. For example, if topic A merges with topic B, TM4J does not remove either topic, instead it makes B a "merged topic" of A. All topics know their list of merged topics and all merged topics know which topic they are merged with. This means that any topic, when asked for its characteristics (such as its names or occurrences) can actually return a collection containing the values of all of the topics it is merged with. So logically, it looks to an application as if A and B were merged, physically, they are separate and using the API it is possible to get at A and B and modify them separately and even to modify them in such a way that they "unmerge" and go back to being separate topics again. This kind of functionality would be incredibly useful in an editing environment where a merge may happen because the user enters a name string which happens to be used by another topic. And from an editors point of view, I think it would be nice to have a file round-trip all <topic> elements regardless of whether they merge when processed or not... > | [...] becaues the application would then be able to guaruntee > to maintain > the > | ordering of the XTM elements (something which the > com.techquila.topicmap.memory > | package cannot do). > > An com.techquila.topicmap.dom can't yet, either. For example, > currently, if > I want to add a scope to a base name that already contains a base name > string, the DOM implementation simply appends the new <scope> AFTER the > existing <baseNameString>, which makes the whole TM no longer valid XTM. > Needless to say, this has to get fixed. And it will. :-) > We all have to start somewhere! :-) > | So my gut feeling is that this implementation is most especially useful > for editing > | apps - does that sound right to you ? > > Sounds good! Again, this is a classic let's-see-if-this-works venture. > Compare this to a paragraph above; you'll see a pattern emerging here. :-) > To be frank, this whole thing was like, OK, I'll give it a shot, as far as > applicability is concerned -- that's where Kal comes in. And you did! :-) > Kind of a naive approach, I know. > > | In principle, I would have nothing at all against including this > | as part of > | the TM4J Project if we can make a clear distinction between what the dom > | package is intended to do and what the memory package is intended to do. > > Well, I guess that distinction is easily made. Firstly, I suppose (gut > feeling) that the DOM implementation trades perhaps enhanced > flexibility for > a major overhead. Just take a look at the object hierarchy. I would assume > that the in-memory implementation is lightning fast and extremely > lightweight by comparison. Secondly, for the time being, the DOM > implementation is at best pre-experimental and far from even half-way > complete. Its worst down side is, I guess, the lack of a parser. Anyway, > everything is really under major construction, as I'm sure you've noticed. > > Now if you want a formal "yes, I'd like to join the development > effort", you > have it. But please be advised, as I already wrote, that my time is > extremely limited in the next three weeks, so I doubt that I'll be making > any progress during that time. After that, I can get things going again. > That would be cool. I would like to copy this message over to the TM4J Developer's list, and propose it as a new sub-project for TM4J. Is that OK with you ? (I'm making it sound like we have some formal process...but we don't I would just like it to be discussed in that forum too). I'm in favour of adding this in as a foundation for building an editing environment that I think would provide useful features such as element-level round-tripping and arbitrary XPath expression evaluation. If the others agree (there are only two of us currently active developers), then I would be happy to add you in to the development team and you can then upload your code to CVS whenever you feel ready. How does that sound ? Kal |