|
From: Victor M. <vi...@ou...> - 2007-10-19 13:37:38
|
Hi Tony: Sorry for being slow to respond. Out sick, trying to catch up. Also, it looks like I forget to include the mailing list in my last response, so I apologize to those monitoring the list for the disconnect. I am including the list in this response. You wrote: > Though the aXSL page doesn't mention Java until near the end > and, for example, SAX and DOM are/have APIs that have been > implemented for more than just Java. The intent has always been that aXSL could support other languages as you mention for SAX and DOM. The distribution files include the word "java" in them, to theoretically distinguish them from other versions. In fact, I would love to mess with CORBA and such to try to get interoperability between modules written in other languages. (That is a big topic beyond the scope of this thread. I mention it only as an insight into my wish-list). I would love to have a C# or C++ distribution of aXSL. There are several impediments at the moment: 1. I don't have the bandwidth to do it. In addition to aXSL, I am busy with FOray and a full-time job, family, etc. I have sought but been unable to find corporate support for either aXSL or FOray, so I devote less time to it than I would like to. I also am not a very experienced C-ish programmer, so probably the wrong person to do that particular task, efficiently at least. I will say this: if you "get" what aXSL is about, and wish to champion a C-ish version of it, I think that would be extremely useful, and would help any way I could within the constraints mentioned above. 2. A C-ish reference application would seem to be needed to at least show the general utility of the thing. That will be someone besides me. 3. aXSL addresses a bigger issue than SAX or DOM. In fact, most feedback I get about aXSL is that it is too big of a topic/problem for a general interface to be effective. I disagree with that conclusion, but there is a bigger mass of work that needs to be converted. 4. aXSL is still a moving target. The API itself is mostly experimental still, and this makes maintenance difficult, not just for implementations, but for conversion to other languages. This will change as the API gets more stable, but would be of concern at the moment. > > So I agree that all of the implementation intersect at the > output. But > > for those that use the same inter-module API, there is potential at > > least for testing at those points as well. > > Though at present that's only FOray. Right. I suppose that every general API ever created started with exactly one implementation. My attitude is that others are welcome to join if they want to. If not, the benefits to FOray of having this extra layer of modularization are still worth it. Each FOray module is independent of the others. You cannot believe how much garbage and bad design I was able to clean up just because, when you have to expose it in a general API, it looks stupid. If FOray remains forever the one and only implementation of aXSL, the effort was extremely well justified for me just for those benefits. > I'm not trying to impugn the worth of having an API for XSL > processing, but for the foreseeable future, aXSL is most > useful to xmlroff if there is something useful behind those > two words "Testing suites" on the aXSL home page. I understand. This means two things in my mind: 1) unit testing, and 2) area tree (black-box) testing. Unifying unit tests across language platforms is not feasible at the moment, but I do think there are places where black-box testing of XML output could be quite platform-agnostic. I think that FOP's adoption of such a strategy was one of the key things that revived it. If their scheme were made into a more general solution, it would be very useful for what we are talking about here. I started a DTD for area-tree representation one time for aXSL, and that would be a necessary prerequisite for this task. But aXSL could publish such a DTD, then create benchmark area-tree files for each of the NIST FO files, and create some tools to do the diffs and plug into unit-testing tools. Now, if that is of interest to you, I can get excited about spending some time to make that happen. That would be extremely valuable. > > I suspect that I am not understanding something important about the > > way you are using the NIST tests and the DTD, and the links > you have > > kindly provided don't enlighten me. What is the general > approach to your xmlroff testing? > > Are you parsing the two PDFs and (logically) comparing the > output? If > > so, that is very cool. > > My approach is that eyeballs are better than 'diff' for > recognising significant versus insignificant differences. It is certainly true that diff will be well nigh worthless when comparing PDF files made by different products. But in general, the problems that I want the aXSL testing to find are not in the step that gets from the area-tree to the PDF, but in the process of creating the area-tree. You may be interested to know that one of the FOP developers created a scheme that would compare the output of two PDF files on a pixel-by-pixel basis (IIRC) for a reasonable approximation of the eyeball approach that you mention. > I don't currently use the NIST tests because there's so many > of them that I don't have the bandwidth to verify all the > results. Years ago, I used to run the NIST tests in a cron > job every night and ignore the results in my email every > morning. There is an xmlroff ticket to cut down the NIST > tests so we can start again with testing a subset of the 2,700+. That is the general problem with the eyeball approach -- time. Also, I don't trust mine to detect everything, even if I spend the time. > The tests that I do run are FO files that I wrote and the > DocBook testdocs collection, since DocBook formatting is an > important goal for xmlroff users. > > The script that runs the tests is generated from XML that > conforms to the DTD. > > The PDF or PostScript output of a test run is compared (using > 'diff') against the result from a previous run, and if there > are differences, the pages are rasterised, and each page is > compared against the corresponding page from the result of > the previous run. For pages that have differences, a > "stereo" PNG is made that has the red channel from one > version and the blue channel from the other so the > differences are more visible. > > It then becomes a judgement call as to whether the > differences are significant or not, particularly when, if > there are differences, the differences may have been caused > by changes to something unrelated to the implementation of > the FO or property that is the subject of the test. > > The summary of the test results is recorded in another XML > file that also conforms to the DTD. > > Most of the time, the comparisons are between the results > from two builds of xmlroff, but I used the system several > years ago to compare xmlroff output against sample PDFs > provided with the samples for the XSL bake-of (I forget > exactly what it was called) at XML 2003. Cool. Thanks for the explanation. That is more useful than I thought. > > Nevertheless, I think your original point was suggesting that we > > needed to take full advantage of the standard tools that you > > mentioned. That might be > > My original comment was really just to point out that the DTD > and the NIST tests already exist, though the best that I > could say about the DTD is that it is workable, since I'm not > going to pretend that it's in any way ideal. Well, if you are interested in pushing forward with some things that would be useful to both xmlroff and FOray, I'll be glad to help. This is a topic of interest to me, and I have started down this path several times. Knowing that it is of general interest to others would remove one of my impediments toward spending more time on it. Thanks again for your interest and comments. They have been very helpful. Victor Mote |