From: Paul V. <pa...@vi...> - 2004-08-16 16:12:03
|
[ took conversation here because subject concerns all docwriters ] Hi Lester, >> Is there any way to get DocBook to start a new page in the printed >> versions. The heading on the bottom of a page, and text on the next >> page looks very mickey mouse. > That's not DocBook, but a limitation of Apache FOP, which doesn't > honour keep-with-next directives (except in tables). Yes, it's > annoying. Maybe we can work around it by including a PDF > postprocessor in the build process, but this will take a lot of time > to investigate and test - which I don't have at the moment. But I'll > see if I can hand-edit the PDF to get them right. Will get back to > you on this. Hmmm, no luck :-( It turns out that once a PDF file is produced, the text-structural information is lost. (Every bit of text only has an absolute position on the page.) So you can do all kinds of things: deleting entire pages, moving them around, changing the margins, adding text and images, deleting text, etc. etc... but you can NOT repaginate in the sense that you insert a hard page break before a widowed header to reunite the header with the section text. This goes for Adobe, but also for any postprocessors out there. So the only option (at least as long as we use Apache FOP) seems to be: wrap every header in a single-cell table together with the text that follows it. This requires editing our transformation stylesheets; don't know how complicated it will be, but I'll have a look at it (somewhere in the next weeks). Greetings, Paul Vinkenoog |
From: Philippe M. <mak...@a6...> - 2004-08-16 18:07:47
|
>> So the only option (at least as long as we use Apache FOP) seems to >> be: wrap every header in a single-cell table together with the text >> that follows it. This requires editing our transformation stylesheets; >> don't know how complicated it will be, but I'll have a look at it >> (somewhere in the next weeks). >> It's seem that it is possible, on my linux machine, I use db2pdf for my technical doc, and I have page-break so it might be possible with FOP too ?? Perhaps some help with : <http://cscisl.dce.harvard.edu/lecture_notes/20040722/links.html> And this : How to insert a pagebreak into docbook output The answer's been that you need to use a processing instruction instead of an element (so also need to write support in your XSLT customization layer for generating a pagebreak in the FO output for each instance of that PI). The suggestion from Paul Reavis on the debian-sgml list was to use <?dbfo break-before="page"?>. Something like that would work -- there's existing (undocumented) support in the stylesheets (dbfo-attribute template in fo/pi.xsl and dbhtml-attribute template in html/pi.xsl) for processing PIs that start with "dbfo" and "dbhtml" and extracting the pseudo- attributes/values from them. As far as the rationale for the DTD not providing an element to force pagebreaks in rendered output, it's just consistent with the fact that the DTD by design provides markup only for modeling structure and content, not presentation. If a pagebreak is strictly a processing thing and has no significance to structure or content -- just something that's specific to print delivery, not to HTML or online help, etc. -- then not really appropriate to have an element for it. -- Philippe Makowski |
From: Paul V. <pa...@vi...> - 2004-08-17 12:09:04
|
Hi Philippe, > on my linux machine, I use db2pdf for my technical doc, and I have > page-break so it might be possible with FOP too ?? Yes, using page-break as such is no problem. The problem is: where to insert it? Because you don't know where it's needed until the PDF is produced and you find widowed headers. And you *only* want to place it where it's needed or you'll get unwanted pagebreaks. So (until FOP implements keep-with-next, or until we find a better alternative) I guess the best solution is: repeat build the PDF; find first occurrence of a widowed header; if found, insert a pagebreak before that header; until no widowed headers remain. This is not exactly elegant, but given the rate at which we publish PDFs it's certainly feasible. After all, we only have to run this cycle on the docs we publish; not on all the versions we build in-between for ourselves or for comments by others. Next question: how to insert the pagebreak? We don't have support for a <?dbfo break...> instruction so we'd have to implement that first; only then could we include such a PI in our docs. This has two drawbacks: - spending time to implement something that is essentially a hack; - the pagebreaks will be in our XML sources; if the document is edited, chances are that some of them will have to be removed, but this could easily be forgotten. Even if not, I think it pollutes the source. The better solution is therefore to insert the pagebreaks directly into the .fo file. Open it in a text or XML editor, use the search function to find the right spot, and change e.g.: <fo:block font-size="12pt" ...>Using a GUI client</fo:block> to <fo:block break-before="page" font-size="12pt" ...>Using a GUI client</fo:block> Of course it's still a hack, but now it takes less time and we keep the sources clean. So you run the cycle like this: "build pdf"; repeat find first widowed header in PDF file; find that spot in the .fo and add break-before attribute; "build fo2pdf"; until no more widows found. BTW, the .fo is in manual/build/docs. As soon as you issue a "build fo" or "build pdf", your manually inserted pagebreaks will be lost! If someone has a PDF ready to publish but finds this too complicated, I don't mind doing it myself. > As far as the rationale for the DTD not providing an element to > force pagebreaks in rendered output, it's just consistent with the > fact that the DTD by design provides markup only for modeling > structure and content, not presentation. That's entirely correct; pagebreaks don't belong in the DocBook DTD. Greetings, Paul Vinkenoog |
From: Lester C. <le...@ls...> - 2004-08-17 14:52:47
|
Paul Vinkenoog wrote: > The better solution is therefore to insert the pagebreaks directly > into the .fo file. Open it in a text or XML editor, use the search > function to find the right spot, and change e.g.: > > <fo:block font-size="12pt" ...>Using a GUI client</fo:block> > > to > > <fo:block break-before="page" font-size="12pt" ...>Using a GUI client</fo:block> > > Of course it's still a hack, but now it takes less time and we keep > the sources clean. That will do me for the time being ;) As for this not being part of DocBook DTD, on one hand I can understand the reasoning, BUT a header should never be the last thing on a page. If that rule was added, things would probably come out much better? -- Lester Caine ----------------------------- L.S.Caine Electronic Services |
From: Paul V. <pa...@vi...> - 2004-09-09 23:17:00
|
Hi Lester, (replying to an "old" one here) >> <fo:block break-before="page" font-size="12pt" ...> >> Using a GUI client</fo:block> >> >> Of course it's still a hack, but now it takes less time and we keep >> the sources clean. > That will do me for the time being ;) > > As for this not being part of DocBook DTD, on one hand I can > understand the reasoning, BUT a header should never be the last > thing on a page. That's right, but such a rule can never be in the DocBook DTD because DocBook does not (and should not) know about pages or page length. So it's the processing software that must take care of this. Now, producing PDF from DocBook is a two-phase process: - First, the Saxon processor takes the DocBook XML sources and - using the transformation stylesheets - outputs a Formatting Objects (FO) file. FO files are well-formed XML; you can view and edit them with an XML editor. The FO thus produced does contain keep-with-next attributes for headers and some other elements. - Then, we call Apache FOP to pick up the FO and transform it into PDF (no stylesheets involved here). And that's where the problem lies: Apache FOP hasn't fully implemented keep-with-next support yet, so headers may wind up on the bottom of a page. It will probably take some time before Apache FOP supports keep-with-next. This was posted a couple of weeks ago on the fop-user list (note: we use version 0.20.5): "Any patches you submit to 0.20.5 are unlikely to be committed to the code base. Any work you do on FOP 1.0 dev will be highly welcome. However, as you've already noticed the development branch is not yet up to 0.20.5 so we need to do a lot more work on core features before it is ready for "tweaking" The reason for this situation is that FOP 1.0 dev is not just an evolution of FOP 0.20.5, but was a ground up re-write. This was necessary because the structure of the code in 0.20.5 did not allow for key features such as keep-* properties." Bottom line: we'll have to hand-tweak the FO files (add 'break-before="page"' attributes) for some time to come. Greetings, Paul Vinkenoog |
From: Lester C. <le...@ls...> - 2004-08-16 16:51:19
|
Paul Vinkenoog wrote: > So the only option (at least as long as we use Apache FOP) seems to > be: wrap every header in a single-cell table together with the text > that follows it. This requires editing our transformation stylesheets; > don't know how complicated it will be, but I'll have a look at it > (somewhere in the next weeks). Seems to be a bit of a major bobo ;) It is the only thing that two of my customers both picked up and commented on. Just a simple 'page break' would solve the problem, and hopefully something that will come. Just need to find just where to post a bug report :) I'll pull the file set together to a single 'book' with 'chapters' for each OS and probably a separate book for all the Eclipse stuff. That does not need to be part of the basic stuff, but it will be nice to have it all documented for my own reference. -- Lester Caine ----------------------------- L.S.Caine Electronic Services |
From: Paul V. <pa...@vi...> - 2004-08-17 12:35:40
|
Hi Lester, >> So the only option (at least as long as we use Apache FOP) seems to >> be: wrap every header in a single-cell table together with the text >> that follows it. This requires editing our transformation >> stylesheets; don't know how complicated it will be, but I'll have a >> look at it (somewhere in the next weeks). Hm, this approach has its drawbacks: it could lead to lots uf *unwanted* pagebreaks, namely if header + all of the following text don't fit on the remainder of the page. But I've thought of something else (see next post). > Seems to be a bit of a major bobo ;) > It is the only thing that two of my customers both picked up and > commented on. Just a simple 'page break' would solve the problem, > and hopefully something that will come. Just need to find just where > to post a bug report :) It's not a bug - it's an unimplemented feature ;-) The people at Apache FOP are aware of this and full keep-with-next support will be implemented... one day. See the Standard Compliance page at http://xml.apache.org/fop/compliance.html more specifically: http://xml.apache.org/fop/compliance.html#fo-property-keepsbreaks > I'll pull the file set together to a single 'book' with 'chapters' > for each OS and probably a separate book for all the Eclipse > stuff. That does not need to be part of the basic stuff, but it will > be nice to have it all documented for my own reference. Yes, the Eclipse stuff - but also the logo stuff - shouldn't be in there. But the Eclipse info is very useful; could you make it an <article> so we can add it to the "Documentation for Firebird docwriters" book? That would be its rightful place. As for the logo stuff - is it ready to be published yet? I thought there needed to be some official OK first (from admin group and/or FFmembers). Greetings, Paul Vinkenoog |