From: SourceForge.net <no...@so...> - 2013-01-15 09:14:57
|
Bugs item #3600058, was opened at 2013-01-09 01:20 Message generated for change (Comment added) made by dkf You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=3600058&group_id=10894 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: 55. Other Tools Group: current: 8.6.0 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Twylite (twylite) >Assigned to: Donal K. Fellows (dkf) Summary: Doctools nroff/groff output not supported by tcltk-man2html Initial Comment: Tcl's tools/tcltk-man2html is used to generate HTML documentation from nroff/groff sources on multiple platforms. It supports a subset of nroff that is used by the man pages in the Tcl core and bundled packages. Doctools, as used for documentation in Tcllib, produces nroff output that is not supported by tcltk-man2html. This means that the Tcl core utilities for HTML doc generation cannot also generate Tcllib documentation in the same style. It would be desirable to extend tcltk-man2html to support doctools output. There are a number of issues, and complete support will entail changes to tcltk-man2html and fixes to doctools. PATCH For all issues below with proposed fixes, the fix has been implemented and can be found on the branch bug-(thisbug)-td. BACKGROUND Tcllib nroff documentation is generated using 'tclsh sak.tcl doc nroff' which places the .n files in doc/nroff. The files are then moved to be directly under doc/ so that they can be found by tcltk-man2html, which is invoked as 'tclsh tcltk-man2html.tcl --tcl --pkgdir=..\..\ --verbose=1' The tcltk-man2html source was modified to provide more detailed output in some cases. ISSUE #1: Section name with no description crashes ----- scanning page C:/User/Tcl_BUILD/tcl-8.6.0/tcllib/doc/pkg/doc/coro_auto.n coro_auto: NAME: output-name: bad section name: coroutine::auto - can't read "head": no such variable can't read "head": no such variable while executing "man-puts "$head — $tail"" (procedure "output-name" line 8) ----- Proposed fix: Adjust the regex in output-name to not require a space after the dash (consume 1 optional whitespace). ISSUE #2: Doctools escapes single quote at start of line ----- scanning page C:/User/Tcl_BUILD/tcl-8.6.0/tcllib/doc/pkg/doc/csv.n csv: process-text: uncaught backslash: IN {Takes a \fImatrix\fR object following the API specified for the struct::matrix package and returns a string in CSV format containing these values. The separator character can be defined by the caller, but this is optional. The default is ",". The quoting character can be defined by the caller, but this is optional. The default is \'"'. Each row of the matrix is considered a record, these are separated by newlines in the result. The elements of each record are formatted as usual (via \fB::csv::join\fR).} OUT {Takes a <I>matrix</I> object following the API specified for the struct::matrix package and returns a string in CSV format containing these values. The separator character can be defined by the caller, but this is optional. The default is ",". The quoting character can be defined by the caller, but this is optional. The default is \'"'. Each row of the matrix is considered a record, these are separated by newlines in the result. The elements of each record are formatted as usual (via \fB::csv::join\fR).} ----- csv.man contains: ----- Takes a list of values and returns a string in CSV format containing these values. The separator character can be defined by the caller, but this is optional. The default is ",". The quoting character can be defined by the caller, but this is optional. The default is '"'. ----- csv.n contains: ----- Takes a \fImatrix\fR object following the API specified for the struct::matrix package and returns a string in CSV format containing these values. The separator character can be defined by the caller, but this is optional. The default is ",". The quoting character can be defined by the caller, but this is optional. The default is \'"'. Each row of the matrix is considered a record, these are separated by newlines in the result. The elements of each record are formatted as usual (via \fB::csv::join\fR). ----- Proposed fix: add pair ( {\'} "'" ) to charmap in process-text. ISSUE #3: Doctools may generate redundant font changes, which are ignored but result in an unhandled backslash error ----- scanning page C:/User/Tcl_BUILD/tcl-8.6.0/tcllib/doc/pkg/doc/docidx.n docidx: process-text: impotent font change: If not, the list of per-object search paths is searched. For each directory in the list the package checks if that directory contains a file "\fIidx.\fIfoo\fR\fR". If yes, then that file is taken as the implementation. docidx: process-text: uncaught backslash: IN {If not, the list of per-object search paths is searched. For each directory in the list the package checks if that directory contains a file "\fIidx.\fIfoo\fR\fR". If yes, then that file is taken as the implementation.} OUT {If not, the list of per-object search paths is searched. For each directory in the list the package checks if that directory contains a file "<I>idx.foo</I>\fR". If yes, then that file is taken as the implementation.} ----- Proposed fix: doctools is arguably generating a redundant \fR, but we shouldn't be crashing out because of it. The font handling logic could be rewritten to give every span of text terminated by a \\fx or \\f(xy its own open and close fonts, so we can generically handle nested and redundant cases. ISSUE #4: oops: Copyright (c) There are a number of these due to the variety of copyright statement styles in Tcllib. Proposed fix: Support common styles "Copyright (c) YEAR, ..." and "Copyright (c) YEAR,YEAR,YEAR,... ...", and possibly "YEAR-YEAR,YEAR,...". Less common styles should be handled by warning (as currently) and fixing the source in Tcllib. ISSUE #5: Crash from treating string as list ----- scanning page C:/User/Tcl_BUILD/tcl-8.6.0/tcllib/doc/pkg/doc/snitfaq.n list element in quotes followed by "INFO"" instead of space list element in quotes followed by "INFO"" instead of space while executing "llength $rest" (procedure "make-manpage-section" line 100) invoked from within "make-manpage-section $html $arg" (procedure "make-man-pages" line 35) invoked from within "make-man-pages $webdir [list $tcltkdir/{$appdir}/doc/*.1 "$tcltkdesc Applications" UserCmd "The interpreters which implement $cmd esc."] [plus-base ..." ("try" body line 94) ----- Proposed fix: In make-manpage-section the .SS case does a check {[llength $rest] == 0}, but rest is a string (not necessarily a valid list). Check should be {$rest eq {}}. ISSUE #6: unrecognised format directive in .CS block ----- scanning page C:/User/Tcl_BUILD/tcl-8.6.0/tcllib/doc/pkg/doc/docidx_lang_intro.n docidx_lang_intro: make-manpage-section: unrecognized format directive: ... docidx_lang_intro: make-manpage-section: unrecognized format directive: ... docidx_lang_intro: ADVANCED STRUCTURE: output-directive: unexpected .CS format: [<B>include FILE</B>] [<B>vset VAR VALUE</B>] [index_begin GROUPTITLE TITLE] [index_end] docidx_lang_intro: ADVANCED STRUCTURE: output-directive: unexpected .CE ----- docidx_lang_intro.n contains: ----- .CS [\fBinclude FILE\fR] [\fBvset VAR VALUE\fR] [index_begin GROUPTITLE TITLE] ... [index_end] .CE ----- Proposed fix: Various other pages have example lines starting with a period (some are widget names, e.g. .text in sitfaq.n, .plot in statistics.n). This is (to my knowledge) invalid output generated by doctools, and should be fixed in doctools. As a workaround the handling of an unrecognised directive should be adjusted to treat the line as text if in a .CS block (a common case). Outside a .CS block the current behaviour (ignore line) should be maintained. The reason for the workaround in the .CS case is that an unrecognised directive in the middle of text will cause the line buffer to contain ".CS text text .CE" instead of ".CS text .CE" which causes the .CS/.CE output processing to fail. ISSUE #7: Empty .CS block is unupported While processing pki.n: ----- pki: EXAMPLES: output-directive: unexpected .CS format: .CE .CS pki: EXAMPLES: output-directive: unexpected .CE ----- pki.n contains: ----- .SH EXAMPLES .CS .CE .CS .CE ----- Proposed fix: This occurs because the empty lines cause the line buffer to contain ".CS .CE" instead of ".CS text .CE", which causes the .CS/.CE output processing to fail. Permit empty code section by having the .CE handler force the line buffer to flush, even if it is empty. ISSUE #8: Unsupported .TP \fB...\fR ----- scanning page C:/User/Tcl_BUILD/tcl-8.6.0/tcllib/doc/pkg/doc/struct_list.n struct_list: make-manpage-section: ignoring .TP after .TP struct_list: make-manpage-section: ignoring .TP after .TP ----- struct_list.n contains: ----- .TP \fB...\fR .TP \fBi\fR Application of the command to the result of the last call and the \fBi\fR'th element of the list. .TP \fB...\fR .TP \fBend\fR Application of the command to the result of the last call and the last element of the list. The result of this call is returned as the result of the subcommand. ----- Proposed fix: None. This appears to be invalid markup generated by doctools, and should be fixed there. ISSUE #9: Spaces in package names ----- scanning page C:/User/Tcl_BUILD/tcl-8.6.0/tcllib/doc/pkg/doc/graph1.n graph1: NAME: output-name: name has a space: {struct::graph v1} from: struct::graph v1 - Create and manipulate directed graph objects ... scanning page C:/User/Tcl_BUILD/tcl-8.6.0/tcllib/doc/pkg/doc/matrix1.n matrix1: NAME: output-name: name has a space: {struct::matrix v1} from: struct::matrix v1 - Create and manipulate matrix objects ... scanning page C:/User/Tcl_BUILD/tcl-8.6.0/tcllib/doc/pkg/doc/struct_tree1.n struct_tree1: NAME: output-name: name has a space: {struct::tree v1} from: struct::tree v1 - Create and manipulate tree objects ----- Proposed fix: Unknown. ---------------------------------------------------------------------- >Comment By: Donal K. Fellows (dkf) Date: 2013-01-15 01:14 Message: Going through these issues: #1: Strictly not a bug in Tcl, as the line after .SH NAME *must* consist of “blah \- blah blah” (with some scope for flexibility in the “blah”s) or the external tool mkdirhier won't work. This isn't our restriction! OTOH, there's no reason for us to fail to accept it in our own conversion code. #2: Reasonable. #3: Still trying to wrap my head around the change here, but if we're moving to a state-maintaining approach for conversions within a paragraph, we can move to supporting \fP properly (it should mean “previous font” and not “roman font” so the current hack higher up is wrong). #4: The important thing is to get the first and especially last year. It might be worth using a two-stage match for this rather than trying to do everything in the one RE. #5: Reasonable, but recommend using [string trim $rest]eq{} just in case. #6: This is a doctools bug, but if we can still make this a warning it would be a help. #7: Why are there empty examples? That would seem to me to be the real problem here. (Again, happy to support if we can spit out a warning.) #8: That's just wrong. Doctools bug. The argument to .TP (as opposed to the following line) should be the indent level to use. Either generate .IP (with "quoted" leading material) or do .TP right. #9: That's also incorrect output by doctools; NAME sections have very restricted formats (not because of anything done by Tcl, but because of the ways manpages are processed by other parts of the Unix ecosystem) and the version number doesn't belong there. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=3600058&group_id=10894 |