From: Jarret \Jax\ R. <jar...@pr...> - 2023-10-29 07:11:07
|
Dear docutils developers I'm sending you a draft on how to handle literal tab characters inside reStructuredText documents. The Problem =========== Maybe we start with two examples. From here on I'll denote the tab character with its escape sequence ``\t`` (in case it gets replaced by spaces). 1. tabs in Makefile listings/literal blocks ------------------------------------------- Currently it is impossible for the *output* of a literal block to contain tabs. This breaks some listings, e.g. Makefile: >This is a Makefile:: > > all: > \techo Hello This gets rendered as something like >This is a Makefile: ><pre> >all: > echo Hello ></pre> Which is not a valid Makefile as the rules require a tab and not 5 spaces. This behaviour is also contrary to my expectation that a literal block leaves its content untouched. 2. CSV with tabs ---------------- A common delimiter for a CSV-table (or should I say a CTV-table?) is the tab character. Currently, the tab character is only supported for included CSV tables but not for inline ones. The following does not work: >.. csv-table:: good delimiter > :delim: tab > > some\tcsv\tdata The problem is that all tab characters get replaced *prior* to parsing, namely in string2lines(). An attempt for a solution ========================= In an attempt to fix this, I toyed around with not replacing tab characters before parsing but handling them during parsing. The parts concerned with indention parsing are the methods trim_left() and get_indented() of StringList. Adjusting them lets things "just work(tm)". This is a draft, so some details and corner cases must still be sorted out. The above mentioned functions need the `tab_width` (currently at the default value of 8). Since I have not such a deep inside in the architecture of the parser, what would be the best way to pass that information to their consumer? Obviously we should somehow use the docutils.conf option tab_width. Here is some rambling of what I (don't) know: * The StringList methods are called by their StateMachineWS counterparts, which also does not have access to that information. However, we could make it a property of StateMachineWS. * StateMachineWS subclasses RSTStateMachine and NestedStateMachine do have access to document and hence to the tab_width setting. * We could also put the information into the state. In case we ever want to change the tab_width during parsing, e.g. inside a directive this might be the better place? However, I don't know how longlived/isolated these states are, changing the tab_width inside a directive should neither effect the next directive nor the rest of the document. Regards Jax |