On Debian Sid, it appears that 'maxim-index.lisp' encodes wrong lengths.
We can check it in a maxima session:
entering '? ?' gives a paragraph that ends in the middle of a sentence.
This can be observed for other description requests.
I found this issue by entering '? build_info' in wxMaxima:
I got an endless loop instead of a description (it appeared
that the cut was around a special character).
Actually, for these two examples at least, the lengths in the 'maxim-index.lisp'
distributed in Debian and the ones distributed in the upstream source are the same.
So I am afraid that is not a Debian specific issue.
This issue is reported in Debian bugreport #1131495.
Here is a replacement for
read-info-textin cl-info.lisp that appears to work for me with gcl. "? build_info" isn't truncated.This needs more work, but seems ok. Of course, this assumes the info files are utf-8 encoded. We need to make sure all the info files are utf-8 encoded.
@rtoy Thanks for working on it, but I dunno, reworking
read-info-textlooks like a bridge too far for me. If there is some desire to make Unicode stuff work for GCL, maybe you can invest the same time and energy in GCL. Or maybe not, as you wish, either way is A-OK by me.@rtoy @jgmbenoit @villate The Interwebs claim that
makeinfohas a command line flag--no-utf-8. If someone wants to package Maxima with GCL, that flag could be enabled. The examples in the Texinfo documentation are already using ASCII art pretty printing (i.e., using the old ASCII art pretty printer instead of the current Unicode-enabled pretty printer). There might be a few Unicode characters here and there, but not many, I believe.This is, of course, ignoring languages which use Unicode characters for diacritics. That's just not going to work for GCL. If someone feels strongly about it, maybe they can work on GCL. I agree it's undesirable but under the circumstances I don't believe it is the Maxima project's responsibility to fix it.
Are you going ;to leave gcl broken then? FWIW, I have offered to help Camm with unicode, but he wants strings to be utf-8 and I don't see how that can possibly work when you can do arbitrary writes to anywhere in the string. Maybe I lack imagination.
About what Maxima should do for GCL, it makes sense to me to enable
--no-utf-8formakeinfo(and also inspect the Texinfo files to replace any stray Unicode characters) so that GCL can handle the resulting output.About GCL's Unicode support, my advice is just go ahead and try to do it the way Camm wants to. On the face of it, it seems like a fixed width will be simpler to work with, but maybe it will work out, and you can always back up and try again if it doesn't.
I'm ok with this as long as it's the default. I guess all the translations can use latin1 encoding. Well, maybe not for the Japanese translation, but I don't know if that builds anymore. I don't normally build any of the translations.
If gcl uses utf-8 strings, I'm not likely to help. It seems overly complicated and prone to bugs. Or maybe I'm just not clever enough to know how to do this efficiently and transparently.
Anyway, that's a gcl problem, not a maxima problem (yet?).
I play with the latest unstable packages of gcl27 and maxima. The current texinfo version in unstable is 7.3-2.
build_index.pl (and update_examples) are anachronisms.
Irregardless of how we fix this bug report, I think the best way forward
would be to re-write this perl script in lisp and generate the index
file at run-time, not build-time.
BTW, it is an embarrassment that we use perl and not lisp to do this
simple text stuff. There may have been a time when that made sense, but
that is long gone.
I think many years ago Rupert Swarbrik (?) started on a lisp replacement for build_index.pl. Never finished. I'm not sure how that would work if a Lisp doesn't have unicode support, but it would probably be easier today since we have pregexp support.
I have thought about using m4 to do update_examples, so that everytime the manual is generated, the examples are too. But not sure if that's a good idea because it would probably really slow down generation of the docs. Plus, someone would have to check the manual that all the examples were converted correctly.
Also, people complained that my changes to generate grad results at startup took too long (a second or two extra?). Generating the index at start-up would probably take even longer.
But maybe we can get AI to convert build_index.pl and update_examples to Lisp? I would be ok with that.
Whether or not the index generator is Lisp or Perl is beside the point for the purpose of resolving the bug report -- the problem is that GCL doesn't understand multi-byte characters. @l_butler, with all due respect, can you please open a separate ticket to pursue the reimplementation of build_index.pl, should you choose to take up that topic.
One possibility is to generate a
maxima-index.lispdata file per Lisp implementation beside the utf-8 one. For example,maxima-index-gcl.lispcan be generated for theGCLimplementation.Generating
maxima-index-gcl.lisp, without utf-8 characters, would not solve the problem. As Raymond has explained, texinfo introduces utf-8 characters that were not present in the Maxima documents sources. I did get correct info files with GCL using an older version of texinfo that didn't add extra utf-8 characters.But generating
maxima-index-gcl.lispwith lengths as counted bygclwould solve the issue if the info file contains non-ASCII character.I agree with Robert, and I'm still puzzled by the fact that Maxima 5.49 + GCL works fine for me but fails for Raymond and Jerome.
Figured it out. The difference is texinfo. IIRC, you use 6.8. I was using 7.3. With 6.8, "? build_info" is not truncated. The last line is "'maxima_frontend_version' accordingly.", which is correct. Presumably, somewhere between 6.8 and 7.3, texinfo switched to using the left backquote character (non-ASCII) for
@codein info files.Other non-ASCII characters are around as in the description page of
carlson_rj. In the current Sid,echo '? carlson_rj' | maximagivesMaxima 5.49.0 https://maxima.sourceforge.io
using Lisp GNU Common Lisp (GCL) GCL 2.7.1 git tag Version_2_7_2pre13
Distributed under the GNU Public License. See the file COPYING.
Dedicated to the memory of William Schelter.
The function bug_report() provides bug reporting information.
(%i1)
-- Function: carlson_rj (<x>, <y>, <z>, </z></y></x>
)
Carlson's RJ integral is defined by
(%o1) true
In this case I think it is a bug of the m4 macros that Raymond introduced to parse mathematical equations. The original expression in the manual source was an ASCII only equation. I think the m4 macro should use only ASCII for the 2d representation of the equation in the info manual.
Note that the info files also contains author names and paper titles with UTF-8 characters.
Ah. I think I regenerated the examples using
update_examplesand it used unicode characters for the integral signs. We need to modifyupdate_examplesnot to use unicode. Or just bite the bullet and add a utf-8 decoder for gcl so we can read the file.I certainly prefer this approach because it's localized to just fixing
read-info-textfor gcl instead of forcing somewhat arbitrary conditions on what the user manual can use. No one is going to remember and we'll end up debugging this again, and again, and again.@jgmbenoit I notice that Debian bugreport #1131495 is marked "closed". From your point of view, is there any further action needed on the part of the Maxima project?