Originally created by: *anonymous
Originally created by: pkx1...@gmail.com
Originally owned by: truer...@gmail.com
Add support for unicode filenames on Windows
On Windows, LilyPond can't handle
unicode filenames.
The patch replaces main(), and
hooks filename related functions.
This converts between UTF-16
unicode (Windows) and UTF-8 unicode
(LilyPond, libguile etc.).
LilyPond can handle unicode filenames
for *.ly, *.mid, *.ps.
*.pdf is not supported yet.
Ghostscript-8.70 that is included
in the binary distribution of LilyPond
can't handle unicode filenames.
This requires Ghostscript-9.10 or later
Originally posted by: pkx1...@gmail.com
(No comment was entered for this change.)
Owner: pkx1...@gmail.com
Originally posted by: pkx1...@gmail.com
Patchy the autobot says: passes tests. includes a full make doc
Labels: -Patch-new Patch-review
Originally posted by: dak@gnu.org
Ok, I've been stalling on this patch for about a week because I don't have a good idea how to address it properly. So let's start with the worst first: at the current point of time, I don't see it making sense, and in addition I will not be able to spend significant amount of time addressing the problems arising in connection with it for about a month, so I'd like a moratorium on it for about that time.
The main problem is that the next step needed for GUILEv2 migration (which nobody is really interested in working on apart from myself) is to get the coding mess sorted out. GUILEv1 is basically working with byte streams in its strings and string ports and files. GUILEv2 is coding system aware, uses either Latin-1 or UTF-32 as its string internals, uses UTF-8 for string ports (necessitating copies and conversions rather than being able to work in-place) and converts back and forth all the time.
Since LilyPond has a lexer working on an UTF-8 coded byte stream and data is liberally bounced around between files, strings, and string ports, and the respective position pointers are not distinguished.
It does not help that GUILEv2 changes its ways to pick particular encodings basically every 3 or 4 "stable" versions in the 2.0 series, and 2.1 is different again. So we need to carefully refactor the code in order to have a chance of it working in both GUILEv1 and GUILEv2 (and at all in GUILEv2).
This is an ugly and ungrateful task that has been at the front of my this-is-really-up-next-so-let's-procrastinate-instead queue the last month or so. Letting a patch like this in right now would make the situation worse.
Viewed realistically, this patch also needs Ghostscript updates in GUB to make any sense. There is no reason _those_ cannot go ahead first: having a buffer of a few developer releases where we might discover and sort out possible problems with newer versions of GhostScript on our crosscompiled distributions seems sensible.
Personally, I don't really like the look of this patch at all as it is very specifically useful only for one platform, and apart from that I'd very much want to avoid seeing any wide character functions in LilyPond code: that's rarely exercised library code of often dubious reliability or availability, and most programmers are not overly comfortable utilizing it. I'd very much prefer that we'd find a better encapsulated solution, probably by utilizing the encoding support of GUILEv2. Which again suggests that a moratorium on this patch makes sense, but that one would not be "until David gets encodings sorted out so that we can compile and test on GUILEv2" but rather on "until we are able to ditch GUILEv1 altogether". And given the current enthusiasm for working on the GUILEv2 migration and the likelihood of necessary bug fixes in GUILE itself becoming available as we discover problems, the latter is likely quite longer than a month away.
The problem this patch tries to address is not a recent problem, and the workaround ("don't use special Unicode characters in filenames") is obvious. So the urgency seems limited. I very much agree that LilyPond should not balk at file and directory names containing accented characters on all platforms. But I think we will be doing ourselves our favor to postpone addressing this problem until we have moved LilyPond to GUILEv2, and then look for the best-matching solution within _this_ framework. Otherwise we complicate the coding issue work for GUILEv2 migration and that's something we can afford less than prolonging the current filename encoding situation.
Originally posted by: truer...@gmail.com
Thank you for your reviewing my patch.
Exactly.
It is not a recent problem, and the workaround is obvious.
I'd like to wait while watching GUILEv2 migration.
Now, I read current GUILE's Git HEAD sources.
For the moment, it doesn't support the unicode filenames on Windows.
E.g. Scheme Procedure ``open-file'' uses open() system call.
http://git.savannah.gnu.org/cgit/guile.git/tree/libguile/fports.c#n403
http://git.savannah.gnu.org/cgit/guile.git/tree/libguile/_scm.h#n167
On Windows, open() can't handle unicode filenames.
Instead, _wopen() is necessary.
So, I propose another solution that does not depend on the GUILE.
It's the method to process all open() and unlink() etc. by C++, not by GUILE.
Current LilyPond opens *.ly and *.mid files by C++.
It opens *.ps files by GUILE.
It invokes GhostScript by C++ (GLib).
It deletes *.ps by GUILE.
Therefore how is the following way?
Without using GUILE's ``open-file'' and ``delete-file'' procedure etc.
Using callback C++ function like ``ly:spawn''.
http://git.savannah.gnu.org/gitweb/?p=lilypond.git;a=blob;f=lily/general-scheme.cc;h=068fb27aad255427f4c6239bfa8c370242985a19;hb=HEAD#l639
In addition, I'm trying to upgrade GUB's ghostscript to 9.15.
https://github.com/trueroad/gub/tree/ghostscript-9.15
It has not been complete yet, but when I work for a while, that would be made.
Originally posted by: dak@gnu.org
> I'd like to wait while watching GUILEv2 migration.
Well, everyone does.
> Now, I read current GUILE's Git HEAD sources.
> For the moment, it doesn't support the unicode filenames on Windows.
> On Windows, open() can't handle unicode filenames.
> So, I propose another solution that does not depend on the GUILE.
> It's the method to process all open() and unlink() etc. by C++, not by GUILE.
Well, I propose exactly the opposite solution (or actually "strategy"): process all open() and unlink() etc. by GUILE, not C++.
In GUILEv2, strings are Unicode strings. When the respective Scheme functions for open/unlink etc do not process their arguments as Unicode strings, that is a bug. This bug needs to be reported and get fixed rather than worked around.
It is a problem that every application using GUILE will encounter. Fixing it in GUILE is the only way that makes sense. If the matter were of utmost urgency, we could create a patch for the Windows version of GUILE we compile in GUB and use that patch until a fix appears in GUILE.
I consider it likely that if one reports the GUILE problem now, particularly if accompanied by proposals of how to solve it, a version of GUILE where this has been fixed will be available by the time we are ready for GUILEv2 migration.
Even if not, we can still patch up GUILEv2 in GUB ourselves. So I think that the right solution should be using GUILE for everything.
Originally posted by: pkx1...@gmail.com
Looks like this needs work.
Labels: -Patch-review Patch-needs_work
Originally posted by: truer...@gmail.com
OK.
I understand your "strategy".
More than a year ago, GUILE developers had discussed unicode support on Windows.
However, they do not seem concluded.
http://lists.gnu.org/archive/html/guile-devel/2014-02/msg00073.html
I continue to try upgrading GUB's ghostscript to 9.15.
Originally posted by: truer...@gmail.com
I've succeed to upgrade GUB's ghostscript to 9.15 in this branch.
https://github.com/trueroad/gub/tree/ghostscript-9.15
I've succeed GUB's ``make lilypond'' by ghostscript-9.15.
All lilypond installers have been build.
On Windows, ghostscript can handle unicode filenames.
Originally posted by: pkx1...@gmail.com
(No comment was entered for this change.)
Owner: ---
Originally posted by: truer...@gmail.com
(No comment was entered for this change.)
Owner: truer...@gmail.com
Diff:
Closed as duplicate of issue #2173.