Working with TextPad, I've grown accustomed to a toggle that lets me see non-printing characters such as space, tab and newline.
Now I'm working with the Khmer language, in which there is extensive use of the zero-width space[1] to mark word boundaries (Khmer being one of those scripts that does not visibly separate words from each other). So I'd like to be able to switch back and forth between show and hide invisibles when editing text, to see my word boundaries.[2]
Is this in the program? I haven't located it if it is.
And if it's not, please regard this as a feature request.
Thanks,
Roger Sperberg
[1] Unicode character U+200B
[12] With the Khmer keyboard, pressing the spacebar inserts a ZWSP, so you can get two or three without realizing it. Or, more problematically, put one where it doesn't belong inadvertently and invisibly. (In case you're wondering, you press Shift+spacebar to get a normal space.)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Neither is the zero-width space (at least not in the fonts I've tried, but mightn't this be independent of the font? -- or: shouldn't it be?).
I heard from someone who is expert in this that XML officially considers only three characters whitespace (that is, space, tab and newline). Other non-printing characters such as the thin-space and, presumably, the ZWSP are not officially whitespace. They're just "invisibles".
As you work on this, you may want to provide for either (show whitespace/show ALL nonprinting characters) or let the user choose which of the different characters should be shown by something on-screen. Of course, a function like normalize-space() would treat these characters differently, and hence your users may prefer this differentiation.
Roger
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I can enter a ZWSP in XCE (using the Khmer locale keyboard). I can select the text with the ZWSP and copy and paste it into XCE or into another program such as Word with no issues.
However, if I take text from Word containing a ZWSP (including the text that originated in XCE), copy it and paste it into XCE, the ZWSP is lost.
This is true of 1.0.8.8 (as well as 1.0.8.7).
Roger
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I need to find out if there is a newline display option. The visible whitespace option can be set to invisible, always visible and visible outside indentation. (The checkbutton in Tools>Options>White space visible toggles between the first and last setting.
<I can enter a ZWSP in XCE (using the Khmer locale keyboard). I can select the text with the ZWSP and copy and paste it into XCE or into another program such as Word with no issues. However, if I take text from Word containing a ZWSP (including the text that originated in XCE), copy it and paste it into XCE, the ZWSP is lost.>
Do you know if it's possible to copy a ZWSP from XCE and paste it back (in a different place) into the document? I think it is at least possible that the ZWSP is lost not when it's pasted back into XCE, but when it's copied from Word.
All XCE does is fetch the contents of the clipboard from Windows as a Unicode string and insert it into the document (if tags are locked it will also convert < > & into the relevant entity references).
I must install the Khmer support on my system so I can test this for myself.
Best,
Gerald
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
In XCE, if I copy text with a ZWSP and paste it into another spot in the same file, the ZWSP is preserved.
So, yes, it's something to do with the way it comes from Word, I suppose.
Note though that if I copy text with a ZWSP in Word, I can paste it into Word, into NotePad, into XML NotePad, and into the Stackz flashcard app and the ZWSP is preserved.
So it's not like Word is dropping this character. This is the only circumstance I've observed where it's not preserved in a program that is otherwise properly handling the Unicode (some other text editors would display the Unicode but didn't paste the ZWSP either -- sorry, I've already uninstalled them after the brief tests so I can't even identify them).
I opened several apps and copied and pasted among them. This is what I learned:
When I copy text with a ZWSP from XCE, I don't have any problems pasting it into
Word, Stackz, NotePad, OpenOffice Writer, XML NotePad, Excel.
When I copy text with a ZWSP in any of these apps, I don't have any problems when I paste it into XCE: Stackz, NotePad, OpenOffice Writer, XML NotePad, Excel.
When I copy text with a ZWSP in these apps, the ZWSP disappears when pasted into XCE: Word.
Hm-m. Word is the only application with which this problem shows up. It also happens to be one of only two applications (OpenOffice Writer being the other) that shows the ZWSP, toggling it on and off. OO Writer displays a gray, overlapping space between words where the ZWSP appears, even with non-printing characters hidden; e.g., the toggle doesn't affect the ZWSP at all.
Hope this helps.
Roger
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Working with TextPad, I've grown accustomed to a toggle that lets me see non-printing characters such as space, tab and newline.
Now I'm working with the Khmer language, in which there is extensive use of the zero-width space[1] to mark word boundaries (Khmer being one of those scripts that does not visibly separate words from each other). So I'd like to be able to switch back and forth between show and hide invisibles when editing text, to see my word boundaries.[2]
Is this in the program? I haven't located it if it is.
And if it's not, please regard this as a feature request.
Thanks,
Roger Sperberg
[1] Unicode character U+200B
[12] With the Khmer keyboard, pressing the spacebar inserts a ZWSP, so you can get two or three without realizing it. Or, more problematically, put one where it doesn't belong inadvertently and invisibly. (In case you're wondering, you press Shift+spacebar to get a normal space.)
Visible white space is available from version 1.0.8.8. I hope ZWSPs will work better, too, but I haven't tested this yet.
Tabs and spaces are clearly shown.
Newlines aren't.
Neither is the zero-width space (at least not in the fonts I've tried, but mightn't this be independent of the font? -- or: shouldn't it be?).
I heard from someone who is expert in this that XML officially considers only three characters whitespace (that is, space, tab and newline). Other non-printing characters such as the thin-space and, presumably, the ZWSP are not officially whitespace. They're just "invisibles".
As you work on this, you may want to provide for either (show whitespace/show ALL nonprinting characters) or let the user choose which of the different characters should be shown by something on-screen. Of course, a function like normalize-space() would treat these characters differently, and hence your users may prefer this differentiation.
Roger
I can enter a ZWSP in XCE (using the Khmer locale keyboard). I can select the text with the ZWSP and copy and paste it into XCE or into another program such as Word with no issues.
However, if I take text from Word containing a ZWSP (including the text that originated in XCE), copy it and paste it into XCE, the ZWSP is lost.
This is true of 1.0.8.8 (as well as 1.0.8.7).
Roger
I need to find out if there is a newline display option. The visible whitespace option can be set to invisible, always visible and visible outside indentation. (The checkbutton in Tools>Options>White space visible toggles between the first and last setting.
<I can enter a ZWSP in XCE (using the Khmer locale keyboard). I can select the text with the ZWSP and copy and paste it into XCE or into another program such as Word with no issues. However, if I take text from Word containing a ZWSP (including the text that originated in XCE), copy it and paste it into XCE, the ZWSP is lost.>
Do you know if it's possible to copy a ZWSP from XCE and paste it back (in a different place) into the document? I think it is at least possible that the ZWSP is lost not when it's pasted back into XCE, but when it's copied from Word.
All XCE does is fetch the contents of the clipboard from Windows as a Unicode string and insert it into the document (if tags are locked it will also convert < > & into the relevant entity references).
I must install the Khmer support on my system so I can test this for myself.
Best,
Gerald
In XCE, if I copy text with a ZWSP and paste it into another spot in the same file, the ZWSP is preserved.
So, yes, it's something to do with the way it comes from Word, I suppose.
Note though that if I copy text with a ZWSP in Word, I can paste it into Word, into NotePad, into XML NotePad, and into the Stackz flashcard app and the ZWSP is preserved.
So it's not like Word is dropping this character. This is the only circumstance I've observed where it's not preserved in a program that is otherwise properly handling the Unicode (some other text editors would display the Unicode but didn't paste the ZWSP either -- sorry, I've already uninstalled them after the brief tests so I can't even identify them).
I opened several apps and copied and pasted among them. This is what I learned:
When I copy text with a ZWSP from XCE, I don't have any problems pasting it into
Word, Stackz, NotePad, OpenOffice Writer, XML NotePad, Excel.
When I copy text with a ZWSP in any of these apps, I don't have any problems when I paste it into XCE: Stackz, NotePad, OpenOffice Writer, XML NotePad, Excel.
When I copy text with a ZWSP in these apps, the ZWSP disappears when pasted into XCE: Word.
Hm-m. Word is the only application with which this problem shows up. It also happens to be one of only two applications (OpenOffice Writer being the other) that shows the ZWSP, toggling it on and off. OO Writer displays a gray, overlapping space between words where the ZWSP appears, even with non-printing characters hidden; e.g., the toggle doesn't affect the ZWSP at all.
Hope this helps.
Roger