From: Mikhael G. <mi...@ho...> - 2004-05-16 19:00:22
|
I have several questions to Joe. These issues are not urgent and may be covered after releasing 3.1 (or should the numbering be x.y.z, 3.0.1?). 1. The documentation of ctags package installed on my system says that several editors including vim and emacs use the tags system for syntax highlightling. The project's home page is: http://ctags.sourceforge.net/ It seems to be nice to reuse the work of this project and to automatically support qualitive highlighting for 33 programming languages. Can you please comment on this? Other issues suppose you want to roll your own syntax highlighting rules. 2. The Perl syntax file is a very good start, but it fails in a lot of non-trivial situations. Are you willing to maintain it yourself? If you wish, I may maintain this file in cvs. I am also thinking about adding *.diff highlighting for patches/diffs. 3. Please correct me, but as I understand it, the configured colors are currently used in both color-enabled terms (like TERM=xterm) and color-disabled terms (like TERM=vt100). I would like to have 2 definition sets, for color and mono terminals. Like this is done in some other applications, say, mutt. For example, here is currently the color definition of mail.jsf: =Idle =Quot1 green =Quot2 cyan =Sign magenta I may wish to use these non-colors instead: =Idle =Quot1 underline =Quot2 bold =Sign inverse One possible way to solve this is to enable definition like: =Idle =Quot1 color green, mono underline =Quot2 color cyan, mono bold =Sign magenta inverse # this is used for both color and mono terms Another solution is to separate colors from syntax rules and to let users to supply their colors without changing syntax file. What do you think? Regards, Mikhael. |
From: <ja...@av...> - 2004-05-16 20:05:18
|
Mikhael Goikhman <mi...@ho...> wrote: >1. The documentation of ctags package installed on my system says that >several editors including vim and emacs use the tags system for syntax >highlightling. The project's home page is: > http://ctags.sourceforge.net/ >It seems to be nice to reuse the work of this project and to >automatically support qualitive highlighting for 33 programming >languages. Can you please comment on this? How do emacs and vim use the tags file for highlighting? Or do they borrow the etags lexer for their own parsers? 33 lanuages is actually not really that many: VIM supports 388. It would be nice to reuse VIM (or jEdits or emacs) syntax files, but to do so is a huge amount of work. VIM syntax files are regular expressions intermixed with VIM extension language code. If the extension language stuff could be trimmed, it should be possible to compile the regular expressions into my DFA format. But you need a pretty good regex compiler (VIM's), which is still a lot of work. jEdit might be an easier source: the syntax files are XML, but there seems to be a lot of built-in rules (it's not purely regex based). >Other issues suppose you want to roll your own syntax highlighting rules. >2. The Perl syntax file is a very good start, but it fails in a lot >of non-trivial situations. Just to warn you: perl is a nearly hopeless language to highlight (although yes, JOE's highlighter could be much better). For example, without a more powerful parser, there is no way to deal with s<any-char>find<any-char>replace<any-char>, which is perfectly legal perl. I don't think it needs a pushdown-automata, but it does need self-modifying syntax. Also there's a combinatorial explosion for things like s<find>{replace}. Shell is another difficult language: there is no way to parse: cat <<EOF this is "weird text EOF >Are you willing to maintain it yourself? >If you wish, I may maintain this file in cvs. If you would like to improve it, please do. It needs to recognize commands (first word on line) so that keywords are not highlighted in non-command contexts, plus it needs a new context for all of the commands which have regex after them. >I am also thinking about adding *.diff highlighting for patches/diffs. That would be cool. >3. Please correct me, but as I understand it, the configured colors are >currently used in both color-enabled terms (like TERM=xterm) and >color-disabled terms (like TERM=vt100). I would like to have 2 definition >sets, for color and mono terminals. Like this is done in some other >applications, say, mutt. That's a good idea: I'll add it at some point. The only argument against it is that it is very rare that somebody is using an xterm with no color support (though it happens), but quite likely that they have an xterm with color support, but the terminfo database says it doesn't have it. >Another solution is to separate colors from syntax rules and to let >users to supply their colors without changing syntax file. I was going to eventually do this. Perhaps specify color-syntax element bindings in the joerc file or even in a menu within joe. |
From: Mikhael G. <mi...@ho...> - 2004-05-17 00:08:59
|
[Sorry for the large message.] On 16 May 2004 16:07:09 -0400, ja...@av... wrote: > > Mikhael Goikhman <mi...@ho...> wrote: > > >1. The documentation of ctags package installed on my system says that > >several editors including vim and emacs use the tags system for syntax > >highlightling. The project's home page is: > > > http://ctags.sourceforge.net/ > > >It seems to be nice to reuse the work of this project and to > >automatically support qualitive highlighting for 33 programming > >languages. Can you please comment on this? > > How do emacs and vim use the tags file for highlighting? Unfortunately I can't say anything useful here, there is some poor info in the "Adding tag file support to a software tool" link on that page. I hoped you (or someone else) know more about this process. > >2. The Perl syntax file is a very good start, but it fails in a lot > >of non-trivial situations. > > Just to warn you: perl is a nearly hopeless language to highlight You don't need to warn me, I am a native Perl speaker for many years and I am fully aware about the power and complexity of its syntax. This is why I wanted to scan a possibility to reuse the work of others first. > (although yes, JOE's highlighter could be much better). At least a lot of keywords and built-in function names are missing. Some constructions like pod (inline documentation) are missing too. > For example, without a more powerful parser, there is no way to deal > with s<any-char>find<any-char>replace<any-char>, which is perfectly > legal perl. I don't think it needs a pushdown-automata, but it does > need self-modifying syntax. Also there's a combinatorial explosion for > things like s<find>{replace}. Yes, regexps together with arbitrary quotes like q[it's string] that I use all the time totally bury my highlighting. :-) I should say that vim is not smart about s@string@replace@ either. It would be nice if you enhance the lexer to have stack (sorry if you already have this, I didn't learn the details yet), or maybe more than one stack to be Turing-complete. > Shell is another difficult language: there is no way to parse: > > cat <<EOF > this is > "weird > text > EOF This should be handled by a stack, I guess, just push and wait for "^EOF$" (or "\nEOF\n" if there are no regexps). > >Are you willing to maintain it yourself? > >If you wish, I may maintain this file in cvs. > > If you would like to improve it, please do. It needs to recognize commands > (first word on line) so that keywords are not highlighted in non-command > contexts I don't think there is such term like a Perl command. :) There are at least these types of keywords: 1) built-in functions, they may appear at almost any place 2) keywords like "return", "last", "use" may only start a statement 3) declaration keywords like "my", "our", "local" may start a statement and may be in the middle: while (my $a = <>) { $b = (my $c = 1); } 4) block keywords, some may precede the block ("if", "for", "while"), some are between the blocks ("else", "elsif", "continue") 5) the same keywords ("for", "unless", "while") may be modifiers, i.e. appear in the middle of the statement without any block ... more keyword types ... And any statement itself may follow ";" or a curly bracket: { statement }. "Commands" should definitelly _not_ be the first word in the line. > plus it needs a new context for all of the commands which have > regex after them. I think the regexp part should be handled just like any other string. There are mostly 2 groups of such syntax constructions (they are not commands). 1) 'string', "string", `string`, <string>, /string/, m{string}, q[string], qq(string), qw<string>, qx:string:, qr@string@ 2) s/string/string/, tr{string}{string}, y[string](string) There are only minor differences between "string" in all of these constructions, so I think they should be handled similarly with a stack (or maybe you just add a simple general purpose register for end-quote). I think I will search the CPAN module Syntax::Highlight::Perl for ideas. > >3. Please correct me, but as I understand it, the configured colors are > >currently used in both color-enabled terms (like TERM=xterm) and > >color-disabled terms (like TERM=vt100). I would like to have 2 definition > >sets, for color and mono terminals. Like this is done in some other > >applications, say, mutt. > > That's a good idea: I'll add it at some point. The only argument against it > is that it is very rare that somebody is using an xterm with no color > support (though it happens), but quite likely that they have an xterm with > color support, but the terminfo database says it doesn't have it. I use TERM=vt100 mostly because I dislike the TERM=xterm behaviour of separate screens. I want to see my text after I exit joe or less. I also don't very like colors, beside the ones I defined for xterm: underline (soft cyan), bold (soft yellow) and reverse (gray background). I mean, I like that all my terminal apps (less, mutt, joe) look the same using this white-cyan-yellow on dark-gray backgound color scheme. I wonder whether I am the only man that don't like the motley and often hard to read colors in my xterm (I disable colorful "ls" as well). Regards, Mikhael. |
From: <ja...@av...> - 2004-05-17 00:52:05
|
Mikhael Goikhman <mi...@ho...> wrote: >> How do emacs and vim use the tags file for highlighting? >Unfortunately I can't say anything useful here, there is some poor info >in the "Adding tag file support to a software tool" link on that page. Can you point to exactly where in the documentation you saw this? >At least a lot of keywords and built-in function names are missing. >Some constructions like pod (inline documentation) are missing too. Yeah, yeah.. Java has something like this too. I was hoping more people would contribute syntax files, but it hasn't happened yet. The most help I've gotten thus far was, surprisingly, for fortran.jsf (and you'de be surprised at how difficult fortran is to highlight properly). >It would be nice if you enhance the lexer to have stack (sorry if you >already have this, I didn't learn the details yet), or maybe more than >one stack to be Turing-complete. I think maybe it just needs one extra variable (a string), and not a stack. But most of the work is switching from 'int' to a typedef to hold the state. > That's a good idea: I'll add it at some point. The only argument against it > is that it is very rare that somebody is using an xterm with no color > support (though it happens), but quite likely that they have an xterm with > color support, but the terminfo database says it doesn't have it. I use TERM=vt100 mostly because I dislike the TERM=xterm behaviour of separate screens. I want to see my text after I exit joe or less. I hate this too: I added the '-notite' option to joerc (currently enabled by default), which causes JOE to not send the "ti" and "te" termcap sequences which cause this horrible behavior :-) |
From: Mikhael G. <mi...@ho...> - 2004-05-17 02:21:24
|
On 16 May 2004 20:54:17 -0400, ja...@av... wrote: > > Mikhael Goikhman <mi...@ho...> wrote: > >> How do emacs and vim use the tags file for highlighting? > > >Unfortunately I can't say anything useful here, there is some poor info > >in the "Adding tag file support to a software tool" link on that page. > > Can you point to exactly where in the documentation you saw this? I saw that emacs, vim and other editors use ctags in its man page, a quick search in Google finds many online man pages, for example: http://www.polarhome.com/ctags/ctags.html Then I wondered myself why there is no a single "highlight" word in ctags documentation, and I think I got it now. This package is used to find a corresponding location of some language object (for example function), so if a cursor is under play_game("tetris") function call, you may easily jump to the function definition. It is not used for highlighting. I am sorry for confusion. :) Still it is pitty there is no free software package to generate highlighting information for all known languages. Now, since I need this highlighting capabilities for my own FS program, I did small research, and I see there is some poor tool for me (but it is useless for joe): enscript -W html --output out.html --pretty-print in.pm Regards, Mikhael. |
From: Brian C. <B.C...@po...> - 2004-05-17 09:22:08
|
On Mon, May 17, 2004 at 12:08:53AM +0000, Mikhael Goikhman wrote: > > Shell is another difficult language: there is no way to parse: > > > > cat <<EOF > > this is > > "weird > > text > > EOF > > This should be handled by a stack, I guess, just push and wait for > "^EOF$" (or "\nEOF\n" if there are no regexps). Strictly speaking, a stack is not needed here, just a single match buffer (unless this construct is nestable, which I don't think it is). Ditto for s@pattern@replace@ A stack would be useful for things like: - C balancing: given a '{', find me the matching '}' - XML balancing: given a <tag>, find the matching </tag> Although even then, you don't need an explicit stack, just a counter which records how deep the stack would have been at that point. I would probably argue that really you should be able to enter the BNF grammar of a language and have it properly parsed; unfortunately, I don't think that Perl or sh are parseable in that way :-( Perhaps you should be able to bolt on an existing external parser. Is there a function in libperl you can call which takes a stream of text and demarcates it into a stream of tokens? That would be cool, although I expect you couldn't capture the internal parser state at an intermediate point, which means you'd always have to reparse the whole file from the top. Regards, Brian. |
From: Mikhael G. <mi...@ho...> - 2004-05-17 16:16:22
|
On 17 May 2004 09:21:31 +0000, Brian Candler wrote: > > On Mon, May 17, 2004 at 12:08:53AM +0000, Mikhael Goikhman wrote: > > > Shell is another difficult language: there is no way to parse: > > > > This should be handled by a stack, I guess, just push and wait for > > "^EOF$" (or "\nEOF\n" if there are no regexps). > > Strictly speaking, a stack is not needed here, just a single match buffer What I meant in my original message is it would be nice to have a stack (for other purposes). Then this particular problem would be solved too. > Perhaps you should be able to bolt on an existing external parser. Is there > a function in libperl you can call which takes a stream of text and > demarcates it into a stream of tokens? That would be cool, although I expect > you couldn't capture the internal parser state at an intermediate point, > which means you'd always have to reparse the whole file from the top. It sounds so inefficient to parse the whole file on every key press. :) There are some information in the "Editor Support for Debugging" section of "man perdebug", or "perldoc perldebug" on your system. Google gives: http://perl.active-venture.com/pod/perldebug.html#Editor-Support-for-Debugging It seems there are some tools to help with parsing Perl. Not sure yet how useful they are. I am a bit pessimistic about this approach. Regards, Mikhael. |
From: Tom M. <to...@ho...> - 2004-05-22 16:45:52
|
> I am also thinking about adding *.diff highlighting for patches/diffs. I spent some time working on this a couple weeks ago. I think I left off with unified and normal diffs working well. I didn't get context diffs working. I'll clean this up and submit it this weekend. Hopefully that will be in time for the 3.1 release. --=20 "...if the church put in half the time on covetousness that it does on lust, this would be a better world." -- Garrison Keillor, "Lake Wobegon Days" |
From: Tom M. <to...@ho...> - 2004-05-26 03:48:42
Attachments:
diff.jsf
|
On Sat, May 22, 2004 at 09:44:51AM -0700, Tom Marshall wrote: > > I am also thinking about adding *.diff highlighting for patches/diffs. > > I spent some time working on this a couple weeks ago. I think I left off > with unified and normal diffs working well. I didn't get context diffs > working. > > I'll clean this up and submit it this weekend. Hopefully that will be in > time for the 3.1 release. I haven't had a chance to complete this. My attempt at unified diff highlighting is attached. Feel free to use and modify. -- A design for digital copy protection is like a design for a perpetual motion machine: It may be interesting to look at, but you know from the start it is impossible to build. |
From: Mikhael G. <mi...@ho...> - 2004-05-26 04:25:54
|
On 25 May 2004 20:48:02 -0700, Tom Marshall wrote: > > On Sat, May 22, 2004 at 09:44:51AM -0700, Tom Marshall wrote: > > > I am also thinking about adding *.diff highlighting for patches/diffs. > > > > I spent some time working on this a couple weeks ago. I think I left off > > with unified and normal diffs working well. I didn't get context diffs > > working. > > > > I'll clean this up and submit it this weekend. Hopefully that will be in > > time for the 3.1 release. > > I haven't had a chance to complete this. My attempt at unified diff > highlighting is attached. Feel free to use and modify. Very good, thank you. Of course I will redefine the colors for my mono TERM (and it does not look very good on color TERM too, with my rgb:00/00/30 background). But this is a general problem of all syntax files. I am waiting for the solution from Joe here (possibly have two separate files for each language, one for parser rules that I may leave system wide, and one for color values that I may redefine in the home dir; and support the mono TERM color set). Hopefully this makes sense. Regards, Mikhael. |
From: <ja...@av...> - 2004-05-26 13:58:01
|
Do you have an updated XML highlighter? I'd like to include it in version 3.1. I have this one, but I think you were going to work on it some more: # Define no. sync lines # You can say: # -200 means 200 lines # - means always start parsing from beginning of file when we lose sync # if nothing is specified, the default is -50 - # Define colors # # bold inverse blink dim underline # white cyan magenta blue yellow green red black # bg_white bg_cyan bg_magenta bg_blue bg_yellow bg_green bg_red bg_black # The underlines are here right now just because I want to distinguish which # bits have been coloured (say) CdataStart, CdataBody, CdataEnd. And that's # because I think it may be useful to make that distinction for some people. =Idle =Tag blue =Error red bold =EntityRef magenta =Decl cyan =CommentStart green underline =CommentBody green =CommentEnd green underline =PIStart cyan bold underline =PIBody cyan bold =PIEnd cyan bold underline =CdataStart blue bold underline =CdataBody blue bold =CdataEnd blue bold underline :start Idle * start "<" open_tag recolor=-1 ">" error noeat recolor=-1 "&" entity recolor=-1 :error Error * start :open_tag Tag * tag "?" pi_start recolor=-2 "!" decl recolor=-2 buffer :tag Tag * tag "<" error noeat recolor=-1 "&" entity_attr recolor=-1 ">" start :decl Decl * decl strings "!--" comment_start recolor=-5 "![CDATA[" cdata_start recolor=-10 done "<" decl_nest ">" start # We allow one level of <...> nesting within declarations :decl_nest Decl * decl_nest ">" decl :comment_start CommentStart * comment noeat :comment CommentBody * comment "-" comment2 :comment2 CommentBody * comment "-" comment3 :comment3 CommentBody * comment_error noeat recolor=-3 ">" comment_end recolor=-3 :comment_end CommentEnd * start noeat recolor=-1 # For compatibility, the string "--" (double-hyphen) MUST NOT occur within # comments. [http://www.w3.org/TR/REC-xml/ section 2.5] :comment_error Error * comment "-" comment_error ">" comment_end recolor=-3 :pi_start PIStart * pi noeat recolor=-1 :pi PIBody * pi "?" pi2 :pi2 PIBody * pi ">" pi_end recolor=-2 :pi_end PIEnd * start noeat recolor=-1 :cdata_start CdataStart * cdata noeat :cdata CdataBody * cdata "]" cdata2 :cdata2 CdataBody * cdata "]" cdata3 :cdata3 CdataBody * cdata ">" cdata_end recolor=-3 :cdata_end CdataEnd * start noeat recolor=-1 :entity EntityRef * entity ";" start :entity_attr EntityRef * entity_attr ";" tag |