Re: [Flex-help] Please help me about linguistic typology
flex is a tool for generating scanners
Brought to you by:
wlestes
From: Hugh S. <hg...@dm...> - 2013-06-16 13:12:56
|
On Sun, 16 Jun 2013, Daniel Janzon wrote: > Hi! > > Flex is not the right tool for this. It's way to complicated (and powerful). You're francophone friend should have mentions the tool sed. Sed will be you're friend. You may also find awk will do the job. More recent options are Perl, Python Aand Ruby. Some of this is a matter of personal taste. sed is a good choice for many reasons, though I have found it difficult to debug at times. > > Save a copy of you lex document (no joke!) and experiment with the following commands, which I haven't verified to work: That should be "tex" rather than "lex" document, I think. It is that document you wish to process, and preserve in case things go wrong. > > sed -i 's/\(.*[^ \t]\)[ \t]*$/\1/'' your-file.lex # Remove whitespace at end of line It is the -e option to take an expression. See tutorials here: http://www.dblab.ece.ntua.gr/~george/sed/ This ("Unix text processing") might be helpful: http://oreilly.com/openbook/utp/ > sed -i 's/\t/ /g' your-file.lex # Replace tabs with four spaces > sed -i 's/\([,.;:!?]\)[ \t]*/\1 /g' # Put a single white space after punvtuation marks > sed -i 's/\([.?!][ \t]*[a-zA-Z]\)/\U\1/' your-file.lex # Fix forgotten upper case > > I didn't test run the above commands but they will hopefully get you on track. > > This is not a sed mailing list so do not expect an extended discussion about sed here. > > All the best, > Daniel HTH Hugh > > ________________________________________ > From: t ly [tl...@li...] > Sent: Sunday, June 16, 2013 8:32 AM > To: Fle...@li... > Subject: [Flex-help] Please help me about linguistic typology > > Hello everybody, > > With the advice of a friend, I start to learn to use LaTex instead of MS Word for my redaction. He said that it will take me little time at the beginning, but it will help me to save a lot of time after. I began to find that he's right :) > > Today, i try to convert some of my work-in-progress doucments from Word to Latex; but, i have alot of useless whitespace in the converted documents. On a francophone forum of Ubuntu users, one member suggest me to use Lex, a much more powerful text processing tool, to deal with it; but nobody know exactly how to do these following tasks: > > 1) Remove all useless whitespace (spaces and tabs) at the end of the line. > 2) Replace all tabs with a fixed number of spaces, 4 for example. > 3) Put a single white after a punctuation marks [,;:!?]. > 4) Restore the forgotten upper-cases (transform the letters which found after punctuation marks [.!?] to capital letters. > > So, I wonder if you'd be kind enough to give me some lines of code which could do that. Thank you in advance for any help you can provide! > ------------------------------------------------------------------------------ > This SF.net email is sponsored by Windows: > > Build for Windows Store. > > http://p.sf.net/sfu/windows-dev2dev > _______________________________________________ > Flex-help mailing list > Fle...@li... > https://lists.sourceforge.net/lists/listinfo/flex-help > ------------------------------------------------------------------------------ > This SF.net email is sponsored by Windows: > > Build for Windows Store. > > http://p.sf.net/sfu/windows-dev2dev > _______________________________________________ > Flex-help mailing list > Fle...@li... > https://lists.sourceforge.net/lists/listinfo/flex-help > |