|
From: Łukasz S. <luk...@ie...> - 2010-03-28 19:40:19
|
Hello. grep from MSYS includes CR as a part of pattern read from a file with a -f option. This seems to be rather obvious bug, doesn't it? Is there any magic trick to overcome it? -- Miłego dnia, Łukasz Stelmach |
|
From: Earnie <ea...@us...> - 2010-03-29 12:31:15
|
Łukasz Stelmach wrote: > Hello. > > grep from MSYS includes CR as a part of pattern read from a file with a > -f option. This seems to be rather obvious bug, doesn't it? Is there any > magic trick to overcome it? > cat foo | tr -d "\r" > bar && mv bar foo If the file contains CR then that is a character in the file and MSYS provides a pseudo UNIX environment which reads the CR as a character. CRLF line endings and MSYS do not mix well. Never has and MSYS has been around for more years than I remember. -- Earnie -- http://www.for-my-kids.com |
|
From: Keith M. <kei...@us...> - 2010-03-29 18:26:40
|
On Sunday 28 March 2010 20:26:50 Łukasz Stelmach wrote: > grep from MSYS includes CR as a part of pattern read from a file > with a -f option. This seems to be rather obvious bug, doesn't it? No, not really. All MSYS tools are designed to process Unix format files; IIRC, it was a conscious design decision that any CR present in any input file would be considered significant. > Is there any magic trick to overcome it? Use dos2unix, (or the d2u filter), to convert your pattern file to the proper format. -- Regards, Keith. |
|
From: Erwin W. <wat...@xs...> - 2010-03-30 07:23:16
|
Op 29-03-10 13:43, Keith Marshall schreef: >> Is there any magic trick to overcome it? >> > > Use dos2unix, (or the d2u filter), to convert your pattern file to > the proper format. > > You can get the latest version of dos2unix at http://www.xs4all.nl/~waterlan/dos2unix.html Erwin |
|
From: Keith M. <kei...@us...> - 2010-03-30 20:59:12
|
On Tuesday 30 March 2010 08:23:07 Erwin Waterlander wrote: > > Use dos2unix, (or the d2u filter), to convert your pattern file > > to the proper format. > > > > > > You can get the latest version of dos2unix at > http://www.xs4all.nl/~waterlan/dos2unix.html What's wrong with the version we provide ourselves? (You will find it in the `MSYS cygutils' package). You will also find no-nonsense script implementations of `d2u' (and its complementary `u2d' filter) in the base package set from MSYS-1.0.11 onwards. -- Regards, Keith. |
|
From: Erwin W. <wat...@xs...> - 2010-03-31 08:44:47
|
Op 30-03-10 22:59, Keith Marshall schreef: > What's wrong with the version we provide ourselves? (You will find > it in the `MSYS cygutils' package). You will also find no-nonsense > script implementations of `d2u' (and its complementary `u2d' filter) > in the base package set from MSYS-1.0.11 onwards. > > There is nothing wrong. It was just a tip. It's nice to have to the same command-line options as on Unix/Linux. Other alternatives are listed as well. |
|
From: Keith M. <kei...@us...> - 2010-03-31 21:20:14
|
On Wednesday 31 March 2010 09:44:38 Erwin Waterlander wrote: > It's nice to have to the same command-line options [for dos2unix] > as on Unix/Linux. AFAIK there is no standard for dos2unix, or unix2dos, on Unix; POSIX certainly does not specify either. In fact, it is doubtful if many commercial Unixes provide such tools. Solaris/SunOS does, but its implementation is completely incompatible, in terms of invocation syntax, with the variant common on GNU/Linux, which in turn appears to be quite different from the Cygwin implementation. HP-UX has a similar tool, called dos2ux, but just how similar, I do not know. There is one tool, which is guaranteed to be both ubiquitous and consistently portable, for converting between MS-Windows CRLF and Unix LF line endings -- the venerable `awk'; this is precisely what the `d2u' and 'u2d' filters in recent MSYS base use to achieve the conversion. -- Regards, Keith. |
|
From: Charles W. <cwi...@us...> - 2010-03-31 22:23:18
|
On 3/31/2010 5:19 PM, Keith Marshall wrote: > There is one tool, which is guaranteed to be both ubiquitous and > consistently portable, for converting between MS-Windows CRLF and > Unix LF line endings -- the venerable `awk'; this is precisely what > the `d2u' and 'u2d' filters in recent MSYS base use to achieve the > conversion. Um, Keith -- stop saying that. The script-based filters are no longer provided in "recent" MSYS. AFAICT, they were removed in 1.0.12. Instead, the cygutils-dos2unix-msys and cygutils-dos2unix-mingw32 packages provide all of the desired functionality, in both msys-aware and "normal" win32 modes. Even filtering. Search your mail archives for an (offlist) message titled: "lpr and cygutils" back in August, 2009. Also, see: http://article.gmane.org/gmane.comp.gnu.mingw.devel/3452 Updated packages available (wave 6) > MSYS cygutils also provides implementations of dos2unix and unix2dos > (u2d, d2u) that supplant the u2d and d2u shell scripts installed by > msysCORE, as well as the dos2unix and unix2dos executables provided by > mingw-utils-0.3. These new versions are, obviously, MSYS-dependent. -- Chuck |
|
From: Keith M. <kei...@us...> - 2010-04-01 20:36:06
|
On Wednesday 31 March 2010 23:22:41 Charles Wilson wrote: > On 3/31/2010 5:19 PM, Keith Marshall wrote: > > There is one tool, which is guaranteed to be both ubiquitous and > > consistently portable, for converting between MS-Windows CRLF > > and Unix LF line endings -- the venerable `awk'; this is > > precisely what the `d2u' and 'u2d' filters in recent MSYS base > > use to achieve the conversion. > > Um, Keith -- stop saying that. > > The script-based filters are no longer provided in "recent" MSYS. > AFAICT, they were removed in 1.0.12. Instead, the > cygutils-dos2unix-msys and cygutils-dos2unix-mingw32 packages > provide all of the desired functionality, in both msys-aware and > "normal" win32 modes. Even filtering. Sorry; perhaps I didn't fully appreciate the significance of... > Search your mail archives for an (offlist) message titled: > "lpr and cygutils" back in August, 2009. Also, see: > > http://article.gmane.org/gmane.comp.gnu.mingw.devel/3452 > Updated packages available (wave 6) > > > MSYS cygutils also provides implementations of dos2unix and > > unix2dos (u2d, d2u) that supplant the u2d and d2u shell scripts > > installed by msysCORE, as well as the dos2unix and unix2dos > > executables provided by mingw-utils-0.3. These new versions > > are, obviously, MSYS-dependent. I had thought that the basic scripts were to be supplanted only for those users who CHOSE to install the cygutils add-on package, and that those users who didn't adopt this choice would still have the capability to filter MS-DOS style line endings using the scripts. Personally, I can't see what benefit might accrue from the extra bloat of a (somewhat non-portable) .exe, when a ubiquitous filter program achieves the desired effect, in a more universally portable manner; the effect of dos2unix is as simple[1] as: $ awk '{sub("\r$",""); print}' dosfile > unixfile (or the completely equivalent): $ awk '{sub("\r$",""); printf "%s\n", $0}' dosfile > unixfile while unix2dos yields the equivalent of: $ awk '{sub("\r$",""); printf "%s\r\n", $0}' unixfile > dosfile and IMO, the not-so-standardised options provided by dos2unix and unix2dos, (in the absence of standardisation), are little more than window dressing. [1] Of course, this simple implementation depends on having an awk which doesn't have CRLF line endings mangled behind its back by the platform runtime library's I/O subsystem, (a typically Microsoft (mis)feature which, thankfully, MSYS' awk does not exhibit). -- Regards, Keith. |
|
From: Charles W. <cwi...@us...> - 2010-04-02 03:24:49
|
On 4/1/2010 9:53 AM, Keith Marshall wrote: > On Wednesday 31 March 2010 23:22:41 Charles Wilson wrote: >> Search your mail archives for an (offlist) message titled: >> "lpr and cygutils" back in August, 2009. Also, see: >> >> http://article.gmane.org/gmane.comp.gnu.mingw.devel/3452 >> Updated packages available (wave 6) > I had thought that the basic scripts were to be supplanted only for > those users who CHOSE to install the cygutils add-on package, and > that those users who didn't adopt this choice would still have the > capability to filter MS-DOS style line endings using the scripts. Well...sure. Anybody can copy that line into a file, name it "d2u", and they're off to the races... > Personally, I can't see what benefit might accrue from the extra > bloat of a (somewhat non-portable) .exe, when a ubiquitous filter > program achieves the desired effect, in a more universally portable > manner; the effect of dos2unix is as simple[1] as: > > $ awk '{sub("\r$",""); print}' dosfile > unixfile > and IMO, the not-so-standardised options provided by dos2unix and > unix2dos, (in the absence of standardisation), are little more than > window dressing. Well, let's be honest: nobody uses the options anyway. They either use it as a filter, or as a simple command ('d2u *.txt'). It's that latter use -- operating on multiple source files, and replacing in-place -- that the scripts don't support, without additional complexity (which msys's implementations never had, IIRC). > [1] Of course, this simple implementation depends on having an awk > which doesn't have CRLF line endings mangled behind its back by the > platform runtime library's I/O subsystem, (a typically Microsoft > (mis)feature which, thankfully, MSYS' awk does not exhibit). But this is the real reason. There HAVE been cases in the past, on both cygwin and msys, where '>' redirection operated in text mode, or -- for cygwin -- depended on the mount type of the destination drive. Happily those days are in the past -- and I *hope* they stay in the past. But pragmatically, I'd rather not bet on it. By using an actual application, you can *guarantee* that setmode(binary) happens on the target file, and that the app is *completely* on control of every byte that gets written to the output. -- Chuck |
|
From: Keith M. <kei...@us...> - 2010-04-02 08:25:21
|
On Friday 02 April 2010 04:24:37 Charles Wilson wrote:
> > ... the effect of dos2unix is as simple[1] as:
> > $ awk '{sub("\r$",""); print}' dosfile > unixfile
> >
> > and IMO, the not-so-standardised options provided by dos2unix
> > and unix2dos, (in the absence of standardisation), are little
> > more than window dressing.
>
> Well, let's be honest: nobody uses the options anyway. They either
> use it as a filter, or as a simple command ('d2u *.txt').
>
> It's that latter use -- operating on multiple source files, and
> replacing in-place -- that the scripts don't support, without
> additional complexity (which msys's implementations never had,
> IIRC).
No, they didn't. I realise that I'm probably in a minority group
here, but personally I've never found that capability to be useful,
(mostly because I have to work on commercial Unixes which don't
support it anyway, and with their platform specific implementations
it could be downright dangerous to anticipate it; e.g. on Solaris
IIRC, dos2unix requires exactly two mandatory file name arguments,
the first of which is the input file and the second the *output*;
if your `dos2unix *.txt' just happened to match exactly two files
there, then the second is toast). However, I do appreciate that
there are many who will expect, and even rely on this capability
being supported.
When Cesar and I originally discussed the inclusion of the scripts,
we understood these two distinct usage cases. We felt that the two
scripts, which we would call d2u and u2d were sufficiently minimal
to be incorporated into an MSYS Base distribution, and that these
names would be RESERVED for a FILTER ONLY mode of operation; those
who wanted the multi-file in-place conversion could download the
mingw-utils, (as the providing package was at the time), and install
the dos2unix.exe and unix2dos.exe applications, (which should NOT
supplant d2u and u2d). Seems like, with cygutils, that original
intent has now been superseded.
> > [1] Of course, this simple implementation depends on having an
> > awk which doesn't have CRLF line endings mangled behind its back
> > by the platform runtime library's I/O subsystem, (a typically
> > Microsoft (mis)feature which, thankfully, MSYS' awk does not
> > exhibit).
>
> But this is the real reason. There HAVE been cases in the past,
> on both cygwin and msys, where '>' redirection operated in text
> mode, or -- for cygwin -- depended on the mount type of the
> destination drive. Happily those days are in the past -- and I
> *hope* they stay in the past.
Amen.
> But pragmatically, I'd rather not bet on it.
I don't recall ever having encountered such a problem with MSYS; of
course, your experience may differ.
> By using an actual application, you can *guarantee* that
> setmode(binary) happens on the target file, and that the app is
> *completely* on control of every byte that gets written to the
> output.
Sure.
--
Regards,
Keith.
|
|
From: Łukasz S. <luk...@ie...> - 2010-04-04 00:01:42
|
Earnie <ea...@us...> writes: > Łukasz Stelmach wrote: >> Hello. >> >> grep from MSYS includes CR as a part of pattern read from a file with a >> -f option. This seems to be rather obvious bug, doesn't it? Is there any >> magic trick to overcome it? >> > > cat foo | tr -d "\r" > bar && mv bar foo > > If the file contains CR then that is a character in the file and MSYS > provides a pseudo UNIX environment which reads the CR as a character. > CRLF line endings and MSYS do not mix well. Never has and MSYS has been > around for more years than I remember. Thanks. I was sure I've read somewhere that mingw C runtime/libc (I am not very familiar with these bits under Windows) supports the "\n" <-> "\r\n" conversion automatically. That's why grep's behaviour seemed odd to me. But if that is just the way things work it's OK. -- Miłego dnia, Łukasz Stelmach |
|
From: Tor L. <tm...@ik...> - 2010-04-04 04:42:50
|
> Thanks. I was sure I've read somewhere that mingw C runtime/libc But this thread is about the *MSYS* grep. By definition it doesn't use the "mingw C runtime" (i.e., a Microsoft C library). --tml |
|
From: Łukasz S. <luk...@ie...> - 2010-04-09 21:00:01
|
Tor Lillqvist <tm...@ik...> writes: >> Thanks. I was sure I've read somewhere that mingw C runtime/libc > > But this thread is about the *MSYS* grep. By definition it doesn't use > the "mingw C runtime" (i.e., a Microsoft C library). I'm sorry for the mistake. I don't use Windows too often and I've found MSYS on the mingw webpage and probably that the source of my false assumption about their deeper connections. Thanks for the information. -- Miłego dnia, Łukasz Stelmach |