From: Jeff H. <Je...@Je...> - 2020-07-01 03:19:29
|
<html> <head> <meta http-equiv="content-type" content="text/html; charset=UTF-8"> </head> <body> <p>I have added the CONTEXT <i>number</i> option. This reports not only the matching record, but some before and after it also. Also added are BEFORE and AFTER to get contextual records in one direction. There is an optional SEPARATOR to set off the groups of records. It defaults to "--".</p> <p> <blockquote type="cite"><tt>/** regex<br> <br> >>--<b>REGEX</b>--+--------------------------+--<i>regex_string</i>-(1)---><<br> +-(--| <i>options_string</i> |--)-+<br> <br> <b>options_string</b>:<br> +----------------------------+<br> |--v-+------------------------+-+--|<br> +-NUMBERS----------------+ (2)<br> +-BEFORE-+-<i>0</i>------+------+ (3)<br> | +-<i>number</i>-+ |<br> +-AFTER-+-<i>0</i>------+-------+ (3)<br> | +-<i>number</i>-+ |<br> +-CONTEXT-+-<i>0</i>------+-----+ (4)<br> | +-<i>number</i>-+ |<br> +-NOSEPARATOR------------+<br> +-SEPARATOR-+- -- ----+--+<br> | +-<i>DString</i>-+ |<br> <br> Records matching the RegEx are put out on primary output<br> Records not matching are put out on secondary, if connected, or discarded.<br> <br> (1) string is a Java RegEx expresion. null string passes all records.<br> (2) lines are prefaced with line number, 10 characters, right justified<br> (3) number of records put out after a matching record<br> (4) number of records put out before and after a matching record<br> <br> */<br> </tt></blockquote> <br> This brings it up, functionally, almost to GNU GREP 3.4 (minus all of its file input options).</p> <p>A few things for discussion:</p> <ul> <li>Would it ever help to put matched records out on the tertiary stream without the context records? Unmatched ones are already on the secondary.</li> <li>GREP has an option to just put out the COUNT of matched records. Do we have any use for this? ( REGEX string | COUNT LINES does the same.)<br> </li> <li>Possibly the regex_string should be a delimited string. This because a potential REGEX_CHANGE stage would have two delimited strings.</li> <li>Should this be named GREP? Which term would be more or less familiar to our users? Both? with one an alias of the other?</li> </ul> (Oops, just spotted a bug: BEFORE, etc, without a number works if it is the last option, but not otherwise. Something for the morning fix.)<br> <p>Jeff<br> </p> </body> </html> |