From: Colin <net...@im...> - 2020-08-02 05:13:21
|
I've been away for a while and am just catching up on some old emails. If I read the diagram correctly I could say BEFORE <n> AFTER <m> to get the same effect as CONTEXT. Is that a fair statement or am I not misunderstanding the CONTEXT option? To your questions, here are my 2 cents worth if I am not too late to chime in. . . . * Would it ever help to put matched records out on the tertiary stream without the context records? Unmatched ones are already on the secondary. Interesting idea. At the moment I don't think I have an opinion either way. As the number of streams increases does it increase the complexity and readability of a PIPE? * GREP has an option to just put out the COUNT of matched records. Do we have any use for this? ( REGEX string | COUNT LINES does the same.) Would there be a huge performance benefit to providing this functionality? If not, then my vote leans towards "no". REGEX string | COUNT LINES should suffice. * Possibly the regex_string should be a delimited string. This because a potential REGEX_CHANGE stage would have two delimited strings. I would have thought that regex_string would have to a delimited string as wouldn't you need to handle the case of a blank in the middle of your regex_string? E.g. regex a b c vs regex / a b c/ find all strings (space)a(space)b(space)c * Should this be named GREP? Which term would be more or less familiar to our users? Both? with one an alias of the other? Personally my preference is for REGEX_MATCH and REGEX_CHANGE since it more closely matches (pun intended) how the matching takes place than GREP does. Having one the alias of the other works too. Just my thoughts from the rookie. :-) Cheers Colin K On 2020-06-30 20:19, Jeff Hennick wrote: > > I have added the CONTEXT /number/ option. This reports not only the > matching record, but some before and after it also. Also added are > BEFORE and AFTER to get contextual records in one direction. There is > an optional SEPARATOR to set off the groups of records. It defaults > to "--". > >> /** regex >> >> >>--*REGEX*--+--------------------------+--/regex_string/-(1)--->< >> +-(--| /options_string/ |--)-+ >> >> *options_string*: >> +----------------------------+ >> |--v-+------------------------+-+--| >> +-NUMBERS----------------+ (2) >> +-BEFORE-+-/0/------+------+ (3) >> | +-/number/-+ | >> +-AFTER-+-/0/------+-------+ (3) >> | +-/number/-+ | >> +-CONTEXT-+-/0/------+-----+ (4) >> | +-/number/-+ | >> +-NOSEPARATOR------------+ >> +-SEPARATOR-+- -- ----+--+ >> | +-/DString/-+ | >> >> Records matching the RegEx are put out on primary output >> Records not matching are put out on secondary, if connected, or >> discarded. >> >> (1) string is a Java RegEx expresion. null string passes all records. >> (2) lines are prefaced with line number, 10 characters, right justified >> (3) number of records put out after a matching record >> (4) number of records put out before and after a matching record >> >> */ > > This brings it up, functionally, almost to GNU GREP 3.4 (minus all of > its file input options). > > A few things for discussion: > > * Would it ever help to put matched records out on the tertiary > stream without the context records? Unmatched ones are already on > the secondary. > * GREP has an option to just put out the COUNT of matched records. > Do we have any use for this? ( REGEX string | COUNT LINES does > the same.) > * Possibly the regex_string should be a delimited string. This > because a potential REGEX_CHANGE stage would have two delimited > strings. > * Should this be named GREP? Which term would be more or less > familiar to our users? Both? with one an alias of the other? > > (Oops, just spotted a bug: BEFORE, etc, without a number works if it > is the last option, but not otherwise. Something for the morning fix.) > > Jeff > > > > _______________________________________________ > netrexx-pipelines mailing list > net...@li... > https://lists.sourceforge.net/lists/listinfo/netrexx-pipelines |