Your understanding of groups doesn't seem to be quite right.
An expression can have several groups (that's why 'groups()'), but each group after a successful match can have either a single value or no value at all.
For example, the regex "(.).(.)" has two groups. If we match it against "abc", these groups would capture "a" and "c" appropriately, so the 'groups()' would return {"a","c"}.
When quantified (as in your case), the group just remembers the last captured substring. For example, if we match the regex "(.)+" against the "abcde", the matcher would subsequently take the following states:
- [(a)]bcde
- [a(b)]cde
- [ab(c)]de
- [abc(d)]e
- [abcd(e)]
- report a match, match="abcde", group1="e"
In your case:
- remove the quantifier from the expression ( "(<.*?>)" or "(<[^>]*>)")
- iterate through all the occurences of a pattern (using 'find()' method) and collect values of the 1-st group.
Regards
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have the following problem.
pattern: (<.*?>){1,} or (<[^>]*>){1,}
text: <x1><x2><x3><x4>
I expect from Matcher.groups() a String array of length 4 with the values <x1> - <x4>.
Whats wrong? Please help me.
jr-ramses@gmx.net
Hello, jr-ramses!
Your understanding of groups doesn't seem to be quite right.
An expression can have several groups (that's why 'groups()'), but each group after a successful match can have either a single value or no value at all.
For example, the regex "(.).(.)" has two groups. If we match it against "abc", these groups would capture "a" and "c" appropriately, so the 'groups()' would return {"a","c"}.
When quantified (as in your case), the group just remembers the last captured substring. For example, if we match the regex "(.)+" against the "abcde", the matcher would subsequently take the following states:
- [(a)]bcde
- [a(b)]cde
- [ab(c)]de
- [abc(d)]e
- [abcd(e)]
- report a match, match="abcde", group1="e"
In your case:
- remove the quantifier from the expression ( "(<.*?>)" or "(<[^>]*>)")
- iterate through all the occurences of a pattern (using 'find()' method) and collect values of the 1-st group.
Regards