I'm trying to extract some text nodes in regions excluding nodes within a specific subregion. For example in :
<text id=42 lang="English"> <s>An easy example.</s><s> Another <i>very</i> easy example.</s> <s><b>O</b>nly the <b>ea</b>siest ex<b>a</b>mples!<s></text>
I'd like to extract all the words except the words between "<i></i>". I tried the "diff" command approach :
A = /region[text];
B = /region[i];
C = diff A B;
but "diff" is a set command so it doesn't work as I expected and return the same set as A.
Anyone could tell me if what I'd like to do is possible and how I could do it ?
Thanks in advance.