Menu

Context areas

Rx (9)
Oleg Osepyants

Context areas

Often there is a need to apply the Rx statement not to the entire text, but only to some of its areas. A typical example is removing line breaks (<br />) from within paragraphs (that is, between <p> and </p>) from an HTML document. To solve such problems, the Rx language allows you to set a context area for any statement, and the scope of the command can be both all context areas found in the text, and (vice versa) all text outside these areas.

To specify the scope of statement WITHIN the context area, append "INSIDE OF <ContextArea>" to the right of the operator, where <ContextArea> is a string (usually a regular expression) that defines the context area. If you want the scope of statement to be everything outside the context areas, use the "OUTSIDE OF <ContextArea>" instead of this.

Examples

Example 1

So, remove all <br /> from all paragraphs of the HTML document

REPLACE P"<br />" TO P" " INSIDE OF R"(?<=<p>).+?(?=</p>)"

Original text

<html><head><title>Example document</title></head>
<body>
  <h1>Example document.<br />Just a few lines of text.</h1>
  <p>The first paragraph of<br />text with line break.</p>
  <p>The next paragraph<br />of text with<br />two line breaks.</p>
</body>
</html>

The text after applying the statement above

<html><head><title>Example document</title></head>
<body>
  <h1>Example document.<br />Just a few lines of text.</h1>
  <p>The first paragraph of text with line break.</p>
  <p>The next paragraph of text with two line breaks.</p>
</body>
</html>

Example 2

And now vice versa - let's do the same outside the paragraphs.

REPLACE P"<br />" TO P" " OUTSIDE OF R"<p>.*?</p>"

Original text

<html><head><title>Example document</title></head>
<body>
  <h1>Example document.<br />Just a few lines of text.</h1>
  <p>The first paragraph of<br />text with line break.</p>
  <p>The next paragraph<br />of text with<br />two line breaks.</p>
</body>
</html>

The text after applying the statement above

<html><head><title>Example document</title></head>
<body>
  <h1>Example document. Just a few lines of text.</h1>
  <p>The first paragraph of<br />text with line break.</p>
  <p>The next paragraph<br />of text with<br />two line breaks.</p>
</body>
</html>

Related

Wiki: PREFIX statement
Wiki: Rx text transformation script language

MongoDB Logo MongoDB