Notepad++ / Discussion / [READ ONLY] Deutsch Forum: Suchen und Ersetzen mit Regexp in Notepad++

Jens Habermann - 2015-05-15

Hallo,

ich habe hier 2 XML-Dateien bei denen ich einen Tag umformatieren müsste.

Aus
<Klassifikation>1001</Klassifikation>
<Klassifikation>100118</Klassifikation>
<Klassifikation>10011901</Klassifikation>

soll
<Klassifikation>10.01</Klassifikation>
<Klassifikation>10.01.18</Klassifikation>
<Klassifikation>10.01.19.01</Klassifikation>

werden. Die Anzahl der Ziffern ist immer gerade (4,6,8,10 oder 12). Suchen kann ich die alle, aber das Einfügen des Punktes klappt überhaupt nicht. Hat irgendwer eine Idee?

Jens

Last edit: Jens Habermann 2015-05-15

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Andreas Jonsson - 2015-05-15

Search for: <Klassifikation>(\d\d)(\d\d)</Klassifikation>
Replace with: <Klassifikation>\1\.\2</Klassifikation>

Search for: <Klassifikation>(\d\d)(\d\d)(\d\d)</Klassifikation>
Replace with: <Klassifikation>\1\.\2.\3</Klassifikation>

And so on.

Not sure if it can be done all at once with regular expressions.

Last edit: Andreas Jonsson 2015-05-15

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jens Habermann - 2015-05-19

Thats it, except one Backslash to much behind \1:

Replace with: <Klassifikation>\1.\2.\3</Klassifikation>

Great. Thanks!

Jens

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

THEVENOT Guy - 2015-05-19

Hi Jens and Andreas,

Jens, although I don't understand German language at all, I could easily guess your needs, as you clearly explained what you want !!

I, first, tried to find a general regex, with lookarounds and the special \K syntax, but I couldn't build a right regex :-((

Then, I decided to split the problem into 2 smaller ones :

Firstly, try to match any block of text, of the form <Klassifikation>.....</Klassifikation>, with blocks of two-digits, ONLY, between the two tags.

Secondly, try to match two digits, necessarily followed by a digit, ONLY IF the previous search was matched

The first regex can be easily written as <(Klassifikation)>(\d\d)+</\1>$

and we replace all that block by itself, followed a specific character, which doesn't exist, in your file

So, the Replace zone will contain $0@, assuming, for instance, that no @ character exists, yet

You'll note that this regex can't be match twice, because it must exactly match a > character, at the end of the current line.

The second regex is even more simple \d\d(?=\d), which we will replace by $0. ( the entire line, followed by a dot )

Remainder : In both replacements, the syntax $0 represents the entire regex matched !

OK, now, we just have to perform this second regex, ONLY IF a @ character is present at the end of the line

To do so, we just have to modify, a bit, the lookahead : \d\d(?=\d.+@$)

Finally, when all the blocks of two digits - 1, are followed by a dot, we must delete, at the end of the line, the @ character, that we used as a mark. It's childlike ! just search for @ and replace by NOTHING.

Therefore, the complete search regex, built with three alternatives, becomes :

<(Klassifikation)>(\d\d)+</\1>$|\d\d(?=\d.+@$)|(@)

For the replacement, we'll use conditional replacements. If you're not acquainted with them, here is, below, a fast summary :

A conditional replacement is of the general form (?n ... : ... ), where n is the number of a searched group

If the group n is DEFINED, all the characters after ?n till the colon are rewritten

If the group n is NOT defined, all the characters after the colon till the ending round parenthesis, are rewritten

For example, the replacement ABC(?4ijk:pqr)XYZ would produce ABCijkXYZ, if the search group 4 is matched and would give the string ABCpqrXYZ, if the group 4 is NOT matched

Then, our replace regex may be written (?3:$0(?1@:.)) and can be understood as the two overlapped conditions, below :

IF group 3 ( The @ character ) is MATCHED, it's DELETED ELSE we rewrite the **MATCHED** string ( **$0** ) IF **group 1** ( the word Klassifikation ) is MATCHED ( due to ALTERNATIVE 1 ) we add the **@** character ELSE ( ALTERNATIVE 2 ) we add a **DOT** character ENDIF ENDIF

To sump up :

SEARCH :

<(Klassifikation)>(\d\d)+</\1>$|\d\d(?=\d.+@$)|(@)

REPLACE : (?3:$0(?1@:.))

Select the regular expression search mode

Uncheck the . matches newline option, if necessary

Go back to the very beginning of your document

Click TWICE on the Replace All button

The first S/R adds a @ character, at the end of all the concerned lines

The second S/R adds a dot after all the two-digits block, but the last and, finally, delete the @ character

With that regex :

The opening tag <Klassifikation> may begin, after column 1

Any sequence of two-digits, between the two tags, will be modified

Any extra click, on the Replace All button, after the second one, has NO effect, luckily :-)

Best Regards,

guy038

Last edit: THEVENOT Guy 2015-05-19
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Rufus V. Smith - 2015-05-21

Are they always two digit pairs? I have a simple solution:

Search: ([>.])(\d\d)(\d)
Replace: \1\2.\3

Click replace all 3 times and done.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

THEVENOT Guy - 2015-05-23

Hi Rufus,

Oh, yes, your regex is really simple, compared to mine ! However, your regex would add a dot, after ANY block of two digits and not only between the two tags <Klassifikation>......</Klassifikation>

Indeed, I tried to find the strict regex, from Jens's post. But, in current life, I would have used a more simple regex, like yours, by selecting the concerned text, with the In selection option of the Replace dialog, for instance :-)

BTW, your regex can, even, be shortened !

SEARCH ([>.]\d\d)(\d)

REPLACE \1.\2

Cheers,

guy038

Last edit: THEVENOT Guy 2015-05-23

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Suchen und Ersetzen mit Regexp in Notepad++

Notepad++ project is moving to GitHub:

Forums

Help

Suchen und Ersetzen mit Regexp in Notepad++

Suchen und Ersetzen mit Regexp in Notepad++

Notepad++ project is moving to GitHub:

Forums

Help

Suchen und Ersetzen mit Regexp in Notepad++ document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Suchen und Ersetzen mit Regexp in Notepad++