How can i extract urls from text or html files preceeded by a string to a new file with each url by row.
Is the predefined string to be found in the text or are you adding it?
Assuming the former, you need a regular expression to extract URLs. The following is a crude attempt (uncheck ". matches newline"):
You may need to tailor this to meet specific needs.
Now make a backup of your original file, if you need it later.
In regulr expression mode, search for the url regex and replace with
\r\n§<Your predefined="" strng="">$0\r\n
If processing a Windows file, remove both \r.
So now your original file has lines that start with § and contain URLs, and other lines that don"t.
Search for ^[^§].*?\R
Replace all with nothing
And now get rid of all the § markers.
Note: replace § with anything easy to type and not in your text.
Sign up for the SourceForge newsletter:
You seem to have CSS turned off.
Please don't fill out this field.