extract urls preceeded by a predefined string

  • Visibler

    Visibler - 2014-02-15

    How can i extract urls from text or html files preceeded by a string to a new file with each url by row.

    Best Regards

  • cchris

    cchris - 2014-02-16

    Is the predefined string to be found in the text or are you adding it?
    Assuming the former, you need a regular expression to extract URLs. The following is a crude attempt (uncheck ". matches newline"):

    You may need to tailor this to meet specific needs.

    Now make a backup of your original file, if you need it later.

    In regulr expression mode, search for the url regex and replace with
    \r\n§<Your predefined="" strng="">$0\r\n

    If processing a Windows file, remove both \r.

    So now your original file has lines that start with § and contain URLs, and other lines that don"t.

    Search for ^[^§].*?\R
    Replace all with nothing
    And now get rid of all the § markers.

    Note: replace § with anything easy to type and not in your text.