Menu

Following only <a> tag

Help
Anonymous
2014-12-15
2014-12-17
  • Anonymous

    Anonymous - 2014-12-15

    Hello,

    PHPCrawl is very good at what it does, but I wasn't able to find any option (or where to put a proper regex) for it to follow 'href' parameter only from '<a>' anchor tags. Is there any solution to this?

    Big thank you to anyone who responds :)

     

    Last edit: Anonymous 2014-12-17
  • Anonymous

    Anonymous - 2014-12-17

    Hi!

    Did you try enableAggressiveLinkSearch(false) and setLinkExtractionTags(array("href"))?

     
  • Anonymous

    Anonymous - 2014-12-17

    Hello,

    I have tried exactly that, but, you see, 'href' itself is not a tag, but a parameter of tag. PHP Crawl does everything right, in a sense, that it really follows every 'href' it can find in the document (including <link href="..."> and other). All I want is it to follow <a href="..."> ;)

     

    Last edit: Anonymous 2014-12-17
  • Anonymous

    Anonymous - 2014-12-17

    Ah ok, i see.

    Unfortunately this can't be set without modifying the phpcrawl-code itself.

    But feel free to put this on the list of feature-requests, like a new method "addLinkExtractionTags" (after the old method "setLinkExtractionTags" was renamed to "setLinkExtractionAttributes")

     

    Last edit: Anonymous 2014-12-17

Anonymous
Anonymous

Add attachments
Cancel





Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.