Menu

Find links which are build with Javascript

Help
Anonymous
2015-07-29
2015-11-04
  • Anonymous

    Anonymous - 2015-07-29

    Hi,

    I'm using PHPCrawl for a couple of months and I really like it!

    But I have one question. I'm crawling pages which have a selectbox which redirects you to another page. These 'pages'/'options' in the selectbox aren't recognized by PHPCrawl.

    Example:
    I'm on this page: www.example.com/Shop/Product/sft001/SFT001

    There is a selectbox with this information/javascript:

    <select name="selectedOptions[]" class="articleOption" onchange="if (this.value) window.location.href=this.value">
    <option value="SFT001">SFT001</option>
    <option value="/Shop/Product/sft001/SFT002">SFT002</option>
    <option value="/Shop/Product/sft001/SFT003">SFT003</option>
    <option value="/Shop/Product/sft001/SFT004">SFT004</option>
    <option value="/Shop/Product/sft001/SFT005">SFT005</option>
    <option value="/Shop/Product/sft001/SFT006">SFT006</option>
    <option value="/Shop/Product/sft001/SFT007">SFT007</option>
    </select>

    how can I make PHPCrawl aware that he should also visit:
    www.example.com/Shop/Product/sft001/SFT002
    www.example.com/Shop/Product/sft001/SFT003
    www.example.com/Shop/Product/sft001/SFT004
    ?

    I'm thinking about using this:
    $crawler->setLinkExtractionTags(array("href","src","url","location","codebase","background","data","profile","action","open","value"));

    Because I'm afraid I'm ruining the performance, because the value-attribute can be used in many many places...
    By example it's also used in the header of the website to show the languages and currencies...

    Is this the way to go?

    Thanks in advance for your reply!

     

    Last edit: Anonymous 2015-07-29
  • Anonymous

    Anonymous - 2015-08-05

    maybe you can use the regex to reach the purpose,,,

     
  • Uwe Hunfeld

    Uwe Hunfeld - 2015-08-29

    Hi!

    Yes, your posted solution should work fine:

    $crawler->setLinkExtractionTags(array("href","src","url","location","codebase","background","data","profile","action","open","value"));

    .. as you may have figured out meanwhile ;)

     

    Last edit: Uwe Hunfeld 2015-08-29
  • Anonymous

    Anonymous - 2015-11-04

    Thanks for your reply, this works and it looks like it didn't have any big impact on the performance!!!

     

    Last edit: Anonymous 2016-01-30

Anonymous
Anonymous

Add attachments
Cancel





Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.