I tried this:
$crawler->addURLFilterRule("/Category\/.\/Search/");
and
$crawler->addURLFilterRule("#/Category\/.\/Search/#");
But still the first 2 url's are also crawled.
What would be the correct regex to prevent this?
Thanks in advance!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
Anonymous
-
2019-02-15
You're close. The dot matches any (single) character except line breaks, but you need to match ALL of the characters. Try this:
(\/Category\/.+\/Search)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm having an issue in creating the correct regex for this situation:
The Crawler should ignore url's like:
http://mysite.com/Category/ProductA/Search
http://mysite.com/Category/ProductB/Search
http://mysite.com/Category/ProductB/Search
But it should crawl this url:
http://mysite.com/Search
I tried this:
$crawler->addURLFilterRule("/Category\/.\/Search/");
and
$crawler->addURLFilterRule("#/Category\/.\/Search/#");
But still the first 2 url's are also crawled.
What would be the correct regex to prevent this?
Thanks in advance!
You're close. The dot matches any (single) character except line breaks, but you need to match ALL of the characters. Try this:
(\/Category\/.+\/Search)
By the way, I didn't actually try it because I don't have those set up, but this is my example:
https://www.regexpal.com/?fam=107703
Thanks for your reply! I'll give it a try and let you know!
Hi,
due to your hint I was able to get this command working: $crawler->addURLFilterRule("/\/Category\/.+\/Search/");
Thanks for your help!