Re: [Treebase-devel] Treebase Dev problems

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Nov 15, 2011, at 8:50 AM, Mattison Ward wrote:

> The Nagios monitoring system queries treebase-dev every few minutes to
> make sure it is up using this query:
> 
> http://treebase-dev.nescent.org/treebase-web/search/studySearch.html?query=prism.publicationName=Nature&format=null&recordSchema=null
> 
> 
> It might be unrelated, but I saw a fair amount of activity from search
> engines in the web server logs.
> 
> I can set up a robots.txt file to keep search engines from crawling
> the dev and staging sites.
> 
> Would it make sense to keep search engines from crawling any sections
> of the production site?

Hi Mattison:

Search engines are already blocked from crawling production:

http://www.treebase.org/robots.txt

... though I don't find this on stage or dev:

http://treebase-stage.nescent.org/robots.txt
http://treebase-dev.nescent.org/robots.txt

So definitely, please have a robots block on stage and dev -- in fact, it should block *everything* on those two sites because we don't want them to compete with production. 

So perhaps this has nothing to do with Carl's R bindings. That would be great news if true. 

Do you have logs that provide the IP identity of users responsible for taking down dev? 

bp