Should the Yandex Russian Search Engine Bot be blocked from spidering your pages? I believe it should. I did some serious testing using server logs and honeypots and this bot currently does not respect robots.txt files. Worse still it applies such a server load that it must be contained.
I initially had this code (amongst others) in the .htaccess file:
but because it’s a persistent little critter I now have this as the first line:
If it bothers me any more, I shall start to fight back and start a campaign against it. One of our team got so fed up he did this:
Now the Yandex bot gets RickRolled every visit. Imagine half a million sites doing this…..