Korbinian Bachl wrote:
-----Ursprüngliche Nachricht-----
How would I detect a crawler? By the user agent string.
This is not cloaking. Nor it is a sneaky redirect! In fact,
there is no redirect for crawler at all.
The only difference is that if there is a link in document
/my/page and crawler follows the link, the page gets displayed
However, if a regular visitor (not a crawler) follows the
link, he is redirected to /my/page[24] (for example)
It's either a) or b). Google (nor any other crawler) won't
see both of those.
The idea is to hide as much session relative stuff from
google as possible.
Ah! - here lies the problem. You think a crawler is coming, saying he is a
crawler and then indexing. In reality it will go similar to this:
-> search engine wants to index foo.com/bar
-> spider goes to foo.com/bar, having user agent "Google Bot" and known
google.com IP
-> data from spider is saved by google.com
-> some time goes by
-> 2nd spider to foo.com/bar, having user agent "IE 6.0" (or any other
possible browser) and unknown IP, however this is also a spider
-> data from spider2 is savedby google.com
-> as the results from spider1, and 2, are not the same, the procedure is
rewinded - the result is same: actions in case of "Googlebot" is not same as
in case of "IE 6.0"
-> site is marked as cloaked, not visited anymore and banned from index
this behavior is known - spiders rarely say that they are spiders, often
faking user agent and IPs to detect frauds - that youre not doing any fraud
doenst care google - they see youre behaviour and react to their guidelines.
Just look that even having a JavaScript redirects for different user agents
can be treated as claoking (I refer to the action google did against bmw.de
and some other very big companies some months ago).
But this is not cloaking. Every time you get exactly the _same_ content!
Cloaking is providing different content for same url depending of the
nature of visitor. That is far from what's going on here.
We always serve same content.
Only if the user agent is not google, you get one more redirect.
Anyway, google doesn't ban a site automatically. Every suspect is passed
to the human reviewers, to decide whether it is cloaking or not.
Regards,
Korbinian