It's written in document. The suppressing of redirect to hybrid url for
crawlers would be configurable.
-Matej
Korbinian Bachl wrote:
I thought this was gloabl for wicket 2.0 as whole URL cycle changes???
-----Ursprüngliche Nachricht-----
Von: Igor Vaynberg [mailto:[EMAIL PROTECTED]
Gesendet: Samstag, 4. November 2006 16:58
An: [email protected]
Betreff: Re: AW: AW: URL scheme for Wicket 2.0
what you guys are forgetting is that this is optional, and
people who care about spiders just dont use this feature. i
think the majority of wicket users are not building sites
with public facing content, so its not an issue for us.
-igor
On 11/4/06, Korbinian Bachl <[EMAIL PROTECTED]> wrote:
-----Ursprüngliche Nachricht-----
How would I detect a crawler? By the user agent string.
This is not cloaking. Nor it is a sneaky redirect! In
fact, there is
no redirect for crawler at all.
The only difference is that if there is a link in
document /my/page
and crawler follows the link, the page gets displayed
However, if a regular visitor (not a crawler) follows the
link, he
is redirected to /my/page[24] (for example)
It's either a) or b). Google (nor any other crawler)
won't see both
of those.
The idea is to hide as much session relative stuff from google as
possible.
Ah! - here lies the problem. You think a crawler is coming,
saying he
is a crawler and then indexing. In reality it will go
similar to this:
-> search engine wants to index foo.com/bar spider goes to
-> foo.com/bar, having user agent "Google Bot" and known
google.com IP
-> data from spider is saved by google.com some time goes by 2nd
-> spider to foo.com/bar, having user agent "IE 6.0" (or any other
possible browser) and unknown IP, however this is also a spider
-> data from spider2 is savedby google.com as the results from
-> spider1, and 2, are not the same, the procedure is
rewinded - the result is same: actions in case of
"Googlebot" is not
same as in case of "IE 6.0"
-> site is marked as cloaked, not visited anymore and banned from
-> index
this behavior is known - spiders rarely say that they are spiders,
often faking user agent and IPs to detect frauds - that youre not
doing any fraud doenst care google - they see youre behaviour and
react to their guidelines.
Just look that even having a JavaScript redirects for
different user
agents can be treated as claoking (I refer to the action google did
against bmw.de and some other very big companies some months ago).
Regards,
Korbinian