Re: AW: AW: AW: URL scheme for Wicket 2.0

Matej Knopp Sat, 04 Nov 2006 08:16:52 -0800

It's written in document. The suppressing of redirect to hybrid url forcrawlers would be configurable.


-Matej


Korbinian Bachl wrote:

I thought this was gloabl for wicket 2.0 as whole URL cycle changes???
-----Ursprüngliche Nachricht-----
Von: Igor Vaynberg [mailto:[EMAIL PROTECTED]Gesendet: Samstag, 4. November 2006 16:58
An: [email protected]
Betreff: Re: AW: AW: URL scheme for Wicket 2.0
what you guys are forgetting is that this is optional, andpeople who care about spiders just dont use this feature. ithink the majority of wicket users are not building siteswith public facing content, so its not an issue for us.
-igor


On 11/4/06, Korbinian Bachl <[EMAIL PROTECTED]> wrote:
-----Ursprüngliche Nachricht-----

How would I detect a crawler? By the user agent string.
This is not cloaking. Nor it is a sneaky redirect! In
fact, there is
no redirect for crawler at all.
The only difference is that if there is a link in
document /my/page
and crawler follows the link, the page gets displayed
However, if a regular visitor (not a crawler) follows the
link, he
is redirected to /my/page[24] (for example)
It's either a) or b). Google (nor any other crawler)
won't see both
of those.
The idea is to hide as much session relative stuff from google aspossible.
Ah! - here lies the problem. You think a crawler is coming,
saying he
is a crawler and then indexing. In reality it will go
similar to this:
-> search engine wants to index foo.com/bar spider goes to-> foo.com/bar, having user agent "Google Bot" and known
google.com IP
-> data from spider is saved by google.com some time goes by 2nd-> spider to foo.com/bar, having user agent "IE 6.0" (or any other
possible browser) and unknown IP, however this is also a spider
-> data from spider2 is savedby google.com as the results from-> spider1, and 2, are not the same, the procedure isrewinded - the result is same: actions in case of
"Googlebot" is not
same as in case of "IE 6.0"
-> site is marked as cloaked, not visited anymore and banned from-> index
this behavior is known - spiders rarely say that they are spiders,often faking user agent and IPs to detect frauds - that youre notdoing any fraud doenst care google - they see youre behaviour andreact to their guidelines.Just look that even having a JavaScript redirects for
different user
agents can be treated as claoking (I refer to the action google didagainst bmw.de and some other very big companies some months ago).
Regards,

Korbinian

Re: AW: AW: AW: URL scheme for Wicket 2.0

Reply via email to