I thought this was gloabl for wicket 2.0 as whole URL cycle changes???

 

> -----Ursprüngliche Nachricht-----
> Von: Igor Vaynberg [mailto:[EMAIL PROTECTED] 
> Gesendet: Samstag, 4. November 2006 16:58
> An: [email protected]
> Betreff: Re: AW: AW: URL scheme for Wicket 2.0
> 
> what you guys are forgetting is that this is optional, and 
> people who care about spiders just dont use this feature. i 
> think the majority of wicket users are not building sites 
> with public facing content, so its not an issue for us.
> 
> -igor
> 
> 
> On 11/4/06, Korbinian Bachl <[EMAIL PROTECTED]> wrote:
> >
> >
> >
> > > -----Ursprüngliche Nachricht-----
> > >
> > > How would I detect a crawler? By the user agent string.
> > > This is not cloaking. Nor it is a sneaky redirect! In 
> fact, there is 
> > > no redirect for crawler at all.
> > >
> > > The only difference is that if there is a link in 
> document /my/page 
> > > and crawler follows the link, the page gets displayed
> > >
> > > However, if a regular visitor (not a crawler) follows the 
> link, he 
> > > is redirected to /my/page[24] (for example)
> > >
> > > It's either a) or b). Google (nor any other crawler) 
> won't see both 
> > > of those.
> > >
> > > The idea is to hide as much session relative stuff from google as 
> > > possible.
> > >
> >
> > Ah! - here lies the problem. You think a crawler is coming, 
> saying he 
> > is a crawler and then indexing. In reality it will go 
> similar to this:
> >
> > -> search engine wants to index foo.com/bar spider goes to 
> > -> foo.com/bar, having user agent "Google Bot" and known
> > google.com IP
> > -> data from spider is saved by google.com some time goes by 2nd 
> > -> spider to foo.com/bar, having user agent "IE 6.0" (or any other
> > possible browser) and unknown IP, however this is also a spider
> > -> data from spider2 is savedby google.com as the results from 
> > -> spider1, and 2, are not the same, the procedure is
> > rewinded - the result is same: actions in case of 
> "Googlebot" is not 
> > same as in case of "IE 6.0"
> > -> site is marked as cloaked, not visited anymore and banned from 
> > -> index
> >
> > this behavior is known - spiders rarely say that they are spiders, 
> > often faking user agent and IPs to detect frauds - that youre not 
> > doing any fraud doenst care google - they see youre behaviour and 
> > react to their guidelines.
> > Just look that even having a JavaScript redirects for 
> different user 
> > agents can be treated as claoking (I refer to the action google did 
> > against bmw.de and some other very big companies some months ago).
> >
> > Regards,
> >
> > Korbinian
> >
> >
> >
> 

Reply via email to