On Apr 12, 2006, at 11:03 AM, Nathan Hamblen wrote:
Yes, exactly! There may be other reasons to defer the session creation, but it will not fix this problem of Google finding perfectly accessible
bookmarkable pages with a jsesionid appended to them, unless your
robots.txt is able to exclude everything but bookmarkable links. Could
we do that? It might be the best of all worlds.

Until then, though, (and I don't think that needs to go in to 1.2)
turning off rewriting is a perfectly good "workaround." It's not like
other web app technologies have a better solution to this problem. They
either won't give cookieless people a session or they abandon that
session when the user browses back into static-URL land. (Or they bind
to the IP? That's nuts though.)

Heh, no, other technologies *do* have a better solution, which is simply to not create sessions unless they are needed. They still append session ids, but only when necessary. So if a (cookie-less) user goes to a news site and reads 10 articles, he still doesn't have a session. Now he logs into his account. When he goes back to read more articles, the session id should be appended. This does not affect search engines in any way.

When I brought this up originally, I understood that forms would require sessions. I'm okay with that for now. There are some sites out there that don't have site-wide forms. Even if they do, deferring session creation is still a good first step to coming up with a fix. I think you mentioned before that one workaround to this form issue (which is usually just a site-wide search form) is to possibly use a separate technology, like a standalone servlet or something just for that one form.

By the way, if you want to see a *perfect* implementation, look at amazon. They generate sessions for every user except for search engine spiders. If you look at google search results that include amazon links, you'll see a simplified URL with no session id, but if you click on the link, amazon will redirect you to the same URL with session information.

Example:
results page: http://www.google.com/search? hl=en&lr=&client=safari&rls=en&q=site%3Aamazon.com+j2ee+book&btnG=Search first result: http://www.amazon.com/exec/obidos/tg/detail/-/ 1861004656?v=glance after session is added: http://www.amazon.com/gp/product/ 1861004656/103-2816815-3928604?v=glance&n=283155

Notice they also set a cookie, so in case you go back to google and click on another link, they add the *same* sessionid back to the URL again (if the cookie is found). They also somehow know to assign a new session if you share the link - probably by tying the user's IP address to the session. That's normally very bad if you only use sessionids in the URL, but when the cookie is the primary source, I guess it works well enough.

If we're able to generate (at runtime?) a robots.txt that guides a
search engine through bookmarkable pages that don't open a session, we'd
have a far better implementation than everyone else.

This is a cool idea, but it is something that can and has been done manually since forever. Sites should already do this with any technology, not just wicket. For example, you don't want search engines spiders redirecting to your login page or trying to add items to their shopping cart =).

With all that said, I don't think any of us expect that this issue will be fixed in 1.2. The purpose of the discussion was to bring light to an issue that most of the core developers probably don't encounter.



-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Wicket-user mailing list
Wicket-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wicket-user

Reply via email to