On Apr 12, 2006, at 11:03 AM, Nathan Hamblen wrote:
Yes, exactly! There may be other reasons to defer the session
creation,
but it will not fix this problem of Google finding perfectly
accessible
bookmarkable pages with a jsesionid appended to them, unless your
robots.txt is able to exclude everything but bookmarkable links. Could
we do that? It might be the best of all worlds.
Until then, though, (and I don't think that needs to go in to 1.2)
turning off rewriting is a perfectly good "workaround." It's not like
other web app technologies have a better solution to this problem.
They
either won't give cookieless people a session or they abandon that
session when the user browses back into static-URL land. (Or they bind
to the IP? That's nuts though.)
Heh, no, other technologies *do* have a better solution, which is
simply to not create sessions unless they are needed. They still
append session ids, but only when necessary. So if a (cookie-less)
user goes to a news site and reads 10 articles, he still doesn't have
a session. Now he logs into his account. When he goes back to read
more articles, the session id should be appended. This does not
affect search engines in any way.
When I brought this up originally, I understood that forms would
require sessions. I'm okay with that for now. There are some sites
out there that don't have site-wide forms. Even if they do,
deferring session creation is still a good first step to coming up
with a fix. I think you mentioned before that one workaround to this
form issue (which is usually just a site-wide search form) is to
possibly use a separate technology, like a standalone servlet or
something just for that one form.
By the way, if you want to see a *perfect* implementation, look at
amazon. They generate sessions for every user except for search
engine spiders. If you look at google search results that include
amazon links, you'll see a simplified URL with no session id, but if
you click on the link, amazon will redirect you to the same URL with
session information.
Example:
results page: http://www.google.com/search?
hl=en&lr=&client=safari&rls=en&q=site%3Aamazon.com+j2ee+book&btnG=Search
first result: http://www.amazon.com/exec/obidos/tg/detail/-/
1861004656?v=glance
after session is added: http://www.amazon.com/gp/product/
1861004656/103-2816815-3928604?v=glance&n=283155
Notice they also set a cookie, so in case you go back to google and
click on another link, they add the *same* sessionid back to the URL
again (if the cookie is found). They also somehow know to assign a
new session if you share the link - probably by tying the user's IP
address to the session. That's normally very bad if you only use
sessionids in the URL, but when the cookie is the primary source, I
guess it works well enough.
If we're able to generate (at runtime?) a robots.txt that guides a
search engine through bookmarkable pages that don't open a session,
we'd
have a far better implementation than everyone else.
This is a cool idea, but it is something that can and has been done
manually since forever. Sites should already do this with any
technology, not just wicket. For example, you don't want search
engines spiders redirecting to your login page or trying to add items
to their shopping cart =).
With all that said, I don't think any of us expect that this issue
will be fixed in 1.2. The purpose of the discussion was to bring
light to an issue that most of the core developers probably don't
encounter.
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Wicket-user mailing list
Wicket-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wicket-user