On 1/9/07, Johan Compagner <[EMAIL PROTECTED]> wrote:
>
> > And if you where on version 10 and you roll backed to 6 you will never
> > be able to get 7 back.
>
> This is how it should be (and was) in the first place. I applied a
> little patch yesterday that actually deletes those higher version
> (together with a patch that only writes a version when that version
> wasn't written before yet).


no see the reply of martijn (and me again) this is a bug in the current
system
that was fixed by the current implementation!


Ok, that case wasn't clear to me. That totally sucks though, as if I
were to rollback that change, I'd also have to rollback the change to
prevents a write even if the version already existed. I would hate
that as it is really inefficient to do that. An option would be to
that if you rollback from 10 to 5, and have a version change on 5, the
next version would still be 11 somehow.

> When we rollback, we should invalidate newer versions of the page
> relative to the version that is rollbacked to anyway. I think this was
> always the idea, and was how the page version manager was implemented.


No we shouldn't do it.
Maybe a browser will be able to go forward again even in other situations
then
the current one. ALL pages should be reachable so we are bullet proof in all
situations.

So, pushing the back button to a page and then pushing refresh is the
only case, right? How much does that suck.

Partially it is just fixing something that is broken now. We could
> decide to write full versions rather then one full page and a series
> of changes. However, as this is something we do for every request
> (though we might do it async, but still it's for every request), it
> makes sense to optimize as best as we can. From testing on my machine,
> a typical save of a page with one or two versions costs between 5 - 20
> miliseconds (which obviously increases currently as more versions get
> added to the page, with my system here easily up 50 miliseconds are 10
> clicks, and it is also on a warmed up system, where the introspection
> caches are filled properly; the first saves are up to 500
> miliseconds!). This would be 1 - 2 miliseconds for just writing the
> changes. On a system with heavy load, I believe this can make quite
> the difference. Note that it's not only processing time, but also
> space it takes in the FS (minor) and time it needs the filesystem.
>
> > And do remember that most clustering situations do exactly the same but
> then
> > for the complete session
> > We only for changed pages!
>
> The fact that other solutions are not optimized doesn't mean we don't have
> to :


i get a dejavu discussion from the wicket irc list from yesterday
Dont try to optimize upfront only when there is really a problem..
And there is no problem currently i don't believe it until i see it.
I like to see it with my own eyes as a hotspot in yourkit..

Well, I do see it with my own eyes. I have many saves that are take
about 50 milis (pages with around 10 versions, which isn't uncommon in
our system that uses a lot of tabs and other component replacements),
if that test on whether the version was saved isn't done, most get 2
saves per request (though I hope this might be a fixable issue, but
it's something I'm seeing now), meaning it costs 100 milis. That's a
second for 10 request. And that's *a lot* if you ask me, as it not
only uses the processor, but also the file system during that period
(which of course can be shared for static resources, databases, etc).
Furthermore, about 3/4 (!!!) or more of the saves totally aren't
necessary. One optimization we could do that would put this
drastically down is leave my check in place for new pages/ first
versions, and always override subsequent versions. But even then,
every time you click a (normal) link that doesn't result in a version
change, you would still save every request. And that is just a blunt
waste of resources.

But i agree with you that constantly saving a page with full version info is
a waste.
But to make the system so much more difficult to try to gain some speed. is
in my eyes a waste.

If we can keep it simple, all the better. But imo this is such a
central thing it has to be optimized as good as we can. I expected slc
session store to have a positive overall impact on the scalability of
Wicket, but in it's current state I'm not so sure, as the hit on
processor and FS usage could be higher than the gain in memory.

So still what i think we should do is set the max page vedrsions to 1 so that
only one per page can be made
Then when 1 back button is hit nothing is read back in. only when 2 times it
is hit something needs to be read in
And we hold much less in mem and save much less to disk. Its a simple
optimization that only has a win/win side.

If I understand correctly, that's a smart trick. Though more a hack/
partial solution then robust one imo. What we really should fix with
nbr 1 priority is the fact that versions and pages are stored in
different places now. I think it should be handled by one store
somehow.

> I think we only should do this when it is really shown as a bottleneck!
>
> Typically I would go for the premature optimization argument. But in
> this case the penalty is pretty obvious and easy to measure.


Then i first want to see it as a hostpot in yourkit.
I will setup something.

Sounds good. Be sure to test with a production app :)

> The solution i propose and like is because of 2 things, the first one is
> > minor
> > 1> that we save much less, just the page at that time.. not the complete
> > version history.
> > 2> That the page in mem is much smaller again not the complete history
> > (because that is already on disk)
> >
> > especially the last point is nice to have..  the drawback is when the
> > backbutton is used the page will always
> > come from disk. Which is now not the case because most of the time the
> page
> > can construct is previous page itself in mem.
>
> That doesn't have to be the case, as we don't have to write
> immediately. We could have an overflow mechanism with a combined rule
> so that it doesn't write until we have say 5 versions when it has
> enough available memory. I starting to think whether it wouldn't be


But where is the page being kept if it is not the active page?
When a page is not the active page anymore it has to be saved.
So you can't say no you only have to cache it.
And 5 page versions of the page as the active page is a waste. That eats
session space
that is not really needed because it is already cached.

That would/ could just depend on the caching engine if we would go
that far. If you have enough free memory, keep it there, why not?
That's the idea behind most caching engines; only overflow when
needed.

But ok, a concrete thing that can be done here is keep 5 (or so)
versions of the same page in mem, and if you recognize a new page is
put there, you flush them (maybe except for the last).

We should strive for reducing session usage and that means only holding the
active page
and as low number of versions in the version manager.

We have to find the best balance between all resources, not just
memory. If you sacrifice too much processor and FS for the sake of
memory, you're scalability will suffer just the same (much like the
wrong assumption that client state saving is always better for
scalability than storing in the session is).

Of course we could extract the version manager completely from the page and
make it a single
session thing for all the pages that has change sets for pages and that one
also saves those
to the FileStore of the SecondLevelPageMap (so the VersionManager is another
kind of second level cache
or we merge those 2)

That would be an option.
So that we only save a page when it falls out of the PageMap (not active
anymore)
and for the rest the "global" version manager will save the changesets.
And the page doesn't have a hard (not transient) reference to it anymore.

Something like that sounds good to me yeah.

better to use an actual cache manager for this - like EHCache - so
> that we don't have to write this ourselves and have something that is
> easy to configure and stuff.

I don't see the real value in this at all.
It only complicates things and we get much to much garbage again for what is
now a few lines of code.
And it wont solve anything because Page and versions of it is now a single
object that also then is a single object.

You're probably right.

Eelco

Reply via email to