Re: [OT] pagemap entries and versioning

Jonathan Locke Fri, 02 Feb 2007 18:11:02 -0800


yeah, i agree that it's best to focus on making our serialization
more efficient as a first priority.  then see if there's anything else 
worth doing.  i'm not 100% convinced that it's actually impossible to
separate 
out the versioned components because we could record some kind of virtual
object pointer to substitute for the reference that could be fixed 
up later (basically implement a very custom serialization mechanism that 
allows object references to span files), but at that point, i think we'd be 
writing our own highly wicket-specific serialization code from scratch and
it 
seems like just doing efficient serialization with something like xstream
as a starting point might be good enough to really improve things
dramatically.


obviously we should be able to make a binary serialization of wicket
components
radically more efficient than the default java version, particularly in
terms of 
the size of things.  one off-the-cuff idea of how to do this is to maintain
a complete 
map of the classes and fields that the serializer serializes separately. 
then, using 
that map, we should be able to get rid of object headers and if we're
willing to say 
we don't care about serialization version compatibility issues (an
incompatible
class or field change would cause the system to drop all your pagemaps) 
we could even drop field names and types and just get down to the raw data.  
and i'd i bet there's actually not so much of that.  probably a tiny
fraction of the size.
since this doesn't affect clustering, the only issue would be keeping a
persistent
version of this class/field map so that restarts would only dump your
backbutton
data if it really needs to be dumped.

i'm actually very curious how xstream works, assuming it's really compatible
with
java serialization.  i mean, how do they create objects to initialize with
field data
without actually constructing them with new?  i always figured that was
unique 
to the built-in serialization.  maybe xstream is not completely compatible
that way?


Johan Compagner wrote:
> 
> On 2/3/07, Jonathan Locke <[EMAIL PROTECTED]> wrote:
>>
>>
>>
>> of course.  right you are.  you could fix parenting.  but anon classes
>> that
>> reference the
> 
> 
> this would be a problem in 2.0!
> components need to always keep a reference to the parent
> Else the developer can't say reattach() to the component. Because then the
> component
> can't add itself again to the parent. So really getting rid of it is hard.
> 
> It is not just anon classes i can do this:
> 
> MyTabPanel
> 
> Component child1 = MyChild1(MyTabPanel.this,"child");
> Component child2= MyChild2(MyTabPanel.this,"child");
> 
> and keep the child as a reference
> then later on the other tab must be shown
> 
> child1.attach();
> 
> This will cause the child2 to be removed (and in the undo map)
> but the page still has the reference to the child itself (so it can say
> attach() ) again
> 
> So separating changes is just not possible.
> 
> 
> page... that would be a killer for this whole idea.  shucks, it was too
> good
>> to be true...
>>
>> so you are suggesting getting rid of versioning entirely when the second
>> level cache
>> is running and just save the whole page each time? (sounded like it)
> 
> 
> We already do that now.
> We save  all page versions to disk: pageid:pageversion
> 
> so my plan was to only have 1 extra version (besides the page itself)
> in the undo manager so that the page in the session is pretty light.
> And one backbutton is very quick (no disk read) and that is the most used
> behavior i guess.
> So it looks like the best trade off
> 
> We should really focus on getting the serializable as quick as possible
> and
> the resulting size as small as possible
> 
> johan
> 
> 
> Johan Compagner wrote:
>> >
>> > just one thing.
>> > It is not possible to really seperate changes from its page.
>> > Because changes (mostly components) always have there parent (so you
>> can
>> > re
>> > attach them)
>> >
>> > And we have no idea what the components or models also by itself have
>> > references to the page. (anon classes)
>> >
>> > so i don't see any real value of separating the changes from a page.
>> > Because
>> > what would you do with that?
>> > save only changes? But then you will save the page anyway.
>> > The page itself will be smaller ofcourse. But with the second level
>> cache
>> > if
>> > we implement it right
>> > we only really have to keep 1 page version (maybe not even that but
>> that
>> > is
>> > what i would do)
>> >
>> > johan
>> >
>> >
>> > On 2/2/07, Jonathan Locke <[EMAIL PROTECTED]> wrote:
>> >>
>> >>
>> >>
>> >> before johan complains, i just realized there's a flaw in my little
>> >> plan.  you still have to undo changes to pages that are not
>> reconstructed
>> >> by custom IPageMapEntry implementations because the page is in the
>> >> state of the highest ordinal in the page map because that was the last
>> >> one accessed.  even so, a little more logic here should correct for
>> that.
>> >>
>> >>
>> >> Jonathan Locke wrote:
>> >> >
>> >> >
>> >> > A part of this whole discussion of serializing page map entries is
>> >> > also the current open bug that Eelco submitted that we should make
>> >> > page versions separate from pages.  This came up over at Diva
>> espresso
>> >> > a few minutes ago when Eelco and I were chatting and I had an
>> >> > interesting and very elegant little idea that could sort this out
>> >> > quite nicely.  Not only would it definitely make things more
>> efficient,
>> >> > it would also be a more elegant solution that fixes an unreported
>> bug.
>> >> >
>> >> > first, i think that IPageMapEntry.getNumericId is really more like
>> >> > getOrdinal since page ids will always increment in a page map.
>> >> > but regardless, to implement a page version based on another page
>> >> > in the page map (which might also be a page version), we can use a
>> >> > simple little container implementing IPageMapEntry to hold the
>> >> > base pagemap entry id and the changes to apply to that page.
>> >> > this container might be just an anonymous class, but let's give it
>> >> > a name here for clarity:
>> >> >
>> >> > class PageVersion implements IPageMapEntry
>> >> > {
>> >> >    // Identifier of page map entry to apply these changes to
>> >> >    // If we're using ordinals, we don't need this field at all
>> >> >    // because the page this version is based on will be our own
>> >> >    // ordinal - 1.
>> >> >    int basePageMapEntryNumericId;
>> >> >
>> >> >    // Don't recall the exact base class name for change entries
>> >> >    // in the versioning code, but this is the list of changes to
>> apply
>> >> >    List<Change> changes;
>> >> >
>> >> >    Page getPage()
>> >> >    {
>> >> >       // Get previous page (possibly recursing)
>> >> >       final Page page = pageMap.get(basePageMapEntryNumericId);
>> >> >
>> >> >       // Apply changes to page
>> >> >       page.applyChanges(changes();
>> >> >       return page;
>> >> >    }
>> >> > }
>> >> >
>> >> > this should work nicely and actually this fixes an existing bug
>> because
>> >> > right
>> >> > now if you provide a custom implementation of IPageMapEntry to
>> >> reconstruct
>> >> > a page, that page is probably not versionable.  in this case it
>> would
>> >> be
>> >> > because
>> >> > the recursion would bottom out and reconstruct that page.
>> >> >
>> >> >
>> >> > igor.vaynberg wrote:
>> >> >>
>> >> >> another idea to optimize serialization state from jon and i
>> >> >>
>> >> >> allow an easy way to override model serialization
>> >> >>
>> >> >> a simple example:
>> >> >>
>> >> >> class EntityModel extends LoadableDetachableModel {
>> >> >>    private long id;
>> >> >>    //standard junk
>> >> >> }
>> >> >>
>> >> >> now when this is serialized you get a bunch of junk like the class
>> >> >> header,
>> >> >> etc, but all you really care about is that single long id field. so
>> >> what
>> >> >> if
>> >> >> you have
>> >> >>
>> >> >> interface ModelSerializationCodec {
>> >> >> boolean supportsModel(Class<? extends IModel>);
>> >> >> writeModel(ObjectOutputStream s, IModel model);
>> >> >> IModel readModel(ObjectInputStream);
>> >> >> }
>> >> >>
>> >> >> class ModelSerializationCodecRegistry {
>> >> >>    private ModelSerializationCodec[]=new
>> ModelSerializationCodec[255];
>> >> >>    void registerCodec(ModelSerializationCodec codec) {...}
>> >> >>    void codecForId(byte id) {...}
>> >> >>    int codecIdForClass(Class<? extends IModel>){...}
>> >> >> }
>> >> >>
>> >> >> Component.writeObject(ObjectOutputStream oos) {
>> >> >> // or instead of overriding component.writeobject we can put model
>> >> >> // into a model holder object that has this logic
>> >> >> // or even wrap the model in the model holder conditionally when
>> the
>> >> >> model
>> >> >> is set only if there is a codec
>> >> >>
>> >> >> // i suppose this can also be in a special outputstream we use so
>> it
>> >> >> doesnt
>> >> >> have to be in the component
>> >> >> // we already have a few of these
>> >> >>   ...
>> >> >> // write out model
>> >> >> byte codecId=registry.codecIdForClass(model.getClass());
>> >> >> oos.writebyte(0);
>> >> >>
>> >> >> if (codecId!=0) {
>> >> >>    registry.codecForId(codecId).writeModel(oos, model);
>> >> >> }
>> >> >> }
>> >> >>
>> >> >> class EntityModelCodec implements ModelSerializationCodec {
>> >> >> boolean supportsModel(Class<? extends IModel> c) { return
>> >> >> EntityModel.class.equals(c); }
>> >> >> writeModel(ObjectOutputStream s, IModel model) {
>> >> s.writeLong(model.id);
>> >> }
>> >> >> IModel readModel(ObjectInputStream ois) { EntityModel em=new
>> >> >> EntityModel();
>> >> >> em.id=ois.readLong(); }}
>> >> >>
>> >> >> so now instead of serializing the entire model instance you can
>> just
>> >> >> write
>> >> >> out the fields you need and because this ability is outside the
>> model
>> >> you
>> >> >> can write out simple primitives. the downside is that for any model
>> >> that
>> >> >> doesnt have a codec the size of output takes an extra hit - which
>> is
>> >> the
>> >> >> codec id byte. the question is how many bytes are in the extra junk
>> >> like
>> >> >> class header - this ratio will determine if this is generally worth
>> >> >> doing.
>> >> >>
>> >> >> the only trick here is to keep the codecid byte consistent across
>> the
>> >> >> cluster. we can probably use initializers to do that for components
>> >> that
>> >> >> come in jars, just add a registerModelCodecs to the initializer and
>> >> also
>> >> >> allow the same from application.init()
>> >> >>
>> >> >> does this make sense? what do you guys think?
>> >> >>
>> >> >> -igor
>> >> >>
>> >> >>
>> >> >
>> >> >
>> >>
>> >> --
>> >> View this message in context:
>> >>
>> http://www.nabble.com/optimizing-serialized-state%3A-model-serialization-codecs-idea-tf3140882.html#a8775862
>> >> Sent from the Wicket - Dev mailing list archive at Nabble.com.
>> >>
>> >>
>> >
>> >
>>
>> --
>> View this message in context:
>> http://www.nabble.com/optimizing-serialized-state%3A-model-serialization-codecs-idea-tf3140882.html#a8778010
>> Sent from the Wicket - Dev mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: 
http://www.nabble.com/optimizing-serialized-state%3A-model-serialization-codecs-idea-tf3140882.html#a8778788
Sent from the Wicket - Dev mailing list archive at Nabble.com.

Re: [OT] pagemap entries and versioning

Reply via email to