On Sun, Sep 20, 2009 at 6:10 AM, Martin Aspeli <optilude+li...@gmail.com> wrote:
> I'm working on a package (plone.app.textfield) that's meant to solve the
> common use case in Plone whereby we have a rich text field that's got a
> "raw" value (a unicode string) with a MIME type (e.g. text/html or
> text/structured), which is transformed to HTML on output (even HTML
> input is transformed, because we do some markup tidying and stripping).
> We're trying to implement this as efficiently as possible for the most
> common use case: the raw value is read infrequently (on the edit screen,
> basically); transformed value is read frequently (every time the content
> item is viewed).
> The approach we've taken is to store the mime type, the raw input and
> the transformed output values in a RichTextValue object that's not
> IPersistent (it derives from 'object' only) but knows its parent, via a
> __parent__ pointer. This avoids a separate _p_jar so e.g. the object
> isn't loaded or cached separately. We use the _p_changed protocol to
> notify the parent when the value is changed.
> Furthermore, we store the raw value in a blob, since it can be
> relatively big and is read/written infrequently. For performance, we
> store the transformed output HTML on the object in a regular string, on
> the assumption that it's almost always read when the object is used
> (i.e. on its view).
So the raw value is a separate database object, since it is using a
blob, which is a separate database object. That contradicts what you
say 2 paragraphs up.
> I have a couple of questions about the performance and behaviour of this:
> 1) When a value is extracted from the request in the edit widget, it
> is used to construct a RichTextValue, which is passed back for
> validation etc. This means writing to a blob since setting the 'raw'
> attribute writes to a blob. It is possible that validation will fail and
> so the object will never be persisted (set onto its parent object). Is
> it bad for performance to do write a "temporary" blob like this?
> 2) When a value is edited, we currently create a new RichTextValue
> object and replace the old one with it. Hence, we get a new blob. Would
> it be better to re-use the same blob and write a new value to it?
All other things being equal, it is better to update an existing
object rather than creating a new one. Objects consume index space
and getting rid of unused objects, via GC is far more expensive that
getting rid of old revisions via packing. Also, keeping the same
object makes auditing changes sane, because you can use the history
> 3) If the parent object is copied (e.g. via manage_cut/manage_paste in
> Zope 2), will the blob be copied as well, or this something we need to
> implement e.g. with event handlers for IObjectCopiedEvent? Bear in mind
> that the blob is an attribute of a simple class (RichTextValue) which in
> turn is an attribute of a persistent content object. The RichTextValue
> has a __parent__ pointer to the content object.
It all depends on how the copy/paste is done. If it's via database
export and import, I believe it will work without extra effort. I
writing a test. :)
I suspect a blob isn't buying you anything here. In fact, I suspect a
simple persistent object with a string value will serve you better.
> 4) Speaking of that __parent__ pointer: if the content object is
> copied, is that going to point to the old instance?
Again, that depends on your copy algorithm.
> If so, we presumably
> have to fix the reference up in an event handler? Is there a better way
> to make an object that doesn't have its own _p_jar but doesn't need a
> reference to the parent? I suppose this isn't really any different from
> a folder structure in Zope 3 where each child has a __parent__ pointer
> to its parent. When the parent is copied, what happens to those
> __parent__ pointers?
I believe the copying algorithm is aware of them and handles them properly.
For more information about ZODB, see the ZODB Wiki:
ZODB-Dev mailing list - ZODB-Dev@zope.org