On 18.08.13 18:34, Jim Fulton wrote:
On Sun, Aug 18, 2013 at 12:17 PM, Christian Tismer <tis...@stackless.com> wrote:
We get a medication prescription database in a certain serialized format
which is standard in Germany for all pharmacy support companies.

This database comes in ~25 files == tables in a zip file every two weeks.
The DB is actually a structured set of SQL tables with references et al.
So you get an entire database snapshot every 2 weeks?

I actually did not want to change the design and simply created the table
structure that they have, using ZODB, with tables as btrees that contain
tuples for the records, so this is basically the SQL model, mimicked in
OK.  I don't see what advantage you hope to get from ZODB.

I want its flexibility. I need python and zodb to transform the data tables
before I understand them. I use Python to stress and inquire and validate my
implementation, and their data structures, before I trust it and maybe turn it (painfully) into an SQL db. Maybe not at all, as I learn from playing with Zodb.

Have you ever tried to "play" with an SQL DB?
This is very painful and boring to set up and get right.
I only do that after I have studied the data with Python.
In this case, simply looking at pickles huge dicts did not scale, because of
too much data. That was the reason to dive into Zodb. With success.

What is boring is the fact, that the database gets incremental updates all
the time,
changed prices, packing info, etc.
Are these just data updates? Or schema updates too?

At first I was told that there are data updates, only. Then, due to my validation analyze during parsing, I found out that there were structural schema changes
as well. Some were just relaxations or strengthened constraints, but there
were three major changes lately, that incolved the whole tables by inserting
and removing columns.
The whole catastrope, so to say.

As always, when the customer swears "this will never happen", you should be
prepared to implement exactly that impossible case. :-)

We need to cope with millions of recipes that come from certain dates
and therefore need to inquire different versions of the database.
I don't understand this. What's a "recipe"?  Why do you need to
consider old versions of the database?

Not recipes, but prescriptions. (Unfortunately these words collapse in German).
We get millions of these every month and have to use the right data from the
DB version which was active at that time when the prescription was issued.

That made me want to create a "time machine" interface to the DB without the
need to have several GB of that crap as slightly different variations of
basically the same stuff.

Made some promising experiments today with column btrees.
ZODB is performing well with 100 million of buckets!

cheers - Chris

Christian Tismer             :^)   <mailto:tis...@stackless.com>
Software Consulting          :     Have a break! Take a ride on Python's
Karl-Liebknecht-Str. 121     :    *Starship* http://starship.python.net/
14482 Potsdam                :     PGP key -> http://pgp.uni-mainz.de
phone +49 173 24 18 776  fax +49 (30) 700143-0023
PGP 0x57F3BF04       9064 F4E1 D754 C2FF 1619  305B C09C 5A3B 57F3 BF04
      whom do you want to sponsor today?   http://www.stackless.com/

For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org

Reply via email to