On Wed, Apr 18, 2012 at 12:16 AM, Roan Kattouw <[email protected]>
 wrote:

> On Tue, Apr 17, 2012 at 5:37 PM, Martijn Hoekstra
> <[email protected]> wrote:
> > On Tue, Apr 17, 2012 at 10:51 PM, Krinkle <[email protected]> wrote:
> >> On Apr 17, 2012, at 9:05 AM, Thomas Gries wrote:
> >>
> >>>My questions:
> >>> 1. Is there a policy, convention, that more than one new table should
> be
> >>> avoided in extensions ?
> >>> 2. Are two or more new tables tolerated?
> >>
> >> If it it required, then sure it's tolerated. Some of the extensions
> currently
> >> deployed on Wikipedia have lots more tables even.
> >>
> >> Of course it goes without saying, that if you can optimize the number
> of tables
> >> without sacrificing performance, then by all means: Go for it.
> >>
> >> If you could merge the tables and make it still perform well with the
> right
> >> database indexes, why not :)
> >>
> >> On the other hand, if it means the table will be significantly larger,
> then it
> >> may be better to keep them separate. For example, I'd say it's better
> two tables
> >> (say, 'group' and 'item', where item.it_group refers to group.gr_id).
> So that
> >> you don't have to repeat all information about the group in each
> item-row, and
> >> if the group has to change, no need to change all item-rows.
> >>
> >> -- Krinkle
> >>
> >
> > Am I reading this right as suggesting and encouragement of database
> > denormalisation in extensions?
> >
> Ignore what Krinkle said. We DO NOT encourage denormalization, except

where necessary for performance reasons.
>
> Your extension should have a sanely designed database schema which is
> normalized in as far as it makes sense. Don't feel bad about creating
> too many or too few tables, just try to design the schema the sanest
> way you can.


Who said anything about denormalization[1]? Maybe I'm missing something
here,
but I think we're saying the same thing.

What I meant (and thought I made clear) was that one should put a little
bit of
thinking into the database design, using as many or as few tables as it
needs to
work well. Preferably without duplication of information by splitting it
into
separate logical tables (such as the 'group' / 'item' example I mentioned,
which
is quite common in MediaWiki and in pretty much any other major SQL-backed
web
application). Maybe my description of the "merge" was a bit too vague, but
let
me elaborate on what I meant.

I wanted to add to the discussion that creating separate tables is not
inherently good or bad on itself. Sometimes it makes sense to use less
tables,
sometimes it makes sense to use more tables. In the above cited mail I
mentioned
a group/item relation where it is best to keep them in separate logical
tables.
Here is an example where not splitting it up might make sense: A system for
managing lists with items of a certain type (where the types are variable).
Then
it may make more sense to have a single table for the list items (with a
column
indicating the item type) and a table for lists and a table for types. So,
only
1 table for items with a column to indicate the item type, rather than
having a
separate item-table for each item-type. Again, it depends on the situation
and
on how variable "variable" is.

-- Krinkle

[1] https://en.wikipedia.org/wiki/Denormalization
_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to