mkroetzsch added a subscriber: mkroetzsch.
mkroetzsch added a comment.

Structurally, this would work, but it seems like a very general solution with a 
lot of overhead. Not sure that this pattern works well on PHP, where the cost 
of creating additional objects is huge. I also wonder whether it really is good 
to manage all those (very different!) types of "derived" information in a 
uniform way. The examples given belong to very different objects and are based 
on very different inputs (some things requiring external data sources, some 
not). I find it a bit unmotivated to architecturally unify things that are 
conceptually and technically so very different. The motivation given for 
choosing this solution starts from the premise that one has to find a single 
solution that works for all cases, including some "edge cases". Without this 
assumption, one would be free to solve the different problems individually, 
using what is best for each, instead of being forced to go for some least 
common denominator.

To pick just one example, consider the (article) URL of a SiteLink. To create 
it, one needs to have access to the content of the sites table. In WDTK, we 
encapsulate the sites table in an object (called Sites). To find out the URL of 
a SiteLink, one has to call a method of the Sites object (called 
getArticleUrl() or something), which takes a SiteLink as an input. This design 
is simple and efficient, uses no additional memory for storing new objects or 
values, and clearly locates the responsibility for computing this information 
(the Sites table). In a single-site setting (like you have in PHP), there is 
only one sites table, and you can access it statically, so the caller does not 
even need to have a reference to a Sites object as in WDTK. I therefore don't 
see any benefit in creating a role object for this simple task. It's just more 
indirection, without any convenience for the software developer or any gain in 
performance.

The situation is similar in several of the other cases mentioned. In the end, 
this is a choice for PHP development, which won't affect my work, so I'll leave 
it to you, but it seems you are making your life harder than necessary by going 
for a complicated solution instead of several simple solutions. There is only 
going to be a small number of different kinds of "derived data" ever, and there 
is hardly any place where such data is used in a way that does not need to 
understand its meaning (mainly for serialization).

For JSON exports of such data, I don't think this approach would make sense 
(but I suppose this is not intended here). There, one would simply use optional 
keys for including additional data. Likewise for RDF, where one would not want 
to introduce additional "role" objects that create another layer of indirection 
to access derived values (of course, RDF has a tradition of dealing with 
derived values, and nobody expects special structures there). I suppose that 
this proposal has nothing to do with JSON or RDF, but just to be sure we are on 
the same page.

The term "data model" is a bit over-used in our context -- maybe it would make 
sense to indicate in this bug report that it is specific to the object model 
used in the PHP implementation, and has no implications for other 
implementations or export formats.


TASK DETAIL
  https://phabricator.wikimedia.org/T118860

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: mkroetzsch
Cc: mkroetzsch, adrianheine, hoo, thiemowmde, aude, Jonas, JanZerebecki, 
JeroenDeDauw, Aklapper, StudiesWorld, daniel, Wikidata-bugs, Mbch331



_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to