On 01/07/14 22:33, David Cuenca wrote:
Markus, could your algorithm work together with human direction? Like,
if we entered which properties are common for a class, and then a user
creates an instance of that class, would the algorithm be able to sort
those properties based on how often they appear on the database?

My algorithm is all about *detecting* "which properties are common for a class". If you want this to be entered by humans instead, that's fine too, but then you don't need an algorithm. Sorting a list of properties by how often they appear in the database is easy to do. My algorithm does not do this though, because the most often used property is usually not the most intersting one (for instance, many classes are related with Freebase IDs, but you don't want this to be the first suggestion you get; I want the things that are "special" for the instances of a class as compared to the rest of the data, not the things that are most common overall).

Cheers,

Markus


Thanks,
Micru


On Tue, Jul 1, 2014 at 10:23 PM, Markus Krötzsch
<[email protected] <mailto:[email protected]>>
wrote:

    On 01/07/14 22:14, Markus Krötzsch wrote:
    ...


        (2) "Grade I listed building"
        
http://tools.wmflabs.org/__wikidata-exports/miga/?__classes#_cat=Classes/Id=__Q15700818
        
<http://tools.wmflabs.org/wikidata-exports/miga/?classes#_cat=Classes/Id=Q15700818>


        Related properties: English Heritage list number, masts, Minor
        Planet
        Center observatory code, home port, coordinate location, OS grid
        reference, mother house, architect, manager/director, Emporis ID,
        MusicBrainz place ID, country, architectural style, visitors per
        year,
        Commons category, Structurae ID (structure), officially opened by,
        floors above ground, inspired by, religious order, number of
        platforms,
        street, owned by, diocese

        These are computed fully automatically from the data, with no manual
        filtering or user input. But don't get me wrong -- great work!
        Brilliant
        to have such a thing integrated into the UI. In any case, my
        algorithm
        for computing the related properties is certainly very different
        from
        theirs; I am sure it also has its glitches.


    P.S. One weakness of my algorithm you can already see: it has
    troubles estimating the relevance of very rare properties, such as
    "Minor Planet Center observatory code" above. A single wrong
    annotation may then lead to wrong suggestions. Also, it seems from
    my list under (2) that some Grade I listed buildings are ships. This
    seems to be an error that is amplified by the fact that property
    "masts" is used only 11 times in the dataset I evaluated (last
    week's data). I guess the new property suggester rather errs on the
    other side, being tricked into suggesting very frequent properties
    even in places that don't need them.

    -- Markus



    _________________________________________________
    Wikidata-l mailing list
    [email protected] <mailto:[email protected]>
    https://lists.wikimedia.org/__mailman/listinfo/wikidata-l
    <https://lists.wikimedia.org/mailman/listinfo/wikidata-l>




--
Etiamsi omnes, ego non


_______________________________________________
Wikidata-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Reply via email to