On 02.08.2016 22:28, Yuri Astrakhan wrote:
Is there a way we could have more than just the number of language
links? Eg number of incoming links from other wikipedia pages?

One could have other data added to the store, but this may be more work depending on what you want. You ask about links from "wikipedia pages". If you really mean this (and not Wikidata items), then this would be a lot of work to do since one would have to update RDF when (any) Wikipedia page changes. I guess we do not have infrastructure for doing this in a life update mode. Also note that the number of these links is different in each language, so one would have to store many numbers. Overall, this link count would really be (meta)data about Wikipedia pages and their relations, and not so much about Wikidata. I think you could get such Wikipedia-specific data from DBpedia, but I am not sure how well their life endpoint keeps track of this data (since it is tricky). Maybe an offline solution that combines RDF dumps is the most practical approach for now if you really need this data.

Even storing the number of incoming links (properties) from other Wikidata items would actually be tricky. Currently, the RDF data about each item only depends on the content of this item's Wikidata page. The number of inlinks depends on other Wikidata pages, and therefore it is much more work to keep it up to date when there are edits.

Markus




On Aug 2, 2016 10:41 PM, "Markus Kroetzsch"
<markus.kroetz...@tu-dresden.de <mailto:markus.kroetz...@tu-dresden.de>>
wrote:

    On 02.08.2016 20:59, Daniel Kinzler wrote:

        Am 02.08.2016 um 20:19 schrieb Markus Kroetzsch:

            Oh, there is a little misunderstanding here. I have not
            suggested to create a
            property "number of sitelinks in this document". What I
            propose instead is to
            create a property "number of sitelinks for the document
            associated with this
            entity". The domain of this suggested property is entity.
            The advantage of this
            proposal over the thing that you understood is that it makes
            queries much
            simpler, since you usually want to sort items by this value,
            not documents. One
            could also have a property for number of sitelinks per
            document, but I don't
            think it has such a clear use case.


        "number of sitelinks for the document associated with this
        entity" strikes me as
        semantically odd, which was the point of my earlier mail. I'd
        much rather have
        "number of sitelinks in this document". You are right that the
        primary use would
        be to "rank" items, and that it would be more conveniant to have
        the count
        assocdiated directly with the item (the entity), but I fear it
        will lead to a
        blurring of the line between information about the entity, and
        information about
        the document. That is already a common point of confusion, and
        I'd rather keep
        that separation very clear. I also don't think that one level of
        indirection
        would be orribly complicated.

        To me it's just natural to include the sitelink info on the same
        level as we
        provide a timestmap or revision id: for the document.


    I just proposed the simple and straightforward way to solve the
    practical problem at hand. It leads to shorter, more readable
    queries that execute faster. (I don't claim originality for this; it
    is the obvious solution to the problem and most people would arrive
    at exactly the same conclusion).

    Your concern is based on the assumption that there is some kind of
    psychological effect that a particular RDF encoding would have on
    users. I don't think that there is any such effect. Our users will
    not confuse the city of Paris with an RDF document just because of
    some data in the RDF store.

    Markus

    --
    Prof. Dr. Markus Kroetzsch
    Knowledge-Based Systems Group
    Faculty of Computer Science
    TU Dresden
    +49 351 463 38486 <tel:%2B49%20351%20463%2038486>
    https://iccl.inf.tu-dresden.de/web/KBS/en

    _______________________________________________
    Wikidata mailing list
    Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
    https://lists.wikimedia.org/mailman/listinfo/wikidata



_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata



_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to