[Wikimedia-l] Re: ChatGPT and Wikipedia

Anders Wennersten Mon, 19 Dec 2022 05:52:26 -0800

I think we should not underestimate what this could evolve into. Wethrived because our readers find us "good enough" when it comes tofinding facts, not the ultimate source.

And the software learns by reading, and can (and have done so)Wikipedia, Wikidata etc and represent our data in its own syntax ochpresent it in a way so it is not a direct copy. Perhaps data will be abit delayed to the actual content in Wikipedia, but so what - good enough?


Anders

Den 2022-12-19 kl. 14:26, skrev Gnangarra:

AI simply cant descriminate between good research and faked research,for any outcome it must provide all of its sources whether they arefrom Wikipedia, Wikidata, WikiCommons, WikiSource or some otherplace. Otherwise it will answer yes to some asking if the world isflat because it'll seek out that answer and find all the nonsense thathas been produced.


On Mon, 19 Dec 2022 at 06:02, Erik Moeller <eloque...@gmail.com> wrote:

    On Sun, Dec 11, 2022 at 5:55 AM Anders Wennersten
    <m...@anderswennersten.se> wrote:
    > ChatGPT is now making headlines more or less every day and I
    perceive
    > them to try to position themself  av the "next" google.

    I suspect OpenAI will continue to focus on generative applications
    (images, code, text for purposes such as copywriting, eventually
    music/video) and won't attempt to compete with Google directly, but
    we'll see. Currently GPT-3.5 (which ChatGPT is based on) is very prone
    to generating nonsensical answers, citations to works that don't
    exist, etc. But it is pretty cool if you keep its limitations in
    mind--for example, it's quite good at bootstrapping small scripts in
    various programming languages (with mistakes and idiosyncrasies).

    Google has one of the largest AI research programs on the planet, they
    just are extremely conservative about letting anyone try their models
    (due to reputational concerns, e.g., that generative AI will spit out
    racist output within about 30 seconds of people poking its
    guardrails). This blog post from September is instructive about the
    direction they're taking with what's called retrieval-augmented
    generation; see the paper linked from the post for details:

    https://www.deepmind.com/blog/building-safer-dialogue-agents (DeepMind
    is part of Google)

    That is likely to yield significantly more accurate answers than what
    ChatGPT is doing, and is difficult to replicate for folks like OpenAI
    without being dependent on the search APIs of big search companies.
    It's worth noting that Google has also started to incorporate language
    model tooling into how it's presenting search results (e.g.,
    summarizing or highlighting different parts of a website to make the
    result snippet more useful).

    A retrieval-augmented approach that leverages Wikidata could IMO be
    quite powerful and could be a useful research program for Wikimedia to
    pursue, be it independently or in partnership with others. The
    resulting technology should of course be fully open source.

    Querying Wikidata via SPARQL is currently still a bit of wizardry (and
    the query builder is extremely limited). To pick a completely random
    example not at all inspired by current events, if I wanted to see a
    list of journalists with Mastodon accounts & a picture, I currently
    have to do this:

    SELECT DISTINCT ?personLabel ?mastodonName ?pic
    WHERE {
      ?person wdt:P4033 ?mastodonName ;
        wdt:P106 ?occupation .
      OPTIONAL { ?person wdt:P18 ?pic . }
      ?occupation wdt:P279* wd:Q1930187 .
       SERVICE wikibase:label {
         bd:serviceParam wikibase:language "en"
       }
    }

    Make a small mistake (a curly brace missing) and you'll get a red
    error message. Forgot the * after wdt:P279? A different response set
    in ways that are difficult to spot or reason about.

    Why can't I type "list of journalists with their picture and Mastodon
    account" as a natural language query? (You can try it in ChatGPT and
    it'll get you started, but it'll generate nonsense P/Q numbers.) If
    such queries could be produced reliably, it could be a very useful
    tool for readers as well.

    Warmly,
    Erik
    _______________________________________________
    Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org,
    guidelines at:
    https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
    https://meta.wikimedia.org/wiki/Wikimedia-l
    Public archives at
    
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/YYTLJVCDSYITUKNA2DJSK5SSR3AZ3B5F/
    To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org



--
Boodarwun
Gnangarra

'ngany dabakarn koorliny arn boodjera dardoon ngalang Nyungarkoortaboodjar'

//

_______________________________________________
Wikimedia-l mailing list --wikimedia-l@lists.wikimedia.org, guidelines 
at:https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines  
andhttps://meta.wikimedia.org/wiki/Wikimedia-l
Public archives 
athttps://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/CVXPECMNLGBGIQYP2DI7IRJVLUNNOF6B/
To unsubscribe send an email towikimedia-l-le...@lists.wikimedia.org

_______________________________________________
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/HKAPIBPSXAETLTFQFQDPDCSGCFWDCXAQ/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] Re: ChatGPT and Wikipedia

Reply via email to