Of relevance to this conversation: https://www.wired.com/story/large-language-models-artificial-intelligence/
On Fri, Dec 30, 2022 at 9:32 AM Neurodivergent Netizen < idoh.idreamofhor...@gmail.com> wrote: > One concern I have is that all “oldbies” like myself have all seen bots > basically decay after whomever is maintaining goes inactive. Of course, > this could be mostly rectified by having the AI be open source. This leaves > the “people” aspect; that is, not only does the AI need to be maintained, > but interest needs to be maintained as well. > > From, > I dream of horses > She/her > > > > > > On Dec 30, 2022, at 8:53 AM, Victoria Coleman <vstavridoucole...@gmail.com> > wrote: > > Anne, > > Interestingly enough what these large companies have to spend a ton of > money on is creating and moderating content. In other words people. > Passionate volunteers in large numbers is what the movement has in > abundance. Imagine the power of combining the talents and passion of our > community members with the advances offered by AI today. I was struck > recently during a visit to NVIDIA how language models have changed. Back in > my day, we would have to build one language model per domain and then load > it in to the device, a computer or a phone, to use. Now they have one > massive combined language model in a data center full of their GPUs which > is there so long as you are connected. My sense is that within the guard > rails offered by our volunteer community, we could use AI to force multiply > their efforts and make knowledge even more accessible than it is today. > Both for those who create and record knowledge as well as those who consume > it. In the case of Chat GPT, our volunteers could use supervised learning > for example to narrow down the mistakes the bot makes - which should be > many fewer that the Open AI version since the Wikipedia version would be > trained on good, clean Wikipedia content which is constantly reviewed by > the community. > > Best regards, > > Victoria Coleman > > On Dec 30, 2022, at 12:21 AM, Risker <risker...@gmail.com> wrote: > > > Given what we already know about AI-like projects (think Siri, Alexis, > etc), they're the result of work done by organizations utilizing resources > hundreds of times greater than the resources within the entire Wikimedia > movement, and they'renot all that good if we're being honest. They're > entirely dependent on existing resources. We have seen time and again how > easily they can be led astray; ChatGPT is just the most recent example. It > is full of misinformation. Other efforts have resulted in the AI becoming > radicalized. Again, it's all about what sources the AI project uses in > developing its responses, and those underlying sources are generally > completely unknown to the person asking for the information. > > Ironically, our volunteers have created software that learns pretty > effectively (ORES, several anti-vandalism "bots"). The tough part is > ensuring that there is continued, long-term support for these volunteer-led > efforts, and the ability to make them effective on projects using other > languages. We've had bots making translations of formulaic articles from > one language to another for years; again, they depend on volunteers who can > maintain and support those bots, and ensure continued quality of > translation. > > AI development is tough. It is monumentally expensive. Big players have > invested billions USD trying to develop working AI, with some of the most > talented programmers and developers in the world, and they're barely > scratching the surface. I don't see this as a priority for the Wikimedia > movement, which achieves considerably higher quality with volunteers > following a fairly simple rule set that the volunteers themselves develop > based on tried and tested knowledge. Let's let those with lots of money > keep working to develop something that is useful, and then we can start > seeing if it can become feasible for our use. > > I envision the AI industry being similar to the computer hardware > industry. My first computer cost about the same (in 2022 dollars) as the > four computers and all their peripherals that I have within my reach as I > write this, and had less than 1% of the computing power of each of > them.[1] The cost will go down once the technology gets better and more > stable. > > Risker/Anne > > [1] Comparison of 1990 to 2022 dollars. > > > > On Fri, 30 Dec 2022 at 01:40, Yaroslav Blanter <ymb...@gmail.com> wrote: > >> Hi, >> >> just to remark that it superficially looks like a great tool for small >> language Wikipedias (for which the translation tool is typically not >> available). One can train the tool in some less common language using the >> dictionary and some texts, and then let it fill the project with a >> thousands of articles. (As an aside, in fact, one probably can train it to >> the soon-to-be-extint languages and save them until the moment there is any >> interest for revival, but nobody seems to be interested). However, there is >> a high potential for abuse, as I can imagine people not speaking the >> language running the tool and creating thousands of substandard articles - >> we have seen this done manually, and I would be very cautious allowing this. >> >> Best >> Yaroslav >> >> On Fri, Dec 30, 2022 at 4:57 AM Raymond Leonard < >> raymond.f.leonard...@gmail.com> wrote: >> >>> As a friend wrote on a Slack thread about the topic, "ChatGPT can >>> produce results that appear stunningly intelligent, and there are things >>> that I’ve seen that really leave me scratching my head- “how on Earth >>> did it DO that?!?” But it’s important to remember that it isn’t actually >>> intelligent. It’s not “thinking.” It’s more of a glorified version of >>> autosuggest. When it apologizes, it’s not really apologizing, it’s just >>> finding text that fits the self description it was fed and that looks >>> related to what you fed it." >>> >>> The person initiating the thread had asked ChatGPT "What are the 5 >>> biggest intentional communities on each continent?" (As an aside, this >>> was as challenging as the question that led to Wikidata, "What are the ten >>> largest cities in the world that have women mayors?") One of the answers >>> ChatGPT gave for Europe was "Ikaria (Greece)". As near as I can determine, >>> there is no intentional community of any size in Ikaria. However, the >>> Icarians <https://en.wikipedia.org/wiki/Icarians> were a 19th-century >>> intentional community in the US founded by French expatriates. It was named >>> after a utopian novel, *Voyage en Icarie*, that was written by Étienne >>> Cabet. He chose the Greek island of Icaria as the setting of his utopian >>> vision. Interesting that ChatGPT may have conflated these. >>> >>> It seems that given a prompt, ChatGPT shuffles & regurgitates facts. >>> Just as a card dealer deals a good hand, sometimes ChatGPT seems to make >>> sense, but I think at present it really is " a glorified version of >>> autosuggest." >>> >>> Yours >>> Peaceray >>> >>> >>> >>> On Thu, Dec 29, 2022 at 6:39 PM Gnangarra <gnanga...@gmail.com> wrote: >>> >>>> I think the simplest answer is yes its an artificial writer but its not >>>> intelligence as the name implies but rather just a piece of software that >>>> gives answers according to the methodology of that software. The garbage in >>>> garbage out format, it can never be better than the programmers behind the >>>> machine >>>> >>>> On Fri, 30 Dec 2022 at 09:56, Victoria Coleman < >>>> vstavridoucole...@gmail.com> wrote: >>>> >>>>> Thank you Ziko and Steven for the thoughtful responses. >>>>> >>>>> My sense is that for a class for readers having a generative UI that >>>>> returns an answer VS an article would be useful. It would probably put >>>>> Quora out of business. :-) >>>>> >>>>> If the models are not open source, this indeed would require >>>>> developing our own models. For that kind of investment, we would probably >>>>> want to have more application areas. Translation being one that Ziko >>>>> already pointed out but also summarization. These kinds of Information >>>>> retrieval queries would effectively index into specific parts of an >>>>> article >>>>> vs returning the whole thing. >>>>> >>>>> Wikipedia as we all know is not perfect but it’s about the best you >>>>> can get with the thousands of editors and reviewers doing quality control. >>>>> If a bot was exclusively trained on Wikipedia, my guess is that the >>>>> falsehood generation would be as minimal as it can get. Garbage in garbage >>>>> out in all these models. Good stuff in good stuff out. I guess the >>>>> falsehoods can also come when no material exists in the model. So instead >>>>> of making stuff up, they could default to “I don’t know the answer to >>>>> that”. Or in our case, we could add the topic to the list of article >>>>> suggestions to editors… >>>>> >>>>> I know I am almost day dreaming here but I can’t help but think that >>>>> all the recent advances in AI could create significantly broader free >>>>> knowledge pathways for every human being. And I don’t see us getting after >>>>> them aggressively enough… >>>>> >>>>> Best regards, >>>>> >>>>> Victoria Coleman >>>>> >>>>> On Dec 29, 2022, at 5:17 PM, Steven Walling <steven.wall...@gmail.com> >>>>> wrote: >>>>> >>>>> >>>>> >>>>> >>>>> On Thu, Dec 29, 2022 at 4:09 PM Victoria Coleman < >>>>> vstavridoucole...@gmail.com> wrote: >>>>> >>>>>> Hi everyone. I have seen some of the reactions to the narratives >>>>>> generated by Chat GPT. There is an obvious question (to me at least) as >>>>>> to >>>>>> whether a Wikipedia chat bot would be a legitimate UI for some users. To >>>>>> that end, I would have hoped that it would have been developed by the WMF >>>>>> but the Foundation has historically massively underinvested in AI. That >>>>>> said, and assuming that GPT Open source licensing is compatible with the >>>>>> movement norms, should the WMF include that UI in the product? >>>>> >>>>> >>>>> This is a cool idea but what would the goals of developing a >>>>> Wikipedia-specific generative AI be? IMO it would be nice to have a >>>>> natural >>>>> language search right in Wikipedia that could return factual answers not >>>>> just links to our (often too long) articles. >>>>> >>>>> OpenAI models aren’t open source btw. Some of the products are free to >>>>> use right now, but their business model is to charge for API use etc. so >>>>> including it directly in Wikipedia is pretty much a non-starter. >>>>> >>>>> My other question is around the corpus that Open AI is using to train >>>>>> the bot. It is creating very fluid narratives that are massively false in >>>>>> many cases. Are they training on Wikipedia? Something else? >>>>> >>>>> >>>>> They’re almost certainly using Wikipedia. The answer from ChatGPT is: >>>>> >>>>> “ChatGPT is a chatbot model developed by OpenAI. It was trained on a >>>>> dataset of human-generated text, including data from a variety of sources >>>>> such as books, articles, and websites. It is possible that some of the >>>>> data >>>>> used to train ChatGPT may have come from Wikipedia, as Wikipedia is a >>>>> widely-used source of information and is likely to be included in many >>>>> datasets of human-generated text.” >>>>> >>>>> And to my earlier question, if GPT were to be trained on Wikipedia >>>>>> exclusively would that help abate the false narratives >>>>> >>>>> >>>>> Who knows but we would have to develop our own models to test this >>>>> idea. >>>>> >>>>>> >>>>> This is a significant matter for the community and seeing us step to >>>>>> it would be very encouraging. >>>>>> >>>>>> Best regards, >>>>>> >>>>>> Victoria Coleman >>>>>> _______________________________________________ >>>>>> Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, >>>>>> guidelines at: >>>>>> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and >>>>>> https://meta.wikimedia.org/wiki/Wikimedia-l >>>>>> Public archives at >>>>>> https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/CYPO3PEMM4FIWPNL6MRTORHZXVTS2VNN/ >>>>>> To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org >>>>>> >>>>> _______________________________________________ >>>>> Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, >>>>> guidelines at: >>>>> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and >>>>> https://meta.wikimedia.org/wiki/Wikimedia-l >>>>> Public archives at >>>>> https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/G57JUOQ5S5ZHXHWJN7LPYEBZMFVMJGVO/ >>>>> To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org >>>>> >>>>> _______________________________________________ >>>>> Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, >>>>> guidelines at: >>>>> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and >>>>> https://meta.wikimedia.org/wiki/Wikimedia-l >>>>> Public archives at >>>>> https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/WH6SHKVKPBVKPPWID5WFM2RSY3ZUUSQ6/ >>>>> To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org >>>> >>>> >>>> >>>> -- >>>> Boodarwun >>>> Gnangarra >>>> 'ngany dabakarn koorliny arn boodjera dardoon ngalang Nyungar >>>> koortaboodjar' >>>> >>>> _______________________________________________ >>>> Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, >>>> guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines >>>> and https://meta.wikimedia.org/wiki/Wikimedia-l >>>> Public archives at >>>> https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/N4CYGIOUJOAO2FCKKRFSMFZTATIYUKL5/ >>>> To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org >>> >>> _______________________________________________ >>> Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines >>> at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and >>> https://meta.wikimedia.org/wiki/Wikimedia-l >>> Public archives at >>> https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/FIALTVJ6AR6MRDUBECFPIDXX5YXNC2CS/ >>> To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org >> >> _______________________________________________ >> Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines >> at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and >> https://meta.wikimedia.org/wiki/Wikimedia-l >> Public archives at >> https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/GIEYQ7BNV4LMR4YOIYSUUL4OLAQVGAFO/ >> To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org > > _______________________________________________ > Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines > at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and > https://meta.wikimedia.org/wiki/Wikimedia-l > Public archives at > https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/W4IAWBV7VPBRFNQGRZT54UIV77E7M2XJ/ > To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org > > _______________________________________________ > Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines > at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and > https://meta.wikimedia.org/wiki/Wikimedia-l > Public archives at > https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/5F3ONUSUOKXV52ZCZ73T5KVPAWMJUTYN/ > To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org > > > _______________________________________________ > Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines > at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and > https://meta.wikimedia.org/wiki/Wikimedia-l > Public archives at > https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/2UBNTXB72SIMB7NRXSLQNBYJNVFQAO4E/ > To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org
_______________________________________________ Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l Public archives at https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/2L5EPGDMHEEOGWMMXI6VF7UUQ7CNBC6V/ To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org