Thomas,
I also wanted to briefly indicate how non-trivial that some of these technical topics are; for example algorithmically determining which interpretation hypotheses are correct for sentences or whether one or more constituent elements of sentences are best interpreted in ways not yet specified in a growing, dynamic lexicon. The matter relates to language learning. There is the matter of encountering new lexemes, lexemes with zero senses thus far in the lexicon, and then there is the matter of encountering new senses of lexemes previously encountered. My earlier comment was that software systems could signal machine-utilizable crowdsourced lexicon services, in the case of certain events, so that users could utilize data to prioritize collaborative work. I also theorize, as others do, that a viable concept of sequencing work with respect to building natural language understanding systems and lexicons is entering data in the order of reading level, from infancy to adult reading level. Building machine-utilizable crowdsourced lexicon software with rich, structured metadata and with extensible storage slots for definitions in multiple knowledge representation formats is a difficult task; one that makes possible other difficult tasks utilizing such lexicons. Thank you for the enjoyable brainstorming session and for indicating the state of the art with regard to projects underway. I am interested in any of your thoughts, opinions and ideas with respect to the future of machine-utilizable crowdsourced lexicons. Best regards, Adam ________________________________ From: Wiki-research-l <wiki-research-l-boun...@lists.wikimedia.org> on behalf of Adam Sobieski <adamsobie...@hotmail.com> Sent: Thursday, May 31, 2018 4:26:46 PM To: Research into Wikimedia content and communities Subject: Re: [Wiki-research-l] Machine-utilizable Crowdsourced Lexicons Thomas, Thank you for the exciting information with regard to the future of Wikidata lexemes. With bulk upload and update capabilities, we might anticipate alignments and uploads from projects on the scales of FrameNet, PropBank, VerbNet and WordNet. With regard to crowdsourced lexicons containing machine-utilizable definitions, we can consider a feature where, as software using the API’s for definitions find that there aren’t yet definitions for particular lexemes, counters can be accumulated such that users can observe which lexemes’ definitions are in popular demand. This could be a means of prioritizing which lexemes to rigorously define. We might envision natural language understanding, including semantic interpretation, of children’s books in upcoming years. Best regards, Adam ________________________________ From: Wiki-research-l <wiki-research-l-boun...@lists.wikimedia.org> on behalf of Thomas Pellissier Tanon <tho...@pellissier-tanon.fr> Sent: Thursday, May 31, 2018 6:25:56 AM To: Research into Wikimedia content and communities Subject: Re: [Wiki-research-l] Machine-utilizable Crowdsourced Lexicons > In addition to Web-based user interfaces for content editing, machine lexicons could support bulk API’s including those based on XML-RPC and SPARUL. It is what it is planned for Wikidata lexemes. There is already a REST API. Example: https://www.wikidata.org/wiki/Special:EntityData/L42.json We are currently working on an RDF output of the lexemes content using Lemon/Ontolex [1]. It is planned to import this RDF representation into https://query.wikidata.org in order to be able to execute SPARQL queries on it. Cheers, Thomas [1] https://mediawiki.org/wiki/Extension:WikibaseLexeme/RDF_mapping Le jeu. 31 mai 2018 à 05:22, Adam Sobieski <adamsobie...@hotmail.com> a écrit : > Micru, > Finn, > > Thank you for the hyperlinks to the pertinent projects. > > I’m thinking that machine lexicon services could include URL-addressible: > (1) headwords and lemmas, (2) conjugations and declensions, and (3) > specific senses or definitions. Each conjugation or declension could have > its own URL-addressable definitions. Machine-utilizable definitions are > envisioned as existing in a number of machine-utilizable knowledge > representation formats. > > In addition to Web-based user interfaces for content editing, machine > lexicons could support bulk API’s including those based on XML-RPC and > SPARUL. With regard to the use of SPARQL and SPARUL, there may already > exist a suitable ontology. Some lexical ontologies include: Lemon ( > https://www.w3.org/2016/05/ontolex/), LexInfo (http://www.lexinfo.net/), > LIR (http://mayor2.dia.fi.upm.es/oeg-upm/index.php/en/technologies/63-lir/), > LMM (http://ontologydesignpatterns.org/wiki/Ontology:LMM), semiotics.owl ( > http://www.ontologydesignpatterns.org/cp/owl/semiotics.owl), and Senso > Comune (http://www.sensocomune.it/). It should be possible to extend > existing ontologies to include machine-utilizable definitions in a number > of knowledge representation formats. > > I’m thinking about topics in knowledge representation with regard to the > formal semantics of nouns, verbs, adjectives, adverbs, pronouns, > prepositions and conjunctions and about how automated reasoners could make > use of machine-utilizable definitions to obtain and compare semantic > interpretations as software systems parse natural language. > > > Best regards, > Adam > > _______________________________________________ > Wiki-research-l mailing list > Wiki-research-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l > _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l