I'd really like to see this evolve into a Wikiproject, if one doesn't already exists. What's the next step? I started a sister conversation about this on the GLAM-US mailing list, here: http://lists.wikimedia.org/pipermail/glam-us/2013-May/000157.html. There was a suggestion about starting a page on meta, and then inviting stakeholders to come join a discussion. I'm sure there have been efforts already in this direction, it would be nice to have a place to consolidate information.
Chris On Sun, May 5, 2013 at 5:15 AM, Michael Hale <[email protected]> wrote: > I think you need to start redesigns by considering scenarios with specific > examples. What are the tasks that I want to do that I can't currently do? > What are the tasks that I do so much that they should be easier? Can you > provide an example of some DBpedia queries that are awkward due to the > current ontology and would be improved by COSMO? Things like that typically > change gradually on wiki-projects as opposed to starting over. Here are the > beginnings of a visual query interface for Wikidata: > http://toolserver.org/~magnus/ts2/wdq/ Also, the English Wikipedia currently > has about a million categories. Some of them are used for infrastructure > purposes, but the vast majority of them are content categories. The original > issue was that not all categories that contain people have subcategories for > males and females. With Wikidata we will be able to say, "The vast majority > of these articles have a statement describing their sex, so we can place a > filter for that property on this category page." Then we can find a balance > between the organic nature of the category system without having to worry > about categories like "American male Democratic politicians". We could just > have a politicians category, and the system would be smart enough to know > that it should put gender, nationality, and party filters on that page. Then > eventually we might have enough structured data to eliminate the category > system. That is how I imagine the progression at least. > > ________________________________ > From: [email protected] > To: [email protected] > Date: Sat, 4 May 2013 19:25:25 -0400 > > Subject: Re: [Wikidata-l] Question about wikipedia categories. > > If one is interested in a functional “category” system, it would be very > helpful to have a good logic-based ontology as the backbone. > > I haven’t looked recently, but when I inquired about the ontology used by > DBpedia a year ago, I was referred to “dbpedia-ontology.owl”, an ontology in > the format of the “semantic web” ontology format OWL. The OWL format is > excellent for simple purposes, but the dbpedia-ontology.owl (at that time) > was not well-structured (being very polite). I did inquire as to who was > maintaining the ontology, and had a hard time figuring out how to help bring > it up to professional standards. But it was like punching jello, nothing to > grasp onto. I gave up, having other useful things to do with my time. > > > > Perhaps it is time now, with more experience in hand, to rethink the > category system starting with basics. This is not as hard as it sounds. > It may require some changes where there is ambiguity or logical > inconsistency, but mostly it only necessary to link the Wikipedia categories > to an ontology based on a well-structured and logically sound foundation > ontology (also referred to as an “upper ontology”), that supplies the basic > categories and relations. Such an ontology can provide the basic concepts, > whose labels can be translated into any terminology that any local user > wants to use. There are several well-structured foundation ontologies, > based on over twenty years of research, but the one I suggest is the one I > am most familiar with (which I created over the past seven years), called > COSMO. The files at http://micra.com/COSMO will provide the ontology itself > (“COSMO.owl”, in OWL) and papers describing the basic principles. COSMO > is structured to be a “primitives-based foundation ontology”, containing all > of the “semantic primitives” needed to describe anything one wants to talk > about. All other categories are structured as logical combinations of the > basic elements. Its inventory of primitives is probably incomplete, but is > able to describe everything I have been concerned with for years (7000 > categories and 800 relations thus far) can always be supplemented as > required for new fields. With an OWL ontology, queries can be executed by > any of several logic-based utilities. Making the query system easy for > those who prefer not to build SPARQL queries (including myself) would > require some programming, but that is a miniscule effort compared to what > has already been put into the DBPedia database. Tools such as “Protégé” > make it easy to work with an OWL ontology, and there is a web site where an > OWL ontology can be developed collaboratively. > > > > I will be willing to put some effort into this and assist anyone who wants > to used the COSMO ontology for this project. If those who are in charge of > maintaining the ontology (is anyone?) would like to discuss this at greater > length, send me an email or telephone me. All those who are interested in > this topic may also feel free to contact me, or to discuss this thread on > the list. I suggest the thread title “Foundation Ontology”. > > > > Pat > > > > Patrick Cassidy > > MICRA Inc. > > [email protected] > > 908-561-3416 > > > > From: [email protected] > [mailto:[email protected]] On Behalf Of Michael Hale > Sent: Saturday, May 04, 2013 2:57 AM > To: Discussion list for the Wikidata project. > Subject: Re: [Wikidata-l] Question about wikipedia categories. > > > > I think it's important to consider the distinction between a category system > and semantic queries. I think it's very likely that DBpedia and Wikidata > will converge over time and develop a simple enough query interface that > causes fewer people to use the category system because we will be able to > automatically generate relevant queries related to a given article. DBpedia > currently has a lot more data, but Wikidata is important for many editing > scenarios. Also, in the future I think there will be a lot of content > scenarios where it is natural to start by putting data into Wikidata and > then including it in articles instead of just extracting information from > articles. If you are familiar with query languages you can get comfortable > with the DBpedia SPARQL examples in a few minutes, but for a typical reader > that just wants to go from an article about a person to a list of similar > people it is hard to beat scrolling down and just clicking on a category. I > did a test query on DBpedia to plot all sports cars by their engine sizes, > and I think for the types of things it enables you to do it is totally worth > the learning curve. That being said, I think the category system has a lot > of potential for better browsing scenarios as opposed to queries. I've been > making a tool that mixes the article view data with the category system. You > can see a video of the basic idea here and a screenshot of football league > popularity split by language. > http://en.wikipedia.org/wiki/User:Wakebrdkid/Popular_category_browsing I'm > currently multiplying the Chinese traffic by 30 to try and account for Baidu > Baike. > >> Date: Sat, 4 May 2013 08:14:54 +0200 >> From: [email protected] >> To: [email protected] >> Subject: Re: [Wikidata-l] Question about wikipedia categories. >> >> Wondering exactly the same thing - my frustrations with categories >> began about three years ago and it seems I am surprised monthly by >> severe limitations to this outdated apparatus. I am a heavy category >> user, but I would love to be able to kick it out the door in favour of >> a more structured method. As far as I can tell, there is very little >> synchronisation among language Wikipedias of category trees, and being >> able to apply a central structure to all Wikipedias through Wikidata >> sounds like a great idea, and one which would not disturb the current >> category trees we already have, but supplement them. As I see it, some >> category structures are OK, but when categories get big, people split >> them in non-standard ways, causing problems like this recent >> media-hype regarding female novellists. I think that it's great this >> is in the news in this way, because I am sure that most Wikipedia >> readers never knew we had categories, and this is a great introduction >> to them, as well as an invitation to edit Wikipedia. >> >> 2013/5/4, Chris Maloney <[email protected]>: >> > I am just curious if there has ever been discussion about the >> > potential for reimplementing / replacing the category system in >> > Wikipedia with semantic tagging in WikiData. It seem to me that the >> > recent kerfuffle with regards to "American women writers" would not >> > have happened if the pages were tagged with simple RDF assertions >> > instead of these convoluted categories. I know, of course, that it >> > would be a huge undertaking, but I just don't see how the category >> > system can continue to scale (I'm amazed it has scaled as well as it >> > has already, of course). >> > >> > I am trying to learn more about wikidata, and have perused the various >> > infos and FAQs for the last two hours, and can't find any discussion >> > of this particular issue. >> > >> > -- Chris >> > >> > _______________________________________________ >> > Wikidata-l mailing list >> > [email protected] >> > https://lists.wikimedia.org/mailman/listinfo/wikidata-l >> > >> >> _______________________________________________ >> Wikidata-l mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/wikidata-l > > > _______________________________________________ Wikidata-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikidata-l > > _______________________________________________ > Wikidata-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikidata-l > _______________________________________________ Wikidata-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-l
