I'd like to restate the initial question.

Why did wikidata choose shex instead of other approaches?

>From this very detailed comparison
http://book.validatingrdf.com/bookHtml013.html  (thank you Andra!) I could
see arguments in both directions.  I'm curious to know what swayed the
wikidata software team as my group is currently grappling with the same
decision.

On Thu, May 30, 2019 at 7:55 AM Peter F. Patel-Schneider <
[email protected]> wrote:

> The history of ShEx is quite complex.
>
> I don't think that one can say that there were complete and conforming
> implementations of ShEx in 2017 because the main ShEX specification,
> http://shex.io/shex-semantics-20170713/ was ill-founded.  I pointed this
> out
> in https://lists.w3.org/Archives/Public/public-shex/2018Mar/0008.html
>
> There were several quite different semantics proposed for ShEx somewhat
> earlier, all with significant problems.
>
> peter
>
>
>
>
>
> On 5/30/19 12:34 AM, Andra Waagmeester wrote:
> > I really don't see the issue here. SHACL, like ShEx is a language to
> express
> > data shapes. I adopted using ShEx in a wikidata context 2016 when ShEx
> was
> > demonstrated at a tutorial at the SWAT4HCLS conference [1] in Amsterdam,
> where
> > it was discussed in both a tutorial and a hackathon topic. At that
> conferene,
> > I was convinced that ShEx is helpful in maintaining quality in Wikidata.
> ShEx
> > offers not only the means to validate data shapes in Wikidata, but it
> also
> > provides a way to document how primary data is expressed in Wikidata.
> In 2016
> > I joined the ShEx community group [2]. Since I have been actively using
> ShEx
> > in defining shapes in various projects on Wikidata (e.g. Gene Wiki and
> > Wikicite).  It is not that this happened in secrecy. On the contrary, it
> was
> > discussed at both Wikimedia [3,4] and non-Wikimedia events [5,6,7].
> >
> > It is also not the case that SHACL has not been discussed in this
> context, on
> > the contrary, I have very good memories of a workshop where both were
> debated
> > (see page 24 ;) )  [8]
> >
> > IMHO  the statement that we all should adhere to one standard, simply
> because
> > it is a standard, is not a valid argument. Imagine having to dictate
> that we
> > all should speak English because it is the standard language.  In every
> single
> > talk that I have given since 2016, proponents of SHACL have been very
> vocal in
> > asking the same question over and over again "why not SHACL?", where the
> > discussion never went beyond, "You should because it is a standard". It
> is
> > also a bit disingenuous to suggest we all should adhere to SHACL because
> it is
> > the standard, while in the same sentence calling it a "Recommendation".
> >
> > Although initially, I was open to SHACL as well (I use both Mac and
> Linux, so
> > why not open up to different alternatives in data shapes), (Some)
> Arguments
> > for me to prefer ShEx over SHACL are:
> > 1. Already in 2017 there were different (open) implementations. At the
> time
> > SHACL didn't have much tooling to choose from, other than one javascript
> > implementation and a proprietary software package.
> > 2. ShEx has a more intuitive way of describing Shapes, which is the
> compact
> > syntax (ShExC). SHACL seems to have adopted the compact syntax as well,
> but
> > only yesterday [9].
> > 3. The culture in the Shape Expression community group aligns well with
> the
> > culture in Wikidata.
> > 4. I don't want to be shackled to one standard (pun intended). I assume
> the
> > name was chosen with a shackle in mind, which puts constraints at the
> core of
> > the language. Wikidata already has different methods in place to deal
> with
> > constraints and constraint violations. In the context of Wikidata, ShEx
> should
> > specifically not be intended to impose constraints, on the contrary, it
> allows
> > expressing of disagreement or variants of different shapes, whether
> conflict
> > or not. Which fits well with the NPOV concept. Symbols do matter.
> >
> > For a less personal comparison, I refer to the "Validating RDF data" book
> > which describes both ShEx and SHACL, and has a specific chapter on how
> they
> > compare and differ [10]
> >
> > Up until now, I have been using ShEx in repositories outside the Wikidata
> > ecosystem (e.g. Github), but I am really excited about the release of
> this
> > extension. I am curious about how the wiki extension will influence the
> > maintenance of schemas. Schemas are currently often expressed as static
> > images, while in practice the schemas are as fluid as the underlying data
> > itself. Being able to document these changes dynamically (the wiki way),
> can
> > be very interesting. One specific expectation I have is that it might
> make it
> > easier to write federated SPARQL queries. Currently, when writing these
> > federated queries we often have to rely on either a set of example
> queries or
> > a one-time schema description, which makes it hard to write those
> queries,
> > because of schemas changing constantly. Federated SPARQL queries now
> really is
> > a process of "slot machine" querying, where one has to explore the
> underlying
> > schema, query by query. With a wiki in place and a  community maintaining
> > these ever-changing schema's, I expect better documentation.
> >
> > The data shape community, instead of adhering to one language, should
> really
> > be proud to have produced two very helpful languages. ShEx and SHACL are
> > similar but do have differences so both have merit to exist and I wish we
> > could steer away from this ShEx vs SHACL feud. It really isn't helping
> the
> > cause, i.e. being able to express schemas in a formal language.
> Honestly, this
> > fued really reminds me of the famous monty python sketch, "The machine
> that
> > says Bing". Let us focus on the patient and not on the "Bing".
> >
> > Just my 2ct.
> >
> >
> >
> >
> >
> >
> > [1] http://www.swat4ls.org/workshops/amsterdam2016/
> > [2] https://www.w3.org/community/shex/
> > [3]
> https://www.wikidata.org/wiki/Wikidata:WikidataCon_2017/Submissions/Using_Shape_Expressions_for_data_quality_and_consistency_in_Wikidata
> > [4] https://meta.wikimedia.org/wiki/WikiCite_2017/Program
> > [5]
> https://figshare.com/articles/Using_Shape_Expressions_ShEx_to_model_validate_and_curate_Wikidata/4766002
> > [6]
> https://2017.semantics.cc/satellite-events/linked-data-quality-assessment-and-improvement-academia-industry
> > [7] http://swib.org/swib18/programme.html
> > [8]
> https://upload.wikimedia.org/wikipedia/commons/d/d6/WikiCite_2017_report.pdf
> > [9] https://lists.w3.org/Archives/Public/public-shacl/2019May/0012.html
> > [10] http://book.validatingrdf.com/
> >
> > On Wed, May 29, 2019 at 10:05 PM Antoine Zimmermann
> > <[email protected] <mailto:[email protected]>> wrote:
> >
> >     Hello,
> >
> >
> >     Could you explain why the non-standard ShEx has been chosen rather
> than
> >     the W3C Recommendation SHACL?
> >
> >     I would assume that if one has several options for bringing a
> >     functionality to something that largely promotes interoperability
> (like
> >     Wikidata), the default choice should be a standard, and /only if/ one
> >     has a carefully crafted argumentation to reject it, one would opt for
> >     something else.
> >
> >     For those who may not know, the W3C RDF Data Shapes Working Group
> worked
> >     between 2014 and 2017 on defining a standard for describing data
> shapes
> >     in RDF. ShEx existed already and was a candidate for standardisation.
> >     Eventually, another standard emerged, Shapes Constraint Language
> (SHACL,
> >     see https://www.w3.org/TR/shacl/).
> >
> >     Disclaimer: I did not contribute to either SHACL or ShEx, and I do
> not
> >     know them enough to judge which one is better.
> >
> >
> >     Best,
> >     --AZ
> >
> >
> >     On 19/05/2019 15:32, Léa Lacroix wrote:
> >     > Hello all,
> >     >
> >     > After several months of development and testing together with the
> >     > WikiProject ShEx
> >     > <https://www.wikidata.org/wiki/Wikidata:WikiProject_ShEx>, Shape
> >     > Expressions are about to be enabled on Wikidata.
> >     >
> >     > *First of all, what are Shape Expressions?*
> >     >
> >     > ShEx (Q29377880) <https://www.wikidata.org/wiki/Q29377880> is a
> concise,
> >     > formal modeling and validation language for RDF structures. Shape
> >     > Expressions can be used to define shapes within the RDF graph. In
> the
> >     > case of Wikidata, this would be sets of properties, qualifiers and
> >     > references that describe the domain being modeled.
> >     >
> >     > See also:
> >     >
> >     >   * a short video about ShEx
> >     >     <https://www.youtube.com/watch?v=AR75KhEoRKg> made by
> community
> >     >     members during the Wikimedia hackathon 2019
> >     >   * introduction to ShEx <http://shex.io/shex-primer/>
> >     >   * more details about the language <
> http://shex.io/shex-semantics/>
> >     >
> >     > *What can it be used for?*
> >     >
> >     > On Wikidata, the main goal of Shape Expressions would be to
> describe
> >     > what the basic structure of an item would be. For example, for a
> human,
> >     > we probably want to have a date of birth, a place of birth, and
> many
> >     > other important statements. But we would also like to make sure
> that if
> >     > a statement with the property “children” exists, the value(s) of
> this
> >     > property should be humans as well. Schemas will describe in detail
> what
> >     > is expected in the structure of items, statements and values of
> these
> >     > statements.
> >     >
> >     > Once Schemas are created for various types of items, it is
> possible to
> >     > test some existing items against the Schema, and highlight possible
> >     > errors or lack of information. Subsets of the Wikidata graph can be
> >     > tested to see whether or not they conform to a specific shape
> through
> >     > the use of validation tools. Therefore, Schemas will be very
> useful to
> >     > help the editors improving the data quality. We imagine this to be
> >     > especially useful for wiki projects to more easily discuss and
> ensure
> >     > the modeling of items in their domain. In the spirit of Wikidata
> not
> >     > restricting the world, Shape Expressions are a tool to highlight,
> not
> >     > prevent, errors.
> >     >
> >     > On top of this, one could imagine other uses of Schemas in the
> future,
> >     > for example building a tool that would suggest, when creating a new
> >     > item, what would be the basic structure for this item, and helping
> >     > adding statements or values. A bit like this existing tool, Cradle
> >     > <https://tools.wmflabs.org/wikidata-todo/cradle/#/>, that is
> currently
> >     > not based on ShEx.
> >     >
> >     > *What is going to change on Wikidata?*
> >     >
> >     >   * A new extension will be added to Wikidata: EntitySchema
> >     >     <https://www.mediawiki.org/wiki/Extension:EntitySchema>,
> defining
> >     >     the Schema namespace and its behavior as well as special pages
> >     >     related to it.
> >     >   * A new entity type, EntitySchema, will be enabled to store Shape
> >     >     Expressions. Schemas will be identified with the letter E.
> >     >   * The Schemas will have multilingual labels, descriptions and
> aliases
> >     >     (quite similar to the termbox on Items), and the schema text
> one can
> >     >     fill with a syntax called ShEx Compact Syntax (ShExC)
> >     >     <http://shex.io/shex-semantics/#shexc>. You can see an
> example here
> >     >     <https://wikidata-shex.wmflabs.org/wiki/EntitySchema:E2>.
> >     >   * The external tool shex-simple
> >     >
> >      <
> https://tools.wmflabs.org/shex-simple/wikidata/packages/shex-webapp/doc/shex-simple.html?schemaURL=https%3A%2F%2Fwikidata-shex.wmflabs.org%2Fwiki%2FSpecial%3AEntitySchemaText%2FE2
> >
> >     >     is directly linked from the Schema pages in order to check
> entities
> >     >     of your choice against the schema.
> >     >
> >     > *When is this happening?*
> >     >
> >     > Schemas will be enabled on on test.wikidata.org <
> http://test.wikidata.org>
> >     > <http://test.wikidata.org> on May 21st and on wikidata.org
> >     <http://wikidata.org>
> >     > <http://wikidata.org> on May 28th. After this release, they will
> be
> >     > integrated to the regular maintenance just like the rest of
> Wikidata’s
> >     > features.
> >     >
> >     > *How can you help?*
> >     >
> >     >   * Before the release, you can try to edit or create Shape
> Expressions
> >     >     on our test system <
> https://wikidata-shex.wmflabs.org/wiki/Main_Page>
> >     >   * If you find any issue or feature you’d like to have, feel free
> to
> >     >     create a new task on Phabricator with the tag
> |shape-expressions|
> >     >   * Once Schemas are enabled, you can discuss about it on your
> favorite
> >     >     wikiprojects: for example, what types of items would you like
> to model?
> >     >   * You can also get more information about how to create a Schema
> >     >
> >      <
> https://www.wikidata.org/wiki/Wikidata:WikiProject_ShEx/How_to_get_started%3F
> >
> >     >
> >     > *See also: *
> >     >
> >     >   * Main Phabricator board
> >     >     <https://phabricator.wikimedia.org/tag/shape_expressions/>
> >     >   * Technical documentation of the extension
> >     >     <https://meta.wikimedia.org/wiki/Extension:EntitySchema>
> >     >   * To enhance the interface, you can use this user script
> >     >     <
> https://www.wikidata.org/wiki/User:Zvpunry/EntitySchemaHighlighter.js>
> >     >     to highlight items and properties in the schema code and turn
> the
> >     >     IDs into links
> >     >
> >     > If you have any questions, feel free to reach me. Cheers,
> >     >
> >     > --
> >     > Léa Lacroix
> >     > Project Manager Community Communication for Wikidata
> >     >
> >     > Wikimedia Deutschland e.V.
> >     > Tempelhofer Ufer 23-24
> >     > 10963 Berlin
> >     > www.wikimedia.de <http://www.wikimedia.de> <
> http://www.wikimedia.de>
> >     >
> >     > Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens
> e. V.
> >     >
> >     > Eingetragen im Vereinsregister des Amtsgerichts
> Berlin-Charlottenburg
> >     > unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das
> >     > Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.
> >     >
> >     >
> >     > _______________________________________________
> >     > Wikidata mailing list
> >     > [email protected] <mailto:[email protected]>
> >     > https://lists.wikimedia.org/mailman/listinfo/wikidata
> >     >
> >
> >
> >     _______________________________________________
> >     Wikidata mailing list
> >     [email protected] <mailto:[email protected]>
> >     https://lists.wikimedia.org/mailman/listinfo/wikidata
> >
> >
> > _______________________________________________
> > Wikidata mailing list
> > [email protected]
> > https://lists.wikimedia.org/mailman/listinfo/wikidata
> >
>
> _______________________________________________
> Wikidata mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikidata
>
_______________________________________________
Wikidata mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to