Re: Input needed from RDF group on JSON-LD skolemization

David Booth Mon, 01 Jul 2013 07:08:23 -0700


On 07/01/2013 05:41 AM, Markus Lanthaler wrote:

On Sunday, June 30, 2013 10:45 PM, David Booth wrote:

On 06/30/2013 10:25 AM, Pat Hayes wrote:


On Jun 27, 2013, at 10:19 PM, David Booth wrote:

[Copying public archive www-archive.w3.org for lack of a better
option]

PROBLEM SUMMARY

GOAL: Any two JSON-LD-compliant parsers should produce the exact
same RDF triples when parsing the same JSON-LD document, except for
blank node labels and (possibly) datatype conversions.

CURRENT PROBLEM: JSON-LD  is intended to be a concrete RDF syntax,
but the JSON-LD data model has some extensions to the RDF data
model, and this causes some non-determinism and/or important
information loss when interpreting JSON-LD as RDF.


Wait. There are two issues getting muddled here. Yes, there can be
information loss in JSON-LD ==> RDF. No, it does not follow that the
mapping is nondeterministic or ambiguous. So information loss does
not compromise the GOAL as stated.


True.  I thought it would be obvious that information loss is
undesirable (since otherwise we could just map to the empty graph), but
to clarify: the goal is to have a deterministic mapping *with* minimum
information loss.


How do you define "loss"?

In this case, when interpreting JSON-LD as RDF, information loss meansdiscarding triples that have blank nodes as predicates, because RDF doesnot allow blank nodes as predicates.

The data is obviously in the JSON(-LD).

Yes -- represented using blank nodes that, when converted to the RDFmodel, would appear in the predicate position..

You seem
to suggest that we remove a mechanism that allows to map data to something
close enough to RDF that some RDF systems already support.

Yes, I am suggesting that the feature of permitting blank nodes in thatJSON-LD position -- the position that would result in blank predicatesin the RDF model -- be removed. Users would instead be required to useURIs in those cases instead of blank nodes.

IMO, that's also information loss.

I would consider it a feature loss rather than information loss, since auser could still represent the information. The user would just have touse a URI instead of a blank node. Is there an important use case forpermitting blank nodes in the JSON-LD position that would map them toblank node predicates in RDF? If so, what is it?


David

David


Pat .

At present, the results of JSON-LD-compliant parsing of a JSON-LD
document to produce a set of RDF triples is non-deterministic
because JSON-LD allows blank node predicates and RDF does not.


That is a nonsequiteur. There is a perfectly deterministic algorithm
to map JSON-LD into RDF, with information loss. Option (a) below, for
example.

The JSON-LD specification currently suggests three potential
solutions but does not mandate one of them: (a) discard triples
that contain blank node predicates; (b) retain triples that contain
blank node predicates; or (c) skolemize blank nodes that are used
in the predicate position.


RANGE OF POTENTIAL SOLUTIONS

1. Change JSON-LD to prohibit JSON-LD blank nodes in positions
where the RDF interpretation of JSON-LD would cause them to be
mapped to illegal RDF blank nodes.

Pros: Easy enough spec change.

Cons: Loss of JSON-LD functionality?  (Is there an important use
case for having blank nodes in predicate positions in JSON-LD?)

My comments: This seems to me like the best available option.


How is that different from the current situation? Instead of mapping
predicates to bnode identifiers people won't map them at all then. The
resulting RDF is the same.

2. Change RDF to permit blank nodes as predicates.

Pros: Avoids information loss.

Cons: Not possible in the current RDF working group, because it is
specifically specified in the charter as being out of scope:
http://www.w3.org/2011/01/rdf-wg-charter "Some features are
explicitly out of scope for the Working Group . . . Removing
current restrictions in the RDF model (e.g., . . . blank nodes as
predicates"

My comments: To my mind, this would have been a second-best option
if it were available.


3. Change the JSON-LD-to-RDF-model mapping to specify that illegal
triples are discarded.

Pros: Easy change to the JSON-LD spec.

Cons: Significant information loss when interpreting JSON-LD as
RDF.

My comments: Not acceptable, due to the information loss.


Again, the same result IMO.

4. Require skolemization of bnodes that appear in the predicate
positiont.   (Note that if skolemization of a bnode is performed,
it must be performed uniformly on all instance of that bnode that
arise from that JSON-LD document.)  RDF-standards-based
round-trippable skolemization would permit round-tripping of the
skolemized bnodes back to the original JSON-LD even if the return
trip is performed by a different party.

Pros: Avoids information loss.

Cons: (a) More complex than other options; (b) To avoid possible
URI clashes, the skolemizer would need a user-specific URI prefix
as a parameter, such as
http://example.com/.well-known/genid/alice/

My comments: Complex, but acceptable.


Yes, skolemization might be necessary... but not on the JSON-LD side. It is
the consumer that has to skolemize if it can't accept the data otherwise.

Are there other options or pros/cons that I did not list?  Which
options would be preferable, acceptable or not acceptable to you?

I suggest adopting #1, but also adding a note to the JSON-LD spec
that recommends that parsers offer an *option* (disabled by
default) to retain triples with a blank node predicate.


That's a contradiction. You can't prohibit blank-node-predicates at the
syntax level and provide a flag to allow it. All such documents would be
invalid.  I think what you mean here is that they are discarded by default
when converting to RDF but retained if that option is set, right? If so,
then we this is again exactly the same point as we currently have but
instead of doing that filtering in the toRDF algorithm, it would be the
consumer of the toRDF result to do the filtering.


--
Markus Lanthaler
@markuslanthaler

Re: Input needed from RDF group on JSON-LD skolemization

Reply via email to