JSON-LD Community Group Telecon

Minutes for 2013-02-05

Agenda
http://lists.w3.org/Archives/Public/public-linked-json/2013Feb/0003.html
Topics
  1. New Alternate Algorithms Review
  2. RDF Algorithms Section
  3. ISSUE-217: Disallow BNode identifier as Graph Name
  4. JSON-LD 1.0 Final Community Group Specification
Resolutions
  1. Adopt the 'purpose' and 'general solution' language in Dave Longley's (alternate2.html) specification.
Chair
Manu Sporny
Scribe
Niklas Lindström
Present
Manu Sporny, Niklas Lindström, Gregg Kellogg, Dave Longley, Paul Kuykendall, Markus Lanthaler, David I. Lehn
Audio Log
audio.ogg
Manu Sporny: Any additions to the Agenda?
Niklas Lindström is scribing.
Niklas Lindström: We might want to discuss the reply and suggested change to Eric's recent mail regarding when a default graph is turned into a named graph. [scribe assist by Manu Sporny]
Niklas Lindström: If we have time, I'd like to describe some potential future needs when working with National Library of Sweden stuff. We're officially using JSON-LD there now, maybe some framing-like issues and @rev like stuff. [scribe assist by Manu Sporny]
Gregg Kellogg: I may do a CG proposal for @ordered [scribe assist by Manu Sporny]
Gregg Kellogg: I'm probably going to do a community document regarding @list, like @ordered..

Topic: New Alternate Algorithms Review

Dave Longley: I have worked on merging the current and alternate texts for the algorithms, e.g. including the lookup table (inverse context) for term selection, also added examples and a visual description
… my goal is also to have an implementation (the one used on the playground) is implementing this new spec text (alternate2)
… I've also detected things that were missing, like keyword aliasing
… we have heard of at least one processor that didn't work when impl. the described algorithms
… I've left out the 2 or 3 controversial issues that we have left (like relative iris)
… I've also added some sections to describe the general problem (or "purpose" as gregg suggested) in paragraph form, for people who do their own algorithms
… it's difficult to wrap your mind how all the algorithms work together; so I've attempted to address that as well, e.g. by using the notion of a "subalgorithm"
… not yet updated: the flattening and node mapping algorithms
… also the mapping to rdf concepts may need more review
… in summary: we have a working implementation of this, and it should be noted that it didn't really need that many updates
Gregg Kellogg: I like the context processing; it's the term selection that seem problematic
Dave Longley: re context processing: I reordered to put it first
… we tailored the API to do async processing
… but it may be better to retrieve all the contexts beforehand, and then do the processing
… this is also much more beneficial for our payswarm work
… caching processed contexts
Niklas Lindström: Thanks for the great summary and work - sounds awesome. [scribe assist by Manu Sporny]
Niklas Lindström: It's interesting about the pre-loading the context stuff - it dawned on me a while ago, but didn't have the time to digest the idea. [scribe assist by Manu Sporny]
Niklas Lindström: Things we've talked about regarding asynchronous processing in general would be affected if you needed all the contexts beforehand. I wonder if there is something more to that idea that would affect the API as well. We've discussed async vs. sync approaches - maybe the API needs to be modified... maybe the transformation step is purely functional? [scribe assist by Manu Sporny]
Dave Longley: I'm not sure we can eliminate that, since someone might need to do something async during the processing
Dave Longley: I don't know if we can eliminate async entirely - there may be ways to make it simpler, but I don't think we can remove some of the stuff from the API. If anybody wanted to do anything that we don't think of in this group, we might cripple them. [scribe assist by Manu Sporny]
Paul Kuykendall: I'm in the group implementing the C# impl. I'd like to note that this new layout of the spec looks very helpful.
… e.g. putting context processing first, and the explanatory additions to each algorithm are valuable
Manu Sporny: so, the question is if we want to move forward with one of the three spec alternatives we have before us
Markus Lanthaler: I think we should include the prose of this directly in the spec. We could agree on that and then discuss the algorithms separately.
Manu Sporny: sounds good
Gregg Kellogg: I see value in being able to take some sample data, and walk through the algorithm step by step to see what's going on
… there is something on term selection which seems to intentionally be similar to term ranking
… and what's the relation to inverse contexts
Dave Longley: so inverse contexts gives a lookup from iri to possible terms, and term selection goes through the alternatives; first building the container and then going through if there's a language, etc.
… when doing compaction: get info for property IRI, then match values which apply; and then term selection looks for specificity to select the proper one
… think of the new term selection algorithm as similar to markus' querying of inverse context
Gregg Kellogg: what might help is a picture or table to illustrate this
Dave Longley: yes, a table would be helpful, and show with arrows what is selected
Dave Longley: i wanted to be clear that we're not going to modify the data, therefore I used the notion of a shallow copy
Gregg Kellogg: I think we need to move forward with this, and dave's rewrite addresses or major issues with complexity. Compaction is still very complicated, but I think this is the path to go
Dave Longley: there are also places where we explain over again local processing steps which we could probably explain the gist of and define them (and then link to them)
Gregg Kellogg: like a micro-algorithm section, sounds good
Manu Sporny: my high level read-over gives me the same impression as gregg; the purpose and direction of this is where we want to go
… the things fit together much better now
… and the algorithm work has been very thorough
… so no it's much easier to get an overview
Markus Lanthaler: was the error stuff removal a conscious decision?
Dave Longley: I wanted to get away from a lot of MUST and SHOULD language
… so I combined markus' and gregg's error description
… but we should probably add technical (API) error text back
… we should combine the MUST/SHOULD with that
Gregg Kellogg: re. MUST text, if we use that, and we're duplicating normative text that should exist in the normative grammar, we should look for something better than repeating that
… using an error code seemed incongruous with an algorithm which is much more mathematical in nature
… it'd be better with a constant with a title
… e.g. a "list-of-list error" (could be a tref)
… I prefer something less prescriptive than "raise an error"
Markus Lanthaler: ups... local IRI
… but we need to be explicit about what is an exceptional error, and leave to the API to define what that is
Gregg Kellogg: there is a circular dependency issue of letting the algorithm reference to the API, we need something separate, and let the API also refer to that
… the algorithms should exist without the API
Manu Sporny: so both could refer to the lookup table, defined in prose
Gregg Kellogg: yes, and it could also be used to index back to the normative text describing this
Markus Lanthaler: I don't see how the constants are coupling the algorithms with the API
Manu Sporny: let's take this part back to the list
… can we do a proposal on the high level text, and next week propose on the algorithms?
Gregg Kellogg: I would like to come back also to the RDF algorithms
dave+manu: also include the feature definition language?
PROPOSAL: Adopt the 'purpose' and 'general solution' language in Dave Longley's (alternate2.html) specification.
Manu Sporny: +1
Gregg Kellogg: +1
Dave Longley: +1
Niklas Lindström: +1
Markus Lanthaler: +1
Paul Kuykendall: +1
RESOLUTION: Adopt the 'purpose' and 'general solution' language in Dave Longley's (alternate2.html) specification.
Manu Sporny: Markus to review the algorithms; next week we'll handle whether or not we want to include Dave Longley's algorithm rewrites.

Topic: RDF Algorithms Section

Gregg Kellogg: there has been some issues regarding aligning with the RDF concepts, we need to determine the status of that
… also, to add explicit examples
Manu Sporny: yes, would be good (using turtle)
Markus Lanthaler: does it require to be expanded+flattened?
Gregg Kellogg: there' based upon expanded; there may be some recursion issue, but I'll look at if it would be simplified by flattnening
… complexity on the order of turtle parsing
Manu Sporny: it might be easier to explain without recursion
… looping over flattened input is probably easier to explain

Topic: ISSUE-217: Disallow BNode identifier as Graph Name

Manu Sporny: about using blank node identifiers as a graph name. We raised this with the RDF group. Their response is that graph names can only be IRIs.
… this is problematic when doing graph normalization. When you have two graphs, without bnode names, you have to generate a name
… we can't use a hash of the content to name the graph, we could use fragment IDs, but we'd be specifying something new. Basically, if we invent a new mechanism, we're just re-inventing bnode identifiers.
Dave Longley: if you have two graphs without id but same values, you'd have to assume they're the same graph, which is not correct.
Gregg Kellogg: what if we say that if graphs occurs without an @id, it's a default graph?
Gregg Kellogg: according to RDF concepts, you can't have two graphs that don't have names
… when turning that into RDF, you cannot.
Manu Sporny: Well, if you named them with "blank graph names" you could. RDF Concepts states that you cant' have anonymous graph names that are local to the document, which is a mistake.
Gregg Kellogg: fragment identifers do that
Gregg Kellogg: you can't process the same document twice and get the same bnode out
Gregg Kellogg: we're setting ourselves up for problems if we diverge from RDF
Niklas Lindström: I agree with Gregg in principle - it'll just cause more problems if we diverge from RDF WG. [scribe assist by Manu Sporny]
Niklas Lindström: We support bnode names for properties, right? [scribe assist by Manu Sporny]
Manu Sporny: Yep. [scribe assist by Manu Sporny]
Niklas Lindström: Terms that don't have explicit @id of @null are dropped? [scribe assist by Manu Sporny]
Markus Lanthaler: yes. [scribe assist by Manu Sporny]
Niklas Lindström: We support blank nodes for properties, but not graphs? Syntax for @id supports bnode @ids, maybe we should do a SHOULD NOT support bnode IDs for properties and graphs? [scribe assist by Manu Sporny]
Niklas Lindström: The reason to have two blank node identifiers is to say that there are two graphs that are not named. [scribe assist by Manu Sporny]
Manu Sporny: yes, the problem is that the RDF model doesn't allow two different graphs to exist without having names, which is dumb because they allow two different nodes to exist without names. Seems like a completely arbitrary decision.
Manu Sporny: it seems the reason for disallowing this seems more political than logical - no consensus to do anything, so don't do anything. This has a real-world consequence in that it will break the RDF Dataset Normalization Algorithm.
Gregg Kellogg: if the name must be an IRI, there is no issue. What we need to to is note that it's a violation if it's a bnode id.
Niklas Lindström: I haven't read RDF concepts in detail about this recently, one thing that strikes me as odd is that you never in any part of RDF Concepts, expect the IRI or bnode to be "different", apart from lists. [scribe assist by Manu Sporny]
Manu Sporny: the current RDF 1.1 concepts spec doesn't say that node and graph are disjoint
Manu Sporny: you can have two blank nodes that refer to each other. You cannot do that with two "blank" graphs. Why?
Gregg Kellogg: are we really bound to the RDF data model and WG? I think we are.
Manu Sporny: Gregg and I disagree here. We have done as much alignment as possible. There are minute differences where JSON-LD is explicitly more lax and accommodating. E.g. bnodeid's for properties.
… and up to last week support for bnode id's for graph ids
Dave Longley: it doesn't help with normalization though, which is tied to quads, we need to be able to use /something/ in the graph position. We've been using a blank-node like identifier.
… I think we need to say that if you're gonna use @graph other than as default; you need an @id
Discussion around the effect of bnode ids for graphs won't match since those ids aren't stable... though, identifiers will be internally stable (to the document or quad-store).
Markus Lanthaler: are you normalizing datasets or graphs?
Gregg Kellogg: Datasets
Markus Lanthaler: but the algorithm is called graph normalization?
Manu Sporny: Datasets didn't exist when we wrote the first version of the spec.
Dave Longley: you normalize to quads
Gregg Kellogg: this is an issue for the RDF WG
Manu Sporny: yes. But it's important to understand that code we're deploying in two weeks use bnode ids for graphs. If the normalization algorithm changes that's a problem
Dave Longley: It could work for payswarm if we disallow it; we can adapt
Manu Sporny: It'll be hard to convince the RDF WG that the RDF Concepts model is broken.
PROPOSAL: Disallow blank node identifiers for graph names.
Manu Sporny: -1 (I think if we do this, we align with the RDF data model, which is broken - no reason to disallow blank-node-like identifiers for graph names)
Gregg Kellogg: +1
Dave Longley: a sad -0
Markus Lanthaler: +0.5
David I. Lehn: -0
Niklas Lindström: +0.1
Manu Sporny: Is there anything that would get more consensus that this?
Markus Lanthaler: Yeah, what's in the spec right now.
Markus Lanthaler: if we don't support it in the data model we have to throw an error
Gregg Kellogg: by sticking with SHOULD, you allow for usage to evolve which could affect future RDF
Dave Longley: do we have feedback from RDF WG on SHOULD NOT vs MUST NOT?
Manu Sporny: not really...
Gregg Kellogg: my guess is they'd grudgingly go along with should not, but further convince them that JSON-LD is deviating from RDF unnecessarily.
Markus Lanthaler: this is the current spec text: "Each named graph is a pair consisting of an IRI or blank node identifier (the graph name) and a JSON-LD graph. Whenever possible, the graph name should be an IRI."
Gregg Kellogg: what would it mean to use the bnode-id as a subject in a description, and use the same bnode-id for the graph?
… this should be brought up to the working group now.
… it *may* result in a retreat of a MUST NOT
Niklas Lindström: We may want to discuss this with the Provenance WG about this. [scribe assist by Manu Sporny]
PROPOSAL: Graph names SHOULD use IRIs. The JSON-LD Data model supports identifiers for graphs that are IRIs and identifiers that look like blank node identifiers, but instead identity graphs. The RDF Conversion algorithm SHOULD generate an error when a non-absolute IRI is detected when converting to RDF.
Markus Lanthaler: Counter Proposal: Keep the current spec text: "Each named graph is a pair consisting of an IRI or blank node identifier (the graph name) and a JSON-LD graph. Whenever possible, the graph name should be an IRI."
Gregg Kellogg: with no change, we need an issue marker for bnode ids as graph ids
Markus Lanthaler: That's already in the spec: "In contrast to the RDF data model as defined in [RDF-CONCEPTS], JSON-LD allows blank nodes as property labels and graph names. This feature is controversial in the RDF WG and may be removed in the future."
Manu Sporny: we can leave it open if we mark it as at risk (we can still go to LC)
… we'll bring it up again in the RDF WG

Topic: JSON-LD 1.0 Final Community Group Specification

Manu Sporny: Should we publish the FCGS specification?
Niklas Lindström: We still have outstanding issues, why now?
Manu Sporny: Because we want to get the Intellectual Property aggrements in place while RDF WG is reviewing a semi-finalized specification.
Gregg Kellogg: I think we should wait a week, then try again.
JSON-LD 1.0 Community Group agrees to wait a week.