Linked Data in JSON Telecon

Minutes for 2011-7-26

Manu Sporny
Manu Sporny
Alfonso Martin, Dave Lehn, Manu Sporny, Danny Ayers, Gregg Kellogg, Nathan Rixham, Ted Thibodeau Jr.
Audio Log
Note: Manu Sporny is scribing.
Alfonso Martin: Going to lurk for the call today, looking at JSON-LD for dbpedia project I'm working on.

Topic: JSON-LD Playground

Dave Lehn: Here is a link to the playground -
Manu Sporny: Something that we put together after the last telco. It's an implementation of a JSON-LD/JSON-SD processor in JavaScript. Playground allows you to type in any JSON-SD compatible markup - examples at top for Person, Place, Event, Library, Recipe, etc. You can click tabs for viewing different forms at bottom of the page.
Danny Ayers: ooh, pretty syntax highlighting
Manu Sporny: You can view compacted form, which is what most webdevs are going to use, expanded form, which lets you see how the JSON expands to IRIs, normalized form for viewing the canonical form of the data. We believe that the normalization algorithm works for all degenerate cases, as far as we can tell. Framed form tab - pick the library example - framed form allows you to take an arbitrary graph and force it into a certain JSON structure. Working with graphs is hard - framed form allows you to query by example. When clicking on "Library" example and "Framed" tab, graph is on left, frame is on right - output is below. Allows webdevs to just use graphs like they do any other JSON data. Last tab is TURTLE tab - view the data as RDF/TURTLE.
Danny Ayers: manu, typical uses for Normalized?
Dave Lehn: digital signatures is one use
Manu Sporny: Several use cases - graph equality comparison, graph diffing and generate hashes for graphs and generate a digital signature for the graphs.
Danny Ayers: oh yeah, of course - thanks manu
Manu Sporny: All source for website and JSON-LD playground is available here

Topic: Finalizing definition of Linked Data

Dave Lehn: Here is a link to the latest requirements -
Gregg Kellogg: We pretty much have agreement on section 3.1 - item #4
Gregg Kellogg: I think we could have agreement on the ISSUE between MAY vs. SHOULD - it should be SHOULD.
Manu Sporny: Let's go through each item to see what folks think about it. This decision may be made easier based on the discussion we've had on the mailing list.
Nathan Rixham: what is the difference between Structured and Linked Data?
Manu Sporny: I think for item 4, SHOULD for LD MAY for SD
Manu Sporny: Basically, there was a disagreement on mailing list on what Linked Data was. Kingsley and Glenn had concerns that we were calling things "Linked Data" when they were not Linked Data. For example, if a graph has a blank node in it, it shouldn't be called Linked Data. However, what some need are Structured Data - folks need blank nodes at times. Linked Data SHOULD use IRIs, and SHOULD NOT use blank nodes. Structured Data MAY use blank nodes. We're trying to come up with a good definition for Linked Data. JSON-SD /can/ do Linked Data - but it can also do Structured Data - which is what most people seem to want.
Nathan Rixham: I find it debatable that Linked Data and Structured Data are different. Structured Data is just data with structure - most types of data. Not going to split hairs over a name I suppose.
Manu Sporny: We've been splitting hairs for the last 3 weeks. Kingsley and Glenn feel that there has been this big mis-use of Linked Data - it was mistakenly tied to a technology (RDF and SPARQL) when it should not have been, Linked Data also should not contain blank nodes. I believe that is Kingsley and Glenn's position.
Ted Thibodeau Jr.: I think that JSON-SD is just JSON. Because of that, I don't really want to go into JSON-SD vs JSON vs. JSON-LD... the distinction is between JSON and JSON-LD.
Manu Sporny: I don't follow.
Ted Thibodeau Jr.: JSON is, by its nature, structured data. Saying JSON-SD is different from JSON will confuse people.
Manu Sporny: Maybe we need another term?
Ted Thibodeau Jr.: No, we don't. JSON that is not JSON-LD is just JSON.
Danny Ayers: I'm inclined to agree that JSON-SD = JSON
Nathan Rixham: I agree with the basic principle of Kinglsey, that Linked Data is data that is linked, and shouldn't be tied to RDF/SPARLQ meme - but "structured data" seems like a misuse and a waste
Gregg Kellogg: Personally, I'd rather stick with the JSON-LD name. Can't please everyone all the time.
Nathan Rixham: JSON-LD is fine
Nathan Rixham: even RDF doesn't have to have links in it.. Linked is th point of why you're using JSON-LD, doesn't mean to say all JSON-LD must be linked - I suggest call it JSON-LD and stick to it
Ted Thibodeau Jr.: JSON is by it's nature, structured data - JSON-SD is redundant
Manu Sporny: Right... but we have all of these processing rules that sit between JSON and JSON-LD. What are we calling this thing, then? We can't call it JSON-LD, and it's more narrow than just JSON. That was the issue we started off with - Kingsley and Glenn insist that we can't call something that contains blank nodes - Linked Data.
Ted Thibodeau Jr.: What is the distinction between traditional JSON and JSON-SD in your coining?
Manu Sporny: JSON-SD is JSON plus a context. JSON-SD has processing rules to turn it into Linked Data and/or RDF, regular JSON doesn't.
Danny Ayers: manu' suggests JSON-SD = JSON + context + processing rules
Danny Ayers: how does a processor know some JSON is JSON-LD? (something like mime type?)
Manu Sporny: A processor will know that something is JSON-SD/JSON-LD because there will be a @context declared.
Danny Ayers: aha
Ted Thibodeau Jr.: Yes, but what is the difference between JSON and JSON-SD.
Manu Sporny: JSON-SD has a @context. JSON does not. JSON, in and of itself, cannot be transformed to Linked Data.
Ted Thibodeau Jr.: The @context is external?
Manu Sporny: The @context can be external, or it may be declared in-line. Any time that you see @context as a key - it's no longer plain JSON - it's JSON-SD - there is extra meaning you can extract from that JSON.
Ted Thibodeau Jr.: Is @context new? Has it always been used with JSON?
Manu Sporny: @context has nothing to do with JSON. It's just another key - it's never existed in JSON.
Ted Thibodeau Jr.: In this structuring, there is JSON w/o @context, there is JSON-SD, which adds @context and then there is JSON-LD, which only uses IRIs.
Manu Sporny: Yes. Unfortunately, the nuance is so small that people are wondering why we are making the distinction. Kingsley and Glenn feel strongly about this.
Ted Thibodeau Jr.: I feel strongly about it as well. We should not say blank nodes are OK in "Linked Data".
Gregg Kellogg: JSON-SD is JSON-LD + BNodes. Having to call it JSON-LD, because it's not fully linked is like saying you can't have a "functional" language with any non-functional features.
Nathan Rixham: there is no reason to artifically limit JSON-LD such that people can't mix in basic unlinked-JSON.; if I want only half of my data to be exposed as linked data, and the rest as boring old unlinked data (say debugging info), then that should be fine, surely? - all of this JSON/JSON-SD/JSON-LD is too complicated, JSON-LD should be a layer above JSON to allow Linked Data
Nathan Rixham: I don't see a reason to limit JSON-LD. This is getting complicated - JSON, JSON-SD and JSON-LD. This just adds a layer on top of JSON - we don't need three levels.
Ted Thibodeau Jr.: (JSON)+(JSON-LD) is not the same as (JSON-LD)
Manu Sporny: We're not limiting JSON-LD, we're just talking about definitions - what is Linked Data (JSON-LD)... what is JSON-SD? We're not talking about limiting the spec - it's going to support all of the variations that we're talking about. Kingsley, Ted and Glenn's issue is that we need to be very clear about the definitions before writing the spec.
Ted Thibodeau Jr.: The big difference between JSON and JSON-LD are IRIs... it is a limit, it is a subset with a lot more application, a lot easier working environment, if you know that everything that matters to you have a IRI - then you don't have to do a great deal of work. JSON-LD docs are going to use IRIs in particular places - that's a win. Regular JSON may use this stuff, but fewer are going to be able to work with it as easily as JSON-LD. Making a subset of JSON, which is what JSON-LD is, makes working with this data easier.
Ted Thibodeau Jr.: The big difference between JSON and JSON-LD are IRIs... it is a limit, it is a subset with a lot more application, a lot easier working environment, if you know that everything that matters to you have a IRI - then you don't have to do a great deal of work. JSON-LD docs are going to use IRIs in particular places - that's a win. Regular JSON may use this stuff, but fewer are going to be able to work with it as easily as JSON-LD. Making a subset of JSON, which is what JSON-LD is, makes working with this data easier. You can still use JSON inside JSON-LD, but don't call that document JSON-LD.
Gregg Kellogg: Except we need processing rules for intersection of JSON and JSON-LD
Manu Sporny: The technical concern is that we need to be able to specify how you handle JSON-LD mixed in with a regular JSON document.
Ted Thibodeau Jr.: The issue is with blended documents - that's hard to do.
Manu Sporny: I don't think we're trying to take it that far. I'm certainly not interested in mixing arbitrary JSON w/ JSON-LD. We just want to make sure that unlabeled nodes are supported.
Nathan Rixham: JSON-LD looks like a SUPERset not a subset to me; also, if it's a specific subset that's only "linked data" then why not just RDF/JSON and forget this all together?
Nathan Rixham: I thought the idea was to augment JSON such that people can expose some bits as linked data.
Danny Ayers: hmm, JSON-LD is JSON from which at least one IRI-based triple can be extracted from, so the rest of the JSON is MUST IGNORE or ERROR?
Nathan Rixham: danja, I'd thought of it slightly differently, take a chunk of JSON, throw it through a JSON-LD parser, and see what triples come out
Ted Thibodeau Jr.: It's when you try to step outside of the bounds of Linked Data that this becomes an issue.
Manu Sporny: I'm trying to understand your point. We want Linked Data - but there are cases where it's not a good thing to require an IRI for everything. There are demonstrated use cases where we want Linked Data used for everything.
Ted Thibodeau Jr.: Like what?
Manu Sporny: Digital signatures... the list of payees in a financial contract... the name of a person that made something coupled with their home page... We don't want to give an IRI for everything under the sun.
Ted Thibodeau Jr.: Why not?
Gregg Kellogg: LD is not just IRI, but _dereferencable_ IRI. What do you get when you dereference the signature of a document?
Nathan Rixham: define: "unlabeled node"
Manu Sporny: Nathan effectively - unlabeled node == blank node. We were using unlabeled node to not use RDF terminology (and all of the baggage that brings into the discussion)
Nathan Rixham: they're JSON-LD use cases, not linked data use cases, just as you get blank nodes in RDF, or anonymous objects pretty much everywhere
Ted Thibodeau Jr.: A financial system like that should be vetted for proper operation.
Ted Thibodeau Jr.: I don't think those are Linked Data use cases. If you want to use blank nodes, great. If you use blank nodes, you are not using Linked Data. It's a mixed use case - it's not a Linked Data use case. Since it's a mixed use case, it should be treated as such. You can't do everything with everything.
Manu Sporny: I don't understand where you're going with this. I don't understand what changes are going to be made to the spec based on your line of argumentation. I don't know what you're proposing.
Ted Thibodeau Jr.: I'm proposing that we start with clear-er diagrams of what these things are. This discussion is going in circles, both here and on the mailing list.
Manu Sporny: Wait, no. JSON-LD is a subset of JSON-SD which is a subset of JSON. How does this impact the spec? That's what I'm concerned about.
Ted Thibodeau Jr.: I don't know. What I see is ongoing confusion about the same issues.
Manu Sporny: I thought we had those resolved on the mailing list. What you're saying is different from my read on the mailing list. I thought the group was in agreement on how we should define this and how we should go forward? Your statements are unique and new to me.
Ted Thibodeau Jr.: Your statements at the beginning were confusing.
Manu Sporny: Have you looked at the requirements document to see if you agree with the Linked Data definition?
Danny Ayers: extreme case, can JSON-LD contain parser-ignored comments?
Dave Lehn: Here's the URL we're looking at -
Ted Thibodeau Jr.: The "according to Wikipedia" bit - wikipedia isn't a standards thing.
Manu Sporny: I'm talking about section 3.1
Manu Sporny: We agreed to the change to #4
Ted Thibodeau Jr.: The definition of Linked Data in section 3.1 looks good.
Manu Sporny: That's good - that means we have a definition of Linked Data that works for everybody. So, what is the addition item that we want to add to that section to make unlabeled nodes work? That's the only item that we need to discuss. What we did on the mailing list is that we called that "Structured Data" and said that the one difference is that Structured Data can specify unlabeled node identifiers.
Danny Ayers: SHOULD in the Linked Data definition still allows for "Structured Data", that may be good enough. It's not it MUST be an IRI, so there is wiggle room.
Manu Sporny: Yes, but we wanted language that was stronger. To tell people that it is okay to have unlabeled nodes.
Danny Ayers: It's not really a big issue, but it would be nice to say something in the spec.
Manu Sporny: Does the use case that we would like to support an acceptable thing to you? We need to be able to express unlabeled nodes. If we make people create IRIs for everything, we're going to end up with some nasty data out there.
Ted Thibodeau Jr.: Do we have use cases any place?
Manu Sporny: No, not for JSON-LD. We have the PaySwarm use cases, but not JSON-LD use cases.
Ted Thibodeau Jr.: We should have the use cases justifying the requirements and features in the spec written down somewhere.
Manu Sporny: Right.
Ted Thibodeau Jr.: I personally hate blank nodes.
Manu Sporny: Have you ever needed blank nodes?
Ted Thibodeau Jr.: I have never find a use case where blank nodes were required.
Manu Sporny: We're not saying they're required - we're saying that they'll make implementers' lives easier.
Danny Ayers: would personally just ditch the SD notion, it doesn't really seem necessary, but not a big deal either way
Gregg Kellogg: part of the problem is the _dereferancable_ nature of URIs in LD. BNode use-cases that could be URIs shouldn't be derferencable.
Manu Sporny: Maybe we should talk about the requirements.

Topic: Discuss JSON-SD Requirements

Dave Lehn: The URL for the requirements - section 3.2 - JSON-LD -
Danny Ayers: I bet there are still a lot of FOAF people out there that are still bnodes - and I'd prefer not to mint URIs for them
A JSON-LD document must be able to express a linked data graph.
Manu Sporny: Any objections to that statement?
Nathan Rixham: does that quantify to every linked data graph? some of them or?
Danny Ayers: That is confusing
Gregg Kellogg: _must_ in 3.2.1 means that 3.1.4 is MUST, not SHOULD
Manu Sporny: Then how about "A JSON-SD document MUST be able to express a linked data graph." ? JSON-SD can also express a mixture of Linked Data and Non-Linked Data (aka. Structured Data). Specifically, the ability to express unlabeled nodes (aka. blank nodes). Any disagreement with that statement?
Gregg Kellogg: 3.1.4 A subject MUST be labeled with an IRI.
Nathan Rixham: manu, so everything that is a valid linked data graph, must be expressible in JSON-*D? "Linked Data" is defined as "should, should, may" - so then "1 1 1" would be valid.
Gregg Kellogg: 3.1.4 is SHOULD, as it may be a literal.
Ted Thibodeau Jr.: Still digesting the IRI comment from Gregg.
Nathan Rixham: and must be able to be expressed in JSON-*D
Gregg Kellogg: No, no subject literals.
3.1.4 A subject SHOULD be labeled with an IRI?
Manu Sporny: Are we good with SHOULD?
Danny Ayers: could live with MUST for 3.1.4 as long as there was an informative note to say that any bnodes are not considered part of the linked data graph (i.e. they're ok)
Manu Sporny: Yes, that's interesting. So, would that means that a graph that has references to blank nodes is to remove any reference to a blank node, and what's left behind is a Linked Data graph?
Danny Ayers: ok but not in the LD graph
Nathan Rixham: or skolemize it *lol*
Danny Ayers: right :)
Manu Sporny: I'm fine w/ that.
Ted Thibodeau Jr.: Subjects and properties are always de-referenceable. You can build a graph with bnodes, but that's not Linked Data.
Manu Sporny: Yes, and to get to Linked Data, one can remove bnode refs, skolemize or do a variety of other things.
Ted Thibodeau Jr.: Yes, I think that's a good direction. We have three nested circles - JSON (biggest), JSON-SD inside (JSON), JSON-LD inside JSON-SD. We need a diagram - we can say it explicitly.
Gregg Kellogg: That leads to a disjoint graph if you have a bnode linking two IRI-based identifiers.
Danny Ayers: - only for definition purposes, in practice you might want to keep the lot in your store
Nathan Rixham: at the minute.. this is just RDF in JSON tbh (sorry)
Gregg Kellogg: Since we're defining LD, I'm fine with MUST. Perhaps we need a "Structured Data" definition too.
Manu Sporny: Ok, so we're making two changes to the Linked Data part of the spec. Changes to 3.1.4 (MAY->MUST) and 3.1.6 (SHOULD->MUST) - that is our definition of Linked Data that we're using for the spec.
Nathan Rixham: so now "Linked Data" is a ground RDF graph, yeah?
Gregg Kellogg: MacTed, can you provide a picture to include?
Danny Ayers: good man
Nathan Rixham: Linked Data == Ground RDF Graph, (contains no bnodes) --- Structured Data == RDF Graph (contains bnodes too) --- JSON == well... json... the host serialization langauge
Danny Ayers: webr3, that sounds reasonable
Nathan Rixham: that's how it's been defined in the call so far
Danny Ayers: See the first principle in
Danny Ayers: is Use URIs as names for things
Manu Sporny: I think we're in more agreement now than in the beginning of the call. I think we're close - we have the definition of Linked Data. We just need a way to talk about the requirements, use cases, etc. Can folks create use cases and post them to the mailing list? One use case per person over the next two weeks? Any volunteers?
Gregg Kellogg: We need to create a Use Cases document.
Nathan Rixham: I will send a couple
Gregg Kellogg: I'll create a couple
Danny Ayers: will try and get at least one together
Manu Sporny: We'll put 4 use cases together.
Manu Sporny: Great, this will help ground the discussion even further. That's the call - have a great week!