Linked Data in JSON Telecon

Minutes for 2011-07-04

Agenda
http://lists.w3.org/Archives/Public/public-linked-json/2011Jul/0001.html
Chair
Manu Sporny
Scribe
Thomas Steiner
Present
Manu Sporny, Gregg Kellogg, Bradley Allen, Thomas Steiner, Michael Johnson, Dave Longley, Niklas Lindström
Audio Log
audio.ogg
Note: Thomas Steiner is scribing.
Manu Sporny: Does anyone have objections to swapping agenda items #3 and #1. So we would do introductions first, then talk about the current state of the spec and implementations, and then go into attempting to define Linked Data more formally?
Michael Johnson: +1 change agenda

Topic: Introductions

Manu Sporny: involved in rdfa standardization, PaySwarm, which was the reason for creating json-ld
Manu Sporny: JSON-LD was influenced by Mark Birbeck's RDFj and Ian Davis of Talis' RDF/JSON work.
Gregg Kellogg: consultant, working in media area (music, video, etc), responsible for most of the RDF parsers in Ruby
Bradley Allen: working on extending APIs for giving our developers access to metadata, our company is one of the largest publishers of scientific data in the world.
Bradley Allen: started from XML underpinning, moving more & more to json
Thomas Steiner: Coming from REST corner, went to Semantic Web/Linked data corner - developing RDFa browser extensions.
Thomas Steiner: Thought it would be great if we could not only get metadata out of the pages, but also publish data in JSON in a more Webby way
Thomas Steiner: Also in the RDF and RDFa Working Groups at W3C. I feel there is a need for serialization... I feel unsure exactly what I want out of this - full glory of JSON, full glory of RDF, something in the middle. Should be something that supports triples at the least.
Michael Johnson: I work at Digital Bazaar on the PaySwarm stuff and the Universal Payment standard initiative
We need something like JSON-LD for PaySwarm
Dave Longley: I'm the CTO of Digital Bazaar
Dave Longley: also interested in getting JSON-LD working for PaySwarm.

Topic: State of the Specs and Implementations

Manu Sporny: there are two specs for JSON-LD right now, advanced and basic
Manu Sporny: do we want this separation at all? That's actively being discussed in the community.
Manu Sporny: most complete implementation is the one Dave Longley is working on, could you take us through that, Dave?
Dave Longley: JSON-LD JavaScript: https://github.com/digitalbazaar/forge
Dave Longley: JSON-LD C++: https://github.com/digitalbazaar/monarch/blob/jsonld/cpp/data/json/JsonLd.cpp
Dave Longley: worked on the JavaScript implementation, which is pretty complete.
Dave Longley: and the c++ implementation
Dave Longley: implementations include expanding objects (CURIE expansion)
Dave Longley: implementation included compacting (CURIE compaction and remapping)
Dave Longley: implemantion also includes normalization & framing
Dave Longley: corner cases difficult to implement
Dave Longley: framing feature not yet spec'ed, but used internally to build simple representations of graphs - think simple JSON objects. Kind of like Projections in RDFa. Developers just want to work with simple objects directly, not necessarily graph APIs.
Dave Longley: frame acts as a filter to the graph dataset - just give me what I want to work with. Just give me events, or people, etc.
Dave Longley: We have a set of tests for compaction/expansion/normalization: https://github.com/digitalbazaar/forge/tree/master/tests/jsonld
Dave Longley: allows for working in a more natural way. You can see an example of a frame in the test suite
Dave Longley: https://github.com/digitalbazaar/forge/blob/main/tests/jsonld/frame-0001-in.json
Dave Longley: https://github.com/digitalbazaar/forge/blob/main/tests/jsonld/frame-0001-frame.json
Dave Longley: https://github.com/digitalbazaar/forge/blob/main/tests/jsonld/frame-0001-out.json
Dave Longley: that's it for JavaScript, C++ has the same set of features - it's just a port of the JavaScript code.
Dave Longley: A python port in the works, Mike's working on that.
Michael Johnson: I took the JavaScript code and was able to port it over pretty quickly. It took 3 days and the Python version passes all of the tests, except for the evil normalization test.
Dave Longley: :)
Michael Johnson: JSON-LD Python: https://github.com/digitalbazaar/payswarm-python
Michael Johnson: we need to update the public implementation, but we'll release the code when we upgrade the PaySwarm Authority reference implementation. The Python JSON-LD code will eventually get it's own repository. I've also been writing a unit test framework for the python implmentaiton
Gregg Kellogg: I have a complete parser working, the serializer is working except for the normalization stuff.
Gregg Kellogg: https://github.com/gkellogg/json-ld
Manu Sporny: ballen: Right now I haven't updated my JSON-LD implementation to the latest spec - I've taken elements of the spec and applied it to specific purposes for Linked Data.
Manu Sporny: ballen: Using the idiom itself to do encoding - using that w/ MongoDB to be able to build a low complexity query RDF store in that framework.
Bradley Allen: using it together with mongo db to build an RDF store
some more JSON-LD implementations can be found by googling
Manu Sporny: Henri Bergius has a JSON-LD implementation for VIE - part of the IKS semantic wiki project in the EU.
Manu Sporny: spec is fairly up-to-date as far as the syntaxes are concerned, barring any major changes to the data model.
Manu Sporny: two things going on in parallel
Manu Sporny: We believe that normalization for the non-ridiculous cases can be dealt with easily, the ridiculous cases are NP, but Dave is working on solutions for those.

Topic: Formal Definition of Linked Data

Manu Sporny: There are a group of people saying
Manu Sporny: some folks are saying that RDF is too complex, we need to simplify the linked json stuff
Manu Sporny: gregg has put a doc together
Manu Sporny: http://json-ld.org/requirements/latest/
Manu Sporny: let's go thru Gregg's doc and do a straw poll to see if we have consensus on some of the items.
Gregg Kellogg: seemed like email convos were going round in circles
Gregg Kellogg: thought might make sense to structure the discussion
Gregg Kellogg: what is linked data? We should get a good formal definition before we go too far.
Gregg Kellogg: bnodes made it seem too close to RDF
Gregg Kellogg: thought we could talk about it in the form of graphs
Gregg Kellogg: if we know what linked data means, what does it imply for JSON?
Gregg Kellogg: what is the model we're looking at?
Gregg Kellogg: is it a graph? or is it a json-centric model? or is it something else?
Gregg Kellogg: "What is Linked Data?" seems to be the crux of the confusion on the mailing list.
Manu Sporny: let's go thru each item
Michael Johnson: +1
Manu Sporny: Linked Data is used to express relationships between entities expressed as subject-predicate-object, or entity-attribute-value.
Manu Sporny: (was suggesting to go thru points one by one)
Dave Longley: not everything has to expressed as a set of triples
Dave Longley: that was gkellogg
Gregg Kellogg: linked data used to represent directed graphs
Gregg Kellogg: destination nodes can be called subject/object, vertexes predicates
Manu Sporny: Linked Data is used to represent a directed graph, and within the context of Linked Data, the graph can be represented as connections between different nodes, nodes are subjects and objects, links are properties. Nodes may have identifiers that are URIs allowing them to be externally addressed.
Bradley Allen: would be clarifying if we scrub the rdf roots
Bradley Allen: need to be crisp about the terminology
Manu Sporny: what you find over time is the need to use words that define concepts, (over)loaded with decades of usage
Manu Sporny: an example would be blank nodes
Manu Sporny: { "name": "Bradley Allen" }
Manu Sporny: immedeate thing you get out of that example is the requirement for something like a blank node. That JSON object doesn't have an identifier, there is an doiscussion going on if we need identifiers all the time
Manu Sporny: we've been through this simplification game before
Thomas Steiner: linked data is imho the concept of providing links where no links used to exist before
Thomas Steiner: quoting http
Manu Sporny: not sure if we can create a spec on top of that - the definition is too loose.
Dave Longley: they seem to end up being the same definitions
Manu Sporny: let's strawpoll with what Gregg said.
Manu Sporny: Linked Data is used to represent a directed graph, and within the context of Linked Data, the graph can be represented as connections between different nodes, nodes are subjects and objects, links are properties. Nodes may have identifiers that are URIs allowing them to be externally addressed.
Gregg Kellogg: +1
Manu Sporny: +1
Michael Johnson: +1
Dave Longley: +1
Thomas Steiner: +1
Bradley Allen: +1
Niklas Lindström: +1 :)
Manu Sporny: This is good! I thought that we may have a fundamental disagreement on the core of Linked Data, but it doesn't seem like we do.
Manu Sporny: we still need to make sure those on the list agree, but this is a good start at consensus.
Manu Sporny: Next item: A subject is a non-terminal node in a directed graph.
Bradley Allen: the language needs cleansing, but the sense is +1able
Manu Sporny: i could see someone misunderstand this with the "non-terminal" part
Bradley Allen: what's really being said here is that a subject with a link degree != 0
Manu Sporny: A subject is any node in a directed graph with at least one outgoing link.
Manu Sporny: +1
Dave Longley: +1
Bradley Allen: +1
Thomas Steiner: +1
Gregg Kellogg: +1
Michael Johnson: +1
Niklas Lindström: +1
Manu Sporny: A subject may be given a unique identifier represented using a URI.
shall we combine this with the previous definition?
Manu Sporny: A subject may be labeled with an IRI.
Gregg Kellogg: +1
Dave Longley: +1
Manu Sporny: +1
Bradley Allen: what that does is clarify how to treat blank nodes
Thomas Steiner: +1
Bradley Allen: +1
Michael Johnson: +1
Niklas Lindström: Since I can't dial in, I'm missing a lot, but aren't we just (re-) defining the graph model underpinning RDF here?
Manu Sporny: more or less :)
Manu Sporny: but we're doing it w/o talking about RDF at all, which may have some benefits.
Bradley Allen: yep
Bradley Allen: +1
Niklas Lindström: +1 anyway (my SIP connection is borked somehow; probably the company net)
Manu Sporny: now we're entering into shaky ground with blank nodes, we should skip this and leave it until the end.
Manu Sporny: proposed change here
Manu Sporny: ok to use property instead of predicate?
Manu Sporny: A property describes an edge of the directed graph relating two entities.
Thomas Steiner: +1 to move to property
Manu Sporny: A property describes an edge of the directed graph relating two nodes.
Manu Sporny: A property is an edge of the directed graph relating two nodes.
Manu Sporny: A property is an edge of the directed graph.
Gregg Kellogg: +1
Dave Longley: +1
Manu Sporny: +1
Thomas Steiner: +1
Niklas Lindström: +1
Michael Johnson: +1
Bradley Allen: +1
Bradley Allen: that takes us to 8, a property may be labeled with an IRI
Manu Sporny: A property SHOULD be labeled with an IRI.
Discussion around SHOULD or MAY language and consistency.
Manu Sporny: we should use spec terms, so let's make it SHOULD, as we strongly suggest people identifying properties with IRIs
Bradley Allen: how does this align to JSON usage of property names
Gregg Kellogg: this is where we start to confuse syntax with Linked Data.
Gregg Kellogg: json-ld must provide a way to do this, but that's why we have context - it's higher up the stack.
Gregg Kellogg: we didn't make this statement for subjects, as oftentimes there are no proper IRIs defined for subjects
Gregg Kellogg: a property relationship is a best practice
Manu Sporny: A property SHOULD be labeled with an IRI almost always, there are very few use cases that make sense where you don't label a property with an IRI. There are more use cases where you don't label a node with an IRI.
Gregg Kellogg: that's the motivation behind SHOULD for property vs. MAY for subject
Manu Sporny: +1
Gregg Kellogg: +1
Bradley Allen: +1
Thomas Steiner: +1
Michael Johnson: +1
Dave Longley: +1
Niklas Lindström: +1
Manu Sporny: We're at the top of the hour, is everyone okay to continue on for about 15 more minutes?
The group decides to extend the call by 15 minutes
Manu Sporny: An object is a node in a directed graph with at least one incoming link.
Dave Longley: we're mixing "link" and "edge", should we pick one?
Thomas Steiner: let's use edge
Dave Longley: +1 edge
Michael Johnson: +1 edge
link is overloaded
Manu Sporny: +1 edge
Bradley Allen: +1
Gregg Kellogg: +1
Manu Sporny: An object is a node in a directed graph with at least one incoming edge.
Thomas Steiner: +1
Manu Sporny: +1
Bradley Allen: +1
Dave Longley: +1
Michael Johnson: +1 for object
Gregg Kellogg: +1
Niklas Lindström: What is a link; the use of a URI to label an edge?
Manu Sporny: no - link is confusing - we are using the word 'edge' instead.
Manu Sporny: link is too overloaded... a graph is 'nodes' and 'edges'
Niklas Lindström: +1 for that (but the question is still valid ;) )
Niklas Lindström: +1 for object
Manu Sporny: we could strike 10 and 11
Manu Sporny: i don't think datatyping needs to be part of Linked Data
Bradley Allen: what you can say, though, is that the object can be labeled with an IRI
Gregg Kellogg: if an object is a scalar value such as a date, can it have a label then?
Bradley Allen: for a given json object you could draw the graph in a serialization-neutral way
Bradley Allen: that would allow us to frame the discussion
Manu Sporny: An object may be labeled with an IRI or a literal?
Manu Sporny: An object may be labeled with an IRI or a scalar value?
Manu Sporny: could we say "an object may be labeled with an IRI or a literal"?
Niklas Lindström: literals are said to denote themselves (I interpret that kind of like "they are their names")
Gregg Kellogg: we need an abstract definition of an object first
Bradley Allen: we have that in 9
Gregg Kellogg: an object may be labeled with an IRI
Dave Longley: An object MUST be labeled with an IRI *or* a literal?
Gregg Kellogg: An object MAY be labeled with an IRI *or* a literal?
Gregg Kellogg: An object MAY be labeled with an IRI *or* a literal *or* a 'whatever'?
Gregg Kellogg: An object MAY be labeled with an IRI *or* a literal *or* be unlabeled?
Niklas Lindström: I suggest we base the underlying model on RDF concepts as currently defined. These matters are deep. Cconsider the ongoing debates of literals in subject position, etc.
Manu Sporny: An object MAY be labeled with an IRI or a literal. A Node may be unlabeled.
Manu Sporny: A Node may be unlabeled.
Manu Sporny: +1
Gregg Kellogg: +1
Dave Longley: +1
Michael Johnson: +1
Bradley Allen: +1
Thomas Steiner: +1 (weak)
Manu Sporny: (+1 for "A node may be unlabeled")
Niklas Lindström: +1 for node may be unlabeled
Manu Sporny: An object MAY be labeled with an IRI or a literal.
Niklas Lindström: .. or BE a literal?
Manu Sporny: We are running out of time for today's call. We've been very productive today, this is good! There is a large group of us that agree on some fundamentals. Thanks everyone for participating in the conversation.
Bradley Allen: thanks to Gregg for compiliing this requirements document. Can you put it on github?
Manu Sporny: it's already on github, all of this stuff is
Michael Johnson: thanks all!
Manu Sporny: Thanks to Tom Steiner for scribing. Have a good week, everyone.