JSON-LD Community Group Telecon

Minutes for 2011-11-15

Gregg Kellogg
Markus Lanthaler
Gregg Kellogg, Markus Lanthaler, David I. Lehn, Niklas Lindström
Audio Log
Markus Lanthaler is scribing.

Topic: ISSUE-35: JSON Vocabulary / Data Round-tripping

Gregg Kellogg: Issues with data representation in JavaScript - issues with representing numbers.
Gregg Kellogg: https://github.com/json-ld/json-ld.org/issues/35
Markus Lanthaler: The problem was with the lexical representation of types, maybe we should resolve on something reasonably based on TURTLE syntax?
Markus Lanthaler: http://www.w3.org/TR/turtle/#abbrev
... Decimal floating point double/fixed precision numbers may be written directly and correspond to the XML Schema Datatype xsd:double in both syntax and datatype IRI.
... Decimal floating point arbitrary precision numbers may be written directly and correspond to the XML Schema Datatype xsd:decimal. in both syntax and datatype IRI.
David I. Lehn: Isn't the problem here that your implementation may serialize as "1" instead of "1.0"?
Niklas Lindström: If you wanna keep precision you should explicitly coerce to type
Niklas Lindström: for example, coercing like this - "age": "xsd:double" - but how would you do these?
Niklas Lindström: "age": 33
Gregg Kellogg: 33.0e1
Niklas Lindström: "33.0"^^xsd:double
Gregg Kellogg: 3.3e1
Gregg Kellogg: {"foo": 3.}
Niklas Lindström: "3"^^xsd:integer
Gregg Kellogg: The problem is JSON could be ambiguous, the problem is with round-tripping. If we have any coercion in place that trumps the format, in the absence of coercion, if you have a document that expresses a number, how do you round trip it? Like how would "3." be interpreted above?
Gregg Kellogg: Depends on parser...
Niklas Lindström: If you want to preserve the exact lexical representation, you have to do this (which is fine):
Niklas Lindström: "foo": {"@literal": "3.", "@datatype": "xsd:decimal"}
Niklas Lindström: :foo 3
Niklas Lindström: If you would like have a explicit RDF output you would have to use the above form
Niklas Lindström: foo: 3
Niklas Lindström: foo: 3.0
Niklas Lindström: Most JSON parsers would give you just the integer, but it would depend on the programming language. We could have some rule in place that says that everything should be normalized.
According to http://jsonlint.com/ that's translated to foo: 3
Markus Lanthaler: I think if we would make JSON-LD work as Turtle in automatic typing I think it would solve our problems. Right now, we always coerce to a double.
Gregg Kellogg: Do you think this can be done unambiguously? Do you think that we can unambiguously decide between xsd:double and xsd:decimal?
Markus Lanthaler: We currently automatically type to integer, double or boolean.
... Why don't we distinguish between double and decimal?
Niklas Lindström: denormalized JSON | normalized JSON or Turtle short form | explicitly typed Turtle
Niklas Lindström: 3.3e10 | 3.3e1 | "3.3e1"^^xsd:double
Niklas Lindström: 1.10 | 1.1 | "1.1"^^xsd:decimal
Niklas Lindström: 1.0 | 1 | "1"^^xsd:integer
Niklas Lindström: 3.30e1 | 3.3e1 | "3.3e1"^^xsd:double
Markus Lanthaler: 3.3e10 | 3.3e10 | "3.3e10"^^xsd:double
... 3.31e1 | 3.31e1 | "3.31e1"^^xsd:double
Gregg Kellogg: I suggest that we say how to coerce into integer, double and boolean - and include a warning that if round-tripping is expected, that an explicit coercion rule is included.
David I. Lehn: I'm not sure what we're discussing here! The differences in JSON parsers and serializers are going to cause pain anyway you look at it unless you use explicit typing. You are not going to know what the intended value is going to be - you can only support this if you are going to explicitly type things.
Niklas Lindström: Two reasons for having automatic typing - one for normalization, two for conversion to RDF.
Gregg Kellogg: This needs to be in the syntax spec, because that's what authors are going to use. To have a fixed lexical representation, you should have an object with a literal and a datatype.
Niklas Lindström: Is decimal a subset of the value space of double?
Gregg Kellogg: yes, it is.
David I. Lehn: Yes.
Niklas Lindström: If we do normalization on the value, and it doesn't have a decimal point, it's always a "double". If it doesn't have a decimal point, then it's an integer. If you are fine with this, then you can use explicit JSON values... if not, you should specify a datatype like xsd:decimal.
Gregg Kellogg: This is what the spec already says. Can we leave it as is for now and re-open the issue if we need to in the future?
Niklas Lindström: Yes, we're looking into if there is any "baseline" in JSON or if we have to specify it all.
PROPOSAL: No change to specification, current language on auto-coercion is fine
Niklas Lindström: +1
Gregg Kellogg: +1
David I. Lehn: +1
Markus Lanthaler: +1
RESOLUTION: No change to specification, current language on auto-coercion is fine

Topic: ISSUE-40: Merge @coerce with @context

Gregg Kellogg: https://github.com/json-ld/json-ld.org/issues/40
Gregg Kellogg: We had agreed to change @coerce, for example, change to this:
Gregg Kellogg: "@context": {
Gregg Kellogg: "title": "http://purl.org/dc/terms/title",
Gregg Kellogg: "description": "http://purl.org/dc/terms/description",
Gregg Kellogg: "identifier": {"http://purl.org/dc/terms/identifier": "xsd:string"},
Gregg Kellogg: "publisher": {"http://purl.org/dc/terms/publisher": "@iri"},
Gregg Kellogg: "created": {"http://purl.org/dc/terms/created": "xsd:dateTime"},
Gregg Kellogg: "authorList": {"http://purl.org/ontology/bibo/authorList": ["@list", "@iri"]}
Gregg Kellogg: }
Gregg Kellogg: The key is the IRI and the value is the type, an alternative is:
Gregg Kellogg: "authorList": {"http://purl.org/ontology/bibo/authorList": { "@list": "@iri" }}
Gregg Kellogg: "created": { "@iri" : "http://purl.org/dc/terms/created", "@coerce": "xsd:dateTime"}
Gregg Kellogg: Alternative: value is object, this says @list entries are of type @iri, I like Markus' form because it's more consistent.
Niklas Lindström: Really prefer the form where you use "@iri" as the key - it's concise.
Niklas Lindström: https://raw.github.com/rinfo/rdl/1c8c6d2/packages/java/rinfo-service/src/main/resources/json-ld/context.json
Niklas Lindström: https://raw.github.com/gist/1340408/context-vocab-array-combined-iri-coerce.json
Gregg Kellogg: @vocab would have to specified prior to be used in the context (in a outer context). That is, if we use terms in the @context, the active context must be used to expand the terms.
Niklas Lindström: You have to look up the keys in the active context while you're parsing @context.
Gregg Kellogg: Point is when is the active context modified. I think it should be modified before the currently processed context has been fully processed
Niklas Lindström: Regardless if we merge coerce and prefix definitions or not it can't be processed in one pass
Niklas Lindström: We should discuss: 1) if we merge @coerce into term definitions 2) if @list is in array or object form? 3) How do you parse a list of contexts, how is @vocab handled when in a list?
PROPOSAL: Move coercion rules into the term definitions section of @context
Niklas Lindström: +1
Gregg Kellogg: +1
David I. Lehn: (I'm afraid i haven't put enough thought into this to vote)
Markus Lanthaler: +1
RESOLUTION: Move coercion rules into the term definitions section of @context
Gregg Kellogg: @vocab would have to come before it's used for stream-based parsers - it wouldn't be used to expand keys in @context.
Niklas Lindström: that's also applies to expansion in coerce - SAX-based processors are going to have issues in any case - it's always two-pass.
Gregg Kellogg: Current processing requires us to know the entire @context before processing it.
Niklas Lindström: I found it useful to have many contexts... especially for using groups of terms.
Niklas Lindström: also useful for keeping memory usage low - by processing in chunks. Any given chunk needs to be processed in its entirety.
Gregg Kellogg: Looking at your list of contexts in the example - an array of contexts. At parse time, there is an active context, parsing the first item in the array would update the active context. All terms @vocabs/@base would the first part would take effect after each array item is processed.
Niklas Lindström: I implemented it so that first @vocab is taken to expand values, then term definitions are parsed
Gregg Kellogg: That would mean processing a context would be a multi-pass operation. The first step is to extract any @vocab definition, the second step is to process all prefix to URI definitions using the active context, the third step is to process coercion/datatype mapping.
Niklas Lindström: https://github.com/rinfo/rdl/blob/develop/packages/java/rinfo-base/src/main/groovy/se/lagrummet/rinfo/base/rdf/jsonld/JSONLDContext.groovy
Markus Lanthaler: I don't like the use of @vocab/@base in expanding values in @context
PROPOSAL: Parsing @context is a multi-pass process. First pass sets the term mappings, second pass resolves the @datatypes.
Gregg Kellogg: +1
Niklas Lindström: +1
David I. Lehn: +1
Markus Lanthaler: +1
Niklas Lindström: What about? { "foaf": "foaf:foo"}
{ "a" : "b:c", "b" : "a:c" }
Niklas Lindström: There is still a problem, see above - we don't need to process @vocab in @context.
Gregg Kellogg: First pass, unless "a" and "b" are already defined - they are IRIs.
Niklas Lindström: Datatypes are not an issue, but if we remove @vocab and wanted a shortened form - we would have to define something to look in the key position. It would be fairly complicated...
ACTION: Discuss that prefixes can't be used for expanding URIs within the same context, unless they're part of @datatype portion.
Niklas Lindström: @vocab is useful tool for context writers - everything that is not defined as a term is resolved against @vocab. We could have @vocab work within the context - you could declare lots of terms more easily.
PROPOSAL: @vocab is resolved prior to term URI expansion within a @context.
Niklas Lindström: +1
Gregg Kellogg: +1
Markus Lanthaler: -1
David I. Lehn: +0
Niklas Lindström: This is a possibility, right? @context: [{foaf: …, dc: …}, {"title: "dc:title", "homepage": "foaf:homepage"}]
Gregg Kellogg: We shouldn't split "prefix" and "term" - let's not over-complicate anything.
ACTION: Define prefixes required for expansion in context definitions prior to use.
Gregg Kellogg: If we are doing it that way we could also go back to single-pass processing (for datatype expansion)
Gregg Kellogg: this removes the need to do 2-pass @context processing.
Gregg Kellogg: Ok, so something like this:
Gregg Kellogg: "created": { "@iri" : "http://purl.org/dc/terms/created", "@coerce": "xsd:dateTime"}
Markus Lanthaler: Other options:
Gregg Kellogg: "created": { "@iri" : "http://purl.org/dc/terms/created", "@type": "xsd:dateTime"}
Gregg Kellogg: "created": { "@iri": "dc:created", "@datatype": "xsd:dateTime"}
Gregg Kellogg: I think "@iri" makes sense - it's consistent.
Niklas Lindström: I favor the compact form - I spend most of my time writing contexts. If you're not using prefixes, it's completely unreadable because they're gigantic.
Niklas Lindström: What about if we do this?
"created": [ "http://purl.org/dc/terms/created", { "@type": "xsd:dateTime"} ]
"created": [ "http://purl.org/dc/terms/created", { "@coerce": "xsd:dateTime"} ]
Gregg Kellogg: This makes it more readable, right? "created": {"dc:created": "xsd:dateTime"}
"created": {"http://purl.org/dc/terms/created": "http://www.w3.org/2001/XMLSchema#dateTime"},
Markus Lanthaler: Could we push this back to the mailing list? This isn't something I'd loose that much sleep over - but would like to think about this for a while. I'm also concerned about markup like this:
Niklas Lindström: "created": {"@iri": "dc:created", "@datatype": "xsd:dateTime", "@array": "@list"}
Niklas Lindström: "created": {"@iri": "dc:created", "@datatype": "xsd:dateTime", "@list": true}
Niklas Lindström: "created": {"@iri": "dc:created", "@datatype": "xsd:dateTime", "@rev": true, "@set": true}
Niklas Lindström: The @set might be more interesting - seeing some usability issues w/ my developers.
Niklas Lindström: I'll try these changes in my implementation. [scribe assist by Niklas Lindström]
Gregg Kellogg: We'll keep working on ISSUE-40 via the mailing list.