JSON-LD Community Group Telecon

Minutes for 2011-11-01

Manu Sporny
Thomas Steiner
Thomas Steiner, Manu Sporny, Markus Lanthaler, Gregg Kellogg, Niklas Lindström, David I. Lehn
Audio Log
Thomas Steiner is scribing.

Topic: ISSUE-37: Clarify prefix expansion

Manu Sporny: https://github.com/json-ld/json-ld.org/issues/37
Manu Sporny: This was been mostly discussed via e-mail
Manu Sporny: You split on a colon and attempt to expand the first item returned based on entries in the @context.
Markus Lanthaler: nothing else needs to be added to the spec?
Manu Sporny: only thing we need to do: make sure we add some text to the spec and detail the spec
Gregg Kellogg: I'll take that action
ACTION: Gregg to add language to the JSON-LD spec, clarifying prefix expansion.
Manu Sporny: Anything else we need to clarify?

Topic: ISSUE-38: Prefix location clarification

Manu Sporny: https://github.com/json-ld/json-ld.org/issues/38
Markus Lanthaler: most of this has been addressed, but needs to be in the syntax doc
Manu Sporny: slight bit of miscommunication on the list
Manu Sporny: not clear in the current spec where prefixes are allowed
Manu Sporny: answer is: anywhere there is an IRI
Markus Lanthaler: except for in the top-level of the @context
Manu Sporny: correct
Manu Sporny: not allowed in the top level of the @context
Gregg Kellogg: Having CURIEs in the @context is the topic of current discussion
Manu Sporny: Gregg, are you ok with that?
Gregg Kellogg: we use curies in the @coerce section
Gregg Kellogg: in the prefixes, IRIs must be spelled out
Gregg Kellogg: Niklas might have different ideas
Niklas Lindström: no, I agree
Niklas Lindström: I would be open to allow curies everywhere in @context
Manu Sporny: the only objection would be that we don't want too many ways for people to hang themselves
Manu Sporny: we don't want to assume that someone loads the entire doc and processes it. we rather expect people to treat it as a stream in some cases.
Manu Sporny: prefixes everywhere make it more complicated w/o adding great benefits
Niklas Lindström: I understand, although, w/ coerce I feel it might not be possible to "stream-parse" it
Niklas Lindström: I feel it's too complex to parse at once
Manu Sporny: the processors would end up storing the key value pairs
Manu Sporny: in the worst case you'd have to process the entire doc
Manu Sporny: we don't want to lose that simplicity right now
Niklas Lindström: will people use the context to create full RDF?
Manu Sporny: we don't know - good question
Manu Sporny: maybe not full RDF, but they will process it in some way that is stream based
Manu Sporny: maybe they are extracting some data that requires IRIs
Niklas Lindström: I can see that, we need to weigh that vs. the option to use IRIs on the right hand side
Niklas Lindström: it's quite difficult in certain implemetations
Manu Sporny: you could still stream it if you had control over the publishing-side
Niklas Lindström: you could put the unresolvable things in a queue... wouldn't know if things are unresolveable unless you have some heuristics
Niklas Lindström: if you put the definitions of the prefixes first, implying you have control over the order, then people resolving over already resolved prefixes, wouldn't be a problem
Manu Sporny: is it useful to have CURIEs on the right hand side then?
Manu Sporny: typically prefix definitions use completely different URIs
Manu Sporny: same for schema.org
Gregg Kellogg: If we do "prefix": {"@iri": "…", "@coerce": "xsd:date"}
Niklas Lindström: I use like four or five different vocabs in my use case
Niklas Lindström: I would like to make use of more compact prefixes
Manu Sporny: is that a very strong need?
Niklas Lindström: I could do without it
Niklas Lindström: one thing: showed context to a colleague w/ little RDF knowledge
Niklas Lindström: used the json view plugin
Niklas Lindström: made content navigation very nice
Manu Sporny: would that be a counter argument?
Niklas Lindström: the real counter argument is that it's less complex
Niklas Lindström: I would like to discuss it further
Niklas Lindström: it would make things for me very much more compact
Manu Sporny: let's put back to the mailing list for now
Manu Sporny: I could live w/ either
Manu Sporny: pro: makes readibility better
Manu Sporny: con: doesn't enable any new technical use cases
Manu Sporny: that's a semi weak counter argument, but that's the most convincing one for me.
Gregg Kellogg: I can see the usefulness
Gregg Kellogg: an alternative way to do it would be to allow the absence of an IRI and have the prefix to be inferred
Gregg Kellogg: given we can have multiple contexts, we could have the IRIs be inferred
Niklas Lindström: hey, that's really interesting
Niklas Lindström: using vocab for that is quite useful
Manu Sporny: general consensus is: it's interesting - don't put it in yet, discuss further on mailing list.
Manu Sporny: we can always leave it out in the first version, and add it later - it's a forwards-compatible change
Manu Sporny: no time pressure to decide on this
Manu Sporny: we can just let people do implementations, and wait
Niklas Lindström: I agree
Manu Sporny: let's move on, then
Niklas Lindström: we might have to rediscuss were CURIEs are allowed
Manu Sporny: looking back at the issue-38
Manu Sporny: people have to be careful not to expand common prefixes like ftp:, or http:
Gregg Kellogg: If a prefix is defined and the key/value is expanded, it is also determined to be an IRI
Manu Sporny: if there is a term specified, we check it in the prefix map
Manu Sporny: doing anything more than that complicates the rules I think
Manu Sporny: any feelings?
Niklas Lindström: I used to have troubles, but I think now it is a good way to go
Niklas Lindström: one question regarding terms vs. prefixes
Manu Sporny: they are the same thing
Manu Sporny: if there is a colon, you take the bit before the colon and expand
Markus Lanthaler: http://json-ld.org/spec/latest/json-ld-api/#iri-expansion is now clear when terms/prefixes are expanded
Manu Sporny: you have to say how you expand
Niklas Lindström: TermOrCURIEorIRI
Gregg Kellogg: I think in my recent revision I defined prefixes and terms as just "terms"
Gregg Kellogg: when you always divide on colon, and take the first part of that - you just try expanding that and that's all you have to do.
Manu Sporny: ok, that makes perfect sense
Manu Sporny: anything that simplifies is great
Niklas Lindström: does this TermOrCURIEorIRI rule also implied for @coerce: @iri? I believe it is - yes it is.
Manu Sporny: Niklas wanted to allow prefixes in @coerce
Markus Lanthaler: by the data section I meant the main content of the document
Manu Sporny: whenever the property is supposed to be an IRI, you do term processing
Markus Lanthaler: do we say somewhere say that terms can't contain colons
Gregg Kellogg: I think we say that they are NCNAMEs
Manu Sporny: not sure
Manu Sporny: thats a corner case
Niklas Lindström: the way this works has to be aligned with how people should use @base
Niklas Lindström: './whatever' instead of 'whatever' - otherwise, you could accidentally expand what was meant to be an relative IRI as a term.
ACTION: Gregg to define term (and other 'terms') more formally within syntax spec.
Manu Sporny: I don't know how I feel about that. if you're using base, you can shoot yourself in the foot
Niklas Lindström: or something "./curie:like"
Manu Sporny: wondering if @base should be expanded first, no, that doesn't work... you have to do term processing first
Niklas Lindström: colons are allowed in segments of IRIs
Niklas Lindström: if you use @base, you have to know what you're doing
Manu Sporny: might be a best practice to use dot slash
ACTION: Gregg to add as best practice, relative IRIs should begin with "./" or "#" or "/"

Topic: ISSUE-35: JSON Vocabulary / Data Round-tripping

Manu Sporny: https://github.com/json-ld/json-ld.org/issues/35
Manu Sporny: Markus suggested that we create a JSON-LD vocabulary
Manu Sporny: the reason presented would be that we could do roundtripping
Manu Sporny: we do this so that the data roundtrips well from serializing to deserializing
Manu Sporny: for converting to native types
Manu Sporny: doubles will almost always be lossy because of the way JSON parsers are implemented
Manu Sporny: the double that you publish will be different than the double in C/C++
Manu Sporny: has to do with what happens with the number when you process the JSON doc
Niklas Lindström: I think I follow
Niklas Lindström: if you use coerce rules, than the context should ensure that
Manu Sporny: the translation is not guaranteed to happen safely
Manu Sporny: even if we create a JSON vocab, it doesn't address the issue of data round-tripping
Markus Lanthaler: but it moves it to the implementations - away from the spec
Markus Lanthaler: a boolean could be 1, and 0
Markus Lanthaler: I don't expect JSON developers to look at the xml XSD schema
Manu Sporny: let's assume we publish jsonld:number, no one else besides us is gonna use it
Manu Sporny: we would create a parallel data space to xsd:integer and xsd:double and xsd:decimal
Manu Sporny: that the range of a type is bigger, does not mean we can't use it
Manu Sporny: everything that JSON can express fits in XSD value space
Manu Sporny: Markus, do any of these arguments convince you?
Markus Lanthaler: two double values might be different when compared in a triple store
Markus Lanthaler: if you retrieve a JSON-LD doc from somewhere, then store it, then retrieve another one, the reretrieve the previously saved one, then the values might not be equal
Manu Sporny: I disagree
Niklas Lindström: when we convert to JSON, we try to use the most convenient JSON view
Niklas Lindström: is that the problem?
Manu Sporny: correct
Manu Sporny: there are 2 parallel issues: one is losing the lexical space, the other is native language data representation for doubles
David I. Lehn: handling of native doubles in JSON is just going to be potentially lossy due to JSON implementation details. not much we can do about that. handling strings coerced asxsd:double would be an alternative if you require more strict behavior.
Manu Sporny: ISO specifies how to convert doubles, I believe
David I. Lehn: i can't remember why we picked %1.6e. I think I might have picked that for no particular reason. I forget though. :)
Manu Sporny: none of the other data types are lossy, except for doubles
Manu Sporny: if we follow the native ISO spec, then we are fine because it specifies string representation for a double value - I think.
Manu Sporny: If ISO doesn't specify it, we could use the native JSON represenation and add a warning to the spec that the conversion may be lossy
Niklas Lindström: wanted to agree that this is excellent handling of the problem
Niklas Lindström: if they want exactness, they can use the string represenation w/ @datatype
David I. Lehn: people are going to get bitten by this so documentation is our only hope. I see people doing things like using native doubles and signing a serialized stream and it fails between implementations.
Markus Lanthaler: when someone creates a json-ld doc, the double doesn't have to be in ISO format
Manu Sporny: they can use whatever native json way
Markus Lanthaler: if I want to compare, I have to normalize first
Niklas Lindström: .. So, something like this? .. real json numbers are passed into and out of json accompanied by @datatype (or @coerce) *and* expressed lexically according to the JSON-LD datatype lexical canonicalization
Manu Sporny: that's correct
Manu Sporny: also, you should never do double equality comparsions - recipe for disaster
Manu Sporny: if you want to be on the safe side, use strings w/ @datatype
Manu Sporny: in the spec, does it say decimal or double?
Markus Lanthaler: it says "number"
Markus Lanthaler: http://www.ietf.org/rfc/rfc4627
Manu Sporny: xsd:double is the only thing we can do automatic typing on
Manu Sporny: the machine-level representation will almost always be a double
Manu Sporny: anything that has a dot or an 'e' (exponent) in it, will be a double in json-ld when you normalize
Niklas Lindström: http://www.w3.org/TR/turtle/#abbrev
Niklas Lindström: to sum up my last bit: 1) ensure we know what xsd datatype (different) json numbers are automatically represented as, 2) define a canonical lexical representation for each xsd number type
Niklas Lindström: #2 will be used both for automatically interpreted json numbers, and for once explicitly cast by either @datatype or @coerce