JSON-LD Community Group Telecon

Minutes for 2012-10-02

  1. ISSUE-160: Specify property-generator round-tripping algorithm
  2. Microdata to JSON-LD conversion
  3. JSON-LD in Drupal 8
Manu Sporny
Niklas Lindström
Niklas Lindström, Manu Sporny, Markus Lanthaler, Lin Clark, Gregg Kellogg, David I. Lehn, Stéphane Corlosquet
Audio Log
Niklas Lindström is scribing.
Manu Sporny: Two additions to the agenda: talk about the connect/graphify mechanism
… and the microdata/json to json-ld topic
… we'll add those to the end of the agenda

Topic: ISSUE-160: Specify property-generator round-tripping algorithm

Manu Sporny: had a good discussion with lin about this recently. We may have a new angle/proposal for this.
… the big problem with this issue is that it creates new data
… when expanding, and re-compacting, it's not clear how to get back to the original data
… the idea that came up was that we could tag the generated data to make clear where it came from
… thus recompaction knows how to *re*compact
… I like this mostly because there's linear time complexity
… downsides are: we're now generating instructions to the processor
… and it is not a general solution in the sense that non-tagged multiple occurrences will not be compacted even if there is a property generator expression for a term in the context
… so the question is: do we want this kind of tagged data?
… If we decide that we want to take this path, we can apply this to the language maps as well (and other forms of syntactic sugar in the future)
Markus Lanthaler: Do we want to include such metadata in the expanded output? Some systems may publish expanded data. If the metadata isn't there, users wanting to use property generators in compact contexts, they won't work.
Markus Lanthaler: I'm also unsure about the use cases Drupal has. Can type coercion work for that?
Lin Clark: I checked with the community. We cannot rely on gzip-ing for reducing bandwidth
Lin Clark: Regarding type coercions: we don't know if those will have different types.
… The people defining the fields aren't necessarily developers. So this is beyond our control.
Markus Lanthaler: but you could generate a type based on the field names automatically..
Lin Clark: We want to expose the type that people say things are.
Manu Sporny: to clarify; the issue is that two different sites have different kinds of properties that share a common schema.org property?
Markus Lanthaler: This shows how using different types would work: https://bitly.com/U0Qo87
Niklas Lindström: This is kinda like the RDFa vocabulary expansion feature. [scribe assist by Manu Sporny]
Niklas Lindström: I think that people will need this feature in JSON-LD - the property generator stuff. In all of these cases, the important property is the first property in the array. If we include this, we could specify something along those lines. Property generators aren't a general feature, they're a feature for those who cannot use entailment. It's a dirty solution w/ a practical use case. [scribe assist by Manu Sporny]
Niklas Lindström: I think we should clearly mark the metadata-in-expanded-form would be okay with me if we were clear about the issues. [scribe assist by Manu Sporny]
Gregg Kellogg: my thought was to use the spirit of entailment, in order to define the property generator array to be a list of properties for which the stated term as a subproperty relationship
… so when we expand, we only use the single id of that property
… if we just stick with a single iri for the @id, but include a property generator array as well
… and we could add a flag to turn on using that to use those properties as well
… I'd rather do that than add pragma data to the expanded output
Gregg Kellogg: on compaction, the extra properties in the generator expression could be removed; unless they are also defined elsewhere..
Manu Sporny: and we could have rules for treating presence of an iri both in a generator expression and regular term as an error
Lin Clark: one of our cases is the ability to expand and then compact again
…. although going back to compacted form is more theoretical for us
Lin Clark: a number of the suggestions proposed will probably work for us
Markus Lanthaler: I meant instead of having "term": { "@id": ["http://example.org/vocab#term1", "http://example.org/vocab#term2" ]} in the context
Markus Lanthaler: we could have "term": { "@id": "http://example.org/vocab#term1", "@expandAlsoTo": "http://example.org/vocab#term2" ]}
David I. Lehn: hmm...
Markus Lanthaler: in compaction we would then just use @id, i.e., expandAlsoTo is completely ignored in compaction
Gregg Kellogg: instead of direct expansion, we can output a subPropery-assertion for the property
Markus Lanthaler: yeah, that would work for us. we will always have at least one unique IRI for the term [scribe assist by Lin Clark]
Gregg Kellogg: in my proposal, that'd result in term1 rdfs:subPropertyOf term2
Manu Sporny: {"@context": {"compactor": {"@id": "http://bar.com/baz", "@expandAlsoTo": ["http://schema.org/title", "http://example.com/a"]}, "http://schema.org/title": "foo"}
Markus Lanthaler: @expandAlsoTo is ignored in compaction
Gregg Kellogg: {"@id": "http://bar.com/baz", "rdfs:subPropertyOf": "http://schema.org/title"}
Niklas Lindström: what are the consumer demands here?
Manu Sporny: do the consumers expect to see schema.org/title in the expanded form here?
… or is the need just for semantic annotation
Gregg Kellogg: not really reinventing. It's more like what we do in RDFa.
Manu Sporny: the downside is that you need to reason on the graph.
Lin Clark: we want to communicate with multiple different consumers using different vocabularies
… and also e.g. content staging...
… we want to use ison-ld instead of RDF but with the full irri
Gregg Kellogg: so then we need to expand this
Markus Lanthaler: do we need to be able to undo that?
Gregg Kellogg: if we don't perform generator expansion we don't need to undo anything
… if you perform expansion you shouldn't expect to be able to undo it
Niklas Lindström: I may have gotten a clear picture from what Gregg said... we have to consider the entire use case... when do we need to throw away the generated data? If we have property generators, on expansion, do they always generate the extra data (or do you need a flag to do that?) [scribe assist by Manu Sporny]
Niklas Lindström: When you use compaction on expanded data, do you throw away something? Any full IRI that is used only in a property generator? Other alternative is to have the pragma directive. [scribe assist by Manu Sporny]
Niklas Lindström: Actually see if all of the statements are the same? That is use the heavily computation intensive mechanism. [scribe assist by Manu Sporny]
Manu Sporny: if we do have a flag, we have the roundtripping issue
Manu Sporny: i think we need to support roundtripping
… I don't want that to be computation expensive. If we have pragmas we can do it cheaply.
Manu Sporny: throwing away iris within property generators is ambiguous because it may not have come from expansion.
Markus Lanthaler: that's *the* question I would say niklas
Gregg Kellogg: when expanding, if we have a property generators, bnodes *must* have node-id:s since the expanded properties link to the *same* bnode
… and thus we can compact and compare values
Manu Sporny: assigning bnodes requires graph normalization
Manu Sporny: a drupal site might export and re-import via expanded form
Markus Lanthaler: but that system can ignore data not relevant for it
Lin Clark: the direction of the api is unclear for me
… if the idea is that you will round-trip, we want to be able to use it
Niklas Lindström: {ctx; {term: {id: [a, b]}, term: {label: 'unknown'}} } expands to [{a: {@id: "_:genid-1", {b: {@id: "_:genid-1"}}, {@id: "_:genid-1", label: 'unknown'}]
Markus Lanthaler: { "@context": { "property": "http://schema.org/foo" } }
Manu Sporny: What I'm proposing is something like this - {"@value": "foo", "@processor": "drop-when-compacting"}, not that terrible, I don't think. Problem is that re-compacting with a different context could lose data.
Markus Lanthaler: compacting with processor pragmas and a new context seems problematic
Manu Sporny: can we create a bnode mechanism that is easily identified?
Gregg Kellogg: when expanding a property generator term, all iris:s associated with the generator are repeated for them. If a value is a bnode without @id, an @id must be generated for that to ensure that the same bnode is referenced more than once.
…. when we compact, upon checking a term match for a term using property generators, we must compare the values to see if we have the same value. If all are the same, use the property generator term.
Niklas Lindström: Can we just say that we use the node definition for the first item in the array, and references for the other items in the array. [scribe assist by Manu Sporny]
Markus Lanthaler: if you get an expanded document where iris used in non-normalized node definitions, would a property generator matching algorithm miss something?
Gregg Kellogg: I'l write down a proposal
Manu Sporny: so backing up, do we still want to support property generators? Lin and I think they're important.
Niklas Lindström: +1 with some faith
Manu Sporny: we have two approaches, both *may* destroy data in edge cases upon compaction
Markus Lanthaler: still, do we really have to eliminate the data?
… it would make it much, much simpler
Gregg Kellogg: then we wouldn't roundtrip
Manu Sporny: Okay, so we have consensus that we want to support property generators and we want to support round-tripping. There is concerns that we don't want to inject pragma directives into expanded form.

Topic: Microdata to JSON-LD conversion

Gregg Kellogg: we've discussed off-list about the possibility to take the json generated by a microdata parser and treat it as json-ld
… the crux is that md-json contains a "properties" term which means nothing; so we'd need to "fold that upwards"
… we'd need something like a "properties": {@container: "@fold"} to be able to do this.
Gregg Kellogg: Example of microdata+json: http://tinyurl.com/9n45rs5
Niklas Lindström: We have seen this requirement before, the ability to keep processing if a particular key is found, but has no semantic meaning.
Manu Sporny: the further we delay 1.0 to accommodate these new features, the higher the risk is that we lose current adopters of the api. I'm very concerned about this, I'd rather that we add features like this in extension specs and see what the market uptake for it is.
Gregg Kellogg: I'm concerned about that as well.
Manu Sporny: We could just add an API method - .fromMicrodata() - it's not declarative (which is bad), but it could be done in parallel to JSON-LD 1.0 going to REC. [scribe assist by Manu Sporny]

Topic: JSON-LD in Drupal 8

Markus Lanthaler: Just saw Lin's post about integrating JSON-LD into Drupal 8: http://groups.drupal.org/node/258778
Niklas Lindström: Great!
Stéphane Corlosquet: yeah we even manage to get a bit of funding to help with the integration of JSON-LD in Drupal, hopefully that will speed things up.