JSON-LD - Linked Data Expression in JSON

1. Introduction
- 1.1 How to Read this Document
- 1.2 Contributing
2. Design Goals and Rationale
3. Markup Examples
4. Markup of RDF Concepts
5. Advanced Features
- 5.1 Automatic Typing
- 5.2 Type Coercion
6. The JSON-LD Processing Algorithm
7. Sequence
8. Best Practices
- 8.1 JavaScript
- 8.2 Schema-less Databases
9. Advanced Concepts
A. The Default Context
B. Acknowledgements
C. References
- C.1 Normative references
- C.2 Informative references

1. Introduction

JSON-LD is designed as a light-weight syntax that can be used to express Linked Data. It is primarily intended to be a way to express Linked Data in Javascript and other Web-based programming environments. It is also useful when building interoperable Web Services and when storing Linked Data in JSON-based document storage engines. It is practical and designed to be as simple as possible, utilizing the large number of JSON parsers and existing code that is in use today. It is designed to be able to express key-value pairs, RDF data, Microformats data, and Microdata. That is, it supports every major Web-based structured data model in use today. It does not require anyone to change their JSON, but easily add meaning by adding context in a way that is out-of-band. The syntax is designed to not disturb already deployed systems running on JSON, but provide a smooth migration path from JSON to JSON with added semantics. Finally, the format is intended to be fast to parse, fast to generate, stream-based and document-based processing compatible, and require a very small memory footprint in order to operate.

JSON, as specified in [RFC4627], is a simple language for representing objects on the web. Linked Data is a way of describing content across different documents, or web resources. Web resources are described using IRIs, and typically are dereferencable entities that may be used to find more information, creating a "web of knowledge". JSON-LD is intended to be a simple publishing method for expressing linked data in JSON.

1.1 How to Read this Document

This document is a detailed specification for a serialization of JSON for Linked data. The document is primarily intended for the following audiences:

Web developers that want to understand the design decisions and language syntax for JSON-LD.
Software developers that want to encode Microformats, RDFa, or Microdata in a way that is cross-language compatible via JSON.
Software developers that want to write processors for JSON-LD.

To understand this specification you must first be familiar with JSON, which is detailed in [RFC4627] and RDF as described in [RDF-CONCEPTS].

1.2 Contributing

There are a number of ways that one may participate in the development of this specification:

All comments and discussion takes place on the public mailing list: [email protected]
Specification bugs and issues should be reported in the issue tracker.
Source code for the specification can be found on Github.
The #json-ld IRC channel is available for real-time discussion on irc.freenode.net.

2. Design Goals and Rationale

The following section outlines the design goals and rationale behind the JSON-LD markup language.

2.1 Goals

A number of design considerations were explored during the creation of this markup language:

Simplicity: Developers don't need to know RDF in order to use the basic functionality provided by JSON-LD.
Compatibility: The JSON-LD markup should be 100% compatible with JSON.
Expressiveness: All major RDF concepts must be expressible via the JSON-LD syntax.
Terseness: The JSON-LD syntax must be very terse and human readable.
Zero Edits, most of the time: JSON-LD provides a mechanism that allows developers to specify context in a way that is out-of-band. This allows organizations that have already deployed large JSON-based infrastructure to add meaning to their JSON in a way that is not disruptive to their day-to-day operations and is transparent to their current customers. At times, mapping JSON to RDF can become difficult - in these instances, rather than having JSON-LD support esoteric markup, we chose not to support the use case and support a simplified syntax instead. So, while we strive for Zero Edits, it was not always possible without adding great complexity to the language.
Streaming: The format supports both document-based and stream-based processing.

2.2 Map Terms to IRIs

An Internationalized Resource Identifier (IRI) as described in [RFC3987], is a mechanism for representing unique identifiers on the web. In Linked Data, IRIs (or URI references) are commonly used for describing entities and properties.

Establishing a mechanism to map JSON values to IRIs will help in the mapping of JSON objects to RDF. This does not mean that JSON-LD must be restrictive in declaring a set of terms, rather, experimentation and innovation should be supported as part of the core design of JSON-LD. There are, however, a number of very small design criteria that can ensure that developers will generate good RDF data that will create value for the greater semantic web community and JSON/REST-based Web Services community.

We will be using the following JSON object as the example for this section:

{
  "a": "Person",
  "name": "Manu Sporny",
  "homepage": "http://manu.sporny.org/"
  "avatar": "http://twitter.com/account/profile_image/manusporny"
}

2.3 The JSON-LD Context

A context is used to allow developers to use aliases for IRIs. The semantic web, just like the document-based web, uses IRIs for unambiguous identification. The idea is that these terms mean something, which you will eventually want to query. A context allows the expression of a number of terms which map directly to IRIs. For example, the term name may map directly to the IRI http://xmlns.com/foaf/0.1/name. This allows JSON-LD documents to be constructed using common JSON syntax of using simple name/value pairs.

To reduce the number of different terms that must be defined, JSON-LD also allows terms to be used to expand Compact URIs (CURIE). The semantic web specifies this via Vocabulary Documents, in which a prefix is associated with a document, and a suffix is used to create an IRI based on this vocabulary. For example, the IRI http://xmlns.com/foaf/0.1/ specifies a Vocabulary Document, and name is a term in that vocabulary. Join the two items together and you have an unambiguous identifier for a vocabulary term. The Compact URI Expression, or short-form, is foaf:name and the expanded-form is http://xmlns.com/foaf/0.1/name. This vocabulary term identifies the given name for something, for example - a person's name.

Developers, and machines, would be able to use this IRI (plugging it directly into a web browser, for instance) to go to the term and get a definition of what the term means. Much like we can use WordNet today to see the definition of words in the English language. Machines need the same sort of dictionary of terms, and URIs provide a way to ensure that these terms are unambiguous.

The context provides a collection of vocabulary terms that can be used for a JSON object.

2.4 Unambiguous Identifiers for JSON

If a set of terms, like Person, name, and homepage, are defined in a context, and that context is used to resolve the names in JSON objects, machines could automatically expand the terms to something meaningful and unambiguous, like this:

{
  "http://www.w3.org/1999/02/22-rdf-syntax-ns#type": "http://xmlns.com/foaf/0.1/Person",
  "http://xmlns.com/foaf/0.1/name": "Manu Sporny",
  "http://xmlns.com/foaf/0.1/homepage": "http://manu.sporny.org"
  "http://rdfs.org/sioc/ns#avatar": "http://twitter.com/account/profile_image/manusporny"
}

Doing this would mean that JSON would start to become unambiguously machine-readable, play well with the semantic web, and basic markup wouldn't be that much more complex than basic JSON markup. A win, all around.

2.5 Mashing Up Vocabularies

Developers would also benefit by allowing other vocabularies to be used automatically with their JSON API. There are over 200 Vocabulary Documents that are available for use on the Web today. Some of these vocabularies are:

RDF - for describing information about objects on the semantic web.
RDFS - for expressing things like labels and comments.
XSD - for specifying basic types like strings, integers, dates and times.
Dublin Core - for describing creative works.
FOAF - for describing social networks.
Calendar - for specifying events.
SIOC - for describing discussions on blogs and websites.
CCrel - for describing Creative Commons and other types of licenses.
GEO - for describing geographic location.
VCard - for describing organizations and people.
DOAP - for describing projects.

Since these vocabularies are very popular, they are pre-defined in something called the default context, which is a set of vocabulary prefixes that are pre-loaded in all JSON-LD processors. The contents of the default context are provided later in this document. Using the default context allows developers to express data unambiguously, like so:

{
  "rdf:type": "foaf:Person",
  "foaf:name": "Manu Sporny",
  "foaf:homepage": "http://manu.sporny.org/",
  "sioc:avatar": "http://twitter.com/account/profile_image/manusporny"
}

Developers can also specify their own Vocabulary documents by modifying the active context in-line using the @context keyword, like so:

{
  "@context": { "myvocab": "http://example.org/myvocab#" },
  "a": "foaf:Person",
  "foaf:name": "Manu Sporny",
  "foaf:homepage": "http://manu.sporny.org/",
  "sioc:avatar": "http://twitter.com/account/profile_image/manusporny",
  "myvocab:personality": "friendly"
}

The @context keyword is used to change how the JSON-LD processor evaluates key-value pairs. In this case, it was used to map one string ('myvocab') to another string, which is interpreted as a IRI. In the example above, the myvocab string is replaced with "http://example.org/myvocab#" when it is detected. In the example above, "myvocab:personality" would expand to "http://example.org/myvocab#personality".

This mechanism is a short-hand for RDF, called a CURIE, and provides developers an unambiguous way to map any JSON value to RDF.

2.6 An Example of a Context

JSON-LD strives to ensure that developers don't have to change the JSON that is going into and being returned from their Web applications. A JSON-LD aware Web Service may define a known context. For example, the following default context could apply to all incoming Web Service calls previously accepting only JSON data:

{
  "@context": 
  {
    "@vocab": "http://example.org/default-vocab#",
    "@base": "http://example.org/baseurl/",
    "xsd": "http://www.w3.org/2001/XMLSchema#",
    "dc": "http://purl.org/dc/terms/",
    "foaf": "http://xmlns.com/foaf/0.1/",
    "sioc": "http://rdfs.org/sioc/ns#",
    "cc": "http://creativecommons.org/ns#",
    "geo": "http://www.w3.org/2003/01/geo/wgs84_pos#",
    "vcard": "http://www.w3.org/2006/vcard/ns#",
    "cal": "http://www.w3.org/2002/12/cal/ical#",
    "doap": "http://usefulinc.com/ns/doap#",
    "Person": "http://xmlns.com/foaf/0.1/Person",
    "name": "http://xmlns.com/foaf/0.1/name",
    "homepage": "http://xmlns.com/foaf/0.1/homepage"
    "@coerce": 
    {
      "xsd:anyURI": ["rdf:type", "rdf:rest", "foaf:homepage", "foaf:member"],
      "xsd:integer": "foaf:age"
    }
  }
}

The @vocab string is a special keyword that states that any term that doesn't resolve to a term or a prefix should be appended to the @vocab IRI. This is done to ensure that terms can be transformed to an IRI at all times.

The @base string is a special keyword that states that any relative IRI must be appended to the string specified by @base.

The @coerce keyword is used to specify type coercion rules for the data. For each key in the map, the key is the type to be coerced to and the value is the vocabulary term to be coerced. Type coercion for the key xsd:anyURI asserts that all vocabulary terms listed should undergo coercion to an IRI, including @base processing for relative IRIs and CURIE processing for compact URI Expressions such as foaf:homepage.

3. Markup Examples

The JSON-LD markup examples below demonstrate how JSON-LD can be used to express semantic data marked up in other languages such as RDFa, Microformats, and Microdata. These sections are merely provided as proof that JSON-LD is very flexible in what it can express across different Linked Data approaches.

3.1 RDFa

The following example describes three people with their respective names and homepages.

<div prefix="foaf: http://xmlns.com/foaf/0.1/">
   <ul>
      <li typeof="foaf:Person">
        <a rel="foaf:homepage" href="http://example.com/bob/" property="foaf:name" >Bob</a>
      </li>
      <li typeof="foaf:Person">
        <a rel="foaf:homepage" href="http://example.com/eve/" property="foaf:name" >Eve</a>
      </li>
      <li typeof="foaf:Person">
        <a rel="foaf:homepage" href="http://example.com/manu/" property="foaf:name" >Manu</a>
      </li>
   </ul>
</div>

An example JSON-LD implementation is described below, however, there are other ways to mark-up this information such that the context is not repeated.

[
 {
   "@": "_:bnode1",
   "a": "foaf:Person",
   "foaf:homepage": "http://example.com/bob/",
   "foaf:name": "Bob"
 },
 {
   "@": "_:bnode2",
   "a": "foaf:Person",
   "foaf:homepage": "http://example.com/eve/",
   "foaf:name": "Eve"
 },
 {
   "@": "_:bnode3",
   "a": "foaf:Person",
   "foaf:homepage": "http://example.com/manu/",
   "foaf:name": "Manu"
 }
]

3.2 Microformats

The following example uses a simple Microformats hCard example to express how the Microformat is represented in JSON-LD.

<div class="vcard">
 <a class="url fn" href="http://tantek.com/">Tantek Çelik</a>
</div>

The representation of the hCard expresses the Microformat terms in the context and uses them directly for the url and fn properties. Also note that the Microformat to JSON-LD processor has generated the proper URL type for http://tantek.com.

{
  "@context": 
  {
    "vcard": "http://microformats.org/profile/hcard#vcard",
    "url": "http://microformats.org/profile/hcard#url",
    "fn": "http://microformats.org/profile/hcard#fn",
    "@coerce": { "xsd:anyURI": "url" }
  },
  "@": "_:bnode1",
  "a": "vcard",
  "url": "http://tantek.com/",
  "fn": "Tantek Çelik"
}

3.3 Microdata

The Microdata example below expresses book information as a Microdata Work item.

<dl itemscope
    itemtype="http://purl.org/vocab/frbr/core#Work"
    itemid="http://purl.oreilly.com/works/45U8QJGZSQKDH8N">
 <dt>Title</dt>
 <dd><cite itemprop="http://purl.org/dc/terms/title">Just a Geek</cite></dd>
 <dt>By</dt>
 <dd><span itemprop="http://purl.org/dc/terms/creator">Wil Wheaton</span></dd>
 <dt>Format</dt>
 <dd itemprop="http://purl.org/vocab/frbr/core#realization"
     itemscope
     itemtype="http://purl.org/vocab/frbr/core#Expression"
     itemid="http://purl.oreilly.com/products/9780596007683.BOOK">
  <link itemprop="http://purl.org/dc/terms/type" href="http://purl.oreilly.com/product-types/BOOK">
  Print
 </dd>
 <dd itemprop="http://purl.org/vocab/frbr/core#realization"
     itemscope
     itemtype="http://purl.org/vocab/frbr/core#Expression"
     itemid="http://purl.oreilly.com/products/9780596802189.EBOOK">
  <link itemprop="http://purl.org/dc/terms/type" href="http://purl.oreilly.com/product-types/EBOOK">
  Ebook
 </dd>
</dl>

Note that the JSON-LD representation of the Microdata information stays true to the desires of the Microdata community to avoid contexts and instead refer to items by their full IRI.

[
  {
    "@": "http://purl.oreilly.com/works/45U8QJGZSQKDH8N",
    "a": "http://purl.org/vocab/frbr/core#Work",
    "http://purl.org/dc/terms/title": "Just a Geek",
    "http://purl.org/dc/terms/creator": "Whil Wheaton",
    "http://purl.org/vocab/frbr/core#realization": 
      ["http://purl.oreilly.com/products/9780596007683.BOOK", "http://purl.oreilly.com/products/9780596802189.EBOOK"]
  },
  {
    "@": "http://purl.oreilly.com/products/9780596007683.BOOK",
    "a": "http://purl.org/vocab/frbr/core#Expression",
    "http://purl.org/dc/terms/type": "http://purl.oreilly.com/product-types/BOOK"
  },
  {
    "@": "http://purl.oreilly.com/products/9780596802189.EBOOK",
    "a": "http://purl.org/vocab/frbr/core#Expression",
    "http://purl.org/dc/terms/type": "http://purl.oreilly.com/product-types/EBOOK"
  }
]

4. Markup of RDF Concepts

JSON-LD is designed to ensure that most Linked Data concepts can be marked up in a way that is simple to understand and author by Web developers. In many cases, Javascript objects can become Linked Data with the simple addition of a context. Since RDF is also an important sub-community of the Linked Data movement, it is important that all RDF concepts are well-represented in this specification. This section details how each RDF concept can be expressed in JSON-LD.

4.1 IRIs

Expressing IRIs are fundamental to Linked Data as that is how most subjects and many objects are identified. IRIs can be expressed in a variety of different ways in JSON-LD.

In general, an IRI is generated if it is in the key position in an associative array. There are special rules for processing keys in @context and when dealing with keys that start with the @ character.
An IRI is generated for the value specified using @, if it is a string.
An IRI is generated for the value specified using a.
An IRI is generated for the value specified using the @iri keyword.
An IRI is generated when there are @coerce rules in effect for xsd:anyURI for a particular vocabulary term.

An example of IRI generation for a key outside of a @context:

{
...
  "http://xmlns.com/foaf/0.1/name": "Manu Sporny",
...
}

In the example above, the key http://xmlns.com/foaf/0.1/name is interpreted as an IRI, as opposed to being interpreted as a string..

Term expansion occurs for IRIs if a term is defined within the active context:

{
  "@context": {"name": "http://xmlns.com/foaf/0.1/name"},
...
  "name": "Manu Sporny",
...
}

CURIE expansion also occurs for keys in JSON-LD:

{
...
  "foaf:name": "Manu Sporny",
...
}

foaf:name above will automatically expand out to the IRI http://xmlns.com/foaf/0.1/name.

An IRI is generated when a value is associated with a key using the @iri keyword:

{
...
  "foaf:homepage": { "@iri": "http://manu.sporny.org" }
...
}

If type coercion rules are specified in the @context for a particular vocabulary term, an IRI is generated:

{
  "@context": 
  { 
    "@coerce": 
    {
      "xsd:anyURI": "foaf:homepage"
    } 
  }
...
  "foaf:homepage": "http://manu.sporny.org",
...
}

4.2 Identifying the Subject

A subject is declared using the @ key. The subject is the first piece of information needed by the JSON-LD processor in order to create the (subject, property, object) tuple, also known as a triple.

{
...
  "@": "http://example.org/people#joebob",
...
}

The example above would set the subject to the IRI http://example.org/people#joebob.

4.3 Specifying the Type

The type of a particular subject can be specified using the a key. Specifying the type in this way will generate a triple of the form (subject, type, type-url).

{
...
  "@": "http://example.org/people#joebob",
  "a": "http://xmlns.com/foaf/0.1/Person",
...
}

The example above would generate the following triple (in N-Triples notation):

<http://example.org/people#joebob> 
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
      <http://xmlns.com/foaf/0.1/Person> .

4.4 Plain Literals

Regular text strings are called a plain literal in RDF and are easily expressed using regular JSON strings.

{
...
  "foaf:name": "Mark Birbeck",
...
}

4.5 Language Specification in Plain Literals

JSON-LD makes an assumption that plain literals with associated language encoding information is not very common when used in JavaScript and Web Services. Thus, it takes a little more effort to express plain literals in a specified language.

{
...
  "foaf:name": 
  {
    "@literal": "花澄",
    "@language": "ja"
  }
...
}

The example above would generate a plain literal for 花澄 and associate the ja language tag with the triple that is generated. Languages must be expressed in [BCP47] format.

4.6 Typed Literals

A typed literal is indicated by attaching a IRI to the end of a plain literal, and this IRI indicates the literal's datatype. Literals may be typed in JSON-LD in three ways:

By utilizing the @coerce keyword.
By utilizing the expanded form for specifying objects.
By using a native JSON datatype.

The first example uses the @coerce keyword to express a typed literal:

{
  "@context": 
  { 
    "@coerce": 
    {
      "xsd:dateTime": "dc:modified"
    }
  }
...
  "dc:modified": "2010-05-29T14:17:39+02:00",
...
}

The second example uses the expanded form for specifying objects:

{
...
  "dc:modified": 
  {
    "@literal": "2010-05-29T14:17:39+02:00",
    "@datatype": "xsd:dateTime"
  }
...
}

Both examples above would generate an object with the literal value of 2010-05-29T14:17:39+02:00 and the datatype of http://www.w3.org/2001/XMLSchema#dateTime.

4.7 Multiple Objects for a Single Property

A JSON-LD author can express multiple triples in a compact way by using arrays. If a subject has multiple values for the same property, the author may express each property as an array.

{
...
  "@": "http://example.org/people#joebob",
  "foaf:nick": ["joe", "bob", "jaybee"],
...
}

The markup shown above would generate the following triples:

<http://example.org/people#joebob> 
   <http://xmlns.com/foaf/0.1/nick>
      "joe" .
<http://example.org/people#joebob> 
   <http://xmlns.com/foaf/0.1/nick>
      "bob" .
<http://example.org/people#joebob> 
   <http://xmlns.com/foaf/0.1/nick>
      "jaybee" .

4.8 Multiple Typed Literals for a Single Property

Multiple typed literals may also be expressed using the expanded form for objects:

{
...
  "@": "http://example.org/articles/8",
  "dcterms:modified": 
  [
    {
      "@literal": "2010-05-29T14:17:39+02:00",
      "@datatype": "xsd:dateTime"
    },
    {
      "@literal": "2010-05-30T09:21:28-04:00",
      "@datatype": "xsd:dateTime"
    }
  ]
...
}

The markup shown above would generate the following triples:

<http://example.org/articles/8> 
   <http://purl.org/dc/terms/modified>
      "2010-05-29T14:17:39+02:00"^^http://www.w3.org/2001/XMLSchema#dateTime .
<http://example.org/articles/8> 
   <http://purl.org/dc/terms/modified>
      "2010-05-30T09:21:28-04:00"^^http://www.w3.org/2001/XMLSchema#dateTime .

4.9 Blank Nodes

At times, it becomes necessary to be able to express information without being able to specify the subject. Typically, this is where blank nodes come into play. In JSON-LD, blank node identifiers are automatically created if a subject is not specified using the @ keyword. However, authors may name blank nodes by using the special _ CURIE prefix.

{
...
  "@": "_:foo",
...
}

The example above would set the subject to _:foo, which can then be used later on in the JSON-LD markup to refer back to the named blank node.

5. Advanced Features

JSON-LD has a number of features that provide functionality above and beyond the core functionality provided by RDF. The following sections outline the features that are specific to JSON-LD.

5.1 Automatic Typing

Since JSON is capable of expressing typed information such as doubles, integers, and boolean values. As demonstrated below, JSON-LD utilizes that information to create typed literals:

{
...
  // The following two values are automatically converted to a type of xsd:double
  // and both values are equivalent to each other.
  "measure:cups": 5.3,
  "measure:cups": 5.3e0,
  // The following value is automatically converted to a type of xsd:double as well
  "space:astronomicUnits": 6.5e73,
  // The following value should never be converted to a language-native type
  "measure:stones": { "@literal": "4.8", "@datatype": "xsd:decimal" },
  // This value is automatically converted to having a type of xsd:integer
  "chem:protons": 12,
  // This value is automatically converted to having a type of xsd:boolean
  "sensor:active": true,
...
}

When dealing with a number of modern programming languages, including JavaScript ECMA-262, there is no distinction between xsd:decimal and xsd:double values. That is, the number 5.3 and the number 5.3e0 are treated as if they were the same. When converting from JSON-LD to a language-native format and back, datatype information is lost in a number of these languages. Thus, one could say that 5.3 is a xsd:decimal and 5.3e0 is an xsd:double in JSON-LD, but when both values are converted to a language-native format the datatype difference between the two is lost because the machine-level representation will almost always be a double. Implementers should be aware of this potential round-tripping issue between xsd:decimal and xsd:double. Specifically objects with a datatype of xsd:decimal must not be converted to a language native type.

5.2 Type Coercion

JSON-LD supports the coercion of types to ensure that the zero-edit goal of JSON-LD can be accomplished. Type coercion allows someone deploying JSON-LD to coerce and incoming or outgoing types to the proper RDF type based on a mapping of type IRIs to RDF types. Using type conversion, one may convert simple JSON data to properly typed RDF data.

The example below demonstrates how a JSON-LD author can coerce values to plain literals, typed literals and IRIs.

{
  "@context": 
  {  
     "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
     "xsd": "http://www.w3.org/2001/XMLSchema#",
     "name": "http://xmlns.com/foaf/0.1/name",
     "age": "http://xmlns.com/foaf/0.1/age",
     "homepage": "http://xmlns.com/foaf/0.1/homepage",
     "@coerce":
     {
        "xsd:integer": "age",
        "xsd:anyURI": "homepage",
     }
  },
  "name": "John Smith",
  "age": "41",
  "homepage": "http://example.org/home/"
}

The example above would generate the following triples:

_:bnode1
   <http://xmlns.com/foaf/0.1/name>
      "John Smith" .
_:bnode1
   <http://xmlns.com/foaf/0.1/age>
      "41"^^http://www.w3.org/2001/XMLSchema#integer .
_:bnode1
   <http://xmlns.com/foaf/0.1/homepage>
      <http://example.org/home/> .

6. The JSON-LD Processing Algorithm

The JSON-LD Processing Model describes processing rules for extracting RDF from a JSON-LD document. Note that many uses of JSON-LD may not require generation of RDF.

The processing algorithm described in this section is provided in order to demonstrate how one might implement a JSON-LD processor. Conformant implementations are only required to produce the same type and number of triples during the output process and are not required to implement the algorithm exactly as described.

The Processing Algorithm is a work in progress.

6.1 Overview

This section is non-normative.

JSON-LD is intended to have an easy to parse grammar that closely models existing practice in using JSON for describing object representations. This allows the use of existing libraries for parsing JSON in a document-oriented fashion, or can allow for stream-based parsing similar to SAX.

As with other grammars used for describing linked data, a key concept is that of a resource. Resources may be of three basic types: IRIs, for describing externally named entities, BNodes, resources for which an external name does not exist, or is not known, and Literals, which describe terminal entities such as strings, dates and other representations having a lexical representation possibly including an explicit language or datatype.

Data described with JSON-LD may be considered to be the representation of a graph made up of subject and object resources related via a predicate resource. However, specific implementations may choose to operate on the document as a normal JSON description of objects having attributes.

6.2 Processing Algorithm Terms

default context: a context that is specified to the JSON-LD processing algorithm before processing begins.
default graph: the destination graph for all triples generated by JSON-LD markup.
active subject: the currently active subject that the processor should use when generating triples.
active property: the currently active property that the processor should use when generating triples.
active object: the currently active object that the processor should use when generating triples.
active context: a context that is used to resolve CURIEs while the processing algorithm is running. The active context is the context contained within the processor state.
local context: a context that is specified at the JSON associative-array level, specified via the @context keyword.
processor state: the processor state, which includes the active context, current subject, and current property. The processor state is managed as a stack with elements from the previous processor state copied into a new processor state when entering a new associative array.

6.3 Processing Tokens and Keywords

@context: Used to set the local context.
@base: Used to set the base IRI for all object IRIs affected by the active context.
@profile: A reference to a remote context description used to set the local context.
@vocab: Used to set the base IRI for all property IRIs affected by the active context.
@coerce: Used to specify type coercion rules.
@literal: Used to specify a literal value.
@iri: Used to specify an IRI value.
@language: Used to specify the language for a literal.
@datatype: Used to specify the datatype for a literal.
:: The separator for CURIEs when used in JSON keys or JSON values.
@: Sets the active subjects.
a: Used to set the rdf:type of the active subjects. This token may be conferred as syntactic sugar for rdf:type.

Use @source instead of @?

Use @type instead of a? Note that both are just semantic sugar for rdf:type.

6.4 Context

Processing of JSON-LD is managed recursively using a process described in Sequence. During processing, each rule is applied using information provided by the active context. Processing begins by pushing a new processor state onto the processor state stack and initializing the active context with the default context. If a local context is encountered, information from the local context is merged into the active context.

Should the document URL be used as the default for @base in the default context?

The active context is used for expanding keys and values of an associative array (or elements of a list (see List Processing)).

A local context is identified within an associative array having a key of @context with an associative array value. When processing a local context, special rules apply:

The key @base must have a value of a simple string with the lexical form of IRI and is saved in the active context to perform term mapping as described in IRI Processing.
The key @vocab must have a value of a simple string with the lexical form of IRI and is saved in the active context to perform term mapping as described in IRI Processing.
The key @coerce must have a value of an associative array. Processing of the associative array is described below
Otherwise, the key must have the lexical form of NCName and must have the value of a simple string with the lexical form of IRI. Merge each key-value pair into the active context, overwriting any duplicate values.

A local context may also be loaded from an external document using the @profile key as described in Vocabulary Profiles.

6.4.1 Coerce

Map each key-value pair in the local context's @coerce mapping into the active context's @coerce mapping, overwriting any duplicate values in the active context's @coerce mapping. The @coerce mapping has a either of a single CURIE or of an array of CURIEs. When merging with an existing mapping in the active context, map all CURIE values to array form and replace with the union of the value from the local context and the value of the active context. If the result is an array with a single CURIE, the processor may represent this as a string value.

6.5 Chaining

Object chaining is a JSON-LD feature that allows an author to use the definition of JSON-LD objects as property values. This is a commonly used mechanism for creating a parent-child relationship between objects.

The example shows an two objects related by a property from the first object:

{
...
  "foaf:name": "Manu Sporny",
  "foaf:knows": {
    "a": "foaf:Person",
    "foaf:name": "Gregg Kellogg",
  }
...
}

An object definition may be used anyplace a value is legal in JSON-LD.

6.6 IRI Processing

Keys and some values are evaluated to produce an IRI. This section defines an algorithm for transforming a value representing an IRI into an actual IRI.

IRIs may be represented as an explicit string, or as a CURIE, as a value relative to @base or @vocab.

CURIEs are defined more formally in [RDFA-CORE] section 6 "CURIE Syntax Definition". Generally, a CURIE is composed of a prefix and a suffix separated by a ':'. In JSON-LD, either the prefix may be the empty string, denoting the default prefix.

The procedure for generating an IRI is:

Split the value into a prefix and suffix from the first occurrence of ':'.
If the prefix is a '_', generate a named BNode using the suffix as the name.
If the active context contains a mapping for prefix, generate an IRI by prepending the mapped prefix to the (possibly empty) suffix. Note that an empty suffix and no suffix (meaning the value contains no ':' string at all) are treated equivalently.
If the IRI being processed is for a property (i.e., a key value in an associative array, or a value in a @coerce mapping) and the active context has a @vocab mapping, join the mapped value to the suffix using the method described in [RFC3987].
If the IRI being processed is for a subject or object (i.e., not a property) and the active context has a @base mapping, join the mapped value to the suffix using the method described in [RFC3987].
Otherwise, use the value directly as an IRI.

7. Sequence

The algorithm below is designed for in-memory implementations with random access to associative array elements. For a description of a streaming implementation, see Appendix B.

A conforming JSON-LD processor must implement a processing algorithm that results in the same default graph that the following algorithm generates:

Create a new processor state with with the active context set to the default context and active subject and active property initialized to NULL.
If an associative array is detected, perform the following steps:
1. If the associative array has a @context key, process the local context as described in Context.
2. If the associative array has an @iri key, set the active object by performing IRI Processing on the associated value. Generate a triple representing the active subject, the active property and the active object. Return the active object to the calling location.
3. If the associative array has a @literal key, set the active object to a literal value as follows:
  - as a typed literal if the associative array contains a @datatype key after performing IRI Processing on the specified@datatype.
  - otherwise, as a plain literal. If the associative array contains a @language key, use it's value to set the language of the plain literal.
  Generate a triple representing the active subject, the active property and the active object. Return the active object to the calling location.
4. If the associative array has a @ key:
  1. If the value is a string, set the active object to the result of performing IRI Processing. Generate a triple representing the active subject, the active property and the active object. Set the active subject to the active object.
  2. Create a new processor state using copies of the active context, active subject and active property and process the value starting at Step 2, set the active subject to the result and proceed using the previous processor state.
5. If the associative array does not have a @ key, set the active object to newly generated blank node identifier. Generate a triple representing the active subject, the active property and the active object. Set the active subject to the active object.
6. For each key in the associative array that has not already been processed, perform the following steps:
  1. If the key is a, set the active property to rdf:type.
  2. Otherwise, set the active property to the result of performing IRI Processing on the key.
  3. Create a new processor state copies of the active context, active subject and active property and process the value starting at Step 2 and proceed using the previous processor state.
7. Return the active object to the calling location.
If a regular array is detected, process each value in the array by doing the following returning the result of processing the last value in the array:
1. If the value is a regular array, generate an RDF List by linking each element of the list using rdf:first and rdf:next, terminating the list with rdf:nil using the following sequence:
  1. If the list has no element, generate a triple using the active subject, active property and rdf:nil.
  2. Otherwise, generate a triple using using the active subject, active property and a newly generated BNode identified as first bnode.
  3. For each element other than the last element in the list:
    1. Create a processor state using the active context, first bnode as the active subject, and rdf:first as the active property.
    2. Unless this is the last element in the list, generate a new BNode identified as rest bnode, otherwise use rdf:nil.
    3. Generate a new triple using first bnode, rdf:rest and rest bnode.
    4. Set first bnode to rest bnode.
2. Otherwise, create a new processor state copies of the active context, active subject and active property and process the value starting at Step 2 and proceed using the previous processor state.
If a string is detected, generate a triple using the active subject, active object and a plain literal value created from the string.
If a number is detected, generate a typed literal using a string representation of the value with datatype set to either xsd:integer or xsd:double, depending on if the value contains a fractional and/or an exponential component. Generate a triple using the active subject, active object and the generated typed literal.
Otherwise, if true or false is detected, generate a triple using the active subject, active object and a typed literal value created from the string representation of the value with datatype set to xsd:boolean.

8. Best Practices

The nature of Web programming allows one to use basic technologies, such as JSON-LD, across a variety of systems and environments. This section attempts to describe some of those environments and the way in which JSON-LD can be integrated in order to help alleviate certain development headaches.

8.1 JavaScript

It is expected that JSON-LD will be used quite a bit in JavaScript environments, however, features like the expanded form for object values mean that using JSON-LD directly in JavaScript may be annoying without a middleware layer such as a simple library that converts JSON-LD markup before JavaScript uses it. One could say that JSON-LD is a good fit for the RDF API, which enables a variety of RDF-based Web Applications, but some don't want to require that level of functionality just to use JSON-LD. The group is still discussing the best way to proceed, so input on how JSON-LD could more easily be utilized in JavaScript environments would be very much appreciated.

8.2 Schema-less Databases

Databases such as CouchDB and MongoDB allow the creation of schema-less data stores. RDF is a type of schema-less data model and thus lends itself to databases such as CouchDB and MongoDB. Both of these databases can use JSON-LD as their storage format. The group needs feedback from CouchDB and MongoDB experts regarding the usefulness of JSON-LD in those environments.

MongoDB does not allow the '.' character to be used in key names. This prevents developers from storing IRIs as keys, which also prevents storage of the data in normalized form. While this issue can be avoided by using CURIEs for key values, it is not known if this mechanism is enough to allow JSON-LD to be used in MongoDB in a way that is useful to developers.

9. Advanced Concepts

There are a few advanced concepts where it is not clear whether or not the JSON-LD specification is going to support the complexity necessary to support each concept. The entire section on Advanced Concepts should be considered as discussion points; it is merely a list of possibilities where all of the benefits and drawbacks have not been explored.

9.1 Vocabulary Profiles

One of the more powerful features of RDFa 1.1 Core is the ability to specify a collection of prefixes and terms that can be re-used by a processor to simplfy markup. JSON-LD provides a similar mechanism called Vocabulary Profiles, which is the inclusion of a context external to the JSON-LD document.

The example below demonstrates how one may specify an external Vocabulary Profile. Assume the following profile exists at this imaginary URL: http://example.org/profiles/contacts.

{
  "@context": 
  {
     "xsd": "http://www.w3.org/2001/XMLSchema#",
     "name": "http://xmlns.com/foaf/0.1/name",
     "age": "http://xmlns.com/foaf/0.1/age",
     "homepage": "http://xmlns.com/foaf/0.1/homepage",
     "#types":
     {
        "age": "xsd:integer",
        "homepage": "xsd:anyURI",
     }
  }
}

The profile listed above can be used in the following way:

{
  "@profile": "http://example.org/profiles/contacts",
  "name": "John Smith",
  "age": "41",
  "homepage": "http://example.org/home/"
}

The example above would generate the following triples:

_:bnode1
   <http://xmlns.com/foaf/0.1/name>
      "John Smith" .
_:bnode1
   <http://xmlns.com/foaf/0.1/age>
      "41"^^http://www.w3.org/2001/XMLSchema#integer .
_:bnode1
   <http://xmlns.com/foaf/0.1/homepage>
      <http://example.org/home/> .

9.2 Disjoint Graphs

When serializing an RDF graph that contains two or more sections of the graph which are entirely disjoint, one must use an array to express the graph as two graphs. This may not be acceptable to some authors, who would rather express the information as one graph. Since, by definition, disjoint graphs require there to be two top-level objects, JSON-LD utilizes a mechanism that allows disjoint graphs to be expressed using a single graph.

Assume the following RDF graph:

<http://example.org/people#john> 
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
      <http://xmlns.com/foaf/0.1/Person> .
<http://example.org/people#jane> 
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
      <http://xmlns.com/foaf/0.1/Person> .

Since the two subjects are entirely disjoint with one another, it is impossible to express the RDF graph above using a single JSON-LD associative array.

In JSON-LD, one can use the subject to express disjoint graphs as a single graph:

{
  "@": 
  [
    {
      "@": "http://example.org/people#john",
      "a": "foaf:Person"
    },
    {
      "@": "http://example.org/people#jane",
      "a": "foaf:Person"
    }
  ]
}

A disjoint graph could also be expressed like so:

[
  {
    "@": "http://example.org/people#john",
    "a": "foaf:Person"
  },
  {
    "@": "http://example.org/people#jane",
    "a": "foaf:Person"
  }
]

9.3 The JSON-LD API

This API provides a clean mechanism that enables developers to convert JSON-LD data into a format that is easier to work with in various programming languages.

[NoInterfaceObject]
interface JSONLDProcessor {
    object toProjection (in DOMString jsonld, in object? template, in DOMString? subject, in optional JSONLDParserCallback? callback);
    Graph  toGraph (in DOMString jsonld, in optional JSONLDParserCallback? callback);
};

9.3.1 Methods

toGraph

Parses JSON-LD and transforms the data into an Graph, which is compatible with the RDF Interfaces API specification [RDF-INTERFACES]. This method will return null if there are any errors, or if the RDF Interfaces API is not available for use.

Parameter	Type	Nullable	Optional	Description
jsonld	`DOMString`	✘	✘	The JSON-LD string to parse into the RDFGraph.
callback	`JSONLDParserCallback`	✔	✔	A callback that is called whenever a processing error occurs on the given JSON-LD string.

No exceptions.

Return type: Graph

toProjection

Parses JSON-LD text into an RDF API Projection object as specified by the RDF API specification [RDF-API]. If there are any errors, null is returned.

Parameter	Type	Nullable	Optional	Description
jsonld	`DOMString`	✘	✘	The JSON-LD string to parse into the Projection.
template	`object`	✔	✘	The Projection template to use when building the Projection.
subject	`DOMString`	✔	✘	The subject to use when building the Projection.
callback	`JSONLDParserCallback`	✔	✔	A callback that is called whenever a processing error occurs on the given JSON-LD string.

No exceptions.

Return type: object

The JSONLDParserCallback is called whenever a processing error occurs on input data.

[NoInterfaceObject Callback]
interface JSONLDProcessorCallback {
    void error (in DOMString error);
};

9.3.2 Methods

error

This callback is invoked whenever an error occurs during processing.

Parameter	Type	Nullable	Optional	Description
error	`DOMString`	✘	✘	A descriptive error string returned by the processor.

No exceptions.

Return type: void

The following example demonstrates how to convert JSON-LD to a projection that is directly usable in a programming environment:

// retrieve JSON-LD from a Web Service
var jsonldString = fetchPerson();

// This map, usually defined once per script, defines how to map incoming 
// JSON-LD to JavaScript objects
var myTemplate = { "http://xmlns.com/foaf/0.1/name" : "name",
                   "http://xmlns.com/foaf/0.1/age" : "age",
                  "http://xmlns.com/foaf/0.1/homepage" : "homepage" };

// Map the JSON-LD to a language-native object
var person = jsonld.toProjection(jsonldString, myTemplate);

// Use the language-native object
alert(person.name + " is " + person.age + " years old. " +
      "Their homepage is: " + person.homepage);

A JSON-LD Serializer is also available to map a language-native object to JSON-LD.

[NoInterfaceObject]
interface JSONLDSerializer {
    DOMString normalize (in object obj);
};

9.3.3 Methods

normalize

Serializes a language-native object into a normalized JSON-LD string. Normalization is important when performing things like equality comparison and digital signature creation and verification.

Parameter	Type	Nullable	Optional	Description
obj	`object`	✘	✘	An associative array of key-value pairs that should be converted to a JSON-LD string. It is assumed that a map already exists for the data.

No exceptions.

Return type: DOMString

The Normalization Algorithm

This algorithm is very rough, untested, and probably contains many bugs. Use at your own risk. It will change in the coming months.

The JSON-LD normalization algorithm is as follows:

Remove the @context key and preserve it as the transformation map while running this algorithm.
For each key
1. If the key is a CURIE, expand the CURIE to an IRI using the transformation map.
For each value
1. If the value should be type coerced per the transformation map, ensure that it is transformed to the new value.
2. If the value is a CURIE, expand the CURIE to an IRI using the transformation map.
3. If the value is a typed literal and the type is a CURIE, expand it to an IRI using the transformation map.
4. When generating the final value, use expanded object value form to store all IRIs, typed literals and plain literals with language information.
Output each sorted key-value pair without any extraneous whitespace. If the value is an associative array, perform this algorithm, starting at step #1, recursively on the sub-tree. There should be no nesting in the outputted JSON data. That is, the top-most element should be an array. Each item in the array contains a single subject with a corresponding array of properties in UTF-8 sort order. Any related objects that are complex objects themselves should be given a top-level object in the top-level array.

Note that normalizing named blank nodes is impossible at present since one would have to specify a blank node naming algorithm. For the time being, you cannot normalize graphs that contain named blank nodes. However, normalizing graphs that contain non-named blank nodes is supported.

var myObj = { "@context" : { 
                "xsd" : "http://www.w3.org/2001/XMLSchema#",
                "name" : "http://xmlns.com/foaf/0.1/name",
                "age" : "http://xmlns.com/foaf/0.1/age",
                "homepage" : "http://xmlns.com/foaf/0.1/homepage",
                "@coerce": {
                   "xsd:nonNegativeInteger": "age",
                   "xsd:anyURI": "homepage"
                }
              },
              "name" : "Joe Jackson",
              "age" : "42",
              "homepage" : "http://example.org/people/joe" };

// Map the language-native object to JSON-LD
var jsonldText = jsonld.normalize(myObj);

After the code in the example above has executed, the jsonldText value will be (line-breaks added for readability):

[{"http://xmlns.com/foaf/0.1/age":{"@datatype":"http://www.w3.org/2001/XMLSchema#nonNegativeInteger","@literal":"42"},
"http://xmlns.com/foaf/0.1/homepage":{"@iri":"http://example.org/people/joe"},
"http://xmlns.com/foaf/0.1/name":"Joe Jackson"}]

When normalizing xsd:double values, implementers must ensure that the normalized value is a string. In order to generate the string from a double value, output equivalent to the printf("%1.6e", value) function in C must be used where "%1.6e" is the string formatter and value is the value to be converted.

To convert the a double value in JavaScript, implementers can use the following snippet of code:

// the variable 'value' below is the JavaScript native double value that is to be converted
(value).toExponential(6).replace(/(e(?:\+|-))([0-9])$/, '$10$2')

When data needs to be normalized, JSON-LD authors should not use values that are going to undergo automatic conversion. This is due to the lossy nature of xsd:double values.

Round-tripping data can be problematic if we mix and match @coerce rules with JSON-native datatypes, like integers. Consider the following code example:

var myObj = { "@context" : { 
                "number" : "http://example.com/vocab#number",
                "@coerce": {
                   "xsd:nonNegativeInteger": "number"
                }
              },
              "number" : 42 };

// Map the language-native object to JSON-LD
var jsonldText = jsonld.normalize(myObj);

// Convert the normalized object back to a JavaScript object
var myObj2 = jsonld.parse(jsonldText);

At this point, myObj2 and myObj will have different values for the "number" value. myObj will be the number 42, while myObj2 will be the string "42". This type of data round-tripping error can bite developers. We are currently wondering if having a "coerce validation" phase in the parsing/normalization phases would be a good idea. It would prevent data round-tripping issues like the one mentioned above.

JSON-LD - Linked Data Expression in JSON

A Context-based JSON Serialization for Linked Data

Unofficial Draft 15 June 2011

Abstract

Status of This Document

Table of Contents

1. Introduction

1.1 How to Read this Document

1.2 Contributing

2. Design Goals and Rationale

2.1 Goals

2.2 Map Terms to IRIs

2.3 The JSON-LD Context

2.4 Unambiguous Identifiers for JSON

2.5 Mashing Up Vocabularies

2.6 An Example of a Context

3. Markup Examples

3.1 RDFa

3.2 Microformats

3.3 Microdata

4. Markup of RDF Concepts

4.1 IRIs

4.2 Identifying the Subject

4.3 Specifying the Type

4.4 Plain Literals

4.5 Language Specification in Plain Literals

4.6 Typed Literals

4.7 Multiple Objects for a Single Property

4.8 Multiple Typed Literals for a Single Property

4.9 Blank Nodes

5. Advanced Features

5.1 Automatic Typing

5.2 Type Coercion

6. The JSON-LD Processing Algorithm

6.1 Overview

6.2 Processing Algorithm Terms

6.3 Processing Tokens and Keywords

6.4 Context

6.4.1 Coerce

6.5 Chaining

6.6 IRI Processing

7. Sequence

8. Best Practices

8.1 JavaScript

8.2 Schema-less Databases

9. Advanced Concepts

9.1 Vocabulary Profiles

9.2 Disjoint Graphs

9.3 The JSON-LD API

9.3.1 Methods

9.3.2 Methods

9.3.3 Methods

The Normalization Algorithm

A. The Default Context

B. Acknowledgements

C. References

C.1 Normative references

C.2 Informative references