Re: Updated data model specification document from Dave Longley on 2016-05-28 (public-credentials@w3.org from May 2016)

From: Dave Longley <dlongley@digitalbazaar.com>
Date: Fri, 27 May 2016 21:41:12 -0400
To: public-credentials@w3.org
Cc: Steven Rowat <steven_rowat@sunshine.net>
Message-ID: <5748F738.8070007@digitalbazaar.com>
On 05/27/2016 11:31 AM, Steven Rowat wrote:
> ...
> 
> Second, which I also remember from my first reading and have just gone
> back to the document to check: I'm uncomfortable with the fact that 'id'
> is not differentiated in its different usages. Couldn't it be give an
> different name like "claimid"?  I found this disorienting -- if the 'id'
> for a claim is specific for claims, then shouldn't that be indicated?
> 
> And not just in reading the data description; I think this is a
> potential hotspot for confusion in writing the code. For example, to me
> the second example seems ripe for confusion:
> 
> Example 2: A simple claim
> 
> {
>   "id": "http://example.gov/credentials/3732",
>   "type": ["Credential", "ProofOfAgeCredential"],
>   "issuer": "https://dmv.example.gov",
>   "issued": "2010-01-01",
>   "claim": {
>     "id": "did:ebfeb1f712ebc6f1c276e12ec21",
>     "ageOver": 21
>   }
> }
> 
> Perhaps this could be easier to code and parse, instead, if was:
> 
> Example 2: A simple claim
> 
> {
>   "id": "http://example.gov/credentials/3732",
>   "type": ["Credential", "ProofOfAgeCredential"],
>   "issuer": "https://dmv.example.gov",
>   "issued": "2010-01-01",
>   "claim": {
>     "claimid": "did:ebfeb1f712ebc6f1c276e12ec21",
>     "ageOver": 21
>   }
> }

The "id" property is a bit special. To understand that I'll try to
provide more on my view of the data model.

The data model can be understood as a graph. It can be thought of as a
collection of nodes that are connected to other nodes via "properties".

Each of these relations in the graph can be modeled as a statement with
a subject, a property, and an object. A subject may have many
properties. The objects that are linked to it via these properties may
themselves be the subjects of other relations or they may be literal
values like numbers or strings.

Looking at the JSON above, every time you see properties nested inside
curly brackets ("{" and "}"), you're looking at a verbose expression of
a subject and some of its relations. Subjects are identified via a
special property, "id". The value for "id" may be a URL that globally
identifies the subject. Every other property defines a relation between
the subject and an object. The object, again, could be a literal value
or some other subject with relations of its own.

If you want to express a simple identity, you may do this:

{
  "id": "<the identity's ID>",
  "name": "Steven Rowat"
}

That's essentially a claim about some identity that has this English
meaning:

The entity identified by <the identity's ID> has the name "Steven Rowat".

However, you don't know any information about the claim itself, like who
made it, when it was made, etc. To wrap up this information, we put it
all into a "TBD Credential". This "TBD Credential" contains relations
that identify who a claim is about, who made the claim, and when the
claim was made.

More specifically, the property "issuer" relates the credential to the
entity that made the claim. The term "issuer" comes from the the idea
that making a set of claims about an entity can be described as "issuing
a credential". The property "claim" relates the credential to the entity
that the claim is about. In English:

The credential <the credential's ID> was issued by <the issuer's ID>.
The credential contains a claim about <the identity's ID>.

In JSON:

{
  "id": "<the credential's ID>",
  "issuer": "<the issuer's ID>",
  "claim": "<the identity's ID>"
}

The problem here is that this document doesn't tell us any attributes
about the identity we're making claims about. So we haven't really
claimed anything yet. We need to actually list some properties
(relations) about the entity itself. We do that by expressing it via the
curly braces form instead of as a simple string with its identifier. Of
course, we need to keep its identifier around, so we express it using
the "id" property:

{
  "id": "<the credential's ID>",
  "issuer": "<the issuer's ID>",
  "claim": {
    "id": "<the identity's ID>",
    "name": "Steven Rowat"
  }
}

Thus, if a subject is expressed using the curly braces form, we know we
can always find its identifier via the "id" property.

You can also see that now the document contains an additional statement.
In English, we've got:

The credential <the credential's ID> was issued by <the issuer's ID>.
The credential contains a claim about <the identity's ID>.
The entity identified by <the identity's ID> has the name "Steven Rowat".

We can take this further and make more complex claims like:

{
  "id": "<the credential's ID>",
  "issuer": "<the issuer's ID>",
  "claim": {
    "id": "<the identity's ID>",
    "name": "Steven Rowat",
    "favoriteColor": {
      "id": "<the color blue's ID>",
      "name": "Blue"
    }
  }
}

Here we've added another node to the graph, a color. In English we've added:

The entity identified by <the identity's ID> has a favorite color <the
color blue's ID>.
The color <the color blue's ID> has the name "Blue".

Using this method, we can add more relations -- and in some cases, we
can avoid repeating ourselves by referring to objects using the short
string form of their identifier:

{
  "id": "<the credential's ID>",
  "issuer": "<the issuer's ID>",
  "claim": {
    "id": "<the identity's ID>",
    "name": "Steven Rowat",
    "favoriteColor": {
      "id": "<the color blue's ID>",
      "name": "Blue"
    },
    "eyeColor": "<the color blue's ID>"
  }
}

This works because, in the data model, the properties "favoriteColor"
and "eyeColor" both refer to the same entity.

Furthermore, if you're using JSON-LD to represent "TBD Credentials", you
can add a special "@context" property that will map each property to a
global identifier (a URL). That allows you to distinguish the relation
at a global scale, and, if you are using a Linked Data vocabulary, you
can dereference the property's URL to find out more information about
it. For example, the relation's definition and where you're likely to
see it appear.

An example of such a Linked Data vocabulary, put together by Google,
Yahoo, Microsoft, and Yandex can be found here:

http://schema.org/

You can use one or more vocabularies like this one in the same "TBD
Credential" to make verifiable claims. This approach allows people to
create decentralized vocabularies and reuse and mix and match them based
on their individual personal or industry needs. It also helps ensure
that the semantics for a particular property are clear at global
scale -- so those who are interested in verifying claims can know their
exact meaning.


-- 
Dave Longley
CTO
Digital Bazaar, Inc.
Received on Saturday, 28 May 2016 01:41:40 UTC