Re: internationalization issues from James M Snell on 2015-10-24 (public-socialweb@w3.org from October 2015)

From: James M Snell <jasnell@gmail.com>
Date: Fri, 23 Oct 2015 17:35:27 -0700
To: Ben <ben@thatmustbe.me>
Cc: Owen Shepherd <owen.shepherd@e43.eu>, elf Pavlik <perpetual-tripper@wwelves.org>, Harry Halpin <hhalpin@w3.org>, Sandro Hawke <sandro@w3.org>, Social Web Working Group <public-socialweb@w3.org>, Richard Ishida <ishida@w3.org>
Message-ID: <CABP7RbevBFnLrUv+Q3acW9nSMpBY00oznWA9t3oDXDKboMydSg@mail.gmail.com>
On Fri, Oct 23, 2015 at 2:01 PM, Ben <ben@thatmustbe.me> wrote:
> I have been working on ways to get AS2 compatible output from my site,
> and I don't think I can generate a valid context for the document as
> various terms I use on the site (some from microformats-2, some from
> IWC's wiki, and some my own vendor prefixes).
>

I'm not following. What do you mean "generate a valid context for the
document". Can you provide an example of the data that you're working
with?

> The other problem has been, as others have mentioned, that the various
> ways to represent a key in JSON-LD (aliasing being a primary one)
> means you cannot just ignore the context as I have been told.  If you
> just ignore context or only process part of it, you can easily end up
> with values that appear to have different keys when they are not.
>

Not quite accurate. The facts that (a) AS 2.0 requires a minimal
normative JSON-LD @context that cannot be extended but not overridden
and (b) that JSON-LD compact form is required both mean that for
everything defined as part of the *core* vocabulary in AS 2.0, you can
rely on a consistent JSON serialization without worrying about the
various complexities of the @context and JSON-LD processing. It's only
when you get into the extensions that you need to worry about @context
processing, and that is largely nothing more than building up a map of
prefix to term mappings.

This means, for instance, that "displayName" will *always* be
"displayName", "object" will always be "object". It is invalid for an
implementation to make any modification to the @context that would
cause any of these expected field names to be altered.

Let's look at any example:

{
  "@context": "http://www.w3.org/ns/activitystreams",
  "@type": "Note",
  "displayName": "This is the title",
  "content": "This is the content"
}

The fact that JSON-LD allows aliasing in the @context does *not*
change how this particular AS 2.0 object can be rendered. In other
words, while the following example is perfectly valid JSON-LD and
would expand out to the exact same RDF model as the above example, it
is **NOT** valid AS 2.0:

{
  "@context": [
    "http://www.w3.org/ns/activitystreams",
    {
      "foo": "as:displayName",
      "bar": "as:content"
    }
  ],
  "foo": "This is the title",
  "bar": "This is the content"
}

If an AS 2.0 implementation receives the latter document, it would be
perfectly within it's rights to reject it as being invalid. It simply
does not matter that it is valid JSON-LD.

The same thing goes for the values of properties like @type. It is not
valid for an implementation to override the mapping for the key terms
like "Note". When you see the simple string "Note" as the value of
@type, it MUST **always** map to
http://www.w3.org/ns/activitystreams#Note. Anything different from
that would not be a valid Activity Streams 2.0 document.

The following, however, is an example of a valid use of the JSON-LD @context:

{
  "@context": [
    "http://www.w3.org/ns/activitystreams",
    {
      "foo": "http://example.org/foo#",
      "bar": "foo:bar",
      "Thing": "foo:Thing"
    }
  ],
  "@type": "Thing",
  "displayName": "This is the title",
  "content": "This is the content",
  "bar": "this is an extension"
}

In this case, an implementation that does absolutely no JSON-LD
processing can look at this and easily determine that extensions are
being used. They don't have to perform JSON-LD expansion on the
@context in order to determine that. The "Thing" is just an
unrecognized term. We know for a fact that it cannot be just an alias
for one of the core terms in the vocabulary because the spec forbids
implementations from aliasing any of those core terms. The non-JSON-LD
aware implementation can also immediately determine that "bar" is an
extension property because it's not part of the core vocabulary and
cannot possible be an alias. The non-JSON-LD aware implementation can
look for the specific properties it is aware of and be assured that
those are not magically mapped to some other meaning.

Now, if an implementation wishes to pay attention to the JSON-LD
@context but does not want to implement the full JSON-LD processing
model, then the minimal amount of processing that is required is
building a map of prefix values to terms. It can be a bit complicated
due to the fact that the @context can be a reference to an external
document that needs to be fetched before it it processed, but
essentially what you're working with is an array of simple maps.
Initialize that with a mapping of the core terms. Iterate through each
additional context document. If a term attempts to remap one of the
core terms, either ignore it or throw an error. You'll end up with a
single map whose keys are aliases for a URI value. Then, when you look
at the remaining JSON in the document, field names are the keys in
that map, and the value of certain fields might be keys in that map
(fields like @type for instance). If the key is in the map, you use
the associated URI value, if it's not, you use the fall back "blank
node" value.

But here's the thing: everything in the previous paragraph is optional
and only comes into play when you're using extensions to the core
vocabulary. Implementations can easily work around this through
consistent use of extensions. In other words, extensions can be based
on their own normative JSON-LD @contexts that always render
consistently. The fact that AS 2.0 uses exactly the same must ignore
extension model as Atom means that AS 2.0 is not doing anything
special, unusual, new or different here. You pay attention to the
things you care about and ignore everything else.

Establishing the default language context using the @context property
is currently the ONLY thing that you would absolutely need to rely on
the @context for currently because JSON-LD does not provide any other
way for that to happen (which is something I'd like to see fixed in
the next version of JSON-LD).

The bottom line to all this is simple: it is possible (I know because
I've done it many times) to use AS 2.0 without being forced to process
the JSON-LD @context. There are benefits to doing that additional
processing *but it is not required*.

> One way forward I would suggest is that we drop conformity to JSON-LD
> from the spec.  So long as it can be converted in a non-lossy way to
> JSON-LD, it should be an easy task for those wishing to use it as
> such.  This would free us of any limitations of the JSON-LD spec.
>
> On Fri, Oct 23, 2015 at 3:35 PM, Owen Shepherd <owen.shepherd@e43.eu> wrote:
>> On Fri, Oct 23, 2015 at 9:41 AM elf Pavlik <perpetual-tripper@wwelves.org>
>> wrote:
>>>
>>> On 10/22/2015 07:46 PM, Harry Halpin wrote:
>>> > However, off the top of my head there's no reason why we can't just say
>>> > in AS2.0 that we can look at the @language in @context but if there's no
>>> > @context, just look at a "language" tag. That's typically how I've seen
>>> > it in the wild and a lot more intuitive than @context and @language for
>>> > non-JSON-LD parsers (i.e. the majority of parsers).
>>> IMO all software supporting AS2.0 MUST have support for some basic
>>> processing of JSON-LD context. Otherwise it will have *no support* for
>>> vocabulary terms outside of very limited AS2.0 Vocabulary. For example
>>> no support for email addresses, birthdays or basic data about spoken
>>> languages in social profiles etc.
>>
>>
>> If you mean "support for basic processing of JSON-LD context" thats anything
>> greater than "reading the @language property", then -1000.
>>
>> If you say "You need to implement this huge JSON-LD algorithm", most
>> programmers will rightfully say "screw this" and do something else. If you
>> say "You need to go read the JSON-LD spec", then most programmers will say
>> "screw this".
>>
>> Mandating JSON-LD support is the way to oblivion. I thought ActivityStreams
>> was supposed to be a *profile* of JSON-LD?
>>
>> As far as I'm concerned, any document which requires context processing more
>> advanced than reading certain fixed properties (e.g. @language) should not
>> be considered ActivityStreams. If you want to use full JSON-LD, go right
>> ahead - call it application/ld+json. If you want to interoperate with
>> ActivityStreams implementations, use "application/activity+json", which
>> implies a default context (the ActivityStreams context, which may be a
>> "living vocabulary"*) with a fixed, user-friendly normalization.
>>
>> The goal of JSON-LD, after all, was to let you use linked data tools on
>> mostly arbitrary JSON, right? Why can our spec not be in the "nice, mostly
>> arbitrary JSON" subset, without any JSON-LD weirdness?
>>
>> For graph-structured data (like a social graph!), a graph format (RDF/Linked
>> Data) is great, but most apps don't need it and most programmers aren't
>> familiar with it.
>>
>> If the group gives up on the "plain JSON" serialization format, you might as
>> well just encode everything in Turtle. You'll attract just as many users and
>> avoid all the what-JSON-format bikeshedding.
>>
>>     Owen
>>
>> * Expanding the default context may cause some interoperability issues
>> between old and new versions of the spec (e.g. a canoncal URI changing from
>> http://foo/bar#baz to CURIE bar:baz), but in reality they should be small
>> and can probably be worked around in a reasonably easy manner
Received on Saturday, 24 October 2015 00:36:18 UTC