Re: extension model of RSS/Atom (ISSUE-16 discussion) from James M Snell on 2015-04-07 (public-socialweb@w3.org from April 2015)

From: James M Snell <jasnell@gmail.com>
Date: Tue, 7 Apr 2015 11:46:13 -0700
To: Erik Wilde <dret@berkeley.edu>
Cc: "public-socialweb@w3.org" <public-socialweb@w3.org>
Message-ID: <CABP7Rbft+q_vnDwE3bKbP9w6CmF2Fc-OKjpSJTyFzHcbO-x32Q@mail.gmail.com>
Several key points:

The RELAX-NG model in RFC 4287 is *non-normative*

RFC 4287, Section 1.3: "Some sections of this specification are
illustrated with fragments of a non-normative RELAX NG Compact schema
[RELAX-NG].  However, the text of this specification provides the
definition of conformance.  A complete schema appears in Appendix B."

RFC 4287, Appendix B: "This appendix is informative."

The normative definition of "foreign markup" is given in text in
Sections 6.2 and 6.3.

6.2:

"The Atom namespace is reserved for future forward-compatible
revisions of Atom.  Future versions of this specification could add
new elements and attributes to the Atom markup vocabulary.  Software
written to conform to this version of the specification will not be
able to process such markup correctly and, in fact, will not be able
to distinguish it from markup error.  For the purposes of this
discussion, unrecognized markup from the Atom vocabulary will be
considered "foreign markup"."

6.3:

"Atom Processors that encounter foreign markup in a location that is
legal according to this specification MUST NOT stop processing or
signal an error.  It might be the case that the Atom Processor is able
to process the foreign markup correctly and does so.  Otherwise, such
markup is termed "unknown foreign markup".

When unknown foreign markup is encountered as a child of atom:entry,
atom:feed, or a Person construct, Atom Processors MAY bypass the
markup and any textual content and MUST NOT change their behavior as a
result of the markup's presence.

When unknown foreign markup is encountered in a Text Construct or
atom:content element, software SHOULD ignore the markup and process
any text content of foreign elements as though the surrounding markup
were not present."

The Atom spec does go on to describe "simple" vs. "structured" extensions

What this is saying is that extensions can be used throughout the Atom
document but must be ignored when not supported. There is some
distinction given between simple and structured extensions but those
are largely a parsing concern, not a processing concern. The only
*processing* requirement for extensions is the must-ignore rule.

The current Activity Streams 2.0 spec says this:

"In Activity Streams 2.0, an "extension" is any property not defined
by the Activity Vocabulary. Consuming implementations that encounter
unfamiliar extensions must not stop processing or signal an error and
must continue processing the items as if those properties were not
present. Support for specific extensions can vary across
implementations and no normative processing model for extensions is
provided.

While consuming implementations are not required to use the standard
JSON-LD Processing Algorithms [JSON-LD-API], it is important to note
that the algorithms, as currently defined, will silently ignore any
property that is not defined in a JSON-LD @context. Implementations
that publish Activity Streams 2.0 documents that contain extension
properties should provide a @context definition of those extensions."

I would argue that the processing semantics here are identical to
those used in Atom. If there's an extension you do not understand,
ignore it. JSON and JSON-LD handle the structural concerns so there's
no need for AS2 to delve into the "simple" vs. "structured" question.
AS2 provides a non-normative RDF model that is informative if someone
wants to do something at a higher level.




On Tue, Apr 7, 2015 at 11:06 AM, Erik Wilde <dret@berkeley.edu> wrote:
> hello.
>
> circling back to ISSUE-16 and james' claim that RSS/Atom had simple
> mustIgnore routes and AS2 is doing that already by simply and silently
> relying on JSON's generic extensibility capabilities. this is not really how
> it is.
>
> Atom actually was careful in defining specific extensibility both on the
> schema as well as on the processing level. The schema clearly defines where
> extensions should be expected. Atom uses RELAX NG for that which has pretty
> good capabilities. XSD would have been a different option. but either way,
> Atom *does* explicitly define how producers are allowed to add extensions,
> and how consumers should expect extensions.
>
> at the processing model level, http://tools.ietf.org/html/rfc4287#section-6
> defines how those extensions are to be processed, and what's expected as
> behavior from implementations. Atom even defines a lightweight "infoset" by
> defining atom markup and foreign markup and how they should be treated.
>
> to me, those two components (the well-defined extension model defined in
> RELAX NG and the section on how to process those extensions) are exactly
> what we are missing.
>
> as a historical note: Atom put quite a bit of effort into this because RSS
> was not all that well-defined (whatever version of RSS you're talking
> about), and one lesson of that was that implementations were behaving in
> unpredictable and sometimes unfortunate ways. RSS at that time was a huge
> ecosystem, and i think it would be good to look at the lessons learned back
> then, and make sure we're not doing the same mistakes again.
>
> cheers,
>
> dret.
>
> --
> erik wilde | mailto:dret@berkeley.edu  -  tel:+1-510-2061079 |
>            | UC Berkeley  -  School of Information (ISchool) |
>            | http://dret.net/netdret http://twitter.com/dret |
>
Received on Tuesday, 7 April 2015 18:47:05 UTC