W3C home > Mailing lists > Public > www-tag@w3.org > January 2011

Re: Feedback on Internet Media Types and the Web

From: Eric J. Bowman <eric@bisonsystems.net>
Date: Sat, 15 Jan 2011 00:34:05 -0700
To: Larry Masinter <masinter@adobe.com>
Cc: "www-tag@w3.org" <www-tag@w3.org>
Message-Id: <20110115003405.0508fcde.eric@bisonsystems.net>
You're right, Larry, in that nothing prevented RFC 4627 from defining
+json from the get-go, or prevents it being changed now.  But that
won't solve the registry failure this omission revealed, which I'd like
to see prevented from recurring as a matter of course.  Codifying the
existing +xml-analogous consensus, would establish categorization of
media types by meta-language suffix the community expects from the
registry -- thereby providing Ned with the tools he's asked for to
enable the maintenance of same, by forcing the issue of changing RFC
4627 and preventing similar situations in the future.

Larry Masinter wrote:

>
> If they're not willing to do that work, then perhaps it isn't
> important that the +json MIME types actually be defined...
>

I can't chalk this up to the unwillingness of anyone to do the work
nobody ever told them needed doing.  If Ned had spent the past five
years requiring JSON-based applicants to use +json and pointing out that
RFC 4627 needed changing first, that work would've been done by now.

This advice to applicants is common both on and off the ietf-types list,
but Ned hasn't endorsed it, resulting in +json getting dropped from
applications as the path of least resistance to approval -- *that's* the
problem because it's the opposite of how the registry best serves its
architectural function (besides the fact that everyone assumes that's
how it already works).

Ned tells us he'd like to require +json, but feels his hands are tied
by RFC 4288, so it's more likely a lack of motivation rather than a
lack of willingness; I think there's a lack of awareness which results.
Regardless of cause, the effect is detrimental to the relevance of the
registry the architecture is dependent upon, and needs to be fixed in
the general case to prevent, say, YAML from making the same mistake
when it moves through the registration process, and so on into the
future.  I'm not blaming Ned, I'm advocating we not leave him any more
such messes to clean up after.

Each approval of a JSON type sans +json, represents one less group
willing to do the work of changing RFC 4627, and one more registry entry
that can't be searched for by generic parsing model.  Or one less
registry entry, and one more +json not-a-media-type released into the
wild, where nobody can search for it unless its token is known.  The
+json issue is only an example of the registry failure which results
from leaving suffixes undefined -- changing RFC 4627 doesn't fix the
underlying problem.

Ideally when dealing with registries in general, it's required to
identify this sort of pattern such that over time, fewer entries are
required to be updated to obsolete old tokens in favor of new tokens.
This has happened with several prominent media types, with problematic
results.  So we should be proactive in keeping this from happening to
every proposed media type based on a non-XML meta-language, by fixing
the general registry failure +json illustrates before it's compounded.

Especially because it's a problem for which a consensus solution
already exists, which, if solved now, will avoid the suffix travelling
down the path to irrelevancy text/* has in practice...

>
> The problem is that using something like "+json" without having
> a definition for what it means doesn't really help anyone except
> if you think that a MIME type is a fashion accessory and not a 
> protocol element with a definite meaning. 
> 

Exactly the point I've been making, to so much consternation, on rest-
discuss for a year and a half now.  One rebuttal was, "Where in AWWW
does it say that?"  So I see this issue as a relevant example of the
sort of thing a revised AWWW could clarify (especially since I pointed
to your draft as proof that media type omission was a known AWWW bug).
Hopefully, AWWW is updated to clarify the new standardization of
suffix usage, where it discusses proper media type creation and use,
which is why I think the issue belongs here -- these are orthogonal
concerns.

> 
> There's nobody home but us. So there's no one else other than the
> JSON-mime-type using community to define what "+json" means.
>

It's no more the place of the JSON folks to codify that MIME suffixes
mean what everyone now thinks they mean, than it is for the XML folks.
Such an effort would be open to charges of changing RFC 4288 to reflect
the needs of +json or +xml rather than the needs of +anything.  As Ned
points out, others have a different opinion, like +vendor -- those
folks need to have their say about generic suffix syntax, in a forum
outside those of +xml-ish suffix instantiations.

I believe it would make all the difference in the world if the JSON-
media-type-using community were told they need to define +json, instead
of being told they need to drop +json (like they are now) and try again.
Fix the process, and the definition of +json will duly follow.  Or in
this case, create the process.

>
> ...and perhaps they're just "mime-type-like" labels because there's
> nothing in particular you would or should do if you get a text/frob
> +json MIME body. Right? At least with XML there was some expectation
> of generic processing either with CSS or with XSLT.
> 

I'm not a JSON user (yet, perhaps), but I believe there is a generic
processing model as with XML.  I may not know the context or semantics
of a JSON document (same with XML), but I can still tell by looking at
it what's a real vs. integer number, vs. a string, boolean, array or
object.  As with XML, the document may be seen as a collection of
name-value pairs; typing is inherent and nonextensible in JSON, while
in XML it is an extensible option.  JSON is a subset of YAML, which
allows more complex typing in its generic processing model.

I think JSON and YAML are exactly the sort of languages the suffix
registry should target, as they may be extended to have semantic
meaning expressed as a data-type family identified by a media type in
exactly the same way as XML is extended to have HTML semantics using
application/xhtml+xml.  The syntax parsing is invariant, is one way to
put the suffix consensus which has formed based on +xml usage.  XSLT
is definitely an argument in favor of XML, but I also believe such
useful tools will evolve for JSON... eval() works with all JSON, but
only with JSON -- XML parsers work with all well-formed XML, but only
with well-formed XML.

>
> Are text/something+json bodies guaranteed to be in UTF8? Which version
> of JSON is used?  Is there any internal framing? 
> 

I don't think we need to solve those problems, so much as require those
problems to be solved, in the definition of the suffix registration
process.  As to text/ vs. application/, this thread is an interesting
read:

http://www.ietf.org/mail-archive/web/ietf-types/current/msg00988.html

Also on-topic to the discussion of YAML registration in this post.  (Hi
Julian, thanks for asking my same questions about registering XBEL).

>
> Defining +xml and what guarantees can be assumed by a +xml MIME
> type was very difficult. Without any guarantees (for a conforming
> implementation), then the adding extra protocol for +json isn't
> helpful, since it's meaningless except as a "gee we're cool"
> indicator.
>

I agree, these guarantees should be required for registering a suffix.
In 20/20 hindsight, defining +json should have been a requirement for
JSON, and in 20/20 foresight +yaml should be a requirement for YAML;
but they can't be until the mechanism is changed to require such
languages to define suffixes in their media type registrations,
addressing exactly the conformance concerns you raise.

Surely RFC 3023 was harder because it came first, and surely the
lessons learned would apply to changing RFC 4627, particularly if that
were the guidance of a revised RFC 4288?  The result could be a bunch
of meaningless-but-cool +json types coming in from the wild, which may
be what some of those creators were thinking when they decided to
forego registration.

>
> I'm willing to put something in the document about this situation,
> but would like to get agreement on the nature of the problem,
> e.g., something like:
> 
> "People want to use suffix-type registration, in analogy to
> the +xml definition, for things like types based on ZIP or
> JSON, using +zip or +json respectively, but they're not willing to
> do the work to define what those suffixes mean."
> 
> and then go into some discussion of +json and +zip examples?
> 

Consensus already exists on what +anything means, based purely on
assumption, so it's time to point out the need to bring the standards
in line with that consensus (since it's common sense, with technical
merit and +xml as precedent).  I'd rather focus on the general problem
of bringing the standards in line with reality, where +suffix is
concerned, rather than address the +json situation which I believe
resulted more from that deficiency than from laziness.  ;-)  Although
laziness is the mother of invention...

"People assume +suffix to be analogous to +xml's usage, but in reality
it is undefined in the standards and mostly discouraged in practice.
The inclusion of a suffix registry standardizing this assumed consensus
could prevent the registry failure evidenced by +json, based on the
unmet community expectation (arising from the success of +xml) that
suffixes categorize media types by common structural language."

Knowing what we now know, it would be shortsighted not to require that
+yaml be defined along the lines of RFC 3023 +xml.  Since there's no
mechanism for such a requirement, approving application/yaml without
+yaml would compound the problems arising from the existing registry
failure.  We can't point to existing YAML-based media types (assuming
we could search for any) and divine unwillingness to do the work of
creating +yaml, if RFC 4288 never told them that suffixes are used to
denote common structural languages shared by myriad media types, or
required the registration of generic-syntax languages to define and
register suffixes.

Nor can we fault Douglas Crockford for not including +json in RFC 4627,
but we can determine what recommendations he should have been following
(not that we can fault the authors of RFC 4288 for not having foresight
about the current state of affairs, either) which would have led to the
desired outcome where the developer community is concerned (meaning a
number of +json types in the registry growing at a rate that increases
as the rate of +xml registrations decreases, reflective of the times),
allowing those recommendations to be followed by both YAML and JSON
separately from this effort but consistently with RFC 3023 -- by design
instead of assumption.

-Eric
Received on Saturday, 15 January 2011 07:34:44 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:48:29 GMT