RE: IETF RFC format <-> W3C pubrules from Leif Halvard Silli on 2012-05-01 (spec-prod@w3.org from April to June 2012)

From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
Date: Tue, 1 May 2012 02:02:40 +0200
To: Larry Masinter <masinter@adobe.com>
Cc: "Tab Atkins Jr." <jackalmage@gmail.com>, Marcos Caceres <w3c@marcosc.com>, Doug Schepers <schepers@w3.org>, "spec-prod@w3.org" <spec-prod@w3.org>
Message-ID: <20120501020240533675.b3659743@xn--mlform-iua.no>
Larry Masinter, Sun, 29 Apr 2012 21:47:18 -0700:

> Almost all of the problems are "solvable" in some abstract sense.  
> The problem is that some of the potential solutions require 
> development and deployment of new infrastructure -- email archives, 
> for example.

What has e-mail archives to do with RFCs? Is the intention to send RFCs 
as e-mail?

> Every new required feature adds additional compatibility 
> considerations.  And the issues when there are many legacy systems 
> are not abstract. There's lots we could theoretically do but which 
> would cause untold and unnecessary disruption because it doesn't work 
> with what's deployed. Surely you understand that.

There are text encoding problems with the current RFC system too.

First there is the problems related to describing characters, rather 
than typing them directly, which makes the RFCs difficult to understand 
for instance if the character describes a non-ASCII letter instead of 
rendering it. Or, the problem related to descriptions such as 'below I 
type an 'a', but you should consider that it is 'exotic letter x'. 

Second, you do have encoding problems now as well. Take RFC2557, which 
contains at least one non-ASCII letter - É. 

* <http://www.rfc-editor.org/rfc/rfc2557.txt

  includes the É because the file is served as ISO-8859-1
* <http://tools.ietf.org/html/rfc2557#section-9.1> is served, via 
  HTTP, with the charset label 'latin-1', which is an invalid
  label, which means that the page is only correct in Web browsers
  that default to Windows-1252.
* <http://www.rfc-editor.org/rfc/pdfrfc/rfc2557.txt.pdf>
  includes the É, for some reason.
* <http://datatracker.ietf.org/doc/rfc2557/?include_text=1>
  is served with the label UTF-8, but the É letter is still lost

It seems going for UTF-8 everywhere would be simpler.

>  " Would it not be possible, in the RFC format, to start with a 
> allowing a  subset (that covers more than US-ASCII ...) of UNICODE?"
> 
> Yes, of course. And no one is opposed to doing so. The trick is 
> "which subset?".  To do that, you need some idea of what the 
> requirements are. 

If one requires non-ASCII, or whatever, to be both be described - like 
now - and directly typed, then there would  not need to be a 
restriction. I think e-mail should use utf-eight as well. But to make 
documents readable in case not every e-mail program support utf-eight, 
then one could require the plain text version to e.g. represent letters 
as numerical character references, as well ...

When it comes to the font problem that you mentioned to Tab, then it is 
possible to embed fonts via CSS. You could even regulate which letters 
are allowed based on what the font you embed allows. And RFC which 
contains non-ASCII - or whatever - could come with a warning about what 
the requirements for reading it are.

> On Accessibility: Accessibility is not best measured by "average ease 
> of use".  Making a document more accessible (in the sense of making 
> it easier to understand) to most of the population, but also making 
> it less accessible (in the sense of making it IMPOSSIBLE to 
> understand) to a smaller subset may not improve accessibility, if you 
> gate is "What percentage of people can access the information". So 
> sure, adding linking to the beginning of a document letting them skip 
> the introductory info or mandatory or useless boilerplate might help 
> someone. But leaving material necessary to determine conformance 
> _only_ within non-accessible tables or figures seems like it 
> unnecessarily narrows the scope of users who can interpret the 
> document.

If 'figures' means 'illustrations' of various sorts, then HTML has 
various ways in which one can embed or link to e.g. a text version of 
the illustration:

* There is the @longdesc attribute which take a URL
  which, if it is a data URI, could contain a small document
* Instead of @logndesc, one can of course do the same with
  an anchor element.
* There is the <details> element, which allows you to do
  <details><summary><img src='the-illustration' /></summary>
  ASCII illustration or description goes here </details>
* There is the <object data=image> element, which in text
  browsers will render its fallback rather than the image

When it comes to tables, then there are many text browsers that 
supports it. Especially if you include the @border attribute ... E.g. 
the W3M browser - which is included in Emacs - does support it. OK: 
Lynx does not support tables, unless you are - I think - quite rigid 
with how you make them - I forgot the exact requirements, but it uses 
empty lines as 'borders', I think.

Is text browsers a goal?

Text browsers is a interesting topic, and one I looked quite a bit on 
during HTML5 ... It would be fun to help define a text 
browser-compatible sub format of HTML that could work - if that is what 
you are looking for. I'd be happy to help working out something like 
that.

Leif H Silli
Received on Tuesday, 1 May 2012 00:03:15 UTC