Re: Rewording Algorithms and Script Types

Jim Jewett writes:

> ... algorithms ... needlessly complex ... Some of them are phrased to
> avoid an early exit.
> 
> To make this more concrete ... the method for determining a script's
> type seems to read backwards:
> 
> http://www.whatwg.org/specs/web-apps/current-work/#running0

I can see what you mean about step 1 of that algorithm seeming to read
backwards (and how that may be a problem), but I don't see in what way
this is "to avoid an early exit"; both the original and your rewording
of it have the form 'if, otherwise-if, otherwise', without any exiting
in any of them.

Please could you give a few more examples of algorithms which suffer
from this generic 'early exit' problem -- that should make it clearer
what you're concerned about.

> The only way I could understand it was to reword it mentally, and then
> it looked incomplete.

I agree; the case where there is neither a type nor a lang attribute
isn't specified.  I think that could be fixed be changig the first
paragraph to:

  If the script element has a type attribute and its value is the empty
  string, or it has no type attribute and either it has no lang
  attribute or it has a lang attribute and the lang attribute's value is
  the empty string, then let the script's type for this script element
  be "text/javascript".

> I would reword it to similarly to whichever of the following is
> intended:
> 
> If the script element has a type attribute, this determines the
> Script's Type.
>     If the type attribute's value is the empty string, then the
> Script's Type is "text/javascript".
>     Otherwise, the Script's Type is the value of the type attribute.
> 
> Otherwise, if the script element has a language attribute, this will
> determine the Script's Type.
>     If the value is the empty string, then the Script's Type is
> "text/javascript".
>     Otherwise, the Script's Type is "text/" + the value of the
> language attribute.
> 
> Otherwise (if the script has neither a type attribute nor a language
> attribute):
>     ?  Assume "text/javascript" or raise an Error?

Going by the spirit of the spec, and the general aim of HTML5
user-agents being compatible with the web as it currently exists, I'm
sure that in this situation it's wanted for user-agents to treat the
script as JavaScript, and that's what the spec should be fixed to
specify.

Overall your suggested wording has a couple of disadvantages over the
wording currently in the spec:

* There are three separate places which can result in text/javascript
  being used as a default, rather than having all the conditions that
  can result in this being grouped together.

* Your condtions are nested, as indicated with indentation in your
  message.  Indeed, it is _only_ the indentation which shows the first
  "otherwise" applies to the nested subcondition about type being the
  empty string whereas the "otherwise" on the very next line refers to
  the top-level condition on having a type attribute at all.  As such,
  it would be nearly impossible to unambiguously interpret the algorithm
  without the indentation, for instance if it were being read out loud
  to you.

Note I'm not claiming that these disadvantages make your text worse than
the spec's current wording!  I'm merely pointing them out as things to
be considered when deciding which wording to use, and in the hope we can
come up with even better wording.  Hmmm, how about:

  Let <var>type</var> be "text/javascript".

  If the script element has a type attribute whose value is not the
  empty string then let <var>type</var> be the value of that attribute.

  Otherwise, if the script elment has no type attribute (not even one
  whose value is the empty string) but does have a language attribute
  whose value is not the empty string then let <var>type</var> be the
  concatenation of the string "text/" and the value of the language
  attribute.

  Let the script's type for this script element be <var>type</var>.

Further, the context of this algorithm suggests it's only necessary for
user-agent implementers to read it, not authors (which I think would be
a good think -- authors wanting to write valid HTML5 don't need to be
aware of things like the language attribute, only included for backwards
compatibility).  But outside the algorithm type is merely defined as:

  The language of the script may be given by the type attribute. If the
  attribute is present, its value must be a valid MIME type, optionally
  with parameters. [RFC2046]

So an author knows that it is valid to omit type, but not what that
signifies.  It would be good to make this explicit without requiring
working through the algorithm, for example by replacing the above with:

  The language of the script may be given by the type attribute, which
  specifies a MIME type.  The default language is ECMAScript [with a
  link to the 'Scripting languages' subsection].  For a script written
  in any other language its MIME type must be specified with the type
  attribute.

[If you reply to this mail only about only either the generic issue of
algorithms _or_ specifically script types then please remember to edit
t'other one out of the Subject: header.]

Smylers

Received on Tuesday, 7 August 2007 16:55:38 UTC