HTML/XML Task Force Minutes 11 Jan 2011

[ Thank you, Noah, for scribing. ]



                                   - DRAFT -

                              HTML/XML Task Force

Meeting 3, 11 Jan 2011


   See also: [3]IRC log


           Norm, John, Yves, Michael Champion, Michael Kay, Noah, Henri

           James, Anne

           Norm Walsh

           Noah Mendelsohn, NM


     * [4]Topics

         1. [5]Administrivia
         2. [6]Use case #3, islands of HTML5-marked prose
         3. [7]Use case #4, HTML document with islands of XML

     * [8]Summary of Action Items



   NW: Next call will be in a week, on 18 January. Any regrets?


   topics: Use cases

   <hsivonen> Use case email was

   NW: I've been somewhat out of touch, but have seen at least two
   interesting email threads: 1) xml in feeds and 2) how to detect html5

   JC: XML or HTML?

   NW: Well, some thread subjects said XML

   <hsivonen> we covered use cases 1 and 2. We didn't cover 3 and 4

   Use case email was

  Use case #3, islands of HTML5-marked prose

   From the email description of the use case:

   3. I have an XML document and I want to embed islands of human prose

   marked up with HTML5 in it because I want to be able to extract

   those sections for use in, for example, documentation.

   JC: In that environment, we don't have an HTML5 DOM, I think, so we don't
   have to deal with inconsistent DOMs

   NW: Yes, mainly XML tools for this case.

   JC: What limitations are there on HTML5? E.g., I know about noscript.

   NW: (missed something about semantics) I was thinking about things like
   HTML5 rules that automatically add namespaces to SVG, and that won't
   happen in an XML toolchain.

   JC: The XHTML5 elements mean the same as their like-named counterparts in
   HTML5, with the exception of NOSCRIPT

   HS: Yes, and also ISINDEX

   NW: Why?

   HS: Those are both sort of parser-managed things on the HTML side. ISINDEX
   as sort of a parser macro, is invalid into HTML5, and is invalid in that
   sense. It expands into other elements like a macro. NOSCRIPT depends on
   the context.

   JC: How it's parsed depends on whether you have scripting.

   NW: Thanks, good to know. Sounds like it's safe to set aside ISINDEX. Less
   sure about NOSCRIPT, but likely at worst a minor problem.

  Use case #4, HTML document with islands of XML

   From the use case email

   4. I have an HTML5 document and I want to embed islands of XML in it

   because I want to be able to write JavaScript and CSS to manipulate

   those elements, for example, in the browser.

   NW: The HTML5 parser won't do the same thing as XML would if the element
   names are in the HTML5 language.
   ... I believe that the only workaround is to put the XML in a <SCRIPT>
   element, that gives you the XML in an escaped node.

   MK: Or download the XML separately.

   HS: The text node will have the text unescaped.

   NW: Oh, OK, yes. If serialized then escaped, but in the node it's not.

   NM: The XML need not be for manipulation only in Javascript/CSS, you may
   also or instead want to manipulate it in XML (or HTML) tools at the
   server, or conceivably elsewhere on a client.

   HS: The script element trick works for all languages, so XML is being
   treated as a special case.

   NM: Yes, and there are arguments pro and con as to whether that makes
   sense. HTML and XML have a long history togther, and this task force is
   focused on exploring synergies.

   JC: Just use XHTML?

   NM: Yes, but we always get back to the huge install base that runs best
   with text/html

   <darobin> "The script element allows authors to include dynamic script and
   data blocks in their documents. The element does not represent content for
   the user."

   NW: I find the uniformity of treatment of all languages by NOSCRIPT to be

   <Norm> I'm not sure I went so far as to say that I found it appealing, but

   NM: So, I'm a little troubled by the fact that <SCRIPT> tags have mandated
   processing in the case there's a script there. What if the script is media
   type applicaiton/xml

   JC: Not troubled by that. You'll use something like application/xslt+xml
   if you want your XML interpreted as (in this example) an XSLT script.
   ... Historically, media type is what to do with it, not what it is.

   NM: I strongly disasgree with that.

   JC: Oh, I mean in HTML

   NM: Specifically on the SCRIPT tag

   NM: I'd prefer to associate the processing rules with the spec for the
   SCRIPT tag

   JC: What does the HTML5 spec say?

   HS: I agree with Noah that in principle there's an architectural issue; in
   practice the set of languages supported in browsers is small and slowly
   growing. So far none in XML. If necessary, any such new XML scripting
   language could get a more specific type.

   Speaking for myself: OK, maybe the HTML5 spec should say what Henri just

   JC: XSLT?

   HS: They don't support it in <SCRIPT>, and it doesn't make much sense to
   do so.

   JS: I understand this isn't likely to happen, but not sure why it wouldn't
   make sense.

   HS: Script processing starts when end tag </script> is parsed, and you
   only have a partial DOM. Seems not to make sense to do XSLT then. Hmm, but
   a DEFER script could make sense I guess.

   JC: Could run multiple successively.

   MK: Some of my points have been partly covered. There are a lot of
   potential XSLT processing scenarios, many of which can't be captured by
   <script type="..xslt type..">
   ... E.g. when to run, what the input is, whether there's more than one
   script, etc., parms, etc.
   ... Relying on one attribute seems insufficiently extensible. Henri
   reinforces that when he says "won't happen in next year, therefore
   uninteresting". Seems the wrong way to architect. We should look further
   into the future, to when Javascript seems as old fashioned as COBOL. The
   world is dynamic.

   JC: Propose we add embedded XSLT as another use case.

   NW: +1

   NM: Too bad we're leaving this behind so quickly. The purpose of our group
   is to maximize HTML/XML synergies, and for >certain< purposes XSLT is a
   terrific language for HTML scripting

   HS: There is some implementation in the runtimes for giving the HTML DOM
   as input to XSLT processing (scribe isn't sure he got this right)
   ... The XSLT program can be put in a script element, and use bootstrapping
   Javascript that compiles the XSLT program, and chooses as input tree to
   give to that program.
   ... You can put the output in the DOM.

   <Zakim> noah, you wanted to talk about circularity

   MK: Yes, we've seen the folks at ETH Zurich do just that, using two
   <SCRIPT> elements, one javascript and one XQuery. The former looks for and
   runs the latter.

   HS: The set of programming languages supported natively by browsers has
   always been "1" across multiple browsers, that is Javascript. Internet
   Explorer has for years also supported VBScript. There are also good
   accessibility(?) APIs that allow languages to be plugged in.
   ... Gecko allows some extensibility, but for various reasons only for
   local content.
   ... Anyway, the trend is toward focus on Javascript only, and viewing that
   as a compiler target for other languages. That said, there is precedent
   for having other languages.
   ... You cannot ever use type="text/vbscript" for data, because there
   exists a browser that would attempt to execute it.

   BINGO! That's why I don't much like using the <SCRIPT> tag for data.

   NW: That is astonishingly unsatisfying. It would make much more sense to
   add a new <DATA> element, without the risk that IE would later decide that
   type="application/fribble" would launch missles.

   HS: The reason it's called SCRIPT and not DATA is that there are only a
   handful of elements that don't try to parse their content.
   ... If we introduce something called <DATA>, it would be incompatible with
   the install base of browser.

   What about <script type="xxxx" mode="NORUN">?

   HS: So, the pattern is formalized in HTML5. An alternative is using
   <STYLE>. Another is <XMP>, but that's not hidden by default.

   NW: Yeah, I forgot the compatibility problem.

   <Zakim> noah, you wanted to ask about NORUN attribute

   <hsivonen> existing browsers wouldn't honor NORUN

   <jcowan> Announcement: I'm working on a MicroXML parser/DOM called
   MicroLark (hommage to Tim's Lark parser from the early days of XML)

   NM: I think a new attribute would have fewer problems BUT: I admit that it
   would be at best eliminating future problems, and then only rarely. The
   advantage would be architectural robustness. It appeals to me
   intellectually, but I suspect that even if built it would be used only

   NW: We are ADJOURNED.

Summary of Action Items

   [End of minutes]


    Minutes formatted by David Booth's [11]scribe.perl version 1.135 ([12]CVS
    $Date: 2011/01/12 21:56:20 $


   Visible links

Received on Wednesday, 12 January 2011 21:59:45 UTC