Xinclude 17-Jul7-2000 draft comments (R.A.O'K) from Richard A. O'Keefe on 2000-08-01 (www-xml-xinclude-comments@w3.org from July 2000)

From: Richard A. O'Keefe <ok@atlas.otago.ac.nz>
Date: Tue, 1 Aug 2000 15:57:36 +1200 (NZST)
To: www-xml-xinclude-comments@w3.org
Message-Id: <200008010357.PAA26661@atlas.otago.ac.nz>
		Some comments on
	XInclude 1.0 Working Draft 17-July-2000

Section 1.2
  paragraph 1.
    s/complimentary/complementary/
  paragraph 4.
    s/referencecs/references/
Section 3.1
  paragraph 2.
    Appears to require that the prefix of an inclusion href attribute
    MUST be xinclude: as opposed to some prefix that maps to the right
    namespace, whatever that may be.

    xmlinclude: would be a better prefix to use than xinclude:
    here and elsewhere in this document.
  paragraph 4.
    s/it's/its/
    The example is important, but rather confusing.

Section 3.2
  paragraph 1.
    We are told that "The value of the xinclude:href is a URI reference".
    There are two main problems with this.

    a) There is no requirement that the URI "designate a data object".
       While it's reasonably clear what a URI with a file:, ftp:, http:,
       or nntp: scheme means, it is not clear what a telnet: or mailto:
       URI would mean in this context.
       
    b) The examples show fragment identifiers.  But RFC 1808 says that
       a "fragment identifier is not part of a URL" and RFC 2396 section
       4.1 also says it "is not part of a URI".  So
	"#xpointer(x/myinclude[1])" is not a URI,
	"common.xml#xptr(a/b)" is not a URI (although it contains one),
	"source.xml#xpointer(string-range....)" is not a URI.

    What, *exactly*, is the value of an {Xinclude}:href attribute
    supposed to be?  If a fragment identifier appears, does it *have*
    to be an Xpointer fragment, or can it be something else?

  paragraph 9 (just before Xinclude-45-fail-text).
    s/are specified/is specified/

Section 3.2.1
  paragraphs 4, 8 conflict
    Paragraph 4 says an include element _may_ reference itself as text.
    Paragraph 8 says an include element may _not_ reference itself or
    an ancestor.  Presumably they should read
    [para4]
	An include element with parse="text" may reference itself or
	any of its ancestors.
    [para8]
	An include element with parse="xml" or omitted may not
	reference itself or any of its ancestors.

Section 3.3.2
  paragraph 1 slightly garbled and seems inconsistent with
  section 3 paragraph 2.
    Section 3 paragraph 2 says
	Well-formed XML entities that do not have defined infosets
	(e.g. an external entity with multiple top-level elements)
	are outside the scope of this specification
   Section 3.3.2 paragraph 1 says
	An include element might identify a subresource that
	contains more than a single information element.
	[Which presumably includes a subresource that is an entire
	 external entity with multiple top-level elements.]
	In this case these information items replace the information
	item representing [was "the" missing here?] include element
	in the order in which they appear in the included document.

Section 3.3.4
  paragraph 3
    s/set of information item/set of information items/
  paragraph 4
    s/ranges are/ranges is/ (to agree with "a set")

Section 4
  paragraph 3 conflicts with DTD fragment.
    The text says 'A value of "text" ...' but the DTD fragment has
    'xinclude:parse (xml|parse) "xml"'.

  Missing attribute.
    When a document is fetched using HTTP, it may have an encoding
    value in the HTTP header.  When a document that is fetched by
    that or any other means is an XML document, it may (but need not)
    contain an <?xml?> declaration specifying an encoding.  But if
    a document is fetched by nfs:, afs:, file:, ftp:, and does not
    contain an <?xml ... encoding='...'?> declaration or is to be
    included as text, what encoding does it use?

    There is a clear need for
      xinclude:encoding
	The value of this attribute is an EncName as defined in
	XML 1.0 spec., section 4.3.3, rule [81], specifying how
	the resource is to be translated.

  Optionality.
    Does an element have to have an xinclude:parse attribute as well,
    or is it enough for it to have an xinclude:href attribute to be
    an inclusion?

Section 5.2
  paragraph 3.
    s/must )/must)/
    s/may )/may)/

Section C.2
  box 1
    is too wide for Netscape to print it correctly on A4 paper.
    Can the xinclude:href="..." line be broken somehow?

Section C.3
  box 1
    is too wide for Netscape to print it correctly on A4 paper.
    Can the <example><example-body> line be broken somehow?
    Presumably something like
	<example><example-body xinclude:href="data.xml"
          xinclude:parse="text"/></example-body></example>
    is meant.


General observation.

  Inclusion sounds like a simple problem, but this seems like a
  cumbersome and somewhat confusing way to solve it.

  I note that it has a number of limitations:
    - the combined document is not validated.
    - the included material must be well-formed.

  It would be interesting to know why a simpler scheme using processing
  instructions has not been adopted.  (Note that <?xml?> processing
  instructions already affect parsing, so there is precedent.)
	
  "<?xml-include" (S "type" Eq Type)? (S "encoding" Eq Enc)?
                  (S ExternalID | S "href" Eq URI)"?>
  Where Type is "(xml|cdata)" or '(xml|cdata)'
  and Enc is "EncName" or 'EncName'
  and EncName and ExternalID come from the XML 1.0 spec.

  If href appears in an <?xml-include?> PI, the text to be included is
  located as in the present draft (whatever that method is).
  If an ExternalId appears, the resource to be included is the external
  entity thus identified.

  If type="cdata", the characters will be treated as character data
  and not parsed.  If type="xml" or omitted, the characters will be
  parsed as if they had appeared literally in the place of the PI.

  This would allow
	+--start.inc----------------+
	|<html><head><title>        |
	+---------------------------+

	+--body.inc-----------------+
	|</title><body>             |
	+---------------------------+

	+--end.inc------------------+
	|</body></html>             |
	+---------------------------+

	+--sample.xml--------------------------------+
	|<?xml-include href="start.inc"?>An example  |
	|<?xml-include href="body.inc"?>             |
	|<p>A tiny example PIs can handle</p>        |
	|<?xml-include href="end.inc"?>              |
	+--------------------------------------------+

  This would straightforwardly map yo
	<html><head><title>An example
	</title></head>
	<p>A tiny example PIs can handle</p>
	</body></html>
  The resources included are not, and have no particular reason to be,
  well-formed xml.  What matters is that the combined document is
  well-formed xml.  Not only that, it can be validated.

  The XML Information Set and similar models would have no difficulty
  with this either:  it would be as if the processing instructions had
  never existed.  Yes, that would be a problem for XML editors, but
  - making life easy for editors has made infosets and DOM difficult
    for everything else, and
  - it is a solvable problem.  Place a PI node
	(PI "xml-include" "begin <rest of PI>")
    just before the inclusion and a second PI node
	(PI "xml-include" "end <rest of PI>")
    just after the inclusion, and an XML editor can then recover the
    original structure from the infoset.
Received on Monday, 31 July 2000 23:58:00 UTC