- From: MURATA Makoto <murata.makoto@fujixerox.co.jp>
- Date: Tue, 30 Nov 1999 19:48:36 +0900
- To: Chris Lilley <chris@w3.org>
- Cc: Dan Connolly <connolly@w3.org>, timbl@w3.org, simonstl@simonstl.com, ietf-xml-mime@imc.org, Tsmith@parc.xerox.com, xsl-editors@w3.org, masinter@parc.xerox.com
Chris Lilley wrote: > Yes, agreed. > > > We have two choices. One is to use text/xml or application/xml even for > > external parsed entities. The other is to use application/xml-epe > > only for those external parsed entities which are not XML documents. I think > > that the latter is a complicated rule. > > The former also has complications, sinc eit means that application/xml > is "sometimes but nnot allways, well-formed xml". Since the terms valid > xml and well-formed xml are defined, but there is no defined term for > "stuf that is not wellformed", this is a problem. I think that this is > significant complication. Well, I do not think this is complicated. text/xml or application/xml means either external parsed entities or document entities. This is simple. > Wheras for the latter option, it is simple. Is the epe itself a > well-formed document (this is easy to check mechanically). if yes, label > it as applicatio/xml. If no,label it as application/xml-epe (or whatever > term is chosen). This seems a simple, readily understod, and > machine-processable rule. Suppose that you make an XML document which references to an external parsed entity. You are very likely to inform the URI of that document to recipients but not that of the external parsed entity. The external parsed entity will thus be fetched only from XML processors during parsing. The fact that it is labelled as text/xml or application/xml does not cause any problems. But the URI of the external parsed entity may become disclosed and some program (e.g., WWW robots) may fetch it as a MIME entity. This program does not know if this MIME entity is an XML document or external parsed entity. If it parses as XML, it is an XML document. Even if it does not, it may or may not be an external parsed entity. Is this a problem? I do not see any problems. > > However, I have assumed that this issue is not very important since > > we should anyway avoid external parsed entities at all in the Internet. > > (Out of curioisity - why? In the context of HTTp/1.1 keep alive - its > not very expensive to fetch an epe once. If the epe is shared between > two or more documents, ther eis a net win even with HTTP/1.0) Because different processors emit different outputs. I personally think that in the Internet, we should never use (1) default values declared in external DTD subsets and external pararmeter entities, and (2) external parsed entities. > > If external parsed entities are used, different parses emit different > > results. (See "5. Conformance" of the XML recommendation > > http://www.w3.org/TR/REC-xml#sec-conformance) > > > > >For maximum reliability in interoperating between different XML processors, > > >applications which use non-validating processors should not rely on any > > >behaviors not required of such processors. > > Well, there is a move to define a category of "full infoset" parsers - > non validating, but which fetch epe's and external DTD subsets - which > deals with this problem. I am not aware of such a move, and I have been a member of the XML Syntax WG. I am aware of a move for so-called "trivial subset". But I do not know what will happen. > Regardless, it is legal now to use epes, and thus, a rule needs tobe > established for labellingthem; and the rule needs to cover all legal > cases, not just some frequently occurring ones. I think that the current rule satisfies there criteria. The only caveat is that (1) to know if an XML MIME entity is an XML document, you have to parse it, and (2) even if an XML MIME entity does not parse as an XML document, you are not sure if it is an external parsed entity or not. I do not think they are problems. As for (1), you have to parse it anyway, since the MIME header may be wrong. As for (2), are they any requirements to distinguish those text which may become external parsed entities, and those which cannot? To me, what does not parse as an XML document is useless. By the way, if there is a strong reason for introducing a specialized media type for external parsed entities, we also need another media type for external *parameter* entities. Cheers, Makoto Fuji Xerox Information Systems Tel: +81-44-812-7230 Fax: +81-44-812-7231 E-mail: murata.makoto@fujixerox.co.jp
Received on Tuesday, 30 November 1999 05:46:57 UTC