Re: Notes on implementing non-local URI references and resource documents

Erik Dahlström wrote:
>> 2) There is no indication in the specification of what should happen
>> when the HTTP response for a non-local reference URI is a 404 with an
>> image/svg+xml (or other XML type) entity body.  Gecko's implementation
>> treats this (and in fact any HTTP response that makes it out of the HTTP
>> implementation with a status code that is not 2xx) as equivalent to a
>> non-XML response.
> 
> In Tiny 1.2 it's defined to be an 'invalid IRI', and the handling of such IRI:s are defined per element. Usually it means the element that had the reference isn't rendered.

Can you point me to that definition?

Note that there is an asymmetry here, especially the way the SVG 
specification defines things (which is not quite the way Gecko 
implements them).  Consider http://www.example.com/foo.svg which returns 
an HTTP 404 with a body of type image/svg+xml.  If I point my browser at 
this URL directly, I expect this SVG to render.

Now say this SVG has a fill="url(#bar)" on some element.  I would surely 
expect that fill to happen.

What if it has fill="url(foo.svg#bar)", though?  Per SVG 1.1 that's 
suddenly a non-local reference, right?  Should that fill be rendered? 
In Gecko, we decide on local vs non-local reference based on comparing 
the fully resolved URI against the URI of the document, so we treat this 
as a local reference and render it.  Of course this has other effects, 
like making fill="url(#bar)" a non-local reference if the baseURI of the 
node is different from that of the document.  I'm not sure what the best 
path forward here is; it doesn't help that the URI RFCs have 
flip-flopped on how this should work a few times and that various 
content depends on different parts of the flip-flops...

My point is that the situation is complicated enough that things should 
really be clearly spelled out here instead of just assuming stuff.

>> 3) There is no indication in the specification that resource documents
>> (which are not thus named anywhere that I can find) are to only be
>> loaded once per primary document.  I assumed that this is the desired
>> behavior based on SVG Tiny 1.2 2008-09-15 draft, section 14.1.6 [2].
> 
> Are you referring to[7]:
> "The conceptual model is that each resource document is loaded only once; if the same resource document is referenced multiple times directly or indirectly by the same primary document, that resource document is only retrieved and processed one time."

Yes.

> This might need some more work, since I don't think this is fully accurate for e.g <animation>

I was speaking only of resource documents referenced via the various 
places that reference a node.  Note that <animation> doesn't create a 
resource document, per this text in the section you cite:

   If an SVG document as a whole is referenced for inclusion by a parent
   document, such as using the HTML 'object' or SVG 'animation' elements,
   then that document itself shall also be a primary document.

Later on, this section says:

   The term "resource document" refers to a complete, self-contained SVG
   document which has at least one of its elements referenced as a
   resource by a primary document.

>> 4) There is no indication in the specification as to what size should be
>> used for the canvas of the resource documents. Perhaps this never
>> affects rendering and thus doesn't matter; I can't tell for sure that
>> this is the case, especially where foreignObject is involved.  Gecko
>> currently sets the canvas size to 0x0.
> 
> Do you while loading and processing the resource document

Yes.

> or when rendering the resource document as part of the main document?

Also yes.  Note that the resource document per se is never rendered as 
"part of the main document".  Particular subtrees from it can be used as 
fills, masks, clip-paths, <use> targets, and so forth.

> The svg spec should define what viewport to use for the latter case

It doesn't.

> or in some cases there may not be one (e.g for some <use> constructs).

Yeah, <use> is not a problem since it creates layout objects in the 
primary layout tree.  It's the fills, masks, clip-paths, etc that worry me.

>> 5) To reduce attack surface, Gecko does not execute scripts in resource
>> documents at the present time.  We're not convinced that we want to
>> change this at any point in the foreseeable future.  We do plan to
>> implement non-Turing-complete declarative animation in resource documents.
> 
> It probably makes sense to apply some form of access control in a future spec, similar to what's being done to <iframe> elements in HTML5 with @sandbox.

Perhaps.  Fundamentally, the resource documents are weird enough 
compared to what already exists (for example, no parent, but not 
actually a top-level document) that I have no confidence that Gecko's 
existing security checks are good enough.  So while we will certainly 
look into enabling script in these, it will require some fairly 
extensive code auditing.  I'm not sure the use cases justify it, honestly.

>> 6) To reduce the risk of inadvertent information leakage, Gecko
>> currently does not allow linking to resource documents across origins.
>> The security check performed is the same as that for XMLHttpRequest (so
>> does not take document.domain into account).  We do plan to make this
>> check subject to Access-Control [3] to allow a server to export SVG
>> resource documents for use by other sites.
> 
> Is this information leakage any different from someone using e.g <iframe>?

Yes, since the resource document affects the rendering (and at least in 
the case of <use> in quite measurable-from-script ways) of elements in 
the primary document.

> Opera currently allows cross-domain references in svg, but applies restrictions on trying to access the DOM of those documents, similar to how <iframe> is handled.

I don't believe the SVG specification has any way of accessing the DOM 
of resource documents to start with, so it sounds like we're not talking 
about the same thing here.

>> 7) It's not clear to me from section 5.6 [4] whether event handler
>> attributes on elements in a subtree pointed to by a <use> should be
>> cloned onto the instance tree, but it's certainly what Gecko does right
>> now. 
> 
> Sort of, but there is one crucial difference:
> 
> "If event attributes are assigned to referenced elements, then the actual target for the event will be the SVGElementInstance object within the "instance tree" corresponding to the given referenced element."
> 
> Have SVGElementInstance:s been implemented in gecko?

No.  We actually create Elements (using cloneNode).

>> It's further unclear which script context said event handlers
>> should execute in; currently in Gecko they execute in the context of the
>> primary document. 
> 
> I think the SVGElementInstance:s are in the primary document, while the actual referenced elements are in a resource document. Opera doesn't allow access to the actual nodes if the resource and primary documents have different domains.

For rendering purposes, this is quite clear.  I think it needs to be 
clarified for script context purposes.

>> This means that a <use> can effectively import
>> scripts from the referenced document into the primary document scope.
> 
> Hmm...do you mean for the case when a <use> references a tree that contains a <script>?

No, I mean the case when a <use> references something with an 
onmousemove="..." and then you move your mouse.  If, as you suggest, 
this event handler is evaluated in the context of the primary document, 
then we have script injection.

> I don't think scripts are imported to the primary document by use, they're executed only in the document context that they're in.

For <script>, I agree (though I don't recall anywhere in the spec making 
this clear, so much; in Gecko it's a consequence of cloned scripts not 
executing).

>> While we're enforcing same-origin restrictions this is not a problem,
>> but when we move to Access-Control we will continue to enforce a hard
>> same-origin restriction on <use> to mitigate this attack scenario.
> 
> That depends on whether <use> imports <script> into the primary document or not, no?

No; see above.

> Agreed, the "must not contain scripting" thing is bogus, I've read that more as an authoring requirement rather than as an UA requirement.

Given the context it certainly sounds like a UA requirement to me. 
Everything else in this section seems to be one.

-Boris

Received on Tuesday, 14 October 2008 12:01:36 UTC