Thoughts on font linking and embedding from Maciej Stachowiak on 2011-02-16 (public-webfonts-wg@w3.org from February 2011)

From: Maciej Stachowiak <mjs@apple.com>
Date: Wed, 16 Feb 2011 11:28:23 -0800
To: public-webfonts-wg@w3.org
Message-id: <B221366F-747D-495B-B7A3-9531951560BE@apple.com>

Hello Fonts WG,

I'm writing to clarify Apple's thoughts on fonts, same-origin restrictions, and related issues. While this is not necessarily an official statement on behalf of Apple, I have vetted these remarks with the relevant stakeholders at Apple, and we are in general agreement on the rough outlines. Also, I'd like to apologize on behalf of Apple for not providing comments earlier, during the Last Call Period.

Overall, I believe our position is very close to that expressed by Opera, but I wanted to elaborate it in my own words. Specific points I will cover:

(1) History of cross-origin resource linking and embedding on the Web
(2) The need for a general "anti-hotlinking" mechanism
(3) Hotlinking prevention is useful, but not a very strong license enforcement mechanism
(4) Fitting web fonts into the standard model
(5) Cross-origin access restrictions should be tied to embedding mechanisms, not data formats
(6) Pros and cons of different approaches to restricting cross-site font embedding

== (1) History of cross-origin resource linking and embedding on the Web ==

Historically, the Open Web Platform's mechanisms for resource access have evolved in a somewhat haphazard way. However, over time, resource access has resulted in a clear tripartite division:

A) Linking.

A resource "links" to another when it has a reference that can be traversed by the user to view the resource in a separate context. The HTML <a> element is a canonical example of a linking mechanism. Historically, linking mechanisms have always been allowed cross-origin without restriction. Many believe that free linking is an important property of the Web and should be preserved.

B) Embedding.

A resource "embeds" another when it incorporates it into its own display, but without full read access to the contents. The HTML <img> element is a canonical example of an embedding mechanism. Historically, embedding mechanisms have been allowed cross-origin without restriction, and APIs have been carefully designed to avoid undue information disclosure. Cross-origin embedding is sometimes useful, but there are also often valid reasons to prevent it, for example to prevent a third party from using an image without credit while consuming undue amounts of the hosts bandwidth. Some believe that the Web platform would be better if cross-origin embedding had never been allowed by default, but due to compatibility constraints, this is not possible to change for existing embedding mechanisms.

C) Reading.

A resource "reads" another when it has direct access to its contents, as full text or a binary representation. The XMLHttpRequest object is a canonical example of a reading mechanism. Historically, cross-origin reading was not allowed at all, since it creates security risks. Over the past few years, a protocol called CORS (Cross-Origin Resource Sharing) was developed to allow servers to selectively opt in to cross-origin reading of their resources.

In this three-way division, @font-face would be considered an embedding mechanism.

== (2) The need for a general "anti-hotlinking" mechanism ==

The practice of directly embedding a resource from another origin is called "hotlinking". There are many valid uses for it, for example content distribution networks (CDNs) depend on this capability. But there are also valid reasons to restrict it. Here are some use cases:

A) Alice hosts an image, my-cute-kitten.jpg on her hosted site. She is happy to let people download it, but has a limited monthly bandwidth quota. Another site, content-farm.com, directly embeds the image. Due to powerful SEO, the page embedding the image is visited many times. The loads all hit Alice's server, putting her over her bandwidth quota and therefore taking her site down for the month. Alice is said. She wishes she could prevent image hotlinking like this.

B) Happy Fun Social Network has an account control web page, using cookie-based authentication. There are many buttons on this page that, if clicked by the user, will perform potentially regrettable side effects. evil.com embeds this page using the <iframe> element, and applies a technique called "clickjacking" to trick users into clicking on account settings. The unofficial X-Frame-Options header exists to address this, but it is relatively inflexible

It seems to me there is a clear need for a mechanism to prevent hotlinking of all kinds of resources. I think Opera's proposed "From-Origin" header is an excellent way to address the problem. (I would prefer a more explicit name, such as "Embed-Only-From-Origin" or something, but bikeshedding the name is not the key issue).

== (3) Hotlinking prevention is useful, but not a very strong license enforcement mechanism ==

Besides protecting the hosting site's bandwidth, same-origin embedding restrictions are proposed as a lightweight aid to following font licenses that limit use to a single site. It is reasonable that font vendors may want this, but I think we need to keep this threat in perspective. Most sites wanting to make unauthorized use of a font will simply download it and host it from their own site. Same-origin restrictions will do nothing to prevent that. Hotlinking seems like a less likely scenario for license violation.

== (4) Fitting web fonts into the standard model ==

Although the three-way distinction among linking, embedding and reading can be confusing and fuzzy (as outlined by Robert O'Callahan), it is nevertheless the model for the existing Web platform. Setting different rules for different kinds of resources is confusing to authors, and makes security analysis more difficult. The security model of the Web is already way too complicated without odd corner-case rules.

Therefore, instead of creating a different model for Web fonts, where embedding is forbidden by default and tied to read permission, it would be better for the Web platform as a whole to follow the standard model: linking and embedding are allowed by default, reading is forbidden by default, but servers can opt in. Adding From-Origin/Embed-Only-From-Origin to the Web platform should satisfy the needs to limit bandwidth abuse and to simplify license compliance. Implementing this mechanism across all resource types and all font formats is something that Apple could make a priority if there is consensus for this approach (though I cannot make any specific commitment about future releases).

== (5) Cross-origin access restrictions should be tied to embedding mechanisms, not data formats ==

Regardless of the embedding rules for fonts, and whether CORS or a new Embed-Only-From-Origin/From-Origin header is used to control them, these rules should be tied to the relevant embedding API, namely @font-face, not to a specific font file format. There are a few key reasons for this:

A) It is a layering violation for a data format to place constraints on how embedding mechanisms can use it. The embedding mechanism itself should define the restrictions.

B) It is confusing and undesirable for everyone if different font formats have different embedding rules. It would be silly if WOFF fonts and OpenType fonts had different embedding rules.

C) It may not be practical to implement same-origin embedding restrictions with CORS exception for WOFF fonts only, and not other font formats (e.g. TrueType, OpenType, SVG). This has so far not affected Firefox or IE, since those browsers actually implement same-origin restrictions for all font formats in the most recent versions. However, if that is the desired end-game, that is how it should be specced. The restrictions should attach to @font-face, not to the WOFF format.

== (6) Pros and cons of different approaches to restricting cross-site font embedding ==

I believe that making fonts consistent with the rest of the Web platform, and adding an anti-hotlinking mechanism that applies to the whole Web platform, has a number of advantages:

A) Consistency and clarity for authors.
B) Applies to all font formats, not just WOFF, which is presumably more desirable for font vendors.
C) Appears to be something that all browsers would be on board to implement, which seems better than a more restrictive rule that only a subset of browsers would implement.
D) Better fit with the architecture of the Web.

I hope these remarks clarify our position.

Regards,
Maciej

Received on Wednesday, 16 February 2011 19:29:32 UTC