enc: URL scheme from Kornel Lesiński on 2012-06-13 (public-html@w3.org from June 2012)

From: Kornel Lesiński <kornel@geekhood.net>
Date: Thu, 14 Jun 2012 00:45:10 +0100
To: public-html@w3.org
Cc: ietf-http-wg@w3.org
Message-ID: <op.wfu79kggte2ec8@aimac.local>

I'd like to propose an enhancement of the http+aes scheme (in WHATWG
draft), and a generalization of the ClearKey Encrypted Media proposal.

This URL scheme is for providing encryption and integrity guarantees on
top of other protocols.
It is intended to enable secure hosting of content on untrusted servers,
e.g. storing private photos, commercial videos or shared JavaScript
libraries on 3rd party HTTP content delivery networks.

----

The scheme uses attribute-value pairs to define type of encryption and
digests used and includes absolute URI of the resource to be fetched.

enc:<parameters>,<absoluteURI>

User agents are expected to fetch the resource using the (sub)protocol
specified (following all redirects) and then decrypt it and/or verify its
integrity according to the specified attribute(s).

For a start "sha1", "sha256" and "aes-ctr-key" are defined.

##Syntax

> (intended to be similar to data: URI)

encurl := "enc:" parameter *( ";" parameter ) "," absoluteURI
parameter := attribute "=" value
value := 1*pchar
attribute := optional_attr | required_attr
optional_attr := 1*unreserved "?"
required_attr := 1*unreserved

"`absoluteURI`", "`pchar`" and "`unreserved`" are the corresponding tokens
from [RFC2396].

Order of attributes is significant.

##Attributes

UA MUST NOT fetch resource from `enc:` URL which has a required attribute
that the UA does not support.

UA MAY fetch the resource if it does not support an optional attribute.

UA MUST NOT use the resource if checksum specified in an optional
supported attribute does not match fetched data (e.g. if UA is able to
determine that the resource doesn't match `sha9999` digest then it should
reject the resource).

###The `sha1` and `sha256` attributes

These attributes contain SHA-1 or SHA-256 digests of the content or the
ciphertext. If the attribute is specified before an encryption attribute,
it's a digest of unencrypted content. If it's specified after an
encryption attribute, it's a digest of the ciphertext.

> (If there's a good reason to only hash before/after encryption, then it
> should be defined as that instead, and then order of attributes could be
> meaningless)

UA must fetch the entire resource and verify it matches the given hash. If
it doesn't match, then the UA MUST NOT use the resource and must act as if
the resource could not be obtained due to a network error.

value := sha_base64 | sha_hex

The `sha_base64` is a string base64-encoded as described in Section 6.8 of
[RFC2045]. `sha_hex` is a string of 40 (`sha1`) or 64 (`sha256`)
hexadecimal characters (case-insensitive).

enc:sha1=Ck1VqNd45QIvq3AZd8XYQLvEhtA=,http://example.com/Hello%20World
enc:sha1=0a4d55a8d778e5022fab701977c5d840bbc486d0,http://example.com/Hello%20World

Other specifications may define digest methods that are suitable for
partial requests or streaming of content (e.g. hash trees).

enc:sha1=ff4e04b67591f413dc713d8e9b7697c352c022f1,hashtree?=http%3A%2F%2Fexample.com%2FHello%2520World.index,http://example.com/video.ogv

####Security and privacy considerations

This attribute enables user agents to avoid fetching resources if they
already have a cached resource matching the digest. An attacker could use
such implementation to test whether users have certain known files cached
(e.g. an image shown to users logged in to a particular website).

Attackers could also use this to obtain contents of cached files they only
know checksum of (e.g. attacker may have seen a digital signature of a
secret document and attempt to retrieve the document by including its
digest in a bogus `enc:` URL).

User agents may want to limit sharing of cached files to files with
`Cache-Control: public` or avoid sharing cached files across origins.

###The `aes-ctr-key` attribute

value := aes128_base64 | aes192_base64 | aes256_base64

This key is provided in the form of 16, 24, or 32 bytes base64-encoded as
described in Section 6.8 of [RFC2045].

The message body must be decrypted by applying the AES-CTR algorithm using
the key specified, and using a zero nonce.

If the base64-decoded value does not consist of exactly 16, 24, or 32
bytes, then the user agent must act as if the resource could not be
obtained due to a network error, and may report the problem to the user.

enc:aes-ctr-key=ZGFtb3dtb3dkYW1vd21vdw==,http://example.com

####Security considerations

URLs using this attribute contain sensitive information (the key used to
decrypt the referenced content) and as such should be handled with care,
e.g. only sent over TLS-encrypted connections, and only sent to users who
are authorized to access the encrypted content.

User agents are encouraged to not show the full `enc:` URLs in user
interface elements where the URL is displayed, as it could be used to
obscure the domain name.

This attribute enables the content of a particular resource to be
encrypted. If protocol used to fetch the resource is not encrypted itself,
it may leak private information through metadata, e.g. information held in
HTTP headers. The length and name of the resource may still be visible.
The rate at which the data is transmitted is also unobscured. If this
scheme is used to obscure private information, it is important to consider
how these side channels might leak information.

Each resource encrypted in this fashion must use a fresh key. Otherwise,
an attacker can use commonalities in the resources' plaintexts to
determine the key and decrypt all the resources sharing a key.

The encryption does not guarantee integrity. Attacker will be able to
truncate or corrupt the resource unless a cryptographically strong
checksum is used as well.

> (Encryption without integrity is sufficient if you just want to protect
> against the CDN being hacked/accidentally leaking all their files,
> rather than malicious CDNs.)

##Origin

For the purpose of Same-Origin Policy the URL embedded in the `enc:`
scheme should be used.

`enc:x=y,http://example.com` and `http://example.com` are same origin.

`enc:x=y,http://example.com` and `enc:x=y,https://example.com` are not
same origin.

--
regards, Kornel Lesiński

Received on Wednesday, 13 June 2012 23:45:44 UTC