Re: [precis] Canonicalization of IRIs in security contexts

A note: email is even worse. People quite commonly use email as usernames
with third-parties like banks or shopping sites (and it is a really handy
practice: I don't forget that my username is mark@macchiato.com). But there
are no identity criteria for the local-part, so the bank doesn't know
whether Mark@macchiato.com is the same person as mark@macchiato.com; let
alone whether mark@macchiato.com (with fullwidth ASCII) is.

Mark

*— Il meglio è l’inimico del bene —*


On Thu, Nov 4, 2010 at 06:30, Mark Davis ☕ <mark@macchiato.com> wrote:

> A couple more to add to the list:
>
>    - Not just the schema and host, but each of the components could be
>    canonicalized on the server (eg case-folded), so that would need to be
>    specified.
>    - The query is especially nasty, because some servers treat the values
>    as a series of bytes, and others as a series of characters.
>    - I don't know if it is a concern, but because there is no termination
>    criterion for a URL, in plaintext two processes can get different results.
>    For example, http://unicode.org/cldr/utility/list-unicodeset.jsp?a=a‰&g=gc.
>    Some email clients will stop before the ‰; others go to the .
>
>
> Mark
>
> *— Il meglio è l’inimico del bene —*
>
>
>
> On Thu, Nov 4, 2010 at 04:12, "Martin J. Dürst" <duerst@it.aoyama.ac.jp>wrote:
>
>> Hello Dave,
>>
>> I'm re-forwarding your mail to the IRI WG list, just in case somebody has
>> some comments.
>>
>> Regards,   Martin.
>>
>>
>> On 2010/11/04 1:41, Dave Thaler wrote:
>>
>>> -----Original Message-----
>>>> From: precis-bounces@ietf.org [mailto:precis-bounces@ietf.org] On
>>>> Behalf Of
>>>> "Martin J. Dürst"
>>>> Sent: Monday, November 01, 2010 10:53 PM
>>>> To: precis@ietf.org
>>>> Cc: Yaron Goland
>>>> Subject: Re: [precis] Canonicalization of IRIs in security contexts
>>>>
>>>> Yaron sent the mail below to the IRI WG about half a year ago. My guess
>>>> is that
>>>> nobody has yet looked at it because it's simply overwhelming.
>>>>
>>>> It just occurred to me that this mail might contain some material that
>>>> is in one
>>>> way or another relevant to precis, so I'm forwarding it for future
>>>> reference.
>>>>
>>>> Regards,   Martin.
>>>>
>>>
>>> I plan to start an I-D on this topic in the near future (after Beijing),
>>> so would
>>> welcome any pointers or other contributions.
>>>
>>> -Dave
>>>
>>>>
>>>> On 2010/04/06 8:32, Yaron Goland wrote:
>>>>
>>>>> Of late I've been worrying about the use of URIs/IRIs in security
>>>>> contexts. So I
>>>>>
>>>> wrote up a paper that explores some of the issues and have included it
>>>> below. I
>>>> shared this paper with Ted Hardie, Larry Masinter and Dave Thaler. We
>>>> were
>>>> mostly discussing who should actually own worrying about this problem.
>>>> Ted
>>>> suggested that NewPrep (assuming it gets created as a WG) should own
>>>> this.
>>>> Larry just asked that we move this discussion to the IRI mailing list as
>>>> the IRI WG
>>>> is now worrying about security considerations. So here is the paper.
>>>>
>>>>>
>>>>> Thoughts?
>>>>>
>>>>>                                  Thanks,
>>>>>
>>>>>                                                  Yaron Secure
>>>>> Comparison of URIs and IRIs in security token environments Current
>>>>> purpose of this document The purpose of this paper is to motivate that
>>>>> a problem exists with URI canonicalization in the context of security
>>>>> token
>>>>>
>>>> environments and that this problem needs to be resolved.
>>>>
>>>>>
>>>>> This paper does not contain nor attempt to contain an exhaustive
>>>>> collection of
>>>>>
>>>> URI canonicalization issues. Rather it contains what is hoped to be a
>>>> sufficiently
>>>> large collection of canonicalization issues to motivate the need for a
>>>> solution.
>>>>
>>>>> Problem Description
>>>>> This paper looks at issues related to using URIs in secure ways in
>>>>> security token
>>>>>
>>>> based access control systems. Examples of such systems include WS-*,
>>>> SAML-P
>>>> and OAuth WRAP. In such systems a variety of participants in the
>>>> security
>>>> infrastructure are identified by URIs. For example, requesters of
>>>> security tokens
>>>> are sometimes identified with URIs. The issuers of security tokens and
>>>> the
>>>> relying parties who are intended to consume security tokens are
>>>> frequently
>>>> identified by URIs. Claims in security tokens often have their types
>>>> defined using
>>>> URIs and the values of the claims can also be URIs.
>>>>
>>>>>
>>>>> The most common operation on URIs in a security token context is a
>>>>> straight
>>>>>
>>>> forward comparison. For example, a relying party is consuming a security
>>>> token.
>>>> The relying party will want to look up the name of the issuer of the
>>>> security
>>>> token, which can be a URI, in their local database and find the keying
>>>> material
>>>> associated with that issuer. The relying party will then use the keying
>>>> material to
>>>> validate that the security token is valid. This pattern requires a
>>>> simple
>>>> comparison of the submitted URIs with recorded URIs.
>>>>
>>>>>
>>>>> As outlined in the rest of this document there are a number of
>>>>> decisions that a
>>>>>
>>>> canonicalizer can make when canonicalizing URIs for comparison purposes.
>>>> For
>>>> example, some URI canonicalizers will strip out fragments so that
>>>> http://example.com/foo#1234 and http://example.com/foo will be treated
>>>> as
>>>> equal. Similar treatment is also provided for userinfo, e.g.
>>>> http://joe:password@example.com/foo will be treated the same as
>>>> http://example.com/foo. And all of this is before even beginning to
>>>> think
>>>> through Unicode issues such as how to deal with case insensitive
>>>> environments.
>>>>
>>>>>
>>>>> The reason these inconsistencies matter is that they open up potential
>>>>> security
>>>>>
>>>> holes. For example, the Foo corporation has paid money to the
>>>> example.com
>>>> corporation for access to the stuff service. The Foo corporation allows
>>>> its
>>>> employees to create accounts on the stuff service. So that user Joe
>>>> could get
>>>> the account http://example.com/stuff/FooCorp/joe and the user Jane
>>>> could get
>>>> http://example.com/stuff/FooCorp/Jane. It turns out, however, that Foo
>>>> Corp's
>>>> canonicalizer honors fragments for comparison purposes. So Jack, who is
>>>> a
>>>> malicious employee of Foo Corp, asks to create an account at
>>>> example.com
>>>> with the name joe#stuff. Foo Corp's URI logic checks its records for
>>>> accounts it
>>>> has created with stuff and sees that there is no account with the name
>>>> joe#stuff
>>>> so, in its records, it associates the account joe#stuff with Jack and
>>>> will only issue
>>>> tokens good for use with http://example.com/stuff/FooCorp/joe#stuff to
>>>> Jack.
>>>>
>>>>>
>>>>> Jack, the attacker, goes to the security token service at Foo Corp and
>>>>> asks for
>>>>>
>>>> a security token good for http://example.com/stuff/FooCorp/joe#stuff.
>>>> FooCorp is happy to issue the token since Jack is the legitimate owner
>>>> (in Foo
>>>> Corp's eyes) of the joe#stuff account. Jack then submits the security
>>>> token in a
>>>> request to http://example.com/stuff/FooCorp/joe.
>>>>
>>>>>
>>>>> But example.com uses a URI canonicalizer, that for the purposes of
>>>>> checking
>>>>>
>>>> equality, ignores fragments. So when example.com looks in the security
>>>> token
>>>> to see if the requester has permission from Foo Corp to access the given
>>>> account it successfully matches the URI in the security token,
>>>> http://example.com/stuff/FooCorp/joe#stuff with the request-URI
>>>> http://example.com/stuff/FooCorp/joe.
>>>>
>>>>>
>>>>> Leveraging the inconsistencies in the canonicalizers used by Foo Corp
>>>>> and
>>>>>
>>>> example.com, Jack is able to successfully launch an elevation of
>>>> privilege attack.
>>>>
>>>>> What's up with the colors and the weird SCUXXX identifiers?
>>>>> I track requirements using unique identifiers. So each requirement gets
>>>>> an
>>>>>
>>>> identifier of the form SCUXXX where XXX are three alphabetic letters.
>>>> There is no
>>>> meaning to each identifier. I just generate them as I need them. I use a
>>>> dedicated style for the requirements both to highlight them and also to
>>>> make it
>>>> easy to generate a table of them automatically at the end of the doc.
>>>>
>>>>> Relative URIs
>>>>> Is it possible to have meaningful URI comparisons involving relative
>>>>> URIs or do
>>>>>
>>>> we require that all URIs are fully qualified before being submitted to
>>>> the
>>>> canonicalization algorithm?
>>>>
>>>>>
>>>>>
>>>>> SCUAAA - A secure URI canonicalization profile MUST define if it allows
>>>>>
>>>> relative URIs.
>>>>
>>>>>
>>>>> Hostname or URI resolution
>>>>> Some systems (specifically Java) used to follow the rule that if two
>>>>> host names
>>>>>
>>>> resolved to the same IP then the host names were considered equal. But
>>>> with
>>>> the introduction of virtual hosting and dynamic IP addresses this method
>>>> of
>>>> comparison cannot be relied upon.
>>>>
>>>>>
>>>>> In addition a comparison mechanism which relies on the ability to
>>>>> resolve
>>>>>
>>>> identifiers like host names to other identifies like IP addresses
>>>> inherently leaks
>>>> information about security decisions to outsiders since these kind of
>>>> queries are
>>>> often publicly viewable (e.g. someone could track DNS traffic and from
>>>> that
>>>> determine who an entity was likely getting security tokens from or being
>>>> asked
>>>> to generate security tokens to). So are there security issues in
>>>> requiring name
>>>> resolution as part of the canonicalization algorithm?
>>>>
>>>>>
>>>>> And, if a canonicalization algorithm does require some kind of network
>>>>> access
>>>>>
>>>> to work, how does it function in network restricted or offline contexts?
>>>>
>>>>>
>>>>>
>>>>> SCUAAB - A secure URI canonicalization profile MUST define if it
>>>>> requires
>>>>>
>>>> network access in order to canonicalize a URI.
>>>>
>>>>>
>>>>>
>>>>> SCUAAS - A secure URI canonicalization profile MUST define it compares
>>>>> host
>>>>>
>>>> name values to host name values or if it requires the host name to first
>>>> be
>>>> resolved to an IP address or some other underlying identifier as part of
>>>> the
>>>> canonicalization process.
>>>>
>>>>>
>>>>> Fragment components
>>>>> Some URI formats include fragment identifiers. These are typically
>>>>> handles to
>>>>>
>>>> locations within a resource and are used for local reference. A classic
>>>> example is
>>>> the use of fragments in HTTP URLs where a URL of the form
>>>> http://foo.com/blah.html#ick means "retrieve the resource
>>>> http://foo.com/blah.html and once it has arrived locally find the HTML
>>>> anchor
>>>> named "Ick" and display that.
>>>>
>>>>>
>>>>> So, for example, when a user clicks on the link
>>>>> http://foo.com/blah.html#baz a
>>>>>
>>>> browser will check its cache by doing a URI comparison for
>>>> http://foo.com/blah.html and if the resource is present in the cache a
>>>> match is
>>>> declared.
>>>>
>>>>>
>>>>>
>>>>> SCUAAC - A secure URI canonicalization profile MUST define how URI
>>>>>
>>>> fragments are to be treated as part of the canonicalization process.
>>>>
>>>>>
>>>>> Query components
>>>>> Similar to fragments, there is the question of are http://foo.com/blahand
>>>>>
>>>> http://foo.com/blah? equal or different?
>>>>
>>>>>
>>>>>
>>>>> SCUAAR - A secure URI canonicalization profile MUST define how query
>>>>>
>>>> components of URIs are to be treated as part of the canonicalization
>>>> process.
>>>>
>>>>>
>>>>> But what about the values in a query component? Should
>>>>>
>>>> http://foo.com/blah?ick=bick&foo=bar be considered equal to
>>>> http://foo.com/blah?foo=bar&ick=bick?
>>>>
>>>>>
>>>>>
>>>>> SCUAAY - A secure URI canonicalization profile MUST define if it will
>>>>> allow for
>>>>>
>>>> the re-ordering of query argument values and if so, how.
>>>>
>>>>>
>>>>> URI Scheme names
>>>>> RFC 3986 defines URI schemes as being case insensitive and in section
>>>>> 6.2.2.1
>>>>>
>>>> specifies that scheme names should be normalized to lower case
>>>> characters. But
>>>> separately it specifies that percent-encoded characters should be
>>>> normalized to
>>>> upper case characters. Do we want this inconsistency?
>>>>
>>>>>
>>>>>
>>>>> SCUAAF - A secure URI canonicalization profile MUST define how URI
>>>>> scheme names are to be normalized (e.g. to upper or lower case?)
>>>>>
>>>>> Host names
>>>>>
>>>>> SCUAAM - A secure URI canonicalization profile MUST define how URI
>>>>> host names are to be normalized (e.g. to upper or lower case
>>>>> characters?)
>>>>>
>>>>> Userinfo
>>>>> RFC 3986 defines the userinfo production that allows arbitrary data
>>>>> about the
>>>>>
>>>> user of the URI to be placed before @ signs in URIs. For example:
>>>> http://joe:jane:jack:yo@example.com/bar has the value
>>>> "joe:jane:jack:yo" as its
>>>> userinfo. When canonicalizing a URI in a security context should be the
>>>> userinfo
>>>> be left in? Some URI comparison services for example treat
>>>> http://joe:ick@example.com and http://example.com as being equal.
>>>>
>>>>>
>>>>>
>>>>> SCUABD - A secure URI canonicalization profile MUST specify what is to
>>>>>
>>>> happen to any userinfo included in a URI during the canonicalization
>>>> process.
>>>>
>>>>>
>>>>> IPv6 Host Names
>>>>> IPv6 names have a wide variety of alternate but semantically identical
>>>>>
>>>> syntaxes.
>>>>
>>>>>
>>>>>
>>>>> SCUAAK - A secure URI canonicalization profile MUST define how IPv6
>>>>>
>>>> addresses are canonicalized to a standard format.
>>>>
>>>>>
>>>>> IPv4 Host Names
>>>>> The BNF for URIs is ambiguous when it comes to distinguishing IPv4
>>>>> addresses
>>>>>
>>>> from registered names. RFC 3986 tries to resolve this ambiguity by
>>>> arguing that
>>>> when processing a host name if it matches the IPv4 production
>>>> IPv4address then
>>>> it is an IPv4 address otherwise it is a reg-name. But this solution
>>>> seems on its
>>>> face unsatisfying as it is likely to be confusing to normal users. Can
>>>> we really
>>>> expect a normal user when dealing with a security context to fully grasp
>>>> that
>>>> 12.12.12.12 will be treated as an IPv4 address and not as a DNS host
>>>> name?
>>>> Maybe IPv4 addresses should just be banned from canonicalization because
>>>> of
>>>> the confusion they can cause? Or perhaps domain names that look like
>>>> IPv4
>>>> addresses should be banned? This is similar in spirit to the homograph
>>>> problem in
>>>> Unicode.
>>>>
>>>>>
>>>>>
>>>>> SCUABD - A secure URI canonicalization profile MUST specify how it
>>>>> handles
>>>>>
>>>> IPv4 addresses and the ambiguities of IPv4 versus reg-names.
>>>>
>>>>>
>>>>> DNS versus non-DNS names
>>>>> RFC 3986 explicitly allows for the idea that host names might not be
>>>>> DNS
>>>>>
>>>> names (or IP addresses). But no mechanism is provided to explicitly
>>>> indicate
>>>> when a host name is not a DNS name. This can lead to potential security
>>>> issues if
>>>> the sender of a URI thinks they are referring to a non-DNS name while
>>>> the
>>>> receiver of the URI believes that the host name is a DNS Name.
>>>>
>>>>>
>>>>>
>>>>> SCUAAT - A secure URI canonicalization profile MUST define if
>>>>> non-DNS/IP
>>>>>
>>>> names are allowed as host names.
>>>>
>>>>>
>>>>> Punycode versus non-ASCII Host name characters RFC 3986 in section
>>>>> 3.2.2 specifically allows for the use of URL encoded UTF-8 characters
>>>>> in the
>>>>>
>>>> host name, in addition to the use of IDNA names. This create an
>>>> ambiguity for
>>>> canonicalization since it isn't clear if all host names that involve
>>>> international
>>>> characters should be canonicalized to IDNA names or perhaps IDNA names
>>>> and
>>>> host names with international characters are considered mutually
>>>> exclusive?
>>>>
>>>>>
>>>>>
>>>>> SCUAAU - A secure URI canonicalization profile MUST define the
>>>>>
>>>> canonicalization relationship of host names with internationalized
>>>> characters
>>>> and IDNA names.
>>>>
>>>>>
>>>>> Path Segment Normalization
>>>>> RFC 3986 supports the use of path segment values such as ./ or ../ for
>>>>> relative
>>>>>
>>>> URLs. Strictly speaking including such path segment values in a fully
>>>> qualified URI
>>>> is syntactically illegal but RFC 3986 nevertheless defines an algorithm
>>>> to remove
>>>> them (see section 4.1 of RFC 3986).
>>>>
>>>>>
>>>>>
>>>>> SCUAAP - A secure URI canonicalization profile MUST define if "." Or
>>>>> ".."
>>>>>
>>>> characters are allowed as relative references in fully qualified URIs
>>>> and if so how
>>>> they are to be canonicalized.
>>>>
>>>>>
>>>>> Percent Encoding
>>>>>
>>>>> SCUAAY - A secure URI canonicalization profile MUST define how to
>>>>>
>>>> canonicalize percent encoded characters that are not going to be
>>>> unencoded.
>>>>
>>>>>
>>>>> RFC 3986 actually specifies that alphabetic characters in percent
>>>>> encoding
>>>>>
>>>> (which are required to be in US-ASCII) should be canonicalized to upper
>>>> case,
>>>> which is inconsistent with how host names and scheme names are treated.
>>>>
>>>>>
>>>>>
>>>>> SCUAAZ - A secure URI canonicalization profile MUST define if
>>>>> characters that
>>>>>
>>>> are percent encoded but do not require percent encoding should be
>>>> decoded as
>>>> part of the canonicalization process.
>>>>
>>>>>
>>>>> The previous, btw, assumes that we can even tell when a character
>>>>> didn't need
>>>>>
>>>> encoding. For example, a delimiter character like "/" often needs
>>>> encoding so if
>>>> we see one encoded, especially in a scheme we don't explicitly support,
>>>> it's
>>>> ambiguous if it was unnecessarily encoded. On the other hand if we see
>>>> the
>>>> letter "a" encoded it's highly unlikely that was unnecessary. But is it
>>>> guaranteed
>>>> that it is unnecessary? Section 2.3 of RFC 3986 defines a set of
>>>> characters it
>>>> argues should be decoded but is that decoding required in the
>>>> canonicalization
>>>> process?
>>>>
>>>>>
>>>>>
>>>>> SCUABA - A secure URI canonicalization profile MUST define when, if
>>>>> ever, it
>>>>>
>>>> requires percent encoded characters to be decoded.
>>>>
>>>>>
>>>>> Unicode
>>>>>
>>>>> SCUABF - I need a stiff drink before I even begin to think about this
>>>>> section.
>>>>>
>>>> But http://unicode.org/reports/tr36/ makes for some motivational
>>>> reading. Or
>>>> for those with a more visual bent -
>>>> http://www.casabasecurity.com/files/Chris_Weber_Character%20Transformati
>>>> ons%20v1.7_IUC33.pdf.
>>>>
>>>>>
>>>>> Transcription
>>>>> One of the key goals of the URI design was to enable human
>>>>> transcription of
>>>>>
>>>> URIs. But is this a goal for canonicalization in a secure context?
>>>> Should secure
>>>> canonicalization just worry about having an easy to generate machine
>>>> readable
>>>> format or is there a requirement that the output of the canonicalization
>>>> be
>>>> transcribable?
>>>>
>>>>>
>>>>>
>>>>> SCUABD - A secure URI canonicalization profile MUST define if
>>>>> transcription of
>>>>>
>>>> the canonicalized URIs it produces is a goal.
>>>>
>>>>>
>>>>> Handling unrecognized schemes
>>>>> Is it ever safe for a canonicalizer to canonicalize an unrecognized
>>>>> URI/IRI
>>>>>
>>>> scheme type? For example, a new URI scheme type IPPY might have a
>>>> default
>>>> port of X. Therefore IPPY://foo.com:X and IPPY://foo.com should be
>>>> treated as
>>>> equivalent since X is the default port for the IPPY scheme. But a
>>>> canonicalizer
>>>> that doesn't know the IPPY scheme also will not know its default port
>>>> and so
>>>> cannot safely canonicalize a URI with an unrecognized scheme. Similar
>>>> issues
>>>> apply when dealing with default hosts. A canonicalizer dealing with a
>>>> file URL
>>>> that didn't know that localhost is a reserved host value and equivalent
>>>> to an
>>>> empty host couldn't canoncalize in a reasonable way.
>>>>
>>>>>
>>>>>
>>>>> SCUABC - A secure URI canonicalization profile MUST specify if the
>>>>>
>>>> canonicalizer is allowed to canonicalize unrecognized URI schemes and if
>>>> so,
>>>> how.
>>>>
>>>>>
>>>>> Handling unrecognized IP address types RFC 3986 introduces an
>>>>> extension point to enable future changes to the IP address format using
>>>>> the
>>>>>
>>>> IPvFuture production. But can a canonicalizer safely deal with an IP
>>>> syntax it
>>>> doesn't explicitly recognize? The example of IPv6 which has many forms
>>>> with the
>>>> same semantic content is instructive as a canonicalizer that encountered
>>>> an IPv6
>>>> address but didn't recognize such addresses could not perform necessary
>>>> canonicalization.
>>>>
>>>>>
>>>>>
>>>>> SCUABE - A secure URI canonicalization profile MUST specify if the
>>>>>
>>>> canonicalizer is allowed to canonicalize unrecognized IP address formats
>>>> and if
>>>> so, how.
>>>>
>>>>>
>>>>> Handling syntactically illegal URIs
>>>>> What happens if a URI that is submitted for canonicalization is
>>>>> syntactically
>>>>>
>>>> illegal? Do we try to canonicalize around the errors or just reject the
>>>> URI all
>>>> together? This all assumes that the canonicalization profile even
>>>> requires
>>>> detecting if the URI is syntactically legal in the first place.
>>>>
>>>>>
>>>>>
>>>>> SCUABD - A secure URI canonicalization profile MUST specify how it
>>>>> handles
>>>>>
>>>> URIs that are syntactically illegal.
>>>>
>>>>>
>>>>> Which canonicalization profile is being used?
>>>>> Can we really have a single canonicalization profile or do we need
>>>>> multiple
>>>>>
>>>> ones? At a minimum I would imagine that we would have one profile for
>>>> environments that treat URIs in a case sensitive manner and another for
>>>> URIs in
>>>> a case insensitive manner.
>>>>
>>>>>
>>>>>
>>>>> SCUABH - A secure URI canonicalization profile MUST specify how many
>>>>>
>>>> different canonicalization profiles it supports.
>>>>
>>>>>
>>>>> And if there is more than one canonicalization profile doesn't this
>>>>> place
>>>>>
>>>> requirements on security token formats and protocols that use the
>>>> canonicalization mechanism to explicitly define which profile they
>>>> expect will be
>>>> used with a particular URI?
>>>>
>>>>>
>>>>>
>>>>> SCUABI - A secure URI canonicalization profile MUST specify what
>>>>>
>>>> requirements, if any, it places on formats or protocols that leverage
>>>> the profile.
>>>>
>>>>>
>>>>> Proposed Requirements
>>>>> This is where the actual URI canonicalization profile(s) would go.
>>>>> Q&A
>>>>> This is where we would answer questions about the tradeoffs and design
>>>>>
>>>> choices about the canonicalization profile(s).
>>>>
>>>>> Appendix
>>>>> General Requirements
>>>>>
>>>>> SCUAAA - A secure URI canonicalization profile MUST define if it allows
>>>>>
>>>> relative URIs.
>>>>
>>>>>
>>>>> SCUAAB - A secure URI canonicalization profile MUST define if it
>>>>> requires
>>>>>
>>>> network access in order to canonicalize a URI.
>>>>
>>>>>
>>>>> SCUAAS - A secure URI canonicalization profile MUST define it compares
>>>>> host
>>>>>
>>>> name values to host name values or if it requires the host name to first
>>>> be
>>>> resolved to an IP address or some other underlying identifier as part of
>>>> the
>>>> canonicalization process.
>>>>
>>>>>
>>>>> SCUAAC - A secure URI canonicalization profile MUST define how URI
>>>>>
>>>> fragments are to be treated as part of the canonicalization process.
>>>>
>>>>>
>>>>> SCUAAR - A secure URI canonicalization profile MUST define how query
>>>>>
>>>> components of URIs are to be treated as part of the canonicalization
>>>> process.
>>>>
>>>>>
>>>>> SCUAAY - A secure URI canonicalization profile MUST define if it will
>>>>> allow for
>>>>>
>>>> the re-ordering of query argument values and if so, how.
>>>>
>>>>>
>>>>> SCUAAF - A secure URI canonicalization profile MUST define how URI
>>>>> scheme names are to be normalized (e.g. to upper or lower case?)
>>>>>
>>>>> SCUAAM - A secure URI canonicalization profile MUST define how URI
>>>>> host names are to be normalized (e.g. to upper or lower case
>>>>> characters?)
>>>>>
>>>>> SCUABD - A secure URI canonicalization profile MUST specify what is to
>>>>>
>>>> happen to any userinfo included in a URI during the canonicalization
>>>> process.
>>>>
>>>>>
>>>>> SCUAAK - A secure URI canonicalization profile MUST define how IPv6
>>>>>
>>>> addresses are canonicalized to a standard format.
>>>>
>>>>>
>>>>> SCUABD - A secure URI canonicalization profile MUST specify how it
>>>>> handles
>>>>>
>>>> IPv4 addresses and the ambiguities of IPv4 versus reg-names.
>>>>
>>>>>
>>>>> SCUAAT - A secure URI canonicalization profile MUST define if
>>>>> non-DNS/IP
>>>>>
>>>> names are allowed as host names.
>>>>
>>>>>
>>>>> SCUAAU - A secure URI canonicalization profile MUST define the
>>>>>
>>>> canonicalization relationship of host names with internationalized
>>>> characters
>>>> and IDNA names.
>>>>
>>>>>
>>>>> SCUAAP - A secure URI canonicalization profile MUST define if "." Or
>>>>> ".."
>>>>>
>>>> characters are allowed as relative references in fully qualified URIs
>>>> and if so how
>>>> they are to be canonicalized.
>>>>
>>>>>
>>>>> SCUAAY - A secure URI canonicalization profile MUST define how to
>>>>>
>>>> canonicalize percent encoded characters that are not going to be
>>>> unencoded.
>>>>
>>>>>
>>>>> SCUAAZ - A secure URI canonicalization profile MUST define if
>>>>> characters that
>>>>>
>>>> are percent encoded but do not require percent encoding should be
>>>> decoded as
>>>> part of the canonicalization process.
>>>>
>>>>>
>>>>> SCUABA - A secure URI canonicalization profile MUST define when, if
>>>>> ever, it
>>>>>
>>>> requires percent encoded characters to be decoded.
>>>>
>>>>>
>>>>> SCUABD - A secure URI canonicalization profile MUST define if
>>>>> transcription of
>>>>>
>>>> the canonicalized URIs it produces is a goal.
>>>>
>>>>>
>>>>> SCUABC - A secure URI canonicalization profile MUST specify if the
>>>>>
>>>> canonicalizer is allowed to canonicalize unrecognized URI schemes and if
>>>> so,
>>>> how.
>>>>
>>>>>
>>>>> SCUABE - A secure URI canonicalization profile MUST specify if the
>>>>>
>>>> canonicalizer is allowed to canonicalize unrecognized IP address formats
>>>> and if
>>>> so, how.
>>>>
>>>>>
>>>>> SCUABD - A secure URI canonicalization profile MUST specify how it
>>>>> handles
>>>>>
>>>> URIs that are syntactically illegal.
>>>>
>>>>>
>>>>> SCUABH - A secure URI canonicalization profile MUST specify how many
>>>>>
>>>> different canonicalization profiles it supports.
>>>>
>>>>>
>>>>> SCUABI - A secure URI canonicalization profile MUST specify what
>>>>>
>>>> requirements, if any, it places on formats or protocols that leverage
>>>> the profile.
>>>>
>>>>>
>>>>> Implementation Requirements
>>>>> No table of contents entries found.
>>>>> Open Issues
>>>>>
>>>>> SCUABF - I need a stiff drink before I even begin to think about this
>>>>> section.
>>>>>
>>>> But http://unicode.org/reports/tr36/ makes for some motivational
>>>> reading. Or
>>>> for those with a more visual bent -
>>>> http://www.casabasecurity.com/files/Chris_Weber_Character%20Transformati
>>>> ons%20v1.7_IUC33.pdf.
>>>>
>>>>>
>>>>> Last Used ID
>>>>> SCUABI
>>>>>
>>>>>
>>>>>
>>>>>
>>>> --
>>>>
>>>> #-# Martin J. Dürst, Professor, Aoyama Gakuin University
>>>> #-# http://www.sw.it.aoyama.ac.jp   mailto:duerst@it.aoyama.ac.jp
>>>> _______________________________________________
>>>> precis mailing list
>>>> precis@ietf.org
>>>> https://www.ietf.org/mailman/listinfo/precis
>>>>
>>>
>>>
>>>
>> --
>> #-# Martin J. Dürst, Professor, Aoyama Gakuin University
>> #-# http://www.sw.it.aoyama.ac.jp   mailto:duerst@it.aoyama.ac.jp
>>
>>
>

Received on Thursday, 4 November 2010 13:49:13 UTC