TOC 
Network Working GroupP. Stickler
Internet-DraftNRC
Expires: October 12, 2002April 13, 2002

The 'val:' URI Scheme for Denoting and Describing Datatype Values
draft-pstickler-val-01

Status of this Memo

This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt.

The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html.

This Internet-Draft will expire on October 12, 2002.

Copyright Notice

Copyright (C) The Internet Society (2002). All Rights Reserved.

Abstract

This document describes the 'val:' Uniform Resource Identifier (URI) scheme for denoting and describing datatype values which are represented by a pairing of the datatype and a lexical form.



 TOC 

Table of Contents




 TOC 

1. Overview

XML Schema [5] defines a simple datatype as consisting of a value space (a set of values), a lexical space (a set of lexical forms, or strings), and an N:1 mapping from members of the lexical space to members of the value space. Thus, in the context of a given datatype, a lexical form (member of the lexical space) denotes one and only one value (member of the value space). Thus the pairing of a datatype and a lexical form unambiguously identifies a specific value of that datatype.

The 'val:' URI scheme is intended to provide a simple but consistent means by which a lexical form can be paired with a datatype URI allowing one to denote the datatype value represented by that pairing.

Examples:

   val:(http://www.w3.org/2001/XMLSchema%23integer)15
   val:(http://www.w3.org/2001/XMLSchema%23lang)fi
   val:(http://www.w3.org/2001/XMLSchema%23NMTOKENS)foo%20bar%20bas
   val:(http://dodo.xyz.net/x-token)foo
   val:(voc://john.doe@widgets.org/xyzcoord)28%20191%2011
   val:(uuid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6)392f291a093

These examples are provided for illustrative purposes only and do not necessarily constitute actual URIs. See the BNF definition below for an explicit definition of 'val:' URI syntax.

'val:' URIs are particularly useful for applications such as RDF [4] where one may wish to make statements about such value resources. E.g. to assign language specific labels to members of an enumeration:

   <rdf:Description rdf:about="val:(http://www.w3.org/2001/XMLSchema%23lang)fi">
      <rdfs:label xml:lang="en">Finnish</rdfs:label>
      <rdfs:label xml:lang="fi">Suomi</rdfs:label>
      <rdfs:label xml:lang="se">Finska</rdfs:label>
   </rdf:Description>

There is the additional benefit in RDF whereby if canonical lexical forms are used, every datatype value represented by a 'val:' URI in an RDF graph will share a common graph node, which both provides a considerable amount of graph compression compared to other literal datatyping schemes as well as facilitates queries or inference based on equality of datatype values.



 TOC 

2. BNF for the 'val:' URI Scheme

This is a BNF-like description of the 'val:' Uniform Resource Identifier syntax, using the conventions of RFC 822[2], except that "|" is used to designate alternatives, and brackets [] are used around optional or repeated elements. Briefly, literals are quoted with "", optional elements are enclosed in [brackets], and elements may be preceded with <n>* to designate n or more repetitions of the following element; n defaults to 0.

This BNF description adopts sub-definitions defined in RFC 2396 "Uniform Resource Identifiers (URI): Generic Syntax" [1] which are not repeated here.


val-URI       = "val:" "(" datatypeURI ")" lexicalForm
datatypeURI   = absoluteURI [ "%23" fragment ]
lexicalForm   = 1*uric

absoluteURI   = <as defined by RFC 2396>
fragment      = <as defined by RFC 2396>
uric          = <as defined by RFC 2396>

Any occurrences of the characters '(', ')', or '%' in either the absoluteURI or fragment segments must be URI escaped. E.g. the datatype URI "foo:abc(xyz)#def%23bar" with value "zzz" would be encoded as "val:(foo:abc%28xyx%29%23def%2523bar)zzz".

Note that the URI scheme prefix "val:" is considered to be a valid URI denoting this URI scheme, though it is not itself a valid URI according to this URI scheme.



 TOC 

3. Security Considerations

This document raises no known security issues.



 TOC 

References

[1] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform Resource Identifiers (URI): Generic Syntax", RFC 2396, August 1998.
[2] Crocker, D., "STANDARD FOR THE FORMAT OF ARPA INTERNET TEXT MESSAGES", RFC 822, August 1982.
[3] International Organization for Standardization, "ISO/IEC 11578:1996 Information technology -- Open Systems Interconnection -- Remote Procedure Call", August 2001.
[4] Lassila, O. and R. Swick, "Resource Description Framework (RDF) Model and Syntax Specification", February 1999.
[5] "XML Schema ...".


 TOC 

Author's Address

  Patrick Stickler
  Nokia Research Center
  Visiokatu 1
  Tampere 33720
  FI
EMail:  patrick.stickler@nokia.com


 TOC 

Full Copyright Statement

Acknowledgement