draft-daviel-kaegi-html-geo-tag-07
INTERNET-DRAFT
March 2005 (Expires Okt 2005)
Geographic registration of HTML documents
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet- Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
This memo describes a method of registering HTML documents with a
specific geographic location through means of embedded META tags. The
content of the META tags gives the geographic position of the
resource described by the HTML document in terms of Latitude,
Longitude, and optionally Elevation in a simple, machine-readable
manner. This information may be used for automated resource discovery
by means of an HTML indexing agent or search engine.
1. Introduction
Many resources described by HTML documents on the World-Wide-Web are
associated with a particular place on the Earth's surface. While
resource discovery on the Web has thus far focussed on document title
and open-text keyword searching, in these cases it may be beneficial
to facilitate geographic searching. Examples of this kind of resource
include pages describing restaurants, shipwrecks, retail stores etc.
Consumers may use this information in order to select the closest
facility, and in order to navigate towards a resource by road, on
Daviel,Kaegi [Page 1]
March 2005 (Expires Okt 2005)
foot or by other means.
This draft describes a method of adding static location data to
legacy HTML documents using a construct that is familiar to many HTML
authors. It is intended to be concise, unambiguous, simple to use and
compatible with existing editing tools. The intended use is to
provide location data to Web robots that typically revisit pages
every few weeks.
It is anticipated that in many cases this location data will be added
manually by persons unfamiliar with GIS terminology or metadata
standards. For this reason a minimal data set with few options is
preferred over a more complex and extensible one.
The method described in this draft is not intended to preempt
existing or future metadata encapsulation schemes which may better
serve the needs of a particular community, such as geographic
information systems (GIS).
2. Coordinate Systems
Resource positions on the Earth's surface should be expressed in
degrees North of Latitude, degrees East of Longitude as signed
decimal numbers.
Where the precision of the coordinates is such that the datum used is
significant, typically more precise than one kilometre distance,
positions should be converted to the WGS 84 datum [3]. Elevations, if
given, should be in metres above datum. Positions given by a GPS set
[4] with datum set to "WGS 84" will in most cases be adequate, of the
order of 15 metres accuracy in horizontal position and 25 metres in
elevation.
It should be noted that elevations referred to the WGS 84 geoid will
in some areas differ appreciably from those measured with respect to
local datum in coastal regions, which may be Mean High Water Springs,
Mean Sea Level, Higher High Water or a similar reference level, and
will differ substantially from "ground level". Use of elevation is
not recommended unless its value may be reliably determined.
3. Implementation
HTML markup should be added to the document in the form of a META
statement. This should be placed in the document head in accordance
with the HTML 4 specification [1]. There are three GEO identifiers:
The identifier "geo.position" is used for Latitude, Longitude and
optionally Elevation data.
Daviel,Kaegi [Page 2]
March 2005 (Expires Okt 2005)
The identifier "geo.region" is used for the country subdivision code
from ISO 3166-2 [10].
The optional identifier "geo.placename" is used for a free text
representation of the position, for example "city, province" or
"town, county, state".
For resources within the United States and Canada, the "geo.region"
identifier as given by ISO 3166-2 is typically constructed from the
2-character country code [5] as used in Internet domain names, and
the common 2-character State/Province codes [8][9], joined with a
hyphen, for example "CA-BC" for British Columbia, Canada.
Where the official subdivision code is unknown, the 2-character
country code alone may be used in "geo.region", for example "DE" for
Germany.
The "geo.placename" identifier should not be used for indexing
purposes, due to possible ambiguities in naming convention, language,
word ordering and placename duplicates. It may be used for
descriptive purposes.
If the resource described is localized to a country or region, but
not to a single point, the "geo.region" identifier may be used alone
without a corresponding "geo.position" identifier.
It is the intention of this draft to provide a means to associate a
single point with an HTML document. Some consideration should be
given to the choice of location when describing a resource, given
that positioning mechanisms may provide an accuracy of the order of
ten metres in horizontal position. For instance, when describing a
retail store or small business, it may be more meaningful to give the
position of the street entrance rather than the centroid of the
property.
Although the HTML specification [1] states that the name field is in
general case-sensitive, these GEO tags should be recognized by
compliant agents regardless of case. Coordinates should be ordered
(Latitude ; Longitude) as for RFC 2426, RFC 2445 (vCard and iCal
specifications) [6][7]. If elevation is given, coordinates should be
ordered (Latitude ; Longitude ; Elevation). (This is at variance
with common GIS practice, but better matches the intended audience of
this Draft.)
The Metadata Profile "http://geotags.com/geo" may be used as defined
in [1] to define the geo tag properties.
4. Examples
Daviel,Kaegi [Page 3]
March 2005 (Expires Okt 2005)
describes a resource 115 metres above datum at position 48.54 degrees
North, 123.84 degrees West, while
describes a resource at position 10 degrees South, 60 degrees East.
describes a resource in London, Ontario, Canada, while
describes a resource in London, England (Great Britain).
The HTML attributes "lang", "dir" may be used to define the language
and directionality for the "geo.placename" identifier as defined in
[1], for instance
5. Semantics
Values for latitude and longitude shall be expressed as decimal
fractions of degrees. Whole degrees of latitude shall be represented
by a decimal number ranging from 0 through 90. Whole degrees of
longitude shall be represented by a decimal number ranging from 0
through 180. When a decimal fraction of a degree is specified, it
shall be separated from the whole number of degrees by a decimal
point (the period character, "."). Decimal fractions of a degree
should be expressed to the precision available, with trailing zeroes
being used as placeholders if required. A decimal point is optional
where the precision is less than one degree. Some effort should be
made to preserve the apparent precision when converting from another
datum or representation, for example 41 degrees 13 minutes should be
represented as 41.22 and not 41.21666, while 41 13' 11" may be
represented as 41.2197.
Latitudes north of the equator MAY be specified by a plus sign (+),
or by the absence of a minus sign (-), preceding the designating
degrees. Latitudes south of the Equator MUST be designated by a
minus sign (-) preceding the digits designating degrees. Latitudes
Daviel,Kaegi [Page 4]
March 2005 (Expires Okt 2005)
on the Equator MUST be designated by a latitude value of 0.
Longitudes east of the prime meridian shall be specified by a plus
sign (+), or by the absence of a minus sign (-), preceding the
designating degrees. Longitudes west of the prime meridian MUST be
designated by a minus sign (-) preceding the digits designating
degrees. Longitudes on the prime meridian MUST be designated by a
longitude value of 0. A point on the 180th meridian shall be taken
as 180 degrees West, and shall include a minus sign.
Any spatial address with a latitude of +90 (90) or -90 degrees will
specify a position at the True North or True South Poles,
respectively. The component for longitude may have any legal value.
The vertical coordinate (Elevation) must be expressed in meters
above WGS-84 datum. Points having zero elevation must not have a
negative sign.
5.1 Interpretation
Whitespace within a position value shall be ignored.
An interpreting agent shall internally mark position values either
valid or invalid. If a position is marked invalid, it shall not be
used to index or qualify the containing document.
A position having a Latitude greater than 90 degrees, or less than
-90 degrees, shall be marked invalid.
A position having a Longitude greater than 180 degrees, or less than
-180 degrees, shall be marked invalid.
Where a value is given for geo.region, and the latitude and longitude
values given for geo.position fall outside the recognized boundaries
of this region, the position may be marked invalid. For example, if a
region of "US" is given for a location in the US mainland, the
position may be marked invalid if the Latitude is negative or the
Longitude is positive.
No formal reliance shall be placed on the precision implicit in
position data. It is likely that few content providers are qualified
to determine reliable precision or accuracy data, and may use
position data from other sources which does not give the datum.
6. Formal Syntax
Daviel,Kaegi [Page 5]
March 2005 (Expires Okt 2005)
DIGIT = %x30-39 ; 0-9
PLUS = %x2B ; +
MINUS = %x2D ; -
DECIMAL = %x2E ; .
SEMI = %x3B ; ;
CRLF = %x0D.%x0A ; return, linefeed
SP = %x20 ; space
HTAB = %x09 ; tab
WSP = SP / HTAB ;
LWSP = (WSP / CRLF WSP) ; linear whitespace
UCASE = %x41-5A ; A-Z
HYPHEN = %x2D ; -
USCORE = %x5F ; _
country = 2UCASE ; 2-letter code from ISO3166
region = 1*3UCASE / 2DIGIT ; region code from ISO3166-2
TEXT =
placename = 1*TEXT
delimiter = SEMI
latitude = [ MINUS / PLUS ] 0*2DIGIT [ DECIMAL *DIGIT]
longitude = [ MINUS / PLUS ] 0*3DIGIT [ DECIMAL *DIGIT]
elevation = [ MINUS / PLUS ] 0*DIGIT [ DECIMAL *DIGIT]
position = latitude longitude [ elevation ]
georegion = country [ HYPHEN / USCORE region ]
HTML syntax:
7. Applicability
As stated in the introduction, certain HTML documents may be
associated with a geographic position, while other documents are not.
For proper use of the GEO tags as described in this draft, the
resource described in an HTML document should be associated with a
particular geographic location for the lifetime of the document. The
tags may thus be properly used to describe an object fixed on the
surface of the earth (or more properly, fixed in position relative to
the surface of the earth) such as a retail store, a mountain peak or
a railway station. They may not be used to describe a non-localized,
moving, or intangible object such as a multinational company, river,
aircraft or mathematical theory.
The geographic position given is associated with the resource
Daviel,Kaegi [Page 6]
March 2005 (Expires Okt 2005)
described by the HTML document, not with the physical location of the
document [2], or the location of the company responsible for
publishing or hosting the document. Thus, in some cases the country
code used in "geo.region" may differ from the country code forming
part of the host address in the document URL.
Since the position given is associated with the content of the
document, not the author, publishing and document conversion tools
should not cache position data or store it in a template.
In cases where the object being described is an area, such as a lake
or a building, the position of the object should not in general be
given to greater precision than the width of the object. If desired,
features within the object may be described in another page and their
position given with greater precision. In the case of an object such
as a place of business, where only one page exists, the position of
the entrance may be given rather than the position of the centroid.
8. Security Considerations
This draft raises no security issues.
The intended use of GEO metadata as described in this draft raises no
privacy issues beyond those associated with normal use of the Web.
Concern for privacy requires that personal information, such as a
private address or location, not be published without the consent of
the subject, and that due care be taken in the design of access
control mechanisms when such personal information is present on an
Internet-connected data storage system. It is axiomatic that
information including location data published on a public Web page is
public, and that location-based queries may suggest the present or
future location of the person making them in the same manner that
text queries may suggest personal interests or plans.
It is suggested that publishing tools clearly indicate when
potentially sensitive metadata that is normally not visible, such as
position, author's name or address, is published to a public area.
Use of GEO metadata in an incorrect manner or in a
manner other than that described may raise privacy issues. For
instance, a publishing system that incorrectly places the author's
location on every page, and a mobile device which transmits its
current location, both raise potential privacy issues. An example
of such a mobile device is an embedded diagnostic system in an
automobile. Automatic inclusion of position data may lead to the
users location being determined remotely. In such a case, the device
Daviel,Kaegi [Page 7]
March 2005 (Expires Okt 2005)
should be equipped with appropriate encryption and access controls to
ensure the privacy of the user. Specification of such access
controls is outside the scope of this draft.
9. Internationalization considerations
The "geo.placename" tag content is free text, and should obey the
internationalization rules of HTML 4. "lang" and "dir" modifiers may
be used to specify the language of the content. Multiple instances of
geo.placename may be used with different "lang" modifiers.
Geo.placename content is coded using the character set of the
containing document.
Geo.position and geo.region tag content should use US-ASCII or UTF-8.
10. References
[1] Raggett, Le Hors, Jacobs, "HTML 4.01 Specification",
http://www.w3.org/TR/html4/ , W3C, December 1999
[2] Davis et al., "A Means for Expressing Location Information in
the Domain Name System", RFC 1876, January 1996
http://www.ietf.org/rfc/rfc1876.txt
[3] United States Department of Defense; DoD WGS-1984 - Its
Definition and Relationships with Local Geodetic Systems;
Washington, D.C.; 1985; Report AD-A188 815 DMA; 6127; 7-R-
138-R; CV, KV;
[4] ARINC Research Corporation, "Navstar GPS Space Segment /
Navigation User Interfaces", IRN-200C-002, September 1997
[5] International Organization For Standardization / Organisation
Internationale De Normalisation (ISO), "Standard ISO
3166-1:1997: Codes for the Representation of Names of
Countries and their subdivisions -- Part 1: Country codes",
1997.
[6] Dawson & Stenerson, Internet Calendaring and Scheduling Core
Object Specification (iCalendar), RFC 2445, November 1998
http://www.ietf.org/rfc/rfc2445.txt
[7] Dawson & Howes, vCard MIME Directory Profile, RFC 2426,
September 1998
http://www.ietf.org/rfc/rfc2426.txt
Daviel,Kaegi [Page 8]
March 2005 (Expires Okt 2005)
[8] United States Postal Service, Official Abbreviations -
States and Possessions,
http://www.usps.gov/ncsc/lookups/abbr_state.txt
[9] Canada Postal Guide, Province and Territory Symbols
http://www.canadapost.ca/tools/pg/manual/b03-e.asp
[10] International Organization For Standardization / Organisation
Internationale De Normalisation (ISO), "Standard ISO
3166-2:1998: Codes for the Representation of Names of Countries
and their subdivisions -- Part 2: Country subdivision code",
1998.
11. Acknowledgments Rohan Mahy and Patrik F"altstr"om of Cisco Systems,
for semantics.
12. Author's Address
Andrew Daviel, BSc.
Vancouver Webpages, Box 357
185-9040 Blundell Rd
Richmond BC
V6Y 1K3
Canada
Tel. (604)-377-4796
Fax. (604)-270-8285
andrew@vancouver-webpages.com
Felix A. Kaegi
Dipl.Informatik Ing. ETH (MSc.)
Friedensgasse 51
CH-4056 Basel
SWITZERLAND
+41 61 383 10 01
+41 79 625 27 41
skype felix_kaegi
felix.kaegi@gmail.com
13. Full Copyright Statement
Copyright (C) The Internet Society (date). All Rights Reserved.
This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are
Daviel,Kaegi [Page 9]
March 2005 (Expires Okt 2005)
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Daviel,Kaegi [Page 10]