W3C home > Mailing lists > Public > www-rdf-comments@w3.org > July to September 2003

Re: pfps-04 (why the thread is germane to pfps-04)

From: Graham Klyne <GK-lists@ninebynine.org>
Date: Tue, 29 Jul 2003 16:33:25 +0100
Message-Id: <5.1.0.14.2.20030729162932.00b9c680@127.0.0.1>
To: fmanola@mitre.org
Cc: pat hayes <phayes@ihmc.us>, Martin Duerst <duerst@w3.org>, "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>, www-rdf-comments@w3.org, w3c-i18n-ig@w3.org, msm@w3.org

Frank,

You make a fair point, except that CHARMOD seems to not offer a clear 
definition.

The best I could find is:

[[
3.1.6 Units of storage

Computer storage and communication rely on units of physical storage and 
information interchange, such as bits and bytes (also known as octets, as 
nowadays the word bytes is generally considered to mean 8-bit bytes). A 
frequent error in specifications and implementations is the equating of 
characters with units of physical storage. The mapping between characters 
and such units of storage is actually quite complex, and is discussed in 
the next section, 3.2 Digital Encoding of Characters.

[S] [I] Specifications and software MUST NOT assume a one-to-one 
relationship between characters and units of physical storage.
]]
-- http://www.w3.org/TR/charmod/#sec-Storage

Anyway, I didn't mean to get into a protracted debate here, just wanted to 
try and reduce the number of imponderables.

#g
--

At 11:21 29/07/03 -0400, Frank Manola wrote:
>It seems to me that, given the context of this discussion, we might try
>to stick to the terminology (and distinctions made between the terms) in
>CHARMOD if we can.  At least that's a single document...
>
>--Frank
>
>Graham Klyne wrote:
> >
> > At 00:46 29/07/03 -0500, pat hayes wrote:
> >
> > >>Are 'binary octets' different from 'octets'?
> > >
> > >I have absolutely no idea. :-)
> >
> > Noticing that we're banding around this term 'octets', apparently without
> > understanding what they are, I thought I'd dig over some definitions...
> >
> > I see an octet as a sequence of 8 bits, where a bit is one of {0,1}.  Octet
> > instances are often described by a number in the range 0..255, with the
> > common relationship between binary numbers and bits, subject to agreeing
> > most significant first or least significant first.  In either case, the
> > relationship is 1:1.
> >
> > The UTF-8 spec avoids the bit ordering issue by simply talking about "high
> > order" to "low order" bits, which establishes a single direct relationship
> > between the individual bits and the numbers 0..255.
> >
> > [[
> > In UTF-8, characters are encoded using sequences of 1 to 6 octets. The only
> > octet of a "sequence" of one has the higher-order bit set to 0, the
> > remaining 7 bits being used to encode the character value. In a sequence of
> > n octets, n>1, the initial octet has the n higher-order bits set to 1,
> > followed by a bit set to 0. The remaining bit(s) of that octet contain bits
> > from the value of the character to be encoded. The following octet(s) all
> > have the higher-order bit set to 1 and the following bit set to 0, leaving
> > 6 bits in each to contain bits from the character to be encoded.
> > ]]
> > -- http://www.rfc-editor.org/rfc/rfc2279.txt
> >
> > The UTF-8 spec generally presents octet values as hexadecimal numerals.
> >
> > Dan Connolly offers a slightly different form of definition:
> > [[
> > octet
> >      an element of the set {0, 1, 2, ..., 255}
> > ]]
> > http://www.w3.org/MarkUp/html-spec/charset-harmful.html
> >
> > Some others:
> >
> > [[
> > octet: A byte of eight binary digits usually operated upon as an entity.
> > ]]
> > -- http://glossary.its.bldrdoc.gov/fs-1037/dir-025/_3631.htm
> > -- http://www.atis.org/tg2k/_octet.html
> >
> > [[
> > Definition for: octet
> >
> > Eight bits.Octet is sometimes used instead of the term byte to avoid
> > confusion, because not all computer systems use bytes that are eight 
> bits long.
> > ]]
> > -- 
> http://www.computeruser.com/resources/dictionary/definition.html?lookup=3442
> >
> > Google for "octet definition" shows up plenty more
> >
> > Looking for definitions of "binary octet" doesn't show up anything
> > especially useful, but the pattern of its use suggests one of two things:
> > (a) octet values represented as 8 bits (as opposed to, say, a number)
> > (b) octets used to encode binary data (as opposed to textual data).
> >
> > Anyway, returning to the original question (Are 'binary octets' different
> > from 'octets'?), I think the answer is:  not for any meaningful purpose as
> > far as RDF is concerned.
> >
> > #g
> >
> > -------------------
> > Graham Klyne
> > <GK@NineByNine.org>
> > PGP: 0FAA 69FF C083 000B A2E9  A131 01B9 1C7A DBCA CB5E
>
>--
>Frank Manola                   The MITRE Corporation
>202 Burlington Road, MS A345   Bedford, MA 01730-1420
>mailto:fmanola@mitre.org       voice: 781-271-8147   FAX: 781-271-875

-------------------
Graham Klyne
<GK@NineByNine.org>
PGP: 0FAA 69FF C083 000B A2E9  A131 01B9 1C7A DBCA CB5E
Received on Tuesday, 29 July 2003 11:55:53 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 21 September 2012 14:16:32 GMT