W3C home > Mailing lists > Public > www-rdf-comments@w3.org > July to September 2003

Re: pfps-04 (why the thread is germane to pfps-04)

From: Frank Manola <fmanola@mitre.org>
Date: Tue, 29 Jul 2003 11:21:35 -0400
Message-ID: <3F2690FF.E52C3FC4@mitre.org>
To: Graham Klyne <GK-lists@ninebynine.org>
CC: pat hayes <phayes@ihmc.us>, Martin Duerst <duerst@w3.org>, "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>, www-rdf-comments@w3.org, w3c-i18n-ig@w3.org, msm@w3.org

It seems to me that, given the context of this discussion, we might try
to stick to the terminology (and distinctions made between the terms) in
CHARMOD if we can.  At least that's a single document...

--Frank

Graham Klyne wrote:
> 
> At 00:46 29/07/03 -0500, pat hayes wrote:
> 
> >>Are 'binary octets' different from 'octets'?
> >
> >I have absolutely no idea. :-)
> 
> Noticing that we're banding around this term 'octets', apparently without
> understanding what they are, I thought I'd dig over some definitions...
> 
> I see an octet as a sequence of 8 bits, where a bit is one of {0,1}.  Octet
> instances are often described by a number in the range 0..255, with the
> common relationship between binary numbers and bits, subject to agreeing
> most significant first or least significant first.  In either case, the
> relationship is 1:1.
> 
> The UTF-8 spec avoids the bit ordering issue by simply talking about "high
> order" to "low order" bits, which establishes a single direct relationship
> between the individual bits and the numbers 0..255.
> 
> [[
> In UTF-8, characters are encoded using sequences of 1 to 6 octets. The only
> octet of a "sequence" of one has the higher-order bit set to 0, the
> remaining 7 bits being used to encode the character value. In a sequence of
> n octets, n>1, the initial octet has the n higher-order bits set to 1,
> followed by a bit set to 0. The remaining bit(s) of that octet contain bits
> from the value of the character to be encoded. The following octet(s) all
> have the higher-order bit set to 1 and the following bit set to 0, leaving
> 6 bits in each to contain bits from the character to be encoded.
> ]]
> -- http://www.rfc-editor.org/rfc/rfc2279.txt
> 
> The UTF-8 spec generally presents octet values as hexadecimal numerals.
> 
> Dan Connolly offers a slightly different form of definition:
> [[
> octet
>      an element of the set {0, 1, 2, ..., 255}
> ]]
> http://www.w3.org/MarkUp/html-spec/charset-harmful.html
> 
> Some others:
> 
> [[
> octet: A byte of eight binary digits usually operated upon as an entity.
> ]]
> -- http://glossary.its.bldrdoc.gov/fs-1037/dir-025/_3631.htm
> -- http://www.atis.org/tg2k/_octet.html
> 
> [[
> Definition for: octet
> 
> Eight bits.Octet is sometimes used instead of the term byte to avoid
> confusion, because not all computer systems use bytes that are eight bits long.
> ]]
> -- http://www.computeruser.com/resources/dictionary/definition.html?lookup=3442
> 
> Google for "octet definition" shows up plenty more
> 
> Looking for definitions of "binary octet" doesn't show up anything
> especially useful, but the pattern of its use suggests one of two things:
> (a) octet values represented as 8 bits (as opposed to, say, a number)
> (b) octets used to encode binary data (as opposed to textual data).
> 
> Anyway, returning to the original question (Are 'binary octets' different
> from 'octets'?), I think the answer is:  not for any meaningful purpose as
> far as RDF is concerned.
> 
> #g
> 
> -------------------
> Graham Klyne
> <GK@NineByNine.org>
> PGP: 0FAA 69FF C083 000B A2E9  A131 01B9 1C7A DBCA CB5E

-- 
Frank Manola                   The MITRE Corporation
202 Burlington Road, MS A345   Bedford, MA 01730-1420
mailto:fmanola@mitre.org       voice: 781-271-8147   FAX: 781-271-875
Received on Tuesday, 29 July 2003 11:29:21 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 21 September 2012 14:16:32 GMT