- From: Graham Klyne <GK-lists@ninebynine.org>
- Date: Tue, 29 Jul 2003 09:50:58 +0100
- To: pat hayes <phayes@ihmc.us>, Martin Duerst <duerst@w3.org>
- Cc: "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>, www-rdf-comments@w3.org, w3c-i18n-ig@w3.org, msm@w3.org
At 00:46 29/07/03 -0500, pat hayes wrote:
>>Are 'binary octets' different from 'octets'?
>
>I have absolutely no idea. :-)
Noticing that we're banding around this term 'octets', apparently without
understanding what they are, I thought I'd dig over some definitions...
I see an octet as a sequence of 8 bits, where a bit is one of {0,1}. Octet
instances are often described by a number in the range 0..255, with the
common relationship between binary numbers and bits, subject to agreeing
most significant first or least significant first. In either case, the
relationship is 1:1.
The UTF-8 spec avoids the bit ordering issue by simply talking about "high
order" to "low order" bits, which establishes a single direct relationship
between the individual bits and the numbers 0..255.
[[
In UTF-8, characters are encoded using sequences of 1 to 6 octets. The only
octet of a "sequence" of one has the higher-order bit set to 0, the
remaining 7 bits being used to encode the character value. In a sequence of
n octets, n>1, the initial octet has the n higher-order bits set to 1,
followed by a bit set to 0. The remaining bit(s) of that octet contain bits
from the value of the character to be encoded. The following octet(s) all
have the higher-order bit set to 1 and the following bit set to 0, leaving
6 bits in each to contain bits from the character to be encoded.
]]
-- http://www.rfc-editor.org/rfc/rfc2279.txt
The UTF-8 spec generally presents octet values as hexadecimal numerals.
Dan Connolly offers a slightly different form of definition:
[[
octet
an element of the set {0, 1, 2, ..., 255}
]]
http://www.w3.org/MarkUp/html-spec/charset-harmful.html
Some others:
[[
octet: A byte of eight binary digits usually operated upon as an entity.
]]
-- http://glossary.its.bldrdoc.gov/fs-1037/dir-025/_3631.htm
-- http://www.atis.org/tg2k/_octet.html
[[
Definition for: octet
Eight bits.Octet is sometimes used instead of the term byte to avoid
confusion, because not all computer systems use bytes that are eight bits long.
]]
-- http://www.computeruser.com/resources/dictionary/definition.html?lookup=3442
Google for "octet definition" shows up plenty more
Looking for definitions of "binary octet" doesn't show up anything
especially useful, but the pattern of its use suggests one of two things:
(a) octet values represented as 8 bits (as opposed to, say, a number)
(b) octets used to encode binary data (as opposed to textual data).
Anyway, returning to the original question (Are 'binary octets' different
from 'octets'?), I think the answer is: not for any meaningful purpose as
far as RDF is concerned.
#g
-------------------
Graham Klyne
<GK@NineByNine.org>
PGP: 0FAA 69FF C083 000B A2E9 A131 01B9 1C7A DBCA CB5E
Received on Tuesday, 29 July 2003 09:43:43 UTC