- From: <drewangel@adelphia.net>
- Date: Mon, 20 Sep 2004 7:05:33 -0400
- To: public-webarch-comments@w3.org <public-webarch-comments@w3.org>
- CC: <drewangel@adelphia.net>,<w3c@drewangel.com>
Cyberspace itself should be considered analogous to the set
of all sets.
Sunday 2004'262.6405 September 19th= PASSED DEADLINE TO
COMMENT <=2004 Sept16
This subject is too complex for a full response in the
short period I had (saw the notice about 2 weeks ago), but
the following contains my general thoughts, which
are that Cyberspace should be considered as a practical
model of the Universe, although, as with most models, its
form is extremely different from the object modeled. My
normal interests are more in the direction of the HTML
standards, but I definitely do think the Internet
addressing scheme is too small: it should allow for
numerically distant catagorizations, that might be humanly
descernable.
Basically the system should be able to login every single
human thought, every idea, every photo, every frame of
every video. It is not inconceivable that some people will
think that even every pixel of every frame, and every
sample of every audio clip (i.e. ~50k samples per second)
should be addressable through Cyberspace.
That's a lot larger view than normally taken, but THE
SYSTEM should have MORE THAN ENOUGH ROOM to grow in the
forseeable future, and be able to handled as subsets, some
of which will be compatible with the current system.
The world's a big place: it is not going to get smaller.
Cyberspace, itself, therefore, should perhaps be considered
analogous to the set of all sets.
USING FLOATING POINT URI ADDRESSING
In particular, I think the address space should immediately
be extended to a larger space "between the dots." It could
be made allowed to be 6 or 8 digits without any gigantic
other changes, as an interim fix.
HOWEVER,
if the dot (= decimal point ) is not the delimiter used,
then floating point numbers can be used, basically solving
all troubles with respect to address space size. That would
allow machine limited infinitesimal expansions on the right
side of integer addresses, and, enabling expansions on the
left as well. The obvious hinderance is the use of decimal
points. The clear choice is to choose a address delimiter
and a new glyph without the conflicts of the DOT.
A good choice might be binary value 00000100 = ascii code
04 (commonly represented as a diamond glyph).
The expedient of the decimal point combined with fixed
point interger values for internet addresses was determined
even before CRT screens were common in computing. It is
time to grow up.
That there would be many unused addresses, as indeed of
course there are now, is true, but in the future things
like DNA samples and other testing results will be part of
the general database, which means that numerous information
items could refer to the identically same item,, and in
this case, there could even be identically similar reports
about the same identical sample, refering to potentially an
identical source, or not, And that could all come from a
single drop of spit or a single hair, of which there could
be numerous similar items in, a single scientific study, or
legal proceeding.
The future must contain a virtually infinite URL/URI
address space, perhaps on the order of 10^50 items,
supposedly more than the number of atoms in the Observed
Universe. But it is important, in my view, that the
cyberspace model should be able to CONTAIN the universe,
logically, even if it is never fully filled with actual
atoms. The very idea of SPACE indicates distance between
particles, and in any cyberspace there should be room for
plenty of space between addresses. A real problem of
viewpoint is that regardless of the large limited numbers
of atoms, or even particles, that figure is a one
dimensional parameter. It neglects the true nature of the
universe, where all those particles change relatively every
instant, and where the true measure is units of action.
That make the number of particles simply into the basic
unit measure of the action of the Universe. Thus when
figures like ten^43rd power are take as markers for the
most of something, the crucial element of the flow of time
as a series of actions is neglected. Cyberspace, as a
space, must, for the future, be seen more as a model of the
universe than merely a static collection of discrete
samples of any particular size.
"The World Wide Web is an information space of interrelated
resources. This information space is the basis of, and is
shared by, a number of information systems. Within each of
these systems, people and software retrieve, create,
display, analyze, relate, and reason about
resources." {http://W3.org}
OTHER SOMEWHAT UNRELATED COMMENTS written earlier.
There are a number of problems involved with the practical
use of such system. When it is assumed that authors should
be writing their documents online in realtime, then there
is a danger that incomplete documents may be observed,
producing problems of misleading interpretations, copyright
and idea thefts, and what one might call the glass
bathhouse effect, of making authors feel they are
improperly exposed during their creative processes.
For others, that seems no problem, because there is
software for authoring that can automatically convert
references from relative links to URI forms. But then the
author has lost a certain amount of control over the work.
For example many such software tools also automatically
"pretty print" burdened format the work in progress, before
posting the files to the web via FTP.
In my view automatic pretty printing destroys much of the
utility of HTML. It is a "mark up language" not a symbolic
coding language.
Pretty printing generally adds white space, either spaces
or tabs, for the most part. Paragraphs were developed as a
style during centuries of writing.
However, what is often lost by too many automated
formatting tools is the original paragraph design of the
text, hidden by <SPANS>, <FONT SIZES FACES> and what not,
The invention of CSS (style sheets) seems to have failed to
make type specifications simpler, and less intrusive in
actual documents, and instead merely invented some new
professions, perhaps called that of "stylist."
If one program worked perfectly, everyone would use it, if
it were conveniently inexpensive. The StarOffice/Open Org
office products were headed in the right direction for a
while, but seem to have gotten snaffued. There is no
product that allows tight production of decent HTML text,
and does page formatting also in frameset contexts. It is
insane to have to go back to a "text editor" to fix broken
links, and "detail" pages, (yes "detail" like in car
washing).
If a document is properly formatted, the inclusion of html
markup <TAGS> merely makes the document more readable. When
the document is burdened by artifical symbols for common
textual symbols, and excessively repetitious font
declarations, the underlying text may be lost entirely to
examination by ordinary human readers, and search engines
as well. The text is no longer "marked up," but is instead,
altered!
For example the " " non-breaking space token is one of
the most obnoxious devices ever invented.
It definitely could have been designed as an encapsulated
tag, such as <nb=#> where the number sign "#" is a
parameter specifying the number of spaces from 0 to any
number. When the number of spaces should be 1, then no
argument would be needed, and thus this tag, <nb>,
typically would be 2 characters shorter than " ".
For advanced browsers, even fractional and negative
(typeover) spaces might be practical as for example to make
accents and math symbols, and strike through print
(although the <strike> tag works fine for most uses, such
as legal documents there are times when one might want
double strike-throughs using equal signs, or maybe
xxxxx's), and even drop shadows might be feasible.
The <nb=#> might need to be augmented by another tag,
<sp=#> to allow similar spacing without the non-breaking
feature (that is one might say, "with wrap around"). These
tags could be used similarly as "tab stops" are used in
other document production tools.
The use of " " ("non-breaking space") prevented some
search engines from finding the words following, was
evindently a subversion of the basic intention of making
documents more easily accessible, and certainly makes
reading html more difficult in many instances. It is
virtually impossible to attribute these errors of design
merely to poor judgement, when there were better and easily
defined alternatives.
Similarly the use of """ for double quote marks was
an egregious offence against all writers of English, and
possibly other languages. To compare or be force to search
for two items such as "egregious offences" with
""egregious offences"" is obviously never going
to always work properly if only because most people won't
even think of it, and yet that is what many users of the
World Wide Web are faced with if only they knew.
In many situations the most common method of identifying a
"title" has long been "to enclose it in quotes, if it is an
"article" or "short story" and italisize it if it is the
title of a book. Perhaps the developers of the Web did not
have to attend grammar school, being enrolled in University
by age 12, but presuming it was technical difficulties, we
can say that regardless of cause, some wrong answers have
been published.
Another problem is the explicitly necessary parameter
arguments requiring instantaneous access to the internet
(such as definition statements used for variants of HTML
and XHTML. This may be essential for some purposes, but it
presuposes that 1) internet access is always available, 2)
that anyone using such documents should be able to be
tracked down by their necessity to link to the W3.org for
specifications.
There has always been a problem between computer
programming and general computer document creation. The
tendency exists for programmers to presume that end users
should be exposed to the same kinds of messages needed
during development. The only area where this does not seem
to be true is in the gaming industry. It certainly is a
problem in internet activities. grumpy@DrewAngel.com
sending ~ Mon 2004'263.16501 @03:58 PDT September 20th
Pacific DayLight Time
Received on Wednesday, 22 September 2004 02:37:43 UTC