A Proposal for Hypervideo Management Language (HVML) from Ted Williams on 2001-06-06 (www-tv@w3.org from April to June 2001)

From: Ted Williams <tedwi@starband.net>
Date: Wed, 6 Jun 2001 09:26:59 -0700
To: <www-tv@w3.org>
Cc: <tedwi@starband.net>
Message-ID: <HEECJFKLOIEFPAAIBDHECEIFCBAA.tedwi@starband.net>
Hello,

Let me start by apologizing to the members of this list, particularly if
this post is off-topic.  This is going to be a quite lengthy email.  Please
indulge me while I give some background information, and then discuss my
proposal.

Background

My name is Ted Williams and I have been an entrepreneurial engineer in the
television industry in one form or another since 1984.  I have developed
software for everything from signal processing devices to digital video
recorders to DVEs to machine control and automation systems.  Presently I am
running a startup company I founded called Sultan Media Systems (SMS) to
build very advanced media production and distribution technologies around
the concepts presented here.

In 1993 I attended a conference on multimedia and took a good look at what
multimedia systems were doing, and from there extrapolated what was going on
under the hood.  This also happened to be the time at which the entire
television industry was abuzz about the “information superhighway” and the
“500 channel” television networks that were thought to be just around the
corner.  At the time everybody, including the likes of Bill Gates at al,
thought that the information infrastructures would be a natural outgrowth of
cable television.  Of course the Internet took off and the rest is history
you already know.

What I extrapolated were two important insights: 1) Multimedia systems were
simply duplicating what television production systems already did, albeit at
view-time and with a consumer centric user interface.  The act of
compositing and presenting the different forms of media (video, music,
graphics, text, etc.) could have used identical processes and plumbings as
your average professional television production system, downscaled and at a
somewhat lower quality level.  2) The digital video that these multimedia
systems and the promised “information superhighway” utilized could be
processed like any form of data, could be made intelligent, and could be
interlinked with itself, other “titles” of its kind, or completely
independent forms of digital information or software.

Then of course the Internet usurped the information superhighway based on
cable and interest in fully exploiting digital video’s ability to be
processed, extended and super distributed via digital communications
infrastructures waned.  However, the time has come to revisit the whole
thing.  This is due to, in a phrase, broadband Internet.

The Overall Idea

Imagine the television industry of the near future.  Nearly all television
material is online and available on-demand soon after it is posted, and what
is still transmitted via terrestrial broadcast from the major networks and
affiliates still have rich linkages to auxiliary information residing on the
Internet, private video servers of broadband service providers, and to DVD
or DVD-ROM based companion information specifically authored or developed to
serve as an after-market “upgrade” to the on-air broadcast (like electronic
hypermedia and software versions of the companion books always offered for
sale on PBS.)  The viewer interfaces are both aesthetically pleasing as well
as functional, being designed by production artists and being composed of
“smart” special effects that have a view-time behavior associated with them.
Imagine a web of temporal links, some intra-production such as links from a
table of contents to individual stories in a magazine format show, and some
inter-production linking to related, historical, sponsor supplied, or just
about any other pertinent information you can think of.  Imagine all of this
being arranged ad-hoc in a growing, traversable, navigable, Web-like
structure evolving in much the same way as the World Wide Web, but with the
emphasis on full-motion video, images, surround sound, and temporal and
contextual interactivity.  Of course, traditional Internet information can
be made to fit right in, and the viewer’s system can actually composite that
information into the on-line or broadcast production seamlessly, just as
easily as a post-production system keying titles over video.  A television
industry like this is not as far off as some of those reading this post may
be thinking.

All of the required technology to make this happen can be culled from the
television and video game industries.  The temporal management, composting,
digital effects, and some of the graphics technology can be found inside all
kinds of professional post-production gear, and the more advanced real-time
graphics technology with the ability to treat full-motion video as a texture
can be found in some higher-end 3D graphics accelerators now being sold into
the PC gaming market as well as game consoles.  Broadband communications
infrastructures are coming on-line at a rapid pace, and it appears (to me
anyway) that something like Moor’s Law also applies to bandwidth.  What is
needed is a “glue” to bind all of this together.  This, finally, is where
Hypervideo Management Language (HVML) fits into the big picture.

Hypervideo Management Language

For the sake of brevity, I will only gloss over some of the more important
points of HVML here.  It is my hopes that those interested will jump in and
we can discuss it in greater depth.

HVML is a stream aware, time driven procedural language intended to augment
markup languages and HTML derived streaming languages such as SMIL and
ATVEF.  It is intended be to those languages what Java and JavaScript are to
HTML.  HVML may both control the clock of the production, or may be
controlled by the clock, with programmatic execution and flow being driven
by the containing television production.

HVML may be embedded within a digital television stream, and a subset of the
language may be embedded into the VBI of a terrestrial broadcast NTSC
signal.  When embedded in the VBI, a technique known as Cyclic Procedure
Streaming is used so that “late tuners” are guaranteed to receive any
programmatic procedures required by the body of the HVML production.

SMPTE / EBU timecode is an intrinsic data type, and may have all of the
standard mathematic operators applied to it.  It also may be freely
intermixed in mathematic expressions with other intrinsic data types of the
language, such as integers and floating point values.  For example 1:30:00 /
1.25 and 1.25 / 1:30:00 are both valid expressions.

HVML supports the standard program control structures, and adds some more
that are specifically intended for its use as the means to add intelligence
to hypervideo productions.  The AT structure controls when in time the
contained block of code will execute.  The ANIMATE structure allows a block
of code to iteratively execute during the vertical blanking interval, with
feedback into the program as to what the current time of the animation loop
relative to the start of the animation as determined by the production’s
clock is.

The BRANCH, LOOP, and FORK structures embody the temporal web of the
hypervideo production, causing a change in time, context, or the navigation
to a new production.  A branch takes the viewer to a new place in time and
context, a loop brings the viewer back in time and context, and a fork
splits playback into multiple streams.  The multiple streams created by FORK
may run in separate windows, or be re-composited by the view-time system
back into a single image so seamlessly that the viewer has no idea it has
happened.  In the case of branching to a new time or production with the
BRANCH structure, a return disposition may be specified.  There may be no
return, the clock may return to the next frame after the origination point
on completion of the link opportunity or other factor, it may return time
adjusted such that if playback branches from A at 5:00 to B, remains in B
for 15:00, it will return back to A at time 20:00, and finally may return to
an absolute time.  These control structures are augmented with mechanisms
for specifying the persistence of programmatic or intrinsic browser states
across branch boundaries.

An extensible set of basic viewer interface primitives provide for the
construction of the viewer interface as well as for the attachment of HVML
code to events happening on those controls.  These basic controls are
required to have a generic look-and-feel implemented on the receiving end,
but are most useful when they are used merely as intelligent containers for
other elements which may give them a customized look.  Controls may be
filled with text, bitmaps, moving video, or anything else which supplies an
on-screen appearance.

In addition to the basic set of primitives, “smart special effects” may be
used.  These are somewhat standard looking special effects, such as a title
super, key, DVE move, etc, to which a view-time behavior is attached which
is in turn expressed in HVML.  This allows for the creation of extremely
visually complex and dynamic viewer interfaces very easily from the
perspective of the television post-producer, and places title “look and
 feel” issues into the hands of the production artist.

An HVML production may also be self-editing in nature.  Among other more
artistic uses, this allows a program to have variable ratings levels or
dynamically variable detail level or “information bandwidth.”  A production
may also edit itself based on factors such as programmatic states maintained
by the browser or delivery device, or by past viewer actions and
interactions.

The attached scenarios document discusses a small set of examples of what
all of the above will enable a production artist to deliver to the consumer,
from the consumer’s point-of-view.

The Proposal

At long last, I come to the point.  I propose that the HVML language be
fully designed and specified in an open forum.  Further I propose that the
W3C oversee this forum, and moderate development to maintain an adherence to
an agreed upon standard.  It is my belief that the time to start is now, and
that we can take full advantage of the lag time we are going to see in
widespread broadband deployment such that when broadband is extremely
common, we will be ready with what I believe is the “killer app” for that
bandwidth.

Conclusion

I invite anybody and everybody on this list to jump in and share their
opinions and ideas on all of this.  It will be a complex task to create a
system such as this, so the more people who beat on it the better.

Thank you all for your time.  I look forward to hearing from you.

Regards,
Ted Williams
Founder
Sultan Media Systems, Inc.
Attachments

application/msword attachment: Scenarios_for_Convergence.doc
Received on Wednesday, 6 June 2001 12:28:24 UTC