W3C home > Mailing lists > Public > public-html@w3.org > March 2009

[html5] self-sufficient documents and data URI scheme

From: Andrew Fedoniouk <news@terrainformatica.com>
Date: Wed, 04 Mar 2009 12:18:52 -0800
Message-ID: <49AEE22C.90103@terrainformatica.com>
To: HTML WG <public-html@w3.org>
As we know there is a data URI scheme[1] used in the wild already.
It in principle allows to define self-sufficient HTML documents - 
documents that are
serialized as a mono file. Such documents may include  markup, styling 
and all resources
like images contained in single file.

But there are problems with the data URI scheme that makes it not so 
convenient to use:
when document uses multiple instances of the same image then data bits 
of such an image is getting
inserted multiple times. Another problem arises when the document needs 
to be split into multiple
components - with data URIs that is not so trivial to do. And there are 
other (technical) problems
with the data URI scheme: e.g. such long URIs  are not so good for 
serving as keys in hash tables and maps.

Would be nice if it will be possible to define [image] data in the same 
way as <style> sections inside the document.
Technically this could be made by using <script type="text/base64"> blocks:

      <script type="text/base64" data-type="image/png" 
cid="images/red-dot.png">...base64 data...</script>
      <script type="text/base64" data-type="image/png" 
cid="images/blue-dot.png">...base64 data...</script>
           li { list-style-image: url(red-dot.png); }
     <img src="images/blue-dot.png" />

 From the UA implementation perspective support of this is very simple:  
before issuing download request for the data
it should lookup first for CIDs of blocks that look like: <script 
type="text/base64" cid="images/red-dot.png"> and
if it was found simply use it.

Such self-contained documents are convenient for the cases when HTML 
used as a document format
of files stored on local storages for personal use. It also can be 
useful in e-mails.
In principle it also makes sense to use such documents for publishing on 
the Web.  E.g. articles that have some illustration
images. Such images are unique for the document and really a part of 
content rather than separate standalone
entities. And that <script type="text/base64"> approach is backward 
compatible too - if UA will not recognize
<script type="text/base64"> block it will send request to the server and 
server may supply data for it separately.

<http://en.wikipedia.org/wiki/Data_URI_scheme>Any comments, objections, 

[1] http://en.wikipedia.org/wiki/Data_URI_scheme

Andrew Fedoniouk.

Received on Wednesday, 4 March 2009 20:19:39 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 29 October 2015 10:15:43 UTC