Re: A single, structured SVG doc - with offsets from Cameron McCormack on 2012-02-17 (public-svgopentype@w3.org from February 2012)

From: Cameron McCormack <cam@mcc.id.au>
Date: Fri, 17 Feb 2012 15:23:01 +1100
To: Sairus Patel <sppatel@adobe.com>
CC: "public-svgopentype@w3.org" <public-svgopentype@w3.org>
Message-ID: <4F3DD625.6000508@mcc.id.au>
Hi Sairus,

Sairus Patel:
> It seems we keep going back and forth between the following models:
>
> 1. Each glyph definition is a self-contained SVG document. (Sharing:
> none. Subsetting: easy. Getting to a glyph: easy.)
>
> 2. All glyph definitions are in a single SVG document. (Sharing:
> possible. Subsetting: complex. Getting to a glyph: requires parsing
> entire document up to the glyph.)
>
> We keep going back and forth because there are compelling advantages
> to each model.
>
>
> I'd like to explore a different kind of model wherein all glyph
> definitions are still in a single SVG document, but that document has
> a simple structure, and some other OT table ('SVGO' - SVG Offsets,
> perhaps) provides key offsets into that structure, in the same way
> that 'loca' provides offsets into 'glyf' ('loca' is essentially an
> optimization only). The structure would be something like:
>
> header
> shared elements
> element with id="g0"
> element with id="g1"
> ...
> element with id="g<numGlyphs-1>"
> footer
>
> This structure could enable easy glyph access and simplify
> subsetting, while allowing for shared elements. Even the shared
> elements section could be comprised of a list of elements each with a
> specific id (e.g. id="d<int>"), to aid subsettability. Not allowing
> glyphs to be nested within each other, for example, is not a limit on
> graphical expressiveness.

It depends what you mean by "easy glyph access" here.  If you mean that 
it is easy to seek to the right spot in the file where the markup for 
the given glyph begins, that is true, but you will not be able to 
reliably do anything useful with the characters from the markup without 
having parsed the document up to that point anyway.  This is because XML 
has a small amount of state that gets built up as you parse (basically 
just namespace declarations).  But also, you won't be able to just take 
that substring of the XML document pointed to by the SVGO table and 
render that because style sheets in the document could very well need to 
know what other elements came earlier on in the document.

> Note that there is still nothing font-like about this SVG document --
> it contains no concept of glyphs, or Unicode characters, or
> ligatures. This is one of the goals of this whole effort: to use
> SVG's graphic, color and animation facilities only, and to have them
> be used in OpenType's font model.

I know that you are very concerned about the risk of the glyphs in 
SVG-in-OpenType fonts specifying metrics that conflict with the tables 
in the font, but I don't think this is really going to be an issue.  We 
define exactly how the SVG markup is to be interpreted.  We could easily 
use <glyph> elements if we wanted to and not even talk about any 
attributes that might exist in the SVG Fonts format.  I think it is 
pretty unlikely that implementors will choose to honour any metric-like 
looking attributes that existing with the SVG markup where we've defined 
that not to have any meaning.

> I'm not involved enough in xml circles to know whether access to
> elements without parsing all the preceding data is an area which has
> been investigated much. Certainly this issue isn't specific to
> fonts.

It's problematic, at least with plain XML.  (Certain binary 
serializations of the XML infoset might help, but I don't think we want 
to go down that road.)

> Also, I don't know enough about how an SVG renderer works to know
> whether providing the glyph offsets-and-lengths to it would indeed
> allow it to access glyphs quicker.

Right, so I think this wouldn't help in practice because of the issues 
to do with requiring all of the preceding part of the document to be 
parsed and available.

> However, imposing a simple structure to the SVG document, whether or
> not we have a separate offsets table, feels like the right thing to
> do.

My opinion is that we shouldn't impose any additional structure on the 
SVG document unless necessary, so that we don't need to have a big mode 
difference for SVG engines that are processing font documents versus all 
other documents.

> Any and all thoughts are welcome.

Regarding subsetting, (I think) I did point out in a previous mail that 
it is not trivial to do so when you have all glyphs in a single 
document, but it is definitely not impossible.  The simplicity of having 
a single stream of bytes that get parsed as XML and loaded as a document 
is appealing to me.

> (I heard Mozilla had an intern that was exploring implementing some
> of this. Could s/he comment, if on this list?)

That work is going on in 
https://bugzilla.mozilla.org/show_bug.cgi?id=719286.

Thanks,

Cameron
Received on Friday, 17 February 2012 04:23:44 UTC