RE: A single, structured SVG doc - with offsets

Cameron, and others,



Very well then. It looks like the “natural” way for SVG would be a single document, period.



There is the fact that an SVG parser will need to parse the entire document up to the glyph definition itself. And there is the fact that subsetting may be quite complex. But both of these fall out of the SVG model. It’s what comes with choosing SVG as the glyph technology to introduce color and animation into OpenType.



It’s different from the TrueType way of organizing glyph data -- which in turn is different from the CFF way. Each model comes with its own pros and cons, as well as its own history and culture. In SVG culture, it’s acceptable to indicate that “the community can work out its own best practices” for organizing the document, whereas the structure of CFF was defined precisely from the get-go, and hasn’t changed since then.



I can accept this. I suspect we aren’t totally done with this topic, but I’m satisfied right now that this is the right decision. Onward! Thanks for these valuable discussions.



Sairus





-----Original Message-----
From: Cameron McCormack [mailto:cam@mcc.id.au]
Sent: Thursday, February 16, 2012 8:23 PM
To: Sairus Patel
Cc: public-svgopentype@w3.org
Subject: Re: A single, structured SVG doc - with offsets



Hi Sairus,



Sairus Patel:

> It seems we keep going back and forth between the following models:

>

> 1. Each glyph definition is a self-contained SVG document. (Sharing:

> none. Subsetting: easy. Getting to a glyph: easy.)

>

> 2. All glyph definitions are in a single SVG document. (Sharing:

> possible. Subsetting: complex. Getting to a glyph: requires parsing

> entire document up to the glyph.)

>

> We keep going back and forth because there are compelling advantages

> to each model.

>

>

> I'd like to explore a different kind of model wherein all glyph

> definitions are still in a single SVG document, but that document has

> a simple structure, and some other OT table ('SVGO' - SVG Offsets,

> perhaps) provides key offsets into that structure, in the same way

> that 'loca' provides offsets into 'glyf' ('loca' is essentially an

> optimization only). The structure would be something like:

>

> header

> shared elements

> element with id="g0"

> element with id="g1"

> ...

> element with id="g<numGlyphs-1>"

> footer

>

> This structure could enable easy glyph access and simplify subsetting,

> while allowing for shared elements. Even the shared elements section

> could be comprised of a list of elements each with a specific id (e.g.

> id="d<int>"), to aid subsettability. Not allowing glyphs to be nested

> within each other, for example, is not a limit on graphical

> expressiveness.



It depends what you mean by "easy glyph access" here.  If you mean that it is easy to seek to the right spot in the file where the markup for the given glyph begins, that is true, but you will not be able to reliably do anything useful with the characters from the markup without having parsed the document up to that point anyway.  This is because XML has a small amount of state that gets built up as you parse (basically just namespace declarations).  But also, you won't be able to just take that substring of the XML document pointed to by the SVGO table and render that because style sheets in the document could very well need to know what other elements came earlier on in the document.



> Note that there is still nothing font-like about this SVG document --

> it contains no concept of glyphs, or Unicode characters, or ligatures.

> This is one of the goals of this whole effort: to use SVG's graphic,

> color and animation facilities only, and to have them be used in

> OpenType's font model.



I know that you are very concerned about the risk of the glyphs in SVG-in-OpenType fonts specifying metrics that conflict with the tables in the font, but I don't think this is really going to be an issue.  We define exactly how the SVG markup is to be interpreted.  We could easily use <glyph> elements if we wanted to and not even talk about any attributes that might exist in the SVG Fonts format.  I think it is pretty unlikely that implementors will choose to honour any metric-like looking attributes that existing with the SVG markup where we've defined that not to have any meaning.



> I'm not involved enough in xml circles to know whether access to

> elements without parsing all the preceding data is an area which has

> been investigated much. Certainly this issue isn't specific to fonts.



It's problematic, at least with plain XML.  (Certain binary serializations of the XML infoset might help, but I don't think we want to go down that road.)



> Also, I don't know enough about how an SVG renderer works to know

> whether providing the glyph offsets-and-lengths to it would indeed

> allow it to access glyphs quicker.



Right, so I think this wouldn't help in practice because of the issues to do with requiring all of the preceding part of the document to be parsed and available.



> However, imposing a simple structure to the SVG document, whether or

> not we have a separate offsets table, feels like the right thing to

> do.



My opinion is that we shouldn't impose any additional structure on the SVG document unless necessary, so that we don't need to have a big mode difference for SVG engines that are processing font documents versus all other documents.



> Any and all thoughts are welcome.



Regarding subsetting, (I think) I did point out in a previous mail that it is not trivial to do so when you have all glyphs in a single document, but it is definitely not impossible.  The simplicity of having a single stream of bytes that get parsed as XML and loaded as a document is appealing to me.



> (I heard Mozilla had an intern that was exploring implementing some of

> this. Could s/he comment, if on this list?)



That work is going on in

https://bugzilla.mozilla.org/show_bug.cgi?id=719286.




Thanks,



Cameron

Received on Wednesday, 22 February 2012 23:35:14 UTC