Re: SVG Font criticism

Raph Levien wrote:
> 
> Dear SVG working group,
> 
>    I just had the chance to review Chapter 13 of the Aug 12 SVG spec
> (Fonts). I am concerned that a completely new and somewhat rough-edged
> new capability was slipped into the spec for its "Last Call."

Its not completely new. It is based on an existing W3C Recomendation,
CSS2, and the existing SVG path and symbol syntax. But I thank you for
taking the time to review it. This chapter came about in response to
criticisms of earlier SVG drafts.

>  My
> understanding of Last Call is that it was a time period to consolidate
> technical issues for which there had already been discussion. 

The point of last call is to engage other W3C Working groups to ensure
that we are using their specs correctly, and also to ensure that the
spec is accessible and adequately internationalised.

Which is why the font stuff is in there for last call because otherwise,
there is a big hole in the internationalisation area. And other areas,
as you have pointed out in the past.

> This is
> obviously not the case with the font changes. I believe these problems
> derive directly from the W3C's policy of holding working group
> discussions confidential.

W3C WG discussions are indeed confidential and thus, obviously, you were
not privy to the discussion and thus, cannot know whether it was
discussed and if so how much and by whom. However, now that a draft with
these changes has been made public, I welcome you to the wider
discussion.

>    Now with the obligatory W3C process bashing out of the way, I'll go
> on to a technical criticism.

Thanks. Technical criticism, particularly incorporating concrete
suggestions for improvement, is what this list is for.

>    I think specifying a font directly in the SVG file is a good thing.

I would hope so. After all, we plan to avoiud any "font shame" this way. 

> Obviously, fonts specified in this way are far more likely to work
> correctly across different platforms. However, I have a number of
> criticisms of the font format itself.
> 
>    First and foremost, the lack of hinting represents a significant
> step backwards in functionality, especially compared to existing
> unencumbered solutions. 

If you have a technical proposal on how hinting can work in a
resolution-independent way for glyphs which can be arbitrarily rotated
and skewed and thus will not line up with an axis-aligned pixel grid, I
would be very interested to see it. Existing hinting mechanisms tend to
assume particular pixel sizes for the rendering, and do not work with
rotated characters. Which is not to say that its impossible, of course,
just that we don't have a technical proposal before us to consider.

> I also would like to call attention to this
> language in the spec:
> 
>    SVG fonts contain unhinted font outlines. Because of this, on many
>    implementations there will be limitations regarding the quality and
>    legibility of text in small font sizes. For increased quality and
>    legibility in small font sizes, content creators may want to use an
>    alternate font technology, such as fonts that ship with operating
>    systems or an alternate web font format.
> 
>    I apologize for using technical language, but this is sheer
> bullshit. Using an "alternate font technology" brings up the same
> conformance issues as earlier drafts, and invites an interoperability
> nightmare. 

I think you should spend a little time looking at the CSS2 spec before
accusing it of being "sheer bullshit".

For example, I see nothing particularly wrong with 

font-family: "Some Obscure Font", "Some widespread font", "My SVG font",
sans-serif

"My SVG font" would be in SVG and the others would be in whatever
alternative font technology, such as TrueType, Type 1, whatever, the
implementation was able to process. The font-family declaration doesn't
mandate the format to be used.

> More honest language for this draft would be:
> 
>    SVG fonts contain unhinted font outlines. Because of this, on many
>    implementations there will be limitations regarding the quality and
>    legibility of text in small font sizes. There is no way to create
>    conformant SVG files using higher quality hinted fonts. For
>    increased quality and legibility in small font sizes, content
>    creators may want to use an alternate format other than SVG.

This is factually inaccurate. It is possible for a renderer to use high
quality hinted fonts, if it has any available.

Perhaps you could mention a suitable other graphics format which might
be used? Which does vector graphics, includes fonts and includes
hinting? It would be interesting to see what it does in this area,
assuming it has a specification available.

Or did you mean "an alternative font format other than SVG". Which would
make sense, but would also be pretty much wgat the SVG spec says and to
which you seem to be objecting.

> Is the goal to completely specify a file format for
> best-practice quality graphics, or to provide a half-working kludge
> that really requires proprietary enhancements to deliver its full
> potential?

The former.

>    The Adobe charstring format dates back to 1985, and has acquired a
> large body of knowledge and font design tools. Over its evolution, it
> has acquired the features needed for rendering a wide range of
> scripts, including high quality CJK font rendering. In addition, there
> are no known intellectual property constraints, and many excellent
> free tools exist to parse, generate, and manipulate charstrings.
> 
>    It is most ironic that a free software developer is pushing Adobe
> technology on a specification, 

Yes, it is somewhat ironic. Its still potentially a good solution, if
the IPR issues can be declared non-existent by someone in a position to
authoritatively so declare them and if the idea of having two,
completely different ways to specify a Bezier curve, one in XML and one
not in XML, can be justified for an XML specification, and if
implementors are positive in response to the proposal.

> but that is exactly what I'm doing
> here. The <glyph> element should be allowed to include hex or base64
> encoded type2 charstring data (the difference is basically a 50%
> difference in uncompressed file size, either way probably much more
> compact than svg path syntax).

Thanks. Its good to see an actual concrete proposal here. 

So, you are suggesting that all conformant implementations should be
required to parse and display hinted Type2 glyphs. What sort of hit
would that be on code size and implementation complexity, would you
estimate? For example, does Java2 provide any help here/ Is there code
to do this as part of common OS-s that could be called to help with
this, or would it require implementation from scratch? Lastly, what
would the rensdering look like? 

I think the major use of SVG fonts will be for larger point sizes, and
having seen some text converted to curves and displayed, with correct
antialiasing and correct gamma control, I think the results can look
very good. I have seen far worse font rendering in commercial systems,
such as the poor Type 1 rendering in most X implementations. Its also
possible to list an SVG font as a fallback to only be used if a
particular rendere is unable to locate the higher-preference font
families, which would be in whatever font format the particular platform
supported.

>    My other major criticisms have to do with i18n. Specifying glyphs
> in terms of unicode sequences and using longest-match semantics for
> choosing ligatures makes sense, but is obviously inadequate for
> complex scripts. 

Yes, in the limit, any technology is inadequate for complex scripts.
There is always one other script, like Mayan or Rongo-Rongo, that needs
special rules.

On the other hand, the 80-20 rule can give substantial benefit for
moderate cost. Take a look at the OpenType spec, then take a look at the
Arial Unicode font and the Tahoma font and the Lucida Sans Unicode font
and see which OpenType features they have. Arabic and Han.

The SVG font spec deals with what those industry fonts can do -
ligatures, arabic contextual forms, unihan disambiguation, bidi,
vertical text. Thats a fair chunk of functionality. Yes, support for
Indic scripts is not there. But it covers a good bunch of needs and
aligns well with practical, real-world industry attempts to hit the
middle ground between "just English" and "everything possible".

> This is not to say that complex scripts are
> impossible, just that they will generally be rendered a character at a
> time using the altglyph property, explicit x,y positioning, etc. This
> criticism also applies to placement of diacritical marks.

Yes, its a good criticism and there is clearly room for incremental
improvement here in future versions. We should say ore about diacritics,
although the impact of the W3C character normalisation model helps us
out here a lot.

>    The major consequence of this is that in many scripts, SVG text
> will basically be uneditable, as well as hugely expanded in file size.

In many *scripts*, yes. All the ones that current browsers don't even
attempt to display. But remember the 80:20 rule and apply it to
populations of Web users. For the majority of text on the Web today,
existing either as text or rendered into little GIFs, SVG will deal with
it.

Text in all western and eastern European languages, in Greek, in Hebrew,
in Arabic, in Japanese and Chinese and Korean, in cyrillic scripts, in a
bunch of other scripts such as Native American scripts (Cree, Navaho,
etc) all of that will be editable and not expanded in file size. 

Thats a significant market. 

Thats a significant capability for making graphically rich, well
internationalised illustrations, and it takes the industry enough on
from current capabilities while not trying to take it too far at once.

The alternative to doing this is that text in any language other than
English requires converting the text to curves, thus loosing all
editability and searchability. The SVG font capability allows the
outlines to be stored, but also allows the text to be stored and to
remain editable.

>    Incidentally, "internationalization issues need to be addressed"
> doesn't exactly sound like Last Call language to me either.

I agree, and since significant amount of internationalisation issues
have been addressed, that language which dates from the first public
draft should be updated.

>    I think "isolated" is more usual terminology than "standard" as a
> value for the "arabic" attribute. 

Yes. Isolated is the term I have generally heard used most often and is
the term used in Daniels and Bright[1] which is as close to definitive
as a general work can be. OpenType uses the term standard.

> and Calling this attribute "arabic"
> may not be wise - Syriac (in the Unicode pipeline) has exactly the
> same structure of contextual forms.

Yes. Manchu has the same structure and Mongolian has similar structure.
Again. Arabic is what OpenType calls this feature. Suggestions for a
better name which is mor einclusive, which still being readilly
understandable, are welcome.

>    I do not understand the value in tagging the locale for glyphs in
> the han range. To mix different CJK languages, it makes the most sense
> (to me) to simply use different fonts, subsetted as necessary.

"Use different fonts" is one approach. Equally, most anything can be
done with using separate fonts. But recall that the majority of the
glyphs are actually the same, and can be shared; one only needs to
indicate the ones which are different.

> Similar issues exist for the Arabic and Cyrillic locale variations,
> but only the "han" attribute exists.
> 
>    The exact interpretations of some tags are quite badly
> underspecified. One can easily imagine that the "arabic" contextual
> forms are to be interpreted according to the Unicode rules, but
> nowhere is this explicitly stated. 

OK, thanks. Yes that was the intention and I couldn't imagine anyone
drawing any other conclusions but I agree that more explicit language
would be helpful.

> Do other sets of Unicode rules also
> apply, for reordering in complex scripts, for example? For composition
> of Korean Hangul? This needs to be very explicitly stated.

Composition is what happens to characters. I would expect a Korean font
to supply glyphs for each precomposed hangul used. Of course, the
use/symbol feature can be used to good effect with composite glyphs such
as this (and also to break Han glyphs into radical strokes which are
re-used, for example).

>    Unicodes should be specified in hex, not decimal, as this is the
> standard for interchange, and because Unicode ranges are generally
> aligned at power-of-16 boundaries.

I agree completely.

>    I will have more criticisms of the SVG spec later, but wanted to
> fire these off now due to the limited time frame of the "Last Call"
> period.

(Its four weeks). Then, having integrated all the feedback we will be in
a postion to go to Proposed Recommendation. 

Thanks for your technical feedback, which was appreciated, although most
of the issues you raise had in fact already been discussed internally,
its good to see you bringing the same issues up and coming to similar
conclusions.

Regarding the suggestion to make the parsing and rendering of Adobe Type
2 glyphs mandatory for all SVG implementations, I would be interested to
hear arguments in favour and against this, particularly from
implementors. Implementations are already allowed to use this font
format, along with any other one they find convenient, for already
existing installed fonts.


[1] Daniels, Peter T; Bright, William "The Worlds Writing Systems",
Oxford University Press, 1996. Hardback, 922pp, illustrated, includes
index. ISBN 0-19-507993-0

--
Chris

Received on Saturday, 14 August 1999 20:22:23 UTC