Re: [author-guide] Character Entity References Chart

Lachlan Hunt wrote:
>   I've begun to create the character entity reference chart designed for 
> the HTML5 authoring guide.  It is a very rough draft at the moment, but 
> it contains all the entity references, along with their characters and 
> equivalent numerical character references.
> 
> http://dev.w3.org/html5/html-author/charref

This is to clarify my current plans for the character chart, and specify 
exactly what I need assistance with.  For an outline of the planned 
features, see the bug report I have for this.

http://www.w3.org/Bugs/Public/show_bug.cgi?id=5852

The HTML is generated from this XML file, which includes a lot of the 
character metadata.

http://www.w3.org/2003/entities/2007xml/unicode.xml

I've already included some of that data, such as the Unicode Block names 
and categories in the HTML.  The way I envision this working is that we 
have several different types of category filters, such as:

* Unicode Block
   http://unicode.org/Public/UNIDATA/Blocks.txt
* General Categories
   http://unicode.org/Public/UNIDATA/UCD.html#General_Category_Values
* The historical set groupings based on which DTD they were origianlly
   defined in (e.g. predefined, xhtml1-lat1, html5-uppercase, etc.).
   While these aren't entirely relevant to HTML5, they may provide a
   somewhat useful categorisation.

I can also include any other metadata from the Unicode Character 
Database, if it will be useful for authors in some way.

http://unicode.org/Public/UNIDATA/UCD.html

These categories need to be presented in some sort of list that allows 
the user to select them as a way to filter the characters.  This could 
be a sidebar, or a set of dropdown <select> lists, or a series of 
checkboxes, or anything else.  We just need to find one that is usable, 
functional and well designed.

There needs to be a search feature, which can search based on any 
character metadata.  This should ideally be dynamic. If the user types 
in part of a character's name, or a codepoint, etc. the list of 
characters should be filtered to just those that match.

Then for each character, it needs to present:
* Character name
* All of the named and numerical character references
* An image of the glyph.
* The character, as rendered by the browser
* Maybe a link to a page with more info about the character

I can handle all the coding requirements, I just need help to come up 
with the user interface design.  So a simple photoshop mockup of the 
page would be most useful.  Don't feel too constrained by the initial 
layout I made.  Although I would like the design to follow a few simple 
guidelines:

* Users will be interested in looking for characters, so the design
   needs to make the glyphs stand out.
* Don't clutter the page with too much secondary data.  Once the user
   has found the character they want, they can then obtain more details
   about it.  (This is why the current layout ony shows the numeric
   character references on hover with the mouse.)
* Don't make it look ugly.  Pick a nice looking colour scheme and
   layout.
* An interactive layout would be most useful for the screen, but an
   alternative, non-interactive layout that shows all characters and
   metadata would be most useful for printing.

The following are the known issues with the current layout that need to 
be avoided:

* Some characters have up to 6 named character references, and some are
   really long.  These sometimes overflow the boxes in the current
   layout.
   - U+200B is one of the worst with these 5 long names:
     &ZeroWidthSpace; &NegativeVeryThinSpace; &NegativeThinSpace;
     &NegativeMediumSpace; &NegativeThickSpace;
* If the character name is to be made visible by default, the the layout
   needs to handle some very long names.
* Depending on your system, some glyphs are missing or incorrect for
   many characters due to font issues. (This is why we need glyph images)
* Various browser bugs

-- 
Lachlan Hunt - Opera Software
http://lachy.id.au/
http://www.opera.com/

Received on Monday, 21 July 2008 12:44:35 UTC