While there is still extensive use of handwriting in many domains; in general, handwriting has not evolved in any widespread way to digital form. Why, then, is there a significant need for a digital ink standard?
One significant trend influencing the exploration of digital ink is the rapidly growing demand for mobile computing. Small devices requiring some input mechanism have proliferated over the last five years. Also, many companies are exploring new interactive applications for existing devices, such as cell phones and pagers. Many of these applications will require a means for the user to input information. These small devices and associated applications differ from traditional computing platforms because they are too small to accommodate a keyboard. Other input modalities, including pen input, are being used increasingly as a substitute for keyboards.
In the last five years, another trend has been the increased use of computing platforms for informal communications. Ten years ago, most writing on computers was destined for a formal report or semi-formal e-mail. Social expectations generally demanded that they be typed. With the expanding use of the Internet for informal communications, ranging from e-mails about groceries to online chatting among teens, there is now a large amount of communications traffic leaving behind former social requirements on form.
At the same time, when social expectations on form are becoming more relaxed, other social expectations are forcing people away from intrusive technologies. I.T. devices are becoming more mobile and ubiquitous, pushing into meetings and other settings where social considerations are important. In these settings, the use of keyboards often is socially unacceptable. Potential substitutes, such as voice and speech recognition, are even more impractical than keyboards. Additionally, the very low cognitive load required for taking handwritten notes is attractive in settings where the user's attention must be directed elsewhere.
These two trends, toward smaller devices and more informal use, are likely to drive the increased use of pen input and, in many cases, direct ink representations.
Another completely independent area of development that may fuel additional digital ink applications is the expansion of computer use in countries where ideographic writing is prevalent. Markets, such as Asia, are growing rapidly so there is demand for direct handwritten input of ideographic characters.
Other applications may become more popular as the technology becomes more ubiquitous. Currently, the increasing demand for collaboration with widely distributed personnel, such as distance education, motivates shared whiteboard. Some applications, including direct capture of freeform drawings, illustrations, and sketches, are inconvenient or impossible to use with a traditional keyboard and mouse interface. Even applications, such as markup and annotation, are difficult to use without a stylus.
So why is another standard necessary?
Several factors are important for the digital ink standard. The standard should:
As proposed by the (now defunct) Slate Corporation, Jot is a proprietary format that avoids any abstract characterization of ink.
The Scalable Vector Graphics standard (SVG) is an extremely rich language for describing vector images. This specification is easy to use when describing a page of ink data and how it should be rendered. However, it does not have any means for describing device characteristics and appears to be fairly heavyweight, both in viewer implementation and in data bandwidth.
The Vector Markup Language (VML) supported by some Microsoft products is similar to SVG and shares similar limitations. VML is not a standard.
Standards, such as SMIL and MPEG-7, do not actually address data representation. Instead, they provide for high-level descriptions of presentation scripts and searchable semantic features, respectively.
Image formats, such as jpeg and GIF, are relatively inefficient representations for ink from a size perspective. More importantly, image formats do not preserve any vector information required for many handwriting applications.
There is no existing standard with all the capabilities important for a digital ink standard. Nor does there appear to be any standard that could be readily extended to become a comprehensive digital ink standard. However, several standards should be considered while developing a new digital ink standard because they are complimentary. SVG, MPEG-7, and SMIL are relevant directly to digital ink. The new standard should address how these existing and emerging standards should be used in relation to the new standard. Additionally, ITU T.150 and UNIPEN, as well as other proprietary representations, should be considered from the point of view of feature coverage and transcription in the new representation.
Attributes have been organized into seven different levels of abstraction,
from low to high: Device Level, Point Level, Trace Level, Screen
Context Level, Derived Level, Packing Level, and Meta Level. Device level information
describes the digitizer. Point level information describes an individual
ink point. Trace level information describes a contiguous pen trace. The
CaputureUI level information describes the user interface when the ink
was collected. Derived level information corresponds to features that can
be derived from other lower-level features. Packing level information relates
to transmission and access issues. Finally, meta-level information corresponds
to data that typically conveys some semantics about the ink (for example,
who wrote the ink).
Attribute classes |
---|
|
The following tables list the attributes needed for various categories
of applications:
Command and Control |
---|
|
Handwriting Recognition |
---|
|
Authentication |
---|
|
Communication |
---|
|
Multimodal |
---|
|
Document Management |
---|
|
The opportunities for using ink as a communications medium is severely limited by the small number of users who are capable of exchanging ink data. There are a modest number of ink applications, each using a proprietary ink representation. Consequently, the audience for any given ink document that someone may author is exceedingly small, so the motivation for a user to invest in an ink-enabled device or application is limited.
With the establishment of a non-proprietary ink standard, it will be possible for device and application developers to support a common ink representation, in place of or in addition to their own proprietary representation. This would expand the audience for any particular user from the currently installed base of one device or the application to the combined installed base of all devices and applications that have implemented the standard.
Intel intends to use this version of InkXML in an InkChat application to demonstrate the utility of InkXML. All of the code for this application will be made available via a royalty-free license to any ink application developers.
Currently, IBM and Motorola are using InkXML internally.
The following steps for the design and standardization of the next version of InkXML are recommended:
The structure of an ink-enabled system depends on constraints such as memory, processing, communication link speed, and application requirements. These constraints determine which primitive elements are created at each stage of the system.
In some instances, the pen hardware creates some of the InkXML primitive elements. However, if the pen hardware has small amounts of memory, it may present ink information in a proprietary serial format to the driver.
The driver may present primitive InkXML elements to the programming API (Application Programming Interface). However, if the driver also has limited memory, there might not be enough memory to create some of the primitive elements, such as chunks. In this case, other layers in the system might add these at a later time. The driver typically does not create any events other than time stamps.
The event handler and the programming API (in conjunction with the ink log generator) may add more information to the primitive elements, such as chunks and events.
The front-end application and SDK (Software Development
Kit) library make further modifications, such as grouping ink traces
into chunks or adding additional event tags.
An author may make comments next to the button, switch, and other device events to suggest how they might be interpreted by applications. For example, the switch on the end opposite the tip on a pen is typically mapped in the application as an erase event. However, the button should be recorded as a button event and not an erase event.
One additional item should be considered for channel descriptions, specifically for the force channel. It might be desirable to allow a non-linear transfer function to be identified, such as transfer= "log" or transfer= "sigmoid", to allow the encoding to represent a function of the channel value, rather than the direct value. Alternatively, the channel units might specify "log-newtons" or "tanh-dynes".< /p>< /p>
ScreenContext information is also essential to support interactive ink-enabled applications such as instant messaging and teleconferencing, where ink might be collected and display on at least two different devices. For example, consider a game of tic-tac-toe where one player is using a PDA device and the other player is using a tablet-PC. Clearly, the size of the board (the drawing canvas) on each device is going to be different, and the "X"/ "O" marks made with the pen on one device will need to be transformed (scaled) for proper rendering on the other device.
id | A unique identifier for this ScreenContext |
---|---|
device | An input device is identified |
Canvas
id | A unique identifier for this canvas |
---|---|
extent | The left, top, right, and bottom coordinates of the canvas. Specified as four coordinates, x1, y1, x2, and y2, where (x1,y1) is the top-left corner and (x2,y2) is the lower-right corner. |
Mapping
id | A unique identifier for this mapping |
---|---|
transform | The standard 2x3 matrix representation for basic transformations; default is an identity matrix. |
View
id | A unique identifier for this view |
---|---|
bounding_path | The sequence of the vertices of a polygon representing the outline of the viewable area in the canvas. |
This group believes that ScreenContext as a separate (optional) block can be referenced by traces or chunks, using the id field. In this case, a default behavior for determining the scope of a given ScreenContext should be specified.
The simplest form of encoding specifies the x- and y-coordinates of each sample point. For compactness, it may be desirable to specify absolute coordinates only for the first point in the trace and use delta-x and delta-y values to encode subsequent points. Some devices record acceleration rather than absolute or relative position; some provide additional data that may be encoded in the trace, including z-coordinates or pressure or the state of side switches or buttons.
Within an InkXML file, traces are encoded using two tags. The <trace-format> tag specifies the encoding format for each sample of the recorded traces, while the <trace> tags are used to represent the actual trace data.
For each channel, there is a <channel> element with optional attributes of name = "", type = "", default = "", and wildcard = "" that describe the encoding type (Boolean, decimal, or integer), the default value, and how to interpret the wildcard character. The name attribute is required and specifies which of the channels described in the DeviceSpec (or default channels described in the DeviceSpec section) that this position in the traceFormat corresponds. Other attributes are optional. If omitted, then the default type is decimal, the default value is zero, and the default wildcard interpretation is "lastValue" for required channels and "defaultValue" for optional channels.
A required channel must contain a value for each point. For example, x- and y-coordinates are likely to be required. Some channels may be recorded on an intermittent basis because their state changes infrequently; for example, the state of a pen switch is not likely to change often. In order to prevent the repeated recording of static channel values, these channels can be specified as optional channels. Required channels appear first in the <trace> followed by optional channels, if there are any. The optional channel values may be completely omitted, and a new point started immediately. In this case, it is assumed that all optional channels have been reported with wildcards. If optional channel values are reported, the optional group is preceded by a colon and ended with a semicolon. Optional channels are represented in order between the colon and semicolon. The list may be terminated early with the semicolon, and the unreported optional channels are interpreted with wildcards.
If there is no optionalChannel element, then there are no optional channels. In this case, the colon and semicolon delimiters are still allowed.
Required channels may be reported as explicit values, differences, or second differences. The default is explicit. There are prefix symbols that indicate the interpretation. The exclamation point indicates an explicit value, a single quote indicates a single difference, and a double quote prefix indicates a second difference. If there is no prefix, then the channel value is interpreted as explicit, difference, or second difference based on the last prefix for the channel.
A second difference encoding must be preceded by a single difference representation; which, in turn, must be preceded with an explicit encoding.
Optional channels are always encoded explicitly, and prefixes are not allowed.
Both required and optional channels may be encoded with a wildcard character *.
The wildcard character means either that the value of the channel is the default value, which is the previous channel value (if explicit), or the channel continues integrating the previous velocity and acceleration values.
Booleans are encoded as "T" or "F".
With this trace example, assume the traceFormat is:
<traceFormat> <requiredChannels> <channel name="X" type="decimal" wildcard="lastValue"/> <channel name="Y" type="decimal" wildcard="lastValue"/> </requiredChannels> <optionalChannels> <channel name="S1" type="boolean" default="F" wildcard="lastValue"/> <channel name="S2" type="boolean" default="F" wildcard="lastValue"/> </optionalChannels> </traceFormat>Then, this trace:
<trace id = "4525BCD"> 1125 18432'23'43"7"-8 3-5+7 -3+6+2+6 8+3+6:T;+2+4:*T;+3+6+3-6:FF; </trace>The trace is interpreted as follows:
Trace | X | Y | vx | vy | S1 | S2 | Comments |
---|---|---|---|---|---|---|---|
1125 18432 | 1125 | 18432 | ? | ? | F | F | //switch default values |
'23'43 | 1148 | 18475 | 23 | 43 | F | F | //velocity values |
"7"-8 | 1178 | 18510 | 30 | 35 | F | F | //acceleration Values |
3-5 | 1211 | 18540 | 33 | 30 | F | F | //implicit acceleration
//whitespace token sep |
+7 -3 | 1251 | 18567 | 40 | 27 | F | F | //optional whitespace |
+6+2 | 1280 | 18596 | 46 | 29 | F | F | // |
+6 8 | 1317 | 18633 | 52 | 37 | F | F | //space instead of + |
+3+6:T; | 1360 | 18676 | 55 | 43 | T | F | //an optional value |
+2+4:*T; | 1407 | 18723 | 57 | 47 | T | T | //wildcard |
+3+6 | 1460 | 18776 | 60 | 53 | T | T | //optional keep last |
+3-6:FF; | 1507 | 18823 | 63 | 47 | F | F | //optionals |
One would not typically see both a "+"and a "space" used as a separator in the same trace or document, but it is legal.
An InkXML generator might also include additional whitespace formatting for clarity. The following trace specification is identical in meaning to the more compact version shown above:
<trace id = "4525BCD"> 1125 18432 '23 '43 "7 "-8 3 -5 7 -3 6 2 6 8 3 6 :T; 2 4 : *T; 3 6 3-6 :F F; </trace>In addition, the alphabetic characters may be used to encode small negative and positive values. These may be substituted anywhere for an integer value between -25 and +25.
<trace id="4525BCD"> 1125 18432'W'43"G"hCeGcFBFHCF:T;BD:*T;CFCf:FF; </trace>Note that the true and false values for the side switches use symbols that are also used to encode numbers. However, they are unambiguous because of their location.
3.3.2.1 Grammar
The following notation is used to represent grammars:
Notation | Meaning |
---|---|
| | The vertical bar means logical "OR". |
_ | The underbar means explicit whitespace (all other spaces are for legibility only). |
[ ] | Empty brackets mean optional whitespace. |
( ) | Empty parentheses mean mandatory whitespace. |
(a | b | c) | Exactly one of a or b or c must occur. |
(a | b | c)+ | One or more of the options in ( ) must occur. |
(a | b | c)N | Exactly N of the options in ( ) must occur. |
(a | b | c)+N | One to N of the options in ( ) must occur. |
[a | b | c] | Zero or one of the options in [ ] may occur. |
[a | b | c]+ | Any number of the options in [ ] may occur. |
[a | b | c]+M | Zero to M of the options in [] may occur. |
Italics are symbols. | |
Non-italic bold are literals. |
The following is a draft grammar for the encoding scheme
using the above notation:
Grammar Rules | Description |
---|---|
digit::= (0..9) | Any single digit zero through nine |
sign::= [ + | - ] | A plus or minus sign |
integer::= [sign] (digit)+ | Leading zeros OK |
decimal::= [sign] (digit)+.[digit]+ | Mandatory leading digit, mandatory decimal point, leading zeros OK |
code::= (a..y | A..Z | *) | Single character code |
point::= (requiredPart)[optionalPart] | |
requiredPart::= (requiredValue)N | Exactly N require Values |
optionalPart::= : [optionalValue]M ; | Required colon…up to M optionalValues, then a required semicolon |
requiredValue::= [ ][qualifier] (value)[ ] | |
optionalValue::= [ ](value)[ ] | |
qualifier::= ( ! | ' | " ) | An exclamation point, single quote, or double quote |
value::= (integer | decimal | code) | |
token::= (requiredValue | optionalValue | : | ; ) | |
trace::= <trace ...> [[ ]point[ ]]++ </trace> |
Whitespace is optional before and after "requiredValue" and "optionalValue" tokens (unless required to separate two adjacent positive integer or decimal tokens values without + signs).
On the other hand, if traces are not allowed to contain events, an event which takes place in the middle of a trace (for example, if the user pushes a side switch while writing) cannot be recorded in its order of occurrence.
Since many sources of digital ink are temporal, many digital ink records will have significant time information. The "current"or "cumulative" time may be expressed in several ways, depending on what is available at the time of capture. The most explicit expression of time is by the use of the startTime attribute tag in any element. This is not an ideal solution and should be considered more carefully by the working group.
id | A unique identifier for this event |
---|---|
type | Event type (e.g., left-button press, change pen) |
value | The optional value specific to the event type (e.g., red) |
timestamp | The optional time at which the event occurred |
Chunks are a means to group, access, and reference groups of traces and allow applications to refer to and manipulate a group of traces as a single entity. Chunks facilitate I/O and streaming by marking a group of traces to be references as an entity. Chunks are a building block for semantic groupings and increase the speed of parsers and searches, since every trace inside a group does not need to be examined.
Chunking is the process of creating and inserting chunk tags in the primitive file. Any method, criteria, or algorithm may be used to create chunks. Each chunk tag contains one or more trace events. Chunking can be done at any layer or stage of capture or analysis.
Chunking should avoid incorrect rendering. Therefore, traces that reuse the coordinate space should not occur in the same chunk.
If a non-trace event occurs during the streaming of trace
events, place the event between chunk tags. This is done by ending the current
chunk tag, stating the non-trace event, and beginning a new chunk tag.
id | A unique identifier for this chunk |
---|---|
numTraces | The number of traces in a chunk |
numBytes | The number of bytes to the end of the chunk tag |
The goal of this section is to provide examples of how the following commonly used applications reference primitive elements:
Within a page, traces may be tagged and grouped to form larger semantic units for the purposes of searching and indexing, such as "keyword," "to-do," and "message." The same group of traces may be labeled with multiple tags and applied to overlapping or nested trace groups. For example, within the handwritten ink for a message with the text "Call Jane at 1 p.m.", the word "Jane" may also be tagged as a keyword. XML document structure lends itself to the creation of nested tag structures. Overlapping tags can be handled by introducing separate tags, which reference the same set of traces. As a general rule, if two groupings of common traces have containment semantics, such as a sentence or a paragraph, it would be appropriate to use nested tags. Otherwise, the use of separate tags is preferable.
Pages can also be assigned one or more tags, which apply to all of the traces on the page. Pages are accumulated into ink documents, which may contain pages of different sizes. Pages or traces from many ink documents may be composed arbitrarily to form new ink documents, which contain either referencing links back to the original pages or actual copies of the page data. Ink documents are accumulated into archives, which may be either shared (for example, a single archive containing all the documents for an entire department) or private (for example, Gary's ink documents).
Some devices, such as the CrossPad, SmartPad, and ThinkPad TransNote, create both a paper copy and the digital ink. In such cases, the ink document storage system may restrict the modification of ink documents to the physical modifications that can be performed to the paper copy (if synchronization of the two is important). If the paper contains a background against which the ink is written such as a form, the system must also be able to handle problems of registration such as the alignment of the handwritten ink with the fixed background image.
Retrieved documents may be exchanged with other InkXML applications; for example, traces in a signed ink document can be submitted for signature verification, traces can be sent via email or instant messaging, or ink documents can be annotated or marked up with ink, text, or audio.
In the context of pen command and control, InkXML can be used to represent both the input of a gesture or command and, perhaps, prototypes for gestures. Details of the action to be performed will be indicated by application-specific elements.
The pen input device can also be used to control a cursor, such as a mouse.
An InkXML forms processing application captures user input in the form of stroke data formatted using InkXML. The captured data is sent to a handwriting recognition engine, and the resulting transcription is presented to the user (or a separate validator) for possible correction. This validation process can be made more efficient and accurate through the use of confidence information returned by the recognizer. Also, since it is often necessary to be able to recreate the appearance of the form as it was filled out, the InkXML stroke data may be archived along with the validated text.
In many cases, field data can be collected using form elements that do not require transcription, such as check boxes. In other cases, with a graphical format being both more compact and more easily understood, a textual representation may not be the best way to convey information. Diagrams and charts have been used to capture data on paper forms, and their use will continue when electronic forms are enabled for pen input. In these situations, the InkXML stroke data serves not only as an intermediate format, but also as the final format for the captured data.
One advantage of electronic forms over their paper counterparts is that they make it easier for multiple parties at separate locations to fill out a particular form instance or for a single individual to fill out a particular form instance at multiple locations or over an extended period of time. In these cases, where the form is presented and its data captured on devices with differing capabilities, multiple InkXML screen contexts may be composed when combining the ink data from the different form input sessions. A session is the time between when a user begins working with the system and when he or she stops working.
An ink-enabled forms processing application should integrate with existing and emerging standards for electronic forms. One such standard is XForms, which is discussed in Section 6.5.
The first improvement comes from the use of XML complex types to group related annotations. For instance, a data type "writerInfoType" is defined with elements such as <hand>, <sex>, <age>, <style>, <skill>, and <country>, which generally describe a given writer. Other complex types will be defined to group information describing the source of a set of ink images, information describing user interface elements affecting the ink capture (for example, guidelines as watermarks) and information describing the nature and structure of the data.
A second improvement will be realized through a more sophisticated labeling scheme. Labels can generally be of two main types: "machine"-type (for example, an interpretation generated by a recognition engine) and "human"-type (for example, a truth value assigned by the writer of the ink). Labels also have a (probability) score associated with it, which is indicative of the likelihood that the handwritten ink matches that particular label. Thus, a basic label might look like this:
<label type="machine" source="NeuroScript_v1" score="0.85">hello</label>An unlimited number of labels can be associated with a given ink image.
The third improvement will be the result of leveraging other industry standards. In particular, activities within the W3C Voice Browser Working Group have resulted in a specification of a grammar format [2] and work has been initiated on a lexicon format. Both grammars and lexicons are necessary elements for a handwriting recognition schema. For instance, a machine-generated label can have as an attribute the URI of the grammar used to generate the label.
[2] W3C Voice Browser Working Group. Speech Recognition Grammar Specification for the Speech Interface Framework. January 2001. This can be found on the Web at http://www.w3.org/TR/grammar-spec.
The most efficient way of representing raw ink is to use stroke (or trace) information. Storing bitmaps or any other graphical formats will either waste memory or lose relevant information.
To enrich the experience of using raw ink to communicate one needs to capture, information about the brush type (to preserve the calligraphy type of effects), pen color, and pen width as well as the x/y coordinates.
For a typical ink mail or ink messaging type of application, traces are grouped into messages of arbitrary size. Consider the following possibilities and requirements:
A digital signature is a binary code produced by a device to represent a person. This type of signature is an electronic version of a wax seal. Some methods of producing a digital signature include using a password, a PIN, a computer key, an electronic signature, an electronic key, a magnetic strip card or smart card, or other physical token. Typically, a digital signature is generated with a private key known only by the user.
On June 30, 2000, Present Clinton signed the "Electronic Signatures in Global and National Commerce Act" (E-Sign Act) that gives digital signatures the same legal status as handwritten signatures. The act is technology-neutral, stating that an electronic signature is whatever is agreed upon by two parties. It can be an "electronic sound, symbol or process, attached to or logically associated with a contract or other record and executed or adopted by a person with the intent to sign the record." It could be anything from a PIN to a digital certificate accompanying an electronic signature to verify the identity of the signer.
Because an electronic signature is the electronic binding of an individual's identity to a contract, assurances must be made that the electronic signature is authentic (generated by the individual) and that the electronic contract will not be altered after signing. The establishment of standards for public key infrastructure (PKI) will assist in providing these assurances.
In spite of the fact that users have increasing access to these resources, identifying and managing them efficiently is becoming more difficult because of the sheer volume of the information. The question of identifying and managing content is not restricted to just database retrieval applications such as digital libraries, but also extends to other areas such as broadcast channel selection, multimedia editing, and multimedia directory services.
Digital ink can be used for representing content and/or retrieving existing content. Examples include creating handwritten notes, filling out predefined forms using an electronic pen, and annotating other contents. Examples for using digital ink to retrieve existing content include presenting handwritten queries to a retrieval system.
The MPEG-7 standard, formally called "Multimedia Content Description Interface," provides a rich set of standardized tools to describe multimedia content. Both human users and automatic systems that process audiovisual information are within the scope of MPEG-7, which is described in Section 6.1.
There is still an ongoing debate about the pros and cons of binary XML. Some developers within the XML community express concerns that the original XML design goals will be compromised with a binary encoding (for example, being readable by people) and suggest that there is a lack of real-world test profiles documenting the space- and/or time-saving benefits [1]. Others say that the processing inefficiencies stemming from the document size will preclude widespread adoption of XML. Although deriving a generally useful encoding (one that is most effective on a large variety of documents) is considered a difficult problem, some research and development work in this area is available. The XMill system [2], developed at the AT&T Labs, is based on a user-controllable transformation that exposes document redundancy followed by a standard text compression operation. For instance, users can specify a "pre-compress" transformation that converts strings corresponding to numeric values into their binary integer representation.
One drawback of XMill discussed by Cheney [3] is that it precludes the incremental processing of the XML document. Cheney proposed an online encoder called ESAX that compresses better and faster than XMill. ESAX leverages the work of a general SAX parser by using an encoding scheme where element start tags, end tags, attribute names, and events are represented by single byte code-words. The encoder and decoder both maintain a table of known symbols. The encoder informs the decoder whenever a new symbol is encountered.
Similar to ESAX, the WAP Binary XML (WBXML) Encoding format specifies how XML elements and attribute tags are to be tokenized with respect to a symbol table (the Code Space) [4]. The actual code-words corresponding to the different tags are not defined within the WBXML specification because they are specific to a given document-type. For instance, the Wireless Markup Language (WML) Specification lists the WBXML-based codes that represent the different WML tags [5]. Binary formats, such as WML, are considered useful because applications deal with pre-defined element tokens directly, so there is no overhead for building an on-the-fly dictionary.
Character data content is not compressed in the WBXML format. It is transmitted as inline strings or as a reference to an entry in a string table, which is included at the beginning of the document. The Millau system [6] proposes an extension to WBXML where character data is transmitted on a separate stream (the Content Stream), so a standard text compression algorithm can be used. A special token inside the main data stream (the Structure Stream) indicates the presence of compressed character data.
However, for a typical InkXML document, the ink data (not the character data) must be considered when designing an efficient binary encoding format. There are two distinct modes for coding digital ink—raster scanning and curve tracing. Facsimile coding algorithms belong to the first mode and exploit the correlation within consecutive scan lines. Chain Coding (CC), belonging to the second mode, represents the pen trajectory as a sequence of transitions between successive points in a regular lattice. It is known that curve tracing algorithms result in a higher coding efficiency if the total trace length is not too long. Furthermore, the use of a raster-base technique implies the loss of all time-dependent information.
In summary, this working group believes that a binary encoding of InkXML (say InkXMLb) can be built from the experiences summarized above. In particular, this group suggests that a WBXML-compliant code space be designed for the tags and attribute names and values defined in the InkXML Schema. Compressed ink traces can be included as opaque binary data. The specific algorithm used to compress the ink will be indicated using an ID field. Based on the ITU-T150 recommendation, two standardized algorithms will be offered-one lossy and one lossless. Recommendation of these standarized trace encoding algorithms does not preclude the use of other proprietary ones that might offer higher efficiency, as long as such offerings operate on traces and adhere to the InkXMLb Code Space specification.
[2] H. Liefke and D. Suciu. "XMill: An efficient compressor for XML data." Proceedings of the 2000 ACM SIGMOD International Conference on the Management of Data. 2000. See also www.research.att.com/sw/tools/xmill.
[3] J. Cheney. Compressing XML with Multiplexed Hierarchical PPM Models. www.cs.cornell.edu/People/jcheney/xmlppm/paper/paper.html.
[4] Wireless Application Protocol Forum. Binary XML Content Format Specification. Version 1.3, May 2000. Available at www.wapforum.org/what/technical.htm.
[5] Wireless Application Protocol Forum. Wireless Markup Language Specification. Version 1.3, February 2000. Available at www.wapforum.org/what/technical.htm.
[6] M. Girardot and N. Sundaresan. "Millau: An encoding format for efficient representation and exchange of XML over the Web." Proceedings of the Ninth International World Wide Web Conference (WWW9). Amsterdam, The Netherlands. May 2000.
[7] International Telecommunication Union. T-150 Telewriting Terminal Equipment. 1993. Available at www.itu.int/itudoc/itu-t/rec/t/t150.html.
It is clear that the working group should seek recognition from the MPEG group as the preferred format for electronic ink, while the MPEG group provides feedback for the refinement of the InkSegmentDS.
The SVG statement <path d="M 100 100 L 300 100 L 200 300" style="stroke: blue; stroke-width:0.1"/> describes a triangle, starting at point (100,100), using the Move command (M). The Lineto (L) command instructs the line to be draw from the starting point to the second point (300 100), a line drawn from the second point to a third point (200 300). The line color is defined as blue, with a line width of 0.1. The line segments are an open path. A capital "M" or "L" means absolute position, and a lower case "m" or "l" means relative position.
The InkXML first gives an absolute position followed by deltas, which are the relative positions referenced to the first absolute position. These are maintained in the SVG representation.
<trace color="0 0 255" brushShape="SQUARE" brushSize="3"> 234 122 12 2 12 14 -3 0 15 </Trace>
<path d="M 234 122 l2 12 -3 0" stroke="blue" stroke-width="0.1" stroke-linecap"square"/>
This proprietary format allowed for a very accurate description of electronic ink. Furthermore, Jot was "lightweight" because of its binary representation. On the other hand, InkXML is an open format supporting a wide variety of ink properties including a binary mode as an optional layer, which still appeals to application developers who object to a binary encoding of ink.
Finally, Jot does not support any abstract characterization of the ink in the way it will be possible through InkXML application-specific schemas.
InkXML will be able to provide the same rich annotation possible with the current UNIPEN format by means of a application-specific file definition (see Section 4.4, Handwriting Recognition). InkXML is an improvement over UNIPEN because it replaces UNIPEN's flat attribute organization with a record-like structure by supporting a more sophisticated labeling scheme and by leveraging other standards.
This group envisions several points of intersection between InkXML and XForms. Although the working group is aware of many others (such as ink signature capture for authentication), this draft addresses two points: (1) the use of ink to collect alphanumeric data in an XForms processor and (2) the collection of ink as ink instance data in an XForms model.
In such a system, the ink capture module would collect the users' input ink as InkXML data, which would be passed to the handwriting recognizer along with any syntactic constraint information provided by the form model and any additional constraints added by the ink-centric presentation. Then, the returned results and confidence scores would be presented to the user for verification. When the form is completed or suspended, the instance data would be submitted with the ink stored optionally in an InkXML format for archival purposes.
Structurally, the recognizer may be a module within the XForms processor, or the recognizer could be a service that the processor invokes remotely. Recognition may be performed immediately, as in the previous scenario. This facilitates the presentation of dynamic user interfaces supported by Xforms. On the other hand, the recognition also might be done offline if recognition services are not available to the client.
Ink instance data may require the introduction of ink-specific static facets, such as the dimensions of the input area, the number of strokes in the input, or the duration of the input.
The group should also work to address any ink-specific concerns with respect to XForms' suspend/resume feature. For example, when a form is suspended and later resumed on a device with different display or capture characteristics, the screen contexts for the two devices must be combined appropriately. Also, if a form is suspended by one writer and resumed by another, it may be useful to preserve the writer information (who wrote this ink?) along with the ink data for the purposes of customizing the handwriting recognition process.
<ref src="file" type="MIME-Type/Subtype" region="r" begin="3s" ...>This option requires the existence of an appropriate MIME content-type/subtype registered with the IANA group. From among currently recognized type/subtype, none appears to fit the needs of electronic ink; however, "video" might be the closest.
In addition to the desirability of having an InkXML MIME type, there are some potential benefits to having a special SMIL tag instead of using ref. For instance, such a tag could have an attribute that allows the control of the duration of the animated ink in a non-standard way, such as over-writing the time information already encoded within the ink file.
[2] Hardman L. A Smil Tutorial 1998. Available at http://www.cwi.nl/~media/SMIL/Tutorial.
VML, like SVG, is heavyweight and does not have sufficient means to represent a multitude of handwriting applications. VML is not a standard.
The following is the VML polyline template:
<polyline points="0 0 10 10 20 0" id=null href=null target=null class=null title=null alt=null style='visibility: visible' opacity="1.0" chromakey="null" stroke="true" strokecolor="blue" strokeweight="1" fill="true" fillcolor="white" print="true" coordsize="1000,1000" coordorigin="0 0" />
The InkXML first gives an absolute position followed by deltas, which are the relative positions referenced to the first absolute position. The first VML example represents the InkXML trace as a path using an absolute position. The second example defines a polyline as a sequence of absolute points.
<trace color="0 0 255" brushShape="SQUARE" brushSize="3"> 234 122 12 2 12 14 -3 0 15 </Trace>
<path d="M 234 122 l 2 12 -3 0" stroke="blue" stroke-width="0.1" stroke-linecap"square"/> <polyline points="234 122 236 134 233 134" style='visibility: visible' opacity="1.0" chromakey="null" stroke="true" strokecolor="blue" strokeweight="0.1" fill="false" print="true" coordsize="1000,1000" coordorigin="0 0" />
The two methods for encoding the handwritten ink that are included in the ITU-T.150 recommendation are:
Applications of contextual information in the form of grammars are used by speech recognition systems. A grammar allows the specification of words and patterns of words that the recognizer should expect. Recently, the W3C Voice Browser Working Group suggested an XML-based syntax for representing BNF-like grammars [1] [ www.w3.org/TR/grammar-spec ].
The current W3C grammar specification includes a mode attribute with "speech" or "dtmf" as possible values. The mode attribute indicates how to interpret tokens contained by the grammar. For instance, speech tokens are expected to detect speech audio that sounds like the token. One possible way to increase awareness of the important role of grammars in handwriting recognition systems is to suggest a third value for the mode attribute-namely, "hwr."
Then, APIs for handwriting recognition engines could be standardized to take two inputs: ink in InkXML format and a grammar in W3C format. Thus, it is desirable that members of this committee be invited to participate in the W3C activities related to the grammar specification.
<xsd:schema xmlns:xsd="http://www.w3.org/2000/10/XMLSchema"> <xsd:annotation> <xsd:documentation xml:lang="en"> Primitive file InkXml Schema </xsd:documentation> </xsd:annotation> <xsd:element name="inkxml" type="inkxmlType"/> <!-- --> <!-- Main document type declaration --> <!-- --> <xsd:complexType name="inkxmlType"> <xsd:sequence> <xsd:element name="deviceInfo" minOccurs="0" maxOccurs="1" type="deviceInfoType"/> <xsd:element name="channelList" type="channelNameListType"/> <xsd:element name="screenContext" minOccurs="0" maxOccurs="unbounded" type="screenContextType"/> <xsd:choice maxOccurs="unbounded"> <xsd:element name="trace" minOccurs="0" maxOccurs="unbounded" type="traceType"/> <xsd:element name="chunk" minOccurs="0" maxOccurs="unbounded" type="chunkType"/> </xsd:choice> </xsd:sequence> </xsd:complexType> <!-- --> <!-- Information about the transducer device --> <!-- --> <xsd:complexType name="deviceInfoType"> <xsd:sequence> <xsd:element name="sampleRate" minOccurs="0" maxOccurs="1" type="xsd:decimal"/> <xsd:element name="sampleMode" minOccurs="0" maxOccurs="1" type="samplingModeType"/> <xsd:element name="channelInfo" minOccurs="0" maxOccurs="1" type="channelsInfoType"/> </xsd:sequence> <xsd:attribute name="manufacturer" type="xsd:string"/> <xsd:attribute name="model" type="xsd:string"/> </xsd:complexType> <!-- --> <!-- Name of known sampling modes --> <!-- --> <xsd:simpleType name="samplingModeType"> <xsd:restriction base="xsd:string"> <xsd:enumeration value="UNIFORM"/> <xsd:enumeration value="NONUNIFORM"/> </xsd:restriction> </xsd:simpleType> <!-- --> <!-- Name of data channels a device can be capable of reporting --> <!-- --> <xsd:simpleType name="channelNameType"> <xsd:restriction base="xsd:string"> <xsd:enumeration value="X"/> <xsd:enumeration value="Y"/> <xsd:enumeration value="F"/> <xsd:enumeration value="U"/> <xsd:enumeration value="V"/> </xsd:restriction> </xsd:simpleType> <!-- --> <!-- List of data channels that do appear in the file --> <!-- --> <xsd:simpleType name="channelNameListType"> <xsd:list itemType="channelNameType"/> </xsd:simpleType> <!-- --> <!-- Name of event types a device can be capable of reporting --> <!-- --> <xsd:simpleType name="eventNameType"> <xsd:restriction base="xsd:string"> <xsd:enumeration value="time"/> <xsd:enumeration value="penChange"/> <xsd:enumeration value="switchButton"/> <xsd:enumeration value="modeChange" /> </xsd:restriction> </xsd:simpleType> <!-- --> <!-- Information on channel(s) characteristics --> <!-- --> <xsd:complexType name="channelsInfoType"> <xsd:sequence> <xsd:element name="channel" minOccurs="1" maxOccurs="unbounded"> <xsd:complexType> <xsd:sequence> <xsd:element name="range" type="xsd:decimal"/> <xsd:element name="resolution" type="xsd:integer"/> <xsd:element name="accuracy" type="xsd:decimal"/> <xsd:element name="eventList"> <xsd:simpleType> <xsd:list itemType="eventNameType"/> </xsd:simpleType> </xsd:element> </xsd:sequence> <xsd:attribute name="chName" type="channelNameType" use="required"/> <xsd:attribute name="type"> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:enumeration value="BOOLEAN"/> <xsd:enumeration value="INTEGER"/> <xsd:enumeration value="DECIMAL"/> </xsd:restriction> </xsd:simpleType> </xsd:attribute> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> <!-- --> <!-- The possible states of the pen for a given ink trace --> <!-- --> <xsd:simpleType name="inkStateType"> <xsd:restriction base="xsd:string"> <xsd:enumeration value="penUp"/> <xsd:enumeration value="penDown"/> <xsd:enumeration value="continue"/> </xsd:restriction> </xsd:simpleType> <!-- --> <!-- The possible colors of a "penDown" ink trace --> <!-- --> <xsd:simpleType name="inkColorType"> <xsd:restriction base="inkColorChList"> <xsd:length value="3"/> </xsd:restriction> </xsd:simpleType> <xsd:simpleType name="inkColorChList"> <xsd:list itemType="inkColorCh"/> </xsd:simpleType> <xsd:simpleType name="inkColorCh"> <xsd:restriction base="xsd:positiveInteger"> <xsd:maxInclusive value="255"/> </xsd:restriction> </xsd:simpleType> <!-- --> <!-- Attributes associated with ink traces --> <!-- --> <xsd:attributeGroup name="traceAttributesType"> <xsd:attribute name="type" type="inkStateType"/> <xsd:attribute name="color" type="inkColorType"/> <xsd:attribute name="brushShape"> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:enumeration value="DISC"/> <xsd:enumeration value="SQUARE"/> </xsd:restriction> </xsd:simpleType> </xsd:attribute> <xsd:attribute name="brushSize" type="xsd:decimal"/> <xsd:attribute name="screenContextRef" type="xsd:string"/> </xsd:attributeGroup> <!-- --> <!-- Simple list of point coordinates --> <!-- --> <xsd:simpleType name="basePtCoordListType"> <xsd:list itemType="xsd:integer"/> </xsd:simpleType> <!-- --> <!-- Holder of ink point coordinate sequence --> <!-- --> <xsd:simpleType name="ptCoordListType"> <xsd:restriction base="basePtCoordListType"> <xsd:minLength value="2"/> </xsd:restriction> </xsd:simpleType> <!-- --> <!-- Main ink trace definition. Reader must use 'channelList'--> <!-- field in order to properly parse this list. --> <!-- --> <xsd:complexType name="traceType"> <xsd:simpleContent> <xsd:extension base="ptCoordListType"> <xsd:attribute name="id" type="xsd:ID"/> <xsd:attributeGroup ref="traceAttributesType" /> </xsd:extension> </xsd:simpleContent> </xsd:complexType> <!-- --> <!-- Main ink chunk definition --> <!-- --> <xsd:complexType name="chunkType"> <xsd:sequence> <xsd:element name="trace" minOccurs="1" maxOccurs="unbounded" type="traceType"/> </xsd:sequence> <xsd:attribute name="id" type="xsd:ID"/> </xsd:complexType> <!-- --> <!-- Main screenContext type declaration --> <!-- --> <xsd:complexType name="screenContextType"> <xsd:sequence> <xsd:element name="canvas"> <xsd:complexType> <xsd:sequence> <xsd:element name="x1" type="xsd:positiveInteger"/> <xsd:element name="y1" type="xsd:positiveInteger"/> <xsd:element name="x2" type="xsd:positiveInteger"/> <xsd:element name="y2" type="xsd:positiveInteger"/> </xsd:sequence> <xsd:attribute name="id" type="xsd:ID"/> </xsd:complexType> </xsd:element> <xsd:element name="mapping" minOccurs="0" maxOccurs="1"> <xsd:complexType> <xsd:sequence> <xsd:element name="t00" type="xsd:integer"/> <xsd:element name="t01" type="xsd:integer"/> <xsd:element name="t10" type="xsd:integer"/> <xsd:element name="t11" type="xsd:integer"/> <xsd:element name="t20" type="xsd:integer"/> <xsd:element name="t21" type="xsd:integer"/> </xsd:sequence> <xsd:attribute name="id" type="xsd:ID"/> </xsd:complexType> </xsd:element> </xsd:sequence> <xsd:attribute name="id" type="xsd:ID" use="required"/> </xsd:complexType> </xsd:schema>
<?xml version="1.0" encoding="UTF-8"?> <!-- Edited with XML Spy v3.5 NT (http://www.xmlspy.com) by Giovanni Seni (Motorola HIL) --> <inkxml xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance" xsi:noNamespaceSchemaLocation="H:\inkxml.xsd" <deviceInfo manufacturer="Wacom" model="UD-0608-R"> <sampleRate>200</sampleRate> </deviceInfo> <channelList>X Y</channelList> <screenContext id="s01"> <canvas id="w01"> <x1>0</x1> <y1>0</y1> <x2>750</x2> <y2>250</y2> </canvas> </screenContext> <chunk id="c01"> <trace id="t01"> 232 -94 232 -95 232 -96 232 -97 232 -98 232 -99 232 -101 232 -102 232 -103 232 -105 232 -106 232 -108 232 -109 232 -110 232 -112 232 -113 232 -114 232 -116 232 -117 232 -118 232 -119 232 -120 232 -121 232 -122 232 -123 232 -124 232 -125 232 -126 232 -127 232 -129 232 -130 232 -132 232 -133 233 -134 233 -135 233 -136 233 -137 233 -138 233 -139 233 -140 233 -141 233 -140 233 -139 </trace> <trace id="t02">201 -151 202 -151 203 -151 204 -152 206 -152 208 -152 211 -152 214 -152 216 -152 219 -152 221 -152 223 -152 225 -152 227 -152 229 -152 230 -152 231 -152 233 -152 234 -153 236 -153 237 -153 239 -153 240 -153 241 -153 242 -153 243 -153 244 -153 245 -153 246 -153 247 -153 248 -153 250 -153 251 -153 253 -153 254 -153 255 -153 257 -153 258 -153 260 -153 261 -153 263 -152 264 -152 265 -152 266 -152 266 -151 </trace> </chunk> <trace id="t03" screenContextRef="w01"> 203 -90 204 -90 205 -90 207 -90 209 -91 211 -91 214 -91 217 -91 220 -91 223 -91 226 -91 229 -91 231 -91 234 -91 236 -91 238 -91 240 -91 242 -91 243 -91 244 -91 246 -91 247 -91 249 -91 251 -91 252 -91 254 -91 255 -91 256 -91 258 -91 259 -91 260 -91 261 -91 260 -91 </trace> </inkxml>
<xsd:schema xmlns:xsd="http://www.w3.org/2000/10/XMLSchema"> <xsd:annotation> <xsd:documentation xml:lang="en"> Application-specific file for UNIPEN-like data files Uses inkxml Schema </xsd:documentation> </xsd:annotation> <!-- For access to primitive file element definitions --> <xsd:include schemaLocation="H:\inkxml.xsd"/> <xsd:element name="inkxmlUNIPEN" type="inkxmlUNIPENtype"/> <!-- --> <!-- Main document type declaration --> <!-- --> <xsd:complexType name="inkxmlUNIPENtype"> <xsd:sequence> <xsd:element name="dataBlock" type="dataBlockType" minOccurs="1" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> <!-- --> <!-- A collection of images that share the same source --> <!-- --> <xsd:complexType name="dataBlockType"> <xsd:sequence> <xsd:element name="dataBlockInfo" minOccurs="0" maxOccurs="1" type="dataBlockInfoType" /> <xsd:element name="writerBlock" minOccurs="1" maxOccurs="unbounded" type="writerBlockType"/> </xsd:sequence> <xsd:attribute name="id" type="xsd:ID"/> <xsd:attribute name="hierarchy" type="xsd:string" use="required"/> </xsd:complexType> <!-- --> <!-- Information describing the source of this data block --> <!-- --> <xsd:complexType name="dataBlockInfoType"> <xsd:sequence> <xsd:element name="source" type="xsd:string" minOccurs="1" maxOccurs="1"/> <xsd:element name="date" type="xsd:date" minOccurs="1" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> <!-- --> <!-- A collection of images written by the same writer --> <!-- --> <xsd:complexType name="writerBlockType"> <xsd:sequence> <xsd:element name="writerInfo" type="writerInfoType"/> <xsd:element name="writerImage" minOccurs="1" maxOccurs="unbounded"> <xsd:complexType> <xsd:sequence> <xsd:element name="label" type="labelType" minOccurs="0" maxOccurs="unbounded"/> <xsd:choice maxOccurs="unbounded"> <xsd:element name="traceRef" type="traceRefType" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="trace" minOccurs="0" maxOccurs="unbounded" type="traceType"/> <xsd:element name="chunkRef" type="traceRefType" minOccurs="0" maxOccurs="unbounded"/> <xsd:element name="chunk" minOccurs="0" maxOccurs="unbounded" type="chunkType"/> </xsd:choice> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:sequence> <xsd:attribute name="id" type="xsd:ID"/> </xsd:complexType> <!-- --> <!-- Information about a given writer --> <!-- --> <xsd:complexType name="writerInfoType"> <xsd:sequence> <xsd:element name="hand"> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:enumeration value="L"/> <xsd:enumeration value="R"/> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element name="sex"> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:enumeration value="M"/> <xsd:enumeration value="F"/> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element name="country" type="xsd:string"/> <xsd:element name="age" type="xsd:integer"/> <xsd:element name="skill"> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:enumeration value="bad"/> <xsd:enumeration value="ok"/> <xsd:enumeration value="good"/> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element name="style"> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:enumeration value="print"/> <xsd:enumeration value="cursive"/> <xsd:enumeration value="mixed"/> </xsd:restriction> </xsd:simpleType> </xsd:element> </xsd:sequence> <xsd:attribute name="id" type="xsd:ID"/> </xsd:complexType> <!-- --> <!-- Information on an interpretation of some image --> <!-- --> <xsd:complexType name="labelType"> <xsd:simpleContent> <xsd:extension base="xsd:string"> <xsd:attribute name="id" type="xsd:ID"/> <xsd:attribute name="source" type="xsd:string"/> <xsd:attribute name="type"> <xsd:simpleType> <xsd:restriction base="xsd:string"> <xsd:enumeration value="machine"/> <xsd:enumeration value="human"/> </xsd:restriction> </xsd:simpleType> </xsd:attribute> <xsd:attribute name="score" > <xsd:simpleType> <xsd:restriction base="xsd:float"> <xsd:minInclusive value="0"/> <xsd:maxInclusive value="1"/> </xsd:restriction> </xsd:simpleType> </xsd:attribute> </xsd:extension> </xsd:simpleContent> </xsd:complexType> <!-- --> <!-- The actual ink --> <!-- --> <xsd:complexType name="traceRefType"> <xsd:attribute name="uri" type="xsd:string"/> </xsd:complexType> </xsd:schema>
<?xml version="1.0" encoding="UTF-8"?> <!-- Edited with XML Spy v3.5 NT (http://www.xmlspy.com) by Giovanni Seni (Motorola HIL) --> <inkxmlUNIPEN xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance" xsi:noNamespaceSchemaLocation="file://locahost/H:/inkxmlUNIPEN.xsd" xmlns="file://localhost/H:/inkxml.xsd"> <dataBlock id="db1" hierarchy="CHARACTER"> <dataBlockInfo> <source>Motorola HIL Palo Alto</source> <date>2000-06-22</date> </dataBlockInfo> <writerBlock> <writerInfo> <hand>L</hand> <sex>M</sex> <country>UK</country> <age>17</age> <skill>good</skill> <style>mixed</style> </writerInfo> <writerImage> <label id="l01" source="me" score="1.0">I</label> <traceRef uri="primitiveSample.xml#t01"/> <traceRef uri="primitiveSample.xml#t02"/> <traceRef uri="primitiveSample.xml#t03"/> </writerImage> <writerImage> <label id="l02" source="you" score="1.0">I</label> <chunk id="c01"> <trace id="t01"> 232 -94 232 -95 232 -96 232 -97 232 -98 232 -99 232 -101 232 -102 232 -103 232 -105 232 -106 232 -108 232 -109 232 -110 232 -112 232 -113 232 -114 232 -116 232 -117 232 -118 232 -119 232 -120 232 -121 232 -122 232 -123 232 -124 232 -125 232 -126 232 -127 232 -129 232 -130 232 -132 232 -133 233 -134 233 -135 233 -136 233 -137 233 -138 233 -139 233 -140 233 -141 233 -140 233 -139 </trace> <trace id="t02"> 201 -151 202 -151 203 -151 204 -152 206 -152 208 -152 211 -152 214 -152 216 -152 219 -152 221 -152 223 -152 225 -152 227 -152 229 -152 230 -152 231 -152 233 -152 234 -153 236 -153 237 -153 239 -153 240 -153 241 -153 242 -153 243 -153 244 -153 245 -153 246 -153 247 -153 248 -153 250 -153 251 -153 253 -153 2254 -15 255 -153 257 -153 258 -153 260 -153 261 -153 263 -152 264 -152 265 -152 266 -152 266 -151 </trace> </chunk> </writerImage> </writerBlock> </dataBlock> </inkxmlUNIPEN>
Annotation: Elements in an inkXML file that describe meta-data, or semantic information, about the traces themselves (See ink annotation)
Application-specific elements: Provide higher-level description of the digital ink captured in the primitive elements
Attribute (XML): Additional value associated with an XML element, such as ID, TIME, NAME, or VALUE. These appear in XML elements with the syntax name="value"
Bandwidth: Maximum frequency at which a digitizer can accurately track and report pen coordinates (or other channels). Bandwidth may be much lower than the sample rate.
Binary ink: Any file format for digital ink encoded as a sequence of bits but not consisting of a sequence of printable characters (text). After compression, binary ink is typically archived or transmitted.
Bounding box: A minimal-sized rectangle that encloses a group of traces
Canvas: Widget or window in a graphical user interface where ink is drawn during ink capture
Capture: Digitally recording physical measurements of handwriting, typically using a stylus
Chunks: A group of pen traces
Compression: The coding of data to save storage space or transmission time
Content: Actual data represented by an element in the XML document
Device: See digitizer
Digital ink: An electronic representation of the pen movement, pressure, and other characteristics of handwritten input using a digitizing device
Digital pen: A passive stylus containing no electronic components or an active stylus containing electronic components
Digitizer: A hardware device capable of sensing the digital pen tip position. The digital pen can be a passive stylus containing no electronic components, or an active stylus containing electronic components (a.k.a. tablet).
Electronic ink: See digital ink
Element: The basic construct in XML. An element begins with a "<element-type-name [attributes] >" tag and ends with a "</element-type-name>" tag. The intervening data is considered the element's content. An element without any content may also be written as <element-type-name [attributes] />.
Events: An action, either human or machine generated; for example, page turn, pen up, or ink color change
Force: The pressure applied to a writing implement, typically measured in grams, ounces, or newtons
Gesture: Collection of ink traces that indicate a certain action to be performed
Ideographic: A written language in which symbols represent words, rather than characters, such as Kanji (a.k.a. pictographic)
Ink: See digital ink
Ink annotation: A handwritten note or markup referencing (by proximity) another visible writing or printed matter
Ink archive: A collection of ink documents
Ink attribute: A basic named value for an ink trace, such as color and width
Ink document: A collection of one or more pages containing ink traces
Ink label: A descriptive or identifying word or phrase accompanying some ink traces
Ink point: An element in the stream of data recorded by a real-time digitizer of handwriting; for example, a tuple <x, y, pressure, tilt>
Ink-enabled system: A system capable of recording digital ink data
Instant messaging: Communication application allowing people to know the presence information from other parties and to participate in a near real-time chat session
Mapping: Transformation used to map from digitizer coordinates to canvas coordinates (See transform)
Page boundaries: The division of handwriting events by the page for which they are intended
Primitive elements: Set of rudimentary elements sufficient for all basic ink applications
Primitive file: Contains the raw output of the digitizer in temporal order
Recognition grammar: Specification of words and patterns of words that a recognizer should expect when processing input ink
Resolution: The minimal change or difference in a measurement (coordinate, force, tilt) that a digitizer reports
RMS noise: "Root mean square" noise-a measure of the actual ability of a digitizer to resolve position. Some digitizers report at high resolution, but have a lower effective resolution due to noise. RMS noise is usually linked to bandwidth with noise increasing at higher bandwidths.
Sample rate: The frequency at which a digitizer reports coordinate (or other) information. Sample rate is not always directly related to the bandwidth.
ScreenContext: ScreenContext is one of the primitive elements of the InkXML. It is used to reflect the characteristics of the display area and the correspondence between the display area and the ink-capturing device.
Semantic: A contextual interpretation of handwriting, such as character, word, sentence, and paragraph
Session: The span of time from a user beginning an interaction to ending the interaction with the system. The data gathered during this span of time.
Streaming: Continuously sending handwriting events over a communication channel
Stroke: Ink resulting from an elementary pen movement, such as bounded by two consecutive velocity extrema. A sequence of strokes constitutes a trace.
Tags: Description of a semantic component of an XML language
Temporally sequential: Items that occur next to each other in time
Tilt angle: The angle of the pen with respect to the writing surface, which is usually measured as angles of the projection onto x and y vertical planes
Trace: A complete pen-down movement bounded by two pen-up movements or a complete pen-up movement. A sequence of traces accumulates to meaningful units, such as characters and words.
Transform: A linear function applied to a point in order to stretch, rotate, and skew from one coordinate space into another, which is usually expressed in 2D as a 2 x 3 matrix. (See mapping)
Verification: Confirmation that a presented signature is the same as the one on file (a.k.a. one-to-one matching)
View: The portion of the
canvas visible during ink capture
Copyright ©2002 IBM, Intel, Motorola, International Unipen Foundation. All Rights Reserved.