[css3-images] [css3-background] Image/media fragments and cropping from Philippe Verdy on 2012-11-08 (www-style@w3.org from November 2012)

From: Philippe Verdy <verdy_p@wanadoo.fr>
Date: Thu, 8 Nov 2012 23:55:25 +0100
To: www-style@w3.org
Message-ID: <CAGa7JC0iW8OJ-_v4Wgzk2fRgB7+Xd9vA6tRUk9Z=6BOVf2faPg@mail.gmail.com>
In note that the [css3-background] module is not consistant with the
current definition of the [css3-images] module, with regards to image
fragments, and more generally to the desire of selecting fragments of
a resource to select a part of the resource to be used as a a source
of images to render (possibly animated if the resource is a video).

For example the [css3-images] module uses a very unfriendly fragment
identifier with a fixed keyword to specify cropping parameters
(#xywh=x,y,w,h). Not only this keyword is ugly, but it also prohibits
using a resource containing multiple parts (not just one image or one
video), for example if the resource is a ZIP or JAR archive, or even
an SVG document, which have their own way to select a part of their
content and specifying their associated content-type.

For this reason, the syntax for url(resource-url#resource-fragment)
should remain fixed so that the #resource-fragment remains ONLY
interpreted according to the content-type of the resource at the given
resource-url.

For example, if it is a ZIP or JAR archive it would be used to specify
an internal relative path *within* that resource, and nothing else. If
the resource-fragment cannot be within the specified resource at
resource-url, this shold behave in UAs as if the resource was not
found (like with HTTP 404 Not found). If the resource-fragment has a
syntax error or lacks some information to be processed correctly and
return a selected fragment, according to the content-type of the
resource at resource-url, the handler for the meda-type of the
resource at resource-url may work as if it was a web server returning
other 4xx errors (including permissions denied, or incorrect internal
path in the resource-fragment specified by the client) or 5xx error
(internal error of the content-type handler), and the returned
content-type of the fragment should not be an image type (it may
eventually be a text type, that the UAs may choose to render to
display for example an error message, or text/html if the UA can
render it to build itself an image.

The content-type handler of the url will then generally not perform
any cropping. of the returned image, such cropping should only occur
in the CSS3 UAs, using a much better syntax.

Also, some resources won't return by themselves the correct
content-type in order to support the correct content-type handler that
will parse the #resource-fragment specifier, after parsing the
document that it retrieves from resource-url. For this reason I
suggest to augment the syntax of "url(resource-url)" into
"url(resource-url content-type)" to allow overriding the content-type.

The complete syntax of an URL will then be
"url(resource-url#resource-fragment content-type)", with the
resource-fragment part ALWAYS parsed by the handler of the specified
content-type ! Some content-types may support other selections than
just an internal path, using their own syntax (for example to select a
frame number in a video, or renderng quality parameters, color
conversions and hints, or to select a range of ranges (between
timestamps), or to control the playing speed, or to indicate them the
URL of some external data containing a playing and rendering script,
possibly controling also the layout.

Most image formats (and even videos) don't have internal cropping
parameters, they have their own intrinsic size and that's where we
need a support of cropping in CSS directly in the CSS UA, even if it
is absent from the content-type handler. CSS just needs to make sure
that the content-type handler will return a resource with the expected
content-type (if not, the UAs may display an alternate icon or other
indication that the meda is not in one of the supported content-types
for images, i.e. it cannot generate a rectangular frame).. If the
content-type handler returns some image/* or video/* type, the UAs
will query the properties of the returned content-type to see if it
defines an intrinsic size (width,height).

Now comes the time of supporting cropping : the CSS3 UA should be able
to perform the cropping itself (but if needed it can query the
content-type handler to see if it can perform it itself, to save
processing, using a standard imaging API of the content-type handler).
But if the content-type handler does not support cropping itself, the
CSS3 UA should perform it by using its own "chained" renderer, which
will take in input the url() of the resource (augmented like exposed
above to support the selection of the content-type handler), and
cropping parameters. This will use a new, and MUCH BETTER, LESS UGLY,
specifier :

  <image-url> = <css-uri> <content-metadata>* <image-transform>*

where (at the lexical level):

  <css-uri> = "url(" <WS>* (
      <uri>
    | <SQUOTE> <uri> <SQUOTE>
    | <DQUOTE> <uri> <DQUOTE>
    ) <WS>* ")"

  <uri> = (
      <resource-uri> [ <fragment-specifier> ]
    | <fragment-specifier>
    )

  <fragment-specifier> =  "#" <resource-fragment>

and where (at the syntax level):

  <content-metadata> = <content-type> | <content-encoding> |
<content-language>   -- etc.
  <content-type>        = "type("        <Content-Type-Value>        ")"
  <content-encoding> = "encoding(" <Content-Encoding-Value>  ")"
  <content-language> = "lang("        <Content-Language-Value> ")"

  <image-transform> = <image-flip> | <image-resize> | <image-rotate> |
<image-crop>

  <image-flip> = "flip-x" | "flip-y"
  <image-resize> = "size(" <image-resize-mode>? <image-size>?
<image-resize-hint>? ")"
  <image-rotate> = "rotate(" <degrees> ")"
  <image-crop> = "crop(" ( <image-crop-x> [ <image-crop-y> ] |
<image-crop-y> ) ")"

  <image-resize-mode> = "auto" | "cover" | "contain"   -- default is "auto"
  <image-size> =  ( <length> | <percentage> ){1,2}
  <image-resize-hint> =

  <image-crop-x> = (
      "left" ( <percentage> | <length> ) [ ( "right" | "width" ) (
<percentage> | <length> ) ]
    |
      "right" ( <percentage> | <length> ) ] [ "width" ( <percentage> |
<length> ) ]
    |
      "center" ( <percentage> | <length> )
    )
  <image-crop-y> = (
      "top" ( <percentage> | <length> ) [ ( "bottom" | "height" ) (
<percentage> | <length> ) ]
    |
      "bottom" ( <percentage> | <length> ) ] [ "height" ( <percentage>
| <length> ) ]
    |it will
      "center" ( <percentage> | <length> )
    )

Note that transforms are executed in the specified order. I propose
here the support for:

1. mirroring the source frames, with keywords like "flip-x" and
"flip-y" (specifying both would generate a 180 degrees rotation, so
you just need one value);

2. resizing the source frames (zooming effect) the source images to
the indicate size (if the source has no intrinsic size, it will
instruct the media-type handler the size with which it will render the
media); if only one <length> or <percentage> value is specified in
<image-size>, the second value is computed according to the
<image-resize-mode>. The "auto" in the first or second value means
that the desized width and the height will be adjusted in order to
preserve the size ratio of the source frame, preserving all the area
of its content (without performing any cropping or adding transparent
border bands at this step).

3. rotating the source frames, with a numeric parameter in degrees
(positive for clockwise or using a basic keyword like "left" and
"right" for 90 degrees clockwise or anti-clockwise, or "updown" for
180 degrees) : note that this rotation may extend the resulting size
of the rectangle (adding 4 transparent triangular corners around the
transformed image to preserve its content);

4. finally cropping it by a specified amount from borders, and/or by
speciying a final width/height, like with background-position in
[CSS3-background]. Percentages are indicated relative to the source
frame (eventually already resized by previous operations).

As much as posible, these 4 operations (executed in that order) should
be performed directly by the source media-type renderer to save
processing, but if it does not define such support itself, the UA will
implement it itself. It is assumed however that media-type renderers
for images that have no intrinsic sizes (e.g. some SVG images) will
support at least the image-resize operation, or will be able to return
a "default" size (otherwise the UAs won't be able to render it
itself).

Note that operations 1 and 2 (flipping and resizing) can (and should)
easily be combined into a single operation. I don't suggest mixing the
order of operations 1 to 4, and notably operation 4 (cropping) which
does not fit easily in a single linear transform matrix (which is also
possible for combining operations 1 to 3, including rotation). If the
UA whichs to support such mixes, it will need to use 2 transform
matrixes (a fist matrix before cropping, and a second one after
cripping).

If needed in a future specification, it could be possible to create
<image> specifiers using combinators and transforms in variable order,
but this should use a more complete specification of image expressions
using combinators and multiple transforms (plus additional transforms
like color transformation, non-linear transforms, projections,
clipping, morphing, masking and lighting effects... (But this should
be done in accordance to other graphics specifications, including for
3D).

But what is important for now is to be able to use the same single
image resource, downloaded in a single operation and cached once (if
it's not a "live" video and not a non-cachable image, for which each
new occurence of the resource occuring in loaded document should be
followed by performing a separate download and a separate caching of
the same external resource referenced in the same document instance),
from which image fragments can easily be extracted using common
constructors in an imaging/framing API, and that allows also these
images to be stored consistently in container documents (from which an
extraction is possible using standard fragment identifiers supported
by each content-type handler).

The purpose of my proposal here is effectivety to allow create
multiple image resources derived from the same external resource
loaded once, using the same API and the same consistant syntax. And to
make sure that these derived images will be usable in ALL modules of
CSS3 (for now the image module itself, but also the border and
background modules, as well as in SVG primitive modules.

Also, to allow overriding some resource metadata to allow selecting
and instructing a content-type handler that will render the correct
image (including the content-language which may be important for its
conditional internal styling, and the content encoding for some
container resources for which there's an ambiguity between multiple
content-types for selected content fragments stored in it, or
different possible interpretations for its effective processing, or
simply because the transport).

For now the specification of images that uses the functional notation
"image()" with a specific reinterpretation of the fragment identifier
is definitely inconsistant, and NOT viable (it is also widely
unsupported in most browsers, too much experimetnal and in fact
bogous, so the fragment is not interpreted the way it was
experimentally specified).


For me a better (but LATER) specification will need to redefine a
newer functional notation only for more advanced notations allowing
the creation of derived images from multiple sources with combinators,
and it will use an "image()" notation **without** the bogous and
clearly insufficient "#xywh=x,y,w,h" reinterpretation of fragment
identifiers in the URL. Such combinators will use this <image-url>
specification (or similar) as internal atomic elements, meaning that
the following will be valid a valid image:

  image( url(http://example.com/resources.z#icons/image1
lang(fr;*=0.8) type(binary/x-zip)) type(image/png) flip-y size(contain
100px 100px) crop(left 10px width 10px center 50%) )

just like this proposed version (that does not need the superflouous
"image()" surroudning functor, for just combining a single image
source that has just been selected by document fragment, mirrored,
resized, rotated and cropped).

A future specification could use several url() (each one interpreted
as an imagen or instructed to be interpreted as an image source with a
"image()" functor) to be combined or manipulated with a more advanced
language directly in the CSS scripting syntax. The alternative being
to create instead such combination in an SVG document (handled by its
DOM API or created in XML syntax). I think that my proposal however
can work correctly in the CSS API using only basic properties of
existing objects (which is most cases will simplify the applications
using CSS via its API.

In the example above, note the presence of the lang() specifier
**within** the url() specifier, which is then expanded, to specify the
Accept-Language for the HTTP request to be performed on the
example.com server. As well the type(binary/x-zip) specifier within
the url() also allows overriding the content-type returned by the
server, which allows selecting the correct media-type handler for .zip
files : it means that the fragment named "icons/image1" will be
located in the ZIP archive by the ZIP content-type handler, which will
select it. Then the second type(image/png) specifier that appears
aoutside the url() indicates how to parse the resource extracted from
the ZIP archive.

When metadata specifiers are indicated *outside* the url(), it means
what we expect to get on output : if needed the resource already
loaded will be reinterpreted by another content handler which will
refilter it, When they are *within* the URL, they indicate how to
parse the loaded reasource itself, not the selected and extracted
fragment : the fragment specifier appended to the URL is just exposed
on output of the loaded resource so that the content-type handler will
interpret it. Its role then disappears from the output of the url()
functor (except that the resulting image object may still expose an
view as a basic string property of the url property of the returned
image object, and a basic string property of the same url property for
the fragment.identifier that was used by the content-type handler,
just like it will expose a basic string property for exposing the
content-type attached to the input resource that was fed to the
content-type handler.

Thanks.

-- Philippe Verdy
Received on Thursday, 8 November 2012 22:56:15 UTC