W3C home > Mailing lists > Public > www-svg@w3.org > June 2009

Re: Searching for text in an SVG image?

From: Helder Magalh„es <helder.magalhaes@gmail.com>
Date: Tue, 9 Jun 2009 12:47:17 +0100
Message-ID: <2a1ddf8a0906090447r4462b7f8wf87b249264517a8c@mail.gmail.com>
To: "DuCharme, Bob" <BDuCharme@innodata-isogen.com>
Cc: www-svg@w3.org
Hi Bob,


First of all, note that the www-svg mailing list "is for technical
discussion on Scalable Vector Graphics (SVG) and its specifications"
[1]. For general SVG support there are more appropriate mailing lists
such as svg-developers [2]. ;-)


> I know that when I see "hello" in an SVG file, it may have been put there
> with a single text element, but it may have been put there with 5 text
> elements, each placing a single letter in the image. The latter would
> obviously be more difficult to search for.

Well, I'd say that placing 5 text elements would be semantically
incorrect, as they would no longer be conceptually seen as a whole
word or sentence; also, when placing glyphs separately, the only way
(I can currently think of) to try linking them would be through
position heuristics post rendering (or based in text coordinates after
taken all transformations into consideration, character dimension
etc.), which is basically reverse engineering to guess what the author
meant... :-|

Note that SVG has several interesting text layout features such as
alignment properties and text on a path [3] which should help towards
precise glyph placement in order to achieve the desired result. I take
the opportunity to suggest a couple of interesting articles on "SVG
and Typography" [4] [5]. ;-)


> Has anyone heard of a SVG
> programming library that makes such searches easier?

No. I'm aware that some SVG implementations, such as Batik (Squiggle)
[6], implement text search functionality (using Squiggle, though the
"Edit" menu and choosing "Find..."), though I can imagine none has
implemented the heuristics already described which you seem to be
seeking.

If you really need to go in that direction (for example, if you don't
control the generated SVG input nor can change the SVG generation,
whether by changing authoring habits whether by changing the SVG
output of some tool) then you may want to take a look at the "Machine
Accessibility" [7] section of the SVGIG wiki, with focus on the "XSLT
File" subsection, which can be used as a starting point. :-)


[below in the original message]
> Disclaimer:
[...]

I'd suggest not posting email disclaimers into mailing lists (the
TortoiseSVN mailing list etiquette [8] has a "Note about e-Mail
disclaimers"). I'm not sure what's the specific guidelines regarding
this specific mailing list, but this is a general suggestion. ;-)


> thanks,
> Bob

Hope this helps,
 Helder


[1] http://lists.w3.org/Archives/Public/www-svg/
[2] http://tech.groups.yahoo.com/group/svg-developers/
[3] http://www.w3.org/TR/SVG/text.html
[4] http://www.xml.com/pub/a/2004/04/07/svgtype.html
[5] http://www.xml.com/pub/a/2004/05/12/svg.html
[6] http://xmlgraphics.apache.org/batik/
[7] http://www.w3.org/Graphics/SVG/IG/wiki/Accessibility_Activity#Machine_Accessibility
[8] http://tortoisesvn.tigris.org/list_etiquette.html
Received on Tuesday, 9 June 2009 11:47:54 GMT

This archive was generated by hypermail 2.3.1 : Friday, 8 March 2013 15:54:42 GMT