RE: Anyone tried to extract images from Word 2003 XML before?

Part of the solution is also on sharexml:



http://www.sharexml.com/x/get?k=6loeTebWcH1Z



But that is lacking context, so more difficult to understand..



Grtz



*Van:* Geert Josten [mailto:geert.josten@dayon.nl]
*Verzonden:* dinsdag 26 maart 2013 7:27
*Aan:* mozer
*CC:* XProc Dev
*Onderwerp:* RE: Anyone tried to extract images from Word 2003 XML before?



Hi Xmlizer,



I found an answer myself. It is reasonably trivial using Calabash. Just
have an XSLT write the binData to the secondary output, rename the root
element to c:data, and write that using p:store/@cx:decode=true..



The XProc part is in here:
https://github.com/grtjn/xproc-ebook-conv/blob/master/src/nl/grtjn/xproc/ebook/input-adapters/input-adapters.xpl(store-extra-files)



The XSLT part should be here:
https://github.com/grtjn/xproc-ebook-conv/blob/master/src/nl/grtjn/xproc/ebook/input-adapters/wordml/get-main-matter.xsl



Good luck!



Geert



*Van:* mozer [mailto:xmlizer@gmail.com]
*Verzonden:* zaterdag 23 maart 2013 19:17
*Aan:* Geert Josten
*CC:* XProc Dev
*Onderwerp:* Re: Anyone tried to extract images from Word 2003 XML before?



Did you had any anwser on that topic ?



Xmlizer



2011/11/3 Geert Josten <geert.josten@dayon.nl>

Hi,

I would like to extract images from Word 2003 XML. It is base64 encoded if
I am not mistaken. They are easy to single out, but how can I have them
written as proper binary files to disk using XProc?

<w:pict>
<w:binData w:name="wordml://03000002.png"
xml:space="preserve">iVBORw0KGgoAAAANSUhEUgAABYMAAANmCAIAAAAnyQPfAAAAAXNSR
0IArs4c6QAAAAlwSFlzAAAS
dAAAEnQB3mYfeAAA/7VJREFUeF7snQdgVFXW+CehJvQAihAp0hEFBRFhNWBBmshGLCgi6EIEER
cF
pPj9xV0EF3VRRA2wNvQTP0tEBUF0BVSKihRFSpAq0hOKkABp//PeTV4mkylvat7M/J7jMHlzyz
m/
c9/M3PPOPTcmvtlIW8CPGJv2X+mj+GzJd/W/7E4VvXQ8r1osbMVJB9qpov/tey8sGuNYpUSXTv
q3
k8qViEqiEg0X9WZIUPKEq7LO9HGmTGGzMaW0KaRTqvuS54sY2knnxBjFmB2s6AG+C6Mo4/pM2H
E0
FTdUElDJ8yU09UCllNyFAjsfxa5HmZfwndkwQIRNjF1Hezi3T4nh7WagO4Nf8tpwHGdFfzuF7x
yN
MYhKXcrOJCt1adoNCeefUJqyzmyuPnVcjYZSw7sIvhOizpV29clYYqS5Gt4uyNsJbHp4243fUo
RL
XMGuLsHiQiUvzRIfhi4Qu2Tv8iPfKTYnhEt8/LgaOs4UdvaFUsKorq4ZuyHhOOxLfIs5fkm46b
CU
mV192Lj4FlTfVM5Ht5vh7WnAlvxCKUHa8TvGSd9Fo8QVfBfn1deJc/glBHZlH6++skt+dxVfk8
70
cTHs3cEvksXVwPT4lV38ceP8E8q5UQr1cGGU4BAuNTRdfaaV/Byx/xHiilaRwM4R8JWtMyz5ke
EM
vh9f2c6+Ie0uB7PD29U3v5tJhfMPNUt9Zbv6YnT8dnD+cevmCrczqbMPS1cXWImydoVcfUa7+t
5w
94Xi+Lnp8TPF8dPaFQy7Tp3pUeozuES7rr4SvP7KdvNbp+R1VuqXkytDufr4dgHf/mMxHF9fXn
Gb
IXZsOCqAzBCAAAQgAAEIQAACEIAABCAAAQiEF4FL212tHjFOYyIKck5e2rRKQX5+QUGB7iONUS
8K
j5iYrXtzY8rFu9SZmIhSXrCS/rNilOqVixspuuvPS2ek3zdYiv2Qrjx0zoXSZXXhUSUmwv5a8e
SB
....

Kind regards,
Geert

Received on Tuesday, 26 March 2013 06:31:01 UTC