With Wordbee you can translate xml files that contain good old html code.
A simple example
Let us look at the two ways for embedding html inside an XML document:
<?xml version="1.0" encoding="UTF-8" ?>
<book>
<chapter>
<block title="Introduction"><![CDATA[
<p>Welcome to our book</p>
<p><a href="#" title="click me">Continue reading & browsing</a></p>
]]></block>
</chapter>
</book>
The <block> node contains html code. Using CDATA markers (see brown color) means that the html can be inserted directly, as is. Let us call this the "CDATA method".
Alternatively, your XMLs may show up encoded html. This is the "Encoded html method":
<?xml version="1.0" encoding="UTF-8" ?>
<book>
<chapter>
<block title="Introduction">
<p>Welcome to our book</p>
<p><a href="#" title="click me">Continue reading
&amp; browsing</a></p>
</block>
</chapter>
</book>
Here all reserved XML characters <, > and & are replaced by <, > and & respectively. That looks less readable but is actually the more common and proper way of doing things... After all, XML was invented by IT people ;-)
By the way, a real xml would likely contain more than one <block> node. But I wanted to keep it simple here.
Configuring Wordbee - Setup once, use forever
All that is left to do is telling Wordbee what contents require translation. Click the "Settings" button in the top navigation menu, then "XML" in the list of document formats. Finally, click the link to create a new configuration:
- //block tells the system that all <block> nodes need to be translated. Since the contents is html, I ticked that option to the right. Last not least, tick "Encoded" if contents follows the Encoded html method. Untick option for the CDATA method.
- //block/@title will extract the block titles <block title="xyz"> for translation. This one is just plain text. Important: Attribute rules like this should be put before the rule for extracting the node contents.
Translating a file
Advanced topics for experts
- The xml and the embedded html must be character encoded the same way (utf-8, windows-1252...). You cannot mix different encodings such as an utf-8 xml embedding big5 html.
- The embedded html can be fragments as in the example above or a complete html page with headers, body element, javascript...
- If your html is xhtml compliant you can also insert the code without using a CDATA and without encoding it. In such a case you would untick the "Encoded" option in the configuration. This would then also require to declare all html entity references in the xml document.
- In the Xml configuration page you can select an html configuration. Create your own if you need to do do things such as: Exclude certain html texts from translation, extract Javascript texts or not and much more.
Do not hesitate to contact us if you need advice with your XMLs!




Comments
0 comments
Please sign in to leave a comment.