Making zippy HTML (or other zippy output)
Have you ever wished that DITA-OT could return your files as a zip, instead of thousands of little files? Are you willing to experiment with a small DITA-OT plugin, or tweak one you've already got? If no – that's great! Have a good day! Otherwise – read on to learn about a new feature in DITA-OT 2.5!
A self-serving post about a self-serving feature?
This post is really a sort of tutorial on how to make use of a new feature in DITA-OT 2.5. Specifically, an under-the-covers feature that allows a plugin to easily tweak where any output files are generated. It's a function I needed over and over, got tired of working around, and submitted as new feature to DITA-OT 2.5.
Before version 2.5 (going all the way back to DITA-OT 1.0), the toolkit always generated result
files directly in the specified output directory. Basically, you set output.dir
to
indicate where files should go, and we generate them there rather than generating everything inside
the temp/ directory and then copying. Generally this is good. Creating stuff in
side temp/ just so that we can copy it someplace else is a waste of time.
But.
It also makes some things difficult, like post-processing.
I don't like to run DITA-OT processes inside the output directory. I worry about touching files
already in output.dir
. Unlike a temp/ directory, I don't know
what else is already in an output directory, so I'm in danger of modifying (or corrupting!)
something unrelated to my build. If my process fails, the output directory is left full of files in
an unknown state. These things are all bad.
DITA-OT already sets up a tidy little place in temp/ to do a series of things. We do most of our processing there, generating and cleaning up a lot of files. If I want to do that, plus one extra thing at the end, I should be able to use that same directory – just like I'd use that same directory to do anything in the middle of the process.
Using temp/ in that way used to be hard. Now it's easy.
Self-serving? Sure. This makes my life easier. But I hope it also makes your lives easier.
A new parameter is born
DITA-OT 2.5 defines a new internal parameter for use by plugin developers. The new parameter is called
temp.output.dir.name
. Here's the general idea:
- You set
temp.output.dir.name
as part of the initialization for your custom transform type. It should be a directory name (a relative directory - probably just one word). If you're not familiar with adding an initialization step to the start of a transform type, I'd recommend reading about how and why in the Happy HTML tutorial. - Next, the everybody-must-use-this
build-init
target will set up a property (dita.output.dir
) that combines your value with the usual temporary directory name. For example, setting the new parameter to zippy will get you a property like /path/to/temp12345/zippy/. DITA-OT 2.5 places all output files into the directory specified bydita.output.dir
. - If you didn't initialize the
temp.output.dir.name
parameter? No need to worry:dita.output.dir
is set to your specified output directory. So, you're still good, and nothing has changed.
dita.output.dir
directly. Trying to override
it will Mess Things Up. Just keep specifying the output directory like you always have.For example, I set up my new "zipme" transform type so that it sets
temp.output.dir.name
to zippy, and then runs the normal
dita2html5
target. By doing that, everything that would normally go in the
output directory is now in temp12345/zippy/. Initializing that one
parameter has redirected all of my output to a clean spot for post processing – without the need for
an alternate temporary directory or the need to mess about in the output directory.
Could you do this another way? Absolutely. But I found it was surprisingly common to do [normal processing] followed by [one little post process step like zipping]. I asked other DITA-OT contributors and found out I was not alone.
Given how often I and others need to do this, it should be easy. And I generally feel that to make things clean, elegant, and most importantly not silly, we should be able to do those operations in the same temporary directory where we do everything else. That's … sort of what the directory is there for, isn't it?
Sample: zipping HTML5
With this parameter, if all I want is to use the default HTML5 build, my plugin needs just 3 files.
- plugin.xml: the file every plugin needs to declare itself.
- A simple XML file that says where to find any Ant code added by the plugin (I've named it conductor.xml).
- The build file with my Ant code. To create a new transform type that returns zipped HTML, I need
three targets.
dita2html5zip
, which runs an initialization target, followed by the normal HTML5 target, followed by one target to zip the result.html5zip.init
, my initialization target that just sets the newtemp.output.dir.name
property.ziphtml5
, a target that zips up the result files from dita.output.dir and writes the zip directly to the output directory.
Please do it for me now
Ok, but first I want to show you how it's done.
The following build file does the bare minimum, meaning it builds HTML5 – exactly the same HTML5 you'd get today, using whatever other properties you've set – and returns a zip file. Because it's the bare minimum, this doesn't let you control the zip file name. It always returns ThisIsFun.zip.
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns:dita="http://dita-ot.sourceforge.net" name="dita2html5zip">
<target name="dita2html5zip" depends="html5zip.init,dita2html5,ziphtml5"/>
<target name="html5zip.init">
<property name="temp.output.dir.name" value="internal_zip_dir"/>
</target>
<target name="ziphtml5">
<zip destfile="${output.dir}${file.separator}ThisIsFun.zip" basedir="${dita.output.dir}"/>
</target>
</project>
Uhhh that's not a very good zip name
Well yeah. But I wanted to illustrate how simple this could be.
If I want to use the map name by default – or maybe allow somebody to customize the zip name with a build parameter – I could make the zip target a bit more complicated.
<target name="ziphtml5">
<condition property="html5.zipname" value="${dita.map.filename.root}.zip">
<and>
<isset property="dita.map.filename.root"/>
<not><isset property="html5.zipname"/></not>
</and>
</condition>
<zip destfile="${output.dir}${file.separator}${html5.zipname}" basedir="${dita.output.dir}"/>
</target>
Just gimme the plugin already
Here's a zip: http://metadita.org/toolkit/org.metadita.html5zip.zip.
Full disclosure: it's a bit more complicated than what I showed above. Specifically:
- The zip name defaults to the map name, as in the previous section. It also falls back to the topic name if the input is a topic.
- The zip task sets up a new parameter
_map.dir.within.temp.zipdir
. As you might expect, this is the directory of the map within the new temporary output directory. That is – if there are topics referenced above the map directory, those don't go in my zip, because I want the zip to start at the map level. It's like settinggenerate.copy-outer=1
for your build. - This plugin defines its own template and Ant extension point
depend.html5.postprocess.before.zip
. The extension lets you use thehtml5zip
transform type together with your own post-processing. It works just like preprocessing extension points in the rest of DITA-OT, by adding a dependency that will always run as part of the "html5zip" transform type. Anything you add here will run after the normal HTML5 build but before the results are zipped; this means you can add to or post-process everything in dita.output.dir before the zip is created. - Rather than ThisIsFun.zip, if it can't figure out an input map or topic name, it defaults to the zip name please.set.html5.zipname.property.zip
Adapting the plugin to other default output formats
The most obvious candidate for this sort of extension is eclipsehelp
, which
generates a whole mess of files from a complicated string of targets. But, that's not needed thanks
to the new parameter that lets DITA-OT generate an Eclipse JAR file. (Why, yes, the
implementation of that Eclipse JAR feature does use exactly the process I'm writing about
here. And yes, this is a plug for yet another new DITA-OT 2.5 feature.)
For any other output format, you could just take the plugin above and tweak it to return a zip of
your xhtml, troff, or even PDF. For example, the following Ant code sets up a
xhtmlzip
transform type that creates and zips XHTML instead of HTML5.
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns:dita="http://dita-ot.sourceforge.net" name="dita2xhtmlzip">
<target name="dita2xhtmlzip" depends="xhtmlzip.init,dita2xhtml,zipxhtml"/>
<target name="xhtmlzip.init">
<property name="temp.output.dir.name" value="internal_zip_dir"/>
</target>
<target name="zipxhtml">
<zip destfile="${output.dir}${file.separator}ThisIsFun.zip" basedir="${dita.output.dir}"/>
</target>
</project>
Why yes, I really did just do a search/replace and change html5
to
xhtml
.
Why yes, it really was that simple.
Adapting the plugin to custom output formats
This could be a bit trickier.
If your custom transform type doesn't actually create any output files – then you can use the
same process as above, just initialize the temp.output.dir.name
parameter at the
start and add a new zip target at the end.
If you do generate any output files, it may take a few days of work to update your plugin. Here's the process:
- For every Ant target that uses
output.dir
to generate or copy a file, changeoutput.dir
todita.output.dir
. A nice search/replace tool works wonders here. - Wait a few days, just to draw things out. I suggest reading a book.
- As above, initialize the
temp.output.dir.name
parameter at the start of your transform type and add a new zip target at the end.
For my own purposes, I had a number of custom transform types that did little more than initialize several parameters, run a normal XHTML or HTML5 build, and maybe generate a couple extra output files. Updating those to use this new feature took just a few minutes, and I no longer have to worry about any new output files that DITA-OT generates in the future.
Summary
- There's a new feature in DITA-OT 2.5 that simplifies post-processing of content.
- That processing can be (and often is) as simple as "turn this big set of files into one zip".
- The plugin linked above is meant as a tutorial or sample - if you need to zip up output, take it and tweak the transform type as needed.
- If you just want the current HTML5 output, but need it zipped … use the plugin! It works [with DITA-OT 2.5 or later]! Yay!
- If you have your own transform types, and need to add post processing or zipping, the new
temp.output.dir.name
is here to make your life easier – and it really shouldn't take you long to work it in. (Just remember you'll need to be on DITA-OT 2.5 for it to work.)
Good luck, and happy plugin-ing!