XProc 3.0 - Home

XProc is an XML based programming language for processing documents in pipelines: chaining conversions and other steps together to achieve the desired results. The current version is 3.0.

XProc has been around, in its 1.0 version, since 2010. All information about this older version can be found here.

The following are important sources of information about XProc 3.0:

The XProc 3.0 specification is maintained by Achim Berndzen, Gerrit Imsieke, Erik Siegel and Norman Tovey-Walsh.

XProc in a nutshell

Now why and when would this be useful? In the physical world, pipelining and working in specialized steps is not unusual. Take for instance an oil refinery: it takes crude oil as its input and, through a series of steps and intermediate products, produces petrol/gasoline, kerosene, diesel, etc. Just one look shows that refineries take the word "pipeline" very literal...

A classic from the IT world are of course UNIX pipelines. Some command produces some output and we do further processing (by, for instance, grep or tail or head) to get the information needed. The character used for chaining steps, |, is even called the "pipe" character!

So why would we do this in the world of information and document processing? One of the main reasons is that data is often not in the format we need it to be. Some examples:

For straight transformation of XML data there are languages available, like XSLT and XQuery. But more often than not tasks are more complex than can be done in a single transformation: chaining, splitting and merging comes into play. Surrounding the transformations you need housekeeping, like where to read from or write to, inspect directories and zip files and write logs. Also from a software engineering point of view it is often desirable to work in smaller steps to get more legible and better maintainable code. This is where XProc comes into play: a single executable language to express this.