XProc: An XML Pipelining Language
An XML Pipeline specifies a sequence of operations to be performed on one or more XML documents, producing one or more XML documents as output. Steps in the pipeline may read or write non-XML resources as well.
XProc is currently a W3C working draft. Despite the involvement of Norm Walsh, whom I greatly respect, I wonder whether we actually need another programming language with the worst possible syntax.
We need it because without it we have no interoperable way of passing around XML documents for processing. I can send you an ant script, except maybe you don’t have Ant or even Java installed. I can send you a Makefile, except maybe you don’t have make or are on a platform that doesn’t have make. I can send you … well, you get the idea. Small. Declarative. Simple. Interoperable. That’s the plan.
As for “the worst possible syntax”, I’m afraid that’s likely to be a matter of opinion. But it’s not impossible to imagine that an alternate, compact syntax for XProc might be devised.
But if I send someone an XProc script, I assume they have an XProc processor in place … I believe there are two questions I have: the first one is whether XML pipeline processing is something that can be (and should be) described in a generic language, such as Python, Java, or Ruby, or whether it’s a good candidate for a DSL. I’m a DSL fan, so the I would probably agree that the answer is yes. As to the “worst possible syntax” thing, I did not intend to criticize the specific XML language, rather the usage of XML in the first place.
You’re right about assuming they have an XProc processor. I hope that XProc processors quickly become as ubiquitous as XSLT processors. One of my strongest design goals is to keep it small and simple so as to maximize that possibility. I hope we see implementations in Python, Java, and Ruby (and C and PL/I :-) so that it’s easy to use regardless of your platform of choice.
As for XML, well, I have the polar opposite view. I think an XML pipeline language that wasn’t expressed in XML would be (almost) entirely pointless.
Why? I know you’re using RELAX, I assume you’re using RNC as well. I believe the issues are very similar — maybe it’s just a question of priorities, whether to do the compact syntax first and then do the XML as an interoperable and easily parseable interchange format, or vice versa?