+------------------------------------------------------------------------------+ | CREATING XSLT BACKENDS | +------------------------------------------------------------------------------+ Author(s): Sterling Hughes Introduction ------------------------------------------------------------------------------- Truth be told, at this point in time there are about a zillion and two different XSLT libraries, each with their own unique merits and faults. If you provide a Sablotron extension, people will clamor for a Xalan extension, if you provide a Xalan extension people will clamor for a libxslt extension. In order to be as user friendly as possible, we try and provide the most options to the user. At the same time we must try to keep a level of consistency, so the user does not need to remember 15 different syntaxes, etc. for each XSLT extension, and when switching from XSLT backends, no changes in the PHP code should be necessary (akin to the concept of a database independent api, but with XSLT libraries). At the same time, you'll also notice that in some cases extensions seem to duplicate each others functionality. All extensions need code for calling user-defined handlers, omitting debug messages, etc. In the interests of laziness, we must also try to make these as minimal as possible. Therefore, I've created a processor independent api for XSLT, aka, the XSLT extension (but doesn't "A processor independent API for XSLT" sound cooler?). It defines a set of functions which every XSLT backend must provide, as well as a syntax which those functions must adhere to. Furthermore, the underlying code, provides a "library" if you will, of code that is relevant to all XSLT extensions. The API ------------------------------------------------------------------------------- Every extension must define the following functions: - xslt_sax_create() - xslt_sax_set_scheme_handlers() - xslt_sax_set_sax_handlers() - xslt_sax_set_error_handler() - xslt_sax_set_log() - xslt_sax_process() - xslt_error() - xslt_errno() - xslt_sax_free() These functions are common or implementable with every single XSLT library that I've come across so far (at least every C library) and should there for be defined by the extension. resource xslt_sax_create(void) The XSLT create function allocates a new XSLT processor and returns a resource pointer to the XSLT processor. It also handles any initialization that the processor requires. void xslt_sax_set_scheme_handlers(resource processor, array handlers) Registers the scheme handlers for the document (aka XPath handlers), given a XSLT processor resource (allocated by xslt_sax_create()) and an array in the following format: array( "get_all" => function, "open" => function, "get" => function, "put" => function, "close" => function ) Where function is either a function name or an array in the following format: array(&$obj, "method") Note: You do not need to handle the array(&$obj, "method") syntax by yourself as this is handled in the call_xslt_function() library function (and more specifically, Zend's call_user_function_ex() function. Note: The given array does not need to contain all of the different scheme handler elements (although it can), but it only needs to conform to the "handler" => "function" format described above. Each of the individual scheme handler functions called are in the formats below: string get_all(resource processor, string scheme, string rest) resource open(resource processor, string scheme, string rest) int get(resource processor, resource fp, string &data) int put(resource processor, resource fp, string data) void close(resource processor, resource fp) void xslt_sax_set_sax_handlers(resource processor, array handlers) Registers the SAX handlers for the document, given a XSLT processor resource (allocated by xslt_sax_create()) and an array in the following format: array( "document" => array(document_start_function, document_end_function), "element" => array(element_start_function, element_end_function), "namespace" => array(namespace_start_function, namespace_end_function), "comment" => function, "pi" => function, "character" => function ) Where the functions follow the syntax described for the scheme handler functions. Each of the individual SAX handler functions are in the format below: void start_doc(resource processor) void end_doc(resource processor) void start_element(resource processor, string name, array attributes) void end_element(resource processor, string name) void start_namespace(resource processor, string prefix, string uri) void end_namespace(resource processor, string prefix) void comment(resource processor, string contents) void pi(resource processor, string target, string contents) void characters(resource processor, string contents) void xslt_sax_set_error_handler(resource processor, function error_handler) This function sets the user defined error handler to be called when a processing or any other type of error occurs. It is given a XSLT processor resource (allocated by xslt_sax_create()) and an error function of the same syntax described for the scheme handler function. The user defined error handler as the following syntax: void error(resource processor, int level, int error, array info) void xslt_sax_set_log(resource processor, string logname) Sets the XSLT log to record log information (processor messages, not errors). Its given a XSLT processor (allocated by xslt_sax_create()) and a string containing the name of the log file. If the string is "php://stderr" then the logging should go to standard error (stderr). Also the default place to send log messages is standard error (if no log file is set). mixed xslt_sax_process(resource processor, string xml, string xsl[, string result[, array arguments[, array parameters]]]) This function performs the magic, it takes the user's data, performs the transformation and depending on the context either saves the result to a file or returns the data to the user. To understand the way the xslt_sax_process() function works, you must first understand the concept of "argument buffers". Argument buffers are equivalent to the concept of symlinks on a Unix system, take the following example: $xml, "/_xsl" => $xsl); $xh = xslt_sax_create(); $data = xslt_sax_process($xh, "arg:/_xml", "arg:/_xsl", NULL, $args); xslt_sax_free($xh); print( "The results of the transformation were\n" ); print( "
\n
\n
" ); print( $data ); print( "
\n
\n
" ); ?> See what was done? The argument buffer was declared ($args) and the different arguments were defined. Then when the xslt_sax_process() function was called instead of giving the XML filename and XSLT filename we instead gave "arguments", which correspond to the XML and XSLT data in the argument buffers. This concept is a bit foreign to some people, however, I find it the best way to handle processing xsl data. If you're still having trouble with this, try playing around with the sablotron backend a bit, you should be able to catch on pretty quickly. In order to use argument buffers, the XSLT extension provides a couple of easy to use API functions, you can use them as follows: { zval **arguments_zp; char **arguments_cp; xslt_args *arguments; char *types[] = { "file", "data" }; /* Fetch the arguments from the user into a zval ** */ /* Translate the zval array into a character array */ xslt_make_array(&arguments_cp, arguments_zp); /* Translate the character array into an xslt_arg * structure */ arguments = xslt_parse_arguments(arguments_cp); /* Print out the resulting xslt_arg * structure */ php_printf("XML type: %s\n", types[arguments->xml.type]); php_printf("XML data: %s\n", arguments->xml.ptr); PUTS("\n"); php_printf("XSLT type: %s\n", types[arguments->xsl.type]); php_printf("XSLT data: %s\n", arguments->xsl.ptr); PUTS("\n"); php_printf("Result type: %s\n", types[arguments->result.type]); php_printf("Result data: %s\n", arguments->result.ptr); PUTS("\n"); } You can also test the "type" field by using the XSLT_IS_FILE and XSLT_IS_DATA constants. Anyway back to the syntax of the xslt_process() function. The first argument to the xslt_process() function is a resource pointer to the XSLT processor to be used. The second argument is either an "argument buffer" pointing to the XML data or the name of a file containing the XML data. The third argument is either an argument buffer pointing to the XSLT data or a file containing the XSLT data. The fourth argument is optional, it either contains the name of the file to place the results of the transformation into, NULL or "arg:/_result", in the latter 2 cases, the results of the transformation will be returned. The fifth optional argument is the "argument buffer" itself, it is an associative PHP array of "argument_name" => "value" pairs, or NULL, if no arguments are to be passed. The final optional argument is a set of parameters to pass to the XSLT stylesheet. The parameter argument is an associative array of "parameter_name" => "value" pairs. string xslt_error(resource processor) The xslt_error() function returns the last error that occured, given a XSLT processor resource (allocated by xslt_sax_create()). int xslt_errno(resource processor) The xslt_errno() function returns the last error number that occured given a XSLT processor resource (allocated by xslt_sax_create()). void xslt_sax_free(resource processor) The xslt_free() function free's the given XSLT processor resource (allocated by xslt_sax_create()). Config.m4 ------------------------------------------------------------------------------- The XSLT extension's "magic" really occurs in the config.m4 file. Here you must add a couple of things in order for your backend to be enabled. Its a bit too complex to describe (but easy to implement and understand). Take a look at config.m4 (which is well commented) to see what is necessary. Makefile.in ------------------------------------------------------------------------------- Simply add the source files for your backend to the LTLIBRARY_SOURCES variable and you're all set with this file. Conclusion ------------------------------------------------------------------------------- Nobody's perfect, I'm sure I've made some mistakes while thinking this whole thing through and I would be glad to hear from any of you who think I'm a colossal moron and think you have a better way to do it. Please e-mail at sterling@php.net, this extension will only get better with feedback. With that said, the concepts here may take a little bit of time to sink in, I know I've written a whole lot. My suggestion to you, if you're planning on writing an XSLT backend is simply to go off and implement, taking the api section as a guide and making sure you match that api as closely as possible.