Optimising Paloose Performance
Caching the Sitemap
So how do we improve this situation? After much testing an interesting conclusion
surfaced: that caching the sitemap in Paloose (while undoubtedly providing some speed
increase) was not the most effective way of improving things.
I rewrote a substantial part of Paloose to cache the sitemap using precompiled versions.
The sitemap parsers were changed to provide a code generation function to produce an
equivalent PHP representation of the sitemap XML. For example the sitemap above would be
compiled to the following PHP (or very similar):
<?php
class CachedSitemap {
private $gRequestParameters = array('url'=>'index.html','resource'=>'index.html',);
private $gParameters = array();
function __construct()
{
}
public function run( $inURL, $inQueryString, $inInternalRequest )
{
global $gVariableStack;
global $gSitemapStack;
$sitemap = new Sitemap_3858cff740af348b8a52174be329505d();
$gSitemapStack->push( $sitemap );
$sitemap->run( $inURL, $inQueryString, $inInternalRequest );
$gSitemapStack->pop();
}
}
require_once( '/var/www/html/simpleSite/../paloose-cached/lib/generation/FileGenerator.php' );
require_once( '/var/www/html/simpleSite/../paloose-cached/lib/transforming/TRAXTransformer.php' );
require_once( '/var/www/html/simpleSite/../paloose-cached/lib/transforming/I18nTransformer.php' );
require_once( '/var/www/html/simpleSite/../paloose-cached/lib/serialization/HTMLSerializer.php' );
require_once( '/var/www/html/simpleSite/../paloose-cached/lib/serialization/XHTMLSerializer.php' );
require_once( '/var/www/html/simpleSite/../paloose-cached/lib/serialization/TextSerializer.php' );
require_once( '/var/www/html/simpleSite/../paloose-cached/lib/serialization/XMLSerializer.php' );
require_once( '/var/www/html/simpleSite/../paloose-cached/lib/reading/ResourceReader.php' );
require_once( '/var/www/html/simpleSite/../paloose-cached/lib/matching/WildcardURIMatcher.php' );
require_once( '/var/www/html/simpleSite/../paloose-cached/lib/matching/RegexpURIMatcher.php' );
require_once( '/var/www/html/simpleSite/../paloose-cached/lib/selection/BrowserSelector.php' );
class Sitemap_3858cff740af348b8a52174be329505d {
private $gOutputStream;
function __construct()
{
$this->gOutputStream = new OutputStream( OutputStream::STANDARD_OUTPUT );
}
public function run( $inURL, $inQueryString, $inInternalRequest ) {
global $gVariableStack;
try { // Pipelines parse
{ // Pipeline parse
if ( ( $matchArray = Match::match( 'WildcardURIMatcher', $inURL, '**.html' ) ) != NULL ) {
$gVariableStack->push( $matchArray );
$dom = GeneratorPipeElement::generate( 'FileGenerator', 'context://content/{1}.xml',
'xml-content', $matchArray, $this->gRequestParameters );
$this->gParameters = new Parameter();
$this->gParameters->setParameterList( $this->gRequestParameters );
$this->gParameters->setParameter( 'page', '{1}' );
$dom = TransformerPipeElement::transform( 'TRAXTransformer',
'context://resources/transforms/page2html.xsl',
$dom, $label, $matchArray, $this->gParameters, $gVariableStack );
$this->gParameters = new Parameter();
$this->gParameters->setParameterList( $this->gRequestParameters );
$dom = TransformerPipeElement::transform( 'TRAXTransformer',
'context://resources/transforms/stripNamespaces.xsl',
$dom, $label, $matchArray, $this->gParameters, $gVariableStack );
$this->gParameters = new Parameter();
$dom = SerializerPipeElement::serialize( 'HTMLSerializer',
$dom, $label, $matchArray, $this->gParameters, $this->gOutputStream );
$gVariableStack->pop(); }
} // Pipeline parse
} catch ( ExitException $e ) { // Pipelines parse
throw new ExitException();
} catch( UserException $e ) {
// handle error pipeline
} catch( RunException $e ) {
// handle error pipeline
$dom = GeneratorPipeElement::generate( 'FileGenerator', 'context://content/error.xml',
'', $matchArray, $this->gRequestParameters );
$this->gParameters = new Parameter();
$this->gParameters->setParameterList( $this->gRequestParameters );
$dom = TransformerPipeElement::transform( 'TRAXTransformer',
'context://resources/transforms/error2html.xsl',
$dom, $label, $matchArray, $this->gParameters, $gVariableStack );
$this->gParameters = new Parameter();
$dom = $dom = SerializerPipeElement::serialize( 'HTMLSerializer',
$dom, $label, $matchArray, $this->gParameters, $this->gOutputStream );}}
}
?>
Not the prettiest code, but compiled code is not designed to be. Running this cached base
system gave the following results:
It was clear that exploring other avenues would be more fruitful. After a couple of weeks
of trials I ended up with the conclusion that looking at the design of the Paloose and its
PHP was the best way to continue. Changing the XML, XSLT and sitemap of a site really did
not have as much effect as I wanted.
The next
page describes these changes and their implications.
Copyright 2006 – 2023 Hugh Field-Richards. All Rights
Reserved.