Zend_Cache_Frontend_Page & Google Analytics cookies

zend-framework Last week I was optimizing the caching strategy for one of my projects and I was under the false impression that caching was accelerating my website.

At that time I was using Zend_Cache_Backend_File as my backend and cache files were created. My goal was to boost the performance by implementing the Zend_Cache_Backend_Memcached backend. Again everything seemed to be working fine because my Memcached logging displayed activity of some sort.

I was stunned to hear that in fact all of those cache files where created, but never read. So here I am optimizing everything, but in reality I was creating extra overhead. So I started my quest and found the solution (sorry for the spoiler).

Page caching

We use Zend_Cache_Frontend_Page as the frontend of our caching mechanism because it is so simple to use:

$cache = Zend_Cache::factory('Page',
    'File',
    $frontendOptions,
    $backendOptions);
 
$cache->start();

The start method just triggers the caching mechanism and doesn’t need any more attention. It just takes care of itself. The output buffering stores the output in the caching backend. Caching keys are used to identify cached records. Via save() and load() we can interact with the caching backend.

The frontend options

In the previous example I didn’t specify the frontend options, but actually they are essential to this blog post. That’s why I dedicate a chapter to it.

$frontendOptions = array(
    'lifetime' => 86400,
    'default_options' => array(
        'cache' => false
        ),
    'regexps' => array(
        '^/(nl|fr|en)(.*)$' => array('cache' => true,
            'cache_with_get_variables' => true,
            'cache_with_session_variables' => true,
            'cache_with_cookie_variables' => true,
            'make_id_with_get_variables' => true,
            'make_id_with_session_variables' => true,
            'make_id_with_cookie_variables' => false
            )
        ),
        'debug_header' => false
    );

The code example above illustrates the extra options that were passed to the frontendOptions array. The cache_with_… option keys are quite important because they make sure that caching is still enabled when cookies or get variables are passed. Sometimes you want to caching to be disabled when there’s dynamic content involved. We decided to continue caching when there are cookies.

The make_id_with_… option keys don’t decide if caching is enabled or disabled, but they decide how dynamic content is stored. The next chapter explains how Zend_Cache_Frontend_Page stores its data.

Behind the scenes

To understand how Zend Framework works with page caching and how I ran into my problem, we should dig into the framework code.

Data storage

As mentioned, the start method takes care of everything and below you’ll find the source code of that method.

public function start($id = false, $doNotDie = false)
{
    $this->_cancel = false;
    $lastMatchingRegexp = null;
    foreach ($this->_specificOptions['regexps'] as $regexp => $conf) {
        if (preg_match("`$regexp`", $_SERVER['REQUEST_URI'])) {
            $lastMatchingRegexp = $regexp;
        }
    }
    $this->_activeOptions = $this->_specificOptions['default_options'];
    if ($lastMatchingRegexp !== null) {
        $conf = $this->_specificOptions['regexps'][$lastMatchingRegexp];
        foreach ($conf as $key=>$value) {
            $this->_activeOptions[$key] = $value;
        }
    }
    if (!($this->_activeOptions['cache'])) {
        return false;
    }
    if (!$id) {
        $id = $this->_makeId();
        if (!$id) {
            return false;
        }
    }
    $array = $this->load($id);
    if ($array !== false) {
        $data = $array['data'];
        $headers = $array['headers'];
        if (!headers_sent()) {
            foreach ($headers as $key=>$headerCouple) {
                $name = $headerCouple[0];
                $value = $headerCouple[1];
                header("$name: $value");
            }
        }
    	if ($this->_specificOptions['debug_header']) {
            echo 'DEBUG HEADER : This is a cached page !';
        }
        echo $data;
        if ($doNotDie) {
            return true;
        }
        die();
    }
    ob_start(array($this, '_flush'));
    ob_implicit_flush(false);
    return false;
}

A couple of things that are worth mentioning about this method:

  • Via $this->_makeId(); an identifier is created and the identifier is the storage key
  • The $this->load($id); method loads the cached data using the key which was returned by the makeId method
  • Via ob_start(array($this, ‘_flush’)); output buffering is started. The _flush() method is a callback method that is triggered when the output buffer is flushed.
public function _flush($data)
{
    if ($this->_cancel) {
        return $data;
    }
    $contentType = null;
    $storedHeaders = array();
    $headersList = headers_list();
    foreach($this->_specificOptions['memorize_headers'] as $key=>$headerName) {
        foreach ($headersList as $headerSent) {
            $tmp = split(':', $headerSent);
            $headerSentName = trim(array_shift($tmp));
            if (strtolower($headerName) == strtolower($headerSentName)) {
                $headerSentValue = trim(implode(':', $tmp));
                $storedHeaders[] = array($headerSentName, $headerSentValue);
            }
        }
    }
    $array = array(
        'data' => $data,
        'headers' => $storedHeaders
    );
    $this->save($array, null, $this->_activeOptions['tags'], $this->_activeOptions['specific_lifetime'], $this->_activeOptions['priority']);
    return $data;
}

The _flush() method we were talking about does most of the storage: when the output buffer is flushed (which happens at the end of the script execution) the save method is called and this one also triggers the saving mechanism of the backend.

The actual problem

After some intense debugging I noticed that the reason why caching went all wrong was because every page hit resulted in a new cache key. Result: 0% cache hits !

Making id’s

To figure out why the previous key isn’t re-used, we should take a look at the _makeId() method that is responsible for creating those id’s:

protected function _makeId()
{
    $tmp = $_SERVER['REQUEST_URI'];
    $array = explode('?', $tmp, 2);
  	$tmp = $array[0];
    foreach (array('Get', 'Post', 'Session', 'Files', 'Cookie') as $arrayName) {
    	$tmp2 = $this->_makePartialId($arrayName, $this->_activeOptions['cache_with_' . strtolower($arrayName) . '_variables'], $this->_activeOptions['make_id_with_' . strtolower($arrayName) . '_variables']);
        if ($tmp2===false) {
            return false;
        }
        $tmp = $tmp . $tmp2;
    }
    return md5($tmp);
}

Did you notice that foreach loop? Yes: that’s the key to everything, because we’re looping through the different superglobals. In this loop, we’re checking if we can create id’s based on cookie values. Off course, if the cookie value would change, a new id is generated. And that is actually what was happening on my system.

What does Google have to do with it?

google-analytics-logo To summarize: the values of cookies can be used to create cache keys. If cookie values change, our caching key changes as well and we have a cache miss.

So I started looking at my cookies and noticed that there were some cookies which had a different value each time I reloaded the page. After some research, I finally figured out that these cookies were created by my Google Analytics implementation. The reference page shows a definition of the cookies set by Google Analytics. Some of those cookies contain timestamps that change on every page hit.

The solution

The solution is actually simple: disable the make_id_with_cookie_variables key in the frontendOptions and you’re safe!

The final summary

  1. Caching is based on identifiers
  2. Rules can be defined to allow caching based on cookie values
  3. A new cookie value means new caching identifier
  4. A new caching identifier means a cache miss
  5. When cache misses occur, the ‘miss’ is stored to result in a hit the next time
  6. Google Analytics creates cookies that have dynamic values
  7. By disabling make_id_with_cookie_variables, we can solve the problem

6 Comments

  • I just ran into this same issue, I had determined after many hours of debugging what the cause was, but thanks to reading this excellent post now have a fix without having to spend hours researching for it. Thank You.

  • Hi, i create self frontend with another name as Page. In bootstrap use factory method $cache = Zend_Cache::factory(‘Gran_Cache_Frontend_PageGA’,

                 'File',
                 $frontendOptions,
                 $backendOptions,
                 true
                 );
    

    And in class Gran_Cache_Frontend_PageGA (same as original frontend page) i modify method _makePartialId in case Cookie with case ‘Cookie’:

               if (isset($_COOKIE)) {
                  $my_cookie = $_COOKIE;
                  foreach($my_cookie as $key=>$val){
                     // remove google analytics cookie
                     if(false !== strpos($key,"__"){                   unset($my_cookie[$key]);
                     }   
    
                  }
                  $var = $my_cookie;
               } else {
                   $var = null;
               }
               break;
    

    Sorry for my english …

  • Great post, I read this a while ago and, since then . I was wondering… can I translate your post into portuguese – with link to your original post, of course?

  • Sure thing. Go right ahead.

  • [...] Un blog a identifié ce problème et propose une solution que j’ai adaptée : créer une classe Frontend_Page personnalisée qui ne tient pas compte des cookies Google. Cette classe hérite de Zend_Cache_Frontend_Cache et redéfinit uniquement la méthode _makePartialId() qui boucle sur les différentes variables pour créer un identifiant unique. Il suffit d’éliminer les Cookies commençant par un double underscore (__utmz, __utma, etc.) pour qu’ils ne soient pas pris en compte dans la création de l’identifiant unique : [...]

  • Awesome post and useful.
    Thanks for share!
    Diego
    Mar del Plata, Argentina

Leave a Reply

Your email is never shared.Required fields are marked *