Why my Statamic static cache hit 2.3GB (and how I fixed it)
When the Statamic static cache becomes the problem
Statamic static cache should make a site faster and cheaper to serve, not the thing that brings it down. On a site I was looking at recently, the cache had quietly grown to 2.3GB. PHP ran out of memory and the site went offline.
As a stop-gap I bumped the PHP memory limit from 128MB to 512MB, which got the site back online, but the root cause was still there.
Every time someone hit save in the CP, Statamic had to unserialise the cache to invalidate stale entries. The bigger the cache, the more memory that step needs. At 2.3GB the work was enough to blow past the 128MB limit, and in this instance the site briefly went offline.
What was actually in there
I wrote a quick tinker script to look at every URL Statamic had cached, group them by their query-string pattern, and count them:
php artisan tinker --execute='
$urls = Statamic\Facades\StaticCache::driver()->getUrls();
$total = $urls->count();
$byQueryParams = $urls->map(function ($u) {
$qs = parse_url($u, PHP_URL_QUERY);
if (!$qs) return "(no query string)";
parse_str($qs, $parts);
$keys = array_keys($parts);
sort($keys);
return implode(",", $keys) ?: "(no query string)";
})->countBy()->sortDesc();
echo "Total cached URLs: $total\n\n";
echo "Top 30 query-string patterns:\n";
$byQueryParams->take(30)->each(fn($count, $pattern) => printf("%8d %s\n", $count, $pattern));
'The result was eye-opening:
Total cached URLs: 26168
Top 30 query-string patterns:
21196 gad_campaignid,gad_source,gclid
1835 _hsenc
1217 (no query string)
906 _gl,gclid
287 gad_campaignid,gad_source,wbraid
244 gad_campaignid,gad_source,gbraid,gclid
84 page
75 fbclid
43 utm_source
41 _honeypot,_token
38 _hsenc,utm_campaign,utm_medium,utm_source
35 gtm_latency
31 mdrv
26 qOf the 26,168 cached URLs, only around 1,200 had no query string at all. The other 25,000 were the same handful of pages cached again and again under every tracking-parameter combination Google Ads, HubSpot, Meta and GTM threw at the URL. The actual page count on the site is far smaller than either number, the bulk of even those 1,200 are paginated archives, taxonomy pages, feeds and similar routes.
From the cache's point of view, /landing-page?gclid=abc and /landing-page?gclid=xyz are two different pages. Two different files on disk. Two different entries to track. Multiply that by every paid ad click and the cache fills up with noise.
The fix: disallowed_query_strings
Statamic already ships a config option for exactly this. In config/statamic/static_caching.php, set disallowed_query_strings with every tracking parameter you want the cache to ignore:
'disallowed_query_strings' => [
// Google Ads
'gclid', 'gad_campaignid', 'gad_source', 'gbraid', 'wbraid',
// HubSpot
'_hsenc', '_hsmi',
// UTM
'utm_source', 'utm_medium', 'utm_campaign', 'utm_content', 'utm_term', 'utm_id',
// Facebook
'fbclid',
// Google Analytics cross-domain linker + GTM
'_gl', 'gtm_latency', 'mdrv',
],When a request comes in with any of those parameters, Statamic strips them before checking the cache. /landing-page?gclid=abc and /landing-page now both resolve to the same cache entry. The page is cached once.
Will this break my UTM tracking?
No. This is the first thing every marketing person asks and it is worth being clear about. disallowed_query_strings only changes how Statamic looks up the page in its cache.
The URL itself still arrives at the browser with every parameter intact. Google Analytics, GTM, HubSpot, Plausible, Meta Pixel and anything else reading the URL on page load still sees gclid, utm_source and the rest. Campaign attribution carries on exactly as before.
The difference is that the page is now served from cache instead of being regenerated for every variation of the URL, so the visitor sees it faster.
After clearing the cache and letting it rebuild, the URL count came down to 387. What is left is real pages plus the paginated archives, taxonomy and feed routes that legitimately need their own entries.
An hour later, on a site that gets meaningful traffic, the same tinker script gave me this:
Total cached URLs: 494
Top 30 query-string patterns:
419 (no query string)
73 page
1 shem
1 _honeypot,_tokenAlmost everything is either a clean URL or pagination. The two stragglers are noise from form submissions and a typo in a URL someone shared. No tracking parameters in sight. The cache is doing its job and nothing more.
I would expect that 494 to keep climbing over the next few days as more URLs get hit and cached. The natural steady state for this site is probably somewhere between the original 1,217 clean URLs and a bit higher, depending on how much of the site visitors actually reach.
That is fine. The fix was never about shrinking the real page cache, it was about stopping tracking parameters from inflating it. What I never expect to see again is 26,168.
The disk size dropped too
If your site is using CACHE_STORE=file, you can measure the cache on disk directly. On this site it lives at storage/framework/cache/data/:
# Before
ploi@website:~/website.com$ du -sh storage/framework/cache/data/
2.3G storage/framework/cache/data/
# After
ploi@website:~/website.com$ du -sh storage/framework/cache/data/
57M storage/framework/cache/data/2.3GB down to 57MB. Roughly a 40x shrink on disk, which lines up with the 65x reduction in URL count once you allow for per-entry overhead.
If your cache is in gigabytes, it is almost certainly caching things that should never have been cached in the first place.
A bonus: the Control Panel got faster
The original symptom was worse than slow saves. PHP ran out of memory and the site went offline. Bumping the memory limit got it back up, but with the cache trimmed down, saves in the CP also got noticeably quicker. The slow saves that had been showing up occasionally just stopped happening.
How to check your own site
If you run static caching in Statamic, run the tinker script above. It tells you exactly which query-string patterns are filling your cache. If the top entries are tracking parameters, add them to disallowed_query_strings and clear the cache.
Worth checking even if your cache size looks fine right now. The pattern only really shows up once you start running paid traffic to the site, and by then memory limits, slow saves and disk usage are already creeping in the background.
This is the kind of thing I keep an eye on as part of website maintenance on the Statamic sites I look after. Static caching is one of those features that works invisibly until it suddenly does not, and the fix is usually a few lines of config rather than a server upgrade.
You might also like...
- Introducing Sentinel: the Statamic monitoring tool I built for myself, now free for everyone
- An npm Supply Chain Attack Just Hit One of the Most Popular Packages on the Internet
- Statamic 6: what's new and why it matters
- What does website maintenance include?
- Why Your Website Changes Don't Appear Instantly
- Business Websites Have a Running Cost. Here's Why That's a Good Thing.