Jump to content


Photo

Enormus growing size of cache_store table


There are mini calendar caches being generated by IP.Calendar and they are stored in cache_store table.

As there is no year limit for producing those caches, this leads to overfilling database as caches are written to database without any check.

When some bots will create callendar links, for example - from 1900 until 2850 year, and as all of this caches will be written to cache_store table, this action will fill up database up to gibabytes, which will end in blocking hosting.

In my database this table had 120 MB and inside there was caches from 1928, 2089 year and so on.
I checked 2760 year - and of course it was added to database.

There must be limit like "starting year" and "ending year" to produce mimi cache

Status: Fixed
Version: 3.3.1
Fixed In: 3.3.2


17 Comments

This really isn't a bug, and I'm tempted to mark it as such...however I don't think the addition of hardcoded min and max entry points is a bad idea so i'll look into it. The fact is, we probably don't need to cache "January 1981" or "December 2329".
Isn't a bug??

It's extremely easy to wriite malicious script which will simulate clicks on callendar "next month" button until year 3500.
Think only about size of "cache_store" table....

I know you guys in IPS have powerful server with lot of space, cappable to handle 3 GB database, but not all users of IP.Board have such one.
They could write a script to submit 3500 posts to your site and use up far more storage space than a simple mini-calendar cache too. ;) Scripts designed to attack a site would get much farther with approaches like that than they would triggering basic cache builds.
For reference, this came up in the past and those should be getting cleared out by the scheduler http://community.inv...l-cache-growth/

They could write a script to submit 3500 posts .


True, but..
1. in ACP, admin can set maximum size of post
2. Posts are visible at once, cache not.
3. Admin can use anti flood settings with posting feature, with building cache rather not.

Ok, no problem. If you are so sure that you are right, get rid of all this, but I was taught something else when it comes to safety in net.
Photo
Ryan Ashbrook
Jul 13 2012 05:20 PM
I think what's happening here is overzealous Search Engines. They are following the "Next Month" "Previous Month" links, which can lead them all the way back to 1969 and all the way up to 2039.

Every month they hit sets a cache for the two mini calendars. Hit them enough, and the cache table just grows out of control. I'm starting to see this on my site as well.

We're seeing tickets about this. I wrote the first one off as a fluke as one had the Calendar application disabled, so I figured it was due to the previous bug Andy mentioned, and the task wasn't running to clear them out post-upgrade. One site I have now has a cache_store table with 33,000+ rows. And it's causing the Cache Management page to throw an Out of Memory error.

Anyone experiencing this, I would advise you disallow Guests from viewing the Calendar, or at least set it up so Search Engines cannot crawl the Calendar via robots.txt.
Just had my server admin clear the minical cache manually - Showing rows 0 - 29 ( 58,076 total, Query took 0.0326 sec). Something has to be done about this. Cache management wouldn't even last long before blowing it's brains out after trying to clear the minical cache. Eventually, Cache Management wouldn't even work period, so had to get my server admin to manually clear it.
Photo
Marcher Technologies
Sep 05 2012 12:42 AM
I had to clear this out myself recently...
Decided to spit in my own hard limit:
if( !IN_DEV )
{
$this->cache->setCache( 'minical_' . $month . '_' . $year . '_' . $this->registry->output->skin['_skincacheid'], $_cal, array( 'array' => 1 ) );
}
changed for:
$the_time = IPSTime::unixstamp_to_human(IPSTime::getTimestamp());
if( !IN_DEV && $year==$the_time['year'])
{
$this->cache->setCache( 'minical_' . $month . '_' . $year . '_' . $this->registry->output->skin['_skincacheid'], $_cal, array( 'array' => 1 ) );
}
in public_calendar_calendar_view->getMiniCalendar... can I politely ask *why* this is stored by skin, month, and year?
above is a stopgap that does not solve the underlying problem, there is 0 reason for a row per month, year, and skin.... store the years minicals in one row in an array, or at very least by month with all skins in an array with a hard limit, the more skins one has visible, the instantly worse this becomes, crawlers or not(mine is blocked fully in robots.txt, and is a demo site, and I hit this).
In the short or the long run, this wreaks some havok on any site, regardless of robots config.... the task will factually run out of memory simply trying to clear the build-up of one week in light usage, much less a busy site.... that 500 limit in clearing is too high, which leaves it in a catch-22, lower it, and it may never get fully cleared out properly, don't, and nothing gets cleared at all.
Updating Fixed In to: 0

For reference this has come up on a recent website too. 25,000 rows dating back to 1966, and through to 2035. Manage Applications & Modules, and Cache Management take a long time to render in the browser due to this.
Photo
Ryan Ashbrook
Sep 06 2012 07:42 PM
I investigated this a little bit farther and it's not really a robots issue (though, I suspect that is why minical caches for months in 1969 are being generated, so my suggestion still stands). The issue (well, part of it) is that we are including the day in the minical cache key.

$month = $month ? $month : $this->chosen_date['month'];
$year = $year ? $year : $this->chosen_date['year'];
$day = strftime( '%d', time() + $this->lang->getTimeOffset() );

$_key = 'minical_' . $month . '_' . $year . '_' . $day . '_' . $this->registry->output->skin['_skincacheid'] . '_' . $this->member->language_id;

Basically, we're taking todays date, and including the number of the day in the minical cache key. Because of this, a minical cache can be created for a single month 30 (or 28/29/31) times over. Couple that with the fact we store per skin and language, this can multiply very quickly. When caches are generating for mini calendars ranging in the 1969 / 2032 year range, we're at an easy 22,000. A new cache is generated everyday due to the key not matching. I suspect the reason this is done is to highlight the correct day in the mini calendar.

I made a few changes locally, and while it doesn't exactly fix the issue, it helps work around it. The caches can still grow, though it won't get anywhere near the ten thousands range unless you have an extreme amount of skins and languages installed.

Find this in /admin/applications_addon/ips/calendar/modules_public/calendar/view.php (starts about Line 312):

public function getMiniCalendar( $month=0, $year=0 )
{
$month = $month ? $month : $this->chosen_date['month'];
$year = $year ? $year : $this->chosen_date['year'];
$day = strftime( '%d', time() + $this->lang->getTimeOffset() );

$_key = 'minical_' . $month . '_' . $year . '_' . $day . '_' . $this->registry->output->skin['_skincacheid'] . '_' . $this->member->language_id;

/* One cache per month, year and skin */
$_cal = IN_DEV ? '' : $this->cache->getCache( $_key );

/* If cache wasn't built today, rebuild. We use a diff skin template for today vs other days. */
if( !$_cal OR !is_array($_cal) OR !$_cal['built'] OR $month.$year.$day != strftime( '%Y%m%d', $_cal['built'] ) )

And change it all to this:

public function getMiniCalendar( $month=0, $year=0 )
{
$month = $month ? $month : $this->chosen_date['month'];
$year = $year ? $year : $this->chosen_date['year'];
//$day = strftime( '%d', time() + $this->lang->getTimeOffset() );

$_key = 'minical_' . $month . '_' . $year . '_' . $this->registry->output->skin['_skincacheid'] . '_' . $this->member->language_id;

/* One cache per month, year and skin */
$_cal = IN_DEV ? '' : $this->cache->getCache( $_key );

/* If cache wasn't built today, rebuild. We use a diff skin template for today vs other days. */
if( !$_cal OR !is_array($_cal) OR !$_cal['built'] OR $month.$year != strftime( '%Y%m', $_cal['built'] ) )

The only downside to doing this is you lose the Day highlighting in the mini calendar.


Marcher: We cache per skin and per language because the raw HTML and Language Strings is stored as well. If we didn't do that, then language strings (for example) would just cache using the language of the viewing member when it was stored. We need to store per language so the right strings are used. Same for skin, only ensuring the right template is used instead.
    • Marcher Technologies likes this
Photo
Marcher Technologies
Sep 06 2012 07:58 PM
I get why its cached by language and skin, trust me :smile:
Knew that day bit looked extremely fishy.
That *helps*, quite a bit, but I still cannot help thinking of the fact a row is added by skin set, multiple skin sets available is fairly common, my question is more along the lines of if we have an array, can we *use* it to add the skin sets/langs as used to the cache(obviously using that result instead), resulting in one row for x month of x year(again, realistically needs a cut-off anywhom)?
Also, before I forget, in testing, those caches stick through an upgrade.... so a template/lang change in this area will not reflect.
Photo
Marcher Technologies
Sep 06 2012 08:27 PM
>.< That, appears to have knock-on consequences as well.... ok, if the month is *this* month, the current day is in bold right?
The current day would be wrong after first cache trigger on the following day, and every day there-after, getting into the next month the bolded 'day' wouldn't even be on the right minical.
It apparently needs rebuilt by day for the current month, but not stored by it.... slight twitch to the above code.
public function getMiniCalendar( $month=0, $year=0 )
{
$month = $month ? $month : $this->chosen_date['month'];
$year = $year ? $year : $this->chosen_date['year'];
$day = strftime( '%d', time() + $this->lang->getTimeOffset() );
$_key = 'minical_' . $month . '_' . $year . '_' . $this->registry->output->skin['_skincacheid'] . '_' . $this->member->language_id;
/* One cache per month, year and skin */
$_cal = IN_DEV ? '' : $this->cache->getCache( $_key );
/* If cache wasn't built today, rebuild. We do use a diff skin template for today vs other days, bolded this day yknow. */
if( !$_cal OR !is_array($_cal) OR !$_cal['built'] OR ($month.$year != strftime( '%Y%m', $_cal['built'] ) OR ($month.$year.$day != strftime( '%Y%m%d', $_cal['built'] ) AND $month.$year == strftime( '%Y%m', time() + $this->lang->getTimeOffset() ) ) ) )

Edit, did not see you addendum about day highlighting :rofl: >_<
For what it's worth, it's not limited to 1969, as I'm seeing 1858!

a:2:{s:4:"html";s:3841:"<div class='mini_cal_wrap
ipsBox'>
<div class='ipsBox_container'>
<h3 class='ipsType_subtitle'>June 1858</h3>
<table class='mini_cal vcalendar'>
<tr>
Photo
Marcher Technologies
Sep 07 2012 01:17 AM
Re: bolded 'today', not worth the resources.
Re: the above.... yeah, that would be a result of how it is stored/retrieved, calendar is not limited to UNI epoch, just the crawlers/internal jumps.
I suggest Ryan's fix, coupled with a template change to handle the highlighting of the date.
    • Marcher Technologies likes this
Updating Fixed In to: 3.3.2
Updating Status to: Fixed

-