Jump to content






* * * * * 5 votes

IP.Board 3.3 Dev Update: Performance Enhancements

Posted by Matt, in 3.3.x 29 January 2012 · 7,813 views

During each release cycle we often take some time out to assess performance and look at ways to improve in this area. We're also in a unique position to have first hand experience at hosting tens of thousands of IP.Board installations via our own hosting network.

We also work closely with our clients who constantly give us feedback on how IP.Board is performing and let us know about any areas that need further examination.

All of this data is very useful when it comes to profiling and testing IP.Board and making performance improvements for the next major version.

In this this blog entry, I'd like to discuss some of the improvements we've made for IP.Board 3.3.

Topic Markers
IP.Board has had a centralised database drive topic marking system since 3.0. As IP.Board is only part of the suite, we wrote the system to be extensible and flexible so that our own apps and apps written by others can use the system without maintaining their own tracking databases.

We wrote the system to use two tables. One of which can be considered a 'deep storage' table. This contains permanent tracking data in the format of one row per member per parent. So this means that if you had 200 forums, each member would take up 200 rows.
The second table can be considered the 'active' table. When a member is loaded from the database and no 'active' row is found, the markers are pulled from deep storage and written in a serialised form to the 'active' table.
When the member is no longer active, the data is removed from the 'active' table and written back to the 'deep storage' table ready for the next time they visit.

In theory, this is the perfect solution. You only have to read and write to a smaller table which should make the system more efficient. However, we discovered that trying to keep the tables synchronised when you have a very busy site negated the benefit. The sheer number of SQL inserts and deletes often caused bottlenecks affecting the whole board.

Another downside was that all the marking data had to be loaded when the member was loaded. This could be up to 200k of marking data - most of which wouldn't be needed. If the member was viewing a topic, they wouldn't need marking data for Blog, for example.

We've tweaked the system to remove this SQL bottleneck. We've removed the 'active' table and simply write to the main tracker table. Now we don't have tables to synchronise, we can simply write back to the 1 row that needs updating and not have to periodically update all 200 rows.

Furthermore, we've removed the need to load all markers at once. A new function in 'coreExtensions.php' dictates which markers to load. You can still load all as this may be more efficient (as is in the case of the board index when you have a lot of sidebar hooks)

If you choose not to load the marker data on member initiation, you can use the new built in JOIN methods to fetch the marking data along with your dataset.

In testing, this has dramatically reduced write overhead and the memory footprint required per page view by up to 150k.

We're testing this out right now on our company forums and many people have already commented that 3.3 is seriously faster.

Post Table Access
The largest table in your database is almost certainly the post table. We have clients with millions of rows in this one table alone. It makes good sense to keep reads to a minimum where possible.

In older versions of IP.Board, we had different views such as 'threaded'. These were removed in 3.2 as these older legacy views were rarely used and not really applicable in a modern context. However, some of the older code remained which meant that the post table was being queried twice per topic view. Once to fetch a list of post IDs and again to fetch the data.

We've rewritten this bit of code to use a new API and now we only query the table once. This alone will drop read access to your post table by almost 50% in normal daily use. This is a significant change.

Today's Top Posters
This fairly innocuous feature is accessible via a link in the board footer and on most boards doesn't get a deal of traffic. However, we've found that clients with larger boards notice a significant slow down when this feature is used which can cause another SQL bottleneck.

This is because the query is fairly complex due to the flexible permission IP.Board offers. The query causes the creation of a temporary table to sort the data which isn't desirable for larger boards.

We've added a new caching table which caches recent post IDs. This makes this feature much quicker (over a second in SQL terms in testing) and as an added plus, it doesn't have to query the post table to generate the list which again saves read access on that large table.

Conclusion
There are many other, smaller changes in additions to those listed here. Some of these changes may seem trivial but they quickly stack up. It only takes one or two slow queries to bring a site to a crawl while SQL catches up with queued queries. These changes will make a significant different to everyone but especially those working with large databases. Your IP.Board will be faster, consume less memory and be more SQL efficient. Those are changes we can all appreciate!




True.

3.3 is shaping out very well.
Great work on the improvements.

Don't hesitate to tap me if you need a proofreader for your articles, though. ;)
niiice :)

Is it already implemented here?
Awesome.. 3.3 is going to offer a great performance boost to larger sites.
curious would this be any help to performance or in portability to IPB or IPC?
Sound like my biggest problem has been fixed.... :)

Good Job! :)

curious would this be any help to performance or in portability to IPB or IPC?


I'm confused how splitting the markup language from the raw post would help at all. How would you determine line feeds and what not? Seems like too much trouble.
Excellent. Looking forward to upgrading.
Nice to see 3.3 on these Boards. :)

niiice :) Is it already implemented here?


Yes, it is. :)
Really can't wait for this update. The performance stuff is great.
Photo
action-reaction
Jan 30 2012 03:13 PM
The " in the source code must be remplaced par '.

The " in the source code must be remplaced par '.


Optimizations like this give you virtually no real-world benefit. On paper, ' is better than " because the PHP engine won't try to parse variables inside the text. In the real world, this is not where your bottlenecks come from, so it makes more sense to focus on areas of the code that are utilizing more resources.
Looking forward to 3.3 and utilizing it on a large site :)
Very nice. I like. :)
Glad to hear this.
I Like this..
Good Job :)!
I honestly cannot wait until 3.3 is out, its probably the most excited ive been for an IPB update since the start.

May 2013

S M T W T F S
   1234
567891011
121314151617 18
19202122232425
262728293031 

Recent Entries

Latest Visitors

Recent Comments

Search My Blog