Magic and Math

Earlier today a number of updates to 10C rolled out to fix a handful of bugs and introduce new functionality in both the API and the Social site. While the features are certainly nice, one of the most rewarding aspects to today's release is seeing the average CPU load on the web server drop from a pretty steady 1.1 to just 0.28. Memory usage is down as are response times. Today's update saw the next push to use the system's MySQL database engine more like I would SQL Server.

I've been on record to say that SQL Server is the best relational database engine on the planet and I stand by this assessment. It's incredibly fast, powerful, and comes with a slew of excellent tools that I find to be incredibly reliable. Licensing and other aspects can be quite complex at times, but the technology itself is rock solid. For a while, I actually had 10Cv5 running completely off a SQL Server on Linux installation during the year-long alpha period, but opted to change the database engine to something a little more common in the open source world.

MySQL is an interesting tool in that it powers a lot of the Internet, is relatively easy to set up, and doesn't cost a dime. I've been using this database engine for 15 years and seen it develop quite a lot over this period of time. One area where it has seriously lagged behind SQL Server is with the performance of stored procedures and triggers. These programmable objects were just slow and easy to break in the MySQL 5.x versions, which meant that a lot of people -- myself included -- would write their web applications to use a series of ad hoc queries that are created in the API layer and passed to the database when needed. The more recent 8.0 release of MySQL has completely changed this paradigm. The database engine is faster. Memory is better used. Indexes remain in memory longer. All in all, I feel that 8.0 is the best release of MySQL there has ever been.

Since the beginning of v5's planning, I wanted to make more use of the features and functions that are found in a traditional relational database management system. Objects like triggers, views, and stored procedures are excellent tools to enforce consistency in a database at the database layer, freeing up the API layer to be more of a communications bus than anything else. Triggers were relatively easy to set up on the tables and they're providing some of the much-needed functionality to keep people's data safe and organized.

One area that I've stayed away from until recently has been stored procedures. In SQL Server these are wonderfully powerful objects that allow for a series of complex operations to take place completely in the database before a set of data is returned. In the MySQL 5.x versions, a stored procedure could be used for a complex task but was often slower than simply sending an ad hoc query and processing the results in the API middle layer. With version 8.0, this resistance is no longer necessary. Stored procedures in MySQL 8 rock. They're incredibly fast, reliable, and require less memory and fewer disk reads on average than sending a raw query string to the database. The change has been night and day, and it's reflected in the load times across the myriad of sites and services powered by 10C.

Being able to use stored procedures with 10C and not battle with a performance hit is going to completely change how I write this application and how people who choose to self-host administer the software. If MySQL continues to see the sort of improvements witnessed between 5.7 and 8.0, then other relational database engines will need to see similar changes in order to keep up.

WordPress DB Tuning

Last weekend a regular client of mine asked me to take a look at their website and try to improve its load times. Like a lot of places on the web, it was powered by WordPress and had a number of caching plugins installed in an effort to reduce the amount of time visitors have to wait before they can read and move about the page. Unfortunately, despite their best efforts, they could not get everything to finish loaded in less than 3 seconds during the two busy periods of the day. After a couple of minutes of research, the problem was pretty apparent.

First a little background. The site is a mobile game review site based out of Vancouver, with readers aged anywhere from 13 to 30 … give or take. Amazon's EC2 platform is used, with the database being hosted on a dedicated c4.xlarge instance with two t2.large web servers reading from this database. There are typically 200,000 visitors a day to the site and most traffic happens around lunchtime during the week and between 4:00pm and 10:00pm every day. This is not anything too crazy, though the EC2 instances are much larger than the ones I use for 10Centuries1. Because of all the caching plugins in use, the main site itself usually loads in about 1.5 seconds. The problem came down to a pair of custom-made plugins that were not particularly efficient … and a default database configuration.

Tuning a MySQL 5.7 Database for WordPress

Note: This not for the faint of heart. If you're not comfortable getting into the nitty-gritty of configuration files and occasionally seeing that MySQL refuses to restart because of something that's not quite right in my.cnf, then you do not want to do this. You can either continue to run stock, or you can hire someone to do this for you.

Note 2: The settings that I'll outline here worked for my customer's use case. Do not simply copy/paste and expect the same results. Tuning requires planning.

The two custom plugins that were causing problems had inefficient queries that needed some optimization and a bit of a rewrite to avoid using a FILESORT, but this was just a small part of the performance issues. The big one was a buffer pool that was simply too small. A c4.xlarge instance has 7.5GB of memory, but most of it was sitting unused. The WordPress database was about 2GB in total, so it wouldn't be an issue to have it all in memory ready to be read. I set the buffer pool to 3GB.

innodb_buffer_pool_size = 3G

I then set up a tmpfs partition in memory for MySQL to use for temporary tables, and ensured the InnoDB database engine was properly configured to use it:

innodb_temp_data_file_path = ../../mysqltmp/ibtmp1:128M:autoextend:max:3G

I then went through and ensured some of the other InnoDB settings were set for maximum efficiency. Because ACID compliance is not 100% critical with this website, I could get away with using different log flushing times.

innodb_flush_method = O_DIRECT
innodb_flush_log_at_trx_commit = 0
sync_binlog = 0
innodb_flush_log_at_trx_commit = 0
innodb_buffer_pool_instances = 8
innodb_thread_concurrency = 8
query_cache_size = 0
query_cache_type = OFF

With this done, I restarted MySQL. Page load times dropped from a "good" 1.5 seconds mid-day to 0.2 seconds, and 0.5 seconds during peaks. Not too bad, if you ask me.

There are a lot of InnoDB settings that can be configured, tweaked, and explored. While the default configuration is "good enough" to get people going with a WordPress site, it's really important that some research and time be spent getting the database performance up to something respectable. This isn't something that plugins alone can fix.

  1. which, I will admit, does not run WordPress.

Who Will Use SQL Server on Linux?

This past weekend I was having a discussion on IRC with someone about the role of SQL Server on Linux, and who might actually use the database engine over many of the more traditional, open alternatives like PostreSQL and MySQL. The argument essentially boiled down to this: my opponent said that "nobody in their right mind who wants to use SQL Server would migrate from Windows to Linux", and I said that "SQL Server on Linux isn't for people who already have systems built to use SQL Server installations on Windows, but for people who haven't even started their projects yet". Both of us are likely correct to a certain degree, and both of us are likely over-generalising by using terms like "everyone", "most", and "developers".

I've been using Microsoft's SQL Server on an Ubuntu Server installation since it was first made available this past summer, and I've got to admit that it's a solid little engine. SQL Server has long been my favourite database, and I tend to use others primarily because my servers are all Linux-based. MySQL is the go-to, but does it have to be for smaller projects?

I don't think so.

So with this in mind, I plan on proving to my weekend opponent that SQL Server on Linux is a viable solution by migrating 10Centuries from MariaDB 10.1 to SQL Server 2017. This won't be a simple drop-in replacement, as the two systems do things differently enough that it makes zero sense to hammer a square peg into a round hole. Instead, I plan on taking the examining system and designing a proper replacement that'll play to SQL Server's strengths and finally move all of the data rules from the API layer to the database, where it actually makes sense.

10Centuries is very much a relational system, and it should be perfectly reliable on SQL Server Web despite all the licensing limitations that Microsoft has in place for that edition of the engine. If there's any change at all, it will likely be that 10C is faster after the database change. There will certainly be some performance reports after the fact. If I can drop API response time down by 500ms without spending a small fortune on more hardware (or software licenses), I'll be very, very happy.

Umm … Why?

A little over ten years ago I had an MCDBA certification from Microsoft that revolved around SQL Server 2005. Those skills are now well over a decade out of date, and I'd really like to get back into using this more powerful tool. By using SQL Server on Linux, I can gain experience with the system and also begin to understand the various things that the tool can and cannot do. I'm years away from becoming an expert, this should be a good first step down that road.

SQL Server can certainly be a great tool for people who prefer to use Linux and want to build or update their software. Just because it may not fit every use case doesn't mean it can't fill others.

Master of None

Over the last year or so I've been investing a great deal of time to improve my knowledge of both Microsoft SQL Server and MySQL. One is the greatest database engine that's ever existed1, and the other is certainly adequate for many tasks with some caveats. As with many things in life, the more we learn about a specific subject, the less we feel we know about that subject. The vast absence of knowledge stretching before me has me wondering whether I'll ever truly know either of these tools well enough to master them.

The chasm by Kevin Carden

In just 11 months I've read 8 SQL Server and 4 MySQL books that have really opened my eyes to how these tools have evolved since I was last working with databases in any meaningful way. Sure, I've used these tools a lot over the last decade, but I've never really had to deep-dive into the subjects in order to provide better support for the engines. With some recent issues at the day job, I feel there is no other option than to absorb as much knowledge in order to take on the role of DBA, as there hasn't been one inside the organization for quite some time. What I've come to learn through this activity is that a lot of what I've previously been taught as "best practice" is now anything but, and it has me second-guessing a lot of the software I've built over the years. This has me wondering if any of the other "best practices" that I've been following are outdated, outmoded, or just plain wrong.

For almost ten years I set aside software development and database work in order to earn a paycheck here in Japan, where my personality type is generally not welcome within a typical Japanese company. A great deal has changed in that time. The question I need to answer now is how to apply this newfound knowledge without essentially scrapping the work that's been done up to now on both personal and professional projects.

  1. in my opinion, anyways

When Search Doesn't Find Anything

There's a serious problem on this site, and it's starting to get really, really annoying. One of the reasons I built my own CMS was so that I could always find my content with just a few keystrokes. While this is typically true if I directly access the database, we are not able to find information with the same level of granularity on this and other 10Centuries websites. I have over 100,000 items stored in the database ranging from collections of "moments" no more than a few words long all the way up to some of the 3,000-word essays that are posted on here. This means that the existing search algorithms may work fine for the sites with less content, but it fails miserably when confronted with over a million words spread across a myriad of content types.

Looking for a Needle in a Haystack

Search is not an easy problem to solve. I've worked at it on here for the better part of a year, trying to find an algorithm that is both fast and accurate. The system that is currently in place was, at one time, an excellent tool that allowed me to find just about anything in less than a second. While the return is still under two seconds, the accuracy has gone down the tubes as the amount of content has continued to grow. What's the solution?

Looking at the larger sites that have terabytes of information stored in their databases, I can see how many of them have overcome the problems faced while trying to scale and cope with the incredible influx of data; they threw ever larger servers at the problem and implemented their own NoSQL database solutions.

This isn't the way I want to go just yet, though.

Weighing Search Results

One option I've played around with involves simple-weighting. All search algorithms work with a complex array of weighting mathematics, but I have found through some simple trial-and-error that by assigning numeric values to the results that come back can greatly enhance the accuracy and speed of the results. Here's an example of what I mean:

Searching for something like "bright yellow"1. Using the standard database search queries, I get a single result back, being a Tweet that I posted regarding "Splashback Jack", a man who works in another office at the building where I spend most of my daylight hours. With full-text search enabled I get over 25 results back spread out evenly across blog posts and Tweets, but nowhere in sight is the entry above; the one I am actually looking for. The algorithm that I've been testing would resolve this.

Essentially, the search would be split into multiple queries, with results coming back and being scored based on the order of the words entered into the search box, the date of the content, the type of content requested, and how many of the words (if multiple words are entered into the search box) are found together or simply in the same piece of content. With the numerical values I've assigned each component search, a "perfect match" would have a score of 55 and be shown at the very top of the list. Then, like any other properly sorted list, items will be displayed based on their relevance, date, and data type.

The extra time required to perform this search averages to be roughly 0.004 seconds per word, which makes this a relatively cheap tweak at this point. It will be implemented this weekend should I have a few hours to do some more intensive testing.

As I had mentioned above, search is by no means as simple as doing a query on the database with a single call. People much smarter than I have struggled with the problem of relevance for decades and would be able to poke holes into the simple little algorithm that will soon be implemented on here. That said, one of the biggest goals I had when building Noteworthy and later 10Centuries was to provide a simple way for people to find the content I had written without resorting to Google, Bing, or any other search engine, as these massive resources have lost their ability to provide really useful answers to the questions we have.

Hopefully this little tweak will be worth the effort.

MySQL Server at 100% CPU? Check Your Date.

Over the last few days I've noticed that a few of my servers have been using a solid 20% more CPU than normal, and it's something that I watch very closely to ensure my clients' websites are fast and responsive. This problem wasn't happening on the any of my web servers, though. Only the MySQL servers were under strain. What the heck could be causing this problem? Believe it or not, it was a fault of ntpd.

Here's a quick solution to the problem:

$ /etc/init.d/ntpd stop

$ date -s "date"

$ /etc/init.d/ntpd start

After doing this the MySQL server load dropped from an average of 0.44 to 0.12, and MySQL CPU consumption dropped from 95% to 4%. Much better! The problem seems to stem from the leap second that occurred last weekend. MySQL is not a fan of this extra second and would run into issues when working with date information. Sorts, diffs, and inserts were all affected by this issue.

This little problem is something I'll have to keep in mind the next time we go through a leap second.

Optimizing a t1.micro MySQL Installation

Slow computers are a big source of frustration for many of us, but slow databases are even worse. When it comes to modern piles of data, I expect recordsets to be returned darn near instantaneously. When they're not, I look for ways to optimize the system to ensure data is returned before my finger fully comes off the "Enter" key. But how can we tune databases when hardware is severely restricted? It's very simple.

Disclaimer: This is not an absolutely perfect solution for everybody's MySQL database needs, particularly those on Amazon's EC2 system, but it has worked for me several times in the past for MyISAM-only MySQL databases. Discussing the virtues of a MyISAM or InnoDB system is a topic for another day.

One problem that I have seen many times over the last two years is that Amazon's t1.micro instances are just not cut out for a lot of serious work. Anything that will require a large amount of disk IO is going to be severely constrained, and this will often result in an instance that is almost always maxed out. It's not uncommon to see one of these servers sitting at 100% CPU utilization and a server load in excess of 0.85 when "nothing is going on". There are a number of reasons this happens, but one thing we can do to reduce the stress on the system is to modify the my.cnf file.

This is how I typically write the basic my.cnf file on a t1.micro Instance running CentOS:

Again, this is for a basic system without any real database optimization1. This happy little setup allows a little bit of wiggle room for the OS to do things, while also giving MySQL just enough information about what it should and shouldn't do. After restarting MySQL, I typically see the CPU and server load cut in half. The difference is really night and day.

When it comes to hosting databases, the more RAM we can throw at it the better. CPU is important, but not as much as it used to be. Disk I/O is painfully important, but something we can't truly control on EC2 (even when running the database on EBS volumes2).

This Is a t1.Micro Server ...

… an this is a t1.micro server running WordPress:

CentOS Top | WordPress CPU Usage

Any questions?

There are undoubtedly a billion questions, like "How the heck can MySQL be consuming 100% CPU?" or "how many people are connecting to the server right now?" or "How the heck can so many processes be consuming so much CPU on Amazon's smallest virtual server?". These are all questions that I've been struggling with while trying to determine just what the heck has changed between WordPress 3.1.0 and 3.1.1.

I've been using one of Amazon's Free Tier servers to host this blog and, for the most part, it's been working out pretty well. Amazon allows new subscribers to use a "free" t1.micro server for 750 hours with limited amounts of space and bandwidth in order to test drive their services. As I manage the Amazon servers for my employer, I've come to know what one can expect from various server sizes. That said, there are sometimes some problems.

In this case, I had recently updated WordPress from 3.1.0 to 3.1.1. There were some hiccups with two plugins, so I disabled W3 Total Cache, disabled and re-enabled all of the plugins, restarted Apache, flushed the cached files, re-enabled W3 Total Cache. Everything was working wonderfully … until I hit "Publish" to send out a new post.

The Twitter plugins sent out the notification that a new post was available just fine. The cached pages were created just fine. The notifications of an update were about to be sent to the various servers when MySQL started to stutter under an increased load. Apache was also busy as heck, occasionally running with as many as 50 threads. Had one of my previous posts suddenly gone viral and W3 wasn't kicking in? Was Memcached or some other package that has been installed for months all of a sudden sucking up all the resources and interfering with the server?

From what I can tell … it was none of these things. Instead, it's likely that something has changed inside WordPress that resulted in this increased load and I'll have to investigate further.

Fun? Wow!

Full Table Be Gone!

Earlier today I received a desperate email from a friend who has recently inherited the job of "database guy" at work. Unfortunately for him, he has little interest in being in charge of the database and even less chance of having the duty assigned to someone else. Unfortunately for me, this means that I'll be his go-to guy whenever he runs into a problem that needs a little more knowledge than he can find in the first 30 seconds on Google.

The problem he presented to me was one I haven't seen in years. One of the company's reports has been failing for a few weeks and, after managing to find the error logs, he sees the cause of the problem: Cannot Insert records into Table X. Table is Full.

Sounds like a hard disk is full, right? Well, he checked that out and found that there were several hundred gigabytes available for a database that barely hits 100 MB. This is where his desperate email comes in.

Hey Jason,I'm in a bit of a pickle at the office. This is the message in the error log:Cannot Insert records into table xxxxxxxx. Table is full.Any ideas?

Lots of ideas … but this is probably the result of someone doing something silly. In this case, the original developer of the database decided it would be a good idea to make certain key tables in the database use the Heap storage engine. In other words, these tables were being stored completely in RAM. If the server ever lost power, the data in these tables would be lost, and there would (probably) be no way to recover it.There are two solutions for this problem. One is the right way, and the other is not always the right way.The safe way:


The not-so-safe-way:

In MySQL: SET GLOBAL max_heap_table_size=268435456;

In my.cnf add: max_heap_table_size=268435456

The first method changes the table from being stored exclusively in the RAM to being stored on the disk. For most situations, there is no need to keep a table exclusively in RAM. The second increases the maximum size of a memory table. By default it's set at 16 Megabytes. The number above is 256 Megabytes.I gave him both options, with a very simple explanation about what each would do, and let him make the call about which would be better. Hopefully he chooses the former….Note: This colorful little anecdote is not intended to be a definitive solution for the "table is full" error in MySQL. If you have a specific question about how to resolve this problem, feel free to leave a comment or send me a message through the contact form. There is no warranty on any technical solution I offer on this site.

Proximity Alarms

Am I the only one that finds the proximity of the Copy CREATE Statement command to the Drop Table command a little unnerving?

MySQL Query Browser - Drop Table