MySQL is a complex system that requires many tools to repair, diagnose, and optimize it. Fortunately for admins, MySQL has attracted a vibrant community of developers who have put out high-quality open source tools to help with the complexity, performance, and health of MySQL systems, most of which are available for free.
The following 10 open source tools are valuable resources for anyone using MySQL, from a stand-alone instance to a multiple-node environment. The list has been compiled with variety in mind. You will find tools to help back up MySQL data, increase performance, guard against data drift, and log pertinent troubleshooting data when problems arise.
There are several reasons why you should consider these tools instead of creating your own in-house tools. First, thanks to their wide use, they’re mature and field-tested. Second, because they are free and open source, they have benefited from the knowledge and experience of the MySQL community.
Many of these tools are actively maintained by Percona, a provider of MySQL and other open source database solutions, so they continue to improve and adapt with the evolving MySQL industry. Note that two of the tools—mycheckpoint and oak-security-audit—are no longer maintained. However, you may still find them useful.
Keep in mind that there are many more tools worthy of your attention. I have chosen to emphasize free and open source, and to err on the side of usefulness and usability. Also note that all are Unix command-line programs but one, in large part because MySQL is more widely deployed and developed on Unix systems.
Now, let’s meet these 10 essential MySQL admin tools.
Essential MySQL admin tool: pt-query-digest
Nothing frustrates like slow MySQL performance. All too often, faster hardware is thrown at the problem, a solution that works only if hardware is in fact to blame. More often than not, poor performance can be attributed to slowly executing queries that are blocking other queries, creating a ripple effect of slow response times. Because it’s a lot cheaper to optimize queries than to upgrade hardware, the logical first step in MySQL optimization is query log analysis.
Database administrators should analyze query logs frequently, depending on the volatility of the environment. And if you’ve never performed query log analysis, it’s time to start, even if you are relying on third-party software, which is often assumed to be optimized when, in fact, it is not.
A great query log analyzer is pt-query-digest. Co-written by Baron Schwartz and myself, it is actively developed, fully documented, and thoroughly tested. MySQL distributions include the query log analyzer mysqldumpslow, but the tool is outdated, poorly documented, and untested.
pt-query-digest analyzes query logs and generates reports with aggregated, statistical information about execution times and other metrics. Since query logs usually contain thousands, if not millions, of queries, query log analysis requires a tool.
pt-query-digest can help you find the queries that take the longest time to execute as compared to other queries. Optimizing these slow queries will make MySQL run faster by reducing the greatest delays. The real art of query optimization is more nuanced, but the basic goal is the same: find slow queries, optimize them, and increase query response times.
The tool is easy to use; executing pt-query-digest slow-query.log will print the slowest queries in slow-query.log. The tool includes support for “query reviews,” for reporting queries you have not yet seen or approved, making frequent log analyses quick and efficient.
You can download pt-query-digest from Percona. It is also available as part of the Percona Toolkit.
Essential MySQL admin tool: mydumper
Being able to generate data dumps quickly is vital for backups and server cloning. Unfortunately, mysqldump, which ships with MySQL distributions, is single-threaded and, thus, too slow for data-intensive jobs. Thankfully, the modern replacement, mydumper, uses multiple threads, making it many times faster than mysqldump.
Also known as MySQL Data Dumper, this tool does not manage backup sets, differentials, or other parts of a complete backup plan. It just dumps data from MySQL as quickly as possible, enabling you to complete backups under tight time constraints, such as overnight, while employees are offline, or to perform backups more frequently than you would with mysqldump.
One technical point to know about mydumper is that it locks tables, so it is not the ideal tool for performing backups during operating hours. Then again, professional data recovery costs hundreds of dollars per hour, and you always get a bill even if the data isn’t recoverable. mydumper is free and well worth exploring for even basic backups.
mydumper also comes in handy when cloning servers. Other tools perform complete hard drive duplications, but when all you need is MySQL data, mydumper is the fastest way to get it. Servers provisioned in a cloud are particularly suited to cloning using mydumper. Just dump your MySQL data from an existing server and copy it to the new instance.
Cloning is worthwhile for creating slave servers, benchmarking, and profiling, but nowhere is it more vital than in testing and development. Being able to spin up a replica for quick testing before going live is essential for dynamic MySQL environments. With mydumper, you can quickly create a server that is nearly identical to your production server, enabling your test results to better mimic production results.
You can download mydumper from GitHub.
Essential MySQL admin tool: XtraBackup
If your databases are in use every day, all day, giving you no “overnight” during which tables can be locked for backup, XtraBackup is your solution. This free open source tool performs nonblocking backups, allowing backups to be run on transactional systems while the applications remain fully available.
XtraBackup also offers incremental backups, allowing you to back up only the data that has changed since the last full backup. Adding incremental backups to your backup process is powerful, given the reduced performance hit of these tremendously smaller backups.
A project that grew up around XtraBackup makes managing a full backup plan even easier: XtraBackup Manager. This tool adds advanced features like rotating backups with groups and backup set expiring. Together, XtraBackup and XtraBackup Manager are a formidable and free backup solution.
You can download XtraBackup from Percona. XtraBackup Manager can be downloaded from the Google Code Archive and from GitHub.
Essential MySQL admin tool: tcprstat
tcprstat is probably the most esoteric tool on this list. tcprstat monitors TCP requests and prints statistics about low-level response times. When you become familiar with the response time way of thinking about performance, the payoff of tcprstat is significant.
The principle is elaborated in the book Optimizing Oracle Performance by Cary Millsap and Jeff Holt, and it applies equally well to MySQL. The basic idea is that a service, in this case MySQL, accepts a request (query), fulfills that request (execution time), and responds with results (result set). The service’s response time is the time span between receiving a request and sending a response. The shorter the response time, the more requests can be served in the same amount of time.
Parallel processing and other low-level factors play a significant part here, but the simplified upshot is that there are 28,800 seconds in an eight-hour workday, so reducing response times by just four-tenths of a second (from 0.5 to 0.1 second) results in 230,400 more requests served each day. tcprstat helps you achieve this.
I have only enough space in this article to pique your curiosity, so I’ll finish this tool’s introduction by telling you the first step toward getting started with MySQL response time optimization: read Optimizing Oracle Performance. Then start using tcprstat.
You can download tcprstat from GitHub.
Essential MySQL admin tool: pt-table-checksum
“Data drift” is a significant problem for dynamic MySQL environments. This problem, wherein slave data becomes out of sync with the master, is often caused by writing data to a slave or executing certain nondeterministic queries on the master. What’s worse is that the data differences may go unnoticed until they become crippling. Enter pt-table-checksum, a tool that performs the complex, sensitive calculations necessary to verify the data in two or more tables is identical.
pt-table-checksum works with both stand-alone servers and servers in a replication hierarchy, where the tool’s greatest value is easily seen. Verifying table data between a master and a slave must account for replication consistency. Because changes to the master are replicating to slaves with some amount of time delay (“lag”), simply reading data from the servers is an unreliable way to verify consistency, given that the data is constantly changing and incomplete until fully replicated. Locking tables and waiting for all data to replicate would allow consistent reads, but to do so would mean effectively halting the servers. pt-table-checksum allows you to perform nonblocking, consistent checksums of master and slave data.
Apart from replication consistency, there are other problems with verifying data. Table size is one of them. The MySQL command CHECKSUM TABLE is sufficient for small tables, but large tables require “chunking” to avoid long locks or overloading CPU or memory resources with checksum calculations.
Chunking solves a second problem: the need for regular data-consistency checks. While data drift can be a one-time occurrence, often it is recurring. pt-table-checksum is designed to continuously check tables, vetting certain chunks one run and other chunks the next run until eventually the whole table has been checked. The ongoing nature of this process helps ensure that recurring drift is corrected.
You can download pt-table-checksum from Percona. It is also available as part of the Percona Toolkit.
Essential MySQL admin tool: pt-stalk
Problems have a way of waiting until you’re not looking or at home sleeping to occur, and diagnosing them after the fact is sometimes impossible without data about the state of MySQL and the server at the time of the problem. The natural inclination is to write your own script to wait for or detect a problem and then start logging extra data because, after all, no one knows your system better than you. The problem is, you know your system when it’s working, and if you knew the kinds of problems the system would have, you would simply fix them rather than try to capture and analyze them.
Thankfully, those who specialize in knowing when MySQL is not working, and in fixing the problems, have written a tool called pt-stalk. pt-stalk waits for certain conditions to become true, then collects data to help you diagnose the problem. You can also use pt-stalk to collect data on demand without waiting for a trigger to occur.
pt-stalk gathers not only the standard information that MySQL can report about itself but a lot more data that you might not have thought to include: lsof, strace, tcpdump, and so on. Thus, if you end up having to consult with a professional who specializes in fixing MySQL problems, you will have all the data that they need.
pt-stalk is configurable, so can be used for almost any problem. The one requirement is a definable condition to establish a trigger. If multiple conditions signal the problem, then you may also need to consult with a professional for a more extensive review of your MySQL environment because problems can appear in MySQL even though the underlying cause is elsewhere.
pt-stalk can be used proactively, too. For example, if you know that there should never be more than 50 active MySQL connections at a time, then you could proactively monitor this value, making pt-stalk helpful both for problems that you know and problems that you have not yet seen.
You can download pt-stalk from Percona. It is also available as part of the Percona Toolkit.
Essential MySQL admin tool: mycheckpoint
You don’t always want to wait for something to go wrong before addressing a problem, and dashboards provide an essential way for you to monitor your MySQL environment for potential problems before they arise.
There are many free and commercial monitoring applications for MySQL, some MySQL-specific and others generic with MySQL plug-ins or templates. mycheckpoint is notable because it is free, open source, MySQL-specific, and full-featured. The author, Shlomi Noach, has discontinued development of mycheckpoint, but the source code and binaries are still freely available.
Like many monitoring solutions, mycheckpoint is web-based. Here’s an example chart:
mycheckpoint can be configured to monitor both MySQL and server metrics, like InnoDB buffer pool flushes, temporary tables created, operating system load, memory usage, and more. If you don’t like charts, mycheckpoint can also generate human-readable reports.
As with pt-stalk, alert conditions can be defined with email notifications, though no additional troubleshooting data will be collected. Another useful feature is mycheckpoint’s ability to monitor MySQL variables to detect changes that can lead to problems, or signal that someone has modified MySQL when they shouldn’t have.