High Performance MySQL
Proven Strategies for Operating at Scale
sufficient
reading path: overview → analysis → narration
overview
Overview
High Performance MySQL: Proven Strategies for Operating at Scale (4th edition, 2022) by Baron Schwartz, Peter Zaitsev, Vadim Tkachenko, Jeremy Zawodny, Arjen Lentz, and Derek J. Balling is the canonical reference for anyone running MySQL in production. Six authors — most of them founders or early engineers of Percona — pool their experience from operating some of the largest MySQL installations on the planet.
This is not a beginner's SQL book. It is a deep, opinionated, measurement-first guide to how MySQL actually works under load: InnoDB's MVCC and locking model, B-tree and covering indexes, the query optimizer and EXPLAIN, replication topologies, backup and recovery, and the various paths to scaling beyond a single server.
------|-------|--------------| | 1. Foundations | MySQL architecture, history, benchmarking, profiling, schema design | Logical architecture, Concurrency models, Transactions, Configuration, Schema | | 2. Indexing & Query Optimization | B-Tree indexes, hash indexes, EXPLAIN, the optimizer, query rewriting | Indexing strategies, Query performance, EXPLAIN tour, optimizer hints | | 3. Server Performance | Hardware, OS, locking, transactions, InnoDB internals | Hardware sizing, OS tuning, Locks, Transactions, Multiversion Concurrency | | 4. Replication & High Availability | Binary logs, replication topologies, failover, GTIDs, semisync | Replication configuration, HA architectures, Group Replication, Binlog servers | | 5. Scaling & Operations | Sharding, MySQL Cluster, Galera, Vitess, backup & recovery, cloud | Sharding patterns, MySQL Cluster, Galera, Backup & Recovery, MySQL in the cloud |
The book's recurring message: measure first, optimize second, and never trust intuition over a benchmark.
Key Takeaways
-
MySQL's strength is its simplicity — but its internals are subtle. The default storage engine is InnoDB; understanding its MVCC, lock manager, redo log, and change buffer is non-negotiable for serious work.
-
Indexes are not free. Each secondary index speeds reads but slows writes and consumes buffer pool space. The art is matching indexes to actual query patterns — not to the schema.
-
EXPLAIN is the most underused tool in MySQL. Reading the optimizer's plan reveals join order, index usage, and the difference between a fast query and a full table scan. The fourth edition adds the JSON-format EXPLAIN and EXPLAIN ANALYZE.
-
The optimizer is a cost-based search. It estimates row counts, weighs access paths, and chooses what it believes is cheapest. When its estimates are wrong, queries are slow. Good indexes fix this; hints are a last resort.
-
Transactions and locking trade off against throughput. InnoDB uses row-level locks with MVCC so readers don't block writers. But deadlock, gap locks, and next-key locks can still surprise you under high concurrency.
-
Replication is for availability and read scale, not for write scale. A single leader is still a single point for writes. Semi-synchronous replication and group replication close some gaps; sharding closes the rest.
-
Backups are not backups until you have restored from one. The book devotes an entire chapter to backup tools, PITR, and verification — because the worst time to discover a backup is broken is during an incident.
-
Scaling MySQL means sharding, eventually. MySQL Cluster, Galera, ProxySQL, and Vitess are real tools that solve real problems. None of them are magic.
-
Configuration is a starting point, not a destination. Settings like
innodb_buffer_pool_sizematter, but defaults from MySQL 8.0 are usually good. Tuning without measurement is just guessing. -
Hardware and the operating system matter more than people think. SSDs changed the game for I/O-bound workloads; NUMA, fsync, and the page cache continue to shape performance.
Who Should Read
| Reader Type | Why | |---|---| | Database administrators | The day-to-day reference for InnoDB tuning, replication, and recovery | | Backend engineers | Understand why your queries are slow and what to do about it | | SREs and platform engineers | Backup strategies, HA topologies, failover, capacity planning | | Performance engineers | Benchmarking, profiling, EXPLAIN-driven optimization | | Engineering managers | Speak the language of the team keeping the database alive | | Anyone designing a schema | Indexing, data types, normalization decisions |
Who Should Skip
- Beginners with no SQL experience — read a SQL primer first
- Readers wanting a tutorial on
SELECTsyntax — this is internals, not SQL - Anyone using only managed services (RDS, Aurora) at default settings — you'll get more from the AWS or Cloud SQL docs
- Those looking for a single read of "what's new in MySQL 8" — the book is broader than release notes
Why This Book Matters
For nearly twenty years, High Performance MySQL has been the book that DBAs leave open on their desk. The fourth edition matters because the MySQL it describes is fundamentally different from the MySQL of 2012 or 2004: InnoDB is the only engine that matters, replication has matured (GTIDs, multi-source, group replication), and the operational landscape has shifted (cloud, Vitess, ProxySQL, orchestration, container-native deployments).
The authors are not journalists. Schwartz, Zaitsev, and Tkachenko built Percona into the world's leading MySQL consultancy. Zawodny ran MySQL at Yahoo. Lentz was a long-time MySQL community manager. Balling is a Percona engineer. They have been paged at 3 a.m. for the problems they describe — and the patterns they share are patterns they have seen at scale.
If you operate MySQL in production, this is the most valuable book on your shelf.
Related Books
| Book | Author | Connection | |------|--------|------------| | Database Internals | Alex Petrov | Deeper dive into storage engine internals (B-trees, LSM) | | Designing Data-Intensive Applications | Martin Kleppmann | Distributed systems framing that complements the MySQL specifics | | SQL Antipatterns | Bill Karwin | Schema design mistakes and how to avoid them | | Use The Index, Luke! | Markus Winand | A focused, web-friendly companion on indexing | | MySQL Cookbook | Paul DuBois | Recipes and patterns, lighter than HPMYSQL | | Site Reliability Engineering | Beyer et al. | Operational mindset for running databases at scale |
Final Verdict
High Performance MySQL remains the authoritative MySQL reference. The fourth edition is comprehensive, current, and grounded in operational reality. It is dense — 950 pages is not a casual read — but every chapter pays back the time. If you are a working MySQL professional, buy it. If you are a backend engineer who wants to understand why the database is slow, borrow it.
Rating: 9.5/10 — The definitive MySQL book. Required reading for DBAs, SREs, and serious backend engineers.
content map
MySQL Architecture at a Glance
flowchart TB
subgraph Client["Client Layer"]
C1["Connection Manager<br/>(threads, auth)"]
C2["SQL Interface<br/>(parser, optimizer)"]
end
subgraph Server["MySQL Server"]
P["Parser & Preprocessor"]
O["Query Optimizer<br/>(cost-based)"]
E["Execution Engine"]
end
subgraph Engines["Storage Engines"]
I["InnoDB<br/>(default since 5.5)"]
M["MyISAM<br/>(legacy)"]
N["NDB / Cluster"]
X["Memory, CSV, Archive..."]
end
subgraph FS["Filesystem & Logs"]
R["Redo Log (InnoDB)"]
B["Binary Log (server)"]
D["Data files (.ibd)"]
end
C1 --> P --> O --> E
E --> I
E --> M
E --> N
E --> X
I --> R
I --> D
E --> B
MySQL is a pluggable storage engine architecture. The server layer handles connection management, parsing, optimization, and execution. Each storage engine owns its on-disk format, locking, and indexing. InnoDB is the only engine that should be on your radar for OLTP.
InnoDB Internals: MVCC and Locking
sequenceDiagram
participant T1 as Transaction T1
participant T2 as Transaction T2
participant IB as InnoDB Buffer Pool
participant R as Redo Log
T1->>IB: BEGIN
T1->>IB: UPDATE row id=1 SET v=10
IB->>R: Append redo entry
T2->>IB: SELECT v FROM t WHERE id=1
IB-->>T2: Returns v=8 (snapshot, undo log)
T1->>IB: COMMIT
T1->>R: Flush & mark commit
T2->>IB: New snapshot after T1 commits
IB-->>T2: Returns v=10
InnoDB implements Multi-Version Concurrency Control: writers create new row versions in the clustered index, readers see a consistent snapshot built from the undo log. Locks are mostly row-level, with gap locks and next-key locks preventing phantoms at the cost of occasional deadlocks.
Key consequences:
REPEATABLE READ(InnoDB's default) gives a true snapshot, not the "no phantoms" of the SQL standard.- Long-running transactions bloat the undo log and slow purges.
- Hot rows cause lock contention that no amount of indexing fixes.
Indexing Strategies
B-Tree Indexes
B-tree indexes are the bread and butter of MySQL. They are ordered,
support range scans, and can serve ORDER BY and GROUP BY without
a sort.
flowchart LR
A["Root"] --> B["Internal node"]
A --> C["Internal node"]
B --> L1["Leaf 1<br/>id=1..100"]
B --> L2["Leaf 2<br/>id=101..200"]
C --> L3["Leaf 3<br/>id=201..300"]
C --> L4["Leaf 4<br/>id=301..400"]
L1 -.->|"linked"| L2 -.->|"linked"| L3 -.->|"linked"| L4
Composite, Covering, and Prefix Indexes
| Index Type | Best For | Trade-off |
|---|---|---|
| Single-column | Equality lookups | Limited selectivity |
| Composite (leftmost) | Range + sort, multi-column filters | Order matters; cannot skip columns |
| Covering | Queries whose columns are all in the index | Larger indexes, slower writes |
| Prefix | VARCHAR/TEXT columns | Cannot be used for range or sort |
A covering index is the most powerful optimization in MySQL: when
the index contains every column the query needs, InnoDB never touches
the table. EXPLAIN shows this as Extra: Using index.
Hash and Fulltext
MEMORYengine supports hash indexes, but you should not useMEMORYfor anything serious in 2026.- MySQL 8.0 added functional indexes (
CREATE INDEX ... ON ((LOWER(name)))) and descending indexes forORDER BY ... DESC.
Reading EXPLAIN
EXPLAIN FORMAT=JSON
SELECT u.id, u.name, COUNT(*) AS n
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE u.country = 'US' AND o.created_at > '2024-01-01'
GROUP BY u.id, u.name
ORDER BY n DESC
LIMIT 10;
What to look for in EXPLAIN:
| Column | Watch For |
|---|---|
| type | system / const / eq_ref / ref / range / index / ALL |
| key | The index actually used (vs. possible_keys) |
| rows | Estimated row count; huge gaps from reality signal bad stats |
| Extra | Using filesort, Using temporary, Using index, Impossible WHERE |
| filtered | Percentage of rows that pass the WHERE |
The fourth edition dedicates an entire chapter to walking through
EXPLAIN output and using OPTIMIZER_TRACE to see why the optimizer
chose a particular plan.
The Query Optimizer
flowchart TB
Q["Query"] --> P["Parse & Resolve"]
P --> R["Rewrite<br/>(subqueries, views)"]
R --> O["Optimize<br/>(cost-based search)"]
O --> QP["Query Plan"]
QP --> E["Execute"]
O -.->|"uses"| S["Statistics<br/>(cardinality, histograms)"]
O -.->|"uses"| C["Cost model<br/>(disk, CPU, memory)"]
The optimizer is not magic. It does a cost-based search: it estimates row counts from index statistics, considers access paths (full scan, range, ref, join order), and picks the cheapest plan.
When it is wrong:
- Bad stats → run
ANALYZE TABLEto refresh histograms (MySQL 8+) - Skewed data → use query hints or rewrite the query
- Missing index → add the right covering index
- Optimizer limitations → hints (
/*+ INDEX(...) */) orSTRAIGHT_JOIN
Transactions and Isolation
| Isolation | Dirty Read | Non-Repeatable Read | Phantom | InnoDB Default | |---|---|---|---|---| | READ UNCOMMITTED | yes | yes | yes | no | | READ COMMITTED | no | yes | yes | no | | REPEATABLE READ | no | no | yes (mostly) | yes | | SERIALIZABLE | no | no | no | no |
InnoDB's REPEATABLE READ uses gap locks to prevent most phantoms.
It is "snapshot isolation" in practice — your transaction sees the
data as it was when the first read happened, and reads are consistent
within the transaction.
Pitfalls the book covers in detail:
- Deadlocks — pick a consistent lock order; keep transactions
short; use
SELECT ... FOR UPDATEdeliberately - Lock waits — monitor
information_schema.innodb_trxandperformance_schema.data_locks - Long transactions — fill the undo log, block purges, slow replication
Replication
flowchart LR
subgraph Source["Source (Primary)"]
M1["Master"]
end
subgraph Replicas["Replicas"]
R1["Replica 1<br/>(read scaling)"]
R2["Replica 2<br/>(analytics)"]
R3["Replica 3<br/>(disaster recovery)"]
end
M1 -->|"binary log"| R1
M1 -->|"binary log"| R2
M1 -->|"binary log"| R3
R1 -->|"semi-sync ack"| M1
Replication modes the book covers:
- Asynchronous (default): primary doesn't wait for replicas
- Semi-synchronous: primary waits for at least one replica to acknowledge receipt
- Group Replication (MySQL 8+): Paxos-inspired, single-primary or multi-primary with built-in conflict detection
- Galera / Percona XtraDB Cluster: virtually synchronous, cert- based replication
Replication is for read scale and availability, not write scale. For write scale, you need sharding.
Backup and Recovery
| Method | Granularity | Locking | Speed | Use Case |
|---|---|---|---|---|
| mysqldump | Logical, full | Various | Slow | Small DBs, schema migrations |
| mydumper | Logical, parallel | Minimal | Fast | Large DBs, logical backups |
| Percona XtraBackup | Physical, online | None | Fast | Hot backups of InnoDB |
| Filesystem snapshot (LVM/ZFS) | Physical | Brief FS freeze | Fast | Large DBs, full snapshots |
| Binary log + PITR | Time | None | Continuous | Replay to a point in time |
The book's advice: the only backup that exists is one you have restored from. Test restores. Verify checksums. Store backups off- site. Document the time-to-restore.
Scaling Beyond a Single Server
flowchart TB
subgraph Strategies["Scaling Strategies"]
V["Vertical<br/>(bigger server)"]
R["Read replicas<br/>(read scale)"]
S["Sharding<br/>(write scale)"]
end
subgraph Tools["Tools & Systems"]
P["ProxySQL"]
G["Galera / PXC"]
C["MySQL Cluster (NDB)"]
VT["Vitess"]
end
V --> R --> S
S --> P
S --> G
S --> C
S --> VT
| Approach | Solves | Trade-off | |---|---|---| | Vertical scaling | Most early bottlenecks | Hardware limits, single point of failure | | Read replicas | Read throughput, some HA | Replication lag, write contention on primary | | Vitess / ProxySQL | Sharding, connection pooling | Operational complexity, query restrictions | | MySQL Cluster (NDB) | Write scale + HA | Very different SQL surface, in-memory focus | | Galera / PXC | Multi-master with sync | Conflict detection, schema migration pain | | Group Replication | HA with built-in quorum | Throughput limits, network requirements |
The fourth edition adds a chapter on MySQL in the cloud (RDS, Aurora, Cloud SQL, PlanetScale) and the new realities of running MySQL on Kubernetes.
Key Lessons
- Measure before you optimize. Use
sysbench,mysqlslap, or production traffic replay. The bottleneck is rarely where you think. - Indexes are the most leveraged optimization. A well-chosen covering index beats a faster server.
- InnoDB is the default — and the right default. Master its MVCC model, its redo log, and its lock types.
- EXPLAIN is the diagnostic tool. If you cannot read EXPLAIN, you cannot reason about query performance.
- Configuration has high-leverage knobs (
innodb_buffer_pool_size,innodb_log_file_size,innodb_flush_log_at_trx_commit) but defaults are usually fine. Change with a benchmark. - Replication is a fact of life. Choose the right topology for your RTO/RPO and test failover regularly.
- Backups are a discipline. Automate, verify, restore.
- Sharding is the last resort, not the first. Most applications can run on a single primary for years with the right schema and indexes.
Practical Applications
For the Backend Engineer
- Always check the EXPLAIN of a slow query before reaching for a cache
- Use composite indexes in the same order as your
WHEREandORDER BYcolumns - Wrap related writes in a single transaction; keep it short
- Profile with
performance_schemaandsysschema, not just slow query log
For the DBA
- Set
innodb_buffer_pool_sizeto 50-75% of server memory - Use
innodb_log_file_sizelarge enough to avoid checkpointing storms (often 2-4 GB on busy servers) - Monitor replication lag, lock waits, and buffer pool hit rate
- Rotate
binlog_expire_logs_secondsaggressively (default 30 days in MySQL 8) to avoid filling disk
For the SRE
- Practice failover — make it routine, not a fire drill
- Keep three copies of every backup, on two media, in one geographic region (3-2-1 rule)
- Alert on replication lag, disk space, and
Threads_connected - Treat schema migrations like production deploys: reversible, staged, monitored
For the Architect
- Choose the replication topology that matches your RPO/RTO
- Plan the sharding key before you need it (user_id, tenant_id, region)
- Use Vitess or ProxySQL as the routing layer, not the application
- Consider managed services (Aurora, PlanetScale, RDS) for operational leverage — at the cost of some control
analysis
Strengths
- The canonical MySQL reference. No other book covers MySQL performance with this depth, currency, and operational grounding. The fourth edition (2022) reflects MySQL 8.0, not the MySQL of 2012.
- Six authors who have actually paged at 3 a.m. Schwartz, Zaitsev, Tkachenko, Zawodny, Lentz, and Balling are the people behind Percona and a generation of MySQL operations. The advice is not theoretical.
- Measurement-first philosophy. The book hammers the same point over and over: measure, then optimize. This is the single most valuable habit it teaches.
- Comprehensive coverage. Architecture, indexing, transactions, replication, backup, scaling, hardware, the cloud — the book covers the full operational life of a MySQL deployment.
- Up-to-date on the modern stack. Group Replication, GTIDs,
Vitess, ProxySQL, MySQL Shell,
sysschema,EXPLAIN ANALYZEin MySQL 8.0.18+, histograms — the 4th edition reads as a current book, not a rehash. - Practical command-line examples. Configuration snippets, SQL queries, shell commands, and monitoring queries are abundant and realistic.
- Excellent chapter on schema design. Data types, normalization, and the trade-offs between them are often glossed over; HPMYSQL treats them as a first-class topic.
- InnoDB internals done right. MVCC, undo logs, redo logs, change buffering, doublewrite — the level of detail is enough to reason about, not enough to write a storage engine from scratch.
Weaknesses
- Heavy at 950 pages. This is not a book you read cover to cover. It is a reference you dip into. Some chapters (replication, backup) feel padded with option lists.
- Some content overlaps with official docs. The MySQL Reference Manual and the Percona blog cover a surprising amount of the same material. The book is more cohesive, but you are paying for curation as much as content.
- Galera / Cluster coverage is shallow. MySQL Cluster (NDB) and Galera each deserve their own book; the treatment here is a survey.
- No exercises or problems. Like most O'Reilly titles, this is prose, not a textbook. You learn by doing it on a real system.
- Code examples are in MySQL 5.7 / 8.0 idiom. If you are stuck on a legacy 5.5 or 5.6 system (still common in 2026), some features won't apply.
- Print and PDF formatting is dense. Tables, snippets, and diagrams are packed. Reading on a phone is painful.
Criticism
The "Too MySQL-Specific" Critique
The book's title is honest — this is about MySQL. If you are deciding between MySQL, PostgreSQL, and MongoDB, you want Kleppmann's DDIA or Petrov's Database Internals first. HPMYSQL is for after you have already chosen MySQL.
The "Oracle/Percona Lens" Critique
Six authors, all from Percona or former Yahoo. There is little coverage of MariaDB's storage engine innovations, or of alternative forks (Drizzle, MyRocks). The book is, fairly, opinionated — but those opinions are weighted toward Oracle's MySQL and Percona's tooling.
The "Configuration Drift" Risk
Configuration advice ages fast. Recommendations from the 3rd edition (2012) are sometimes still quoted as current; some are wrong for MySQL 8. Read the 4th edition specifically.
The "Cloud and Managed" Gap
The book has a chapter on the cloud, but the operational advice is written for self-managed MySQL. If you run Aurora, PlanetScale, or RDS, much of the tuning chapter does not apply — those are someone else's defaults.
Scientific Grounding
| Concept | Source | Where in MySQL | |---|---|---| | B-Tree indexes | Bayer & McCreight (1971) | All storage engines | | ARIES recovery | Mohan et al. (1992) | InnoDB redo log + undo log | | MVCC | Reed (1978); Bernstein & Goodman (1983) | InnoDB, NDB | | Two-phase commit | Gray (1978) | Distributed transactions, XA | | Paxos | Lamport (1998) | Group Replication (MySQL 8) | | Paxos / cert-based | Galera (codership) | Galera / PXC | | LSM-trees | O'Neil et al. (1996) | MyRocks (not MySQL default) | | Read-copy update (RCU) | McKenney (2001) | Some lock-free paths |
Historical Context
The first edition (2004) appeared when MySQL was still a niche
database, MyISAM was the default, and replication meant "two
servers and CHANGE MASTER TO". Through four editions, the book
has tracked MySQL's transformation:
- 2nd edition (2008): InnoDB rises; MyISAM is on its way out
- 3rd edition (2012): SSDs, partitioning, semi-sync replication
- 4th edition (2022): MySQL 8, Group Replication, cloud-native,
Vitess,
EXPLAIN ANALYZE
In a sense, the four editions of HPMYSQL are a history of MySQL itself, told by people who helped build it.
Comparison
| Book | Author | Focus | Compared to HPMYSQL | |---|---|---|---| | High Performance MySQL (4th) | Schwartz et al. | MySQL performance, ops, scaling | The reference for MySQL | | Database Internals | Petrov | Storage engines, distributed systems | Broader; less MySQL-specific | | Designing Data-Intensive Applications | Kleppmann | Distributed data principles | Higher-level; not MySQL-specific | | SQL Antipatterns | Karwin | Schema and query mistakes | Lighter; portable across SQL | | Use The Index, Luke! | Winand | Indexing deep dive | Focused, free online; less operational | | MySQL Cookbook | DuBois | Recipes and patterns | More examples; less internals | | Effective MySQL | SqlMaestro | Backup/replication specifics | Narrower scope |
Final Assessment
| Dimension | Rating | Notes | |---|---|---| | Depth | 10/10 | The most comprehensive MySQL book | | Currency | 9/10 | 4th edition (2022) covers MySQL 8.0 well | | Operational Utility | 10/10 | Written by operators, for operators | | Readability | 7/10 | Dense; this is a reference, not a beach read | | Coverage of Alternatives | 5/10 | Barely mentions MariaDB, MyRocks, etc. | | Lasting Value | 9/10 | Principles age well; specific config ages faster | | Overall | 9.5/10 | The definitive MySQL book. Buy it. |
narration
Introduction
Welcome to BookAtlas. Today: High Performance MySQL: Proven Strategies for Operating at Scale. Fourth edition, 2022. Six authors, including the founders of Percona. 950 pages. The book that has sat on every MySQL DBA's desk for nearly twenty years.
This is the book you read when the database is on fire, or when you want to make sure it does not catch fire in the first place.
Why This Book Exists
Engineer: MySQL has a reputation as the "simple" database. It is the default in LAMP stacks. It is the database you reach for when you want something to just work.
Skeptic: And that is the problem, right? The defaults take you 80% of the way. The last 20% is where you find yourself in a war room at 3 a.m.
Engineer: Exactly. Most MySQL books stop at CREATE TABLE and
basic SELECT. This book starts where they end. Indexing strategy,
InnoDB internals, EXPLAIN plans, the optimizer's cost model,
replication topologies, backup verification, sharding. The kind of
stuff that separates a developer from an operator.
InnoDB: The Engine That Matters
Engineer: Let's start with InnoDB. It is the default storage engine since MySQL 5.5, and for good reason. It is the only engine in MySQL that gives you transactions, foreign keys, MVCC, and crash-safe replication. Everything else is legacy.
Skeptic: MVCC is the magic word, isn't it? Multi-version concurrency control.
Engineer: Right. InnoDB keeps multiple versions of each row in the clustered index. Writers create new versions; readers see a consistent snapshot built from the undo log. This is why readers don't block writers and vice versa. It is also why long-running transactions bloat the undo log and slow down purges.
flowchart LR
W["Writer"] -->|"creates new row version"| DB["Clustered Index"]
R["Reader"] -->|"reads consistent snapshot"| UL["Undo Log"]
DB -.->|"old versions live in"| UL
Skeptic: So the trade-off is memory. Long transactions = lots of undo.
Engineer: Exactly. And the book hammers this: keep transactions short. Don't open one in your application server and wait for a user to click a button.
Indexing: The Highest-Leverage Optimization
Skeptic: Most of my slow queries come down to missing indexes. Is that a coincidence?
Engineer: It is not a coincidence. Indexing is the single most leveraged optimization in MySQL. A well-chosen composite or covering index can turn a 10-second query into a 10-millisecond query. A new server, with the same indexes, will not.
Skeptic: Walk me through the "covering index" idea.
Engineer: Imagine a phone book. You want to find everyone named "Schwartz" and you only need their last name and city. A regular index finds the page; the table is the alphabetical entries with phone numbers and addresses you don't need. A covering index includes only the columns you actually want. The query never touches the table at all.
Skeptic: So bigger index, faster query.
Engineer: Bigger index, faster query, slower writes. Every
index is overhead on every INSERT, UPDATE, DELETE. The art is
matching indexes to actual query patterns, not to the schema.
Reading EXPLAIN
Engineer: The second most important skill is reading EXPLAIN.
Skeptic: Just put EXPLAIN in front of a query?
Engineer: In MySQL 8, use EXPLAIN ANALYZE — it actually runs
the query and shows you the real numbers, not just estimates.
Or EXPLAIN FORMAT=JSON for a structured plan.
Skeptic: And you read this to know what the optimizer is going to do?
Engineer: Exactly. The output tells you the join order, which indexes it picked (or didn't), how many rows it thinks it will scan, whether it needs to sort or use a temp table. If the optimizer's row estimates are wildly wrong — say it thinks a query will return 10 rows but actually returns 10 million — that is almost always a missing or broken index.
Skeptic: And if the optimizer is just wrong?
Engineer: Hints. SELECT /*+ INDEX(t users_name_idx) */ ....
But the book is clear: hints are a last resort, not a first
option. If you need a hint, ask why the optimizer made the wrong
choice. Usually, the answer is "the schema or the stats are wrong."
Replication and High Availability
Skeptic: Replication in MySQL has always had a reputation for being fragile. Is that still true?
Engineer: It is more accurate to say it has more knobs now. Asynchronous replication (the default) is fast but lossy. Semi- synchronous replication waits for at least one replica to acknowledge. Group Replication uses Paxos to keep a quorum in sync. Galera is virtually synchronous, cert-based.
flowchart LR
M["Primary"] -->|"binary log"| R1["Replica"]
M -->|"binary log"| R2["Reporting Replica"]
M -->|"binary log"| R3["DR Replica"]
R1 -->|"semi-sync ack"| M
Skeptic: And the failover story?
Engineer: Better than it was. With GTIDs and orchestrators like Orchestrator, MHA, or the cloud-native solutions, failover is mostly routine — if you have rehearsed it. The book is clear: failover that you have not practiced is a fire drill. Failover that you have practiced is a deploy.
Backup: The Discipline Most Teams Skip
Engineer: Here is the most underrated chapter: backup and recovery.
Skeptic: Backups are boring. Everyone knows you need them.
Engineer: Everyone knows they need them. Very few teams have actually restored from one. The book covers the tools — mysqldump, mydumper, Percona XtraBackup, filesystem snapshots, binary log PITR — and then spends half the chapter on verification.
Skeptic: "The only backup that exists is one you have restored from."
Engineer: That is literally the line. The book is opinionated about this. 3-2-1 backup strategy (three copies, two media, one off-site). Automated restores. Checksum verification. Documented RTO and RPO.
Sharding: The Last Resort
Skeptic: When do you shard?
Engineer: When you have to. Most applications run on a single primary for years with the right schema, indexes, and read replicas. Sharding adds operational complexity that you do not want to pay for unless you have to.
Skeptic: And when you have to?
Engineer: Vitess. ProxySQL. MySQL Cluster (NDB). Galera. Each solves a different problem. Vitess is the modern default for sharded MySQL — it sits between your application and your shards, routing queries and managing topology. The book covers all of these, with realistic advice on when each makes sense.
The Verdict
Engineer: High Performance MySQL is not a book you read once. It is a reference you return to. The 950 pages are deliberately comprehensive — you will not need every chapter on day one, but the chapter you need at 2 a.m. will save your on-call rotation.
Skeptic: It is also not a beginner book. If you have never written a JOIN, this is not where you start.
Engineer: Correct. Read a SQL primer first. Then read Karwin's SQL Antipatterns. Then read this. By the time you finish, you will be the person your team pings when MySQL is misbehaving.
Skeptic: Final rating?
Engineer: 9.5 out of 10. The definitive MySQL book. The fact that it is on its fourth edition, written by the people who built Percona, and still current in 2026 is a testament to both the book's quality and the longevity of MySQL itself.
This has been a BookAtlas narration of High Performance MySQL by Baron Schwartz, Peter Zaitsev, Vadim Tkachenko, Jeremy Zawodny, Arjen Lentz, and Derek J. Balling. Thanks for listening.