High Performance MySQL

Name: High Performance MySQL
ISBN: 9781492080510

Proven Strategies for Operating at Scale

Baron Schwartz, Peter Zaitsev, Vadim Tkachenko, Jeremy Zawodny, Arjen Lentz, Derek J. Balling · 2022 · 950pp

sufficient

reading path: overview → analysis → narration

overview

Overview

High Performance MySQL: Proven Strategies for Operating at Scale (4th edition, 2022) by Baron Schwartz, Peter Zaitsev, Vadim Tkachenko, Jeremy Zawodny, Arjen Lentz, and Derek J. Balling is the canonical reference for anyone running MySQL in production. Six authors — most of them founders or early engineers of Percona — pool their experience from operating some of the largest MySQL installations on the planet.

This is not a beginner's SQL book. It is a deep, opinionated, measurement-first guide to how MySQL actually works under load: InnoDB's MVCC and locking model, B-tree and covering indexes, the query optimizer and EXPLAIN, replication topologies, backup and recovery, and the various paths to scaling beyond a single server.

------|-------|--------------| | 1. Foundations | MySQL architecture, history, benchmarking, profiling, schema design | Logical architecture, Concurrency models, Transactions, Configuration, Schema | | 2. Indexing & Query Optimization | B-Tree indexes, hash indexes, EXPLAIN, the optimizer, query rewriting | Indexing strategies, Query performance, EXPLAIN tour, optimizer hints | | 3. Server Performance | Hardware, OS, locking, transactions, InnoDB internals | Hardware sizing, OS tuning, Locks, Transactions, Multiversion Concurrency | | 4. Replication & High Availability | Binary logs, replication topologies, failover, GTIDs, semisync | Replication configuration, HA architectures, Group Replication, Binlog servers | | 5. Scaling & Operations | Sharding, MySQL Cluster, Galera, Vitess, backup & recovery, cloud | Sharding patterns, MySQL Cluster, Galera, Backup & Recovery, MySQL in the cloud |

The book's recurring message: measure first, optimize second, and never trust intuition over a benchmark.

Key Takeaways

MySQL's strength is its simplicity — but its internals are subtle. The default storage engine is InnoDB; understanding its MVCC, lock manager, redo log, and change buffer is non-negotiable for serious work.
Indexes are not free. Each secondary index speeds reads but slows writes and consumes buffer pool space. The art is matching indexes to actual query patterns — not to the schema.
EXPLAIN is the most underused tool in MySQL. Reading the optimizer's plan reveals join order, index usage, and the difference between a fast query and a full table scan. The fourth edition adds the JSON-format EXPLAIN and EXPLAIN ANALYZE.
The optimizer is a cost-based search. It estimates row counts, weighs access paths, and chooses what it believes is cheapest. When its estimates are wrong, queries are slow. Good indexes fix this; hints are a last resort.
Transactions and locking trade off against throughput. InnoDB uses row-level locks with MVCC so readers don't block writers. But deadlock, gap locks, and next-key locks can still surprise you under high concurrency.
Replication is for availability and read scale, not for write scale. A single leader is still a single point for writes. Semi-synchronous replication and group replication close some gaps; sharding closes the rest.
Backups are not backups until you have restored from one. The book devotes an entire chapter to backup tools, PITR, and verification — because the worst time to discover a backup is broken is during an incident.
Scaling MySQL means sharding, eventually. MySQL Cluster, Galera, ProxySQL, and Vitess are real tools that solve real problems. None of them are magic.
Configuration is a starting point, not a destination. Settings like innodb_buffer_pool_size matter, but defaults from MySQL 8.0 are usually good. Tuning without measurement is just guessing.
Hardware and the operating system matter more than people think. SSDs changed the game for I/O-bound workloads; NUMA, fsync, and the page cache continue to shape performance.

Who Should Read

| Reader Type | Why | |---|---| | Database administrators | The day-to-day reference for InnoDB tuning, replication, and recovery | | Backend engineers | Understand why your queries are slow and what to do about it | | SREs and platform engineers | Backup strategies, HA topologies, failover, capacity planning | | Performance engineers | Benchmarking, profiling, EXPLAIN-driven optimization | | Engineering managers | Speak the language of the team keeping the database alive | | Anyone designing a schema | Indexing, data types, normalization decisions |

Who Should Skip

Beginners with no SQL experience — read a SQL primer first
Readers wanting a tutorial on SELECT syntax — this is internals, not SQL
Anyone using only managed services (RDS, Aurora) at default settings — you'll get more from the AWS or Cloud SQL docs
Those looking for a single read of "what's new in MySQL 8" — the book is broader than release notes

Why This Book Matters

For nearly twenty years, High Performance MySQL has been the book that DBAs leave open on their desk. The fourth edition matters because the MySQL it describes is fundamentally different from the MySQL of 2012 or 2004: InnoDB is the only engine that matters, replication has matured (GTIDs, multi-source, group replication), and the operational landscape has shifted (cloud, Vitess, ProxySQL, orchestration, container-native deployments).

The authors are not journalists. Schwartz, Zaitsev, and Tkachenko built Percona into the world's leading MySQL consultancy. Zawodny ran MySQL at Yahoo. Lentz was a long-time MySQL community manager. Balling is a Percona engineer. They have been paged at 3 a.m. for the problems they describe — and the patterns they share are patterns they have seen at scale.

If you operate MySQL in production, this is the most valuable book on your shelf.

| Book | Author | Connection | |------|--------|------------| | Database Internals | Alex Petrov | Deeper dive into storage engine internals (B-trees, LSM) | | Designing Data-Intensive Applications | Martin Kleppmann | Distributed systems framing that complements the MySQL specifics | | SQL Antipatterns | Bill Karwin | Schema design mistakes and how to avoid them | | Use The Index, Luke! | Markus Winand | A focused, web-friendly companion on indexing | | MySQL Cookbook | Paul DuBois | Recipes and patterns, lighter than HPMYSQL | | Site Reliability Engineering | Beyer et al. | Operational mindset for running databases at scale |

Final Verdict

High Performance MySQL remains the authoritative MySQL reference. The fourth edition is comprehensive, current, and grounded in operational reality. It is dense — 950 pages is not a casual read — but every chapter pays back the time. If you are a working MySQL professional, buy it. If you are a backend engineer who wants to understand why the database is slow, borrow it.

Rating: 9.5/10 — The definitive MySQL book. Required reading for DBAs, SREs, and serious backend engineers.

content map

MySQL Architecture at a Glance

flowchart TB
    subgraph Client["Client Layer"]
        C1["Connection Manager<br/>(threads, auth)"]
        C2["SQL Interface<br/>(parser, optimizer)"]
    end

    subgraph Server["MySQL Server"]
        P["Parser & Preprocessor"]
        O["Query Optimizer<br/>(cost-based)"]
        E["Execution Engine"]
    end

    subgraph Engines["Storage Engines"]
        I["InnoDB<br/>(default since 5.5)"]
        M["MyISAM<br/>(legacy)"]
        N["NDB / Cluster"]
        X["Memory, CSV, Archive..."]
    end

    subgraph FS["Filesystem & Logs"]
        R["Redo Log (InnoDB)"]
        B["Binary Log (server)"]
        D["Data files (.ibd)"]
    end

    C1 --> P --> O --> E
    E --> I
    E --> M
    E --> N
    E --> X
    I --> R
    I --> D
    E --> B

MySQL is a pluggable storage engine architecture. The server layer handles connection management, parsing, optimization, and execution. Each storage engine owns its on-disk format, locking, and indexing. InnoDB is the only engine that should be on your radar for OLTP.

InnoDB Internals: MVCC and Locking

sequenceDiagram
    participant T1 as Transaction T1
    participant T2 as Transaction T2
    participant IB as InnoDB Buffer Pool
    participant R as Redo Log

    T1->>IB: BEGIN
    T1->>IB: UPDATE row id=1 SET v=10
    IB->>R: Append redo entry
    T2->>IB: SELECT v FROM t WHERE id=1
    IB-->>T2: Returns v=8 (snapshot, undo log)
    T1->>IB: COMMIT
    T1->>R: Flush & mark commit
    T2->>IB: New snapshot after T1 commits
    IB-->>T2: Returns v=10

InnoDB implements Multi-Version Concurrency Control: writers create new row versions in the clustered index, readers see a consistent snapshot built from the undo log. Locks are mostly row-level, with gap locks and next-key locks preventing phantoms at the cost of occasional deadlocks.

Key consequences:

REPEATABLE READ (InnoDB's default) gives a true snapshot, not the "no phantoms" of the SQL standard.
Long-running transactions bloat the undo log and slow purges.
Hot rows cause lock contention that no amount of indexing fixes.

Indexing Strategies

B-Tree Indexes

B-tree indexes are the bread and butter of MySQL. They are ordered, support range scans, and can serve ORDER BY and GROUP BY without a sort.

flowchart LR
    A["Root"] --> B["Internal node"]
    A --> C["Internal node"]
    B --> L1["Leaf 1<br/>id=1..100"]
    B --> L2["Leaf 2<br/>id=101..200"]
    C --> L3["Leaf 3<br/>id=201..300"]
    C --> L4["Leaf 4<br/>id=301..400"]
    L1 -.->|"linked"| L2 -.->|"linked"| L3 -.->|"linked"| L4

Composite, Covering, and Prefix Indexes

| Index Type | Best For | Trade-off | |---|---|---| | Single-column | Equality lookups | Limited selectivity | | Composite (leftmost) | Range + sort, multi-column filters | Order matters; cannot skip columns | | Covering | Queries whose columns are all in the index | Larger indexes, slower writes | | Prefix | VARCHAR/TEXT columns | Cannot be used for range or sort |

A covering index is the most powerful optimization in MySQL: when the index contains every column the query needs, InnoDB never touches the table. EXPLAIN shows this as Extra: Using index.

Hash and Fulltext

MEMORY engine supports hash indexes, but you should not use MEMORY for anything serious in 2026.
MySQL 8.0 added functional indexes (CREATE INDEX ... ON ((LOWER(name)))) and descending indexes for ORDER BY ... DESC.

Reading EXPLAIN

EXPLAIN FORMAT=JSON
SELECT u.id, u.name, COUNT(*) AS n
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE u.country = 'US' AND o.created_at > '2024-01-01'
GROUP BY u.id, u.name
ORDER BY n DESC
LIMIT 10;

What to look for in EXPLAIN:

| Column | Watch For | |---|---| | type | system / const / eq_ref / ref / range / index / ALL | | key | The index actually used (vs. possible_keys) | | rows | Estimated row count; huge gaps from reality signal bad stats | | Extra | Using filesort, Using temporary, Using index, Impossible WHERE | | filtered | Percentage of rows that pass the WHERE |

The fourth edition dedicates an entire chapter to walking through EXPLAIN output and using OPTIMIZER_TRACE to see why the optimizer chose a particular plan.

The Query Optimizer

flowchart TB
    Q["Query"] --> P["Parse & Resolve"]
    P --> R["Rewrite<br/>(subqueries, views)"]
    R --> O["Optimize<br/>(cost-based search)"]
    O --> QP["Query Plan"]
    QP --> E["Execute"]
    O -.->|"uses"| S["Statistics<br/>(cardinality, histograms)"]
    O -.->|"uses"| C["Cost model<br/>(disk, CPU, memory)"]

The optimizer is not magic. It does a cost-based search: it estimates row counts from index statistics, considers access paths (full scan, range, ref, join order), and picks the cheapest plan.

When it is wrong:

Bad stats → run ANALYZE TABLE to refresh histograms (MySQL 8+)
Skewed data → use query hints or rewrite the query
Missing index → add the right covering index
Optimizer limitations → hints (/*+ INDEX(...) */) or STRAIGHT_JOIN

Transactions and Isolation

| Isolation | Dirty Read | Non-Repeatable Read | Phantom | InnoDB Default | |---|---|---|---|---| | READ UNCOMMITTED | yes | yes | yes | no | | READ COMMITTED | no | yes | yes | no | | REPEATABLE READ | no | no | yes (mostly) | yes | | SERIALIZABLE | no | no | no | no |

InnoDB's REPEATABLE READ uses gap locks to prevent most phantoms. It is "snapshot isolation" in practice — your transaction sees the data as it was when the first read happened, and reads are consistent within the transaction.

Pitfalls the book covers in detail:

Deadlocks — pick a consistent lock order; keep transactions short; use SELECT ... FOR UPDATE deliberately
Lock waits — monitor information_schema.innodb_trx and performance_schema.data_locks
Long transactions — fill the undo log, block purges, slow replication

Replication

flowchart LR
    subgraph Source["Source (Primary)"]
        M1["Master"]
    end
    subgraph Replicas["Replicas"]
        R1["Replica 1<br/>(read scaling)"]
        R2["Replica 2<br/>(analytics)"]
        R3["Replica 3<br/>(disaster recovery)"]
    end
    M1 -->|"binary log"| R1
    M1 -->|"binary log"| R2
    M1 -->|"binary log"| R3
    R1 -->|"semi-sync ack"| M1

Replication modes the book covers:

Asynchronous (default): primary doesn't wait for replicas
Semi-synchronous: primary waits for at least one replica to acknowledge receipt
Group Replication (MySQL 8+): Paxos-inspired, single-primary or multi-primary with built-in conflict detection
Galera / Percona XtraDB Cluster: virtually synchronous, cert- based replication

Replication is for read scale and availability, not write scale. For write scale, you need sharding.

Backup and Recovery

| Method | Granularity | Locking | Speed | Use Case | |---|---|---|---|---| | mysqldump | Logical, full | Various | Slow | Small DBs, schema migrations | | mydumper | Logical, parallel | Minimal | Fast | Large DBs, logical backups | | Percona XtraBackup | Physical, online | None | Fast | Hot backups of InnoDB | | Filesystem snapshot (LVM/ZFS) | Physical | Brief FS freeze | Fast | Large DBs, full snapshots | | Binary log + PITR | Time | None | Continuous | Replay to a point in time |

The book's advice: the only backup that exists is one you have restored from. Test restores. Verify checksums. Store backups off- site. Document the time-to-restore.

Scaling Beyond a Single Server

flowchart TB
    subgraph Strategies["Scaling Strategies"]
        V["Vertical<br/>(bigger server)"]
        R["Read replicas<br/>(read scale)"]
        S["Sharding<br/>(write scale)"]
    end

    subgraph Tools["Tools & Systems"]
        P["ProxySQL"]
        G["Galera / PXC"]
        C["MySQL Cluster (NDB)"]
        VT["Vitess"]
    end

    V --> R --> S
    S --> P
    S --> G
    S --> C
    S --> VT

| Approach | Solves | Trade-off | |---|---|---| | Vertical scaling | Most early bottlenecks | Hardware limits, single point of failure | | Read replicas | Read throughput, some HA | Replication lag, write contention on primary | | Vitess / ProxySQL | Sharding, connection pooling | Operational complexity, query restrictions | | MySQL Cluster (NDB) | Write scale + HA | Very different SQL surface, in-memory focus | | Galera / PXC | Multi-master with sync | Conflict detection, schema migration pain | | Group Replication | HA with built-in quorum | Throughput limits, network requirements |

The fourth edition adds a chapter on MySQL in the cloud (RDS, Aurora, Cloud SQL, PlanetScale) and the new realities of running MySQL on Kubernetes.

Key Lessons

Measure before you optimize. Use sysbench, mysqlslap, or production traffic replay. The bottleneck is rarely where you think.
Indexes are the most leveraged optimization. A well-chosen covering index beats a faster server.
InnoDB is the default — and the right default. Master its MVCC model, its redo log, and its lock types.
EXPLAIN is the diagnostic tool. If you cannot read EXPLAIN, you cannot reason about query performance.
Configuration has high-leverage knobs (innodb_buffer_pool_size, innodb_log_file_size, innodb_flush_log_at_trx_commit) but defaults are usually fine. Change with a benchmark.
Replication is a fact of life. Choose the right topology for your RTO/RPO and test failover regularly.
Backups are a discipline. Automate, verify, restore.
Sharding is the last resort, not the first. Most applications can run on a single primary for years with the right schema and indexes.

Practical Applications

For the Backend Engineer

Always check the EXPLAIN of a slow query before reaching for a cache
Use composite indexes in the same order as your WHERE and ORDER BY columns
Wrap related writes in a single transaction; keep it short
Profile with performance_schema and sys schema, not just slow query log

For the DBA

Set innodb_buffer_pool_size to 50-75% of server memory
Use innodb_log_file_size large enough to avoid checkpointing storms (often 2-4 GB on busy servers)
Monitor replication lag, lock waits, and buffer pool hit rate
Rotate binlog_expire_logs_seconds aggressively (default 30 days in MySQL 8) to avoid filling disk

For the SRE

Practice failover — make it routine, not a fire drill
Keep three copies of every backup, on two media, in one geographic region (3-2-1 rule)
Alert on replication lag, disk space, and Threads_connected
Treat schema migrations like production deploys: reversible, staged, monitored

For the Architect

Choose the replication topology that matches your RPO/RTO
Plan the sharding key before you need it (user_id, tenant_id, region)
Use Vitess or ProxySQL as the routing layer, not the application
Consider managed services (Aurora, PlanetScale, RDS) for operational leverage — at the cost of some control

analysis

Strengths

The canonical MySQL reference. No other book covers MySQL performance with this depth, currency, and operational grounding. The fourth edition (2022) reflects MySQL 8.0, not the MySQL of 2012.
Six authors who have actually paged at 3 a.m. Schwartz, Zaitsev, Tkachenko, Zawodny, Lentz, and Balling are the people behind Percona and a generation of MySQL operations. The advice is not theoretical.
Measurement-first philosophy. The book hammers the same point over and over: measure, then optimize. This is the single most valuable habit it teaches.
Comprehensive coverage. Architecture, indexing, transactions, replication, backup, scaling, hardware, the cloud — the book covers the full operational life of a MySQL deployment.
Up-to-date on the modern stack. Group Replication, GTIDs, Vitess, ProxySQL, MySQL Shell, sys schema, EXPLAIN ANALYZE in MySQL 8.0.18+, histograms — the 4th edition reads as a current book, not a rehash.
Practical command-line examples. Configuration snippets, SQL queries, shell commands, and monitoring queries are abundant and realistic.
Excellent chapter on schema design. Data types, normalization, and the trade-offs between them are often glossed over; HPMYSQL treats them as a first-class topic.
InnoDB internals done right. MVCC, undo logs, redo logs, change buffering, doublewrite — the level of detail is enough to reason about, not enough to write a storage engine from scratch.

Weaknesses

Heavy at 950 pages. This is not a book you read cover to cover. It is a reference you dip into. Some chapters (replication, backup) feel padded with option lists.
Some content overlaps with official docs. The MySQL Reference Manual and the Percona blog cover a surprising amount of the same material. The book is more cohesive, but you are paying for curation as much as content.
Galera / Cluster coverage is shallow. MySQL Cluster (NDB) and Galera each deserve their own book; the treatment here is a survey.
No exercises or problems. Like most O'Reilly titles, this is prose, not a textbook. You learn by doing it on a real system.
Code examples are in MySQL 5.7 / 8.0 idiom. If you are stuck on a legacy 5.5 or 5.6 system (still common in 2026), some features won't apply.
Print and PDF formatting is dense. Tables, snippets, and diagrams are packed. Reading on a phone is painful.

Criticism

The "Too MySQL-Specific" Critique

The book's title is honest — this is about MySQL. If you are deciding between MySQL, PostgreSQL, and MongoDB, you want Kleppmann's DDIA or Petrov's Database Internals first. HPMYSQL is for after you have already chosen MySQL.

The "Oracle/Percona Lens" Critique

Six authors, all from Percona or former Yahoo. There is little coverage of MariaDB's storage engine innovations, or of alternative forks (Drizzle, MyRocks). The book is, fairly, opinionated — but those opinions are weighted toward Oracle's MySQL and Percona's tooling.

The "Configuration Drift" Risk

Configuration advice ages fast. Recommendations from the 3rd edition (2012) are sometimes still quoted as current; some are wrong for MySQL 8. Read the 4th edition specifically.

The "Cloud and Managed" Gap

The book has a chapter on the cloud, but the operational advice is written for self-managed MySQL. If you run Aurora, PlanetScale, or RDS, much of the tuning chapter does not apply — those are someone else's defaults.

Scientific Grounding

| Concept | Source | Where in MySQL | |---|---|---| | B-Tree indexes | Bayer & McCreight (1971) | All storage engines | | ARIES recovery | Mohan et al. (1992) | InnoDB redo log + undo log | | MVCC | Reed (1978); Bernstein & Goodman (1983) | InnoDB, NDB | | Two-phase commit | Gray (1978) | Distributed transactions, XA | | Paxos | Lamport (1998) | Group Replication (MySQL 8) | | Paxos / cert-based | Galera (codership) | Galera / PXC | | LSM-trees | O'Neil et al. (1996) | MyRocks (not MySQL default) | | Read-copy update (RCU) | McKenney (2001) | Some lock-free paths |

Historical Context

The first edition (2004) appeared when MySQL was still a niche database, MyISAM was the default, and replication meant "two servers and CHANGE MASTER TO". Through four editions, the book has tracked MySQL's transformation:

2nd edition (2008): InnoDB rises; MyISAM is on its way out
3rd edition (2012): SSDs, partitioning, semi-sync replication
4th edition (2022): MySQL 8, Group Replication, cloud-native, Vitess, EXPLAIN ANALYZE

In a sense, the four editions of HPMYSQL are a history of MySQL itself, told by people who helped build it.

Comparison

| Book | Author | Focus | Compared to HPMYSQL | |---|---|---|---| | High Performance MySQL (4th) | Schwartz et al. | MySQL performance, ops, scaling | The reference for MySQL | | Database Internals | Petrov | Storage engines, distributed systems | Broader; less MySQL-specific | | Designing Data-Intensive Applications | Kleppmann | Distributed data principles | Higher-level; not MySQL-specific | | SQL Antipatterns | Karwin | Schema and query mistakes | Lighter; portable across SQL | | Use The Index, Luke! | Winand | Indexing deep dive | Focused, free online; less operational | | MySQL Cookbook | DuBois | Recipes and patterns | More examples; less internals | | Effective MySQL | SqlMaestro | Backup/replication specifics | Narrower scope |

Final Assessment

| Dimension | Rating | Notes | |---|---|---| | Depth | 10/10 | The most comprehensive MySQL book | | Currency | 9/10 | 4th edition (2022) covers MySQL 8.0 well | | Operational Utility | 10/10 | Written by operators, for operators | | Readability | 7/10 | Dense; this is a reference, not a beach read | | Coverage of Alternatives | 5/10 | Barely mentions MariaDB, MyRocks, etc. | | Lasting Value | 9/10 | Principles age well; specific config ages faster | | Overall | 9.5/10 | The definitive MySQL book. Buy it. |

narration

Introduction

Welcome to BookAtlas. Today: High Performance MySQL: Proven Strategies for Operating at Scale. Fourth edition, 2022. Six authors, including the founders of Percona. 950 pages. The book that has sat on every MySQL DBA's desk for nearly twenty years.

This is the book you read when the database is on fire, or when you want to make sure it does not catch fire in the first place.

Why This Book Exists

Engineer: MySQL has a reputation as the "simple" database. It is the default in LAMP stacks. It is the database you reach for when you want something to just work.

Skeptic: And that is the problem, right? The defaults take you 80% of the way. The last 20% is where you find yourself in a war room at 3 a.m.

Engineer: Exactly. Most MySQL books stop at CREATE TABLE and basic SELECT. This book starts where they end. Indexing strategy, InnoDB internals, EXPLAIN plans, the optimizer's cost model, replication topologies, backup verification, sharding. The kind of stuff that separates a developer from an operator.

InnoDB: The Engine That Matters

Engineer: Let's start with InnoDB. It is the default storage engine since MySQL 5.5, and for good reason. It is the only engine in MySQL that gives you transactions, foreign keys, MVCC, and crash-safe replication. Everything else is legacy.

Skeptic: MVCC is the magic word, isn't it? Multi-version concurrency control.

Engineer: Right. InnoDB keeps multiple versions of each row in the clustered index. Writers create new versions; readers see a consistent snapshot built from the undo log. This is why readers don't block writers and vice versa. It is also why long-running transactions bloat the undo log and slow down purges.

flowchart LR
    W["Writer"] -->|"creates new row version"| DB["Clustered Index"]
    R["Reader"] -->|"reads consistent snapshot"| UL["Undo Log"]
    DB -.->|"old versions live in"| UL

Skeptic: So the trade-off is memory. Long transactions = lots of undo.

Engineer: Exactly. And the book hammers this: keep transactions short. Don't open one in your application server and wait for a user to click a button.

Indexing: The Highest-Leverage Optimization

Skeptic: Most of my slow queries come down to missing indexes. Is that a coincidence?

Engineer: It is not a coincidence. Indexing is the single most leveraged optimization in MySQL. A well-chosen composite or covering index can turn a 10-second query into a 10-millisecond query. A new server, with the same indexes, will not.

Skeptic: Walk me through the "covering index" idea.

Engineer: Imagine a phone book. You want to find everyone named "Schwartz" and you only need their last name and city. A regular index finds the page; the table is the alphabetical entries with phone numbers and addresses you don't need. A covering index includes only the columns you actually want. The query never touches the table at all.

Skeptic: So bigger index, faster query.

Engineer: Bigger index, faster query, slower writes. Every index is overhead on every INSERT, UPDATE, DELETE. The art is matching indexes to actual query patterns, not to the schema.

Reading EXPLAIN

Engineer: The second most important skill is reading EXPLAIN.

Skeptic: Just put EXPLAIN in front of a query?

Engineer: In MySQL 8, use EXPLAIN ANALYZE — it actually runs the query and shows you the real numbers, not just estimates. Or EXPLAIN FORMAT=JSON for a structured plan.

Skeptic: And you read this to know what the optimizer is going to do?

Engineer: Exactly. The output tells you the join order, which indexes it picked (or didn't), how many rows it thinks it will scan, whether it needs to sort or use a temp table. If the optimizer's row estimates are wildly wrong — say it thinks a query will return 10 rows but actually returns 10 million — that is almost always a missing or broken index.

Skeptic: And if the optimizer is just wrong?

Engineer: Hints. SELECT /*+ INDEX(t users_name_idx) */ .... But the book is clear: hints are a last resort, not a first option. If you need a hint, ask why the optimizer made the wrong choice. Usually, the answer is "the schema or the stats are wrong."

Replication and High Availability

Skeptic: Replication in MySQL has always had a reputation for being fragile. Is that still true?

Engineer: It is more accurate to say it has more knobs now. Asynchronous replication (the default) is fast but lossy. Semi- synchronous replication waits for at least one replica to acknowledge. Group Replication uses Paxos to keep a quorum in sync. Galera is virtually synchronous, cert-based.

flowchart LR
    M["Primary"] -->|"binary log"| R1["Replica"]
    M -->|"binary log"| R2["Reporting Replica"]
    M -->|"binary log"| R3["DR Replica"]
    R1 -->|"semi-sync ack"| M

Skeptic: And the failover story?

Engineer: Better than it was. With GTIDs and orchestrators like Orchestrator, MHA, or the cloud-native solutions, failover is mostly routine — if you have rehearsed it. The book is clear: failover that you have not practiced is a fire drill. Failover that you have practiced is a deploy.

Backup: The Discipline Most Teams Skip

Engineer: Here is the most underrated chapter: backup and recovery.

Skeptic: Backups are boring. Everyone knows you need them.

Engineer: Everyone knows they need them. Very few teams have actually restored from one. The book covers the tools — mysqldump, mydumper, Percona XtraBackup, filesystem snapshots, binary log PITR — and then spends half the chapter on verification.

Skeptic: "The only backup that exists is one you have restored from."

Engineer: That is literally the line. The book is opinionated about this. 3-2-1 backup strategy (three copies, two media, one off-site). Automated restores. Checksum verification. Documented RTO and RPO.

Sharding: The Last Resort

Skeptic: When do you shard?

Engineer: When you have to. Most applications run on a single primary for years with the right schema, indexes, and read replicas. Sharding adds operational complexity that you do not want to pay for unless you have to.

Skeptic: And when you have to?

Engineer: Vitess. ProxySQL. MySQL Cluster (NDB). Galera. Each solves a different problem. Vitess is the modern default for sharded MySQL — it sits between your application and your shards, routing queries and managing topology. The book covers all of these, with realistic advice on when each makes sense.

The Verdict

Engineer: High Performance MySQL is not a book you read once. It is a reference you return to. The 950 pages are deliberately comprehensive — you will not need every chapter on day one, but the chapter you need at 2 a.m. will save your on-call rotation.

Skeptic: It is also not a beginner book. If you have never written a JOIN, this is not where you start.

Engineer: Correct. Read a SQL primer first. Then read Karwin's SQL Antipatterns. Then read this. By the time you finish, you will be the person your team pings when MySQL is misbehaving.

Skeptic: Final rating?

Engineer: 9.5 out of 10. The definitive MySQL book. The fact that it is on its fourth edition, written by the people who built Percona, and still current in 2026 is a testament to both the book's quality and the longevity of MySQL itself.

This has been a BookAtlas narration of High Performance MySQL by Baron Schwartz, Peter Zaitsev, Vadim Tkachenko, Jeremy Zawodny, Arjen Lentz, and Derek J. Balling. Thanks for listening.

Overview

Key Takeaways

Who Should Read

Who Should Skip

Why This Book Matters

Related Books

Final Verdict

MySQL Architecture at a Glance

InnoDB Internals: MVCC and Locking

Indexing Strategies

B-Tree Indexes

Composite, Covering, and Prefix Indexes

Hash and Fulltext

Reading EXPLAIN

The Query Optimizer

Transactions and Isolation

Replication

Backup and Recovery

Scaling Beyond a Single Server

Key Lessons

Practical Applications

For the Backend Engineer

For the DBA

For the SRE

For the Architect

Strengths

Weaknesses

Criticism

The "Too MySQL-Specific" Critique

The "Oracle/Percona Lens" Critique

The "Configuration Drift" Risk

The "Cloud and Managed" Gap

Scientific Grounding

Historical Context

Comparison

Final Assessment

Introduction

Why This Book Exists

InnoDB: The Engine That Matters

Indexing: The Highest-Leverage Optimization

Reading EXPLAIN

Replication and High Availability

Backup: The Discipline Most Teams Skip

Sharding: The Last Resort

The Verdict