benchANT Homepage
benchANT Homepage

Cassandra 5 Enters the benchANT Ranking

YCSB CRUD Benchmark Results: Cassandra 5 vs. Cassandra 4 Across Five Scaling Sizes

benchANT maintains an open database performance ranking that provides fully reproducible benchmark results for databases running on IaaS infrastructure across major cloud providers. The ranking is designed to give engineering teams, architects, and decision-makers an objective, data-driven foundation for comparing database technologies under standardized conditions.

benchANT regularly includes new databases and cloud providers in the ranking. If you are a database vendor or cloud provider and want to get listed, reach out to info@benchant.com. If you are missing a specific database or cloud provider and would like to see it added, feel free to reach out as well, community requests directly influence what gets benchmarked next. The ranking covers three distinct workloads, but for the purposes of this blog post, we focus exclusively on the CRUD workload based on YCSB (Yahoo! Cloud Serving Benchmark), which measures a general-purpose mix of read and write operations and is the most commonly referenced workload in the ranking.

Until now, the ranking already included Apache Cassandra 4.0.0 benchmarked across five different scaling sizes, ranging from a single-node deployment (xsmall) all the way up to a 9-node cluster (xlarge). These v4 results have served as a stable, long-term baseline that other databases and configurations are frequently compared against.

Some months ago, we published a LinkedIn community poll to find out which open source database our community was most interested in seeing benchmarked next. The result was clear: Apache Cassandra 5 was the top choice. With the benchmarks now complete and validated, Apache Cassandra 5.0.6@AWS is officially included in the benchANT open database performance ranking.

Cassandra 5 — A Performance-Oriented Major Release

Apache Cassandra 5 has been generally available for roughly 10 months and represents the most significant major release of the project in years. Unlike incremental point releases, Cassandra 5 introduces deep, architectural changes that target the core performance characteristics of the database. The Cassandra community and key contributors have highlighted several areas of improvement, including enhanced concurrency handling, storage engine and compaction optimizations, more efficient internal coordination mechanisms as well as a substantially expanded set of configuration options.

These improvements are not merely theoretical. In their official announcements and technical blog posts, the Cassandra project team emphasized that v5 was designed from the ground up to deliver measurable throughput and latency gains, particularly under high-concurrency and large-scale workloads. The new configuration capabilities also mean that operators now have significantly more levers to pull when tuning Cassandra for specific use cases, something that directly affects real-world production performance.

For further technical details, refer to the official project resources:

Benchmark Methodology

All of the YCSB CRUD benchmarks follow the standard benchANT methodology applied to the ranking, ensuring full reproducibility and fair comparison across database versions. The workload used is the YCSB General Purpose profile, consisting of a balanced 50/50 mix of read and insert operations. Each test run processes 54 million operations against a pre-loaded dataset of 5 million records, providing a robust statistical basis for throughput and latency measurements.

The infrastructure is hosted entirely on AWS EC2 in the eu-central-1 region, using GP2 storage volumes with 100 GB capacity and instance families from the m5 series (m5.large through m5.2xlarge). The benchmarks span five scaling sizes — xsmall (1 node), small (1 node, larger instance), medium (3 nodes), large (3 nodes, higher concurrency), and xlarge (9 nodes) — all tested with identical workload parameters and infrastructure specifications across both Cassandra versions. A detailed description on the benchmark parameters is available on GitHub together with the raw results.

One notable difference between the two versions is the Java runtime: Cassandra 4.0.0 was executed on Java 8, while Cassandra 5.0.6 runs on Java 17. This is consistent with each version's supported runtime and reflects the upgrade path that production users would follow.

The benchmark methodology is fully documented and designed for reproducibility, ensuring that results can be validated independently and compared across database versions.

What We Benchmarked: Vanilla and Basic Tuning

For this initial inclusion in the ranking, we benchmarked Cassandra 5 in two configurations: (i) The vanilla (default) configuration uses out-of-the-box settings that ship with Cassandra 5.0.6 (vanilla). This represents the baseline experience for anyone deploying the new version without additional tuning. (ii) basic tuning configuration is based on well-known best practices from the Cassandra community. This includes commonly recommended adjustments to memory allocation, compaction strategies, and concurrency settings, i.e., changes that most experienced Cassandra operators would apply as a starting point in production deployments.

It is important to note that these tuning efforts represent only a first step. There is room for deeper, workload-specific optimization. Cassandra 5 has an expanded configuration surface which means that advanced users and Cassandra experts can likely extract even more performance by fine-tuning parameters such as chunk cache sizing, memtable configurations, and flush/compaction concurrency. We deliberately kept the tuning conservative to establish a fair, broadly relevant baseline.

Have a better tuned Cassandra 5 configuration?

We'd love to benchmark it. Submit your configuration to info@benchant.com and we'll run it under identical conditions and include the results in the ranking, fully attributed and free of charge.

CRUD Ranking — Top 10@AWS

To understand where Cassandra 5 lands in the broader ecosystem, it is helpful to look at the overall CRUD ranking for AWS. The chart below shows the Top 10 database configurations across all scaling sizes, providing context for how Cassandra 5 (both vanilla and tuned) compares not only to Cassandra 4, but also to other popular databases in the ranking.

CRUD Ranking — Top 10 @ AWS

The updated ranking now includes both Cassandra 5 vanilla and Cassandra 5 tuned entries, allowing a direct visual comparison within the broader AWS CRUD context. This positioning helps illustrate the competitive landscape and shows how Cassandra 5 stacks up against the field.

Cassandra v4 vs. v5 Performance Comparison

One of the most common use cases for the benchANT ranking is comparing different versions of the same database, holding all variables constant (workload, infrastructure, scaling size) and changing only the software version. This makes any performance difference directly attributable to the version upgrade itself, providing a clean, data-driven basis for upgrade decisions.

The table below summarizes the consolidated throughput results (operations per second) across all five scaling sizes for Cassandra 4, Cassandra 5 vanilla, and Cassandra 5 tuned:

Scaling Size (Nodes)Cassandra 4Cassandra 5 VanillaCassandra 5 Tunedv5 Tuned vs v4
xsmall (1)12,31215,45116,387+33%
small (1)20,87134,84040,358+93%
medium (3)25,25426,36229,089+15%
large (3)62,16361,50868,317+10%
xlarge (9)139,171150,323171,527+23%

The headline numbers tell a compelling story: Cassandra 5 delivers consistent throughput improvements across every scaling size, with gains ranging from +10% at the large scale up to +33% at the single-node xsmall configuration. At the largest cluster size (xlarge, 9 nodes), the tuned configuration pushes throughput to over 171,000 ops/sec, a +23% improvement over the Cassandra 4 baseline. These are not marginal gains; they represent a meaningful generational performance uplift.

Benchmark Results by Scaling Size

The following sections present the detailed benchmark results for each scaling size, comparing Cassandra 4 vanilla, Cassandra 5 vanilla, and Cassandra 5 tuned side by side. Rather than diving into every metric at a granular level, we focus on the high-level observations and the most notable trends.

xsmall (1 Node)

xsmall Benchmark Results

At the smallest scale, running on a single node, Cassandra 5 shows the most dramatic improvement of any configuration. The vanilla Cassandra 5 configuration already delivers approximately 25% higher throughput than Cassandra 4, and with basic tuning applied, the gap widens to a full +33%. Beyond raw throughput, the tail latency profile is also significantly more stable and predictable, suggesting that the internal coordination and memory management improvements in Cassandra 5 have a particularly strong effect in resource-constrained, single-node environments. This is encouraging for smaller deployments and development/testing scenarios.

small (1 Node, Bigger Instance, Higher Concurrency)

small Benchmark Results

The small configuration is still a single-node deployment, but runs on a larger instance (double the size of xsmall) and with double the client thread concurrency. At the small scaling size, the performance gains observed in xsmall carry forward consistently. The throughput improvements remain solid, and the tuned configuration begins to show a stronger and more visible impact on both throughput and the shape of the latency distribution. The gap between vanilla and tuned Cassandra 5 widens slightly at this scale, indicating that even basic tuning starts to pay meaningful dividends as the workload pressure increases.

medium (3 Nodes)

medium Benchmark Results

The medium configuration marks the transition to a distributed cluster setup with 3 nodes and a replication factor of 3. Client concurrency remains the same as in the small configuration, so the workload does not scale linearly with the number of nodes. Hence, throughput is not expected to achieve a 3x increase compared to small. The distributed setup introduces coordination overhead (replication, consistency guarantees), which is the expected trade-off for higher availability and fault tolerance.

That said, the medium configuration (3 nodes) is where the structural improvements introduced in Cassandra 5 become especially visible. Throughput rises from 25,254 ops/sec (Cassandra 4) to 29,089 ops/sec (Cassandra 5 tuned), a solid +15% gain. However, the most striking improvement at this scale is in tail latency. The Read P95 latency drops dramatically from 26 ms on Cassandra 4 to just 7 ms on Cassandra 5 vanilla (and 7 ms tuned). That is a reduction of nearly 75%, representing a qualitative shift in latency behavior rather than a minor incremental improvement. For production systems where tail latency directly impacts user experience and SLA compliance, this kind of improvement is transformative.

large (3 Nodes, Bigger Instances, Higher Concurrency)

large Benchmark Results

Under higher concurrency conditions (large), vanilla Cassandra 5 performs roughly on par with Cassandra 4, with only a marginal difference in throughput. This is noteworthy because it shows that even without tuning, the new version does not regress under heavier load, a common concern with major version upgrades. Once the tuned configuration is applied, however, Cassandra 5 clearly pulls ahead, exceeding the Cassandra 4 throughput baseline with a +10% improvement. The latency profile under load also tightens considerably, with fewer outlier spikes compared to Cassandra 4.

xlarge (9 Nodes)

xlarge Benchmark Results

At the largest cluster configuration (9 nodes), Cassandra 5 tuned delivers approximately 171,500 ops/sec a +23% improvement over Cassandra 4 which delivered 139,171 ops/sec. Even the vanilla Cassandra 5 configuration at this scale achieves over 150,000 ops/sec, an 8% gain without any manual tuning. At this cluster size, scaling efficiency improves noticeably, and latency remains predictable and stable even under sustained high-throughput workloads. The combination of the internal improvements shipped with Cassandra 5 and basic configuration tuning produces results that scale well with additional nodes, suggesting that the performance ceiling for optimized configurations may be significantly higher still.

Key Takeaways

Looking across all five scaling sizes, several clear patterns emerge from the benchmark data:

Consistent throughput gains: Cassandra 5 improves throughput at every single scaling size compared to Cassandra 4. This is not a case of isolated improvements at one specific cluster size as the gains are structural and pervasive across the board.

Gains amplify with cluster size: The absolute throughput improvement grows as the cluster scales up, with the xlarge configuration seeing the largest absolute gain (over 32,000 additional ops/sec for the tuned configuration compared to Cassandra 4).

Tail latency improvements outshine average latency: While average latency also improves, the most dramatic and impactful change is in tail latency (P95 and above). The P95 read latency for the medium configuration improves by nearly 75%. This is a standout example and similar trends are visible at other scales.

Basic tuning unlocks meaningful additional headroom: The gap between vanilla and tuned Cassandra 5 demonstrates that even conservative, best-practice tuning produces tangible benefits. This suggests that operators who invest time in deeper, workload-specific optimization can expect even larger gains.

Taken together, these results point to a structural generational improvement in Cassandra 5, rather than a marginal version bump. For teams already running Cassandra 4 in production, these numbers provide a strong, data-backed case for planning a version upgrade.

Community Call to Action: Submit Your Tuned Configuration

For this release, we benchmarked Cassandra 5 in two configurations: the default (vanilla) setup and a basic tuned configuration following community best practices. As we noted throughout this post, there is significant room for deeper optimization and we believe the Cassandra community is best positioned to push these results further.

If you are a Cassandra expert, DBA, or performance engineer with a tuned Cassandra 5 configuration that you believe outperforms our results under the same YCSB workload conditions, we want to hear from you.

Submit your tuned Cassandra 5 configuration to benchANT.

We will benchmark it under identical conditions and publish the results in our open database performance ranking with full attributions and completely free of charge. Help the community find the optimal Cassandra 5 configuration.

This is an open invitation to the entire Cassandra community. Whether you have fine-tuned a specific compaction strategy, optimized JVM garbage collection parameters, or discovered a combination of settings that works exceptionally well for mixed read/write workloads, your contribution can help others make better, more informed decisions about their Cassandra deployments.

Reach out to us directly or submit your configuration through the benchANT platform. Let's find out how fast Cassandra 5 can really go.