Database Benchmark Suites Update 2026-01

In 2024, we published a structured overview of database benchmarking suites across workload domains in our previous compendium on database benchmarking suites. It categorizes established benchmark suites and describes their origin, scope, and methodological characteristics.

Since then, the benchmarking ecosystem has evolved. New benchmark suites have emerged to address distributed SQL systems, hybrid transactional/analytical workloads, real-time ingestion patterns, and globally distributed NoSQL deployments.

This article document summarizes benchmark suites that became relevant in the last two years (2025–2026). As the original article it focuses on their origin, technical scope, workload characteristics, and applicability. The extensions covered here have also been incorporated into the original article, which serves as a continuously growing benchmark suite compendium and is updated as the benchmarking ecosystem evolves.

Introduction to Database Benchmarking Suites
OLTP Database Benchmark Suites
OLAP Database Benchmark Suites
HTAP Database Benchmarking Suites
NoSQL Database Benchmark Suites
Time-Series Database Benchmark Suites
- Key Metrics in Time-Series Benchmarking
- nano
Getting Involved

Introduction to Database Benchmarking Suites

Database benchmark suites provide structured methodologies to evaluate the performance characteristics of database management systems (DBMS). A benchmark suite typically defines:

A data schema representing a specific workload class
data generation rules and scaling parameters
a controlled mix of operations (reads, writes, analytical queries, etc.)
concurrency models
execution and measurement procedures

Most benchmark suites rely on synthetic datasets designed to approximate real-world usage patterns. Some frameworks additionally support replaying trace-based workloads derived from production systems.

During execution, benchmark frameworks capture performance indicators such as:

transaction throughput
query latency (including percentile distributions)
concurrency behavior
resource utilization
scalability trends

These measurements enable systematic evaluation of database systems under controlled conditions. For a detailed discussion of benchmarking methodology and execution best practices, refer to our guide on database benchmarking.

Beyond raw performance numbers, benchmarking suites serve multiple purposes:

Performance Validation: They verify whether a database system can sustain expected workloads under defined constraints.
Comparative Evaluation: They allow side-by-side assessment of alternative systems using consistent workload definitions, as demonstrated in our database ranking.
Scalability Analysis: They help quantify how systems behave as data volume and concurrency increase.
Configuration Optimization: They reveal bottlenecks and resource contention, supporting tuning and infrastructure planning.
Strategic Architecture Decisions: They enable users to measure empirical data to guide long-term infrastructure and technology choices.

Because database architectures continue to evolve, particularly toward distributed, hybrid, and real-time systems, benchmark suites need to evolve accordingly.

In the following sections, we categorize benchmark suites according to the workload types they generate. The origin of some projects may not always be formally documented and the referenced repositories may represent one of multiple available implementations.

OLTP Database Benchmark Suites

OLTP (Online Transaction Processing) workloads represent the transactional backbone of many operational systems. These workloads power applications that require consistent, low-latency data modifications under high concurrency, such as e-commerce platforms, financial systems, reservation engines, and SaaS backends.

Unlike analytical systems, OLTP environments prioritize predictable response times, correctness under concurrent access, and strict transactional guarantees over large-scale scan efficiency.

Typical characteristics of OLTP workloads include:

High transaction rates with short execution paths
Low-latency response requirements
Frequent inserts, updates, and deletes
Small, index-driven data access patterns
Strong consistency and isolation guarantees
High levels of concurrent client activity

In distributed or cloud-native architectures, additional factors such as replication topology, coordination overhead, and cross-node latency significantly influence observed performance.

Key Metrics in OLTP Benchmarking

Transaction Throughput (TPS)
Measures the number of successfully committed transactions per unit of time.
Response Time / Latency Distribution
Evaluates average latency as well as percentile metrics (e.g., p95, p99, p99.9) to capture tail behavior.
Concurrency Scalability
Assesses how performance evolves as the number of concurrent clients increases.
Consistency & Replication Impact
Measures the cost of synchronous replication, quorum writes/reads, and cross-node commit protocols.
Resource Efficiency
Evaluates CPU, memory, and I/O utilization under sustained transactional load.

pgdistbench

Origin: Xata.io – GitHub Repository
Purpose: pgdistbench is a distributed PostgreSQL benchmarking tool designed to run standardized benchmarks at scale in Kubernetes environments. It provides orchestration and repeatability for benchmark execution across one or more PostgreSQL instances, including distributed load generation via multiple benchmark driver replicas.
Real-world Use-Cases: Suitable for benchmarking Kubernetes-native PostgreSQL deployments (e.g., CloudNativePG), pre-existing PostgreSQL clusters, and Postgres-based platforms where the benchmark setup itself (deployment orchestration, runner scaling, and cluster-level stress behavior) is part of the evaluation.
Key Features: Supports multiple benchmark workloads including TPC-C (OLTP), TPC-H (OLAP), and CH-BenCHmark (mixed OLTP/OLAP). Uses a two-component architecture: a bench driver (HTTP server that executes the benchmark against the system under test) and a k8 runner (CLI that deploys and configures systems under test, coordinates multiple bench driver replicas, and manages benchmark lifecycle phases such as prepare/run/cleanup). Supports scenario configuration via YAML or KCL (KusionStack Configuration Language) to enable validated and reusable benchmark definitions. Includes Kubernetes-focused stress modes that can exercise cluster behavior with dynamic PostgreSQL instance lifecycle management.
benchANT Integration: TBD

Swingbench

Origin: Developed and maintained by Dominic Giles – Project Page
Purpose: Swingbench is a load generator and set of utilities for benchmarking database systems by generating transactional workloads and capturing response-time and throughput behavior. It supports both GUI and command-line execution and includes utilities to chart transaction rates and response-time distributions.
Real-world Use-Cases: Commonly used to validate and stress-test database features and operational scenarios such as Oracle Real Application Clusters (RAC), standby databases, online maintenance operations (e.g., table rebuilds), and backup/recovery workflows. It is also used for migration validation, performance regression testing, and hardware sizing under controlled load.
Key Features: Ships with multiple built-in benchmarks including OrderEntry, SalesHistory, StressTest, JSON, MovieStream, and “TPC-like” workloads (TPC-DS Like, TPC-H Like). OrderEntry (based on Oracle’s OE schema) is designed for continuous execution and introduces heavy contention on a small set of tables to stress interconnect and memory; it includes both JDBC and PL/SQL variants. SalesHistory (based on Oracle’s SH schema) targets complex read-only queries on large tables and supports scaling (e.g., 1GB to 1TB). The framework is Java-based (cross-platform) and provides an API for implementing custom benchmark workloads.
benchANT Integration: TBD

OLAP Database Benchmark Suites

OLAP (Online Analytical Processing) workloads focus on complex analytical query execution over large datasets. These systems are typically optimized for read-heavy operations, large scans, aggregations, joins, and multi-dimensional analysis that support reporting, business intelligence, and decision-support systems.

Unlike transactional systems, OLAP environments prioritize query execution efficiency over write latency and often rely on columnar storage, vectorized execution, distributed query planners, or massively parallel processing (MPP) architectures.

Typical characteristics of OLAP workloads include:

Large table scans and aggregation-heavy queries
Multi-table joins across wide schemas
Complex filtering and grouping operations
Batch-oriented or interactive analytical queries
Concurrency across multiple analytical users
Scale-out execution across distributed nodes

Because analytical systems frequently operate on multi-gigabyte to terabyte-scale datasets, benchmarking must evaluate both raw execution performance and scaling behavior under increasing data volume and concurrency.

Key Metrics in OLAP Benchmarking

Query Latency
Measures execution time for analytical queries, including average and percentile distributions.
Throughput Under Concurrency
Evaluates how many analytical queries can be processed simultaneously without significant performance degradation.
Scalability with Data Volume
Assesses how performance evolves as dataset size increases (e.g., scale factors).
Scan Efficiency & I/O Behavior
Measures performance during large sequential scans and data-intensive operations.
Join & Aggregation Performance
Evaluates optimizer effectiveness and execution efficiency for multi-join, aggregation-heavy workloads.
Resource Utilization & Parallel Efficiency
Analyzes CPU, memory, and network usage in distributed analytical environments.

analytics_benchmark

Origin: Community-driven project – GitHub Repository
Purpose: analytics_benchmark provides a standardized framework for evaluating Online Analytical Processing (OLAP) workloads across multiple database engines. It is designed to enable fair and reproducible performance comparisons using representative analytical queries and datasets.
Real-world Use-Cases: Suitable for evaluating open-source analytical database systems that can be deployed on-premises or containerized using Docker. Particularly relevant for organizations with data sovereignty requirements, strict compliance constraints, cost predictability considerations, or a preference for infrastructure-agnostic and self-managed analytical platforms.
Key Features: Focuses on open-source, Docker-deployable analytical databases. Enables reproducible benchmark execution across different engines using consistent query workloads and datasets. Designed for portability across bare metal, virtual machines, Kubernetes clusters, and private cloud environments. Emphasizes standardized OLAP query patterns to support comparable analytical performance evaluation.
benchANT Integration: TBD

RTABench

Origin: Timescale – GitHub Repository
Purpose: RTABench (Real-Time Analytics Benchmark) is designed to evaluate database systems under real-time analytics workloads inside applications. Unlike traditional OLAP benchmarks that rely on a single denormalized table and large full-table scans, RTABench focuses on normalized schemas, selective queries, and incremental materialized views that reflect application-driven analytics patterns.
Real-world Use-Cases: Suitable for evaluating databases used for real-time analytics in application backends, such as e-commerce platforms tracking orders and shipments, systems requiring selective object-level queries, or environments combining high ingest rates with low-latency analytical queries.
Key Features: Built on the ClickBench framework but introduces a new normalized dataset and query set modeling customers, products, orders, order items, and order events. Includes ~171 million events along with realistic entity volumes (e.g., customers, products, and orders). Defines 33 queries across four categories: raw event queries (aggregation over time), selective filtering (object- and time-based lookups), multi-table joins, and pre-aggregated queries using incremental materialized views. Evaluates systems across general-purpose, real-time analytics, and batch analytics database categories to analyze trade-offs between ingest performance, join efficiency, and query latency in mixed real-time scenarios.
benchANT Integration: TBD

SQLStorm

Origin: Academic project (Technical University of Munich) – GitHub Repository | Paper
Purpose: SQLStorm is an LLM-generated analytical benchmark suite designed to evaluate SQL query engines using automatically generated query workloads. In version 1.0, queries are generated using GPT-4o-mini across multiple established datasets, introducing syntactic and structural diversity beyond manually curated benchmark sets.
Real-world Use-Cases: Suitable for evaluating parser robustness, query optimizer behavior, execution engine stability, and operator coverage in analytical database systems. Particularly relevant for assessing how systems handle diverse, potentially AI-generated SQL queries that may reflect emerging real-world usage patterns.
Key Features: SQLStorm v1.0 generates queries across datasets including StackOverflow, TPC-H, TPC-DS, and JOB. The benchmark follows a structured pipeline: (1) LLM-based query generation using dataset-aware prompts, (2) automated rewriting and normalization (e.g., deduplication, syntax cleanup, date normalization), (3) cross-system compatibility validation by testing parseability on multiple systems (e.g., PostgreSQL, Umbra, DuckDB), and (4) query selection based on defined criteria such as parseability and executability on baseline dataset sizes. Execution is typically orchestrated using the OLAPBench framework, which automates dataset preparation and benchmark execution across multiple systems. The benchmark enforces execution limits (e.g., per-query timeouts, global runtime caps) and includes tooling for analyzing query complexity, operator diversity, and structural characteristics.
benchANT Integration: TBD

Benchto

Origin: Trino Project – GitHub Repository
Purpose: Benchto is a macro-benchmarking framework designed for evaluating distributed SQL engines in clustered environments. It provides structured benchmark definitions and emphasizes repeatable execution, visibility into cluster behavior, and persistent storage of benchmark results.
Real-world Use-Cases: Primarily used for benchmarking distributed query engines such as Trino (and historically Hadoop SQL engines). Suitable for performance regression testing, comparative evaluation across cluster configurations, and detailed performance diagnostics in multi-node deployments.
Key Features: Benchto consists of two main components: benchto-service, which stores benchmark results in a relational database (PostgreSQL), exposes a REST API, and provides a web interface for result visualization; and benchto-driver, a standalone Java application that loads benchmark descriptors and executes them against the system under test. The driver can collect cluster-level metrics (CPU, memory, network usage) and integrates with Graphite and Grafana for monitoring and annotation. It supports advanced query profiling through Java Flight Recorder (JFR), async-profiler, and Linux perf, enabling low-level execution analysis. Designed for clustered environments, Benchto emphasizes reproducibility, observability, and structured performance evaluation.
benchANT Integration: TBD

Firebolt Benchmarks

Origin: Firebolt – GitHub Repository
Purpose: The Firebolt Benchmarks repository provides the FireScale benchmark along with benchmarking clients and published result sets. It enables comparative evaluation of cloud data warehouse systems using a defined analytical workload and reproducible execution tooling.
Real-world Use-Cases: Suitable for evaluating cloud data warehouse performance across vendors such as Firebolt, Snowflake, Redshift, BigQuery, and others. Particularly relevant for benchmarking analytical query latency, concurrency behavior, and workload scaling under controlled execution patterns.
Key Features: Includes the FireScale benchmark workload, DDL definitions, and query sets for multiple vendors. Provides two benchmark clients: a Python client for sequential “power” runs and low-throughput concurrency testing (<100 QPS), and a Node.js (Grafana K6) client for high-concurrency scenarios with hundreds or thousands of queries per second. Supports multi-vendor credential configuration via a unified credentials file and programmatic data ingestion using setup scripts. Enables both reproducible benchmark execution and publication of comparative result sets.
benchANT Integration: TBD

RedBench

Origin: UTN Data Systems – GitHub Repository
Purpose: RedBench is a benchmark suite of analytical SQL workloads intended to evaluate workload-driven optimizations in analytical database systems. It focuses on recurring query patterns observed in production, which are not well captured by classic analytical benchmarks that emphasize per-query diversity.
Real-world Use-Cases: Suitable for assessing techniques such as workload-aware query optimization, plan caching, workload-informed or learned cardinality estimation, join-ordering strategies, and other optimizations that exploit repetition across query workloads rather than treating each query as independent.
Key Features: Provides 30 analytical SQL workloads that mimic production query behavior from Redset (a dataset of query metadata published by Amazon Redshift) by sampling and adapting queries derived from the Join Ordering Benchmark (JOB) and the Cardinality Estimation Benchmark (CEB) on the IMDb schema. Workloads are constructed by clustering Redset users into 10 groups based on query repetitiveness, sampling representative users per group, and reverse-engineering workloads by matching join counts and scanned table sets. The resulting workloads are available under workloads/, with generation details documented in DETAILS.md and summary plots in figures/.
benchANT Integration: TBD

HTAP Database Benchmarking Suites

HTAP (Hybrid Transactional and Analytical Processing) workloads combine transactional (OLTP) and analytical (OLAP) operations within the same database system. Unlike traditional architectures that separate operational and analytical systems, HTAP systems execute mixed workloads over a shared dataset and often within a unified storage and execution engine.

HTAP benchmarks evaluate how databases behave when short, latency-sensitive transactions and longer-running analytical queries are executed concurrently. The primary objective is not only to measure raw performance, but to quantify workload interference, resource contention, data freshness, and overall system stability under sustained mixed load.

Typical characteristics of HTAP workloads include:

Concurrent execution of transactional and analytical queries
Shared storage or execution layers across workload types
Continuous data ingestion with near real-time analytical visibility
Mixed read/write patterns with varying query complexity
Resource contention across CPU, memory, network, and I/O
Freshness requirements for analytical queries over recently updated data

Because HTAP systems remove traditional ETL boundaries, benchmarking must explicitly measure trade-offs between transactional throughput and analytical performance.

Key Metrics in HTAP Benchmarking

Transactional Throughput (TP)
Measures committed transactions per unit of time while analytical queries are running.
Analytical Query Latency (AP)
Evaluates execution time of analytical queries under concurrent transactional load.
Workload Interference Impact
Quantifies performance degradation caused by concurrent workload classes.
Data Freshness / Visibility Delay
Measures how quickly analytical queries reflect newly committed transactional updates.
Resource Isolation Efficiency
Assesses how effectively CPU, memory, and I/O resources are balanced across workload types.
Scalability Under Mixed Load
Examines performance behavior as data volume, concurrency, and workload intensity increase simultaneously.

Web3Bench

Origin: Community project – GitHub Repository
Purpose: Web3Bench is a benchmark suite designed for evaluating database systems in Web3 and decentralized application scenarios. It provides a unified workload model that combines transactional and analytical tasks, reflecting hybrid access patterns commonly observed in blockchain-based systems.
Real-world Use-Cases: Suitable for benchmarking databases used in decentralized applications, blockchain analytics platforms, and systems that must handle both high-frequency transactional updates (e.g., token transfers) and analytical queries over blockchain state.
Key Features: Based on a simplified Ethereum-inspired data model including blocks, transactions, smart contracts, and token transfers. Derived from a 20GB real Ethereum dataset and extended through a proprietary data generator to support scalable dataset sizes. Workloads are categorized by latency requirements (real-time, online serving, and batch processing) and include explicit latency limits for real-time and online queries, as well as throughput measurement for batch tasks. A specialized benchmark driver measures both latency and throughput across mixed workload scenarios.
benchANT Integration: TBD

HyBench-2024

Origin: Academic project (Technical University of Munich) – GitHub Repository | Paper
Purpose: HyBench is a benchmark suite designed to evaluate HTAP (Hybrid Transactional and Analytical Processing) database systems under tightly coupled mixed workloads. It models concurrent transactional (TP) and analytical (AP) execution and explicitly measures workload interference, analytical freshness, and hybrid performance trade-offs.
Real-world Use-Cases: Suitable for evaluating operational analytics platforms where transactional updates and analytical queries must operate on the same dataset with minimal delay. Representative scenarios include financial systems, e-commerce platforms with live dashboards, and data platforms requiring continuous ingestion with near real-time analytical visibility.
Key Features: HyBench defines distinct workload components for TP, AP, and XP (cross-phase evaluation). The benchmark measures:
- Transactional throughput (TP)
- Analytical performance (AP)
- Mixed-workload behavior under concurrent execution
- Freshness metrics, quantifying how quickly analytical queries reflect recent transactional updates
It supports configurable scale factors (e.g., 1x, 10x, 100x), adjustable TP/AP concurrency levels, workload composition ratios, and independent JDBC configurations for transactional and analytical clients. The execution workflow includes data generation, loading, index building, and coordinated mixed-workload runs. Implemented in Java (Java 17+, Maven-based), HyBench emphasizes reproducible configuration through structured parameter files and clearly documented execution procedures to ensure methodological consistency across systems.
benchANT Integration: TBD

NoSQL Database Benchmark Suites

NoSQL benchmarking focuses on distributed and horizontally scalable database systems such as document stores, key-value stores, wide-column databases, and search engines. Unlike traditional relational systems, these architectures are often optimized for elastic scaling, tunable consistency, geo-distribution, and workload-specific data models.

NoSQL workloads frequently operate under:

Multi-region deployments
Asynchronous or tunable replication
Partition-tolerant architectures
Eventual or configurable consistency models
High write throughput and low-latency read requirements
Heterogeneous access patterns (point lookups, range scans, secondary index queries, search queries)

Because these systems trade strict relational guarantees for distribution and scale, benchmarking must account not only for raw performance but also for replication topology, consistency configuration, and failure behavior.

Key Metrics in NoSQL Benchmarking

Request Latency (including tail latency)
Measures average and percentile latency (e.g., p95, p99, p99.9) under different consistency levels and replication settings.
Throughput
Evaluates sustained operations per second across varying client concurrency levels and cluster sizes.
Replication & Consistency Impact
Quantifies the performance cost of synchronous replication, quorum reads/writes, and cross-region coordination.
Partition Tolerance & Failure Behavior
Assesses system behavior during node failures, network partitions, and rebalancing operations.
Cross-Region Performance
Measures latency and throughput when clients operate across geographic regions.
Scalability Characteristics
Examines linearity of performance as nodes are added and datasets grow.

These dimensions are critical for evaluating globally distributed SaaS platforms, content delivery systems, financial transaction services, and other applications that prioritize availability and horizontal scale.

Tectonic

Origin: Academic research project – “Tectonic: Bridging Synthetic and Real-World Workloads for Key-Value Benchmarking,” TPCTC 2025. PDF
Purpose: Tectonic is a key-value workload generator designed to model workload behavior that more closely resembles real production systems. Unlike traditional benchmarks that rely on static and synthetic access patterns, Tectonic captures temporal changes, structural characteristics, and evolving access distributions over time.
Real-world Use-Cases: Suitable for evaluating distributed key-value stores and NoSQL systems in environments where workload characteristics shift dynamically — such as e-commerce platforms, streaming services, or globally distributed systems with time-dependent traffic patterns.
Key Features: Supports a broad range of key-value operations including inserts, updates, read-modify-writes, point queries, range queries, and deletions. Workloads are defined declaratively via JSON configuration files that allow modeling of key distributions, data sortedness, and phased workload transitions. Tectonic enables dynamic changes in workload composition during execution, making it possible to simulate realistic production behavior rather than fixed synthetic patterns. The implementation emphasizes performance efficiency and low overhead, enabling high-throughput workload generation for large-scale experiments.
benchANT Integration: TBD

Global NoSQL Benchmark

Origin: Community project – GitHub Repository
Purpose: Designed to benchmark globally distributed NoSQL databases under realistic multi-region workloads.
Real-world Use-Cases: Applicable for SaaS platforms and globally deployed applications requiring low-latency access across regions.
Key Features: Evaluates cross-region latency, replication overhead, and scalability under geographically distributed deployments.
benchANT Integration: TBD

Latte

Origin: Independent open-source project – GitHub Repository
Purpose:
Latte is a high-performance benchmarking tool designed primarily for Apache Cassandra–compatible database systems. It aims to provide both execution efficiency and maximum workload flexibility by combining a low-level runtime implementation with a programmable workload definition model.
Real-world Use-Cases:
Suitable for benchmarking Apache Cassandra (3.x, 4.x), DataStax Enterprise, and ScyllaDB clusters. Particularly relevant for exploratory benchmarking, performance regression testing, workload experimentation, and evaluating large clusters using a minimal number of client nodes.
Key Features:
Latte is implemented in Rust and uses the native Cassandra driver from Scylla. It employs a fully asynchronous, thread-per-core execution engine capable of issuing thousands of requests per second per thread.
Compared to Java-based benchmarking tools, Latte emphasizes:
- High CPU efficiency on the client side
- Low memory footprint
- Minimal operating system overhead
- No garbage collection pauses
- No client warm-up phase required
- No coordinated omission in latency measurement
Instead of relying solely on static configuration files, Latte embeds a full scripting language (Rune) for workload definition. This enables:
- Custom query logic
- Conditional and iterative workload generation
- Programmable data generation
- Complex multi-query scenarios
- Workload parameterization
Additional features include:
- Asynchronous and prepared queries
- Rate and concurrency limiting
- Accurate throughput and latency measurement with statistical error margins
- JSON report export
- Side-by-side comparison of benchmark runs
- Statistical significance analysis with autocorrelation correction
Current limitations include early-stage maturity and a relatively small built-in data generation library.
benchANT Integration: TBD

OpenSearch Benchmark

Origin: OpenSearch Project – GitHub Repository
Purpose:
OpenSearch Benchmark is a performance evaluation framework designed specifically for OpenSearch clusters. It enables structured and reproducible benchmarking of indexing and search workloads, supporting both comparative performance analysis and regression testing across OpenSearch versions.
Real-world Use-Cases:
Suitable for evaluating search and log analytics deployments, observability platforms, security analytics systems, and document search engines built on OpenSearch. It is commonly used for performance regression testing, capacity planning, infrastructure validation, and tuning cluster configurations under realistic query and indexing workloads.
Key Features:
OpenSearch Benchmark supports:
- Automated provisioning and teardown of OpenSearch clusters
- Execution of predefined and custom benchmark workloads
- Management of datasets and benchmark specifications across OpenSearch versions
- Performance result recording and comparison
- Integration of telemetry devices to capture low-level system metrics
- Reproducible benchmarking workflows
The framework allows users to define custom workloads tailored to specific indexing patterns, query mixes, and concurrency levels. Telemetry components can collect system-level metrics to help identify bottlenecks and performance regressions. Emphasis is placed on methodological reproducibility to ensure consistent and comparable results across benchmark runs.
benchANT Integration: TBD

Time-Series Database Benchmark Suites

Time-series benchmarking focuses on database systems optimized for append-heavy, time-ordered data such as telemetry streams, financial tick data, IoT sensor measurements, observability metrics, and event logs. Unlike generic OLTP or OLAP workloads, time-series systems must efficiently handle continuous ingestion while simultaneously serving time-windowed analytical queries.

These workloads are typically characterized by:

High write rates with append-only or near-append-only patterns
Time-based partitioning or chunking strategies
Frequent range queries over recent and historical intervals
Aggregations over sliding windows (e.g., last 5 minutes, last 24 hours)
Retention policies and downsampling mechanisms
Compression strategies optimized for ordered timestamped data

In addition to raw performance, time-series benchmarks often evaluate how well systems balance ingest throughput, query responsiveness, storage efficiency, and retention management.

Key Metrics in Time-Series Benchmarking

Ingest Throughput
Measures sustained write rates (events per second or rows per second) under continuous load.
Query Latency over Time Windows
Evaluates response times for range queries, windowed aggregations, and filtered lookups across different temporal intervals.
Retention & Downsampling Behavior
Assesses the efficiency of data retention policies, compaction, and materialized rollups.
Compression Efficiency
Quantifies storage footprint reduction for time-ordered datasets.
Concurrent Ingest & Query Performance
Measures how well the system maintains low-latency queries while ingestion is ongoing.
Scalability Across Time and Volume
Examines how performance evolves as dataset size and retention horizon increase.

nano

Origin: KX Systems – GitHub Repository
Purpose:
nano is a low-level benchmark utility designed to measure raw CPU, memory, and storage I/O performance from the perspective of a kdb+ (q) process. Rather than simulating a full application workload, nano evaluates the fundamental system capabilities that underpin time-series and tick database performance.
Real-world Use-Cases:
Suitable for validating infrastructure readiness before executing higher-level time-series or tick-database benchmarks. Commonly used in financial market data environments, telemetry systems, and high-frequency data platforms where kdb+ is deployed. It is particularly useful for diagnosing storage bottlenecks, verifying operating system configuration, and stress-testing shared or distributed storage systems.
Key Features:
nano performs a series of system-level tests directly within kdb+/q, including:
- Sequential and random reads and writes
- Data aggregation operations (e.g., sum, sort)
- Memory mapping (mmap) performance
- File creation and ingestion
- Filesystem metadata operations (e.g., file open)
It can execute benchmarks using:
- A single kdb+ process
- Multiple worker processes with aggregated results
- Local storage or shared/distributed file systems
Throughput and latency are measured natively within the q runtime. The tool also supports testing with compressed data and multi-node client execution to evaluate read/write rates against shared storage targets.
Although nano does not simulate a multi-column tick database workload, it provides insight into the maximum achievable I/O and compute performance of kdb+ on a given system. It is particularly effective for identifying OS-level, filesystem, and storage configuration limitations prior to conducting higher-level benchmarking.
benchANT Integration: TBD

Getting Involved

This update complements our previous compendium on database benchmarking suites and extends it with benchmark suites that gained relevance in 2025–2026 across OLTP, OLAP, HTAP, NoSQL, and time-series workload domains.

The benchmarking ecosystem continues to evolve alongside database architectures. Distributed systems, hybrid workloads, real-time analytics, and global deployments introduce new methodological challenges that require updated and well-defined evaluation frameworks.

If you are aware of a benchmark suite that is missing from this overview — or if a workload domain deserves further classification — we encourage you to reach out and contribute to the discussion.

Please send suggestions or references to info@benchant.com

We welcome benchmark suites with transparent methodology, reproducible workloads, and clearly defined workload models.

For discussions regarding structured, comparable, and independent benchmarking approaches, feel free to get in touch.