Back to blog
C++23 C++ Order Book Performance

Using std::flat_map for Order Book Price Levels in C++23

std::flat_map landed in C++23 and it's a significant win for price-keyed order book structures. Here's a practical comparison against std::map and a hand-rolled sorted vector approach.

One of the more useful additions in C++23 is std::flat_map - a sorted associative container backed by a contiguous array rather than a red-black tree. For order book price levels, this is directly relevant.

The order book access pattern

An exchange order book has a small number of active price levels (typically 10-50 per side for liquid instruments, though some books are sparser). The access pattern is:

  • Insert/delete: rare and latency-tolerant - happens when a new price appears or all orders at a level are filled
  • Lookup by price: moderate frequency, when a new order arrives and we need to find its level
  • Iteration in price order: the hot path - calculating mid-price, VWAP, displaying the book, or computing order imbalance all walk the book

std::map (red-black tree) is terrible for iteration because price levels scatter across heap allocations. Walking the tree in order triggers a cache miss per node.

std::flat_map

#include <flat_map>

struct Level {
    double price;
    double qty;
    int order_count;
};

// Bid side: descending order (highest bid first)
std::flat_map<double, Level, std::greater<double>> bids;

// Ask side: ascending order (lowest ask first)  
std::flat_map<double, Level> asks;

flat_map stores keys and values in two separate std::vectors, sorted by key. Lookups are O(log n) via binary search on the key vector. Iteration is a sequential scan of contiguous memory.

For a 50-level book, binary search is ~6 comparisons - fast. And the tight iteration loop is cache-friendly.

Comparison

I benchmarked three approaches on a synthetic order book workload: 40 price levels, 70% iteration (VWAP calculation), 20% point lookup, 10% insert/delete. Compiler: clang-18 with -O3 -march=native.

std::map:          iterate 380ns, lookup 45ns, insert 180ns
sorted vector:     iterate  28ns, lookup 38ns, insert 320ns
std::flat_map:     iterate  31ns, lookup 35ns, insert 290ns

flat_map matches the hand-rolled sorted vector in iteration performance (both are sequential array scans) and beats both map variants on lookup thanks to prefetch-friendly binary search on the contiguous key array.

The insert cost is higher than std::map because insertion into a sorted array requires shifting elements - for a 40-level book that’s at most 40 moves of 8 bytes each, which is still a couple of SIMD operations. Acceptable.

One complication: floating-point price keys

Using double as a map key is dangerous - FP equality is approximate. The standard solution is to represent prices as integers: multiply by the tick size denominator and use int64_t.

// 0.01 tick size → store in units of 0.01
// £42.53 → 4253
using PriceTick = int64_t;
std::flat_map<PriceTick, Level, std::greater<PriceTick>> bids;

This is exact, sortable, and avoids the comparison footguns.

When not to use flat_map

If your book has high insert/delete churn - e.g., a synthetic book that gets rebuilt on every update rather than incrementally maintained - the O(n) insert cost hurts. A std::map or a skip list is better there.

But for a well-maintained incremental book where inserts are rare relative to reads, flat_map is a clean upgrade from std::map with essentially no code changes required.

The flat_map book structure feeds the WebSocket streaming code I wrote about in the previous post.