ConvexPi

Anatomy of a Real Crypto Order Book

— Baseline —cryptomicrostructureorder-bookslippage

Anatomy of a Real Crypto Order Book

Trading isn't free, and the "price" on a screen isn't the price you get. This post dissects a real recorded BTC-USD limit order book (Coinbase, ~600 snapshots over ~10 minutes — the same data that powers the ConvexPi Arena) to measure the three things that actually determine your cost: the spread, the depth, and the slippage of walking the book. See how matching works for the mechanics.

In [1]:
%matplotlib inline
import json, urllib.request
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

URL = "https://raw.githubusercontent.com/convexpi/arena/f75088c55d58c8693fec8e85c9ef4a8927ab2f2e/data/btcusd_book.jsonl"
raw = urllib.request.urlopen(URL).read().decode()
frames = [json.loads(l) for l in raw.splitlines() if l.strip()]
print(f"{len(frames)} order-book snapshots; each has bids 'b' and asks 'a' = [[price, size], ...]")
600 order-book snapshots; each has bids 'b' and asks 'a' = [[price, size], ...]

1. The spread

The best bid and best ask bracket the mid; the gap is the spread — what you immediately pay to cross from buyer to seller. Coinbase BTC is famously tight, so we look at it in basis points.

In [2]:
rows = []
for f in frames:
    bb, ba = f["b"][0][0], f["a"][0][0]
    mid = (bb + ba) / 2
    rows.append({"t": f["t"], "mid": mid, "spread": ba - bb, "spread_bps": (ba - bb) / mid * 1e4})
df = pd.DataFrame(rows)
df["time"] = pd.to_datetime(df["t"], unit="ms")
print(f"mid price : ${df['mid'].mean():,.2f}")
print(f"spread    : ${df['spread'].mean():.3f}  ({df['spread_bps'].mean():.2f} bps avg, "
      f"{df['spread_bps'].median():.2f} bps median)")

fig, ax = plt.subplots(2, 1, figsize=(10, 5), sharex=True)
ax[0].plot(df["time"], df["mid"], lw=1); ax[0].set_title("BTC-USD mid price")
ax[1].plot(df["time"], df["spread_bps"], lw=1, color="indianred"); ax[1].set_title("Spread (bps)")
plt.tight_layout(); plt.show()
mid price : $61,502.15
spread    : $0.058  (0.01 bps avg, 0.00 bps median)
No description has been provided for this image

2. Depth — the shape of the book

The spread only tells you the cost of a tiny trade. For real size, what matters is depth: how much is resting at each price level. We average the cumulative size available within a given distance (in bps) of the mid, on each side.

In [3]:
LEVELS = 15
def cum_depth(side_key, sign):
    # cumulative size and its avg price-offset (bps) for the first LEVELS levels, averaged over frames
    sizes = np.zeros(LEVELS); offs = np.zeros(LEVELS); cnt = np.zeros(LEVELS)
    for f in frames:
        mid = (f["b"][0][0] + f["a"][0][0]) / 2
        cum = 0.0
        for i, (price, qty) in enumerate(f[side_key][:LEVELS]):
            cum += qty
            sizes[i] += cum
            offs[i] += sign * (price - mid) / mid * 1e4
            cnt[i] += 1
    return offs / cnt, sizes / cnt
bid_off, bid_cum = cum_depth("b", -1)
ask_off, ask_cum = cum_depth("a", +1)

fig, ax = plt.subplots(figsize=(9, 3.5))
ax.step(-bid_off, bid_cum, where="mid", color="seagreen", label="bids (cumulative)")
ax.step(ask_off, ask_cum, where="mid", color="indianred", label="asks (cumulative)")
ax.axvline(0, color="grey", lw=0.5)
ax.set_xlabel("distance from mid (bps)"); ax.set_ylabel("cumulative size (BTC)")
ax.set_title("Average book depth"); ax.legend(); plt.tight_layout(); plt.show()
print(f"avg size in top {LEVELS} ask levels: {ask_cum[-1]:.2f} BTC  (~${ask_cum[-1]*df['mid'].mean():,.0f})")
No description has been provided for this image
avg size in top 15 ask levels: 1.34 BTC  (~$82,445)

3. Slippage — what a market order really costs

A market buy walks the ask ladder: it fills the cheapest offers first, then more expensive ones. The volume-weighted price you pay drifts above the mid — that's slippage, and it grows with order size. This is the cost the Arena makes you feel when you send a market order into the real book.

In [4]:
def slippage_buy_bps(frame, size_btc):
    mid = (frame["b"][0][0] + frame["a"][0][0]) / 2
    remaining, cost, filled = size_btc, 0.0, 0.0
    for price, qty in frame["a"]:
        take = min(remaining, qty); cost += take * price; filled += take; remaining -= take
        if remaining <= 1e-12: break
    if filled < size_btc * 0.999:       # book too thin to fill this size
        return np.nan
    return (cost / filled / mid - 1) * 1e4

sizes = [0.05, 0.1, 0.25, 0.5, 1, 2, 3, 5]
slip = [np.nanmean([slippage_buy_bps(f, s) for f in frames]) for s in sizes]
fillable = [np.mean([not np.isnan(slippage_buy_bps(f, s)) for f in frames]) for s in sizes]

fig, ax = plt.subplots(figsize=(9, 3.5))
ax.plot(sizes, slip, "o-"); ax.set_xlabel("market-buy size (BTC)"); ax.set_ylabel("avg slippage (bps)")
ax.set_title("Slippage vs order size"); plt.tight_layout(); plt.show()
for s, sl, fr in zip(sizes, slip, fillable):
    print(f"  buy {s:>4} BTC  (~${s*df['mid'].mean():>10,.0f})  ->  {sl:5.2f} bps slippage   [fillable in {fr:.0%} of snapshots]")
/tmp/ipykernel_2338/3780790642.py:12: RuntimeWarning: Mean of empty slice
  slip = [np.nanmean([slippage_buy_bps(f, s) for f in frames]) for s in sizes]
No description has been provided for this image
  buy 0.05 BTC  (~$     3,075)  ->   0.03 bps slippage   [fillable in 100% of snapshots]
  buy  0.1 BTC  (~$     6,150)  ->   0.05 bps slippage   [fillable in 100% of snapshots]
  buy 0.25 BTC  (~$    15,376)  ->   0.12 bps slippage   [fillable in 100% of snapshots]
  buy  0.5 BTC  (~$    30,751)  ->   0.23 bps slippage   [fillable in 100% of snapshots]
  buy    1 BTC  (~$    61,502)  ->   0.42 bps slippage   [fillable in 97% of snapshots]
  buy    2 BTC  (~$   123,004)  ->   0.40 bps slippage   [fillable in 24% of snapshots]
  buy    3 BTC  (~$   184,506)  ->   0.27 bps slippage   [fillable in 3% of snapshots]
  buy    5 BTC  (~$   307,511)  ->    nan bps slippage   [fillable in 0% of snapshots]

Takeaways

  1. The spread is the floor, not the cost. Coinbase's BTC spread is ~1 bp — but that only covers a dust-sized trade.
  2. Depth is finite. The top-of-book holds only a few BTC; a larger order walks up the ladder.
  3. Slippage grows with size and explodes once you exhaust visible depth — the real reason big orders are split, worked, or routed.

This is exactly what the Arena's real-order-book mode makes tangible: your market orders pay this slippage, and a market maker earns it. Build an agent in the market-making lesson.

Make it yours: open in Colab, then File → Save a copy in Drive (or in GitHub) to get your own editable copy.

Discussion (0)

Keep feedback constructive: what worked, what you'd try next, or a specific question.

Sign in to join the discussion.

No comments yet — be the first.