Anatomy of a Real Crypto Order Book¶
Trading isn't free, and the "price" on a screen isn't the price you get. This post dissects a real recorded BTC-USD limit order book (Coinbase, ~600 snapshots over ~10 minutes — the same data that powers the ConvexPi Arena) to measure the three things that actually determine your cost: the spread, the depth, and the slippage of walking the book. See how matching works for the mechanics.
%matplotlib inline
import json, urllib.request
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
URL = "https://raw.githubusercontent.com/convexpi/arena/f75088c55d58c8693fec8e85c9ef4a8927ab2f2e/data/btcusd_book.jsonl"
raw = urllib.request.urlopen(URL).read().decode()
frames = [json.loads(l) for l in raw.splitlines() if l.strip()]
print(f"{len(frames)} order-book snapshots; each has bids 'b' and asks 'a' = [[price, size], ...]")
1. The spread¶
The best bid and best ask bracket the mid; the gap is the spread — what you immediately pay to cross from buyer to seller. Coinbase BTC is famously tight, so we look at it in basis points.
rows = []
for f in frames:
bb, ba = f["b"][0][0], f["a"][0][0]
mid = (bb + ba) / 2
rows.append({"t": f["t"], "mid": mid, "spread": ba - bb, "spread_bps": (ba - bb) / mid * 1e4})
df = pd.DataFrame(rows)
df["time"] = pd.to_datetime(df["t"], unit="ms")
print(f"mid price : ${df['mid'].mean():,.2f}")
print(f"spread : ${df['spread'].mean():.3f} ({df['spread_bps'].mean():.2f} bps avg, "
f"{df['spread_bps'].median():.2f} bps median)")
fig, ax = plt.subplots(2, 1, figsize=(10, 5), sharex=True)
ax[0].plot(df["time"], df["mid"], lw=1); ax[0].set_title("BTC-USD mid price")
ax[1].plot(df["time"], df["spread_bps"], lw=1, color="indianred"); ax[1].set_title("Spread (bps)")
plt.tight_layout(); plt.show()
2. Depth — the shape of the book¶
The spread only tells you the cost of a tiny trade. For real size, what matters is depth: how much is resting at each price level. We average the cumulative size available within a given distance (in bps) of the mid, on each side.
LEVELS = 15
def cum_depth(side_key, sign):
# cumulative size and its avg price-offset (bps) for the first LEVELS levels, averaged over frames
sizes = np.zeros(LEVELS); offs = np.zeros(LEVELS); cnt = np.zeros(LEVELS)
for f in frames:
mid = (f["b"][0][0] + f["a"][0][0]) / 2
cum = 0.0
for i, (price, qty) in enumerate(f[side_key][:LEVELS]):
cum += qty
sizes[i] += cum
offs[i] += sign * (price - mid) / mid * 1e4
cnt[i] += 1
return offs / cnt, sizes / cnt
bid_off, bid_cum = cum_depth("b", -1)
ask_off, ask_cum = cum_depth("a", +1)
fig, ax = plt.subplots(figsize=(9, 3.5))
ax.step(-bid_off, bid_cum, where="mid", color="seagreen", label="bids (cumulative)")
ax.step(ask_off, ask_cum, where="mid", color="indianred", label="asks (cumulative)")
ax.axvline(0, color="grey", lw=0.5)
ax.set_xlabel("distance from mid (bps)"); ax.set_ylabel("cumulative size (BTC)")
ax.set_title("Average book depth"); ax.legend(); plt.tight_layout(); plt.show()
print(f"avg size in top {LEVELS} ask levels: {ask_cum[-1]:.2f} BTC (~${ask_cum[-1]*df['mid'].mean():,.0f})")
3. Slippage — what a market order really costs¶
A market buy walks the ask ladder: it fills the cheapest offers first, then more expensive ones. The volume-weighted price you pay drifts above the mid — that's slippage, and it grows with order size. This is the cost the Arena makes you feel when you send a market order into the real book.
def slippage_buy_bps(frame, size_btc):
mid = (frame["b"][0][0] + frame["a"][0][0]) / 2
remaining, cost, filled = size_btc, 0.0, 0.0
for price, qty in frame["a"]:
take = min(remaining, qty); cost += take * price; filled += take; remaining -= take
if remaining <= 1e-12: break
if filled < size_btc * 0.999: # book too thin to fill this size
return np.nan
return (cost / filled / mid - 1) * 1e4
sizes = [0.05, 0.1, 0.25, 0.5, 1, 2, 3, 5]
slip = [np.nanmean([slippage_buy_bps(f, s) for f in frames]) for s in sizes]
fillable = [np.mean([not np.isnan(slippage_buy_bps(f, s)) for f in frames]) for s in sizes]
fig, ax = plt.subplots(figsize=(9, 3.5))
ax.plot(sizes, slip, "o-"); ax.set_xlabel("market-buy size (BTC)"); ax.set_ylabel("avg slippage (bps)")
ax.set_title("Slippage vs order size"); plt.tight_layout(); plt.show()
for s, sl, fr in zip(sizes, slip, fillable):
print(f" buy {s:>4} BTC (~${s*df['mid'].mean():>10,.0f}) -> {sl:5.2f} bps slippage [fillable in {fr:.0%} of snapshots]")
Takeaways¶
- The spread is the floor, not the cost. Coinbase's BTC spread is ~1 bp — but that only covers a dust-sized trade.
- Depth is finite. The top-of-book holds only a few BTC; a larger order walks up the ladder.
- Slippage grows with size and explodes once you exhaust visible depth — the real reason big orders are split, worked, or routed.
This is exactly what the Arena's real-order-book mode makes tangible: your market orders pay this slippage, and a market maker earns it. Build an agent in the market-making lesson.

