A High-Frequency Tick-Data Pipeline at Sub-500ms Latency

Problem

Morpher offers 2,000+ synthetic markets. Every one of those markets needs a price, in real time, from a credible source. The price drives trade execution, position valuation, and the liquidation engine. If the price feed lags or drops, trades get filled at stale prices and somebody loses money.

We pull from 8 disparate market-data sources covering equities, commodities, crypto, forex, indices, and a handful of unique custom markets. Each source has its own protocol, its own latency profile, its own reliability story, and its own outage pattern. Some are WebSocket. Some are pull. Some say "subscribe to this firehose and good luck."

The pipeline had to do three things at once: fan all of that into a normalized stream the trading engine could consume; keep 100% retention so we could replay any historical second of any market; and keep end-to-end latency under 500 milliseconds even when the market got loud.

Constraints

Throughput baseline of 1 to 2 thousand messages per second on a quiet trading day. Peak of 10k messages per second when something newsworthy hit a major asset class.

100% retention. Not a sample. Not a downsampled aggregate. Every tick, kept forever, queryable.

End-to-end latency under 500 milliseconds, measured from source emit to trading-engine consumption.

Operationally simple enough that a team of seven engineers across two product teams could keep it running while also shipping product.

Architecture

A per-source adapter layer normalized 8 source protocols into one internal message format. Each adapter was a small, well-tested unit responsible for one source and one source only. New sources slotted in by writing a new adapter; downstream code did not change.

RabbitMQ was the message bus. It handled fan-out from adapters to the trading engine, the historical writer, and the cache writer. It also handled backpressure when one consumer slowed down without tipping the others over.

Redis was the hot-path price cache. The trading engine reads from Redis, not from the bus directly, which decouples trade-execution latency from any individual consumer's behavior.

S3 plus Athena was the cold storage and replay layer. Every tick lands in S3 in a partitioned, columnar layout. Athena gives us SQL over the entire history. That is how we reconstruct any incident, any market state, at any timestamp.

PostgreSQL holds derived state. Open positions, market state, anything that needs transactional guarantees and joins.

My role

Architecture for the pipeline. The adapter framework and the normalization contract. Retention design and the partitioning scheme that makes Athena queries cheap. Most of the operational tuning that took the system from "works on a quiet day" to "holds at 10k peak."

Outcome

The pipeline holds the sub-500ms latency target in production. Peak load at 10k messages per second flows without queue buildup or fan-out lag. 100% retention has been validated multiple times when we replayed market state during incident postmortems and during platform-wide reconciliations.

The S3-plus-Athena layer became more valuable than I expected. When a position dispute came in months after the fact, we could replay the exact tick stream that the trading engine saw at that timestamp and resolve the dispute conclusively. That is hard to do with a system that drops ticks.

Tradeoffs

RabbitMQ rather than Kafka. At the time, RabbitMQ gave us simpler operational behavior and the fan-out semantics we needed without the cluster complexity Kafka asks for. If I were starting clean today, Kafka is the obvious choice for the same problem, especially because Kafka's log-replay model is closer to what S3 plus Athena gives us anyway. But ripping out a working bus to replace it with the textbook answer is rarely a productive use of a CTO's time.

S3 plus Athena trades query latency for storage cost. Athena queries are not interactive. They are minutes, not milliseconds. That is fine for replay and for postmortems. It would be wrong if we needed a hot historical query path for product features. We did not, so the choice held.

A per-source adapter layer is more code than a single generic ingest. We accepted that because the alternative was leaking source-specific quirks into the trading engine, which would have been a much larger problem.

The lesson I keep: pick boring tools for the hot path, optimize the contract between layers, and do not let any one source's weirdness leak into the system.