Platform

Gaming

Resources

Enterprise

Company

Select Language

Platform

Gaming

Resources

Enterprise

Company

Game Backend Deep Dive – Mortal Kombat X & Injustice 2

Deep dive on rollback netcode found in Mortal Kombat, from Mike Stallone's talk at GDC 2017.

Key Insights

From Dynamic to Fixed Latency: Switching from variable 5-to-20-frame input latency to a fixed 3-frame delay was the core player experience “win”. As it helped ensure predictability given matters as much as raw responsiveness in a fighting game.
Rollback Performance is Multiplicative: Every optimization applied to simulated frames pays off up to 8x compared to the render frame alone, demanding a complete rethink of the performance budget.
Determinism Enables Everything: Bit-for-bit determinism isn't just a correctness requirement. It is, per Stallone, what makes offline desync debugging, match replay, and the entire rollback model possible.
Benchmark Against Reality, Not Worst Case: NetherRealm spent months optimizing for a scenario that almost never occurs in practice; real beta telemetry and an understanding of human input cadence corrected the target entirely.
Particles and Desyncs Need Dedicated Systems: Naive re-simulation of non-deterministic particle effects produces broken visuals at scale, and rollback-induced desyncs require first-class tooling to catch and fix quickly.

Michael Stallone, now Director of Technology at NetherRealm Studios and then Lead Software Engineer on the engine team, delivered a GDC 2017 presentation a detailing how NetherRealm switched the network model of Mortal Kombat X and Injustice 2 from lockstep to rollback in a live patch (explanation of each technique here), across roughly seven to eight man-years of engineering effort.

Indirectly, this highlights best practices that any game studio, big or small, can apply to their multiplayer to improve its online architecture and player experience. Let's roll back to the beginning.

From Unpredictable to Fixed: The Player Experience Case for Rollback

The reason NetherRealm switched wasn't purely technical. It’s because players were unhappy.

Under lockstep, input latency was dynamic. It fluctuated between 5 and 20 frames depending on network conditions at any given moment. For a fighting game like Mortal Kombat, that's a serious problem. Players execute combos with a specific button cadence and expect it to work the same way every time. When the delay shifts mid-fight, muscle memory stops working. The cadence breaks.

Critically, which impacts the feedback loop to game developers, players weren’t describing this as "high latency," they just described it as the game feeling wrong.

The switch to rollback delivered three fixed frames of input latency, supporting up to ten total frames of network latency (333 milliseconds of round-trip time) before the game pauses.

Predictability, not just low latency, is what fighting game players need. A fixed 5-frame delay beats a dynamic 3-to-15-frame delay every time.

The Real Cost of Rollback is Multiplicative

Before rollback, NetherRealm was idling at 9 to 10 milliseconds per frame. After the initial rollback implementation, that number jumped to 30 milliseconds. Nearly double their 16.66 ms budget. The console generation jump had given them plenty of headroom to be careless with CPU resources, and they had to pay for it.

The reason the cost is so steep is structural. In a rollback system, every simulated frame runs the same game logic as the render frame, just with fewer systems active.

Optimize something that runs across all eight simulated frames and you get up to 8x the savings. Optimize only the render frame and you get one frame of savings. The entire performance mindset has to shift toward the simulated tick.

This multiplicative pressure is also worth considering in the broader context of tickrate. Moving from 20 Hz to 60 Hz, for example, reduces input latency but also multiplies simulation cycles per second, which directly increases server CPU usage and network egress costs.

There is likely a middle ground between the ideal tickrate and the cost-effective one, and the right answer will depend on how latency-sensitive your game actually is and the size of your infrastructure budget. NetherRealm's team started at 30 ms and got to a shippable 13 ms idle through disciplined, layered optimization…. And it took the better part of several man-years.

Determinism as the Foundation

Everything in NetherRealm's rollback model depends on one thing: the game plays out identically on both machines, every single time.

The vast majority of Mortal Kombat X is bit-for-bit deterministic. Every floating-point operation runs in the same order on every machine. Thousands of fencepost checks validate that both clients are in sync at various points in the tick.

Any divergence is a desync, and that's by design. Because if you can guarantee determinism, a lot of things become possible: you can roll back, you can replay matches offline for debugging, you can reproduce desyncs with a single development kit, and you can detect problems before they reach players.

Determinism isn't glamorous. But it is the foundation that everything else in the game’s experience is built on.

Serialization is the Tentpole

"Serialization is the tentpole of this," Stallone said early in the talk, and the entire implementation bore that out.

The rollback framework is a ring buffer sized to the rollback window: one entry per frame, seven entries for seven rollback frames.

Every object containing mutable state needs to be saveable and restorable.

Object creation and destruction across rollback boundaries is especially tricky (done naively, you end up constantly creating and destroying objects).

NetherRealm solved this with a system called “Recreatables”: basically, it is to hash the object and the context around it, and if the hash matches on the way back through the creation edge, reuse the existing object. No recreation, no re-simulation. Particularly valuable for sounds and particles, which are non-deterministic and will produce different results if you destroy, recreate, and re-simulate them.

On the restore side, parallelizing the work dropped the cost from 2.7 milliseconds single-threaded to 1.3 milliseconds for twice as much data.

The practical lessons: avoid shared ownership of mutable resources, prefer unique pointers for serialization, and lean on memcopy and buffer swaps instead of dynamic pointer fix-up.

The investment in serialization infrastructure alone was roughly one to two man-years.

Particle Systems Require a Dedicated Strategy

Particle systems were the single largest source of performance spikes. The naive approach (i.e., to re-simulating every particle on every simulated frame) was far too expensive. And it often didn't produce correct results either, because many particle simulations are non-deterministic: re-simulate them eight times and you get eight different outcomes.

To solve this, NetherRealm built a four-mode re-simulation system.

Most particles ran in predictive mode: simulate once on the confirmed frame and once on the render frame. This produced predictable playback without running on every intermediate frame. For example, a 100 MB runtime cache meant particles never needed to spawn at runtime.

The “deeper” solution was the “Predictive Particle Re-simulation System (PPRS)”, which hashes particle state with deliberately loose constraints. Basically continuously asking ‘Is the player roughly in the same position? Did they do roughly the same thing in the gameplay script? If so, reuse the cached particle object entirely. Don't simulate it, don't serialize it.”

As Stallone noted, "That PPRS thing really sort of saved our bacon."

The PPRS is a useful template beyond just particles. Any system that doesn't need to be perfectly correct is a candidate: loose hashing plus lightweight serialization can get you to "close enough" without full re-simulation cost. The constraint is firm: you cannot do this for anything your game relies on for correctness, only for visual effects and audio.

Benchmarking the Wrong Thing & the Human Factor

For months, NetherRealm measured performance against a scenario where every single frame triggered a seven-frame rollback. That is theoretically possible. Yet it almost never actually happens in real life.

A human button press cadence is roughly six presses per second, not 60. The number of times you roll back seven frames is limited to bad connections, and only when your opponent is pressing a button at that exact moment. "Turns out this is incredibly rare," Stallone noted. QA picked up the game and told the engineering team it was fantastic… even as the team believed their performance numbers were broken.

Their benchmark was wrong. The real-world test corrected it.

This points to something broader about how latency is understood. Most conversations focus on the network, but the network is only part of the picture. Research cited in this rollback netcode guide breaks down where latency actually comes from: roughly 55% is attributable to the players themselves (human reaction time, input processing, display lag), and only about 31% comes from the network. Optimizing purely for worst-case network scenarios misses the larger contributor to perceived latency.

That said, the 31% that comes from the network is the part you can most directly control through infrastructure. Aether Studios addressed this head-on with Rivals of Aether 2 by deploying game servers as close as possible to players, keeping network latency minimal by default. The result was a measurably strong online experience. As detailed in Edgegap's breakdown of how Rivals of Aether 2 built its now EVO-approved online experience. Minimize the 31% through smart infrastructure, then focus your design energy on the 55%.

Speculative Saves Cut Rollback Count by 30%

Saving only the confirmed frame is the minimum viable approach.

While it is correct as you can always roll back to a confirmed state, Stallone points out it's suboptimal. Because you're not saving intermediate frames, you'll roll back further than necessary whenever your speculative prediction for remote input turns out to be right.

NetherRealm added a speculative save system that conditionally saves additional frames when CPU budget allows. Namely to save the confirmed frame (mandatory). Then save the simulation midpoint if you have budget, biased toward the confirmed frame and then to further back your save point, the more likely remote input has already been confirmed. Finally, to save at the end of the frame if you still have time.

As thresholds are tweakable without a patch, allowing tuning in the live environment. The result was a 30% reduction in total rollback count, primarily by eliminating buffer-exhaustion rollbacks: cases where the confirmed frame was about to fall off the end of the ring buffer, forcing a maximum-depth rollback even though no input had actually diverged.

The lesson is that correctness and performance optimization are not the same target. The minimal correct solution and the well-optimized one are very different implementations.

Desync Tooling is a First-Class Concern

Switching to rollback made the game more complex, and that complexity came with more desyncs.

Specifically, not running procedural systems on simulated frames was a major cause as the game code had been quietly relying on IK, physics, and other procedural outputs that were now only running once per tick.

NetherRealm's response was to build desync tooling into the fabric of development.

Because the game is fully deterministic, any match can be replayed offline from just the recorded pad input with the same network cadence, producing the exact same match every time. Developers could inject new desync fenceposts, replay the match as many times as needed, and zero in on the root cause with a single kit. "It was a complete lifesaver," Stallone said. No two-machine setup, no live session required to reproduce the problem.

They added a utility letting any developer pull all live desyncs from the past day or week with a single click. Nightly soak testing ran online matches on every kit in the building overnight, with desyncs logged to an internal server. A lot of problems got caught before they ever reached players. The final desync rate shipped at under 1.1%, and Stallone noted he was fairly confident the real figure was below 0.01%.

—

This article is based on and cites the original GDC presentation by Michael Stallone, published on YouTube. All rights in the original content are owned by their respective owners.

Written by

the Edgegap Team

Get your Game Online Easily
& in Minutes

Game Backend Deep Dive – Mortal Kombat X & Injustice 2

Key Insights

Key Insights

Key Insights

From Unpredictable to Fixed: The Player Experience Case for Rollback

The Real Cost of Rollback is Multiplicative

Determinism as the Foundation

Serialization is the Tentpole

Particle Systems Require a Dedicated Strategy

Benchmarking the Wrong Thing & the Human Factor

Speculative Saves Cut Rollback Count by 30%

Desync Tooling is a First-Class Concern

More on this topic:

More on this topic:

More on this topic:

Get your Game Online Easily
& in Minutes

Get Started

Get Started

Get Started

Get Started

Get Started

Get Started

Game Backend Deep Dive – Mortal Kombat X & Injustice 2

Key Insights

Key Insights

Key Insights

From Unpredictable to Fixed: The Player Experience Case for Rollback

The Real Cost of Rollback is Multiplicative

Determinism as the Foundation

Serialization is the Tentpole

Particle Systems Require a Dedicated Strategy

Benchmarking the Wrong Thing & the Human Factor

Speculative Saves Cut Rollback Count by 30%

Desync Tooling is a First-Class Concern

More on this topic:

More on this topic:

More on this topic:

Get your Game Online Easily & in Minutes

Get Started

Get Started

Get Started

Get Started

Get Started

Get Started

Get your Game Online Easily
& in Minutes