Tl; dr: Maintaining clusters in thousands of locations and asking game clients to ping all of them and report back is not technically & monetary viable once you reach thousands of locations.
Short history of matchmaker
Multiplayer games need, well, multiple players for them to happen (duh)! How those players can find each other is what a matchmaker is all about. Early days of matchmaking was done by sharing your IP with friends to connect to each other (Hello Doom!). Soon appears dedicated servers and games would allow you to choose from a list of available servers, sometimes giving you the latency between you and each server. In early 2000, the first matchmaker which would automate selecting a server was created. From there, the next-gen matchmaker started to incorporate rules to pool players together based on the game’s context. i.e. Players would play against others with a similar character level, same kind of cars, etc.
There is a balance in applying rules to get a perfect match, versus the time you will have to wait for another player of your rank to play along. Ask Riot’s League of legends diamond players who are used to wait north of 15 minutes to get an opponent of their rank. This streamer waited for 5 hours and even after that he still hadn’t gotten anyone to play with. Matchmaking delays are not only caused by game policy rules (and lack of opponents) but the amount of data centers it can leverage.
The typical flow of a matchmaker is the following:
-This assumes the studio uses a centralized matchmaker. Some older systems will use one matchmaker per data center which makes things even worst.
-Game client will ask the matchmaker for a list of available data centers
-Game client will do basic ICMP ping to each of them
-Game client will send a request to the matchmaker for a new game, taking a “ticket” and reporting back the various latency per data center
-The matchmaker will put the ticket in a queue along with a timestamp.
-From there, various rules can be created by the game designers, but those will typically involve matching players within a certain region.
-Once pooled, players for a given match will be allocated a game server which is standby and running
-Players can play together
Why should we change?
Every matchmaker is different, and there are other steps especially around encryptions and such. But this list represents the general idea, and it has been like that for over 15 years. This model works well when you have a limited amount of data centers. It although has some caveats:
-you need to warm up (pre-start) instances, therefor you need clusters of running game servers in every location (and teams of people to nurse them). You incur a cost for a service that is not even used. Those are called “fleet”.
-Game clients need to initially ping each data center. This means that latency is looked at from a self-centric perspective and not at the match level as a whole. Some matchmakers will “add” latency between players, but at this stage, this is merely looking at the sum of latency instead of overall experience & fairness.
-Matching times it increased since you now need to add latency as a rule in matching players versus focusing on game-centric mechanisms.
-Decision is made initially and once started, cannot be changed. If a change in the network appears, nothing can be done to change this.
-Today’s solutions only look at latency, nothing else. Other elements have to be taken into account like the time of the day, previous experiences, player’s contexts, and many more.
When this model was initially put together, studios and publishers were buying hardware themselves, hosting machines, and network in a handful of centralized data centers. Simple, centralized, under control environment. A few years later cloud providers started to offer a similar offering but without the hassle to manage the infrastructure. They made it easy to have a data center on the other side of the globe without having to invest too much. This was still highly centralized (AWS today owns 22 data centers you can deploy on around the world), and offered fast backbone with multiple points of presence to get players quickly in those centralized DC. Regardless of the speed of those backbones, players still have to go from their house to those DC using networks and fibers. Cloud providers argue that they cover the metropolitan areas with sub 20ms latency in North America, but how much will this be true as people move away from large cities due to Covid-19 and work from home becomes the norm. If those statements were true, would lag still be the number 1 problem from a gamers perspective?
Edge Computing to supplement the public cloud
Looking at what’s coming, a new type of infrastructure is emerging called Edge Computing. Instead of using large server farms, providers are building smaller data centers, closers to users. For example, instead of building a handful of large DC in the US, they spread a bunch in each state. This process is accelerating as mobile service providers are looking at those edge nodes to add strength to their 5G network, and are starting to deploy one at the base of each cell antenna.
This trend is seen around the world, even public clouds realize this could be a threat to their business and started to partner with carriers to deploy smaller DC in those networks.
The network can be optimized, a faster path can be found, but nobody can bend the law of physics. Fibers will never be faster than (half) the speed of lights, and you will never have fiber between each point on the planet. This has nothing to do with technology, this is common sense. The remaining solution is to get closer to users.
Back to matchmaking. Today’s architecture may work to leverage a handful of data centers, maybe 50, 60. But considering it is clear the infrastructure market is going down a path where we will see north of thousands of data centers spread around the world, how will the actual model scale? How can you leverage this new capability?
What should you do?
You can leave things as is, “today’s service is good enough”, “lag is not a priority”, “my matchmaker does take into account latency already” … Reality is, if you don’t innovate, others will. Leveraging this new set of capabilities in evolving infrastructure requires new methods and processes. Today’s matchmakers will have to be tweaked, well beyond adding a few DC. The sheer amount of new games being launched daily makes it hard for studios and publishers to compete. The number of players for a given game gets smaller as they are spread on many other titles. You will not be able to use thousands of locations by having a cluster in each, relying on client-side code, and hoping for the best. Tweaking networks has been done, and today’s gain is negligible compared to what edge computing allows studios to do.
Upcoming infrastructures are complex, each match of your game is different, and nobody controls every network from where your players will come. You need a solution which can adapt quickly in real-time, learn using advanced AI mechanism on what worked and what didn’t, and optimize every match as if they were tailor-made for your players.
If you are serious about your game and its future, reach out and we’ll help improve your player’s experience. Contact us at email@example.com and we will make sure your players have the best experience possible using our cutting edge technology.