
KRAFTON's Migration to Container-Based Orchestration for PUBG: Battleground - Insights for Multiplayer Game Developers
Infrastructure Modernization Objective: KRAFTON aimed to eliminate operational bottlenecks and improving scalability for hundreds of thousands of concurrent players across global regions by switching from session-based game server to modern container-based game server orchestration. Essentially imitating internally a platform like Edgegap.
Agones Scaling Limitations: The open-source game server orchestration platform suffered from 15-minute bootstrapping delays during peak scaling events. KRAFTON's engineering team invested heavily in custom solutions to reduce this to 3-4 minutes through container registry proxies and Karpenter adoption. Whereas developers can simply use a fully managed solution like Edgegap which boots game server from cold start in 3 seconds on average.
Operational Efficiency Benefits: Containerization reduced environment provisioning to under 5 minutes and enabled self-service capabilities. Teams gained autonomous access to testing infrastructure without DevOps intervention.
Hidden Resource Impact: The modernization journey consumed years of specialized engineering effort that could have been directed toward gameplay improvements. Most studios cannot afford this level of infrastructure investment while maintaining competitive development cycles.
Managed Platform Alternative: Fully managed solutions deliver equivalent, if not better, performance and scaling capabilities without internal complexity. Studios achieve enterprise-grade infrastructure through simple integration rather than multi-year implementation projects by using a platform like Edgegap.
In this article, we’ll cover the key insights that are applicable to all multiplayer game developers in regard to Player Unknown ’s: Battlegrounds architecture migration from lobby-based to container-based game server orchestration.
The presentation was presented in 2024 bu JinHun Kim, then DevOps Team Lead at KRAFTON alongside Minsuk Kim and Minwook Chun. It highlights their multi-year journey transforming PUBG: Battlegrounds' infrastructure from legacy EC2-based servers to a fully containerized Kubernetes ecosystem.
Their experience offers valuable insights for game developers, though it also reveals the hidden complexities and costs of self-managing such sophisticated infrastructure.
From Legacy Architecture to Modern Containerization
PUBG: Battlegrounds, one of Steam's top three games by current players at the time, operated on a two-component architecture.
The lobby serves as the entry point for matchmaking, store operations, and customization. Then, session servers handled the core 100-player battle royale gameplay across distributed regions worldwide.
JungHun Kim, DevOps Team Lead at KRAFTON, explained their initial challenge: "QA environment creation workflow" required 20 minutes to one hour for DevOps teams to provision new testing environments. This bottleneck severely impacted development velocity and team productivity.
The modernization journey began in November 2018 with session servers, followed by lobby servers in October 2019, and culminated with the use of ARM-based processors servers in June 2023. Each phase addressed specific pain points while introducing new complexities.
KRAFTON's first breakthrough came from recognizing that traditional EC2-based QA environments were resource-intensive and difficult to share. Each environment required numerous AWS services: EC2 instances, CodeDeploy, CloudFront, Elastic Load Balancing, DynamoDB, ElastiCache, OpenSearch, Kinesis, Data Firehose, SQS, VPC, Auto Scaling, Route 53, IAM, and S3.
The solution involved containerizing these services within Amazon EKS, creating shared and dedicated resource categories. This architectural shift reduced QA environment creation time from 20-60 minutes to under 5 minutes. Teams gained self-service capabilities through a web UI accessible to designers, developers, QA engineers, and product managers.
Production Migration and Service Mesh Complexity
Moving production workloads proved far more complex than QA environments.
Production systems require zero database flushing, graceful migration strategies, and comprehensive rollback plans. The team implemented sophisticated service discovery mechanisms using clusters to synchronize IP addresses, service names, and location data.
However, traffic balancing issues emerged during the migration. Connection pooling created uneven load distribution across services, forcing the team to adopt Istio service mesh for dynamic traffic management, enhanced security, and improved observability.
Session Server Orchestration: Agones and its Limitations
Session servers presented unique challenges compared to stateless lobby services. Each session server runs Unreal Engine dedicated servers, maintaining game state without persistent storage. KRAFTON needed to scale to hundreds of thousands of concurrent sessions while maintaining consistent response times and cost efficiency.
They evaluated Agones, an open-source multiplayer game server orchestration platform built on Kubernetes. Agones manages game servers like Kubernetes deployments, with one GameServer pod per game instance. Fleets manage groups of GameServers, while FleetAutoscalers handle capacity management. The architecture enabled sophisticated bin-packing across shared Kubernetes clusters with multiple environments coexisting within the same cluster using namespaces.
While KRAFTON achieved impressive technical results, Agones suffered from a critical flaw according to JungHun Kim: 15-minute server bootstrapping time that became a bottleneck during scaling events. Breaking down the timeline revealed multiple compounding delays: instance provisioning (1-3 minutes), instance bootstrapping (2-3 minutes), and pod provisioning (5-10 minutes). These delays created scaling challenges during peak usage periods when players needed immediate server availability.
KRAFTON's solutions required substantial engineering investment. They adopted Karpenter to reduce EC2 launch delays and implemented a container registry proxy with Harbor, S3 caching, and CloudFront distribution. These optimizations reduced bootstrapping from 15+ minutes to 3-4 minutes but demanded deep Kubernetes expertise and ongoing maintenance that most studios cannot sustain.
The Hidden Costs of Self-Management
KRAFTON's technical achievements came at considerable cost and complexity.
Managing Karpenter as an open-source solution required dedicated expertise that most studios lack. The team needed specialized knowledge across multiple domains: Kubernetes networking, Istio service mesh configuration, game server management, container registry optimization, and multi-architecture build systems.
These optimizations demand deep Kubernetes expertise and ongoing maintenance. KRAFTON's dedicated DevOps team could manage this complexity, but smaller studios face significant resource constraints. Few studios can dedicate sufficient resources to mastering these technologies while maintaining focus on game development.
The infrastructure journey consumed years of engineering effort that could have been directed toward gameplay improvements, new features, or additional content. This represents a significant opportunity cost for game development studios.
A Simpler Path Forward
Rather than replicating KRAFTON's complex infrastructure journey, game developers should consider fully managed solutions like Edgegap's game server orchestration platform. Edgegap provides instant server deployment with 3-second boot times from zero, eliminating the 15-minute delays that plagued KRAFTON's Agones implementation.
The platform auto-scales to 14 million concurrent users within 60 minutes, far exceeding most game requirements. Most importantly, it removes the need for dedicated DevOps teams to manage Kubernetes, container registries, service meshes, and scaling algorithms.
Studios can focus their engineering resources on game development rather than infrastructure management, while still achieving the performance and scale benefits that KRAFTON worked years to implement.
Does PUBG Have Dedicated Servers?
Yes, PUBG: Battlegrounds operates entirely on dedicated servers for both lobby and session components. Unlike peer-to-peer networking, dedicated servers provide authoritative game state management, anti-cheat capabilities, and consistent performance regardless of individual player connections.
The session servers run Unreal Engine dedicated server instances, each handling one 100-player match. These stateful servers manage all game logic, physics calculations, and player interactions without relying on persistent storage between matches. Lobby servers use stateless .NET microservices connected to managed storage backends, enabling horizontal scaling and fault tolerance.
Where Are PUBG Servers Located?
PUBG operates a globally distributed server infrastructure with session servers deployed across multiple AWS regions. The lobby services concentrate in us-east-1, serving as a central hub for user management and matchmaking operations.
Session servers deploy dynamically based on player demand and geographic distribution. This approach minimizes latency by placing game servers closer to player populations, improving the competitive experience in a game where milliseconds matter. The geographic distribution strategy balances cost efficiency with performance requirements while maintaining centralized lobby operations.
AWS still requires for game developers to purchase locations individually to ensure global coverage. Game developers in need of a modern solution can instead use Edgegap’s orchestration platform, which taps in the world’s largest, and first, regionless network. Meaning allows your multiplayer to deploy their game server to all 615+ locations worldwide at a single price, which enables Edgegap to reduce latency by 58% on average and delivers sub-50ms latency to 78% of the game’s player base.
Conclusion
Game developers should carefully evaluate whether building and managing complex Kubernetes-based infrastructure aligns with their core competencies and resource availability.
KRAFTON achieved impressive results, their multi-year journey consumed significant engineering effort that smaller studios cannot afford to divert from core game development. Most game studios lack KRAFTON's extensive resources and dedicated DevOps expertise required to build and maintain complex Kubernetes infrastructure.
Fully managed platforms like Edgegap deliver the same performance benefits—instant scaling, global distribution, and low latency—without requiring specialized infrastructure teams or years of implementation work. Studios can achieve KRAFTON-level infrastructure capabilities through simple integration while focusing their talented developers on creating exceptional gaming experiences rather than managing container orchestration systems.
---
This article is based on and cites the original article by KRAFTON published here. All rights in the original content are owned by their respective owners.
Écrit par
the Edgegap Team
