How to survive a game release - a guest post by Stefan Marx
The release of a new game is always an exciting and often very stressful affair - for both fans and developers. Game fans have been waiting for a new title from their favorite series for years and can't wait to explore and discuss every new feature, every level, and every corner of the game. The pressure on the development teams before such an event is correspondingly great. The emergence of large AAA titles such as the famous Battlefield series by EA DICE can be compared with the elaborate Hollywood films. For years hundreds of people have put all their creativity, passion and a budget of often hundreds of millions of dollars into the development of a game. Of course, developers and publishers want to ensure that everything works without any problems in the crucial first hours after the launch, when huge numbers of players flock to the server.
The right basis
So that one Game can run smoothly, it needs some important backend services. For one, players need to be able to authenticate, reset their password and manage their subscriptions and inventory. If a player wants to join a game, matchmaking has to work: Thousands of users have to be assigned to a game on a free server at the same time, and if there isn't a suitable one, a new game has to be created. In any case, it must be ensured that people are brought together who are on a similar level and have similar skills, so that the game remains exciting and fun for everyone until the end.
Read also 0
The author
Stefan Marx is Director Platform Strategy for the EMEA region at the cloud monitoring provider Datadog. Marx has been working in IT development and consulting for over 20 years. In the past few years he has worked with various architectures and techniques such as Java Enterprise Systems and specialized web applications. His main areas of activity are the planning, construction and operation of the applications with a view to the requirements and problems behind the specific IT projects.
Social issues also play a role for the Battlefield developers. They offer networked users the opportunity to keep an eye on their friends' gaming activities in order to be able to join them on the same server. Various internal services that are not aimed directly at the players must also function smoothly. For example, level design teams are responsible for ensuring that a playing field is not completely symmetrical, because that would look unnatural. Nevertheless, both teams must be offered the same opportunities. For this purpose, heatmaps are created that show where players gather during the rounds, where there are good places for an ambush, and whether players are finding ways that were not planned by development and design.
Many of these processes run automatically, which means enormous support for developers. Nevertheless, all processes, both during development and after the game's release, must be constantly monitored. It is important that there are not only people in the team who are familiar with the tools required for monitoring, but that developers take care that everything is made understandable and comprehensible for everyone. Many of these processes are also essential for the function of the game or the gaming experience, a failure must be recognized or prevented at an early stage.
Recommended editorial content Here you will find external content from [PLATFORM]. To protect your personal data, external integrations are only displayed if you confirm this by clicking on "Load all external content": Load all external content I agree that external content can be displayed to me. This means that personal data is transmitted to third-party platforms. Read more about our privacy policy . External content More on this in our data protection declaration. Try and Error
To ensure that everything goes according to plan at the crucial moment, careful testing must be carried out in advance of the market launch. Before a game is finally released, there are pre-launch events: a closed alpha and an open beta test phase. During these phases, of course, not the complete game is presented, but only a few selected functions and game modes. The developers let users play their new game in order to be able to take a look at how features work, what is happening on the backends and whether the players behave as expected. In order to participate in the closed alpha phase, players must be explicitly invited. With a few hundred thousand participants, the alpha phase is a bit smaller and primarily serves to ensure that all game functions are doing what they are supposed to. The beta, on the other hand, is specifically designed for stress testing and is much, much larger. Not quite as many players flock to the servers as at the launch, but still enough to check whether the backends can withstand the coming onslaught.
The beta phase is often more stressful than the actual launch, because during this time all aspects of the game are carefully examined. The developers are specifically looking for issues that affect the game's performance and player experience. This includes technical aspects such as server stability, latency values and the aforementioned matchmaking, as well as the gameplay itself: weapon balancing, game progress and other aspects that need to be optimized in order to make the game as fun as possible. During the test phases, systems are also deliberately taken offline and problems are created intentionally in order to test the operational capabilities of the game. This approach is also known as chaos engineering. The developers try to ensure that they have metrics as early as possible: functions that map the properties of software in a numerical value in order to create comparison and evaluation options that can be essential for eliminating errors.
Read also 0
In order to be able to diagnose and overcome possible challenges in a timely manner, a log management platform is required that enables developers to view the automatically kept log of all actions of processes on the computer system and with others to compare. The tools for collecting and searching logs used to be very rudimentary. Responding to a problem, an attempt was made to find a server that had more information. This method was cumbersome and time-consuming, and resulted in most people avoiding contact with protocols outside of troubleshooting. As a result, few people in the company knew how to access and use log data.
There is a great advantage in uniting metrics and logs in one monitoring platform, as it allows deeper insights into the performance of a game and diagnose problems that need more detail to troubleshoot. Thanks to their modern tools, the DICE developers were able to search billions of logs during operation and quickly find the information relevant to them. This enabled them to become more proactive in troubleshooting rather than just reacting, and found bugs long before they caused problems for gamers. Also, developers began cleaning up the now much more accessible log files and enriching them with contextual data and fields that would make a huge difference in troubleshooting. All of this has led to the fact that logs are now often the first source for assessing game performance, especially when there are millions of players involved in a game. The troubleshooting workflow has become more efficient and problems that used to take weeks to search can now be identified and resolved in a matter of hours. If all of this already happens in the beta phase, nothing stands in the way of a technically uneventful launch.
Stress tests for developers
Production is nearing the end, a final release date is set and the marketing team is doing its best to spread this date everywhere and stir up the fanatic excitement that will peak on the day. With AAA games like Battlefield 5, the traffic on release increases from zero to the highest value the game will ever experience - and that can be up to 30 million users. It is therefore the task of the software teams to ensure that each of these 30 million users has access to a game server as quickly as possible and that the game runs smoothly for everyone.
There is excessive gaming, especially in the weeks after the launch, then the number of players slowly decreases again. The load on the servers decreases, which is of course good for the stress level of the development team. However, they are then confronted with a new problem: In order to cope with the high load in the first few weeks, the server capacities were expanded so that more is available than is ultimately needed - just to be on the safe side. The number of unused servers that still generate costs continues to rise. Of course, there is also the option to display an error message as soon as the servers are full, or to put the players in a queue. A stopgap solution that neither really satisfies players nor developers.
When it comes to running an immense number of game servers for several weeks as cheaply as possible, the cloud is a fantastic option. Amazon, Google and Co. cannot directly provide the complete number of servers required, but still enough to significantly reduce costs. EA Dice developed special software for the operation of their game servers early on, which drives game servers all over the world up and down, depending on which capacities are currently required.
If several million players are required to publish a game want to play, tons of servers have to be operated and monitored. During the test phases, stress tests were used to determine how many servers can run on the same host. In particular, the frame rates and server response times are carefully monitored. If these should decrease, the game could start to stall and actions of the players can have no effect, for example if an opponent who is already wounded is shot but nothing happens. This is often due to the fact that the packets containing the necessary information about the hit did not arrive on the server because the server was overloaded and had to "throw away" various packets. This can be incredibly frustrating for users.
The Key to Success
So how do you survive a game release? Primarily through good preparation, of course, because there is not much time in the crucial 48 hours to react to problems. The implementation of pre-launch events and stress tests, the design of the most resilient systems possible and the availability of more server capacity than is ultimately needed are decisive for a smooth process. It must also be ensured in advance that all metrics are available that are required to monitor everything precisely - it is always better to collect too much data than too little.
The right basis
So that one Game can run smoothly, it needs some important backend services. For one, players need to be able to authenticate, reset their password and manage their subscriptions and inventory. If a player wants to join a game, matchmaking has to work: Thousands of users have to be assigned to a game on a free server at the same time, and if there isn't a suitable one, a new game has to be created. In any case, it must be ensured that people are brought together who are on a similar level and have similar skills, so that the game remains exciting and fun for everyone until the end.
Read also 0
The best online role-playing games 2021: The top 10 MMORPGs
The best MMORPGs in the video special: We as the editorial team have voted and are showing you the great online role-playing games in Buying Guide. var lstExcludedArticleTicker = '1375571,1353633'; Statistics are constantly being created and updated during the game. With Battlefield 5, around 10,000 counters run in the background per user and monitor almost everything: How long does a player sit in a vehicle? How often and how fast do you shoot? How and where is hit? From the collected data numerous rankings are created, both global and local, and for the most diverse achievements and abilities, because as different as the Battlefield players are their talents.The author
Stefan Marx is Director Platform Strategy for the EMEA region at the cloud monitoring provider Datadog. Marx has been working in IT development and consulting for over 20 years. In the past few years he has worked with various architectures and techniques such as Java Enterprise Systems and specialized web applications. His main areas of activity are the planning, construction and operation of the applications with a view to the requirements and problems behind the specific IT projects.
Social issues also play a role for the Battlefield developers. They offer networked users the opportunity to keep an eye on their friends' gaming activities in order to be able to join them on the same server. Various internal services that are not aimed directly at the players must also function smoothly. For example, level design teams are responsible for ensuring that a playing field is not completely symmetrical, because that would look unnatural. Nevertheless, both teams must be offered the same opportunities. For this purpose, heatmaps are created that show where players gather during the rounds, where there are good places for an ambush, and whether players are finding ways that were not planned by development and design.
Many of these processes run automatically, which means enormous support for developers. Nevertheless, all processes, both during development and after the game's release, must be constantly monitored. It is important that there are not only people in the team who are familiar with the tools required for monitoring, but that developers take care that everything is made understandable and comprehensible for everyone. Many of these processes are also essential for the function of the game or the gaming experience, a failure must be recognized or prevented at an early stage.
Recommended editorial content Here you will find external content from [PLATFORM]. To protect your personal data, external integrations are only displayed if you confirm this by clicking on "Load all external content": Load all external content I agree that external content can be displayed to me. This means that personal data is transmitted to third-party platforms. Read more about our privacy policy . External content More on this in our data protection declaration. Try and Error
To ensure that everything goes according to plan at the crucial moment, careful testing must be carried out in advance of the market launch. Before a game is finally released, there are pre-launch events: a closed alpha and an open beta test phase. During these phases, of course, not the complete game is presented, but only a few selected functions and game modes. The developers let users play their new game in order to be able to take a look at how features work, what is happening on the backends and whether the players behave as expected. In order to participate in the closed alpha phase, players must be explicitly invited. With a few hundred thousand participants, the alpha phase is a bit smaller and primarily serves to ensure that all game functions are doing what they are supposed to. The beta, on the other hand, is specifically designed for stress testing and is much, much larger. Not quite as many players flock to the servers as at the launch, but still enough to check whether the backends can withstand the coming onslaught.
The beta phase is often more stressful than the actual launch, because during this time all aspects of the game are carefully examined. The developers are specifically looking for issues that affect the game's performance and player experience. This includes technical aspects such as server stability, latency values and the aforementioned matchmaking, as well as the gameplay itself: weapon balancing, game progress and other aspects that need to be optimized in order to make the game as fun as possible. During the test phases, systems are also deliberately taken offline and problems are created intentionally in order to test the operational capabilities of the game. This approach is also known as chaos engineering. The developers try to ensure that they have metrics as early as possible: functions that map the properties of software in a numerical value in order to create comparison and evaluation options that can be essential for eliminating errors.
Read also 0
The evolution of co-op games: From the late 1970s to the present day
Co-op games have a long history. But which developers actually laid the foundations for this trend, which continues to this day? var lstExcludedArticleTicker = '1375571,1369588'; Vigilance around the clockIn order to be able to diagnose and overcome possible challenges in a timely manner, a log management platform is required that enables developers to view the automatically kept log of all actions of processes on the computer system and with others to compare. The tools for collecting and searching logs used to be very rudimentary. Responding to a problem, an attempt was made to find a server that had more information. This method was cumbersome and time-consuming, and resulted in most people avoiding contact with protocols outside of troubleshooting. As a result, few people in the company knew how to access and use log data.
There is a great advantage in uniting metrics and logs in one monitoring platform, as it allows deeper insights into the performance of a game and diagnose problems that need more detail to troubleshoot. Thanks to their modern tools, the DICE developers were able to search billions of logs during operation and quickly find the information relevant to them. This enabled them to become more proactive in troubleshooting rather than just reacting, and found bugs long before they caused problems for gamers. Also, developers began cleaning up the now much more accessible log files and enriching them with contextual data and fields that would make a huge difference in troubleshooting. All of this has led to the fact that logs are now often the first source for assessing game performance, especially when there are millions of players involved in a game. The troubleshooting workflow has become more efficient and problems that used to take weeks to search can now be identified and resolved in a matter of hours. If all of this already happens in the beta phase, nothing stands in the way of a technically uneventful launch.
Stress tests for developers
Production is nearing the end, a final release date is set and the marketing team is doing its best to spread this date everywhere and stir up the fanatic excitement that will peak on the day. With AAA games like Battlefield 5, the traffic on release increases from zero to the highest value the game will ever experience - and that can be up to 30 million users. It is therefore the task of the software teams to ensure that each of these 30 million users has access to a game server as quickly as possible and that the game runs smoothly for everyone.
There is excessive gaming, especially in the weeks after the launch, then the number of players slowly decreases again. The load on the servers decreases, which is of course good for the stress level of the development team. However, they are then confronted with a new problem: In order to cope with the high load in the first few weeks, the server capacities were expanded so that more is available than is ultimately needed - just to be on the safe side. The number of unused servers that still generate costs continues to rise. Of course, there is also the option to display an error message as soon as the servers are full, or to put the players in a queue. A stopgap solution that neither really satisfies players nor developers.
When it comes to running an immense number of game servers for several weeks as cheaply as possible, the cloud is a fantastic option. Amazon, Google and Co. cannot directly provide the complete number of servers required, but still enough to significantly reduce costs. EA Dice developed special software for the operation of their game servers early on, which drives game servers all over the world up and down, depending on which capacities are currently required.
If several million players are required to publish a game want to play, tons of servers have to be operated and monitored. During the test phases, stress tests were used to determine how many servers can run on the same host. In particular, the frame rates and server response times are carefully monitored. If these should decrease, the game could start to stall and actions of the players can have no effect, for example if an opponent who is already wounded is shot but nothing happens. This is often due to the fact that the packets containing the necessary information about the hit did not arrive on the server because the server was overloaded and had to "throw away" various packets. This can be incredibly frustrating for users.
The Key to Success
So how do you survive a game release? Primarily through good preparation, of course, because there is not much time in the crucial 48 hours to react to problems. The implementation of pre-launch events and stress tests, the design of the most resilient systems possible and the availability of more server capacity than is ultimately needed are decisive for a smooth process. It must also be ensured in advance that all metrics are available that are required to monitor everything precisely - it is always better to collect too much data than too little.