From Maximilian Knaack:
When should you start testing a game? A question often asked and often answered by ominous statements like: “as early as possible”.
While in theory this sounds great, in reality, it does not help much.
In this article we will go over the first testers of Chonky – From Breakfast to Domination, what the challenges were and what we got out of it.
Too early to test
Chonky was created through a slow and quite standardized process – from paper prototypes to a small map in the Unreal engine where basic movement could be tested.
Then, one after the other, features that gave the player more and more abilities were added.
Soon, family members and friends would ask if they could play around for a bit and give us some feedback.
This of course was no dedicated user testing, but it provided the first real inputs from people outside of the project team.
Sadly, they did not hold much value.
A game at this stage simply does not have enough character for any big underlying problems to be identified. Most of the feedback was centered around visuals or systems that were not final anyway.
Family members and friends for the most part are not ideal for giving feedback, as they shy away from telling uncomfortable truths.
Unfortunately, these are the most important parts of feedback.
The vertical slice
After we created more sophisticated iterations of our game, we had to make a decision. We had a due date (as this was a student project at that time) and we could either deliver a finished product or a vertical slice of a game that we would continue to work on.
For Chonky, we chose the latter option.
Fixed due dates like that are an unwanted blessing. They help to focus the attention on the most important things, while cutting away at anything less vital.
We finished our vertical slice more or less on time, even though it required us to crunch.
For those that are unfamiliar with that term, it simply means to focus all your efforts and work over time to finish a game.
As any productive crunch time, the output during those days was staggering. This can be all too alluring for project leaders like myself, but we have to keep in mind that after every crunch follows a downtime, where the output is reduced quite substantially.
Finally, we arrived at our first dedicated user tests, lasting an entire week.
No good user testing can happen without a feedback form.
As annoying as it is to create (and fill out from your testers’ point of view) it is vital for the standardization of the feedback.
There existed two iterations of our feedback form.
The first was created to the best of our knowledge, and the second after we learned where that knowledge was lacking.
Even though it makes the analysis of your findings that much harder when you change your feedback form halfway through testing, it will be worth it.
Better to have half of your findings be more useful and accurate than to have all of them neatly combined.
The bigger picture
Of course, there are many things you have to look out for when testing with different groups of people.
For example, our results varied wildly between university students that came out of interest and high school students that were asked to play our game during one of their classes.
Additionally, we also found some ways to make our life harder than it needed to be.
Not-ToDo’s when user testing
Never just listen to your first few testers and think their opinions will be shared by later testers as well.
Our first testers, for example, wanted other controls for walking. Thinking that this opinion would be shared by later testers as well, we quickly integrated those new controls.
Well, after these first testers’ nobody ever asked for those controls ever again.
One way we actually lowered the quality of our test results was by updating the game on a daily basis, trying to incorporate the findings of the day before.
Even though great in theory and giving us a nice chart that showed how the opinion about our game improved over the course of the testing week, it made the results of the earlier tests pretty much useless.
Last but not least, the aforementioned different types of testers were also an issue. Different groups had different interests, which should have been looked upon separately. We, of course, mixed all feedback forms together and even though it gave nice overall graphs, the details varied wildly and did not allow for a coherent picture.
Cleaning, clearing, categorizing
If you find yourself in a similar situation as us, try to get the best out of it. Make as many charts and diagrams as you can, try to find as many connections between preferences as required, and find the gold crumbs beneath the rubble.
What at first was a big pile of random feedback and information would slowly turn out to harbour some important and interesting findings.
Testing at the stage of a vertical slice houses enormous value. It gives the opportunity for feedback early enough that the game can still be changed fundamentally. For us, some of the most important findings were:
- The movement of our characters felt off
- An alternative way of healing was needed
- We needed some sort of adjustable difficulty
- We needed to calculate our defense stats differently
- Our damage and item system was too complex
Results like these are extremely valuable for developing a game. Things like the character movement can be tinkered with at a later point. Similarly, difficulty settings, as well as different methods for healing, can always be added. But changing our stat calculations or our damage and item system would have broken our entire game and most likely delayed our release had we discovered it to be a problem at a later point.
Not all feedback has value
We also realized that not all feedback holds value.
For example, the group of testers that asked for auto health regeneration was also primarily interested in faster paced games like online shooters. Therefore, they were not likely to buy our game anyway and tailoring it towards them would not be beneficial to us.
We had to understand why certain feedback was given and stop blindly adding everything that was requested (like that useless alternative movement that nobody wanted afterwards).
- Test early but not before your game has a clear identity
- Do not change your game while testing
- Understand why feedback is given and don’t just blindly follow it