In software development there is this pitfall known as pre-optimization, which is the attempt to make code more efficient before it has even been proven to be a problem. I have been careful not to fall into this trap and I think I have done a reasonably good job at it. So much so that this past week I noticed that the game was running under 60 fps. Next thing I knew I was waist deep in a code profiler, the debugger, and my paper notepad filled with ideas. In the end I came out victorious with three beasts slain at my feet and Kelley wearing a tightly corseted gown, gratefully clinging to my muscular arm.
The first of the three, the lighting system, was a tricky one. The lighting is made up of thousands of tiny squares called subtiles. Each square is 4×4 pixels which means there are 16 of them in a single tile. That means if the resolution is only 1280×1024 then there are 81920 subtiles. That’s a lot! I have some awesome algorithms to calculate the lighting at blazing speeds, but when it came to rendering them, I didn’t have much optimization I could make.
After an arduous journey to the top of a mountain, I sat in meditation for 3 days until the solution was given to me in a vision. A great deal of the tiles are a solid value – either pure black or pure white. With this I made light represented by a solid tile and only broke it down into subtiles if the tile needed multiple values. After a day of rewriting a large portion of the lighting system to handle this concept, I was greeted with a constant 60fps, up from the lowly 40~fps. The first beast was slain and onto the next I galumphed!
The second beast was one I had been avoiding for awhile for no real reason – tile blending. Tile blending is a cool concept I came up with awhile back that helps smooth the transition from one tile type to another.
Each tile type has a priority. When a tile is drawn it checks its four direct neighbors and if it has a higher priority then it slightly draws over that neighbor. The problem with this is that I was comparing the tile priorities every frame, slowing down the fps rate.
Fortunately a profound solution was not required this time. I made the exchange of memory for cpu. Now every tile is caching if it has priority over each neighbor using a bitwised enum stored in a single byte. Since I only have a fraction of the world loaded at a time this is not such a bad thing. And on I went to the final beast.
The last of the beasts is the world generation. We’ve been building onto the world generation for awhile now (and it’s still in progress) and it started to get way too slow. It was taking nearly 2 minutes to generate just a small world. So I unsheathed my mighty brain powers and did what I do best – code!
I quickly found that this beast was no ordinary beast. It was a hydra and had three issues that I had to cut through. Before getting into the gory details, I’ll give a brief overview of how the world generation works. First the world terrain is created and everything is either dirt or sky. Then the “world” biome is placed down which is a pass over the entire world such as adding in other tile types such as silver and gold. After this all of the biomes are laid out and each performs its own actions to reshape its allotted area.
The first and remaining issue is that the first step, the terrain generation, takes about 98% of the processing time. The way I am generating the terrain, through the Accidental Noise Library (ANL), is just very slow. There are likely optimizations I can make, I just need to continue hacking.
Another issue that really started slowing things down was the way we were placing down all of the tiles during the world biome pass. I would randomly calculate an area to place the tiles in and then generate a circle with noise once again going through ANL. Doing this a few hundred times was just too much. I realized that I did not really need to generate a new circle with noise every time. No one would notice if we reused the same ones especially if we stretch and flip them. Thus the concept of “stamps” was born. Stamps can either be created from noise generation as before or from an image.
The final issue was with an unmentioned feature/tool – creating an image out of the generated world. I put this together so it would be easier for us to iterate on the world generation. But there was a problem with it… It was taking 30 seconds. It was still worth it but still – ouch! I quickly verified that the part I suspected was indeed the culprit. Reading through all of the tiles and converting them into a color to save to the image was taking 99% of the time. This was written in python so I moved just this part into C++. It went from 30 seconds to roughly 1 second. Victory was mine!
Needless to say, I’m covered in the bloody remains of ones and zeros.