This post is about how I improved the graphical features in a simple HTML5 game. I did so by, among other things, adding a 2.5D perspective. But more importantly it’s about how I optimized the performance of said graphics. If you’ve ever written a 2.5D game engine, you probably know exactly how to do this. If not, read on!
Last year, I entered the Seven Day Roguelike challenge with Golden Krone Hotel, a game about killing vampires with sunlight. It was well received, despite being completely 2D and having lo-fi pixel art and no animation. What can I say? The roguelike community doesn’t expect much when it comes to visuals.
This year, I decided to turn Golden Krone Hotel into a polished, commercial game. To appeal to a wider audience (one not content with ASCII graphics), I felt that the following features were needed:
- 2.5D perspective. Game objects that can obscure other objects behind them, producing an illusion of depth. Tall entities (e.g. big monsters) to emphasize the depth.
- Smooth movement between tiles
- Animated torches and spells
- Previously seen tiles, which appear desaturated (common in roguelikes but I didn’t have it originally)
- Increased visibility radius and fullscreen display
In 2014, this was no problem. That version was very easy to render. I only had 121 tiles on screen and they only needed to be rendered when the player moved.
The 2015 version was much more complex and slooow. I now had to render over 3 times as many tiles. I also needed to do it 60 times per second. To make matters worse, I was relying on some costly composite modes. Even on my gaming desktop, the performance wasn’t great and I knew it’d be much worse on older computers.
First let’s talk about how the new features work.
The primary thing required for 2.5D is to render your objects in order from back to front (top of screen to bottom of screen). Not a big deal, right? Then things get dicey. You can’t just draw all of your floor/wall tiles and then draw all of your monsters on top. That wouldn’t allow walls to be in front of monsters, defeating the purpose of 2.5D. Instead you must group objects (walls, monsters, spells) into rows and draw the rows in order.
It sounds easy on paper. When a monster moves to a tile, you assign them to the new tile, but you also draw them with a corresponding (x,y) offset. For example, moving 1 tile to the right produces a (-1,0) offset. Decaying the offsets to 0 automatically slides the monster around.
- The player’s offset needs to be added to the camera itself, so the camera isn’t jerky.
- In the naive approach, monsters going around corners appear to take diagonal shortcuts. To fix that, you’ll need arrays of offsets that decay one at a time.
- When you add in the 2.5D perspective, you can no longer draw a monster in the row where it ends up. When I tried it initially, monsters moving vertically would get hidden behind floor tiles. Woops! The solution is find the maximum Y coordinate a monster reaches in an animation and draw in that row instead.
The root of all evil
I went as long as possible without obsessing over performance, but at some point I realized I simply can’t ship a game this slow. Tuning the performance of my game was a slow process filled with confusion, frustration, and self doubt. But I’m happy with the end result and I learned some things along the way:
- Always determine where you need optimizations. It’s too easy to focus on the wrong parts, the parts that will never help you. See: Amdahl’s Law
- Always verify that your optimization worked. It feels good to come up with some clever trick and move on, but sometimes you’ve actually made the program slower.
- Remember that optimizations are tradeoffs. You lose debuggability, readability, and valuable development time. You damn well better get something good in return.
- Small mistakes (like forgetting to swap out some test code), can totally murder you. This happened to be me several times. It’s hard to detect without profiling because the program still behaves correctly.
To summarize: Never make assumptions. Profile, profile, profile.
The quickest way to profile is to use in-browser tools like Chrome’s profile tab. However, it can be difficult to interpret the results. When that fails you, you can get a better picture by writing your own profiling code. Performance.now() is pretty great for this.
Onto the optimizations:
Separate rendering into two buckets: animation and turn
Objects that are animated need to be rendered every frame. Objects that only change when the player takes a turn, such as the minimap, only need to be updated once per turn. Maybe the distinction is completely obvious, but it took me a while to conceptualize it and then hunt down all the bits that were being rendered too frequently. This separation is the single most important performance enhancement that I made.
Tiles don’t change during turns, but they do move every frame while the player’s position is being animated. The solution is to drawn them onto a buffer (i.e. an off-DOM canvas element) and just move the buffer around as needed. Besides the main buffer, there also needs to be buffers dedicated to previously seen tiles (one per level). And as mentioned earlier, both of these need to be split up into rows.
There are only two hard things in Computer Science: cache invalidation and naming things. — Phil Karlton
Over the years, I’ve noticed that caching is used absolutely everywhere to make things fast. Unfortunately, caching can be complex and can cause all sorts of weird behavior. Computer thrashing? Web app not functioning properly? App icons missing? Maybe blame caching. It’s an interesting thought experiment: if computers and networks were infinitely fast, computer programs wouldn’t just be faster; they’d also be much simpler to write and more reliable.
So it goes without saying this solution was a bit painful.
Scale the canvas instead of images drawn onto the canvas
Well, I sure do feel stupid about this one. Golden Krone Hotel uses small tiles, 16×16 pixels, which are typically scaled up 300%. The smart way to accomplish the scaling is to draw everything at 1X and then scale the canvas through CSS. The dumb way (what I did at first) is to scale while drawing to the canvas. This is actually much more complex. It resulted in some nasty bugs whenever the player resized the window because now I had all these improperly sized canvas elements sitting around.
Why did I do it the hard way to start? Well, for decades there has been no good way to tell browsers to use nearest neighbor scaling on pixel art; this is absolutely hilarious considering both how widespread pixel art games are and how dead simple the nearest neighbor algorithm actually is.
So I used canvas property imageSmoothingEnabled = false to get crisp pixels on canvas draws. Now, I’ve decided to switch over to the CSS property image-rendering: pixelated. Of course, It still doesn’t work in all browsers and it only showed up in Chrome this year, but that’s good enough for me.
In the new version, I added neat looking fog of war, but it was quite slow. I tried to optimize by making fog buffers. I was ripping my hair out trying to figure out a way to update a buffer without redrawing the whole thing. Fog tiles overlapped, so for each piece of fog removed, I had to redraw all its neighbors.
I finally got it working, but then realized the buffers actually made things worse, since each buffer was the size of an entire level and the fog was animated as well. The buffers were so huge that they would make my entire app crash on startup. So I cut fog out entirely and I don’t miss it. Now I get these creepy vampire eyes instead.
No rendering in steady state
When the player idles for a while, the only objects still animating are torches (at 10 frames per second). Someone suggested that rendering only at 10fps during those times would save on laptop battery life. It’s a great idea, so I did it.
I’d seen some indications that using WebGL might improve performance. I decided to use PIXI.js to utilize WebGL. The cool thing about PIXI is it auto detects WebGL support and falls back to canvas rendering when not available. The requirement to run a local web server was a bit of a pain though. I’m no stranger to running local servers, but I really appreciate the simplicity of popping open an index.html file and having things just work.
I also spent several days freaking out over how PIXI and NW would interact before realizing it would work fine out of the box.
Golden Krone Hotel has a performance issue that most roguelikes don’t have: dynamic lighting. Torches, spells, monsters, and rays of sunlight each produce their own light. Each light source can visit up to 400 tiles while distributing its influence. This occurs every turn and when the player rests, I have to simulate 50 turns at once. So in order to appear snappy, the game has to perform a few 100,000 tile visits/calculations in a split second. Achieving that seemed hopeless.
Failed solution #1: precalculate the influences of each light on each tile and then add them together. Sounds nice, but when you do the math it’s roughly the same number of calculations.
Failed solution #2: don’t simulate light sources that the player can’t see. Doesn’t really help because the player can see light from torches as far as 20 tiles away. If you have to calculate everything in a 20 tile radius, well, that’s nearly the whole map.
Failed solution #3: memoizing distance calculations. I thought 3 exponential calls would be slower than performing an array or object property lookup. I was wrong.
After many weeks of handwringing, the solution popped into my head: only calculate the differences between turns! I keep track of all current light sources, all previous ones, and the differences. If a light is missing from the previous list, I add its influence to tiles. If it’s missing from the current list, I subtract. I have to reset lighting whenever a door is opened, but beyond that it’s pretty simple code. I couldn’t believe how effective this change was. It took the average rest time from 1s to 100ms.
I made significant improvements in the performance of Golden Krone Hotel, but if I’m being honest here, some of the approaches just didn’t pan out. WebGL rendering and tile caching in particular resulted in little or no improvement. Tile caching appeared to be a winner early on, but that was before other changes superceded it.
One of my inspirations for shooting for a commercial release was Lost Decade Games, makers of A Wizard’s Lizard. These two guys have built a reputation on creating a surprisingly successful HTML5 game. They even have a regular podcast, which I would listen to every week. I’d smile and then cringe when I heard the cheesy opening which included a reference to the “arcane arts of HTML5.” One day I turned on the podcast and the opening which they had been using for 100+ episodes had changed. No mention of HTML5. They were switching to Unity because of performance reasons. Sad face.
If you enjoyed this post, why not check out Golden Krone Hotel on Steam Greenlight?