Some time ago, I started a crazy experiment trying to make real-time lighting work on the PICO-8, a cute little fantasy console with limited horsepower to spare.
What we can only assume is an Egyptian merry-go-round.
This went quite well, so I started an even crazier endeavor in which I try to explain all the principles behind it in a series of articles. This is part three.
Part 1 and part 2 dealt with getting the basics of our effect working — making a blending routine fast enough to be workable and figuring out how to apply said routine to make a smoothly lit circle.
Where we ended up last time.
The result is impressive as a programming feat, but a bit lackluster to look at. Somehow, despite the warm tones of the palette, the end result has all the warmth and hospitality of an office fluorescent light.
The feeling we are going for is definitely not that. We’re on a torch-lit romp through an ancient tomb, not in a cubicle simulator, so we have to find a way to breathe some life into this thing.
The thing about torches is that their light always seems to be on the move — flickering, sputtering, changing moment to moment. As it stands, our effect doesn’t do anything of the sort. In fact, it’s quite the opposite — if the character stops moving, not a single pixel on the screen will change, making the whole thing feel artificial.
Another shortcoming of the current implementation is the obviousness of the edges between the light levels. When you know what to look for, you can actually count the distinct bands, especially when the character is on the move.
Any lighting system is basically a magic trick. We’re trying to convince the viewer to believe something that isn’t true — namely, that the world behind the screen is real. And like with any magic trick, once the structure behind how it’s done becomes too obvious, we have already lost.
If we want to fix that, we have to confuse the eyes of the viewer so that the edges are no longer visible. Increasing the number of light levels is not an option due to performance issues, but there is a tried-and-true approach used by people and algorithms alike whenever the color space is too small for comfort: dithering.
Undithered vs dithered image. Notice how the bands on the neck disappear in the dithered one. (image from English Wikipedia)
Unfortunately, the standard approach to dithering is not going to mesh well with our algorithm, since it depends heavily on having long horizontal stretches of the same lighting level. The whole idea behind dithering is breaking these long stretches up, using quickly changing patterns of pixels to hide the seams.
Fortunately for us, there is a different type of dithering that we can use — temporal dithering.
Like our algorithm, biology also has its limitations. To put it frankly — our eyes are pretty crap. When provided with input that changes too fast for the cones and rods to cope, our wetware throws in the towel, averages everything out and calls it a day.
This means that if we make the location of the edges change frame-to-frame, our eyes and brains won’t be able to keep up. The differences between frames will get smooshed out and the exact location of the edges will be hard to pin down.
Since the other major problem we had was our light being too static, adding a bit of movement is a win-win. Changing up where the light reaches each frame sounds a lot like what a flickering torch does.
One of the simplest ways to make the edges move each frame is scaling. This would mimic what happens with a real flame, with its brightness changing as it burns unevenly.
We can achieve that pretty easily by picking a random factor each frame and multiplying all entries in the light_rng
table (whose purpose is defining where the edges are) by that factor.
Scaling uniformly and blinking incessantly
This looks a little better in the liveliness department, but the edges are unfortunately still visible — they just appear to be moving. In addition, the flickering is kind of annoying in the long run. The jumps between frames are too stark, while still not achieving the goal of obscuring the location of the edges.
What if we try to introduce a bit of spatial dithering back into the equation? We can’t break up the horizontal segments, but we’re free to break up the vertical ones. Instead of scaling the whole light, we can scale each line separately. This will also make the brightness of the frames change less on average, avoiding the harsh blinking effect.
The final approach and a single frame of the scaling affecting the light itself.
That gets rid of both of our problems. The bands are much less pronounced, and we kept the lively feeling of torchlight without any annoying flashes.
We have managed to turn the artificial-feeling ringed circle into a fuzzy, warm glow. This helps a lot, but there is still a few ways we might make it better.
The way we calculate lighting is basically correct, given one important assumption: the world has to be perfectly flat. Unfortunately, the real world we’re all used to is annoyingly three-dimensional with all the consequences that entails. The biggest one is that surfaces can be angled differently with respect to the light source, so differing amounts of light get reflected.
True 3D engines take this into account and that angle is one of the major factors deciding how bright a surface is. The classic “SNES RPG” perspective I used might not be 3D, but it still features two different kinds of surface: floors (which already look great) and vertical walls (which feel kind of fake right now).
I’m talking about these pesky things.
A saner person might have gone for a fully top-down perspective that solves this problem like an ostrich would — by not having any vertical surfaces visible. I am apparently not a sane person, so I have two ideas for giving more depth to the walls:
Shadows are a big topic and are coming in part 4, so let’s stick with the first idea for now.
To be entirely correct in our lighting implementation, we’d have to calculate the angle between the ray of light and the wall it is hitting for each pixel we draw. We might try doing that, and PICO-8 would probably dutifully render the first frame just in time for the fourth part of this article.
Instead, we’re going to focus on bang for buck. We should pick whatever the biggest problem is, and as far as I’m concerned, it’s the fact that the walls are still lit when the light is behind them. Fixing that would go a long way towards making it believable (I actually called it “verisimilitude” first, but then figured any word I can’t type without a spellchecker must be too pretentious).
First, we have to figure out which surfaces are affected. This is done by simply marking all wall fronts in the tileset with a special flag.
Since all visible walls are facing the same way (that’s the way our perspective works), it’s relatively easy to figure out when we’re behind them once we know where they are. Whenever the y
coordinate of our character becomes smaller than the y
coordinate of the base of the wall, we’re behind.
We basically have a master’s degree in palette effects now, so using one for darkening the walls when you go behind them seems obvious. Suddenly switching from fully lit to dark once you cross a magical barrier would be too jarring, so we will do it gradually, going one level darker for every 2 pixels behind.
Dynamic darkening of vertical walls — with and without
That’s much better, and the lighting filter still gets applied on top of anything we do, completing the illusion.
The next thing we’d like to tackle is adding shadows, but there is one problem: if we keep the light radius at levels usable for gameplay, rendering takes nearly 100% of the PICO-8 CPU, and shadows certainly aren’t going to appear for free.
This means we have to go back and optimize what we already have. Profiling a bit (using the low-tech approach of strategically applied comments and looking at stat(1)
) shows that the biggest performance hog is fl_light()
, the function divvying up horizontal lines into lit segments. I spent a lot of time whipping it into shape, and I’d like to show you two of the tricks I used — one very general and usable everywhere, and one very unusual, specific to Lua.
As it turns out, one of the sources of the function’s slowness is the sqrt() function. Calculating a square root is a tad more involved than say, multiplication — as anybody who tried calculating √13 by hand can attest (try it, I’ll wait).
Whenever we encounter a slow function, our optimization reflexes tell us one thing — precompute! Our only regret is that we can’t really precompute the square root of every possible number, as infinite time and memory is not a thing yet.
Fortunately, we don’t really need to know the square root of everything. The only formula we use it in is √light_rng[lv]-y²
. Both y
and light_rng[lv]
are integers — meaning we’ll never have to deal with fractions at all. The value will also never be larger than the largest number in light_rng
, which is the square of the maximum radius of our light.
If we assume that radius to be 64 (half the size of a PICO-8 screen), our precomputed table only has to have 4096 entries (64²). That lets us replace the costly sqrt(x)
with a much faster _sqrt[x]
table lookup.
Which piece of code executes faster: a = 3
or b = 4
? Unless we believe that bigger numbers take more time to store in memory (spoiler alert: not how it works), we might say that these are the same. However, the real answer in most programming languages is “depends on what a and b are”.
As for Lua, it is a big fan of tables. It’s so fond of them, it uses them for everything: not only as a single structure to represent both arrays and objects, but also as a way to avoid having to actually implement globals.
Every time you try something innocent like global = 12
, Lua mentally translates it into MY_MAGICAL_GLOBAL_HIDEYHOLE[“global”] = 12
. What we see as global variables, Lua sees as entries in a giant global table. Every access to a global bears the tax of hash table indexing.
Local variables have no such problem, implemented efficiently as locations on a special stack. This means that the intuitively insane, cargo-cult optimization of storing a global variable in a local one is actually a thing. That’s good for me as a mad PICO-8 scientist, as I get to speed up the code and look crazy in one fell swoop.
The code we use for figuring out the light uses a few globals. One of them, the fills
table, stores drawing functions accessed multiple times each line. It turns out that copying that table to a local variable saves us 5% of CPU each frame basically for free.
We’ve got the lighting effect to be both appealing and reasonably fast, which is a good excuse to call it a day and consider part 3 complete.
This part was a little fluffier than the previous ones, but fear not — the fourth and last part will be all about vector math, efficient polygon rendering and using both in conjunction with some elbow grease to get the last thing missing from our effect: realistic shadows.
Until then, may every light in your life feel warm and inviting.
Part 1 | Part 2 | Part 3 | Part 4 | Play the game