Now before you get out pitchforks and tell me that all the managed code stuff I talked about in the previous article is the cause of it, I want to make two quick points.
- Much of that memory would be reclaimed the minute another program asked for it, since it was a lot of garbage the garbage collected was waiting for next collection on.
- The mistakes I made would have been just as slow in C++, RAII would have reclaimed the memory right away (good), but all the allocations I made would have been just as slow (bad).
A month ago when I was just starting the map generator, I was seeing poor performance moving around. I figured it had to be in my game engine somewhere, maybe in the other actors movement. It turned out to be a part of the GUI that was walking every single tile in my map to draw, even if off screen. Since it was in my GUI loop, just the act of walking all the tiles didn't scale to 200x200 sized maps.
To get back at the story at hand, I figured the way I was generating maps was too slow, that I'd need to rip it out and start over. I made the decision to measure my performance and see what was slow. I'd like to tangent to point out the CLR Profiler and EQATEC Profiler, which are free and awesome. So, it turned out that the subroutine that placed chests and monsters was taking 55% of the entire generation time, and using a crazy amount of memory.
The way the chest and monster placement works for my cave generator is simple: from the generated map, get a clear point that isn't too close to the stairs and use it. Repeat for number of chests and monsters. The problem was in the get a clear point. My naive algorithms walked the entire map and created a list of free points. Then it randomized it, and started grabbing from the front of the list until it found one wasn't too close. I figured it'd be more effective than randomly generating points and checking them in maps with a small percentage of empty tiles. You might see the problem here already, I'm walked all the tiles to painstakingly generate a list, randomize it, and then throw it away when done. When you do this 20+ times per level, then 50 levels in a row, it adds up.
Once I cached the list, the generation time dropped in half, and so did memory. I found another similar issue in the core map generator, see here for details. I'm tracking one more down in the stitch map generator right now.
The "lesson" from this story, if there is one, is that I would have never guessed that GetClearPoint was the culprit. A lot of people claim they want fast code, and might try to optimize where they think it's going to be slow, and do nothing but add bugs and obscure coding constructs. Write the simplest code first (with reason, no bubble sorts), then measure and see what's slowing you down. Once you know that, you can whip out the black magic optimizations there.
1 comment:
So true... :)
Post a Comment