Monday, February 8, 2010

libtcod-net 1.5.0rc1-2 released (fixes reported TCODRandom issue)

The issue reported here has been debugged and fixed. The short answer for those not interested in the details is that I was using an API in libtcod wrong and that API was doing the wrong thing and corrupting its internal state. Get the newest libtcod-net here.

For those who are more technically minded, here's the story.

In between libtcod 1.5.0b2 and 1.5.0rc1, the C API changed for creating TCOD_Random's. Previously, one could call TCOD_random_new(), however in this release they added two different RNG implementations. This function now take a paramater. The way to get the "default" RNG is to call TCOD_random_get_instance, which I did happily. I also added a new TCODRandom constructor that took the enum for those who cared, but that's beside the point.

This worked, and seemed to be all I needed to do to fix libcod-net TCODRandom for the new API. However, there was an underlying issue. Sometimes in magecrawl, monsters and players would get into long stretchs (4000+ turns) of continual misses.

After some detective work, the issue was found. TCODRandom, like all libtcod-net wrappers that handle unmanaged resources, implement IDisposable. The idea is that you, or the runtime, will call Dispose() on it to free the allocated memory. The issue is that the new api had a note that you weren't supposed to call TCOD_random_delete on the default RNG. When you did this, you free'ed the memory for the global instance, but not null it out. Due to the implementation of the RNG, it'd happily use the garbage memory giving random answer most of the time. Some times however, some of the garbage pointed to a segment of zero'ed out memory, and I'd see the long stretch of zero's.

The solution was twofold. I updated libtcod-net to follow the API's rules and not delete the default RNG if that was how we implemented it. jice updated libtcod to not internally corrupt itself if you happened to do this. The joys of maintaining libtcod-net is that sometimes you get to track down memory corruption issues, even if you're written in c#. That is the fun of interfacing with unmanaged code.

No comments: