This is a short list of brilliant ideas that have fascinated me for some time now, but I doubt I will find any chance to use them in foreseeable future. These are not mine (sadly) but have been virtually haunting my mind since I learned them.
What I like in them is a combination of simplicity and out-of-the-box thinking that is characteristic to elegant inventions. It’s also cool that the core of these ideas can be expressed in one sentence, but it fundamentally changes the way you look at the problem. I decided to spread them not because I think you might use them right away but for the sheer joy of writing about beautiful ideas.
Using DNS as a load balancer
I first heard about it on this presentation by Jason Hoffman of Joyent at RailsConf Europe 2007 (this topic is covered on the slides from 128 up but go ahead and read all of it if you’re interested in big-time scaling). Jason was talking about many interesting things but this one idea struck me as one of the brightest I heard in a long time.
Basically, if you run a Web application accessed worldwide, you can leverage DNS servers as a first-level load balancers by setting up different IPs for different areas of the world. You can make users in Europe use your European server, US users go to the US server and so on. Not only you get load balancing, but your users’ experience is a little bit better because the server is closer to them.
Of course, there are more possibilities and you don’t have to go by geography at all. There is also similar technique known as Round robin DNS.
In the old days (say MS-DOS or Win 3.1 days), the virus could employ many techniques to hide from anti-virus software, but their code was rather constant. Once the anti-virus was in control of the computer, it could easily find the virus by searching for a known sequence of bytes. Of course, there was a lot of mutations of each virus, but most of them were done by I-don’t-know-how-this-works-but-let-me-insert-my-nickname-here kids to impress their friends.
Then the self-modifying viruses came. The basic idea is that on the machine code level there are many different ways to achieve the same result. For example, instead of loading the constant 12345 into a register, you might zero the register then
add 12345 to it, you could
push 12345 on stack then
pop it into the register, you can load 9999 then
add 2346 (or use any other arithmetic operation). If you want to modify some variable in memory, you can load it into different registers or you can set up the stack pointer so the next
pop will read this.
When you get the idea, the possibilities of obfuscating the code are endless. This method allows you to generate a new mutation of your virus that does exactly the same thing but not even one instruction is the same.
There are even more advanced techniques, like reordering the blocks of code, which requires you to track all the jump instructions in your code, but once you do that, you can rearrange the blocks of code in virtually every way. And this is no black magic, the operating systems and linker programs have been doing it for years now.
I can’t find the original article, where I first read an analysis of actual metamorphic virus and a very good explanation of these techniques, but if you want to learn more, these links are good place to start: http://vx.netlux.org/lib/vmd01.html and http://migeel.sk/blog/2007/08/02/advanced-self-modifying-code/.
Of course, this all leaves you with one fundamental question: how do the anti-virus programs deal with this?
Memory pool with a free list
Another rather low-level idea that allows fast finding of free slots in an array of preallocated objects of the same size. What’s really clever is that free slots contain pointers to the next free slot thus creating a linked list of slots. This way you need only one additional pointer (to the first free slot).
Each time you need to allocate a slot, you grab what’s under this pointer and update it to the address of next free block (found at the beginning of this block). Deallocation mirrors this process by storing the current pointer at the beginning of deallocated block then setting the pointer to the address of this block. Thus allocation and deallocation work in constant time O(1).
This is probably the most easily employable technique of the three I described, so whenever you find yourself building some kind of cache or pool manager (of database connections or worker threads etc.) remember the term “free list”. Learn more from Wikipedia articles: Memory pool, Free list. This algorithm is used by Ruby’s virtual machine: Garbage Collection and the Ruby Heap (starting at slide 13).