Wednesday, February 27, 2013

Hello, bits

Bit arrays are a great tool for efficiently implementing sets, but bit twiddling operations can be a little unwieldy. After searching for an existing method to list all the bits set in an integer, nothing turned up.  This is a pretty important operation for a set, so I just implemented it myself with a bit counting algorithm as a starting point.

How many?

Counting one bits is easy with Brian Kernighan's bit counting algorithm (in C, of course):
int countBits(int n) {
    int count = 0;
    while(n != 0) {
        n &= (n-1);
        count++;
    }
    return count;
}

Why does it work?

The algorithm takes an input n, and "peels" bits off one-by-one until none are left. Line 4 clears the lowest order one bit by masking with n-1. The shape of this mask is easier to see in binary:
When n = 0b1000, the mask n-1 = 0b0111
When n = 0b101000, the mask n-1 = 0b100111

So n-1 produces a mask with all the same high-order bits set.  The lowest bit from n is not set, but all bits below this one will be set.  Bitwise AND-ing n and (n-1) together produces a new value with all the same bits set except the lowest one.  Once all bits are cleared, n will be zero and the loop terminates.

Getting bit indices

Counting bits isn't what we're after, so lets get the actual bits.
int printBits(int n) {
    int count = 0;
    int next_n, index;
    while(n != 0) {
        next_n = n & (n-1);
        index = __builtin_ctz(n - next_n);
        n = next_n;
        printf("Hello, %i\n", index);
    }
    return count;
}
It's easy to get the value of the bit we just cleared---subtract the values before and after the mask!  This leaves us with a number containing only one bit set: the one we just masked out.  The __builtin_ctz function is a GCC builtin that gives us the index of this bit instead of the corresponding power of two.

Hello, bits

Printing the bits isn't all that useful, so here's a minimal C++ biterator that lets you walk over the one bits in an integer.
class biterator {
private:
    int _value;
    int _next;

public:
    biterator(int value) {
        _value = value;
        _next = value & (value-1);
    }

    biterator begin() {
        return biterator(_value);
    }

    biterator end() {
        return biterator(0);
    }

    int operator*() {
        return __builtin_ctz(_value - _next);
    }

    void operator++() {
        _value = _next;
        _next = _value & (_value - 1);
    }

    bool operator!=(biterator b) {
        return _value != b._value;
    }
};

// The obligatory example code
int main() {
    // C++11 foreach:
    for(int i: biterator(0xfa)) {
        printf("Hello, %i\n", i);
    }
    return 0;
}

Acknowledgements

Thanks to +Dan Barowy for piquing my interest with this problem, and translating my algorithmic ramblings to something actually runnable.