Haxin Mainframes

A blog about stuff I do, find interesting, or want to blab about..

Python String Bit Iterator

My most recent project has been to implement a BitTorrent client (or maybe just part of one) and a DHT client! Really the BitTorrent client is just to support the DHT client. Anyhow, I found that in my DHT implementation re occuringly needed to take a string of bytes and iterate bit by bit through it accomplishing some task along the way.

Now in Python 3 there are all kinds of cool tools to do this kind of thing but I am working in Python 2.7 . To accomplish my goal I used yield to create a generator iterator of a python string, that iterates bit by bit.

#Returns a iterator that will iterate bit by bit over a string!
def string_bit_iterator(str_to_iterate):
    bitmask = 128 #1 << 7 or '0b10000000'
    cur_char_index = 0

    while cur_char_index < len(str_to_iterate):
        if bitmask & ord(str_to_iterate[cur_char_index]):
            yield 1
            yield 0

        bitmask = bitmask >> 1
        if bitmask == 0:
            bitmask = 128 #1 << 7 or '0b10000000'
            cur_char_index = cur_char_index + 1

This is pretty self explanatory I hope. It just starts at the first byte in the string, with a bitmask with the most important big set and begins shifting the bit right, reseting and moving onto the next byte when needed.

Heres some of the tests I wrote for this:

bit_iter = string_bit_iterator('\xff\xff\xff\xff\xff\xff\xff\xff')
for b in bit_iter:
    assert(b == 1)

count = 0
last = 0
bit_iter = dhttornado.string_bit_iterator('\xaa') #0xAA is b10101010
for b in bit_iter:
    assert(b != last)
    last = b
    count = count + 1
assert(count == 8)

peer = new_node([0xAA,0xAA])
for b in peer:
    assert(b != last)
    last = b

This was used for iterating over the bits in an info hash or 20 byte peer ID in my DHT implementation (which is on my github).

Hopefully this takes care of some bitmasking/shifting for someone else..