Hello, I have just started learning Python.

Please can you tell me if there is a way to read in a file one bit or byte at a time (i.e. in binary).

Thank you!

If you have Python three, then this is easy:

with open(filename, "rb") as fh:
    b = True
    while b:
        b = fh.read(1)
        #...do something with b

Thank you, however I'm using Python 2.6.1... is there a different way to do it using that?

Thank you, however I'm using Python 2.6.1... is there a different way to do it using that?

Maybe try this.. pretty much same principal as 3.0 :

fh = open('myfile.txt', 'rb')
while True:
    try:
        fh.read(1)
    except:
        pass

To be more specific, I want to read in a bit at a time from a file, and add each individual bit to a list. Can you help with this? (sorry, I am a complete beginner!)

Here's how to add items to a list:

>>> bits_list = []
>>> bits_list.append('0')
>>> bits_list.append('1')
>>> bits_list.append('1')
>>> bits_list
['0', '1', '1']
>>>

Yeah, that's what I thought. But this is the code I have - it does not seem to work, just prints out a blank line in the shell then RESTART. And then the whole shell/IDLE seemed to crash when I closed them.

The file I'm trying to read one bit at a time from is an encrypted jpeg. This is the code I have:

encrypted = []

fh = open('encjpgfile', 'rb')

while True:
    try:
        bit = fh.read(1)
        #print bit
        encrypted.append(bit)
    except:
        pass

print encrypted

sys.exit(0)

Yes well, I forgot that read() does not raise a stop iteration when used in the manner that it is being used here... so the try-except case never gets thrown and we're stuck in an infinite loop.

You can do something like, if bit == '': break

that works great, thank you! Only thing is, when I print out the 'encrypted' list, it's full of characters like '\x0e', '\xc4' etc etc... not the 1s and 0s I want, how do I get it to do this?

Sorry for all the questions, very limited knowledge I have, you've been very helpful though.

I suggest this

from binascii import hexlify
L = ["00", "01", "10", "11"]
LL = [u+v for u in L for v in L]
s = "0123456789abcdef"
D = dict((s[i], LL[i]) for i in range(16))
jpeg = open('encjpgfile', 'rb').read()
bits = ''.join(D[x] for x in hexlify(jpeg))
print(bits)

Try something along that line:

# can be used for Python25/26

def int2bin(num, bits):
    """
    returns the binary of integer num, using bits
    number of digits, will pad with leading zeroes
    """
    bs = ''
    for x in range(0, bits):
        if 2**x == 2**x & num:
            bs = '1' + bs
        else:
            bs = '0' + bs
    return bs


image_file = "red.jpg"  # a testfile

try:
    # read all data into a string
    data = open(image_file, 'rb').read()
except IOError:
    print "Image file %s not found" % image_file
    raise SystemExit

bit_list = []
# create a list padded bits
for ch in data:
    # take the int value of ch and convert to 8 bit strings
    bits8 = int2bin(ord(ch), 8)
    bit_list.append(bits8)

print(bit_list)

"""
my result -->
['11111111', '11011000', '11111111', '11100000', '00000000', ... ]
"""

Sorry about the baddy!

that works great, thank you! Only thing is, when I print out the 'encrypted' list, it's full of characters like '\x0e', '\xc4' etc etc... not the 1s and 0s I want, how do I get it to do this?

Sorry for all the questions, very limited knowledge I have, you've been very helpful though.

Yes, unfortunately Python's file handling wasn't intended for bit-by-bit reading of a file; the read function takes an optional parameter for the number of bytes to read. Since we used '1', that told Python to read the file byte-by-byte. Each of the characters that you see is a single byte of the image data.

Equipped with the knowledge that a byte consists of 8 bits, you could consider it a personal challenge to come up with a way to take each of those bytes and translate it into the corresponding bits.

\xc4 means hexadecimal C4, a.k.a. 0xC4

>>> int('c4',16)
196
>>> int('0xc4',16)
196
>>>

In fact, I believe that Python 3.0 (and therefore 2.6) has a built-in method bin() that would translate directly to binary.

this is all v. helpful, thank you!

Just one more question - if I have modified each of the 8-bit elements in the array and therefore created a new array of bit patterns, how can I then print the data from such an array (as binary data) to a file?

Let's say you have a list of your 8-bit binary elements called new_bytes , if you already have them back into '\x00' form, you could simply write to file like this:

fh = open( 'my_new_file.jpg', 'wb' )
for new_byte in new_bytes:
    fh.write(new_byte)
fh.close()

However if you still need to convert the binary string into hex then you can do either use something like hex('0b10011010') (NOTE: the notation 0b denotes Binary, just as 0x denotes hex-refer here for more details).

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.