Hello, I have just started learning Python.
Please can you tell me if there is a way to read in a file one bit or byte at a time (i.e. in binary).
Thank you!
If you have Python three, then this is easy:
with open(filename, "rb") as fh:
b = True
while b:
b = fh.read(1)
#...do something with b
Thank you, however I'm using Python 2.6.1... is there a different way to do it using that?
Thank you, however I'm using Python 2.6.1... is there a different way to do it using that?
Maybe try this.. pretty much same principal as 3.0 :
fh = open('myfile.txt', 'rb')
while True:
try:
fh.read(1)
except:
pass
To be more specific, I want to read in a bit at a time from a file, and add each individual bit to a list. Can you help with this? (sorry, I am a complete beginner!)
Here's how to add items to a list:
>>> bits_list = []
>>> bits_list.append('0')
>>> bits_list.append('1')
>>> bits_list.append('1')
>>> bits_list
['0', '1', '1']
>>>
Yeah, that's what I thought. But this is the code I have - it does not seem to work, just prints out a blank line in the shell then RESTART. And then the whole shell/IDLE seemed to crash when I closed them.
The file I'm trying to read one bit at a time from is an encrypted jpeg. This is the code I have:
encrypted = []
fh = open('encjpgfile', 'rb')
while True:
try:
bit = fh.read(1)
#print bit
encrypted.append(bit)
except:
pass
print encrypted
sys.exit(0)
Yes well, I forgot that read() does not raise a stop iteration when used in the manner that it is being used here... so the try-except case never gets thrown and we're stuck in an infinite loop.
You can do something like, if bit == '': break
that works great, thank you! Only thing is, when I print out the 'encrypted' list, it's full of characters like '\x0e', '\xc4' etc etc... not the 1s and 0s I want, how do I get it to do this?
Sorry for all the questions, very limited knowledge I have, you've been very helpful though.
I suggest this
from binascii import hexlify
L = ["00", "01", "10", "11"]
LL = [u+v for u in L for v in L]
s = "0123456789abcdef"
D = dict((s[i], LL[i]) for i in range(16))
jpeg = open('encjpgfile', 'rb').read()
bits = ''.join(D[x] for x in hexlify(jpeg))
print(bits)
Try something along that line:
# can be used for Python25/26
def int2bin(num, bits):
"""
returns the binary of integer num, using bits
number of digits, will pad with leading zeroes
"""
bs = ''
for x in range(0, bits):
if 2**x == 2**x & num:
bs = '1' + bs
else:
bs = '0' + bs
return bs
image_file = "red.jpg" # a testfile
try:
# read all data into a string
data = open(image_file, 'rb').read()
except IOError:
print "Image file %s not found" % image_file
raise SystemExit
bit_list = []
# create a list padded bits
for ch in data:
# take the int value of ch and convert to 8 bit strings
bits8 = int2bin(ord(ch), 8)
bit_list.append(bits8)
print(bit_list)
"""
my result -->
['11111111', '11011000', '11111111', '11100000', '00000000', ... ]
"""
Sorry about the baddy!
that works great, thank you! Only thing is, when I print out the 'encrypted' list, it's full of characters like '\x0e', '\xc4' etc etc... not the 1s and 0s I want, how do I get it to do this?
Sorry for all the questions, very limited knowledge I have, you've been very helpful though.
Yes, unfortunately Python's file handling wasn't intended for bit-by-bit reading of a file; the read function takes an optional parameter for the number of bytes to read. Since we used '1', that told Python to read the file byte-by-byte. Each of the characters that you see is a single byte of the image data.
Equipped with the knowledge that a byte consists of 8 bits, you could consider it a personal challenge to come up with a way to take each of those bytes and translate it into the corresponding bits.
\xc4 means hexadecimal C4, a.k.a. 0xC4
>>> int('c4',16)
196
>>> int('0xc4',16)
196
>>>
In fact, I believe that Python 3.0 (and therefore 2.6) has a built-in method bin() that would translate directly to binary.
this is all v. helpful, thank you!
Just one more question - if I have modified each of the 8-bit elements in the array and therefore created a new array of bit patterns, how can I then print the data from such an array (as binary data) to a file?
Let's say you have a list of your 8-bit binary elements called new_bytes
, if you already have them back into '\x00' form, you could simply write to file like this:
fh = open( 'my_new_file.jpg', 'wb' )
for new_byte in new_bytes:
fh.write(new_byte)
fh.close()
However if you still need to convert the binary string into hex then you can do either use something like hex('0b10011010')
(NOTE: the notation 0b denotes Binary, just as 0x denotes hex-refer here for more details).
We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.