How do I test a byte string in Python? I want to manually convert (no libraries or functions) a UTF-8 string into UTF-16.
My basic solution is to reading from the stream some number of UTF-8 bytes, convert them into codepoints, then convert those codepoints into UTF-16 bytes. I want to code this myself, but I don't understand how to test the actual byte sequence.
Let's say I use the following code to ensure I have a UTF-8 encoding (from Evan Jones' Scratch Pad: http://evanjones.ca/python-utf8.html)
s = "hello normal string"
u = unicode( s, "utf-8" )
backToBytes = u.encode( "utf-8" )
Now, I need to test the lead byte of the sequence for each character in "backToBytes", right? Is there a function that does this? Any help would be appreciated.