Member Avatar for herrschteiner

Good day,

I am attempting to write a Python script that has a given hexadecimal value, and a given list of strings.

The script randomly samples the list and pulls out 3, 4-string lists. The strings are joined, converted to integers, converted to hexadecimal values, added together and the sum is compared to the given hexadecimal value.

#HexValue:
value = "%X" %  0x2e1087c4

# Character Set:
charset = ['01', '02', '03', '04', '05', '06', '07', '08', '09', '0b', '0c', '0e', '0f', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '1a', '1b', '1c', '1d', '1e', '1f', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '2a', '2b', '2c', '2d', '2e', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '3b', '3c', '3d', '3e', '41', '42', '43', '44', '45', '46', '47', '48', '49', '4a', '4b', '4c', '4d', '4e', '4f', '50', '51', '52', '53', '54', '55', '56', '57', '58', '59', '5a', '5b', '5c', '5d', '5e', '5f', '60', '61', '62', '63', '64', '65', '66', '67', '68', '69', '6a', '6b', '6c', '6d', '6e', '6f', '70', '71', '72', '73', '74', '75', '76', '77', '78', '79', '7a', '7b', '7c', '7d', '7e', '7f']

# Randomly Sample Three, Four-String Lists:
ran1 = random.sample(charset,4)
ran2 = random.sample(charset,4)
ran3 = random.sample(charset,4)

# Join Lists Into Strings:
str1 = ''.join(ran1)
str2 = ''.join(ran2)
str3 = ''.join(ran3)

# Convert Each String To Integer:
int1 = int(str1, 16)
int2 = int(str2, 16)
int3 = int(str3, 16)

# Convert Each Integer to Hexadecimal:
hex1 = "%X" % int1
hex2 = "%X" % int2
hex3 = "%X" % int3

#Add Hexadecimal Values And Compare Equality to HexValue:
while True:
    hex1 + hex2 + hex3 == value
print hex1, "+", hex2, "+", hex3, "=", value

I'm running the script in Python 2.6.4 on Windows. This is the first Python script that I've ever written from start to finish.

I've read on these forums that I may be running into something akin to the Travelling Salesman Problem.

My questions are these:

- Should I be using random.sample, which I've read samples without replacement, or should I be using scipy.maxentropy.maxentutils.sample_wr, which samples with replacement?

- As I am a complete newb, should I give up on this script, and try to find another way to find three hex numbers whose sum is a given hex number?

The script seems to be doing exactly what you describe it is. What's the problem?

If you want to be sure that you only sample the same three values once, then you will have to keep track of the threesome somewhere, as each individual value could be used in another threesome. I don't know how many possible 4 digit permutations, randomly associated in triples there are, but it is a lot and probably would require more computing power than you have. You might want to consider testing for one group of the three at a time. That is, break the value you are testing against into it's component parts and find each one individually. Even better would be testing for the one-twelfth, each hex, individually. That should reduce the time from months to minutes. Finally, note the the "while True" loop never ends.

There are 121*120*119*118 = 203889840 choices for str1, although I don't understand why the digits in str1 need to be different.

Since hex numbers '87' and 'c4' are not even in your charset, it could take you a very long time to match. :)

You could use this little test program ...

import random

# search for this upper case HexValue string:
# use hex digit pairs that are in the charset
value = "%X" %  0x2e1077
print value  # test upper case hex

# Character Set:
charset = [
'01', '02', '03', '04', '05', '06', '07', '08', '09', '0b', '0c', 
'0e', '0f', '10', '11', '12', '13', '14', '15', '16', '17', '18', 
'19', '1a', '1b', '1c', '1d', '1e', '1f', '20', '21', '22', '23', 
'24', '25', '26', '27', '28', '29', '2a', '2b', '2c', '2d', '2e', 
'30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '3b', 
'3c', '3d', '3e', '41', '42', '43', '44', '45', '46', '47', '48', 
'49', '4a', '4b', '4c', '4d', '4e', '4f', '50', '51', '52', '53', 
'54', '55', '56', '57', '58', '59', '5a', '5b', '5c', '5d', '5e', 
'5f', '60', '61', '62', '63', '64', '65', '66', '67', '68', '69', 
'6a', '6b', '6c', '6d', '6e', '6f', '70', '71', '72', '73', '74', 
'75', '76', '77', '78', '79', '7a', '7b', '7c', '7d', '7e', '7f']

n = 0
while True:
    ran1 = random.sample(charset, 3)
    str1 = ''.join(ran1).upper()
    #print(str1)  # test
    if str1 == value:
        break
    n += 1

print("It took %d samples to find %s" % (n, value))

With 4 * 2 hex digits this brute force approach can take a long time, so I sampled only 3 pairs.

Member Avatar for herrschteiner

After mulling over; the logic I'm trying to get this script to perform, the Python books I have, many searches, and your replies (thank you all for your replies):

@The_Kernel - The way I wrote the script initially is bad, and wrong. As woooee points out, the while loop will never end, mainly because all my script is doing is randomly sampling, joining, etc Once, and then comparing equality to the given value (I think).

I've modified the script like so:
- Define the initial variable as a base16 integer (for convenience - strips out most of the conversion statements)
- Randomly sample, with replacement, 3, 4-string lists from the character set
- Convert the 3 lists to 3 base16 integers
- Add the 3 integers and compare their sum to the initial variable

Here is the script now:

import scipy.maxentropy.maxentutils

#Initial Base16 Integer Value of Hexadecimal Value:
value1 = 772835268  #2E1087C4

# Character Set:
charset = [
'01', '02', '03', '04', '05', '06', '07', '08', '09', '0b', '0c', 
'0e', '0f', '10', '11', '12', '13', '14', '15', '16', '17', '18', 
'19', '1a', '1b', '1c', '1d', '1e', '1f', '20', '21', '22', '23', 
'24', '25', '26', '27', '28', '29', '2a', '2b', '2c', '2d', '2e', 
'30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '3b', 
'3c', '3d', '3e', '41', '42', '43', '44', '45', '46', '47', '48', 
'49', '4a', '4b', '4c', '4d', '4e', '4f', '50', '51', '52', '53', 
'54', '55', '56', '57', '58', '59', '5a', '5b', '5c', '5d', '5e', 
'5f', '60', '61', '62', '63', '64', '65', '66', '67', '68', '69', 
'6a', '6b', '6c', '6d', '6e', '6f', '70', '71', '72', '73', '74', 
'75', '76', '77', '78', '79', '7a', '7b', '7c', '7d', '7e', '7f']

n = 0
while True:
# Randomly Sample Three, Four-String Lists With Replacement:
    ran1 = scipy.maxentropy.maxentutils.sample_wr(charset,4)
    ran2 = scipy.maxentropy.maxentutils.sample_wr(charset,4)
    ran3 = scipy.maxentropy.maxentutils.sample_wr(charset,4)

# Convert Each List To Base16 Integer:
    int1 = int(''.join(ran1), 16)
    int2 = int(''.join(ran2), 16)
    int3 = int(''.join(ran3), 16)

# Add the Three Integers and Compare Equality to Initial Base16 Value:
    if int1 + int2 + int3 == value1:
	    break
    n+= 1

print("It took %d samples to find %s" % (n, value1))
print int1, "+", int2, "+", int3, "=", value1

Thank you @vegaseat. This is going to be my "I don't know Python very well" question. Do the n = 0, n+= 1 statements continue the loop until the three correct values are found?

Knowing that this will take possibly forever to run, I just want to know that the logic of the script is sound. The script (as I previously wrote it) had been running for days, and I'm not sure if my script is bad, or if it's good, and just takes an inordinate amount of time to run.

Thank you all again.

For testing, cut the list down to 10 or 20 hex's and enter a value within that range. You are testing, so you want results in a reasonable amount of time. Also, note that some values are still missing. The following code will print the missing values. It can also be modified to generate the list instead of keying it manually.

charset = [
'01', '02', '03', '04', '05', '06', '07', '08', '09', '0b', '0c', 
'0e', '0f', '10', '11', '12', '13', '14', '15', '16', '17', '18', 
'19', '1a', '1b', '1c', '1d', '1e', '1f', '20', '21', '22', '23', 
'24', '25', '26', '27', '28', '29', '2a', '2b', '2c', '2d', '2e', 
'30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '3b', 
'3c', '3d', '3e', '41', '42', '43', '44', '45', '46', '47', '48', 
'49', '4a', '4b', '4c', '4d', '4e', '4f', '50', '51', '52', '53', 
'54', '55', '56', '57', '58', '59', '5a', '5b', '5c', '5d', '5e', 
'5f', '60', '61', '62', '63', '64', '65', '66', '67', '68', '69', 
'6a', '6b', '6c', '6d', '6e', '6f', '70', '71', '72', '73', '74', 
'75', '76', '77', '78', '79', '7a', '7b', '7c', '7d', '7e', '7f']

k_range = [ str(x) for x in range(0, 10) ]
k_range.extend(['a', 'b', 'c', 'd', 'e', 'f'])
print k_range
for j in range(0, 8):
    for k in k_range:
        x = str(j) + k
        if x not in charset:
            print x
Member Avatar for herrschteiner

For testing, cut the list down to 10 or 20 hex's and enter a value within that range. You are testing, so you want results in a reasonable amount of time. Also, note that some values are still missing. The following code will print the missing values. It can also be modified to generate the list instead of keying it manually.

charset = [
'01', '02', '03', '04', '05', '06', '07', '08', '09', '0b', '0c', 
'0e', '0f', '10', '11', '12', '13', '14', '15', '16', '17', '18', 
'19', '1a', '1b', '1c', '1d', '1e', '1f', '20', '21', '22', '23', 
'24', '25', '26', '27', '28', '29', '2a', '2b', '2c', '2d', '2e', 
'30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '3b', 
'3c', '3d', '3e', '41', '42', '43', '44', '45', '46', '47', '48', 
'49', '4a', '4b', '4c', '4d', '4e', '4f', '50', '51', '52', '53', 
'54', '55', '56', '57', '58', '59', '5a', '5b', '5c', '5d', '5e', 
'5f', '60', '61', '62', '63', '64', '65', '66', '67', '68', '69', 
'6a', '6b', '6c', '6d', '6e', '6f', '70', '71', '72', '73', '74', 
'75', '76', '77', '78', '79', '7a', '7b', '7c', '7d', '7e', '7f']

k_range = [ str(x) for x in range(0, 10) ]
k_range.extend(['a', 'b', 'c', 'd', 'e', 'f'])
print k_range
for j in range(0, 8):
    for k in k_range:
        x = str(j) + k
        if x not in charset:
            print x

I thank you for your reply sir. I think I've made unclear exactly the task I'm trying to accomplish.

The character set has no missing values.

After the script randomly samples and joins three, separate string object lists, it then converts each string object list into an integer.

The integers are then tested to see if their sum is equivalent to the originally defined integer.

In my latest script, I'm using base16 integers, because I can easily convert them to hex with something like Windows calc.exe or KCalc once the script is done (if ever).
(i.e. the integer 772835268 = the hex number 2E1087C4)

The task is:

Given a hexadecimal number, and an incomplete set of hexadecimal characters, find three hexadecimal numbers, whose sum is the given hex number. This is a reverse arithmetic brute force script.

The script is running, but will take a really long time to finish. That is, even if I've written the script correctly.

Thanks again

PS - thank you for the time you spent on the missing characters code you posted. I apologize for not being clearer earlier.

A few remarks, because I think the brute force approach is definitely not the good one. First you are not working with base 16 digits, but with base 256 digits because each of your characters has 2 hexadecimal digits. Second, how would you do in base 10 and an incomplete set of digits to find 3 numbers with a given sum ? This is an incomplete addition. Write it on paper, and will see that you won't use a brute force approach.
When you add 3 digits, there is a carry which can only be 0, 1, 2. The same holds in base 256. Since your result has only 4 digits(256), you are left with 3**3 = 27 possibilities for these carry numbers. For each of this 27 possibilities, you only need to find (4 times) 3 digits which sum is a given number, which is a very simple problem.
So you don't need to go through the hundreds of million cases of the brute force approach.

Member Avatar for herrschteiner

A few remarks, because I think the brute force approach is definitely not the good one. First you are not working with base 16 digits, but with base 256 digits because each of your characters has 2 hexadecimal digits. Second, how would you do in base 10 and an incomplete set of digits to find 3 numbers with a given sum ? This is an incomplete addition. Write it on paper, and will see that you won't use a brute force approach.
When you add 3 digits, there is a carry which can only be 0, 1, 2. The same holds in base 256. Since your result has only 4 digits(256), you are left with 3**3 = 27 possibilities for these carry numbers. For each of this 27 possibilities, you only need to find (4 times) 3 digits which sum is a given number, which is a very simple problem.
So you don't need to go through the hundreds of million cases of the brute force approach.

Wow. You just tilted my brain. But I think I see what you are saying sir.

I was hung up on the base16 (0-9 A-F) definition of hexadecimal. But if I translate each of the two-hex-digits in the list to their decimal equivalents, and start my own "brute forcing" on paper, I think I can come up with an answer.

I will try this now. And yes sir, I am a mathematical neanderthal.

I thank you very much.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.