The_Kernel 33 Light Poster

Just on first look your main while loop in the scan function isn't right.

Try changing it to:

for i in range(times):
    host = list1[0] + "." + list1[1] + "." + list1[2] + "." + list1[3]

    while threading.activeCount() >= threads:
        time.sleep(.1)

    Scanner(str(host), self.port).start()
    nlist1[3] = ((++nlist1[3]) %256 )
    if nlist1[3] != 0:
        nlist1[2] = ((++nlist1[2]) %256 )
        if nlist1[2] != 0:
            nlist1[1] = ((++nlist1[1]) %256 )
            if nlist1[1] != 0:
                nlist1[0] = ((++nlist1[0]) %256 )
The_Kernel 33 Light Poster

sum is actually a built-in function so you can easily change it a single line: sum(range(1, n+1, 2))

The_Kernel 33 Light Poster

without you posting your code it's hard to know that <hint, hint>

The_Kernel 33 Light Poster

I don't think you can just dump a python array into the dll's function. Try initializing graydata like this:

graydata_type = ctypes.c_ubyte * (w * h)
graydata = graydata_type()
The_Kernel 33 Light Poster

A class variable is shared across instances of the class, while an instance variable is unique to that instance of the class. Consider the following example:

class Foo:
    name = "FOO"
    
    def __init__(self, value):
        self.value = value
        
        
mine = Foo("bar")
mine_2 = Foo("taz")

print mine.value, mine.name
print mine_2.value, mine.name

Foo.name = "BAR"

print mine.value, mine.name
print mine_2.value, mine.name

#output is:
# bar FOO
# taz FOO
# bar BAR
# taz BAR
The_Kernel 33 Light Poster

Use the "in" keyword.

**---  will return a positive for words like "strunk"
found = False
for word in ["trunk", "branches"] :
    if word in words[1]:
        found = True
if not found:
    print "processing this"

Wouldn't it make sense to do this the opposite way? i.e.

if words[1] in ["trunk", "branches"]:
    found = True
else:
    found = False
The_Kernel 33 Light Poster

The script seems to be doing exactly what you describe it is. What's the problem?

The_Kernel 33 Light Poster

try

regex = re.compile('"([^"]+)"')
The_Kernel 33 Light Poster

it's failing because you're trying to subtract an int (Bet) from a string (money). You need to convert the string you read in from the money file into an int. You can use the built-in function int(string) to do that.

The_Kernel 33 Light Poster

Here's what I'd do. By converting the sorted list into a dictionary you'll end up with the highest value for each key in the list

Whoops, looking back on the original post this isn't correct. I was finding the overlapping items.

The basic approach is the same though: sort lists, convert to dictionaries, iterate through keys and values of first dictionary and compare to second's.

list1 = [('1201', '3'), ('1101', '4'), ('1101', '2'), ('1101', '3'), ('4472', '3'), ('4472', '1'), ('4472', '2'), ('4472', '0'), ('5419', '2')]
list2 = [('1101', '3'), ('4472', '5'), ('5419', '3'), ('453', '3')]

store1 = dict(sorted(list1))
store2 = dict(sorted(list2))

for key, value in store1.items():
    if (key in store2) and (store2[key] > value):
        del store1[key]
        
print store1.items()
# output is: [('1201', '3'), ('1101', '4')]
The_Kernel 33 Light Poster

Here's what I'd do. By converting the sorted list into a dictionary you'll end up with the highest value for each key in the list

list1 = [('1201', '3'), ('1101', '4'), ('1101', '2'), ('1101', '3'), ('4472', '3'), ('4472', '1'), ('4472', '2'), ('4472', '0'), ('5419', '2')]
list2 = [('1101', '3'), ('4472', '5'), ('5419', '3'), ('453', '3')]

store1 = dict(sorted(list1))
store2 = dict(sorted(list2))

for key, value in store1.items():
    if key not in store2:
        del store1[key]
    elif value < store2[key]:
        store1[key] = store2[key]
        
print store1
# output is: "{'4472': '5', '5419': '3', '1101': '4'}"
The_Kernel 33 Light Poster

Short answer, you don't do this. Usually when you want to get values from a thread you pass it a queue, and have a your main thread get values from the same queue.

A couple other things: you're calling join(), which will block until the thread is finished making the while loop directly after it unnecessary, and don't use "str" as a variable name since it's already the name of a builtin function.

import threading
import Queue

def stringFunction(value, out_queue):
    my_str = "This is string no. " + value
    out_queue.put(my_str)

my_queue = Queue.Queue()
thread1 = threading.Thread(stringFunction("one", my_queue))
thread1.start()
thread1.join()

func_value = my_queue.get()
print func_value
The_Kernel 33 Light Poster

Just to add on, the carriage return character '\r' actually moves the cursor to the start of the current line. So in this case you could just print the number (without a newline), wait a couple seconds, print a carriage return then print the next number.

The_Kernel 33 Light Poster

I'd have a separate thread that looks for changes to the file. When it sees a change the thread should send a signal. You can then have your treeview listen for that signal and call a handler to update what's displayed. See http://zetcode.com/tutorials/pygtktutorial/signals/ for more info on creating your own signals.

The_Kernel 33 Light Poster

I know there are modules out there for sorted dictionaries, but out of curiosity why do you want to do this?

The_Kernel 33 Light Poster

Always use absolute path+file names.

filePath = "Dataset/parameter feature vectors"
for fname in os.listdir(filePath):
    complete_name = os.path.join(filePath, fname)
    data_str = open(complete_name).read()
    index = data_str.find("female")
    if index != -1:
        females.append(index)
        print fname
    else:
        print "append the ones that aren't female to a males"

I don't think this is doing what bol0gna wants actually. In the original post bol0gna wants to sort on the filenames, while your code is actually searching the content of each file. Here's a version that works on the filename:

filePath = "Dataset/parameter feature vectors"
for fname in os.listdir(filePath):
    if fname.count('female'):
        females.append(fname)
    elif fname.count('male'):
        males.append(fname)
The_Kernel 33 Light Poster

hi
what version of python are you using, because i noticed your print statement is different. Maybe that's the reason. I tried it on python 2.6 and it seems to work :)

The reason this is happening is that "input" in python 3 works the same as "raw_input" in python 2.x, which takes the input and puts it into a string. In python 2.x the "input" function actually treats the user's input as literal python commands. So when you input "2, 3" to "input" it sees it not as a string but as you defining a tuple, which is why the unpacking works.

The_Kernel 33 Light Poster

There's a couple problems in your play_choice function it looks like:

def play_choice():
    play_ch = raw_input #You're setting "play_ch" to the actual raw_input function. To get the return value from the function it should be "play_ch = raw_input()"
    while play_ch != 1 and play_ch != 2 and play_ch != 3:
        print 'Invalid choice'
        play_ch = ( input( 'Enter a valid choice:' ) ) # Should be using "raw_input" function here.
    return play_ch
The_Kernel 33 Light Poster

Here's how I would approach this:

(1) for each entry in L2 find all the matching sub-lists from L1. You can use a list comprehension or a for loop for this.
(2) Add all the sub-lists that matched into a single list
(3) Reduce the list containing all the sub-lists to just the unique entries.

To get you started here's the code for the first step using a for loop:

L1 = [[1,2,3],[4,5,6],[7,8,9],[10,11,12]]
L2 = [1,2,3,4,11]

for item in L2:
    # entry will contain all the sub lists from L1 that contain the item
    entry = []
    for sub_list in L1:
        if item in sub_list:
            entry.append(sub_list)
The_Kernel 33 Light Poster

Yeah looks pretty nice for just throwing it together, I tried runnin it on py3 but it crashes for some reason. Maybe something diffrent than the version you are running, I'll definatly check out the queue module though.

Yea, I'm running 2.5. To make it work in py3 just change "Queue" to "queue" on line 3, "Queue.Queue()" to "queue.Queue()" on line 31, and the "print" commands on lines 24 and 27 need to be functions (just wrap the args in parenthesis).

The_Kernel 33 Light Poster

Not sure why your version is running slowly. Here's a quick and dirty version I just came up with using threads and a queue. It runs very quickly for me (less than a second I'd guess).

import socket
import threading
import Queue
import time

class Worker(threading.Thread):
    def __init__(self, port_queue):
        threading.Thread.__init__(self)
        self.port_queue = port_queue
        
    def run(self):
        while True:
            item = self.port_queue.get()
            
            if item is False:
                break
            
            address, port = item
            s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

            try:
                s.connect((address, port))
            except socket.error:
                print "%d: down" % port
                continue
            
            print "%d: up" % port
            s.close()

if __name__ == "__main__":
    port_queue = Queue.Queue()
    num_threads = 4
    
    # Create worker threads and start them
    for i in range(0, num_threads):
        Worker(port_queue).start()

    # Add all the ports to scan to the queue
    for i in range(1, 201):
        port_queue.put(('localhost', i))

    # Finally add False once for each thread so that they will stop
    for i in range(0, num_threads):
        port_queue.put(False)

    # Just sit here doing nothing until nothing is left in the queue
    while port_queue.qsize() != 0:
        time.sleep(.1)
The_Kernel 33 Light Poster

You can use the builtin function setattr to do this. See http://docs.python.org/library/functions.html#setattr.

The_Kernel 33 Light Poster

Opening it unbuffered means that when you call the file's write function the data will be written to disk immediately. This means that you could potentially be accessing the hard drive more frequently. If you are calling flush after each write already though then there won't be any difference there.

You can also open a file in line buffer mode by changing the '0' to a '1'. As the name implies this will buffer writing until you've written an entire line. Depending on what you're doing that might be a good middle ground.

SoulMazer commented: Continued, quality support. +1
The_Kernel 33 Light Poster

You can also solve this problem by opening the file as unbuffered: "open('filename', 'w', 0)"

The_Kernel 33 Light Poster

Well here's how I'd change your code. Let me know if this is what you have or doesn't work:

from __future__ import with_statement

with open ('C:\\Documents and Settings\\Desktop\\file2.txt') as fil:
    f = fil.readlines()
    result = []
    for line in f:
        line = line.split()
        if len(line) >= 3:
            i = float(line[2])
            if 90 <= i < 100:
                line = ' '.join(line)
                result.append(line)
            
with open('C:\\Documents and Settings\\Desktop\\j3.txt','w') as resultfile:
    resultfile.write('\n'.join(result))
The_Kernel 33 Light Poster

The problem is that you're just assigning "line" to "result" each time, hence "result" will only contain the last line you want to print. What you want to do is make "result" a list, then append the matching line to it. Then when you want to write it all to a file you can just do outfile.write('\n'.join(result))

The_Kernel 33 Light Poster

If you're getting an IndexError it means the location you're referencing in the list doesn't exist. You assume each line has at least three columns, however I'd bet there's an empty line in there that's screwing you up. It's easy to fix in any case. Just check that the length of the list is at least three. Here's you're code, plus the fix:

from __future__ import with_statement

with open ('C:\\Documents and Settings\\Desktop\\file2.txt') as fil:
    f = fil.readlines()
    for line in f:
        line = line.split()
        if len(line) >= 3:
            i = float(line[2])
            if i in range(90, 99.99):
                print line
The_Kernel 33 Light Poster

The problem is that you're creating a raw string, which isn't doing what you think it will. Instead of allowing you to create a hex value attached to the '\x' escape, it's instead creating a string that contains the characters exactly as '\' + 'x' + 'f' + '4'.

Here's a version that will do what you want (I hope :) ):

import re 

def convert_to_unicode(match_obj):
    raw_chr_val = match_obj.group(1)
    return unichr(int(raw_chr_val, 16))

def func(mylist):
    for e in mylist:
        e = re.sub(' ?&# x([0-9a-f]*);', convert_to_unicode, e) 
        print e

mylist = ['12 angry men', 'Rash &# xf4;mon']
func(mylist)
vegaseat commented: nice solution +12
The_Kernel 33 Light Poster

I think your problem is that in the call to re.sub you're replacing the match with a raw string. Try removing the 'r' prefacing the replacement string.

The_Kernel 33 Light Poster

Won't that the process of opening and closing the file cause it to get overwritten, instead of the line being appended?

Not if you open it in append mode. That means that anything previously in the file will stay there, and this might not be desirable on the first write (since you want the file to contain only data from the current run presumably). To fix this just keep track of whether you've written to the file before; if you have open it in append mode, otherwise open it in write mode.

The_Kernel 33 Light Poster

Ok, here is what I am doing:

First my code scans a large data file. In column 10, the genetic data is listed by "name", and there are about 110 different names total. Whenever it gets to a new name, it stores it in a list, so I have something like:

Name_List = ['AluSp', 'AluGp', 'AluSx' ... 'ZZcta' ]

For each name in the name list, I want to store a file of the same name, eg:

AluSp.bed, AluGp.bed, AluSx.bed...ZZcta.bed

The program will rescan the data file, and for each name, that line gets written into the appropriate file.

Because the raw data file is so large, I can't just open up one file for a name in the list, scan the whole data file, append entries for only the first name (AluSp.bed) and then close. Doing so would require a 110 scans of a 50 million line file. What I am doing is scanning the raw data file once, while all the 110 name files remain open, and for each line in the data file, that line gets written to the appropriate name file.

Is that clear?

The code is in place, so if you'd prefer to look at it, I can post it.

There's no need to keep all the files open at the same time though. Instead of having a list of 100 file handles replace it with a list of the filenames, then each time you want to write to one of them just open the file, write …

The_Kernel 33 Light Poster

No.

Care to elaborate?

The_Kernel 33 Light Poster

I don't know if you'll be able to do exactly what you're talking about. However once you have the sorted list of keys you could always do the following (to start iterating at the letter 'g' in the example):

my_dict = { 'z': 1,  'a': 2,  'g': 3 }
sorted_keys = sorted(my_dict.keys())

index = sorted_keys.index('g')

for key in sorted_keys[index:]:
    print '%s = %s' % (key, my_dict[key])
The_Kernel 33 Light Poster

Basically you just want to put the keys from the dictionary into a list and sort that list. Then if you iterate through the sorted list of keys. Like this:

my_dict = { 'z': 1,  'a': 2,  'g': 3 }
sorted_keys = sorted(my_dict.keys())

for key in sorted_keys:
    print '%s = %s' % (key, my_dict[key])

running this results in:

a = 2
g = 3
z = 1
The_Kernel 33 Light Poster

To split up the line from your first post here's what I'd do (this will also strip out the quotes from the beginning and end of each entry):

data = """'Corn For Grain', 'Irrigated', '1970', 'Colorado', 'Chaffee', '8', '10', '15', '11199199', '1', '', '100 acres', '75 bushel', '7500 bushel', '', ''"""

data_split = [entry.strip(" '") for entry in data.split(',')]

EDIT:
Looking at your code, why do you have a nested for loop? It seems to have no purpose, and may actually cause you problems.

The_Kernel 33 Light Poster

Seems like a good case for a regular expression

import re

test_input = "AND Category 07|Spec 01|ABC 01 AND Category 07|Spec 02|XYZ 02 AND Category 07|Spec 03|PQR 03 "
test_output = re.sub('\|[A-Z]{3} [0-9]{2}', '', test_input)
sneekula commented: very good +8
The_Kernel 33 Light Poster

There's always a more clever way :-)

start = 10
end = 20

open(outfile, 'w').writelines(open(infile).readlines()[start:end])
vegaseat commented: clever indeed, thanks +12