hughesadam_87 54 Junior Poster

Good to hear. Perhaps mark this thread as solved? If you have further pandas issues, I'd recommend posting to the pandas mailing list, which is very active and extremely helpful.

hughesadam_87 54 Junior Poster

Your best bet is to load all the datafiles into a pandas dataframe. This is the de-facto structure for handling labeled array data, and also has a ton of utilities for data manipulation ESPECIALLY timestamp.

post an example of one of your files, maybe the first 30 lines, including header, and I can help you get it into a pandas dataframe.

http://pandas.pydata.org/

hughesadam_87 54 Junior Poster

Can you elaborate on this, your question isn't clear to me.

hughesadam_87 54 Junior Poster

I noticed a few things.

First, there's no reason to put the entire code into a function (eg main). This is fine, but not mandatory. There's a special line at the bottom of your code that you should add so that if this program is called as an import from another program, then main() won't be executed. It will only be executed if you run the program (ie python thisprogram.py).

Change main() to:

if __name__=='__main__':
    main()

Note that the use of main() and main is just a coincidence. If your main() function was called "main_insane" then it would still be:

if __name__=='__main__':
    main_insane()

Second, why do you have these variables otc and cto?

Third, I would add an error if the use doesn't enter 1 or 2. Something like:

if choice == 1:
    ....
elif choice == 2:
   ...
else:
   raise InputError('Please enter 1 or 2!')

This will let the user know they screwed up.

hughesadam_87 54 Junior Poster

I think it's time for me to start using Python3

hughesadam_87 54 Junior Poster

Also, python is going to round (9/5) to 2 since these are integers. Get in the habit of using floats: (9.0/5.0)

hughesadam_87 54 Junior Poster

Bump on snippsat's reply...

That is almost certainly the way to do it if you ask me.

hughesadam_87 54 Junior Poster

You can download a new python version and run it separately in ubuntu by putting the new call into your path. Don't uninstall your system version, this will mess up the OS (yes python is integral to ubuntu). By default, your system's python exectuable is installed in /usr/bin/.

So let's say you install python 3.1 to a folder /usr/local/Python3.1/ for example...

Then you want to open up your bashrc file (in terminal sudo gedit ~/.bashrc).

You can then add the new python directory to your path like son:

py3path='/usr/local/Python3.1/bin'
PATH=$py3path:/$PATH
export PATH

Save the file (but don't close it in case you made a typo). Open a new terminal to refresh the changes and then type python. Your new version should be accessed now. You can still keep a link to your old build if you want, and my preferred way to do this is to add an alias in the bashrc file. Like this:

alias python1='/usr/bin/python'

Now when you type "python1" in your terminal, your old version should work.

I am spoiled; I use the enthought python distribution so it's extremely easy for me to update packages; however, when you install from source, I think it should be straightforward to tell setup.py how to know which python you want to install to. That is something someone else can help with probably.

Hope this is helpful; trust me, I learned this the hard way.

hughesadam_87 54 Junior Poster

You need to post an attempted effort before anyone will help you with homework.

hughesadam_87 54 Junior Poster

Hmm, so the way you defined alllists, you did not make a list of lists; rather, you simply made a single list by adding all the items from your smaller lists together. To make a list of lists, you'd use this syntax:

alllists=[classlistA, classlistB, classlistC ]

Then I'd change the following line:

for i in range(classnumber):
       classname = raw_input("Please type the classes you have taken <CODE> <COURSE-NUMBER>: ")
       allLists.remove(classname)

To

for i in range(classnumber):
       classname = raw_input("Please type the classes you have taken <CODE> <COURSE-NUMBER>: ")
       for classlist in alllists:
           if classname in classlist:
               classlist.remove(classname)
hughesadam_87 54 Junior Poster

There are numerous solutions to this problem, but perhaps one that is most straightforward is to just put all your items in a list of lists. Or a tuple of lists, or a dictionary of lists...etc there's many advanced storage options in python.

Here's a list of lists:

def main(): 
    print "This program returns a list of classes you still have to take: "
    classlistA = ["CMSC 201", "CMSC 202", "CMSC 203", "CMSC 304", "CMSC 313",
                 "CMSC 331", "CMSC 341", "CMSC 345", "CMSC 411", "CMSC 421",
                 "CMSC 441"]
    classlistB = ["MATH 151", "MATH 152", "MATH 221"]
    classlistC = ["STAT 334" ,"STAT 451"]

    all_lists=[ classlistA, classlistB, classlistC]

    classname = input ("How many classes have you taken so far?: ")
    
    for i in range(classname):
        classname = raw_input("Please type the classes you have taken <CODE><COURSE-NUMBER>: ")
    for classlists in all_lists:
        try classlists.remove(classname)
    except Exception:
        print 'cannot remove that, not found'
        pass
    
    
    print "part A requirements"
    for x in classlistA:
        print "You still have to take", x


main()

What I did was I just put your 3 classlists in a new list called all_lists. Then I also put in an error exception that will handle the situation when you try to remove an element that is not there.

Again, there are many more elegant solutions.

hughesadam_87 54 Junior Poster

Same with me. Post a sample of the data file so we know how your data is structured then try to explain exactly what you do more clearly and we can suggest a better code.

hughesadam_87 54 Junior Poster

I feel like this is too complicated for what you want to do. Can you clarify what you want to do and we could suggest a more simple class structure for it.

hughesadam_87 54 Junior Poster

I'm not so great at explaining all the nuts and bolts of classes, and probably have some bad habits, but I've fixed your code so that it at least works:

class OffSwitch:
    def __init__(self):
        self.keycheck='off'

    def turnOff(self, key):
	print key, 'hi'
        if key==self.keycheck:
            SystemExit
        else:
            print('will continue')


switchword=str(raw_input('Enter a word: ') )

foo=OffSwitch()
foo.turnOff(switchword)

First, I changed your class declaration to class Offswitch: instead of Offswitch(object). I don't know why you need to put the object there, maybe there's a good reason, but it's not necessary.

Second, I changed your switchword call so that it knew to make your input a string. Again, your syntax may be fine, but this is how I do it.

Lastly, and importantly, the code works when you initialize the class before calling its methods. I made a class instance by using:

foo=OffSwitch()

Now, the variable foo, is the instance of your class. Basically, foo is a variable just waiting around for you to run the class methods. I'm no expert, but I think you always need to first instatiate a class like this before you run its methods. To run methods, the syntax is:

foo.turnOff(switchword)
hughesadam_87 54 Junior Poster

If you are using bash in the linux terminal, it is simple to just remove all directories containing "td" with:

rm -r *td*

First you may want to use ls *td* to make sure there are no other directories that would be removed in this process by mistake.

I know this is a python question, but if you didn't know how to do it in bash, it is quite simple.

hughesadam_87 54 Junior Poster

This is what i have:

import re

infile = open ('file', 'r')
outfile = open('output', 'w')
column = 31

for line in infile:
     if not re.match('#', line):     


          line = line.strip()
          sline = line.split()
          outfile.write(sline[column] + '\n')

infile.close()
outfile.close()

It seems to now tell me that there is an IndenError: list index out of range for the line outfile.write(sline[column] + '\n')

Does your file have at least 32 columns? From the image you posted, it seems like it only have like 5 or so. If you are picking a column outside of the range of your list you will get an error like that.

hughesadam_87 54 Junior Poster

Thanks for the help.

I won't need to retrieve the original data. Python will go in through the same file and just grab that column everytime.

I tried your code and it tells me that there is an error in the "for line in file:" line...telling me that there is TypeError: iteration over non-sequence.

Copy and paste what you have. When you put in your file name, did you surround it by quotes. IE. "myfile" vs myfile

hughesadam_87 54 Junior Poster

There are several ways. The easiest way is to use a python list. The more complicated way is to use a dictionary. The advantage of the dictionary is that if later down the road, you need to retrieve some of the original information (for example, the entire line that corresponded to the entry of interest), it is more accessible.

For the simple case of just checking a column:

import re

infile = open ('your_file_name', 'r')
outfile = open('output_file_name', 'w')
column = ??? (PUT YOUR COLUMN HERE)

for line in file:
     if not re.match('#', line):     


          line = line.strip()
          sline = line.split()
          outfile.write(sline[column] + '\n')

infile.close()
outfile.close()

Notice that I chose sline[column] to signify which column I want. In principle, this will do it, but as I mentioned before, if you later need to retrieve corolary information from your original data, it is harder to retrieve unless you use dictionaries.

If you find that you have to do this type of manipulation often, I have a ton of tools/ideas which can help as this line of work is exactly what I've been doing all summer (expect w/ bio data).