Hey everyone,

Here is an outline of my problem:

I have a working code which defines a class, GeneDict, which reads in data from a special type of file, and stores it as a dictionary. What it is really doing is taking in millions of lines of biological data, and storing chromosomes as keys. To each key, there is not just one value, but a list of values. Anyway, that part works fine, so after that, I define a bunch of methods to act on the dictionary (like get the dictionary length for example).
What I want to do now is define a second dictionary. Let's call it, FamilyDict. FamilyDict will have the same keys as GeneDict, but less values. So, what I seek to do is write a subclass that will inherit from GeneDict, take in all the keys, then filter out some of the values that I don't need to keep, and append these values to the new FamilyDict. I seek to use subclasses because I want to be able to use all of the pre-written methods on both the GeneDict and the FamilyDict.

Below I will post the working GeneDict. It is not crucial that you understand everything about this diction, just know that it works:

class GeneDict:
	'''Class to build a dictionary with chromosones as keys, with several genes as values, each gene/value being a freaking list of information.'''

	def __init__(self, file=None):  #Will read from file, but file will be defined later at instantiation
		'''Reads in file like rep_element.bed, stores chromosones as keys, and all other info as values'''
		self.dictionary = {}
		infile = open(file)
		for line in infile:
			if not re.match("#", line):   #If the line isn't a header
				line = line.strip()
				sline = line.split()

				if sline[5] not in self.dictionary.keys():
					self.dictionary[sline[5]] = [];    #key is added

				value=RepeatingElement( int(sline[0]), int(sline[1]),                              int(sline[2]),  int(sline[3]), int(sline[4]), sline[5], 
                                    int(sline[6]), int(sline[7]), sline[8],
                                    sline[9], sline[10], sline[11], sline[12],
                                    sline[13], int(sline[14]),
                                    int(sline[15]), sline[16] )

			        self.dictionary[sline[5]].append(value)

Now, here is my attempt at the new dictionary:

class FamilyDict(GeneDict):
	def __init__(self, file=None):
		GeneDict.__init__(self, file=None)
		self.Family_dict= {}
		infile = open(file)
		for key in dictionary.keys():
			self.Family_dict.append(key)

The terminal is already complaining, and all I've tried to do is copy over the keys. In the end, I need to tell FamilyDict to take certain bunch of elements from GeneDict's values. I plan to do it with an expression match, something like:

if str(element.repFamily) == str(family):
"""Check to see if read matches desired family""" etc...

But just for now, can you guys see where my subclass has already gone wrong?

class FamilyDict(GeneDict):
	def __init__(self, file=None):
		GeneDict.__init__(self, file=None)
		self.Family_dict= {}
		infile = open(file)
		for key in dictionary.keys():
			self.Family_dict.append(key)

It would help if you explained what was going wrong. Without that info all I can say is, where is the dictionary coming from? You're iterating over its keys but I don't see it getting passed in from anywhere...

It would help if you explained what was going wrong. Without that info all I can say is, where is the dictionary coming from? You're iterating over its keys but I don't see it getting passed in from anywhere...

Sorry. So, the code compiles with no instantiations; however, when I try this:

moo = FamilyDict('rep_small.bed')
output_contains = moo.__contains__('chr1')

The code raises this error:

Traceback (most recent call last):
  File "new_repUCSC.py", line 249, in <module>
    moo = FamilyDict('rep_small.bed')
  File "new_repUCSC.py", line 126, in __init__
    GeneDict.__init__(self, file=None)
  File "new_repUCSC.py", line 104, in __init__
    infile = open(file)
TypeError: coercing to Unicode: need string or buffer, NoneType found

Perhaps my instantiation is incorrect? I don't know. Does this look familiar?

I see it now the problem is here:

class FamilyDict(GeneDict):
	def __init__(self, file=None):
                # You're sending None to the __init__ function
		GeneDict.__init__(self, file=None)
		self.Family_dict= {}
		infile = open(file)
		for key in dictionary.keys():
			self.Family_dict.append(key)

When you call GeneDict.__init__ you are passing file=None. When you call a function, it is different from when you define the function, so using a default parameter as such won't work. What you should be doing is simply:

GeneDict.__init__(self, file)

This is because if someone were to create an instance of FamilyDict and not provide a file parameter. It would become None, which will be passed to GeneDict as None. So no reason to coerce it to None again!

This raises the question however; why allow for an optional parameter if it's going to break your code? If somebody were to create an instance of FamilyDict (or GeneDict) without a file parameter, the same thing would happen. You should force the file parameter to be present and not allow it to be optional. Otherwise, you'll need to check if file: in the GeneDict code before opening it.

As a side note, using file for a parameter name is a bad idea. File is a reserved word in Python

This raises the question however; why allow for an optional parameter if it's going to break your code? If somebody were to create an instance of FamilyDict (or GeneDict) without a file parameter, the same thing would happen. You should force the file parameter to be present and not allow it to be optional. Otherwise, you'll need to check if file: in the GeneDict code before opening it.

I see. Most of these programs use an options parser, and so the restriction is put on there. The program won't begin if the user doesn't specify an infile. But had I not been using this, what would I have to do to my code to eliminate the optional file? Would I remove file=none and replace it with file=infile or something?

what would I have to do to my code to eliminate the optional file?

You would simply remove the =None . Whatever you specify after the equals since in your function definition is the default value that the parameter takes on. So if that parameter is not specified when calling the function, it takes on the default value (in this case None). Let me demonstrate:

>>> def funcA(a, b='Default'):
...     print a, b
...     
>>> def funcB(a, b):
...     print a, b
...     
>>> funcA('Hi', 'Blue')
Hi Blue
>>> funcB('Hi', 'GReen')
Hi GReen
>>> funcA('Hi')
Hi Default
>>> funcB('Hi')
Traceback (most recent call last):
  File "<input>", line 1, in <module>
TypeError: funcB() takes exactly 2 arguments (1 given)
>>>

Hope that clears it up a bit.

Thanks for all your help.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.