Hi folks,
I am a newbie to python, and I would be grateful if someone could
point out the mistake in my program. Basically, I have a huge text
file similar to the format below:
AAAAAGACTCGAGTGCGCGGA 0
AAAAAGATAAGCTAATTAAGCTACTGG 0
AAAAAGATAAGCTAATTAAGCTACTGGGTT 1
AAAAAGGGGGCTCACAGGGGAGGGGTAT 1
AAAAAGGTCGCCTGACGGCTGC 0
The text is nothing but DNA sequences, and there is a number next to
it. What I will have to do is, ignore those lines that have 0 in it,
and print all other lines (excluding the number) in a new text file
(in a particular format called as FASTA format). This is the program I
wrote for that:
seq1 = []
list1 = []
lister = []
listers = []
listers1 = []
a = []
d = []
i = 0
j = 0
file1 = open(sys.argv[1], 'r')
for line in file1:
if not line.startswith('\n'):
seq1 = line.split()
if len(seq1) == 0:
continue
a = seq1[0]
list1.append(a)
d = seq1[1]
lister.append(d)
b = len(lister)
for j in range(0, b):
if lister[j] == 0:
listers.append(j)
else:
listers1.append(j)
resultsfile = open("sequences1.txt", 'w')
for i in listers1:
resultsfile.write('\n>seq' + str(i) + '\n' + list1[i] + '\n')
But this isn't working. I am not able to find the bug in this. I would
be thankful if someone could point it out. Thanks in advance!
Cheers!