Hi folks,

I am a newbie to python, and I would be grateful if someone could
point out the mistake in my program. Basically, I have a huge text
file similar to the format below:

AAAAAGACTCGAGTGCGCGGA 0
AAAAAGATAAGCTAATTAAGCTACTGG 0
AAAAAGATAAGCTAATTAAGCTACTGGGTT 1
AAAAAGGGGGCTCACAGGGGAGGGGTAT 1
AAAAAGGTCGCCTGACGGCTGC 0

The text is nothing but DNA sequences, and there is a number next to
it. What I will have to do is, ignore those lines that have 0 in it,
and print all other lines (excluding the number) in a new text file
(in a particular format called as FASTA format). This is the program I
wrote for that:

seq1 = []
list1 = []
lister = []
listers = []
listers1 = []
a = []
d = []
i = 0
j = 0

file1 = open(sys.argv[1], 'r')
for line in file1:
   if not line.startswith('\n'):
       seq1 = line.split()
       if len(seq1) == 0:
           continue

       a = seq1[0]
   	list1.append(a)

   	d = seq1[1]
   	lister.append(d)


b = len(lister)
for j in range(0, b):
   if lister[j] == 0:
       listers.append(j)
   else:
       listers1.append(j)


resultsfile = open("sequences1.txt", 'w')
for i in listers1:
   resultsfile.write('\n>seq' + str(i) + '\n' + list1[i] + '\n')

But this isn't working. I am not able to find the bug in this. I would
be thankful if someone could point it out. Thanks in advance!

Cheers!

There is indentation error. Also, put in some test prints to see what you get.

See if this help make your code shoter and clearer.
Ask if somethis is unclear.

l = []
for i in open('dna.txt'):
    if i.split()[1] == '1':             
         l.append(i.strip().rstrip('1'))       
print l

'''Out
['AAAAAGATAAGCTAATTAAGCTACTGGGTT ', 'AAAAAGGGGGCTCACAGGGGAGGGGTAT ']
'''

You can also strip() the record (remove '\n') and test for endswith zero.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.