Remove [] in list

Question

-ordi- 6 Junior Poster in Training

14 Years Ago

Hei,

list = [line.split() for line in open(file) if line is not None]

and

output:

[linux@localhost ~]$
[[], ['text'], ['text', 'text', 'text', 'text'], ['text', 'text', 'text', 'text']]

How to remove none types [] or something.

python

4 Contributors
16 Replies
140 Views
2 Days Discussion Span
Latest Post 14 Years Ago Latest Post by -ordi-

All 16 Replies

TrustyTony 888 ex-Moderator

14 Years Ago

Are you trying to do something like this (your file I named sources.lst):

# -*- coding: utf-8 -*-
import os

mylist = set(' '.join(word for word in line.split())
             for line in open('sources.lst').readlines()
             if line.strip())

print '\n'.join(sorted(mylist))

Edited 14 Years Ago by TrustyTony because: n/a

-ordi- commented: strip() -> what I needed, thanks! +0

TrustyTony 888 ex-Moderator

14 Years Ago

First point:
Do not use list and file as variable names

Second:
Your expression

'.save' or 'c++' in filename

is always True as '.save' is not False value and part after or is not considered. Therefore value is

>>> '.save' or False
'.save'
>>> not('.save')
False
>>>

You probably mean to do:

import os

filelist = os.listdir('d:/test')

files = [os.path.realpath(filename) for filename in filelist
         if not(any(part in filename for part in ('.save','c++')))]

print '\n'.join(files)

Edited 14 Years Ago by TrustyTony because: n/a

vegaseat 1,735 DaniWeb's Hypocrite

14 Years Ago

At this point it might be best for you to at least study the basics of the Python language.

TrustyTony 888 ex-Moderator

14 Years Ago

I am familiar with C + +.
Python, unfortunately, is too confusing.

That is first steps, after it gets easier. Ask vegaseat.

I analyzed the situation of your code, the situation is actually different as you have and condition. The problem is that you are opening file for line even if it is directory in part f

or line in open(file).readlines()

Let me show some magic of Python, this gets you rid of those dups (variation of my earlier code, cleaner), we do not check before we just pass over dictionaries:

import os
for fn in ('sources.lst','/'):
    try:
        nodups=set(line for line in open(fn) if line.strip())
    except IOError:
        pass
    else:
        print ''.join(nodups)

Edited 14 Years Ago by TrustyTony because: n/a

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

snippsat 661 Master Poster · Answer 1 · 2010-08-11T02:18:02+00:00

snippsat 661 Master Poster

14 Years Ago

So should we guess how the input file looks like?

Edited 14 Years Ago by snippsat because: n/a

-ordi- 6 Junior Poster in Training · Answer 2 · 2010-08-11T02:50:27+00:00

deb http://archive.canonical.com/ lucid partner
deb http://archive.canonical.com/ lucid partner

deb http://archive.canonical.com/ lucid partnerO

and code:

# -*- coding: utf-8 -*-
 
import os


files = ['/home/timo/' + file for file in os.listdir('/home/timo/') if not('.save' or 'c++' in file)] 

files.append('/home/timo/sources.list')

dublicate = []
for file in files:
  list = [line.split() for line in open(file) if line is not None]
  if list:
   list.sort()
   last = list[-1]
   for i in range(len(list) - 2, - 1, - 1):
     if last == list[i]:
       dublicate.append(list[i])
       print dublicate[0]
     else:
       last = list[i]
  print list

-ordi- 6 Junior Poster in Training · Answer 3 · 2010-08-11T13:48:29+00:00

Ok, small another problem.

list = os.listdir('/home/timo/')

files = ['/home/timo/' + file for file in list if not('.save' or 'c++' in file)] 

print files

Output:

[]

-ordi- 6 Junior Poster in Training · Answer 4 · 2010-08-11T17:09:39+00:00

filelist = os.listdir('/home/timo/')

files = [os.path.realpath(filename) for filename in filelist
         if not(any(part in filename for part in ('.save', 'c++')))]

print '\n'.join(files)


dublicate = []
for file in files:
  list = [line for line in open(file).readlines() if line.strip() and os.path.isfile(file)]
  if list:
   last = list[-1]
   for i in range(len(list) - 2, - 1, - 1):
     if last == list[i]:
       dublicate.append(list[i])
       print '\n'.join(sorted(dublicate))
     else:
       last = list[i]

Traceback (most recent call last):
  File "Dup.py", line 23, in <module>
    list = [line for line in open(file).readlines() if line.strip() and os.path.isfile(file)]
IOError: [Errno 21] Is a directory: '/home/timo/.macromedia'

TrustyTony 888 ex-Moderator Team Colleague Featured Poster · Answer 5 · 2010-08-11T21:15:49+00:00

What vegaseat means that you are making variation of exactly same mistake I explained and line.strip() is False only when the line is empty (except whitespace including '\n'). Do some interactive practice in command line.

-ordi- 6 Junior Poster in Training · Answer 6 · 2010-08-12T02:02:29+00:00

-ordi- 6 Junior Poster in Training

14 Years Ago

I am familiar with C + +.
Python, unfortunately, is too confusing.

Edited 14 Years Ago by -ordi- because: n/a

-ordi- 6 Junior Poster in Training · Answer 7 · 2010-08-12T02:37:24+00:00

Ok, it works now!

[timo@localhost Python]$ python Dup.py
Duplicate: deb http://archive.canonical.com/ lucid partner in sources.list
Duplicate: tere tere tere tere in a.list

First file:

tere tere tere tere
tere tere tere tere

Second file:

deb http://archive.canonical.com/ lucid partner
deb http://archive.canonical.com/ lucid partner

deb http://archive.canonical.com/ lucid partnerO

# -*- coding: utf-8 -*-
import os
             
path = "/home/timo/Python/Proov/"  # insert the path to the directory of interest
dirList = os.listdir(path)
sources = []

dup = False
for filename in dirList:
    if os.path.isfile("/home/timo/Python/Proov/" + filename):
      for line in open("/home/timo/Python/Proov/" + filename):
	split = line.split()
	if split and not(split[0] == '#'): # quit if line is empty or a comment
	  for i in split[3:]:
	    src = split[0] + ' ' + split[1] + ' ' + split[2] + ' ' + i
	    if src in sources:
	      print 'Duplicate: ' + src + ' in ' + filename
	      dup = True
	    else:
	      sources.append(src)
	      #print sources
if not(dup):
  print 'No duplicates found'

TrustyTony 888 ex-Moderator Team Colleague Featured Poster · Answer 8 · 2010-08-12T02:44:04+00:00

Ok, but it is not maybe so good to use the split as variable name also. Not so bad as list or file as it is only used as method in string, but still can lead to less readable code and more confusing code. Say rename to splitedline?

-ordi- 6 Junior Poster in Training · Answer 9 · 2010-08-12T02:55:21+00:00

Ok, but it is not maybe so good to use the split as variable name also. Not so bad as list or file as it is only used as method in string, but still can lead to less readable code and more confusing code. Say rename to splitedline?

My problem is that I'd like to use list comprenhension:P

It is wise to use it here?

TrustyTony 888 ex-Moderator Team Colleague Featured Poster · Answer 10 · 2010-08-12T04:46:48+00:00

I did in my post the removal, but to announce the removal takes little consideration as that only removes the dups silently. It also do not keep the original order of lines. My first solution for dups was forgiving for white space by doing the ' '.join(line.split()). You could use remove adjacent I posted to StackOverflow:

def remove_adjacent(nums):
     return [a for a,b in zip(nums, nums[1:]+[not nums[-1]]) if a != b]

You would first need unify the lines and to sort the lines to use this.

vegaseat 1,735 DaniWeb's Hypocrite Team Colleague · Answer 11 · 2010-08-12T19:05:18+00:00

I am familiar with C + +.
Python, unfortunately, is too confusing.

The truth is that neither C++ nor Python lets you get away with gibberish code.

-ordi- 6 Junior Poster in Training · Answer 12 · 2010-08-12T21:38:04+00:00

The truth is that neither C++ nor Python lets you get away with gibberish code.

Do you disparage?

Ok, C++ simple opportunity:

for (int i = 0; i < 100; i++) {}

And Python:

for i, a in enumerate(['a', 'b', 'c'])

for a in ['a', 'b', 'c']

knights = {'gallahad': 'the pure', 'robin': 'the brave'}
>>> for k, v in knights.iteritems():

>>> questions = ['name', 'quest', 'favorite color']
>>> answers = ['lancelot', 'the holy grail', 'blue']
>>> for q, a in zip(questions, answers):
...     print 'What is your {0}?  It is {1}.'.format(q, a)

xrange and range

They are many, and I do not know when something is good to use (dict, list, set, etc)

Sorry:icon_sad:

Remove [] in list

Recommended Answers Collapse Answers

All 16 Replies

Recommended Answers