Hi,
I'm trying to use grep on a file from within a python script, I was thinking of something like this:
variable = 123
s = subprocess.Popen("grep" + "-w "variable" Data.txt")
How do I use a variable from within the call to grep?
Cheers
Andy
Hi,
I'm trying to use grep on a file from within a python script, I was thinking of something like this:
variable = 123
s = subprocess.Popen("grep" + "-w "variable" Data.txt")
How do I use a variable from within the call to grep?
Cheers
Andy
You would use string formatting to insert the value of the variable in the string you're using to execute grep like so:
variable = 123
s = subprocess.Popen("grep -w %s Data.txt" % variable)
Here's the documentation on string formatting.
That would return an error since the last 3 objects are arguments.
try
s = subprocess.Popen(("grep", "-w", "%s"%variable, Data.txt), stdout = subprocess.PIPE)
output = s.communicate()[0]
Hi,
Cheers the solution worked! I'm now having another problem. I'm basically trying to iterate through a directory full of text files, match their filenames against a list in another text file (Data_Labels.txt) and get the file's corresponding genre.
Data_Labels.txt has the following entries:
///////////////////////////
11: Resume or CV: N/A: N/A: N/A: N/A: N/A: Resume or CV;
12: Resume or CV: N/A: N/A: N/A: N/A: N/A: Resume or CV;
13: Resume or CV: N/A: N/A: N/A: N/A: N/A: Resume or CV;
14: Resume or CV: N/A: N/A: N/A: N/A: N/A: Resume or CV;
15: Resume or CV: N/A: N/A: N/A: N/A: N/A: Resume or CV;
16: Resume or CV: N/A: N/A: N/A: N/A: N/A: Resume or CV;
17: Speech Transcript: N/A: N/A: N/A: N/A: N/A: Letter;
18: Speech Transcript: N/A: N/A: N/A: N/A: N/A: Minutes;
...
...
5323: Story Book: N/A: N/A: N/A: N/A: N/A: Story Book;
/////////////////////
etc. and so on.........
The text files I'm going through are named "11_+_resume.txt", "18_+_Robs-speech.txt", "342_+_JasonsCV.txt" and so on.
My code so far looks like this:
import os
import subprocess
datalabels = open('Data_Labels.txt')
#iterate through directory of files
#split on underscore in its filename and get file number
for fname in os.listdir(os.getcwd()):
fname = fname.split('_')
fnum = fname[0]
#match filenum with Data_Labels.txt and get its Genre using
#using grep to identify line with current filenumber
#and its related genre
p = subprocess.Popen(("grep","-w","%s"%fnum, "Data_Labels.txt", stdout = subprocess.PIPE)
#get the matched line from Data_Labels.txt, split on ":", strip any
#whitespaces in its genre and print the genre and corresponding
#file number to screen
lbl = p.communicate()[0]
genrelabl = lbl.split(':')[1]
genrelabl = genrelabl.strip(' ')
print fnum, genrelabl
When I run it, it runs fine till it encounters a file with a space in the filename such as "4444_+_card (131).txt" and then it terminates with the following error
.....
.....
4917 List
4940 Contract
4395 Card
4729 Minutes
5455 Fictional Piece
4959 List
5397 Legal Appeal Proposal or Order
4444 Card
Traceback (most recent call last):
File "arff.py", line 18, in ?
genrelabl = lbl.split(':')[1]
IndexError: list index out of range
How can the index be out of range when I'm only using the first part of the filename before the underscore? The problem occurs whenever the program encounters a file with a space in the filename. What am I doing wrong?
Because you're splitting on a semi-colon when none exist, which will produce a list of length 1. Then you're trying to access the 2nd element of that list (ie, 0 is the first, 1 is the second index), which results in an IndexError.
You should probably do some sanity checking before list slicing like if len(split_item) > 1:
to avoid things like this...
EDIT: Please note, the failure occurs when you're splitting the results of grep and finding the genre, not when looking at the filename.
We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.