I'm writing a program which searches through a list of medium sized text files for a particular keyword string. This string may appear more than once in the file. The files are generally parsed out for readability, so they have hard returns at the 80 column mark within paragraphs. As I have it right now, the program slurps the entire file and then runs a global search on my keyword string like so:
$brcount++ while $data =~ /(file for bankrup)|(file for chapter)/gi;
I then print out this count, along with a bunch of header information. What I would like to do is to print out say 1-3 lines before and after each occurrence so that I have an output file that I can manually inspect.
I've tried to hack in the lines to the regex string, but since it's a global search, it of course matches each occurrence 5 or 6 times:
while ($data=~ /(?:[^\n]*\n){1,3}(file for bankrup)(?:[^\n]*\n){1,3}|(?:[^\n]*\n){1,3}(file for chapter)(?:[^\n]*\n){1,3}/gi) {
print $data;
}
Is there an elegant way to do this? Alternatively, is there another way of going about this reading the file in line-by-line? I've thought about appending two lines together, performing the phrase match, storing the line numbers, and then going back in and pulling the additional lines with the stored line numbers, but this seemed like a huge hack. It would probably run faster that way, I would imagine.