I have a script here that takes a bunch of diff .txt files and plugs&chugs what's in one file into a master "template.txt" list. It basically replaces what's in the [BRACKETS] in the master template list with the other data files. Here's my code:
import re, sys
class Template:
def __init__(self, text):
self.text = text
self.count = 0
self.tags = []
self.sentences = set()
def __str__(self):
return "%s:%s:%s" % (self.count, ";".join(self.tags), self.text)
FOLDER = "/projects/Python/"
with open (FOLDER+"templates.txt") as myfile:
templates = [Template(e.strip()) for e in myfile]
tagdict = {}
for (i, template) in enumerate(templates):
tags = re.findall (r'\[[^\]]+\]', template.text)
template.count = len(tags)
template.tags = tags
for tag in tags:
tag = tag[1:-1]
with open (FOLDER+tag+".txt") as tagfile:
tagdict[tag] = [e.strip() for e in tagfile]
for template in templates:
lengths = []
for tag in template.tags:
l = len(tagdict[tag[1:-1]])
if l > 0: lengths.append(l)
mintag = min(lengths)
for i in range(mintag):
sentence = template.text
for tag in template.tags:
sentence = sentence.replace(tag, tagdict[tag[1:-1]].pop(0))
template.sentences.add(sentence)
for sentence in sorted(template.sentences):
print sentence
templates.txt:
[HOTEL] and [RESTAURANT].
[NAME] lives on [STREET].
HOTEL.txt
Best Western
Holiday Inn
RESTAURANT.txt
Denny's
Applebee's
IHOP
Black Angus
Right now, it would only give me two outputs:
Best Western and Denny's.
Holiday Inn and Applebee's.
because it searches through the minimum list and stops there.. how can I make it so that it uses the maximum of the lists (in this case, RESTAURANT), and then for HOTEL just have it loop through? I tried using maxtag = max(lengths) instead of min(lengths)... but it doesn't seem to be working. Anyone help?