Hi I'm new to python and this forum and I am trying to work on a program that splits the html text file into its components:
The HTML file looks something like this:
"
Hello World #title
Today is a Friday.
The weekend is coming.
Lets have fun. #summary
1923 #date
John Doe # Name
"
I'd like the output to look like
data = {'title' : 'hello world', 'summary': ['Today is a Friday.','The weekend is coming.','Lets have fun.'], 'date': 1923 , 'Name': 'John Doe'}
My current code is:
def parse(file):
data = defaultdict(list)
data = {}
f= open(filename, 'r').readlines()
for line in f:
if line != '':
d['Title'].append(line)
elif line == '':
I have difficulties trying to write the code whereby when the function meets an empty line, it would replace it with d['summary'].append(line), and when it meets the next empty line, it will be replaced with d['date'].append(line) and when it meets with an empty line again it will be replaced with d['name'].append(line). is there any way that, once reading an empty line, ask the function to read the next line?
Another point is, for the summary, is it possible to join all the lines together so that it would look like
'summary': 'Today is a Friday. The weekend is coming. Lets have fun.'
instead of
'summary': ['Today is a Friday.','The weekend is coming.','Lets have fun.'] ?
Any help will be greatly appreciated!