Hi, I'm new to Python and have a task of reading a user input text file that is tab-delimited and contains 4 columns in each line: Authors, Year, Title and Journal.
I currently am just able to open a file, and now I don't know how to begin parsing the data.
The recommended way of sorting the data is to use the following three lists (which I set as):
authorsList = []
journalsList = []
papersList = []
In the papersList, each paper's entry is its title, year published, the index of each author(s) and the index of the journal; in this way the name of each journal and author is only stored in one place.
What I learned to do in Python: basic I/O, loops and conditions, defining functions and little exception handling. I've been going through google but a lot of answers to the same question I have, have been using the csv module and regular expressions, which I tried to learn myself but couldn't understand the code that was suggested. Is there a way to do it without the csv and re module?
I was thinking of doing something like this:
for line in openfile:
a, b, c, d = line.split("\t")
authorsList.append(a)
papersList.append(b, c)
journalsList.append(d)
but dont think that is right at all.
Any suggestions or tips?
Thanks for your time and consideration.