Hi all,
I would need your expertise/advice on the problem I encounter right now when I tried to parse in the contents of .csv file.
Here is the scenario:
1) I have csv file with the possible entries as follow:
ProjCat,RefNum,ProjTitle,MemberName,ProjDeadline,ProjGrade --> Header
I,0001,"Medical Research in XXX Field,2007","Gary,Susan",20.05.07,80
R,0023,Grid Computing in today era,"Henry Williams,Tulali Mark",04-May-07,--NA--
MP,0100,"Thinking in Logical Way, How to do it?","Williams,Harly Dimitry",10.02.07,NA
1,0114,"Computational Research for Biological Science, How to?",Alalaa,15-Mar-06,
2) I have to parse in the contents of this file to preferably, list of dictionaries.
So, the expected output list would be something like this:
outputList = [{'projCat':I,'RefNum':0001,'ProjTitle':"Medical Research in XXX Field,2007",'MemberName':"Gary,Susan",'ProjDeadline':20.05.07,ProjGrade:80},
{'projCat':R,'RefNum':0023,'ProjTitle':Grid Computing in today era,'MemberName':"Henry Williams,Tulali Mark",'ProjDeadline':04-May-07,ProjGrade:--NA--},
{'projCat':MP,'RefNum':0100,'ProjTitle':"Thinking in Logical Way, How to do it?",'MemberName':"Williams,Harly Dimitry",'ProjDeadline':10.02.07,ProjGrade:NA},
{'projCat':1,'RefNum':0114,'ProjTitle':"Computational Research for Biological Science, How to approach it?",'MemberName':Alalaa,'ProjDeadline':15-Mar-06,ProjGrade:}
]
3) Now, I have a problem when it comes to reading a line level of the file as the CSV file may consist of string data that can contain commas (such as, 'ProjTile' & 'MemberName' field)
What currently I have in hand right now is strings of line.
If I just use 'split' method of str, it will give me a misleading result, for e.g. "Medical Research in XXX Field,2007" will be splitted into ['Medical Research in XXX Field', '2007'] which is not what I want
Is there any other ways that I can split the fields correctly? using regular expression? any good approach for solving this?
4) Is it possible that the value of certain key in dictionary is left empty (as in the value for 'ProjGrad' key of the last entry of the above outputList)?
Any suggestions would be welcomed.
Thanks in advance
Shige