Hey guys,
This questions is more about organization of python modules than the actual construction therein.
In my project, we have several scientists performing similar analysis on several different data sets. For each project, the science changes, but the analysis often requires almost identical data handling. For example, one project may require us to take in a large, tab delimited data file, take all of the information from one column, and store it, then compare this information with information from another column in a different file. The protocol is usually to build dictionaries with lists as values, which seems to be the best way to ensure no information is lost during comparisons. Sometimes I just compare one list with another. Sometimes I compare a list with keys in a dictionary. Sometimes I compare one dictionary's keys to another's.
My question is: Do you know of any packages that could help streamline this process? We now have several, nearly identical, modules floating around which do almost the same thing. It starts to get tedious to try to organize all of these modules. Is there any packages available that are used for type of comparison? Namely, column comparison between data files etc... If I could start using the same package for all of this analysis, it would really clear up some of the clutter int he reasearch group.
Advice is greatly appreciated.