Dear friends,
I have a set of values as follows (filename is string; values are floating; notimportant... is self-explaining):
filenameA, value1_1, value2_1, value3, value4, notimportant
filenameA, value1_2, value2_2, value3, value4, notimportant
filenameA, value1_3, value2_3, value3, value4, notimportant
filenameA, value1_5, value2_5, value3, value4, notimportant
filenameA, value1_7, value2_7, value3, value4, notimportant
...
filenameB, value1_1, value2_1, value3, value4, notimportant
filenameB, value1_5, value2_5, value3, value4, notimportant
filenameB, value1_7, value2_7, value3, value4, notimportant
...
filenameC, value1_1, value2_1, value3, value4, notimportant
filenameC, value1_7, value2_7, value3, value4, notimportant
filenameC, value1_9, value2_9, value3, value4, notimportant
From this huge list (I will appreciate also if you could suggest me how to temporary store those information) I need to find any "value1 and value2" that is repeated at least 3 times in the list and I need to get then the full row.
So, in the example above, my ideal output would be:
filenameA, value1_1, value2_1, value3, value4, notimportant
filenameB, value1_1, value2_1, value3, value4, notimportant
filenameC, value1_1, value2_1, value3, value4, notimportant
filenameA, value1_7, value2_7, value3, value4, notimportant
filenameB, value1_7, value2_7, value3, value4, notimportant
filenameC, value1_7, value2_7, value3, value4, notimportant
(not
filenameA, value1_5, value2_5, value3, value4, notimportant
and
filenameB, value1_5, value2_5, value3, value4, notimportant
because "value1_5, value2_5" is not repeated AT LEAST 3 times.)
How would you suggest that I proceed?
Thanks a lot,
Gianluca