Hey all,
I ran a code today which digested an input file which was 304 MB, consisting of about 10 million lines with six columns a piece. I ran the code and got an indexing error. In troubleshooting, I copied only the first 1,000,000 lines into a new data file, ran it and the code ran fine. Google searches say that python arrays are only limited by the ram of the machine.
My question is how can I get a sense of the limitations of my machine? The machine I'm running the code on has several 64bit processors and 8 gb's of ram. Is there an exact way (say a built in command?) that will allow me to test if a data file will be too large without requiring that I actually run the file and wait for it to error. Secondly, what would you recommend I do to obviate such a problem in the future? Lastly, is there a smart way to append the code so that if it fails, it will let me know exactly at which line it failed so I get a sense of how far it got before crashing?
Thanks