Hello,
I have one big XML file (600 MB - 850 MB) in format "cells_yyyymmdd_hhmi.xml" I would like to specify that everyday i will have new file with new date. So, there should be general way to read it and cut it.
For exmaple, i have file of 7th January. Its, cells_20140107_154016
Goal is to split into small parts by shell script and do operation. It will be great if anyone can give input to check the filesize and if it is too big, make 4 parts instead of 3 parts.
I did so far:
head -1125000 cells_20140107_154016.xml > PART1.xml
echo "</details></cells>" >> PART1.xml
echo "<cells><details>" >> PART2.xml
sed -n '1125001,2250000p' cells_20140107_154016.xml >> PART2.xml
echo "</details></cells>" >> PART2.xml
echo "<cells><details>" >> PART3.xml
sed -n '2250001,3480000p' cells_20140107_154016.xml >> PART4.xml
The main task is to make it in general.
Expected output:
head -1125000 filename.xml > PART1.xml
echo "</details></cells>" >> PART1.xml
echo "<cells><details>" >> PART2.xml
sed -n '1125001,2250000p' filename.xml >> PART2.xml
echo "</details></cells>" >> PART2.xml
echo "<cells><details>" >> PART3.xml
sed -n '2250001,3480000p' filename.xml >> PART4.xml
I hope i am clear.
Thanks in advanced for your time and input.