Tag Archives: data

Splitting a large file using PHP

A while ago, I was presented with a somewhat unique problem. I had an automatically generated log file, with a size of 1.2gb (Never let your log files get this massive, kids.) With a file that massive, it becomes very annoying to try and read – most programs will require to load the entire file into RAM in order to read it. Alternate software to read it was an exercise in futility too – nothing seemed to be able to read it fast enough (or at all, in many cases), thanks to the same RAM problem. So i struck upon the idea of splitting the file down into chunks that actually would be readable. And as with most of my bulk processing needs, i once again turned to PHP. Here’s the solution that I came up with:

 $file = new SplFileObject("test_dump.sql");
 $count = 0;
 $lines = "";
 while (!$file->eof()){
  $lines.= $file->fgets(). "\r\n";
  if($count % 100 == 0){
   // write to file
   $output = new SplFileObject('dump'.$count.'.txt', 'w');
   //reset lines to be blank
   $lines = "";

The script makes use of SplFileObject(), an Object Oriented class for handling files, which has a very useful advantage over the older fopen() or file_get_contents() functions – It can read files line by line. This allows the script to essentially load (in this example) 100 lines into a variable, and then dump it into an incrementally named file, which can be easily read. It’s not the most sophisticated solution, but it’s fairly fast, and gets around RAM limitations.