Securely Renaming a Massive Amount of Files with PHP – Part 2

Part One: Reading a folder structure recursively.
Part Two: Renaming the files (uniquely) and storing new and old file paths.
Part Three: Copy the files into a sequenced folder system.

This tutorial is a continuation of Securely Renaming a Massive Amount of Files with PHP. In Part 1, we covered:

  • how to set up PHP locally
  • how to call our PHP script in command line
  • how to recursively read a directory

In part 2 we will store our original file names in an array, and also store each files’ new uniquely hashed name. Let’s get started!

Part Two: Renaming the files (uniquely) and storing new and old file paths.

Storing the original files names is a simple one liner that we put inside the else statement.

} else {
    self::$file_names[] = $file_path;
}

Now it’s time to generate a hashed title for each of the files. We will use the hash as the array key, so we can detect if there is a duplicate. The following code will take generate the md5 hash of the file path, check if it is already stored in the array, and if it is, add a counter to the end of the filename (and rehash until it is unique).

} else {
    //echo $file_path . " FILE \n";     $hash = md5($file_path);
    if (! array_key_exists( $hash, self::$file_names) ) {
    self::$file_names[$hash] = $file_path;
    } else {
        $key = 1;
        $new_hash = $hash;
        
        while( array_key_exists( $new_hash, self::$file_names) ) {
            $new_file = self::new_file_name( $file, $key );
            $new_hash = md5( $new_file );
            $key++;
        }
        self::$file_names[$new_hash] = $file_path;
    }
}

You will notice I have another function being called in the while loop – self::new_file_name. It separates the extension from the filename, adds the $key counter, and sends it back to the while loop.

public function new_file_name($file, $key) {
        $ext = self::get_ext($file);
        $file_no_ext = self::remove_ext($file);
        return $file_no_ext . $key . $ext;
    }
    
    public function get_ext( $file ) {
        return "." . pathinfo($file, PATHINFO_EXTENSION);
    }
    
    public function remove_ext( $file ) {
        return substr($file, 0, strrpos($file, '.')); 
    }

So, here is the fully functional code with the line print_r(self::$files_array); so you will be given the full array listed out once the function completes.

<?php

/* Arugments for this script:
    -c  : Current Directory
    -n  : New Directory
    -l  : Log file name
*/
$args = getopt('c:n:l:');

define ('CURRENT_DIR', $args['c'] );
define ('NEW_DIR', $args['n'] );
define ('LOG_FILE', $args['l'] );

$hasher = new DirectoryHasher();

class DirectoryHasher {

    public static $file_names = array();
    
    public static $log_array = array();
    
    public static $copied = 0;

    public function __construct() {

        self::read_directories( CURRENT_DIR );
        print_r(self::$file_names);
    }


    public function read_directories( $dir ) {
        $files = scandir( $dir );

        foreach($files as $file ) {
            $file_path = $dir . "\\" . $file;

            if( substr($file, 0, 1) !="." ) {
                if( is_dir( $file_path ) )  {
                    self::read_directories( $file_path );           
                } else {
                    $hash = md5($file_path);
                    if (! array_key_exists( $hash, self::$file_names) ) {
                        self::$file_names[$hash] = $file_path;
                    } else {
                        $key = 1;
                        $new_hash = $hash;
                        while( array_key_exists( $new_hash, self::$file_names) ) {
                            $new_file = self::new_file_name( $file, $key );
                            $new_hash = md5( $new_file );
                            $key++;
                        }
                        self::$file_names[$new_hash] = $file_path;
                    }
                }
            }
        }
    }
    
    
    public function new_file_name($file, $key) {
        $ext = self::get_ext($file);
        $file_no_ext = self::remove_ext($file);
        return $file_no_ext . $key . $ext;
    }
    
    public function get_ext( $file ) {
        return "." . pathinfo($file, PATHINFO_EXTENSION);
    }
    
    public function remove_ext( $file ) {
        return substr($file, 0, strrpos($file, '.')); 
    }
}
?>

In part 3 we will copy the files into the new directory, and also write the .csv file that will store the data currently being stored in our $files_array array.

Leave a Reply

Your email address will not be published. Required fields are marked *