Securely Renaming a Massive Amount of Files with PHP – The Non-Destructive Route

A recent project scope included renaming (literally) tens of thousands of documents for security purposes – it’s not too secure having files name “November-2011-Finances.doc” hosted online.

This tutorial will explain how to use PHP locally (without a web server like XXAMP, MAMP, or WAMP). Using one of these 3 services is fine, but we decided to go the system route!

I will break this tutorial down into multiple parts:

Part One: Reading a folder structure recursively.
Part Two: Renaming the files (uniquely) and storing new and old file paths.
Part Three: Copy the files into a sequenced folder system. ( Coming Soon )

Note: I’ll save you the hassle I had with the “guess and check” method of installing PHP locally. Here are some great tutorials on how to install PHP locally on Mac and Windows.

Part One: Reading a folder structure recursively.

Note: I am using a class for this, simply because i was having issues with functions not completing before other commands run.

First, we need to be able to call our script.php file. Use your favourite text editor, and create your PHP file with a simple script like:

<?php
    echo "Hello World";
?>

Next, open your command line on your computer (Terminal on Mac OSX, Command Prompt on Windows) and put in the following line:

php C:\path\to\your\script.php

In your command line, you should see “Hello World”. This is a good sign that you are calling your file correctly.

The Command

This is the final command we will be using in command line:

php E:\your-folder\script.php -c E:\files-to-copy -n E:\folder-for-copied-files -l E:\your\logfile

This is running the PHP file located at E:\your-folder\script.php and accepting the following parameters:

  • -c : current directory (the files to copy)
  • -n : new file directory (where you want the files to go)
  • -l : the directory and name of your log file ( we will write a CSV spreadsheet file with new and old names)

The second task is accepting the parameters we pass through command line:

<?php
// getopt will get our parameters that are separated by colons (:)
// and store the values in an array
$args = getopt('c:n:l:');

// We now define constants to store these values ( you don't have to use constants, I just am in this example)
define ('CURRENT_DIR', $args['c'] );
define ('NEW_DIR', $args['n'] );
define ('LOG_FILE', $args['l'] );
?>

The second task is setting up a loop to go through all files, and dig into subfolders as you come across them. Here is a for loop that we will use to recursively read through the subfolders.

<?php
// getopt will get our parameters that are separated by colons (:) and store the values in an array
$args = getopt('c:n:l:');

// We now define constants to store these values ( you don't have to use constants, I just am in this example)
define ('CURRENT_DIR', $args['c'] );
define ('NEW_DIR', $args['n'] );
define ('LOG_FILE', $args['l'] );

// Initiate our Class
new DirectoryHasher();

// to store in a variable use: $hasher = new DirectoryHasher();

class DirectoryHasher {
    
    // We will store the original files
    public static $file_names = array();
    
    // We will store our log file data in this array
    public static $log_array = array();
    
    // a variable used to keep track of how many files copied
    public static $copied = 0;

    public function __construct() {

        self::read_directories( CURRENT_DIR );
    }

    public function read_directories( $dir ) {
        // scandir reads a directory and returns an array of file/folder names
        $files = scandir( $dir );

        // Now that we have a list of all the files, we will check if each of these is a folder of a file
        foreach($files as $file ) {
            // $file_path will store the entire path ($file is "filename.doc" where $file_path is "C:\folder\filename.doc")
            $file_path = $dir . "\\" . $file;

            // ignore "." and ".." that command line will show (shortcuts to access parent folders)
            if( substr($file, 0, 1) !="." ) {
                if( is_dir( $file_path ) )  {
                    // If the array item is a directory, send the directory back through this function
                    self::read_directories( $file_path );           
                } else {
                    // The array item is a file, display the filename
                    echo $file_path . " FILE \n";         
                }
            }
        }
    }
}
?>

That’s it! So we initiate our script’s class, after passing the directory we want to read’s path. The read_directories() function will start in the folder we pass it, display all files it finds, and when it comes accross a folder, it will recursively pass the folder back through the read_directories() function. You can have unlimited folder nesting, and this function will find all files inside the parent directory!

In part 2 of this tutorial, we will store the files we come across, and create new unique names for them. I hope you have enjoyed so far!

Leave a Reply

Your email address will not be published. Required fields are marked *