paint-brush
How To Find and Delete Duplicate Files in Google Driveby@kcl
1,431 reads
1,431 reads

How To Find and Delete Duplicate Files in Google Drive

by Khadka's Coding Lounge.February 20th, 2023
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Nibesh Khadka explains how to use Google Apps Script to delete duplicate files. This blog is meant for both coders and non-coders, so, please do humor the over-explanation. You can set up a time-based trigger to automate this task.
featured image - How To Find and Delete Duplicate Files in Google Drive
Khadka's Coding Lounge. HackerNoon profile picture


You might be running an organization where files are created automatically or by other employees/collaborators. Have you been troubled by google’s feature that allows files with the same name?


Then this blog is for you. You’ll get a script to delete the duplicate files and also be able to automate the process.


Hi, this is Nibesh Khadka. I am a freelancer that helps clients with jobs related to Google Workspace Products and Google Apps Script. This blog is meant for both coders and non-coders, so, please do humor the over-explanation.

Intro

First of all, you have to understand that we are talking about files with the same name. We won't be checking the contents of the files in the script. So, we'll write a script:

  1. That'll target a folder.
  2. Check the contents of the folder.
  3. Compare the names and sizes of files with each other.
  4. If both matches then consider them duplicate file.
  5. Remove the duplicate files, if they exist.
  6. Create a trigger to automate this task.

Apps Script

Let's first go to your script project home and create a new script. Add these codes there after emptying the code.gs file.


// Add id of the folder to check for duplicate
const FOLDER_ID = "";

/**
 * Function looks for duplicate file names in designated folder and removes them.
 * @param {String} fileName
 */
function removeDuplicateFile() {
  let folder = DriveApp.getFolderById(FOLDER_ID);

  let files = folder.getFiles();

  let fileList = [];

  // if no file is found return null
  if (!files.hasNext()) {
    return;
  }

  // else
  while (files.hasNext()) {
    let file = files.next(),
      name = file.getName(),
      size = file.getSize();

    // checking this way always leaves first file not deleted
    if (isDuplicateFile(fileList, name, size)) {
      file.setTrashed(true);
    } else {
      fileList.push([name, size]);
    }
  }
}

/**
 * Function is helper function of removeDuplicateFile function.
 * It checks if theres already a file in the given lst with same name and size and returns true or false
 * @param {List} lst
 * @param {String} name
 * @param {Number} size
 * @returns {Boolean}
 */
function isDuplicateFile(lst, name, size) {
  for (let i = 0; i < lst.length; i++) {
    if (lst[i][0] === name && lst[i][1] === size) return true;
  }
  return false;
}


/**
 * Delete all the triggers if there are any
 */
var deleteTrigger = () => {
  let triggersCollection = ScriptApp.getProjectTriggers();
  if (triggersCollection.length <= 0) {
    console.log(`Event doesnot have trigger id`);
  } else {
    triggersCollection.forEach((trigger) => ScriptApp.deleteTrigger(trigger));
  }
  return;
};

/**
 * Create a trigger function for file which also deletes previous triggers if there are.
 */
function removeDuplicateFileTrigger() {
  // First Delete existing triggers
  deleteTrigger();

  // now remove duplicate files 
  removeDuplicateFile();
}



removeDuplicateFileTrigger() is a function that prevents installable triggers to accumulate and cause errors. If you don’t want to automate you should run the removeDuplicateFile() function in the script instead.


You can find your folder’s ID from the URL when you're in your folder in google drive as highlighted in the image below.


Get Folder ID

Non-coders: You can see the second line in code→ const FOLDER_ID = "". You’re supposed to add your folder id from here inside quotes.

Setup Trigger

You can now set up a time-based trigger to automate this task.

Setup Trigger


Now, follow the steps indicated in the image above.

  1. In your script, on the left panel, press the clock button.
  2. Remember to select the removeDuplicateFileTrigger function in step 2.
  3. Pick time driven event in the 3rd step.
  4. In 4th step, you can of course select hourly, weekly, monthly, and more options to choose when to run the trigger.
  5. Choose the time which will appear different based on your selection in the 4th step.
  6. Hit Save and you’re done.



Warning: This script does not take into account the limitation of google apps script for triggers and script run time. So, this script probably won't work for very large and old folders with many files.

Reference

Inspired by this StackOverflow answer.

Thank You for Your Time

I make Google Add-Ons and can also write Google Apps Scripts for you. If you need my services let me know.