Most of us have had that one experience where we had a ton of dis-organized files in our machines. It happens. One minute, you're opening a large zip file, the next thing you know, the files are everywhere in the directory, mixing with all your important files and randomly placed leaving you with the task of manually sorting what needs to go where. It's a real pain. To ease this process, we're going to delve into file management with python the smart way. Work smart, not hard. Let's begin. We'll be using python 3.4 or greater. Assuming you've got python up and running already, we're going to take a walk with the module and a few others we will introduce along the way. Most of these come with python, so there's no need to install anything else to follow along. OS Creating random files Create a directory to work with. Call it . Inside this folder create another folder . Your directory structure should now look like this: ManageFiles RandomFiles ManageFiles/ | |_RandomFiles/ We're going to create random files to play with in the directory RandomFiles Create a file inside directory. You now have this: create_random_files.py ManageFiles ManageFiles/ | |_ create_random_files.py |_RandomFiles/ Done? Now get in the following code, we'll get into its details in a moment. os pathlib Path random list_of_extensions = [ , , , , , , , ] os.chdir( ) item list_of_extensions: num range( ): file_name = random.randint( , ) file_to_create = str(file_name) + item Path(file_to_create).touch() import from import import '.rst' '.txt' '.md' '.docx' '.odt' '.html' '.ppt' '.doc' # get into the RandomFiles directory './RandomFiles' for in # create 20 random files for each file extension for in 20 # let the file begin with a random number between 1 to 50 1 50 As of python 3.4, we 've got , our little magic box. We also import python's function for creating random numbers; Hold on to that thought, we're going to cover it when as we get to the line that uses it. pathlib random First off, we create a list of file extensions from where we will get our random files. Feel free to add to it. Next up, we change to the directory, then comes our loop, so here goes. RandomFiles We are simply saying, take each item in this and do the following to it. Let's take the for instance. We get into another loop, where to this , we do something to it 20 times. list_of_extensions .txt .txt Remember our import of ? We use it to select a random number between 1 and 50 for our file. In short, what this little loop does is save us, the less creative lot(don't worry, I'm part of this crew), the time of naming random files. We will simply create a file say or , provided it falls within our range of 50, twenty times. This is just so as to create a mess large enough to give pain when moving manually. The same process will be done with the other extensions. Next? Run this in your terminal. random 23.txt 14.txt python create_random_files.py Congratulations! We now have a mess of a directory. Now to clean it up. In the same location where our is, create a file and get the below in. create_random_files.py clean_up.py Method 1: os shutil glob os.chdir( ) files_to_group = [] random_file os.listdir( ): files_to_group.append(random_file) file_extensions = [] our_file files_to_group: file_extensions.append(os.path.splitext(our_file)[ ]) print(set(file_extensions)) file_types = set(file_extensions) type file_types: new_directory = type.replace( , ) os.mkdir(new_directory) fname glob.glob( ): shutil.move(fname, new_directory) import import import # get into the RandomFiles directory './RandomFiles' # get the list of files in the directory RandomFiles for in '.' # get all the file extensions present for in 1 for in "." " " # create directory with given name for in f'*. ' {type[ :]} 1 For this, we import two new libraries; and . The will help us move our files while the will help find the files to classify. Just like before, this will all become clear as we get to the line. shutil glob shutil glob First off, we get a list of all the files in the directory. Here, we assume that we do not have a clue of what files are in the directory. This means unlike where you can get all the extensions present manually and use `if statements` or `switch`, we want the program to look through the directory and do this for us. What if the file had dozens of extensions or log files? Would you do this manually? Once we get a list of all the files in the folder, we get into another loop, to get the file extensions of these files. Notice how we use: os.path.splitext(our_file)[ ] 1 Currently, the our_file variable looks something like this (for instance). When we split it, we get this: 5.docx `( , )` '5' '.docx' we then get the index [1] from it which in turn takes since is index [0]. .docx 5 So we now have the list of all file extensions present in the folder, whether repeated or not. To make it non-repetitive, we make a set. This takes all the items from the list and gets only the unique items. In our case, if we had a list where we had an extension say repeating itself over and over in the set would ensure we had only one of it. .docx file_types = set(file_extensions) # create a set and assign it to a variable Remember our list of file types still has the for every file extension. This would mean if we were to create a folder named exactly the same way, we would end up creating hidden folders and that is something we do not want. . So, as we loop over this set, we create a directory with the same extension name, only this time, we replace the in the name with an empty string. . new_directory = type.replace( , ) "." " " # our directory would now be called 'docx' We still need the extension to move the files. .docx fname glob.glob( ) for in f'*. ' {type[ :]} 1 This simply implies take any file that ends with the file extension (Notice the spacing used in ) . . .docx f'*.{type[1:]}' There is no space The wild card means a file can be named anything, provided it ends in . Since we have already placed the period we take the string we have and have everything else afterwards and that's why we use [1:] which just means take from after the first character, hence take . * .docx . docx What next? Move any file with this extension into the directory named as so. shutil.move(fname, new_directory) In this way, once a directory for the first file found in the loop has been created, no other duplicates can be made. In short, we will not have a folder to store and many others to store and so on. Once we have a directory made, all other folders looking like so will move there. That's it! 5.docx 34.docx Method 2 You can alternatively, use generators. This is a fancy way of creating a list with a one liner. os shutil glob os.chdir( ) all_files = [x x os.listdir( ) ] file_types = set((os.path.splitext(f)[ ] f all_files)) ftype file_types: new_directory = ftype.replace( , ) os.mkdir(new_directory) fname glob.glob( ): shutil.move(fname, new_directory) import import import # get into the RandomFiles directory './RandomFiles' #take every file from the directory and add to a list for all files for in '.' # make a set for the extensions present in the directory 1 for in for in "." '' for in f'*. ' {ftype[ :]} 1 Both of these will work. You've now got all your files sorted according to extension. ManageFiles/ | |_create_random_files.py |_RandomFiles/ |_doc |_docx |_html |_md |_odt |_ppt Woosh! That was a lot. We did save some time though. Any questions? Feel free to reach out. That's it for now, Stick around as we take it up a notch next week. For the code on this, check s. TheGreenCode As always: @codes_green Previously published at https://thegreencodes.com/file-management-with-python-ck02cpxu30010fqs1stv451gx