paint-brush
Majority Voting Approach to Ransomware Detection: File Name Analysisby@encapsulation

Majority Voting Approach to Ransomware Detection: File Name Analysis

tldt arrow

Too Long; Didn't Read

In this paper, researchers propose a new majority voting approach to ransomware detection.
featured image - Majority Voting Approach to Ransomware Detection: File Name Analysis
Bundling data and functions into a single unit HackerNoon profile picture

Authors:

(1) Simon R. Davies, School of Computing, Edinburgh Napier University, Edinburgh, UK ([email protected]);

(2) Richard Macfarlane, School of Computing, Edinburgh Napier University, Edinburgh, UK;

(3) William J. Buchanan, School of Computing, Edinburgh Napier University, Edinburgh, UK.

3.2. File Name Analysis

This collection of tests is performed on the actual string value of the name of the file being written. It has been a wellknown phenomenon from crypto-ransomware attacks that as well as encrypting the file contents, in the majority of cases, the affected file names will also be modified. For example by adding an extra extension or changing the original file’s name. This set of tests focuses on attempting to identify this change and will again leverage the content of the NapierOne data set.


File Name Entropy Test. This test calculates the Shannon [69] entropy value of the entire file name including any extensions that it may have. In normal operation, users tend to use lower entropy strings when naming their files. An analysis of the original file names used to populate the NapierOne dataset shows that the average Shannon entropy of a file name is below six bits. This calculated value proves to be also language-independent [49]. In many cases, when ransomware alters the contents of a file, it also alters the name of the file. Common ransomware file name manipulations are the addition of random strings to the name or its extension. This action would then increase the entropy of the affected file’s name.


The test was then applied to all the files within the test dataset, using their original file names. With regards to the files generated from the execution of ransomware, then the filename generated by the ransomware was used. For a file under test, if the calculated entropy value of the entire filename string is under six bits then the test passed and the file was considered benign, otherwise, the test failed and the file was considered malicious.


Known File Name Extension Test. As mentioned above, when ransomware encrypts a file it often also tends to change or append an extra extension to the affected file. Sometimes the text of this new extension relates to the name of the ransomware but often the extension is a random string of between three and 50 characters in length. In normal operation, it is very rare that a file’s extension is not a wellknown value, as typically well-known applications generate files with well-known extensions. This test is aimed at checking and confirming that the extension of the file being written is one of the common extensions [18, 10, 27, 48]. This test uses the collated list, created by the authors, of known extensions which are also used in the Magic Number Test described in Section 3.1. If the file extension is present in the list, then it is considered to be well-known. If the file name contains multiple extensions, then this test is applied to the last extension.


The test was then applied to all the files within the test dataset. For a file under test, if the file’s extension is well-known then the test passed and the file was considered benign, otherwise, the test failed and the file was considered malicious.


File Name Extension Entropy Test. This test calculates the Shannon [69] entropy value of the file name’s extension. If the file has multiple extensions, then the entropy of the entire extension chain is calculated. Often crypto-ransomware will append an extra extension to a file that it has encrypted. This extension can be a text string relating to the ransomware strain, but more recently it has been a random string of between three and 50 characters. When analysing the entropy value of all the extensions in the list of well-known extensions it was found that they all had a Shannon entropy value of below six bits.


The test was then applied to all the files within the test dataset. For a file under test, if the calculated entropy value of the file’s extension, or extensions, is below six bytes then the test passed and the file was considered benign, otherwise, the test failed and the file was considered malicious.


This paper is available on arxiv under CC BY 4.0 DEED license.