paint-brush
Can Blockchain Technology Change Plagiarism Detection in Academia?by@ellierichards
539 reads
539 reads

Can Blockchain Technology Change Plagiarism Detection in Academia?

by Ellie RichardsDecember 4th, 2022
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

A blockchain is a digital ledger in which each entry is immutable. Once an entry is added to a blockchain, it contains all the data from all the previous nodes in a ledger, meaning that it cannot be changed no matter what you do. The current approach to plagiarism detection is accurate and reliable, allowing universities to process large amounts of text data while making few errors in plagiarism assessments. As an illustration, imagine a situation in which an entry in an academic text is added in a blockchain database including a timestamp and information about authors.

Companies Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - Can Blockchain Technology Change Plagiarism Detection in Academia?
Ellie Richards HackerNoon profile picture


With the blockchain craze showing no signs of stopping, we are beginning to see more and more proposals for applying this technology in academic settings. In fact, universities have already used blockchain to change or replace several important processes. Some academic institutions in the US, for example, have decided to issue non-fungible tokens (NFTs) to confirm the authenticity of their diplomas and online certificates.


But what about academic writing? Can blockchain truly change the way that students approach writing and submitting their assignments? In this article, we intend to find out how blockchain technology could change plagiarism detection in modern academia.


What Is the Mainstream Approach to Detecting Plagiarism?


At the moment, plagiarism detection is a part of something called natural language processing (NLP). Put simply, NLP involves using algorithms and software tools to parse and analyze text written by humans.


Typically, an NLP algorithm divides texts into words, sentences, and paragraphs. Using built-in dictionaries and linguistic metadata, such an algorithm can also assign meanings to words and identify the key semantic components of phrases and sentences. If needed, automatic tools can also identify unique linguistic ‘fingerprints’ including commonly appearing words or phrases, the overall tone of a particular document, or complex semantic networks.


From there, a plagiarism detection tool scans other documents available in its pre-loaded database. In general, these databases include the academic works of other students, empirical articles, and other material that could be referenced in a typical essay or dissertation. The databases used by plagiarism detection software are constantly updated by ‘crawler’ algorithms regularly searching for new relevant content.


After that, plagiarism detection tools are usually able to output a numeric rating (typically measured as a percentage) indicative of the originality of a given work. The software can also highlight specific phrases or sentences that are borrowed verbatim from other works without providing a proper citation. Overall, the current approach to plagiarism detection is accurate and reliable, allowing universities to process large amounts of text data while making few errors in plagiarism assessments.


So why would universities want to change to a blockchain-powered algorithm? For one, it can be pretty difficult to detect ‘non-verbatim’ plagiarism. If a student is adept at rewriting or finding synonyms, they can sometimes fool several existing plagiarism detectors (although this is not a given). Furthermore, plagiarism checkers can produce ‘false positives. Finally, without a valid timestamp, it can be impossible to determine who copied from whom, providing a possible ‘loophole’ for those accused of plagiarism.


In addition, the results of plagiarism checks can be difficult to authenticate. What if a database used for plagiarism detection is outdated or inaccurate? Academic texts change all the time through resubmissions and revisions. Or what if an automatic web ‘crawler’ chooses to add erroneous or irrelevant text data? This would significantly reduce the accuracy of plagiarism checks and, possibly, increase the number of ‘false positives’ reported by plagiarism assessment tools.


How Can Blockchain Be Integrated Into Plagiarism Detection?


To answer this question, let’s briefly define what a blockchain actually is. In essence, a blockchain is a digital ledger in which each entry is immutable. Once an entry is added to a blockchain database, it contains hash data from all the previous nodes in a ledger, meaning that it cannot be changed no matter what you do.


Blockchain, therefore, does not really offer an alternative to NLP. As an illustration, this technology cannot be used to improve the accuracy of detecting ‘non-verbatim’ plagiarism. However, in our opinion, blockchain can offer a valuable alternative to the traditional databases used by plagiarism assessment suites. How? Let’s consider some brief examples.


For instance, imagine a situation in which an academic text is added as an entry in a blockchain database including a timestamp and authorship information. From that point on, it cannot be deleted or modified in any way. Its mere existence on a blockchain ledger authenticates both the time of its submission and its metadata.


This means that blockchain eliminates the timestamp ‘loophole’ we have described above. In a blockchain-powered system, determining who copied from who is easy as all texts are provided with authentic timestamps. In other words, putting academic texts on a blockchain allows plagiarism detection algorithms to make credible claims about the accuracy of their results all without direct human supervision.


Blockchains, in theory, also eliminate the issue of version control we have mentioned in the previous section. If all valid versions of an empirical article or a textbook are stored on a blockchain, there is no need for ‘crawlers’ to continuously expand their databases. Instead, all plagiarism checks would run on authentic and up-to-date data, greatly reducing the number of ‘false positives’ and positively contributing to the academic integrity of various universities.


While blockchain technology offers meaningful advantages to universities, there are also many unknowns surrounding its implementation. For one, no one so far has actually used blockchain ledgers to store large amounts of academic text data. Moreover, no existing plagiarism checker is linked to a blockchain-powered database. As a result, it is a bit difficult to forecast the costs that could be incurred by using blockchain in plagiarism detection.


Blockchain also constitutes a highly unregulated space that is not closely tied to any legal provisions. Subsequently, large-scale adoption of blockchain by respectable academic institutions may be faced with significant resistance from those distrustful of this technology. It is also unclear how exactly blockchain will change within the next five years, presenting another difficulty for academics wishing to experiment with the use of these digital ledgers.


Blockchain does not constitute a metaphorical panacea to the issues in contemporary plagiarism detection. The usage of decentralized digital ledgers, for example, does nothing to eliminate ‘non-verbatim’ plagiarism. Nonetheless, blockchain technology offers some interesting solutions to timestamps and database authentication. We hope that universities take notice and make a real effort in making blockchain more accessible for regularly performed academic tasks.