TurnitinBot and Why You Should Block It

Published October 13, 2012

If you've been to university recently you'll be familiar with Turnitin. It's a service that supposedly detects plagiarism in students' submissions. Students submit their paper to Turnitin, and Turnitin generates a report telling you it's x% plagiarised and contains sentences, paragraphs, or even chapters that look suspiciously similar to other papers in its database. It's also woefully stupid; when I did my Master's thesis (on ragdoll physics), Turnitin decided I'd 2% plagiarised from some biochemistry papers, and completely failed to detect any similiarity with other papers on ragdoll physics.

It's now trawling the internet, similarly to Google, to build up its database.

The reason this is bad is as follows:

Turnitin is a for-profit organisation and generates money from indexing your stuff, with no permission given by yourself, and at no possible benefit to yourself. It doesn't reimburse you, it doesn't cite you, it just presents your work as being something it has an inalienable right to use for whatever it wishes. Whilst it is supposedly against plagiarism, it has no qualms violating your intellectual property and then making money off it.

Here's how you can block it:

robots.txt:

User-agent: TurnitinBot
Disallow: /

.htaccess:

RewriteCond %{HTTP_USER_AGENT} TurnitinBot [NC]
RewriteRule .* - [F]

Talk is cheap

Thanks for the info on TurnitinBot, seen this bot hit my site for the first time tonight and was wondering what it could be.

– 10:52:04 9th November 2013

This bot is a big bandwidth if you have a blog with a tag cloud and lots of content. I found it hit just about all my tags on my site already. Haven't yet check my other sites log files yet. But blocking this bot is a must for all webmasters

– 11:46:32 7th May 2014

Mine site, also visited TurnitinBot. But direct automatically to block, because I use incapsula :)
TurnitinBot/3.0 (http://www.turnitin.com/robot/crawlerinfo.html)

– 10:36:14 22nd May 2014

You're upset that Turnitin gave you poor results, so you're attempting to get revenge by making the argument that Turnitinbot is unethically profiting off people's content? This is ironic, because I came upon your site through Google (Googlebot is allowed crawl you? hmm..) and I clicked on an Ad at the top of the search results, meaning Google is profiting off of your ironically irrational blog post. Your argument is inconsistent.

– 05:31:55 2nd March 2018

@salty:

I don't think you quite understand the arrangement. I am happy for my content to be indexed by Google and I do not consider that they are violating my IP by doing so. The ad you clicked actually generates profit for me (and for Google). Turnitin uses my work and my bandwidth to generate profit for itself, with no consent implied on my part. That Turnitin was spectacularly useless the last time I experienced it is a note of interest and not really relevant to my main point, which has nothing to do with revenge. There is no inconsistency here - please read the post again if you are confused - it is quite clear.

And Turnitin did not give me a 'poor' result, because I have never been an end user of Turnitin. My university was. It was of no consequence to me that it didn't perform well.

– 21:53:24 7th March 2018

Talk is cheap

Leave a comment: