FPWhen we write for websites, we create unique content. That means it’s valuable and is unfortunately prone to being stolen or spun.

Unfortunately, there are still rogue publishers who make a quick buck by sealing content from other sites.

Article spinning and content theft are both hard to detect, but there are some tools that will help you protect your investment in the content you publish.

Free Duplicate Content Checkers

Duplicate content checkers scour the web to find copies (or near-copies) of your content. This is the first step in taking action and forcing the thief to remove the content from their site.

  • Copyscape is perhaps the most well-known free duplicate content checker. It’s quick and easy to check a single page: simply paste in the URL and Copyscape scans the web for copies. I performed a test scan using my People Per Hour blog from November last year; Copyscape immediately found a 63% match in a blog comment. (The website owner kindly removed the comment within the hour.)
  • WebConfs.com offers a free Similar Page Checker. However, this scans content in a slightly different way: you need to input the original URL and the URL of the copy so that it can compare the two. It doesn’t scan the web for duplicates, so it isn’t anywhere near as powerful as Copyscape.
  • Plagium checks for duplicate content in a similar way to Copyscape. However, it has a few nice features that are different: it can scan news sites and social networks, for example. You can also adjust the matching algorithm to make it stricter or more relaxed. Plagium is free but donations are accepted.

Tip: Copyscape can scan pasted text for web duplicates, and it can scan your entire site using a Batch Search. For this, you’ll need to purchase Premium Credits: one credit per page. Copyscape credits are very reasonably priced, with 100 credits costing $5.00. (Note that your credits will expire if they’re not used within a year.)

Ongoing Duplicate Content Monitors

If manual checking is troublesome or impractical, some services will monitor your site and periodically flag up likely copies of your content.

Unlike manual content checks, you’ll often need to pay – but not always.

  • Google Alerts is a good, yet basic, content monitoring service. Set up an alert for a unique sentence in a blog post (in quotes); Google will then email you if that sentence is picked up in search. This isn’t practical for large sites, but it does have its uses.
  • Copyscape’s Copysentry service offers continuous protection against stolen web content. Copysentry Standard is a weekly monitor that costs $4.95 per month for 10 pages, with each extra page costing $0.25/month. Copysentry Professional monitors pages daily and costs four times as much. If you publish one blog a day, the cost of Copysentry can quickly build up. However, if you’ve invested in a content marketing campaign, this might be a price worth paying.
  • If Copyscape isn’t right for you, Attributor offers similar services with more features. However, there are no prices on the site and I’ve never used it myself. It could be an option if you need more than Copyscape provides.

Someone Has Stolen My Content! What Now?

If you identify content that has been stolen, you then have to decide how to deal with the issue.

There’s a simple process you can go through to get content removed.

  1. Email the publisher. Some businesses employ freelance bloggers and never bother to check that the blogs they supply are original. This is perhaps understandable, since checking content is a time-consuming process. Sometimes a simple email is all that’s needed to get the content taken down amicably. (If you can’t find their email address, run a Whois query on their domain).
  2. If your emails go unanswered, file a DMCA takedown notice at DMCA.com. It doesn’t matter if you and/or the thief are outside the US: all that matters is that the stolen content is hosted on a server in the US. (Not sure if your client’s host is US-based? Use WhoIsHostingThis to look them up.) The Google DMCA Dashboard is designed to make DMCA simple and straightforward. It’s also free. Filing a report is quick and easy, and if Google finds that the content has been stolen, it will remove the duplicate from its indexes.
  3. If the content is on a forum, it may be easier to consider this fair use and move on. Assuming there’s a link next to the content, it’s probably innocent. The only real exception is where articles have been reposted in full, particularly if the whole lot has been copied and pasted without a link. Emailing the forum administrator is usually the best route for removal. If that doesn’t work, add a reply to the post pointing out the original source.

Before you pursue someone and accuse them of stealing your content, make absolutely sure that it hasn’t been used under a fair use clause (for example, you may have accidentally applied a Creative Commons licence in your RSS feed).

Preventing Content Theft

In all honesty, an automatic monitoring service is the best way to ensure your content isn’t being stolen. However, if you want to be proactive, there are a couple of other things you can try.

Tynt is a free content protection service. By embedding a small snippet of JavaScript into your website’s header, every chunk of content that’s copied and pasted will be appended with a link back to your website. (Try it now: copy and paste this paragraph into another application).

Of course, people can just delete the link if they want to, but Tynt at least gives you a chance of being credited for your blog posts.

Tynt also offers a number of other handy features. The links it adds to your content are unique, so you can track the keywords people are using, the number of visits and where your content is being used. Tynt can also append follow links for social services like Twitter. You can append a Creative Commons licence to your link and see which social networks people are using to share your content.

Tynt is completely free and worth installing.

SEOMoz Pro also has a duplicate content monitor as part of its overall SEO toolkit. You’ll find this on the Crawl Diagnostics page of your Pro dashboard; use the drop-down list to show Duplicate Page Content. Note that this scans your own site for duplicated pages; it does not scan externally. As such, it’s mainly for SEO rather than detecting plagiarism.

I found that SEOMoz Pro picked up a lot of duplicate content on my own website, simply because we had tag indexing turned on in Yoast’s WordPress plugin. Don’t panic if your first crawl results are high. You can easily deal with the duplicates on your own site using this Yoast guide.

Any Duplicate Content Tips?

Have you used any other duplicate content checkers, monitors or tools? Let me know in the comments – I’d love to hear about them.

The following two tabs change content below.
Claire Broadley is a freelance technical blogger for Red Robot Media. Hire Claire to write your business blog or technical user guides.