The Copyscape Premium API allows you to seamlessly integrate Copyscape Premium into your internal systems. This lets you automatically check the originality of new content as it enters your workflow, or perform large-scale checks for plagiarism of offline content.
For WordPress users, our plugin allows you to use your API credentials to seamlessly integrate Copyscape Premium into your WordPress workflow. If you have created a private index, the API also lets you add content to your private index or check new content against it.
Technically speaking, the API (Application Programming Interface) allows your developers to write scripts on your server that query the Copyscape Premium service and receive results in JSON, XML or HTML format.
The API costs the same as the web interface. Each search costs 3c for the first 200 words plus 1c per additional 100 words or part thereof.
To use the API, please sign up for Copyscape Premium.
To begin using the API, you may write a simple script in a server-side language such as PHP, Java or ASP that runs on your server and queries the API. The API cannot be accessed by Javascript/Ajax running within a web page, since web browsers do not allow cross-domain Ajax requests.
The following sections provide the information you need to use the Copyscape Premium API:
Sample Code - Some examples which demonstrate how to use the API.
Initial Checks and Full Comparisons - Explains the two levels of checks available in the API.
URL Search Request - Explains how to check for copies of a web page via the API.
Text Search Request - Explains how to check for copies of some text via the API.
JSON/XML Search Response - Describes JSON and XML responses that you receive from API search requests.
HTML Search Response - Describes HTML responses that you receive from API search requests.
URL Add to Private Index Request - Explains how to add the content from a URL to your private index.
Text Add to Private Index Request - Explains how to add some text to your private index.
Add to Private Index Response - Describes responses from API requests that add to your private index.
Delete from Private Index Request and Response - Explains how to delete content from your private index.
Check Balance Request and Response - Describes how to check your account balance.
Click to download sample code for accessing the Copyscape API in PHP, Python, Ruby, Java, Perl, Coldfusion or these flavors of ASP.NET: C#, Visual Basic, C# (Razor syntax), VB (Razor syntax).
Copyscape can check for matching content in two ways: an Initial Check and a Full Comparison.
Initial Check - For users who prioritize speed, the Initial Check is a fast and efficient way to search for matching content based on content indexes. The Initial Check returns an indicative text "snippet" containing a short selection of matching text from each found page. The length of this snippet, given in the minwordsmatched field of the JSON/XML search response, provides an indication of which of the found pages contain the largest quantities of matching text. However, the Initial Check does not provide an exact number of matching words for each found page.
Full Comparison - For users who don't mind longer search times, the Copyscape API can also perform an additional, more thorough check called a Full Comparison. In a Full Comparison, Copyscape attempts to fetch the complete text of a page found in the Initial Check, and compares it against the text that was submitted. For example, a Full Comparison is performed every time you click on a Copyscape result in the web interface. To perform Full Comparisons through the API, set the c parameter in the API search requests below to a value between 1 and 10. This controls the maximum number of Full Comparisons to perform, prioritizing the results with the longest snippet lengths from the Initial Check. If a Full Comparison is successfully performed for a page, additional fields such as wordsmatched are included in the JSON/XML search response for that page. Note that a Full Comparison may fail if the found page is no longer available, or uses a non-HTML format that Copyscape cannot read.
To check for copies of a web page via the Copyscape API, send an HTTP GET request to either of these URLs:
http://www.copyscape.com/api/
https://www.copyscape.com/api/
Parameters are specified on the URL (using ? and &) as follows:
Parameter | Explanation | Value | Required? | Default |
u | Your username | [your username] | Yes | - |
k | Your API key | [your API key] | Yes | - |
o | API operation | csearch (or psearch or cpsearch if you create a private index) |
Yes | - |
q | Source URL | [urlencoded URL] | Yes | - |
c | Full comparisons | 0 to 10 | No | 0 |
f | Response format | json or xml or html | No | xml |
i | Ignore sites | [comma-delimited domains to ignore] | No | - |
l | Spend limit | [value in dollars, e.g. 0.50] | No | - |
x | Example test | 1 or omitted | No | - |
API operation (o): Use csearch to search against the public Internet or psearch to search against your private index. You can also use cpsearch to search against both the Internet and your private index, for the cost of two searches.
Source URL (q): As per the HTTP specification, this must be urlencoded. For example, ? should be replaced by %3F and & replaced by %26. Most languages provide a built-in function for urlencoding - see examples in PHP, Java and ASP.
Full comparisons (c): Set to a value between 1 and 10 to request a full comparison (with an exact count of matching words) between the query text and the top (one to ten) results found. Note that full comparisons may add a delay of a few seconds – more info.
Ignore sites (i): Subdomains are also omitted from the results. For example, if set to site1.com,site2.com then www.site1.com and blog.site2.com would also be ignored. Ignore sites listed in your user settings are always applied.
Spend limit (i): If this parameter is omitted, the limit per search set in your User Settings is applied.
Example test (x): If set to 1, the API will search the Internet for copies of http://www.copyscape.com/example.html and you will not be charged.
To check for copies of some text via the Copyscape API, send an HTTP POST request to either of these URLs:
http://www.copyscape.com/api/
https://www.copyscape.com/api/
The text to be searched and other parameters can be specified in one of two ways:
The parameters are as follows:
Parameter | Explanation | Value | Required? | Default |
u | Your username | [your username] | Yes | - |
k | Your API key | [your API key] | Yes | - |
o | API operation | csearch (or psearch or cpsearch if you create a private index) |
Yes | - |
e | Text encoding | [encoding name] | Yes | - |
t | Text to be searched | [the text] | Yes | - |
c | Full comparisons | 0 to 10 | No | 0 |
f | Response format | json or xml or html | No | xml |
i | Ignore sites | [comma-delimited domains to ignore] | No | - |
l | Spend limit | [value in dollars, e.g. 0.50] | No | - |
x | Example test | 1 or omitted | No | - |
API operation (o): Use csearch to search against the public Internet or psearch to search against your private index. You can also use cpsearch to search against both the Internet and your private index, for the cost of two searches.
Text encoding (e): Use an IANA name, such as UTF-8 (Unicode), ISO-8859-1 (Latin-1) or WINDOWS-1251 (Cyrillic).
Text to be searched (t): If you are using the Raw POST method, as described above, the raw text should be supplied in the POST payload without a parameter name and without any urlencoding.
Full comparisons (c): Set to a value between 1 and 10 to request a full comparison (with an exact count of matching words) between the query text and the top (one to ten) results found. Note that full comparisons may add a delay of a few seconds – more info.
Ignore sites (i): Subdomains are also omitted from the results. For example, if set to site1.com,site2.com then www.site1.com and blog.site2.com would also be ignored. Ignore sites listed in your user settings are always applied.
Spend limit (i): If this parameter is omitted, the limit per search set in your User Settings is applied.
Example test (x): If set to 1, the API will search the Internet for copies of the text on http://www.copyscape.com/example.html and you will not be charged.
For searches with JSON responses, the API returns a JSON object with the fields listed below. For searches with XML responses, the API returns a <response> XML element with the subelements listed below, UTF-8 encoded.
Name | Explanation | Present? | Example |
query | URL searched | If a URL search | http://mydomain.com/ page.html |
error | Reason for API request failure | If request failed | No credits remaining |
querywords | Number of words checked | If succeeded | 583 |
cost | Cost in dollars of this search | If succeeded | 0.09 |
count | Number of results found | If succeeded | 6 |
allwordsmatched | Number of source words matched | If succeeded and c>=3 and o is not cpsearch |
387 |
allpercentmatched | Percentage of source words matched | If succeeded and c>=3 and o is not cpsearch |
56 |
alltextmatched | Full extract of source text matched | If succeeded and c>=3 and o is not cpsearch |
When in the Course of human events... |
allviewurl | URL for viewing found results | If succeeded and o is csearch |
http://view.copyscape.com/ search/a1b2c3d4e5 |
The query value may differ from the original URL you supplied if there was a frameset or redirection.
The allwordsmatched, allpercentmatched and alltextmatched values are based on full comparisons performed between the source text and the top (up to 10) results found. They summarize the portion of the source text that was matched in any of these full comparisons – more info. They are present if the c parameter is 3 or more, and the search was not performed against both the Internet and your private index simultaneously.
The allviewurl value can be used to display the list of results in an iframe or window. If used, the contents of this page must be displayed in full, without modification. Please note that these links are temporary, and are valid for no longer than a few weeks, after which they expire.
For successful searches with JSON responses, the top-level object contains a result array of zero or more objects, each describing one result that was found. For successful searches with XML responses, the top-level <response> element contains zero or more <result> subelements, each describing one result that was found. In either case, each result has the fields below:
Name | Explanation | Present? | Example |
index | Position in results | Yes | 1 |
url | URL of found page or source URL of page in private index | Yes | http://www.law.indiana.edu/ uslawdocs/declaration.html |
handle | Handle of found article | If private index | SIA_1_4487334_3978624 |
id | ID of found article | If private index | MY_ARTICLE_123 |
articlewords | Number of words in found article | If private index | 639 |
added | When the found article was added (GMT) | If private index | 2024-11-21 06:04:28 |
title | Title of the found web page or article in your private index | Yes | Declaration of Independence |
textsnippet | Text snippet showing some of the matching text | Yes | ... separate and equal station to which ... |
htmlsnippet | HTML version of snippet for display in web pages | Yes | <font color="#777777">...</font> <font color="#000000">separate and equal station</font> |
minwordsmatched | Minimum number of words matching | Yes | 96 |
viewurl | URL for viewing found page | If Internet result | http://view.copyscape.com/ compare/a1b2c3d4e5/1 |
The minwordsmatched value is an approximate and relative measure of the amount of matching content found for each result. For an exact count of matching words in the top results, use the c API parameter to request full comparisons – more info.
The viewurl value can be used to display the found page, with the matching content highlighted, in an iframe or window. If used, the contents of this page must be displayed in full, without modification.
If a full comparison was performed for a result, it may also contain:
Name | Explanation | Present? | Example |
urlwords | Number of words in found page | If Internet page retrieved OK | 950 |
wordsmatched | Exact number of words matching | If page retrieved OK | 133 |
percentmatched | Percentage of submitted content matched on page | If page retrieved OK | 13 |
textmatched | Matching text in full | If page retrieved OK | When in the Course of human events... |
urlerror | Error retrieving URL | If Internet page not retrieved | The document could not be retrieved - error code 404 |
Please note that additional fields may be added in future, so your JSON or XML parser must safely ignore any that are not recognized.
For HTML responses, the API returns UTF-8 encoded content with minimal HTML formatting.
If the search request succeeded, the title of the HTML page contains the URL queried (if appropriate) and the number of results found. The body of the page includes a series of paragraphs, one for each result, for example:
Declaration of Independence : Indiana Law
... for opposing with manly firmness his invasions on the rights of the people. ... For transporting us beyond Seas to be tried for pretended offences: ... He has plundered our seas, ravaged our Coasts, burnt our towns, and destroyed the ... He has excited domestic insurrections amongst us, and has endeavoured to bring on ... the merciless Indian Savages, whose known rule of warfare, ... by their legislature to extend an unwarrantable jurisdiction over us. ... which, would inevitably interrupt our connections and correspondence. ... by the Authority of the good People of these Colonies, solemnly publish and declare, ... http://www.law.indiana.edu/uslawdocs/declaration.html |
If the API request failed, the HTML response will contain some red text describing the error.
The HTML format may change in the future, so you should not rely on its structure. The HTML also contains less information than the JSON and XML formats, and excludes full comparisons. To show more information or ensure consistent formatting, please use the JSON or XML response format and build your own HTML.
This API operation requires a private index to be created for your account.
To add the content from a URL to your private index, send an HTTP GET request to either of these URLs:
http://www.copyscape.com/api/
https://www.copyscape.com/api/
Parameters are specified on the URL (using ? and &) as follows:
Parameter | Explanation | Value | Required? | Default |
u | Your username | [your username] | Yes | - |
k | Your API key | [your API key] | Yes | - |
o | API operation | pindexadd | Yes | - |
q | Source URL | [urlencoded URL] | Yes | - |
i | Article ID | [ID for private index] | No | [none] |
f | Response format | json or xml or html | No | xml |
Source URL (q) and Article ID (i): These parameters must be urlencoded. For example, ? should be replaced by %3F, & by %26 and space by + or %20. Most languages provide a built-in function for urlencoding - see examples in PHP, Java and ASP.
The title of the article in your private index is taken from the web page at the URL provided. The request returns a response confirming if the operation was successful.
This API operation requires a private index to be created for your account.
To add some text to your private index, send an HTTP POST request to either of these URLs:
http://www.copyscape.com/api/
https://www.copyscape.com/api/
The text to be added and other parameters can be specified in one of two ways:
The parameters are as follows:
Parameter | Explanation | Value | Required? | Default |
u | Your username | [your username] | Yes | - |
k | Your API key | [your API key] | Yes | - |
o | API operation | pindexadd | Yes | - |
e | Text encoding | [encoding name] | Yes | - |
t | Text to be added | [the text] | Yes | - |
a | Article title | [title for private index] | No | [none] |
i | Article ID | [ID for private index] | No | [none] |
f | Response format | json or xml or html | No | xml |
Text encoding (e): Use an IANA name, such as UTF-8 (Unicode), ISO-8859-1 (Latin-1) or WINDOWS-1251 (Cyrillic).
Text to be added (t): If you are using the Raw POST method, as described above, the raw text should be supplied in the POST payload without a parameter name and without any urlencoding.
The request returns a response confirming if the operation was successful.
For JSON responses, the API returns a JSON object with the fields listed below. For XML responses, the API returns a <response> XML element with the subelements listed below, UTF-8 encoded.
Name | Explanation | Present? | Example |
url | URL whose content was added | If succeeded and URL specified | http://mydomain.com/ page.html |
words | Number of words in content added | If succeeded | 472 |
handle | Reference for article created | If succeeded | SIA_1_4487334_3978624 |
id | Article ID from API request | If succeeded and ID provided | Article 123 |
title | Title of article added | If succeeded and title available | Declaration of Independence |
cost | Cost of operation in dollars | If succeeded | 0.01 |
error | Explanation of problem | If failed | A private index has not been created |
The handle returned may contain up to 32 ASCII characters and can be used to delete the article in the future.
The title is taken from the web page, if a URL was specified. Otherwise it contains the title provided in the API request.
Please note that additional fields may be added in future, so your JSON or XML parser must safely ignore any that are not recognized.
For HTML responses, the API will return a message confirming whether the content was added successfully.
This API operation requires a private index to be created for your account.
To delete an item of content from your private index, send an HTTP GET request to either of these URLs:
http://www.copyscape.com/api/
https://www.copyscape.com/api/
Parameters are specified on the URL (using ? and &) as follows:
Parameter | Explanation | Value | Required? | Default |
u | Your username | [your username] | Yes | - |
k | Your API key | [your API key] | Yes | - |
o | API operation | pindexdel | Yes | - |
h | Handle | Handle of article | Yes | - |
f | Response format | json or xml or html | No | xml |
For JSON responses, the API returns a JSON object with the fields listed below. For XML responses, the API returns a <response> XML element with the subelements listed below, UTF-8 encoded.
Name | Explanation | Present? | Example |
handle | Handle of deleted article | If succeeded | SIA_1_4487334_3978624 |
id | ID of deleted article | If succeeded | MY_ARTICLE_123 |
error | Explanation of problem | If failed | An article with this handle could not be found |
For HTML responses, a textual description of the result of the request is returned as basic HTML.
There is no charge for deleting articles from your private index.
To check how much credit you have remaining, send an HTTP GET request to either of these URLs:
http://www.copyscape.com/api/
https://www.copyscape.com/api/
Parameters are specified on the URL (using ? and &) as follows:
Parameter name | Explanation | Value | Required? |
u | Your username | [your username] | Yes |
k | Your API key | [your API key] | Yes |
o | Name of operation | balance | Yes |
f | Response format | json or xml or html | No - xml by default |
For JSON responses, the API returns a JSON object with the fields listed below. For XML responses, the API returns a <remaining> XML element with the subelements listed below, UTF-8 encoded.
Name | Explanation | Example |
value | Monetary value of your remaining credit in dollars | 999.50 |
today | Number of Internet searches remaining today | 9990 |
For backwards compatibility, the response also contains a total field, but this no longer has a valid meaning. In the past, Copyscape Premium had a fixed price per search, so it was possible to know how many searches were available for a given balance. With per-word pricing, this is not possible.
For HTML responses, a textual description of your balance is returned as basic HTML.
If you have any questions or problems regarding the API, please contact us.