Read the fine print: IP Statistics

Lies, damn lies and statistics. This Kat's first appearance on this blog was in 2007 when she queried the statement, "It has been said that 80% of the information found in patents cannot be found anywhere else." It was an oft-used number with no citation. We never really found one.

The basic problem hasn't gone away. IP statistics are thrown about on a regular basis without sufficient caveats, out of context and, in some cases, stats are so poorly calculated they should never leave the back of the envelope. Counterfeiting and piracy statistics used in early in copyright debates are an excellent example (more here.)

Things are improving, but there is a long way to go. Even the Daily Mail finds misleading stats and graphs funny.

All is not lost. The PatStat (IP Statistics for Decision Makers) annual conference discusses these challenges. The UK IPO has produced a handy guide on the use of patent data, which merits a post by itself. (IPKat discussions on patent stats here and here.)

The folks at CIGI Waterloo have just published a paper looking at stats in cybercrime entitled, "Global Cyberspace is Safer than You Think: Real Trends in Cybercrime" by Eric Jardine. Eric examines recent stats in cybercrime and seeks to normalise them. In this case, normalisation is done by adjusting absolute figures in the form of totals (e.g. 1,000 attacks per year), or growth, (e.g. 50% more attacks in 2014 than 2013), for the growth of the internet.

To illustrate normalisation, consider the following, totally made up example: 30% more thefts of iPhone 6 in 2015!!! Without context, rather shocking. However, the iPhone 6 was only introduced September 2014. There are vastly more iPhones on the market in 2015, and the risk of theft may have actually decreased. Accounting for the growth in number of iPhone 6s would normalise these figures.

Eric investigated 13 absolute figures on cybercrime and found his normalised stats show a much less scary situation:

in 6 cases, the absolute figures showed the situation getting worse whereas his normalised figures showed the situation actually improving
in 6 cases, both the absolute and normalised figures show the situation getting better, but the normalised figures show improvement happening sooner and faster
in 1 case, both the absolute and normalised figures show the situation getting worse, but the normalised figures show this deterioration happening slower

So, good news! Cyberspace is safer than you think.

Tips for reading stats:

Check citations. Citations are like PDO. Know where your calculations come from.
Check context. Read the fine print. Most stats come with caveats.
Check methods. Read the ingredients. Your stats should have good data and processes.
Check with an expert. When in doubt, ask your friendly statistician or economist.

Examples of egregious manipulation of data this Kat has seen in her career:

exploiting Excel's rounding to display 9.45% as 9.5% (0.05% can make a huge difference)
'massaging' data so three firms had a market majority
a protest described as '100 people' by protestors, 'approximately 50' by police, and '20' by the organisation being protested

And remember, 78% of stats are made up on the spot, the other 29% are drivel. Check out Full Fact and the BBC's More or Less for good explanations and debunking of stats.

Not just any old IPKat ...

* "Most Popular Intellectual Property Law Blawg" of all time according to Justia rankings, April 2024.

* "Most Popular Copyright Blawg" of all time according to Justia rankings, April 2024.

* PermaKat Eleonora Rosati has been quoted, and the IPKat has also been hyperlinked on the New York Times, April 2024.

* "Best UK Intellectual Property blog" of all time according to FeedSpot, January 2024.

* PermaKat Eleonora Rosati and The IPKat are expressly recommended as sources to follow to get an "unstuffy look at IP issues" according to Legal Business, April 2023.

* PermaKat Eleonora Rosati received the 2022 Adepi Award.

* PermaKat Eleonora Rosati listed as one of the World Intellectual Property Review's "Influential Women in IP" of 2020.

* PermaKat Eleonora Rosati listed as one of the Managing Intellectual Property magazine's "Fifty Most Influential People" of 2018.

* IPKat founder and Blogmeister Emeritus Jeremy Phillips listed as one of the Managing Intellectual Property magazine's "Fifty Most Influential People" of 2005, 2011, 2013, and 2014.

* Recommended by the European Patent Office as reading material for candidates for the European Qualifying Examinations, 2013.

* Listed as "Top Legal Blog" in The Times Online, March 2011.

* One of the only two non-US blogs listed in the Blawg 2010 ABA Journal 100.

* Number 1 in the 2010 Top Copyright Blog list compiled by the Copyright Litigation Blog, July 2010.

* Selected by the United States Library of Congress for inclusion in its historic collections of Internet materials related to Legal Blawgs as of 2010.

* Top Patent Blog poll 2009: 3rd out of 50 in the "Favourite Patent Blog" poll and 2nd out of 50 in the "Most-read" poll.

* ComputerWeekly IT Law and Governance Blog of the Year, 20 August 2008.

* Best of the Blogs, Times Online, 21 August 2008.

10 comments:

THE US anonTuesday 21 July 2015 at 13:06:00 GMT+1
An excellent post.

Here in the U.S., even fallacious data that has been shown to be of no merit continues to be "refreshed" and used over and over again - most times by an incestuous academia with an agenda unto themselves.

And it is not just the academics here. Even our executive branch is guilty of using bogus data to further their agenda. There has been an extensively detailed request by Ron Katznelson officially lodged with Executive Branch to hew to our laws and restate its "anti-Tr011" manifesto that was riddled with spurious data.

The "official" reply is now well past its original due date, and I remain eager to see just how our Executive Branch is going to spin its answer to the meticulously documented "made up facts" that riddle its Policy Paper.
AnonymousTuesday 21 July 2015 at 16:43:00 GMT+1
Does this post have anything to do with the OHIM press release stating that "The manufacture and distribution of fake clothes, shoes and accessories (like ties, scarves, belts and gloves) takes over €26 billion every year from legitimate EU businesses"?

From a skim of the press-release the relevant study seems riddled with ill-supported statistics, suppositions, and assertions. My personal favourite being the assertion that "(all) producers and sellers of fakes do not pay tax, social contributions and VAT" and therefore we can assume that all of the tax they would have paid is a loss to the European economy of €8bn.
AnonymousWednesday 22 July 2015 at 10:50:00 GMT+1
If you read the actual report and not just the press release, you will get a more nuanced picture. The authors qualify their numbers, for example, on p. 10 the report states that to the extent counterfeits penetrate the legitimate sales channels, the tax losses calculated overestimate the real impact. Press releases are by necessity simplified (sometimes over-simplified). Better read the report.
THE US anonWednesday 22 July 2015 at 11:38:00 GMT+1
Reading the report is always sage advice.

However, in this day and age of "soundbyte journalism," it is often the lack of reading - and the critical thinking that should follow - that "carries the day."

One only has to see the (lack of) true dialogue even (especially) on leading patent blogs (certain other ones in mind) to see that a soundbyte repeated often enough is meant to gain traction where a full read (and understanding) would not stand for that view to be reasonably put forward.
Nicola SearleWednesday 22 July 2015 at 17:24:00 GMT+1
Aha! This is precisely the debate I had in mind. There is a big disconnect between 'reality' (whatever that is), the measurement of the reality, the description and analysis of the measurement, and finally the reporting of said measurements.

The OHIM report, mentioned by Anon 16:43, has a lot of caveats which are not reflected in the press release. But how do you get a sufficiently caveated stat in a press release? There is no room for footnotes as anon 10:50 notes. The same is often the case with speeches and newsbytes.

The challenge is when stats, good or bad, gain currency and are repeated without caveats and without the "critical thinking" US anon refers to. Surely we can do better than the 2007 situation of throwing around the 80% patent figure.

And all of this is before we get to thinking about how things are measured and analysed...
THE US anonThursday 23 July 2015 at 12:00:00 GMT+1
Nicola,

Sadly, that big disconnect is often so on purpose. Especially on patent blogs (of a certain US variety).

As I have often observed, propaganda exists because it does in fact work. Repeat an outright falsehood (or even more devious, a half-truth/full lie) often enough and some people will "think" it to be true. This type of forum is especially pliant because there is no way to force people NOT to engage in such petty dissembling, and the battle for the bully pulpit becomes one of NOT listening to any other point made in the online "discussions." Instead, it becomes a battle of flooding without regard to what others say - and as noted here, without regard to reading and critically thinking about the matter.
AnonymousThursday 23 July 2015 at 12:51:00 GMT+1
I've read the OHIM report and I think the authors' qualifications are extremely poor. To me they appear to be nothing more than a justification for the creating the highest possible estimate of the damage to the European economy resulting from counterfeiting. For example, the report states

"...some amount of direct and indirect taxes is levied on these (counnterfeit) products, and so the net reduction in government revenue may be smaller than the gross effect calculated here. Unfortunately, data currently available do not allow for calculation of these net effects with any degree of accuracy."

So because the authors can't calculate an estimate of the tax that is paid on counterfeit goods with any accuracy they have assumed it is zero. Why not 50% or any other value?

Similarly, they have assumed that precisely zero of the jobs lost in the sectors that produce authentic goods are replaced by jobs related to counterfeit goods that are sold in their place. No justification is provided for this assumption.

This doesn't seem to be a case of simplified headlines twisting a more nuanced report. Rather it seems to be a case of a report being created with the intention of producing such headlines.
Nicola SearleFriday 24 July 2015 at 10:47:00 GMT+1
US-anon

You're absolutely right that propaganda works. But I don't think a single one of us in immune to confirmation bias. We inherently reject statistics and information that conflict with our existing beliefs and accept those that confirm. Which is why I think it's the responsibility of the IP community to look beyond headlines and, as you say, critically thinking.

Also, I haven't figured out which blogs you're referring to!

Anon 12:51

That is a tricky one. In making assumptions like this, you're damned if you do, damned if you don't. Assuming the majority of criminal activity does not pay taxes is reasonable, but any precise amount will be arbitrary until there is better evidence. One way around this is to do a scenario analysis which gives you a range of figures. That allows people to pick which one suits their beliefs best.

Having looked at the report in a bit more detail, there is some room for improvement in the econometric model. It is extremely difficult to have sufficient data to ascribe a difference between predicted sales and actual to counterfeiting. The caveats for this part of the analysis are later in the report.

Communicating all of this is a challenge.

Keep the questions and comments coming as I'd like to continue analysing these figures!
AnonymousMonday 27 July 2015 at 12:38:00 GMT+1
For more made up statistics and bias see:-

https://oami.europa.eu/ohimportal/en/web/observatory/quantification-of-ipr-infringement

Anon 12:38
Nicola SearleTuesday 28 July 2015 at 10:54:00 GMT+1
Hi Anon 12:28, could you elaborate a bit more as to what you mean by 'made up' and 'bias'? (To keep the 'constructive' in constructive criticism.)

All comments must be moderated by a member of the IPKat team before they appear on the blog. Comments will not be allowed if the contravene the IPKat policy that readers' comments should not be obscene or defamatory; they should not consist of ad hominem attacks on members of the blog team or other comment-posters and they should make a constructive contribution to the discussion of the post on which they purport to comment.

It is also the IPKat policy that comments should not be made completely anonymously, and users should use a consistent name or pseudonym (which should not itself be defamatory or obscene, or that of another real person), either in the "identity" field, or at the beginning of the comment. Current practice is to, however, allow a limited number of comments that contravene this policy, provided that the comment has a high degree of relevance and the comment chain does not become too difficult to follow.

Learn more here: http://ipkitten.blogspot.com/p/want-to-complain.html

The IPKat

Read the fine print: IP Statistics

10 comments:

The IPKat: Intellectual Property News and Fun for Everyone!

How many page-views has the IPKat received?

Not just any old IPKat ...

Get the Kat in your Inbox!

The Kat that tweets! Current followers: 22.8K

Follow the IPKat on LinkedIn

Follow the IPKat on Facebook

Follow the IPKat on Reddit

The IPKat's most-read posts in the past 30 days

Search This Blog

Blog Archive

Creative Commons Attribution-Non Commercial Licence

Subscribe to the IPKat's posts by email here

Has the Kat got your tongue?

The IPKat's cousins: some IP-friendly blogs for you

Out for the count...

The IPKat

Read the fine print: IP Statistics

10 comments:

The IPKat: Intellectual Property News and Fun for Everyone!

How many page-views has the IPKat received?

Not just any old IPKat ...

Get the Kat in your Inbox!

The Kat that tweets! Current followers: 22.8K

Follow the IPKat on LinkedIn

Follow the IPKat on Facebook

Follow the IPKat on Reddit

The IPKat's most-read posts in the past 30 days

Search This Blog

Blog Archive

Creative Commons Attribution-Non Commercial Licence

Subscribe to the IPKat's posts by email here

Feed me IPKat!

Has the Kat got your tongue?

The IPKat's cousins: some IP-friendly blogs for you

Out for the count...