From October 2016 to March 2017 the team is joined by Guest Kats Rosie Burbidge and Eibhlin Vardy, and by InternKats Verónica Rodríguez Arguijo, Tian Lu and Hayleigh Bosher.

Wednesday, 3 February 2016

Trends in IP Data

You can't touch IP, most litigation devotes significant energy to even defining a right, and, by definition, any particular right is unique. So, how do you measure IP? Data on IP is scarce; inaccessible registries, unregistered rights and privately held information don't help. Yet, recent trends in IP data suggest progress is being made. In particular, trends in national offices are promising.

"Sh!t ton is my favourite unit of measurement."
Bill Murray Parody Twitter Account
Pallas cat looking angry, by Tambako the Jaguar

Economics is data-obssessed.  I’ve discussed before economists’ predilection for all things quantitative (numbers), but it’s worth emphasising why – objectivity.  Anecdotes and observations are susceptible to the subjective views of the observer. Shark attacks are a good example as our fear, horror stories and media coverage can lead us to vastly overestimate the actual risk.  Quantitative data provides an objective information on shark attacks to balance our subjective view. However, the objectivity of quantitative data is only relative and data collection involves oft-forgotten subjectivity (even deciding what to measure is subjective.) Nonetheless, I've never met a standard deviation that didn't do it for me.

The relative dearth of data in IP impacts both policy and business.  The lack of good data means that policy may be based on subjective views not in line with empirical evidence. Poor data on the value of IP, litigation risk and the importance of IP protection may mean that firms’ IP strategies are not optimised.  For example, the fear of patent litigation, even if such fear is unwarranted, increases demand for litigation insurance, and consequently increases insurance premiums.  Good data benefits everyone.

The trendy word in datasets these days is, “granular.” Granular, however, does not mean organic grains stuck between teeth, but “the size in which data fields are subdivided.” In short, it means detailed.  The kinds of datasets becoming available now are at an unprecedented level of detail.  <Merpel, were she an economist, would swoon at this point.>

Creating a database is challenging work.  In some cases, historical data, key to understanding trends, may be stored on legacy systems or in hard copy only.  Records may be poorly maintained due to error or even deliberately (managing data costs money) and ownership of data may be unclear.  Data must be “cleaned,” which is not some organised crime euphemism, but quality control.  Missing data, or an errant line of code, can mess up a database.  And once you’ve done all the work of extracting the right data, you may have to go through the same exercise next year.

Thankfully, IP offices are making data more accessible.  While the increasing availability of IP registries for search purposes is encouraging, one-off cases provide little insight. Comprehensive research requires complete databases rather than registry searches. It is these complete databases where we are seeing great improvements.
IP God?
British Museum Egypt
by Einsamer Schützer

I could write multiple posts on patent data, but suffice to say that it is much more widely available, from both offices and commercial databases, than other rights.  The USPTO has recently published some more data on application data, which the Written Description blog has covered.  IP Australia’s forthcoming 2016 version of their IP Government Open Data (IPGOD) has broader coverage than previous releases and includes attorney information, abstracts, transactions and process milestones.

Trade marks
The USPTO has led the trend in publishing trade mark data.  They've created good, easily accessible databases (accessibility here meaning comprehensive, high-quality, historical data in a format that can be easily ported into a variety of software.)  Hot on the heels of the USPTO are the UK IPO and IP Australia.  IP Australia has particularly exciting plans, according to Chief Economist, Ben Mitra-Kahn, “We are working with WIPO, USPTO, UK IPO, IPONZ and OHIM to create a global TM database with the Universities of Swinburne and Melbourne which will also be a first. Next steps are tools to work and look into the data. A lot of cool stuff.”

OHIM has plans to make its data more accessible and WIPO, in some cases, is restricted from sharing its data as it is often-third party. If readers could alert me to other offices making more data available, please do.  Most offices seem to be sticking to online searchable, but not downloadable, records systems and statistics releases (e.g. analysis of IP trends, often in published in pdf.)

Design rights
Alas, poor design rights. They are the lesser loved of the registered IP rights, so feel some pity.  Like trade marks, they’re heavily dependent on images, which makes large scale analysis challenging. However, improved image analysis techniques suggest we may get a lot more out of image data than we do at present. Unofficially, I hear offices are planning to publish more design data.

The unregistered nature of copyright suggests we won’t be seeing much by way of copyright data boons from IP offices. However, the digital era creates opportunities for copyright-related data.  For example, Google makes its copyright removal requests available for download.  Digital copyright exchange-type initiatives and Orphan Works schemes are also ripe for creating valuable data.

I have very high hopes for the future of IP data.  Increased digitisation means that more data is available in formats that can be relatively easily mined.  So, keeping in mind you are what you measure, Merpel would like you to know her vital stats: Hairball average: 1.1 per month, Grooming sessions: 3 per day (summer) 5 per day (winter), CIQ (Cat I.Q.): 17/20 (but she was distracted by a mouse when taking the test.)

And remember, "“Measurement is like laundry. It piles up the longer you wait to do it.” - Amber Naslund


Anonymous said...

Thank you for bringing this problematic up in a blog post!

When data on patenting activity became available as files, rather than as manual copying on location from patent office files, economists initially became very happy and excited: something to dig their teeth into. However, it seems that the whole field ended up frustrating them -- all that beautiful data and no way to reach sensible conclusions. The reason is that the data is useless without context, and that context was individual to the applicants and their respective commercial and legal situation. Furthermore, some decisions were confidential, and e.g. licence agreements are not necessarily required by law to be registered in a publicly available register. A small window is opened by court cases, but arbitration is putting a firm lid on what really went down.

The only way to approach an understanding of the patenting field and development is to look at individual business cases. For existing companies it is unthinkable to be so candid as to give access to minutes of meetings in which decisions have been made or to confidential market reports. Where do you find decisions on paying off the competition in order that it does not manufacture generics while the original patents are dying out?

For this reason, it is only really possible to determine what really happened in historical files from now defunct companies. One such study is highly recommendable: Stathis Arapostathis and Graeme Gooday: 'Patently Contestable. Electrical Technologies and Inventor Identities on Trial in Britain', The MIT Press 2013.

An older but very fundamental study is C.T. Taylor and Z.A. Silberston: 'The Economic Impact of the Patent System. A Study of the British Experience', Cambridge University Press 1973. It has not been bettered, and it has approached the matter both from a statistical point of view and also by means of interviews with key persons. A quote from a footnote (p. 305): "For details of the part played by E.M.I. we have relied mainly on S.J. Preston, 'The Birth of a High Definition Television System', a paper read to the Television Society on 24 October 1952; and on conversations with the author and his successor as E.M.I.'s Patent Manager, Mr. A.B. Logan." To give some perspective, the development and commercial decisions under discussion were made in the 1930s, i.e. at a safe distance in time.

An example of the sad results of the modern, numbers-based approach, is Adam B. Jaffe and Manuel Trajtenberg; 'Patents, Citations & Innovations. A Window on the Knowledge Economy', The MIT Press 2002. Despite a very full documentation, for instance their complete data set made available in the form of an enclosed CD-ROM. In my view, the window is frosted. The book's major contribution is the discussion of the approaches that had been used at the time. Taylor and Silberston have been forgotten. Other recent literature does mention T&S, but usually merely as an acknowledgement of its existence, not in order to be inspired by its approach.

The situation may be different in the fields trademark, design, and copyright, because so much more takes place in the public sphere promoted by advertising.

George Brock-Nannestad

THE US anon said...

Thank you GB-N for a cognitive and engaging post!

Nicola said...

Thank you, George, for a very helpful contribution to the discussion. I shall have to revisit Taylor & Silberston! As economic analysis has come a long way since the 70s (mostly in improved computer capacity for statistical analysis and the modelling of theory), older papers often get ignored. However, the fundamentals of patent registries really haven't changed.

I'm quite confident on the ability of patent data to measure knowledge flows and areas of patent activity (unsurprisingly), but grains of salt should be taken with other applications. As you've note, a lot of good information isn't in the public domain. However, no data is perfect, and some of this information will never be the public domain. These imperfections shouldn't stop us from analysing, but should be highlighted.

And it remains to be seen as to data in other IP fields...

Subscribe to the IPKat's posts by email here

Just pop your email address into the box and click 'Subscribe':