usage data now exist that would have been prohibitively expensive to collect in the analog age. Software logs routinely collect information not only on sales of books downloaded from centralized repositories like Amazon, but also information on if and when a particular book was read, how quickly it was read, and so forth. Similar analyses could be done on e-magazines and blogs where it is now possible to measure time spent on a particular article or blog-post, and click-through rates of particular hypertext links. In the context of streaming video, YouTube and Netflix collect data on user behavior including repeat consumption and the location and time of consumption. All of this information, if routinely collected by private and public entities and systematically organized, would be invaluable to the study of copyright in the digital age, as well as other aspects of the digital economy. Of course, proper use of this data will require taking steps to protect the privacy of consumers.
On the other hand, collecting such microdata for research remains a considerable challenge. Perhaps the biggest challenge lies in the fact that data about the creation, consumption, and distribution of digital media increasingly reside in the hands of private entities whose incentives diverge from those of researchers. Even if such data were available, constructing pseudo-experimental research designs places an additional burden on data when, as is usually the case, researchers are unable to directly run experiments. Finally, the problem of “free” goods is particularly salient in the digital domain. E-magazines and blogs are often free to read, free applications for smartphones abound, and free music and video are widely available. In such cases, it becomes hard to place a dollar value on such goods, compounding the difficulty of estimating consumer or producer surplus in these industries. This section highlights the practical and conceptual challenges inherent in the collection of digital copyright-related data and its use in carefully designed research.
Incentives of Data Owners
Data collection can be costly. Firms and industries have some motivation to collect such information in the pursuit of profit maximization and industry-focused advocacy. To out-compete rivals they will want to keep some information proprietary, but in some cases they will be open to selectively sharing data that will help their industry in policy advocacy. They might also design studies and surveys to shape public or political elite perceptions in ways that favor their policy agenda. The home recording controversy described earlier is a good example. What private data holders do not have at present is an incentive to act in concert to share data with researchers whose results they do not control.
These challenges will undoubtedly persist as the Internet and