BIG DATA & INTELLIGENT TRANSPORTATION SYSTEMS John W. Bagby * Abstract: Transportation is a quintessential application domain for big data and the associated predictions from big data analytics. Almost every domain where big data is deployed shares most policy concerns with Intelligent Transportation Systems (ITS) including privacy, security, intellectual property, and the attraction of capital to fund innovation and deployment. This article explores the unique ITS domain-specific prediction that informs both decision-making and control derived from big data analytics and how these matters will be constrained by public policy. Such policy intervention is most likely through litigation, legislation, regulation, and standards development. Policy intervention should be anticipated (1) where big data predictions and the resulting control systems prove to be unreliable or (2) when injustice is perceived to profoundly impact life, liberty or property interests or have disparate impact on demographic groups. A risk-benefit approach is deployed here to propose an inter-disciplinary techno-policy research agenda. I. INTRODUCTION Big data is all the rage; hopefulness inspires widespread belief that it is essential to solving future complex and societal problems. However, there is also widespread dread it will actually cause more societal problems than it solves. Proponents primarily argue the former and seldom admit or address much of the latter. Proponents claim that: (1) big data is essential to resolve intractable engineering design and healthcare problems, 1 (2) big data might inform predictive analytics essential to modern counter-terrorism 2 and improve the effectiveness & efficiency of law enforcement, 3 (3) big data * Professor of Information Sciences and Technology, the Pennsylvania State University. 1 Tien James M., Overview of Big Data: A US Perspective, 44 THE BRIDGE 12-19 (Nat’l Acad. Press) accessible at: http://www.nae.edu/File.aspx?id=128774 2 Bulk Collection of Signals Intelligence: Technical Options, Comm. on Responding to Sect. 5(d) of PDD-28: The Feasibility of Software to Provide Alternatives to Bulk Signals Intelligence Collection; Comp.Sci.& Telecomm.Bd.; Div. on Eng.&Phys.Sci. Nat'l.Res.Counc (2015) accessible at: http://cryptome.org/2015/01/nap-bulk-sigint.pdf 3 Hoofnagle, Chris Jay., Big Brother's Little Helpers: How ChoicePoint and Other Commercial Data Brokers Collect and Package Your Data for Law Enforcement, 29 NCJ INT'L L. & COM. REG. 595 (2003) (predicting separate business model for private sector data brokers providing big data assistance to law enforcement); Cavoukian, Ann, & Jeff Jonas. Privacy by design in the age of big data. Information and Privacy Commissioner of 2 BIG DATA & ITS April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium promises to resolve the long-felt, final attainment of the elusive, frictionless transaction cost-free free market, 4 and (4) big data might enable prediction, using various tools of analytics, of relationships and outcomes across most fields of endeavor and academic domains. 5 This article addresses how big data is essential to enable intelligent transportation systems (ITS), see Figure I. Figure I: Big Data Enablement of ITS: a Generalized Architecture Ontario, Canada, 2012. (arguing false positives resulting from law enforcement use of big data falsely accuse the innocent); Tene, Omer & Jules Polonetsky, Privacy in the age of big data: a time for big decisions, 64 STAN. L. REV. ONLINE 63 (2012) (arguing big data practicality must balance against individual rights) and Skolnick, Jerome H., JUSTICE WITHOUT TRIAL: LAW ENFORCEMENT IN DEMOCRATIC SOCIETY, (Quid pro books, 2011) (arguing big data use to predict criminality risks damage and punishment without due process). 4 See generally, Whinston, Andrew B., Soon-Yong Choi, Dale O. Stahl & Dale O. Stahl, THE ECONOMICS OF ELECTRONIC COMMERCE, (1997 MacMillan London). 5 See generally, Mayer-Schönberger, V., & Cukier, K. (2013). Big data: A revolution that will transform how we live, work, and think. Houghton Mifflin Harcourt. April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium BIG DATA & ITS 3 A. Provisional Definition: Big Data What is big data? Provisional definitions abound but many devotees use some form of the following – a narrower scope for the formulation of big data than is emerging in the data analytics field in business: big data is “an accumulation of data that is too large and complex for processing by traditional database management tools.” 6 This restrictive conceptualization should resonate as probably too limiting because advances in data processing are foreseeable and should be anticipated. Feasibility of using big data, once considered a technical frontier beyond reach, 7 is steadily being resolved. Many big data definitions are problematic in limiting their scope particularly as software tools, storage and processing capacity, and bandwidth advance (roughly) according to Moore’s law. 8 Moore’s law and evolving business practices suggests that four phenomena propel big data analytics: 1st seemingly ever decreasing data storage costs, 2nd seemingly ever increasing processing capabilities, 3rd demand for analysis of big data, and 4th apparent 9 usefulness of predictions made thereon. For the purposes of applying big data in the fields of intelligent transportation systems (ITS), this article is more permissive providing a far less doctrinaire conceptualization. This is accomplished by not attempting to limit big data to the frontiers of data science. This is a more hopeful and practical definition acknowledging that actual practice is far more forgiving and already calls itself big data analytics. 6 Merriam-Webster.com accessible at: http://www.merriamwebster.com/dictionary/big%20data 7 Tien James M., Overview of Big Data: A US Perspective, 44 THE BRIDGE 12, 14 (Nat’l Acad. Press) accessible at: http://www.nae.edu/File.aspx?id=128774 8 Moore, Gordon E.,Cramming more components onto integrated circuits, ELECTRONICS 4. (April 19, 1965) reprinted in 86 PROOCD. IEEE 82-5 (Jan.1998) accessible at: http://www.cs.utexas.edu/~fussell/courses/cs352h/papers/moore.pdf 9 John W. Bagby, Using an Industrial Organization (I/O) Lens to Enhance Predictive Analytics: Disentangling Emerging Relationships in the Electronic Surveillance Supply Chain, LEGAL AND ETHICAL ISSUES IN PREDICTIVE DATA ANALYTICS, Virginia Tech University, June 20, 2014. 4 BIG DATA & ITS April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium B. Provisional Definition: Intelligent Transportation Systems (ITS) It is the same story for ITS, a broad patchwork of “cool” technologies that are rather poorly integrated without very heavy reliance on software 10 and big data. ITS holds great future promise to solve engineering and societal problems: public safety, environmentalism, efficiency, infrastructure costs, mobility, convenience, national security and law enforcement. Most readers can quickly identify some complex embodiments of ITS technologies, such as Google’s autonomous/smart car, as well as some of ITS component technologies, such as these well-known constituents, many essentially rely on various forms of telematics: 11 10 lane departure warning systems, collision warning & avoidance systems, adaptive cruise controls, automated toll collection (EZPass), congestion pricing of road use, variable (warning sign) messaging, dynamic traffic control, driver assist to driverless highways, mobile 911 location referencing, freeway traffic management systems, automated traffic enforcement (red light cameras) commercial vehicle monitoring & control public transit monitoring & control navigation, GPS, “infotainment” mass surveillance Ramsey, Mike, Fears Push Car Makers Deep into Silicon Valley, WALL ST.J. (3.26.15) accessible at: http://www.wsj.com/articles/ford-mercedes-set-up-shop-in-siliconvalley-1427475558 (arguing automaker fears of missing smart car revolution invade silicon valley now that software accounts for 10 - 25% of new passenger vehicle value). 11 Duri, Sastry, Gruteser, M., Liu, X., Moskowitz, P., Perez, R., Singh, M., & Tang, J. M., Framework For Security And Privacy in Automotive Telematics, PROCEED. 2ND INT’L WORKSHOP ON MOBILE COMMERCE (ACM, 2002). “Telematics” is an interdisciplinary field focused on long-distance transmission of computer information. Telematics systems are composed of sensor arrays, communications and control technologies that permit remote monitoring and management of vehicles. These are applicable to spacecraft, aircraft (including drones), highway surface transportation (passenger vehicles, fleets), rail traffic, and waterborne vessels, among others. See generally Buxton, W., Integrating the periphery and context: A new taxonomy of telematics, 95 PROCEED. GRAPHICS INTERFACE-- (1995). April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium BIG DATA & ITS 5 automated risk management (real-time insurance underwriting), weigh in motion for commercial fleet operations, weather condition notification & routing remediation, parking management, hazardous cargoes & environmental control, driver monitoring, o Commercial operator regulatory compliance o Operator real-time risk based underwriting (Progressive “Snap-Shot”) forensic enablement - data capture: o ex post cockpit recording (on-board black boxes) o real time remote transmission (telematics), emergency condition resolution (OnStar). Autonomous, full-automation, and artificial intelligent instantiations of ITS vehicles are rife in popular culture. Examples include the artificial intelligence (AI) controlled Johnny Cab in “Total Recall,” 12 the voicecontrolled autonomous vehicles in “Demolition Man,” 13 or the faster and safer autonomous highway vehicles hypothesized in “I, Robot.” 14 C. Provisional Definition: Big Data & ITS ITS might be THE paradigm application of big data analytics methodologies and components. As the short list above indicates, ITS is a huge umbrella of component technologies and applications that promise improvements in public safety, environmentalism, efficiency, infrastructure costs, mobility, convenience, national security and law enforcement. Three challenging design problems are presented. First, there is the design of ITS technologies that collect, archive and make available the big data for use in ITS applications. Second, the design of the ITS applications enable and depend directly on analytical and control system designs that deliver on the promises of efficiency, safety and information services. Third, the integrated design of these two systems, ITS technologies and ITS 12 Total Recall, Prod. Buzz Feitshans, Dir. Paul Verhoeven, Perf. Arnold Schwarzenegger, Sharon Stone, TriStar Pictures, 1990. 13 Demolition Man, Prod. Joel Silver, Dir. Marco Brambilla, Perf. Sylvester Stallone, Wesley Snipes & Sandra Bullock, Warner Bros., 1993. 14 I, Robot, Prod. Laurence Mark, Dir. Alex Proyas, Perf. Will Smith & Bridget Moynahan, 20th Century Fox, 2004. 6 BIG DATA & ITS April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium applications, must be reliably integrated. Designs for ITS system integrations have only recently been successful in demonstrations, so few are widely deployed. ITS system integrations must be continually updated and tested given the risks of failure. The integration of ITS technology designs with the ITS applications designs is the design area posing the most critical policy problems. Most of the initial ITS technology designs were conceptually described in the 1990s 15 and many were slowly rolled out in limited demonstration forms by the year 2000. This was mandated and facilitated by various federal laws largely through component or affiliated agencies to the U.S. Department of Transportation (DoT). 16 Many proponents in the ITS community predicted very near term deployment of turn-key systems by early in the early 21st century. 17 However, this nearly irrepressible optimism 18 continues today despite progress beyond the piecemeal component deployments urged under the Intelligent Vehicle Initiative (IVI) of the late 1990s. 19 However, experience clearly demonstrates these predictions, many dating back to the mid-1990s, were over-optimistic. Left to be unanswered are practical questions of attracting capital and robust, well-tested technologies into manufacturing, assembly of critical mass, and development of rigorous standards-compliant interconnectivity in critical safety hardware and supporting software and communications systems. 15 THE NATIONAL ITS ARCHITECTURE: A FRAMEWORK FOR INTEGRATED TRANSPORTATION INTO THE 21ST CENTURY, Office of the Assistant Secretary for Research and Technology, Intelligent Transportation Systems Joint Program Office, Department of Transportation accessible at: http://itsarch.iteris.com/itsarch/documents/physical/physical.pdf 16 E.g., Norman Y. Mineta Research and Special Programs Improvement Act, PUB. L. 108-426 (108th Cong.) accessible at: http://www.rita.dot.gov/laws_and_regulations/public_law_108_426.html 17 See e.g., Qu, Zhihua, Cooperative control of dynamical systems: applications to autonomous vehicles, (SPRINGER SCI. & BUS. MEDIA, 2009) (arguing current feasibility and huge promise for autonomous vehicles in most fields of endeavor). 18 Rogers, Christina, Google Sees Self-Driving Car on Road Within Five Years, WALL ST.J. Jan.13, 2015 ((arguing that Google executive sees no regulatory hurdles to Google’s fully automated autonomous vehicle) accessible at: http://www.wsj.com/articles/googlesees-self-drive-car-on-road-within-five-years-1421267677 19 Advanced Public Transportation Systems: The State of the Art Update 2000U.S. Department of Transportation, Federal Transit Administration, (2000). 1998 Transportation Equity Act for the 21st Century (TEA-21), Pub. L. 105-178 (June 9, 1998) accessible at: http://www.fhwa.dot.gov/tea21/tea21.pdf April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium BIG DATA & ITS 7 Furthermore, actual user acceptance, rather than initial consumer interest must also be proven. ITS uses component technologies well-known to the ITS communities. These include (1) sensor networks, (2) location and proximity referencing, (3) data repositories (e.g., cloud, on-board vehicle memory), (4) telecommunications connections (wireless and landline, short range and longer range), (5) analytics capacity assessing using statistical, actuarial and network science methodologies, (6) accretion of huge new databases supplementing existing data, and (7) control systems for in-vehicle and roadbed actuation. This simple list conceals the vast complexity of ITS systems, somewhat more diagrammatically revealed in the U.S. DoT’s depiction of the ITS general architecture in Figure II. 20 Figure II: National ITS Architecture - High Level Architecture Diagram Source: Office of the Assistant Secretary for Research and Technology, Intelligent Transportation Systems Joint Program Office, Department of Transportation 20 Office of the Assistant Secretary for Research and Technology, Intelligent Transportation Systems Joint Program Office, Department of Transportation, accessible at: http://www.standards.its.dot.gov/LearnAboutStandards/NationalITSArchitecture 8 BIG DATA & ITS April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium Unlike many aspects of big data, prediction is only part of the ITS big data implementation. Direct control and regulatory enforcement are also major objectives. For example, HOV lane accessibility or closures, barriers blocking on/off-ramps or special lanes (HOV), ramp metering, variable messaging signage, automated traffic ticketing enforcement are but a few existing examples. Autonomous vehicles may become “cars that drive themselves,” raising significant risk management questions when hardware and/or software failures are provable with forensic-quality evidence. Indeed, risks of big data in the ITS application domain may offer some unique risks, long hypothesized, 21 but now emerging in “enhanced” form. 22 The following sections assemble from technical, policy and assessment literatures that cross these domains to enable an analysis of how policy enablement, reaction and manage might emerge as ITS big data technologies are deployed and assessed. The next section II explores the generalized types of promised or speculative benefits expected for particular ITS technologies and the generalizations about classes of benefits and applications. Section III discusses ITS litigation risks, which serves as a proxy for risk generally. Section IV hypothesizes how ITS deployment will shift, attenuate or magnify such risks. This article concludes with tentative findings summarizing how the risk-benefit approach deployed here can inform an inter-disciplinary techno-policy research agenda consistent with ITS goals as rationalized by public policy realities. II. BIG DATA & ITS’S LITIGATION & POLICY RISKS Big data challenges the status quo – a condition that both attenuates some legal risks while accentuating other risks. However, the curmudgeon’s views are predictable here: big data generally is merely a redux of the so called “cyberlaw revolution” of the late 1990s; furthermore, ITS is nothing revolutionary because the history of automotive development is a constant 21 See generally, Bagby, John W. & Gary L. Gittings, Litigation Risk Management for Intelligent Transportation Systems (Part One), ITS-Quarterly, Vol.VII, No. 2 (SpringSummer 1999), Bagby, John W. & Gary L. Gittings, Litigation Risk Management for Intelligent Transportation Systems (Part Two), ITS-Quarterly, Vol.VII, No. 3 (Fall 1999) and Bagby, John W. & Gary L. Gittings, Litigation Risk Management for Intelligent Transportation Systems (Part Three), ITS-Quarterly, Vol.VIII, No. 1 (Winter 2000). 22 Staff Report, Tracking & Hacking: Security & Privacy Gaps Put American Drivers at Risk, Senate Office of Edward J. Markey (D-Mass.), Feb.2015, accessible at: http://www.markey.senate.gov/imo/media/doc/2015-02-06_MarkeyReportTracking_Hacking_CarSecurity%202.pdf April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium BIG DATA & ITS 9 adjustment to serial deployments of evolutionary technologies. In both cases there is nothing revolutionary or so completely new that existing principles cannot be adapted in a straightforward manner. The development of Cyberlaw over the past two decades has caused some useful awakening by practicing lawyers, policy makers and the laity. There is some general consensus that Cyberlaw is composed of: (1) changes to legal, regulatory and governing procedures, (2) transactional practice, (3) intellectual property (IP), (4) privacy, (5) security and (cyber) crimes, (6) various regulatory matters (e.g., antitrust, employment, financial regulation, telecommunications), (7) technology transfer, and (8) myriad consumer protections. As scholars predict and analyze emerging policy on big data generally and on ITS big data in particular it seems likely that these policies may closely track the development of cyberlaw. But is cyberlaw a distinct, new field or should it be confined to an evolution from existing law? 23 The next two sections address this as the law of the horse that suggests universal method to address policy application in new fields of endeavor (like big data and ITS). A. Big Data & ITS: Just Another Law of the Horse? The evolutionary approach posits that ITS Big Data will never become an independent field. 24 This is reminiscent of Karl Llewellyn’s denigration of developing any new specialized field, likening it to the “law of the horse.” 25 Under this theory it is intellectually irresponsible to build new “fields of study” in narrow areas like the law of the horse (e.g., horse theft, horse financing and sale, horse injury torts, regulation of jockeys). Llewellyn argued that the resulting laws are amateurish, inadequate and ineffective. Llewellyn’s reluctance to develop commercial law as a new 23 Easterbrook, Frank H., Cyberspace and the Law of the Horse, 1996 U. CHI. LEGAL F. 207. (arguing against any rush to treat cyberlaw differently than “traditional space, by reacting to cyber-libertarians demand for cyberspace exceptionalism by urging resistance to creating special exceptions for cyberspace). 24 See generally, Bagby, John W. (special issue ed.), Forward to the Special Issue on Cyberlaw, 39 AM.BUS.L.J. 521 (Summer 2002) (arguing sufficient significance of cyberlaw to require special treatment to attract scholarly investment). Portions of this section are adapted from this special issue forward. 25 Karl N. Llewellyn, Across Sales on Horseback, 52 Harv. L. Rev. 725 (1939); Karl N. Llewellyn, The First Struggle to Unhorse Sales, 52 Harv. L. Rev. 873 (1939) (defending the newly created UCC as no more than a standardization of terms based on existing commercial practice). 10 BIG DATA & ITS April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium field was an important pre-condition to the UCC’s inherent flexibility. The UCC’s apparent success is largely reliant on the experience of professionals who already reduce transactions costs by developing customized commercial relations. The UCC’s standardized gap filler provisions are mere supplementation reliant primarily on customization by innovative commercial transaction designers. Llewellyn’s UCC approach has clearly succeeded to enable transaction innovation with evolutionary approaches using general laws based on durable fundamental principles precisely. This approach has lasting value when compared to continental civil law directives that are excessively detailed and thereby inflexible. So for the emerging field of ITS big data, should new policy promote revolutionary cyberlaw because of Internet exceptionalism, just another “law of the horse”? Are ITS and big data revolutionary new fields that necessitate merely “idiosyncratic transactions?” Are big data and ITS best promoted by policy amateurs who design narrow rules for a new realm before that field can effectively develop its own useful experience?” 26 Of course, dilettantes descend into most new fields and policy should always embrace generality and flexibility. 27 How can these inevitable undesirable consequences accompanying a rush to legislate cyberlaw, big data law or ITS special exceptions be avoided? B. Applying Samuelson’s Evolutionary Model to Big Data & ITS Policy The evolutionary approach is more measured, pragmatic and careful. Legal traditionalists likely prefer evolution to revolution because it is consistent with the efficiency of the common law, 28 and this supports stare decisis. Thus, coherence is preserved and reinforced when applying 26 See Bagby, John W. (special issue ed.), Forward to the Special Issue on Cyberlaw, 39 AM.BUS.L.J. 521, 525 (Summer 2002) (proposing cyberlaw exceptionalism to the extent it attracts academic research investment, fresh perspective and fresh talent unencumbered by insider self-interest). 27 See Tonry, Michael & Norval Morris, Retirement of Sheldon Messinger, 80 CAL. L. REV. 310 (1992). 28 See, Richard A. Posner, ECONOMIC ANALYSIS OF LAW (5th ed. Aspen Law & Business, 1998), but see Todd J. Zywicki, The Rise and Fall of Efficiency in the Common Law: A Supply-Side Analysis, 97 NW. U. L. REV. 1551 (2003). See also Bagby, John W, Common Law Development of the Duty of Information Security in Financial Privacy Rights, FOURTH ANNUAL FORUM ON FINANCIAL INFORMATION SYSTEMS AND CYBERSECURITY: A PUBLIC POLICY PERSPECTIVE, Smith School of Business, Univ. Maryland, May 23, 2007. April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium BIG DATA & ITS 11 existing public policy through traditional law to any new set of technologies. Berkeley Professor Pamela Samuelson’s approach predicted that cyberlaw would be of sufficient magnitude to require special treatment. First, in her approach to cyberlaw exceptionalism, the policy analyst should reassess first principles. Second, a minimalist approach should taken in all new policymaking adhering to the principle of simplicity. Finally, whenever possible new policy should remain technology neutral as much as is possible. 29 Applying these principles from Llewellen, Easterbrook and Samuelson to big data and ITS, policymakers would repeatedly wrestle with the choice between evolutionary and revolutionary approaches. It is foreseeable that many big data and ITS constituents would put political pressure on policymakers to adopt at least some aspects of a revolutionary approach. For example, those with vested interests in massive, quick deployment of their own products, but which seek to elude any risk of product or service liability, would likely urge revolutionary tort reform exceptions that would “pre-empt” state tort law. Federal statutory solutions (DoT regulations eliminating liability seem unlikely) might be viewed as the silver bullet against liability risks posed by plaintiffs’ lawyers. By contrast, on the user side, one might expect civil libertarians to argue the personal security risks of insecure big data contained in insecure cloud repositories poses excessive privacy and personal security risks, e.g., stalking, impersonation. It might be expected such policy pressures could result in costly cybersecurity regulations, specific rights of action for liability, innovation-chilling design standards and extensive audit-enabling recordkeeping. In both cases the twin promises of big data and ITS to enhance societal goals would be thwarted by most revolutionary approaches. C. Policy Risks for Big Data & ITS are Policy Venue Dependant 30 Policy risks for Big Data and ITS span the cyberlaw fields of IP, privacy, security and various consumer protective regulatory programs. In 29 Samuelson, Pamela, Five Challenges for Regulating the Global Information Society, REGULATING THE GLOBAL INFORMATION SOCIETY (Chris Marsden ed., 2000). 30 Portions of this subsection III.C. are adapted in part from Bagby, John W., Illuminating the Elusive Cyber-Infrastructure Policy Resolution: the Industrial Organization Lens, No. ALSB2013_0102 presentation-Academy of Legal Studies in Business, Aug. 8, 2013, Boston MA. 12 BIG DATA & ITS April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium this section the cybersecurity aspects are highlighted. Identifying and assessing litigation and policy risks for other fields of big data and ITS should follow this model. Traditional security law, security law in the general cyber-infrastructure context and security law applicable to big data enablement of ITS is decidedly sectoral and not omnibus. The sectoral approach means there are provisions of law, regulation and the common law that impinge on security concerns, but these are neither applicable broadly across fields of law nor broadly across industries or economic sectors. 31 That is, security law in the U.S. closely resembles the sectoral nature of U.S. privacy law: The U.S. has no comprehensive privacy (/security) protection policy. Privacy (/security) laws are narrowly drawn to particular industry sectors, which can be called a sectoral approach to privacy (/security) regulation. Regulation of privacy (/security) generally arises in the U.S. after there is considerable experience with privacy (/security) abuses, an approach consistent with liberty, laissez-faire economics and common law precedents as the major approach to law making. As a result, U.S. privacy (/security) law is a hodgepodge, patchwork of sectoral protections, narrowly construed and derived from constitutional, statutory and regulatory provisions of international, federal and state law. 32 (compare/contrast emphasis added) Omnibus approaches are much more comprehensive, they mandate strong rights, thereby imposing strong duties on most industries and on many government activities. Strong omnibus regulation is often politically infeasible. Cyber-infrastructure security suffers because legal requirements are not pervasive across industry and government sectors. The traditional law of security is also a hodgepodge, patchwork derived from various fields of law and security law is also based on constitutional, statutory and regulatory provisions of international, federal and state law. Security laws generally arise ex post, following crisis or galvanized political will derived from mounting evidence of abuses. Traditional sources include criminal law, tort law, contract, and malpractice. Privacy laws and security 31 See also Strauss, J., & Rogerson, K., Policies for online privacy in the United States and the European Union, 19 TELEMATICS & INFORMATICS 173 (2002). 32 Bagby, John W., The Public Policy Environment of the Privacy-Security Conundrum/Complement, pp.195-213 Ch. XII in Sangin Park (ed.), STRATEGIES AND POLICIES IN DIGITAL CONVERGENCE (2007 Idea Group Ref., Hershey PA). April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium BIG DATA & ITS 13 laws are linked in two fundamental ways: 1st as a trade-off 33 and 2nd as a complement. 34 Big data is intimately involved with both privacy and security law while ITS will be connected in a somewhat more tenuous way. Sectoral laws impact security generally, cyber-infrastructure in particular and big data in ITS. 35 These constrain activities in particular industries ranging from several bellwether sectors like the federal regulation of healthcare, finance, intellectual property, federal administrative law, education, veterans affairs, deceptive trade practices, 36 and childrens’ protection. The states are also active, primarily in cyber-infrastructure protection of identity theft with security breach notification (disclosure) requirements, spyware and data disposal provisions. 1. Layered Policy Mechanisms for Cyber-Infrastructure Security Cyber-infrastructure security policy emanates from one or more of several layers; the optimal source depends on constraints imposed by political considerations as well as the predicted effectiveness of each in isolation and the system effectiveness of the combined set of controls. First, despite the prospect for market failure, market discipline most certainly provides at least some useful pressure to invest in security. 37 A subset of market disciplines are industry best practices. These evince weak-form, de facto standardization (e.g., mimicking behavior) that function best as a form of information sharing. Another component of market discipline is derived from the employment market for cyber-infrastructure security professionals who would service the big data and ITS industries. Security professionals 33 National security and criminal law are two closely connected examples of the tension between strong privacy law because it arguably leads to weak collective security. 34 Strong personal security relies on strong privacy practices. 35 Shaw, Thomas J. (ed.), INFORMATION SECURITY AND PRIVACY, (Am.Bar Assn. 2010). 36 See generally Bagby, John W, Common Law Development of the Duty of Information Security in Financial Privacy Rights, FOURTH ANNUAL FORUM ON FINANCIAL INFORMATION SYSTEMS AND CYBERSECURITY: A PUBLIC POLICY PERSPECTIVE, Smith School of Business, Univ. Maryland, May 23, 2007 accessible at: http://faculty.ist.psu.edu/bagby/Pubs/CommonLawEfficiencyCustodyDutyInfoSecurity1.pdf 37 Hahn, Robert W. & Anne Layne‐Farrar, The Law and Economics of Software Security 30 HARV. J. L. & PUB. POLICY 284 (2007) (arguing market forces can work to incentivize security investment, diverse software security problems suggest varying remediation approaches and that traditional criminal law is rather ineffective to deter cybercrime). 14 BIG DATA & ITS April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium share skill sets, some preparation (e.g., education, degrees from accredited institutions), and credentialing. 38 These factors arguably contribute to some uniformity among industry best practices. Professionalism in other professions has emanated from licensing statutes, malpractice litigation, and best practices. Second, de jure standards drive very significant security undertakings. For U.S. federal agencies, the Federal Information Security Management Act 39 (FISMA) is influential to create an IT security compliance framework for both civilian and Department of Defense (DoD) agencies. In the private sector, there is a widening choice for de jure IT security standards from National Institute of Standards and Technology (NIST), 40 the Control Objectives for Information, and Related Technology (CoBIT) developed for investment securities disclosure and the financial services industry by the Information Systems Audit and Control Association’s (ISACA), 41 and the control and standards International Organization for Standardization (ISO). 42 The U.S. DoT is the coordinative body for ITS standardization, no such single point of regulatory contact liiikely exists currently nor will emerge for big data security. The effective penetration of alternative security standards varies considerably. 43 38 A common security credential is the Certified Information Systems Security Professional (CISSP) issued by the International Information Systems Security Certification Consortium, Inc., (ISC)² a global, not-for-profit that provides education and certification in IT security. Dozens of competing certification authorities exist throughout the world. 39 FISMA is the Title II component of the E-Government Act of 2002, H. R. 2458, Pub.L. 107-347, 116 Stat. 2899; codified at 44 U.S.C. §3541, et seq. (establishes federal Chief Information Officer in the Office of Management and Budget (OMB); delegates authority to the National Institute for Standards and Technology (NIST) and the National Security Agency (NSA) to issue Federal Information Processing Standards (FIPS) applicable to federal agencies and some federal contractors. 40 The NIST 800-series adapts the FIPS to private-sector government contractors, accessible at: http://csrc.nist.gov/publications/PubsFIPS.html 41 COBIT standards are accessible by subscription at: http://www.isaca.org/Knowledge-Center/cobit/Pages/COBIT-Online.aspx 42 IT security standards in the 27,000 series, Information Security Management Systems, are issued jointly by the ISO and the International Electrotechnical Commission (IEC). For example, ISO/IEC 27001: 2005 Information technology – Security techniques – Information security management systems – Requirements (2005) holds promise as an important model. Standards from the International Organization for Standardization (ISO) are generally accessible only on a fee-basis at http://www.iso.org 43 For example, FIPS penetration in U.S. federal agencies is strong while penetration into state governments is less so. COBIT is widely used by private-sector firms that are April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium BIG DATA & ITS 15 Third, constitutional provisions, statutes and administrative regulations are likely to provide increasingly specific security incentives for both big data and ITS. The U.S. Constitution provides several structural security provisions: U.S. and several state statutes mandate specific and largely sectoral security requirements. Adminsitrative regulations, most at the federal level, also mandate security. Fourth, importantly for the development of more comprehensive security regimes, case law interpreting privacy and security law presumably applicable to big data and ITS, 44 at both state and federal levels, is refining security duties in various sectors. 45 2. Regulatory Tools Various mechanisms can incentivize security investment. Some are appropriate from almost any source, for example, professionalism of cyberinfrastructure personnel working in big data or ITS could be incentivized, as it is for other professions, by licensing standards, malpractice litigation duties and regulatory requirements, any of which could be mandated by either state or federal law as supplemented by professional NGO associations. Appropriate flexibility in cyber-infrastructure security practices associated with big data and ITS is accommodated when standards are produced by expert sources, such as are the audit standards for professional self-regulation. Disclosure is becoming an effective incentive to cyber-infrastructure publicly-traded in the U.S. Penetration of ISO standards in the EU is substantial, but lags considerably in the U.S. One explanation may be that mandatory compliance with ISO standards is much stronger in the EU than in the U.S. One major exception for world-wide conformity to ISO standards are the isotainers developed by Malcolm P. McLean of U.S.based Sea-Land Corp. Containerized freight must be compliant with ISO 668 - Series 1 freight containers -- Classification, dimensions and ratings (1995). See also Container Handbook (Gesamtverband der Deutschen Versicherungswirtschaft e.V. - GDV) 2009, accessible at: www.containerhandbuch.de/ 44 See e.g., White, Anthony E., The Recognition of a Negligence Cause of Action for Victims of Identity Theft: Someone Stole My Identity, Now Who Is Going to Pay for It,; 88 MARQ. L. REV. 847 (2004-2005). 45 See e.g., Bagby, John W, Common Law Development of the Duty of Information Security in Financial Privacy Rights, FOURTH ANNUAL FORUM ON FINANCIAL INFORMATION SYSTEMS AND CYBERSECURITY: A PUBLIC POLICY PERSPECTIVE, Smith School of Business, Univ. Maryland, May 23, 2007 accessible at: http://faculty.ist.psu.edu/bagby/Pubs/CommonLawEfficiencyCustodyDutyInfoSecurity1.pdf 16 BIG DATA & ITS April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium security investment for big data and ITS. Consider that forty-six states 46 and at least one federal statute 47 require disclosure of security intrusions from breaches of PII databases that comprise part of big data. Direct personal delivery of disclosure notice generally must be delivered directly to potentially impacted parties under security breach notification legislation. Breach notices are hard to keep secret, most are publicized broadly supplementing the victims’ pressure with other market disciplines. Eventually these have resulted in pressures for more legislation. 48 Some such legislation requires further implementation by regulatory agencies. 49 Several less onerous security incentives might be supplied by direct regulations. Some such incentives are included as part of consent decree settlements with regulators, and some are beginning to appear in many de jure standards. For example, mandatory security management regimes require contingency planning addressing a wide range of activities that are components of security risk management. The range of activities is considerable and is best customized to the size of the entity, the line of the entity’s business activities, and the particular risks that are most likely. Risk assessment regimes are often implemented as a master risk-benefit analysis. At a minimum, such regimes require (1) initial then ongoing threat assessment, (2) iterative response planning, and (3) risk control, retention, sharing, and transfer. Security audits by internal auditors as well as periodic certified audits by independent experts or other audit authorities are an increasingly frequent security mechanism. These engagements rely on identification of technical and administrative controls and then requires their testing. Cyber-infrastructure security regimes are well-represented in these emerging audit practices. 46 For a fairly current, comprehensive listing see Security Breach Legislation 2011, The National Conference of State Legislatures, accessible at: http://www.ncsl.org/default.aspx?tabid=22295 (state-by-state description) http://www.ncsl.org/default.aspx?tabid=13489 (table) 47 Health Information Technology for Economic and Clinical Health (HITECH) Act applies a federal security breach notification requirement for protected healthcare information (PHI) governed under the Health Insurance Portability and Accountability Act (HIPAA). HITECH is a component of the American Recovery and Reinvestment Act of 2009 (ARRA), Pub.L.111-5. 48 PrivacyRights.org is one of several compilers of privacy breach notices, see: http://www.privacyrights.org 49 See e.g., Breach Notification for Unsecured Protected Health Information, 74 Fed.Reg.42740 (August 24, 2009) codified as 45 C.F.R. 160 et. seq. (2010). April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium BIG DATA & ITS 17 3. Central Role of Standardization in Managing Big Data Risks for ITS Consistent with the philosophy of migrating standards from a particular inflexible design configuration to more flexible performance standards, cyber-infrastructure security regulations on big data and ITS are unlikely to specify particular technical protections expected to enhance security. There has been very considerable push-back against this rules-based standardization or mandatory-design standards approach to establishing security methods. 50 Lobbyists “K Street” success in opposing a strict rulesbased approach requiring particular design or performance standards may be the most cogent explanation for the current fragmented state of security incentives generally or as applicable to big data and ITS. For example, while encryption is an obvious remedy for insecure IT systems, few statutes or regulations directly require encryption. 51 Figure III depicts how standardization could improve ITS quality, avert regulatory pressures and lower litigation risks. Figure III: Source: Office of the Assistant Secretary for Research and Technology, Intelligent Transportation Systems Joint Program Office, Department of Transportation 50 See, Letter from R. Bruce Josten, U.S. Chamber of Commerce, to Members of U.S. Senate (July 31, 2012) (voicing strong opposition to S.3414 Cybersecurity Act of 2012). But see, Letter from Sen. John D. (Jay) Rockefeller IV, Chairman, Senate Committee On Commerce, Science, & Transportation, to Fortune 500 CEOs (Sept. 19, 2012) (requesting views on cybersecurity practices and failed regulation). An interesting conflict is now noted in the divergence of attitude between the industrywide lobbying powerhouse, the U.S. Chamber of Commerce, that generally took a hard line against any mandatory cyber-infrastructure regulations, and supportive responses of Fortune 500 CEOs to Senator Jay Rockafeller’s direct solicitation of support for cyberinfrastructure security regulations. 51 See e.g., Cal. Civ. Code §§ 56.06, 1785.11.2, 1798.29, 1798.82. California’s S.B.1386 does not require encryption but exempts incidents from disclosure if the data lost in a breach is encrypted. Id at §2. 18 BIG DATA & ITS April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium 4. Intellectual Property: Database Acquisition, Transfer & Use 52 Big data, business analytics and data science generally runs on data, facts, collections of artistic expression and the like. Business models for traditional data vendors like Lexis-Nexis, Westlaw, Moody’s, and eBay, share a need for a reasonable cash-flow to fund investment in acquiring data, and for converting analog data to digital formats. Legal protection for such databases incentivizes their creation and maintenance, without which, only publicly-financed libraries and research universities might systematically accumulate such collections. Databases assume many forms, such as collections of copyright protected works, data in numeric and natural language form and many other forms of materials. Databases are traditionally arranged in systematic and methodical ways to enable retrieval and access. Traditional databases are composed of raw facts that are meaningful after arrangement in logical ways to enable meaningful analysis. Users enabled to aggregate particular aspects of such data may observe meaningful relationships. Most useful databases must be managed to maintain usefulness with constant updates and periodical purges of stale entries. Databases include literary works, artistic works, texts, sounds, images, numbers, facts, production and shipping information, transactions, financial data, geographic information and quickly increasing volumes of personally identifiable information (IPII). Databases, stored on networked servers and computers, are traditionally accessed according to defined criteria. By contrast, modern big data expands this to do exploratory relationship and hypothesize associations. Both traditional databases and big data are manipulated to create reports which are transmitted using telecommunications increasingly over the public Internet, although secure data mining occurs in closed, proprietary Intranets. Database technology continues to advance into distributed processing systems that may be composed of informally linked separate repositories. Some data collections are so huge they are called data warehouses. Relational databases may revealing associations permitting talented domain experts to identify new relationships using data mining that encourages 52 Portions of this subsection II.C.4. are adapted in part from Bagby, John W., Trade Secrets and Database Protection, Ch. 5 in: CYBERLAW HANDBOOK FOR ECOMMERCE, (©2003, West Pub. Co. (Cengage) Mason OH; ISBN: 0-324-26028-8). April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium BIG DATA & ITS 19 decision-making on new and potentially valuable factors: new associations are discovered, new sequences are uncovered and new data classifications or clusters are constructed. Big data analysis enables better forecasts. 53 a. Database Protections Under U.S. IP Law: Copyright In the U.S., databases receive weak protection as a form of intellectual property (IP) under one or more of several legal rights regimes: tort, trade secret, copyright and contract. These theories are longstanding and wellestablished. For example, the tort theories have been used to protect databases from unauthorized access, use or resale by an infringing outsider: misappropriation, trespass, conversion and trade secrets. Other legal theories are supportive: breach of contract, database license violations, and breach of employment or confidentiality agreement. Weak database protections is often illustrated under copyright law. Copyright protects only the form of expression and not the underlying ideas: hence the idea- expression dichotomy. However, copyright may protect databases as a compilation when it includes a unique selection and arrangement of facts. The compilation results from selecting, coordinating, arranging or organizing existing materials. Infringement occurs when the accused work copies the selection and arrangement of a protected compilation. Independently created but similar selections and arrangements are unlikely infringing. Copyright acknowledges sufficient originality and creativity in making non-obvious choices from among numerous choices. 54 Most of the weakness in copyright protection of databases was settled in Feist. 55 Rural Telephone’s white page telephone directory was a regulatory requirement. Name and address information, provided by telephone customers, was listed alphabetically by surname for the directory. Feist was not liable for copyright infringement in the copying of Rural’s directory to publish a competing directory. The Supreme Court found insufficient “authorial” originality in a telephone company white pages listing of customers to deserve protection as a compilation. 53 Winn, The Emerging Law of Commercial Transactions in Electronic Consumer Data, 56 Bus.Law. 22?, 235 (Nov.2000). 54 Matthew Bender & Co. v. Hyperlaw, Inc., 168 F.3d 674 (2d Cir.1990). 55 Feist Publications, Inc. v. Rural Telephone Service Co., Inc., 499 U.S. 340 (1991). 20 BIG DATA & ITS April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium b. Database Protections Under U.S. IP Law: Trade Secrecy Database protection under U.S. law is much more common using technical security measures and contract restrictionst. Technical methods include a variety of electronic, security, programming and physical safeguards that lock up the data. For example, encryption of the data on a CD-ROM would impede infringement. Similarly, it would be very time consuming to acquire all the data served from a website if the site’s search engine responds to queries with only piecemeal answers. The NADA and Kelly’s Blue Book vehicle pricing guides deploy this scheme to prevent wholesale harvesting of “free data” by competitors. To the extent that technical security measures are effective to restrict access, meter use or generally impede comprehensive extraction of large portions of an owner’s data, such methods may be sufficient. While technical progress can overcome such security controls, they may become more effective when reinforced with contractual restrictions, such as end-user license agreements (EULA) configured as click-wrap access barriers. EULA terms of use and restrictions on decompiling, reverse engineering and reselling data are all important contractual restrictions impeding database misappropriation. Databases owners consider their data to be a form of IP following the licensing regime for their content rather than outright sales or assignments. Licensing is most appropriate for databases collections of copyrighted works, video, music, images, etc. By contrast, licensing factual databases requires physical and contractual controls elevating the database into trade secrecy. EULA terms generally limit the licensee’s use, resale, and manipulation of the database and restrict online access except upon assent to a EULA. ProCD, Inc. v. Zeidenberg56 supports this “click-wrap” regime. The Uniform Computer Information Transactions Act (UCITA) was a general failure in establishing a uniform statutory licensing regime for information in databases. c. Database Protections Under U.S. IP Law: Sui Generis Schemes Political pressures following Feist from the database management industry, from media giants and from foreign nations signaled possible sui generis database protections. Indeed, the EU urged the WIPO to add sui generis database protections to multi-lateral trade negotiations following the EUs March 1996 passage of the EU Directive on the Legal Protection 56 ProCD, Inc. v. Zeidenberg, 86 F.3d 1447 (7th Cir. 1996). April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium BIG DATA & ITS 21 of Databases 57 Congress has repeatedly failed to comply, controversial database protection bills have died without floor votes. Why should a sui generis form of IP be created to cover databases? How would this encourage more deliberate big data practices, such as negotiated access and cooperative database maintenance? Concerned scientists have argue that strong database IP rights could impede scientific research without sufficient offsetting public benefit. New or extended IP rights would surely remove at least some valuable data from the public domain. How can the benefits of new or expanded IP rights offset their costs, particularly given they are highly speculative? Furthermore, many costs are understated or overlooked outright by IP rights advocates, such as the systematic underestimation of of infringement enforcement and compliance costs.58 d. Database Protections Under U.S. IP Law: Constructing a Balanced Approach Perhaps a more thorough understanding of the rich diversity in types of data and databases would inform the deliberate creation of sui generis database rights. More precise targeting only to industry and government sectors deserving of property rights incentives seems prudent. This targeting could confine the inevitable harmful externalities of an abrupt introduction of a new property rights regime on scientific inquiry, small business and innovation incentives. For example, the nature of the data, the source of incentive or funding for its collection, innovation in database architecture and functionality, the most likely uses of the data, compatibility of technical security with the known, useful business models, bearing of enforcement costs and the public interest are all potentially important factors. Without this basic understanding, it is highly likely that new sui generis database rights will not help society as much as their proponents argue. 57 OJ 1996 L77/20. Congressman Kastenmeier argues the burden of proof that new IP rights needed should are high. Any new, sui generis IP rights should satisfy these criteria: (1) the new IP right will fit harmoniously with other IP regimes; (2) the new IP rights can be defined in a reasonably clear and satisfactory manner; (3) the proposed new IP rights must be based on an honest and rigorous cost-benefit analysis; and (4) the new rights must clearly enrich and enhance the public domain. 58 22 BIG DATA & ITS April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium e. Database Protections Under U.S. IP Law: Role of IP Rights 59 Data, in various forms, drive both basic and applied research in most fields. Data collection and use require many costly steps, from conceptualization to maintenance. Academia is a useful exemplar to emerging big data advocates because there exists an ideal of maintaining public-domain access and a strong belief that society benefits from findings based on such data. Proposals to tighten the IP controls over databases appear antithetical to longstanding academic values. Academic researchers must often pay to access large databases. Biotechnology firms license out the annotations that enrich the genome projects, financial economists need comprehensive securities trading statistics, and effective scholarship in public policy, law, and taxation requires access to online subscription services. Ever-increasing privacy rights, confidentiality agreements, and national security restrictions are curtailing open access to many other important databases. Furthermore, public research universities squeezed for cash flow from state appropriation seek to enhance royalty income by following suit: licensing their databases for use by outside parties. The issues raised are neither simple nor straightforward. Under U.S. law, ideas are as “free as the air” unless they are embodied in a patented invention or kept confidential as trade secrets. However, it is a fundamental tenet of academic scholarship that information are public goods. U.S. copyright law protects only the selection and arrangement of the data in a database, not the data itself. For broader protection, the courts have required recourse to trade-secret laws employing physical and contractual controls. Sui generis database protection raises public-policy issues that reveal the depth of the conflict between open access and proprietary values. This is not really new territory. Since the 19th century, Congress has created several forms of sui generis intellectual property rights. Many of these forms are familiar to the university community: design patents (ornamental designs, 1842); plant patents (asexual reproduction, 1930); plant varieties (sexual reproduction, 1970); and semiconductor chips (maskworks, 1984). Occasionally, the courts and state legislatures have also developed new subject matter or otherwise expanded intellectual property rights, as has 59 This sub-section is adapted from Bagby, John W. Outlook: Who Owns the Data? RESEARCH PENN STATE, Vol.26, No.1 (Jan. 2003) accessible at: http://faculty.ist.psu.edu/bagby/Pubs/WhoOwnsDataRPSJan03.htm April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium BIG DATA & ITS 23 happened with software, business methods, plant species, and even life forms. In each case, proponents have argued that stronger property rights encourage investment that would not otherwise occur, and that such investment benefits society more than it costs. Today various sectors of the Internet “content” industry are lobbying heavily for expanded IP protections. They argue that pirates can too easily “ride free” on the “sweat of the brow” of database creators. Content providers believe that weak U.S. protections encourage wholesale infringement, and narrow the possibilities for profit using various business models reliant primarily on the sale of data. Critics argue that sui generis database rights would impose substantial new costs on research, and that they are particularly inappropriate where the database results from publicly funded work. Indeed, the National Research Council (NRC) strongly cautions that sui generis database protections could retard scientific research. Furthermore, the NRC argues that current tradesecret controls and licensing restrictions afford sufficient protection. 60 The significance of this debate is magnified by controversy over expanding the definition of a database. Traditional definitions are narrow; they limit databases to organized collections of numerical observations taken from tightly controlled experiments. By contrast, big data purists contemplate broader definitions, including structured and unstructured content of nearly any type of information or work. Sui generis proposals would broadly define data to include “any physical or digital collection of information or works arranged in a systematic or methodical way for retrieval or access by manual or electronic means.” Under this view, databases would include literary and artistic works, texts, sounds, images, numbers, facts, statistics, production or shipping information, transactions, financial data, health information, geographic information, and private personal data derived from almost any source. Recent technological developments enhance the value of databases considerably, raising the stakes. Consider the impact of peer-to-peer filesharing, automated data harvesting by Internet “bots,” electronic agents posing as human users, and the capability for near-instantaneous aggregation and association of data from physically separate or independent databases, resulting in data “mining” and data “warehousing.” 60 A QUESTION OF BALANCE: PRIVATE RIGHTS AND THE PUBLIC INTEREST IN SCIENTIFIC AND TECHNICAL DATABASES, Committee for a Study Promoting Access to Scientific and Technical Data for the Public Interest, National Research Council (1999). 24 BIG DATA & ITS April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium Thomas Jefferson, an accomplished inventor and the first federal administrator of federal patents rights as Secretary of State, remains an important influence on intellectual property rights in the U.S. Jefferson insisted that society should not suffer the “embarrassment” of granting a monopoly in intellectual property rights unless society's benefits are clearcut and substantial. Modern Jeffersonians expand this reasoning to argue that the burden of proof must be kept very high on proposal to expand IP rights. They argue that new rights must: (1) fit harmoniously with existing intellectual property protections; (2) be defined in a reasonably clear and satisfactory manner; (3) be based on an honest cost-benefit analysis; and (4) clearly enrich and enhance the public domain. The ultimate question is whether the benefits of new or expanded IP regimes will offset their costs. Many projections of such benefits are highly speculative and too often the costs are overlooked. For example, the costs of privacy intrusion, infringement enforcement and compliance are systematically underestimated in public policy debates, largely to the benefit of IP professionals. A more thorough understanding of the diverse types of databases arising under evolving and advancing big data techniques seems essential before any new form of IP is created. Database rights may subvert scholarship if they are not more precisely targeted. Abrupt introduction of new IP rights could negatively impact scientific inquiry but also small business creation and incentives for other types of innovation. Academics must participate in the big data IP rights debate by developing some consensus on the information policy issues raised. It seems fundamental, first, to distinguish clearly between various types of databases according to such factors as the nature of the data, the source of funding for data collection and maintenance, the degree of innovation in database architecture and functionality, the likely uses for the data, and other considerations. Without a common basic understanding and some sense of others’ experience with database protections, big data IP rights will miss the mark. Table I illustrates the available IP regimes historically and currently useful in protection of databases in the U.S. April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium BIG DATA & ITS 25 Table I: Database Protection Laws Strengths Weaknesses Type of Database Law Tort: Protects data even if Misappropriation technical security measures or contractual restrictions are not used Tort: Most common in U.S. Trade Secrets Requires technical security measures or contractual restrictions Tort: Protects information Trespass to stored in owner’s personal property, such Chattels as computer or network storage Copyright Strong prohibition against copying, rewards creativity in selection and arrangement Contract Common protection in the U.S. Enforceable to prevent misuse, disclosure, copying or reverse engineering Sui Generis Stronger protection than under nearly any other theory; validates the “sweat of the brow theory” Protects only “hot news” or other elements not part of copyright law protections Rights are lost when information is disclosed into the public domain Trespass theory could be expanded so far that interactivity would be severely impeded in network settings Feist limits protection to compilations made with creativity in selection and arrangement; underlying ideas and facts are not copyrightable Requires privity, not applicable unless user assents to restrictions; form and conscious understanding of assent required remains unclear Constitutionality suspect; could impede research and narrow the public domain; offers excessive rewards for some uncreative work III. RISK SHIFTING UNDER ITS BIG DATA ITS allegedly enables societal resolution of a wish list of horribles: environmental catastrophe, recurring and extraordinary infrastructure costs, public safety costs, unattainable law enforcement improvements and 26 BIG DATA & ITS April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium national security surveillance accuracy. Few ITS systems are possible without big data. Therefore, the promised benefits of ITS are attainable mostly with big data applications that do not impose negative externalities 61 beyond their promise of benefits. This section acknowledges that (1) ITSBig Data will have unintended consequences, (2) that ITS-Big Data proponents may promise illusive benefits and (3) implementations by the states and local governments, federal government, NGOs, and third party private-sector service providers will sometimes fall short of the perfection promised. Indeed, side-effects are likely to be manifest as both positive externalities (socially beneficial side effects) as well as negative externalities (side effects imposing social costs). Big Data foreseeable and unforeseen impacts will likely be addressed by various well-known risk shifting techniques, although it can be expected that novel risk management (identification, assessment, control, shifting) techniques will be developed and deployed. 62 Longstanding legal and regulatory concepts will most certainly apply to ITS-Big Data including those bedrock risk shifting problems such as insurability, insurable interest, insurance market development, statutes of repose, statutes of limitation, causation, foressability, joinder, class action, mass tort, hold harmless, acknowledgements, release, cognovit, software/systems/database licensing, and public acceptance. Figure IV illustrates risk transfer points in the generalized ITS architecture Figure IV: Risk Shifting under Generalized ITS Architecture Source: Office of the Assistant Secretary for Research and Technology, Intelligent Transportation Systems Joint Program Office, Department of Transportation 61 See generally Coase, Ronald H., The Problem of Social Cost, 3 J. L. & ECON. 1-44 (Oct. 1960) accessible at Stable URL: http://www.jstor.org/stable/724810 (arguing negative externalities should be viewed only as a property rights problem that is resolved only with private contracting by parties that “will negotiate voluntary agreements that lead to the socially optimal resource allocation and output mix regardless of how the property rights are assigned.”) 62 See generally, ISO/DIS 31000 (2009). Risk management — Principles and guidelines on implementation. International Organization for Standardization. April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium BIG DATA & ITS 27 IV. IMPLICATIONS OF PUBLIC POLICY IMPACTS ON ITS BIG DATA In the regulatory and private sector developmental history of ITS, public safety has assumed a super-ordinate position from among all the ITS goals of public safety, environmentalism, efficiency, infrastructure costs, mobility, convenience, national security and law enforcement. It seems clear that big data feeds all these ITS goals, but in somewhat different ways and to different extent. Therefore, this section will address how public policy application to ITS systems can choose either (1) weak, omnibus or general application to all ITS systems or (2) the more likely result of predictable U.S.-style policy formulation resulting in a more sectoral approach to regulation, with particular sectoral results for the major big data regulatory themes of privacy and security. Thus, this section acknowledges that it may be more useful to decompose ITS into particular ITS applications susceptible to this sectoral approach, reserving omnibus approaches to the common elements of ITS technologies (e.g., wireless telecommunications, sensor arrays, data centers located in the cloud). A. Omnibus vs. Sectoral Approaches: Regulation of ITS & Big Data 63 The accumulation and use of big data for [ITS] analytics and ultimately prediction suggests strong merit in contrasting the different approaches to privacy rights [and cyber security duties] under the American experience [as compared to] that regime largely dominant in European nations. U.S. privacy rights are generally described as sectoral; typically regulation is imposed ex post, only after disputes take shape and public policy intervenes in response to the need to settle disputes among competing forces. By contrast, the post-WWII European experience is omnibus; regulation is pervasive over economic sectors ex ante. This difference illustrates a cultural clash over liberty [America] and paternalism [Europe]. Two important values are at play in the American sectoral approach: 64 63 Subsection A is adapted from text accompanying note 21 in John W. Bagby, Using an Industrial Organization (I/O) Lens to Enhance Predictive Analytics: Disentangling Emerging Relationships in the Electronic Surveillance Supply Chain, LEGAL AND ETHICAL ISSUES IN PREDICTIVE DATA ANALYTICS, Virginia Tech University, June 20, 2014. 64 Some components of this section are adapted from a white paper by Bagby, John W., Regulation of Private Data Management: a Supply Chain Analysis, (Oct. 2013) which is further adapted from excerpts found in Bagby, John W., Privacy, Chapter 13 in: ECOMMERCE LAW, (©2003; West Publishing Co. Mason OH,) and Bagby, John W., Managing Information Rights to Develop the Energy Informatics Field (August 4, 2012) 28 BIG DATA & ITS April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium (1) individual autonomy and (2) personal experience. This means that privacy is compared with ownership over data observed, the meaning given that data and the intangible intellectual property rights to use this knowledge or sell it as a product or service. The controversy over privacy [or security] regulation is essentially a struggle to resolve these often conflicting claims. It occurs when society finds it appropriate to intervene by setting policies [balancing the] benefit[s for] individuals and society. For example, this conflict is settled in favor of individual privacy interests when restrictions are placed on data collection or use, such as the well-known financial and health care information privacy restrictions. By contrast, the balance can also favor the rights to use and own observations and perceptions - examples include broadening law enforcement investigatory powers and the protectability of corporate information and data as trade secrets. B. Discussion: Big Data Risk Profiles for Particular ITS Applications Table I reveals the results of analysis of risk shifting likely for particular ITS Applications reliant on big data. Table II: Analysis of Results – Regulation of Risk Addressed by ITS Big Data Accessible at: http://faculty.ist.psu.edu/bagby/Pubs/ITS-BigDataTableII.pdf V. DISCUSSION AND NEXT STEPS A. Specific Policy Risks for ITS-Big Data Many fields of law are implicated by ITS and big data separately. However, the fusion of ITS and big data produce even more concentrated regulatory and liability interests of affected parties. Many policy issues are simple extensions or analogical extensions from adjacent fields while others arise de novo given the intensity and divergence generated by the fusion of available at SSRN: http://ssrn.com/abstract=2564257 or http://dx.doi.org/10.2139/ssrn.2564257 originally presented as developmental (working) paper No. ALSB2012_0182, Academy of Legal Studies in Business, Kansas City MO Aug. 2012 (developing generalized data architecture susceptible to public policy analysis, regulation and incentives). April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium BIG DATA & ITS 29 ITS and big data. First, decision-making based on analysis of big data is generally associative rather than causative. 65 According to one author: Big data analytics probably work best when they provide interesting or promising “leads” for further analysis, measurement and evidence gathering. Thus, it is easy to predict big growth for predictions based on big data analytics for decision-making unconstrained by: (1) the traditional professional and scientific inquiry standards for creation of new “knowledge,” or (2) any principles of regulatory fairness, or (3) any competent and careful satisfaction of due process in litigation. Furthermore, as prediction accuracy of big data analytics is believed to improve, it is predictable that pressures will mount to continually improve these 66 analytic processes and apply them in more realms. Science fiction writers, privacy zealots and big data theorists appear to uniformly recognize that big data holds strong promise for unjust outcomes. Indeed, due process in criminal prosecutions require proof beyond a reasonable doubt closer to causation. This is repeatedly argued to make any decisions impacting life, liberty or property based on big data associations useful only at the investigation stage not at findings of guilt or liability. When traditional due process interests or life, liberty, or property are implicated as direct stakes, current big data techniques run huge injustice risks largely from a constitutional perspective. Civil law’s preponderance of the evidence stands in a middle ground both between direct causation and circumstantial sufficiency. Some associative reasoning based on circumstantial evidence is noted. Furthermore, regulatory rulemaking and standardization permit much more associative reasoning to comprise causation findings. Law enforcement leads are generally predicated on associative causation, thus using big data implications solely as “leads.” While “leads” have some impact on life, 65 See generally, Bagby, John W., Book Review, 14 J. L.ECON.& POLICY – (2008) of Harcourt, Bernard E., AGAINST PREDICTION: PROFILING, POLICING, AND PUNISHING IN AN ACTUARIAL AGE, (2006). 66 See generally, Bagby, John W., Using an Industrial Organization (I/O) Lens to Enhance Predictive Analytics: Disentangling Emerging Relationships in the Electronic Surveillance Supply Chain, LEGAL AND ETHICAL ISSUES IN PREDICTIVE DATA ANALYTICS, Virginia Tech University, June 20, 2014, accessible at: http://faculty.ist.psu.edu/bagby/Pubs/BagbyWorkingPaperPredictiveAnalytics.pdf 30 BIG DATA & ITS April 17, 2015 Ind.Univ./Va.Tech. Big Data Colloquium liberty or property, they are usually weak impacts and very considerable due process robust legal process must follow that uncovers more causative relationships between evidence adduced and conclusions that prosecutors attempt to prove with direct and some circumstantial evidence. Key to the direct action taken on big data in the law enforcement scenario is that limited law enforcement resources are allocated based on the strength of big data evident “leads.” This means that other somewhat “less promising” leads are allocated less enforcement resources. At the margin, the role of big data is to optimize limited law enforcement resources. Political reconciliation based on conviction records, overturned convictions and law enforcement embarrassment are the major disciplines on use of big data “leads,” “hunches,” and suspicions. Similarly, national security investigations and action seldom relies on direct causation evidence sufficient to prove guilt beyond a reasonable doubt. In its decision-making, aounter-terrorism, national security and other intelligence community (IC) decisions are based almost solely on associative findings. These are weak from a due process standpoint, but appear to satisfy the prevailing IC and black operations decision-making standards when sufficiently “probable” using “confidence” assessments of experienced IC analysts and well-trained field agents. Again, like in law enforcement, projections are made to investigate further. However, spy movies abound that agents with “license to kill” may take direct action impacting due process rights based on big data analysis. Big data plays bigger role in such decision-making.
© Copyright 2024