Time to curb the data brokers

Data center in a server room with server racks behind glass cases.

Much of modern life depends on data stored and sold by private companies.Credit: Nikada/Getty

Data Cartels: The Companies That Control and Monopolize Our Information Sarah Lamdan Stanford Univ. Press (2022)

Information-technology platforms are ubiquitous in contemporary life. From friendships and genealogy to civic engagement, commercial transactions and charitable works, people rely on networked technologies that enable dialogue, authentication, payment and more. It is difficult to remember our daily lives before smartphones; those who have never known otherwise find those days impossible to imagine.

Sarah Lamdan’s Data Cartels is the latest in the genre of books that critically analyses such platforms, identifies bias, inequities and resultant harms, and asks why these for-profit companies are allowed to operate as they do. Lamdan focuses her attention on information and data brokers. These are relatively unknown, unlike social-media and search-engine companies (the respective targets of the 2018 books Antisocial Media by Siva Vaidhyanathan and Algorithms of Oppression by Safiya Umoja Noble).

Lamdan, a law scholar and librarian, concentrates particularly on RELX and Thomson Reuters, with some ancillary attention to Bloomberg and other commercial publishers. She unpacks their activities in unrelenting detail, examining their products and services across the arenas of data brokering, academic research, legal information, financial data and the news. Lamdan argues that these “perfectly legal” activities harm individuals and society, and erode democracy.

That this is a story of legal and regulatory failure is clear from the first, and evidenced throughout. The US government’s documented divestment from public media brings into particularly sharp relief how current practices differ from those of the previous century, which regulated radio and television. These days, for example, only around 1% of media organization NPR’s operating funds are contributed directly by the federal government. Congress has failed to enact laws that would constrain companies’ data practices, and the courts have repeatedly weakened protections. Individuals have been left to fend for themselves. Even when they can document mistreatment, there is limited accountability for firms. Data brokers have successfully asserted a free-speech defence in the face of legal action seeking redress for the dissemination of inaccurate information.

Lamdan’s solution is a legal and regulatory regime that treats information and data as public resources and that provides public digital infrastructures. In her vision of a functional information ecosystem, private companies would still operate, but public infrastructure would deliver essential information without the sacrifice of personal privacy. This kind of library-like public digital infrastructure would require considerable resources. Lamdan does not offer any cost estimates. But extrapolating from the more than $470-million annual budget of the US National Library of Medicine, which provides fairly robust public access to health information, it is quickly apparent that the price tag would be multiple billions.

The US track record with legal and financial information is, however, abysmal. Currently, “when the government provides information to the public, it does so with outdated, insufficient online tools and platforms”, Lamdan writes. These pale in comparison to paywalled corporate services. Any researcher using the government platforms PACER or EDGAR to find court records or financial data, respectively, will be frustrated with their limited information search, retrieval and management tools. As she notes, federal, state and local governments comprise a substantial customer base for many of the data brokers’ services.

Incremental reform

Unfortunately, as Lamdan observes, the kind of comprehensive and coordinated legal and regulatory reform that would be required to tackle the problems she has laid out — addressing copyright complexities, re-establishing previously strong antitrust doctrines and closing loopholes in constitutional law — is unlikely. As such, she also identifies a number of incremental changes that would be useful. These include treating data brokers as information fiduciaries and — when governments contract for their services — as state actors bound by constitutional obligations.

The book is replete with observations about how things should be and ideas for what the government could do. But the reader is left wondering why so many bills are introduced but not passed, and so many petitions for investigations filed but not acted on. If multiple attempts have failed, what are the barriers to success and what would it take to overcome them? Little attention is paid to these questions.

One also wonders why the discussion is almost entirely US-centric. RELX, based in London, and Thomson Reuters, headquartered in Toronto, Canada, are global companies used by researchers around the world. RELX’s information-retrieval system LexisNexis and Thomson Reuters’s research service Westlaw publish laws and related legal documents from many countries. Lamdan mentions in passing that European antitrust legislation focuses on fairness, whereas US laws concentrate on preventing economic harm to consumers. However, there is no vision of how the United States might pursue multilateral action or what might be the role of international organizations in responding to the problems outlined in the book. Is it even possible for the United States to act on its own?

A broader look beyond governmental solutions would have been welcome, too. In addition to the international Free Access to Law Movement, which Lamdan highlights, there are organizations pursuing guidelines and standards (such as the US National Information Standards Organization’s Consensus Principles on Users’ Digital Privacy in Library, Publisher, and Software-Provider Systems) and cross-industry collaborations (one is the Vendor & Library Community of Practice, sponsored by the Intellectual Freedom Committee of the American Library Association).

The Institute of Museum and Library Services has awarded numerous grants on this topic. These include support for a project called Library Values & Privacy in the US National Digital Strategies: Field Guides, Convenings, and Conversations, and for a national forum on web privacy and analytics. There are enough library-led initiatives for the 2022 Electronic Resources and Libraries conference to have featured half a dozen in its keynote panel, including my own Licensing Privacy Project, funded by the Andrew W. Mellon Foundation in New York City. Philanthropic efforts such as Research4Life and INASP — both of which help researchers in low- and middle-income countries to access scholarly literature — as well as emerging shareholder and student activism, should not be overlooked.

Having been involved in efforts to raise awareness of the impacts of data brokers over the past decade, I appreciate Lamdan’s hopeful stance that it is not too late to reverse course and create a better world. Her rhetoric is powerful, her writing colourful and her critique vigorous. For those unfamiliar with the vast literature on challenges in the data and information arenas, this book is a useful compilation. Unfortunately, although Lamdan’s vision of what could and should be might inspire, her call to action falls short of offering a clear path forward.

Competing Interests

The author declares no competing interests.