The power of data: ethics, politics, and public interest | LSE Event

Estimated read time: 1:20

Summary

The LSE event titled "The Power of Data: Ethics, Politics, and Public Interest" convened experts Chris Higgins, Allison Powell, and Erin Young to discuss the intricate dynamics of data in our contemporary society. The event scrutinized the ethical, societal, and political implications of data-powered technologies. It emphasized the challenges of current data governance, the socio-economic disparities perpetuated by AI, and the significant role of ethical principles in navigating these issues. Panelists highlighted the escalating influence of data in decision-making and its potential hazards if not managed with a commitment to public interest and inclusivity.

Highlights

Chris Higgins discussed his experience with teaching data ethics, emphasizing the balance between optimism and realism. 📚
Allison Powell highlighted the importance of storytelling in data, emphasizing narrative's power to shape perceptions and policy. 📖
Erin Young discussed corporate governance and emphasized responsible use of data at the board level. 🔍
Panelists discussed the current challenges and inequalities in AI fields, touching on gender disparities. 👩‍🔬
The event addressed the importance of public participation in data governance to prevent data colonialism. 🌍

Key Takeaways

Data ethics is crucial for maintaining a fair society. Ethical guidelines help balance technological advancements with moral considerations. ⚖️
The role of data in politics and public interest can amplify inequalities if unchecked. Oversight is key. 🔍
Technological power lies with a few companies, requiring stronger regulatory frameworks to ensure public interest. 🚦
Regulation often follows technological crises, highlighting a reactive rather than proactive approach. 🏛️
Participants emphasized a multi-faceted view on ethics, incorporating justice, fairness, and informed consent in technology. 💡

Overview

The session kicked off with Chris Higgins sharing his insights from teaching a course on data ethics at Columbia University, which he co-authored a book about. His perspective was rooted in blending a sense of realism about the challenges of data-driven algorithms with a hopeful outlook for shaping better futures. He referenced historical contexts and how they help make sense of today's ethical challenges in technology.

Allison Powell delved into the narrative aspect of data use and governance. She reiterated that how we tell stories about data influences public comprehension and policy-making. Through her work, she encourages debates over definitions to better understand what constitutes data and technology's impact on society while pushing for creative methods, like community involvement, to reshape governance.

Erin Young contributed from a corporate governance angle, underscoring the need for high ethical standards in businesses concerning data use. Young emphasized that regulation needs to catch up with the fast-evolving tech landscape, advocating for accountable and transparent practices to genuinely support economic and social growth. Discussions also covered structural inequalities within the tech industry, particularly around gender and investment disparities.

Chapters

00:00 - 02:30: Event Introduction and Speaker Introduction The chapter introduces the event being held at the London School of Economics and Political Science, focusing on the topic of the power of data. The speaker, Sitta Peña Gangadran, who is an associate professor and deputy head of the media and communications department at LSE, welcomes attendees to this hybrid event. Key speakers include Chris Higgins, Allison Powell, and Erin.
02:30 - 19:30: Chris Wiggins - Data and Ethics The event is organized by the Data Science Institute and the Department of Media and Communications.
19:30 - 33:00: Allison Powell - Telling Stories with Data The chapter begins with a discussion about the importance of the code of practice on free speech, urging the audience to support it. There is a brief mention of safety procedures in case of a fire, highlighting the need to follow exit signs and assembling outside in the designated area in the plaza.
33:00 - 45:15: Aaron Young - Governing Data and Technological Systems The chapter introduces Aaron Young, head of innovation and technology policy at the Institute of Directors, alongside other notable figures: Chris Wiggins, chief data scientist at the New York Times, and Allison Powell, associate professor at LSE. The event format includes presentations from Chris, followed by Allison and Aaron, focusing on data and society.
45:15 - 83:00: Panel Discussion and Audience Q&A The chapter titled 'Panel Discussion and Audience Q&A' details an event where Aaron initiates a panel discussion. Following the panel discussion, the audience, both in-person and online, are given the opportunity to ask questions to the speaker. The event is being recorded with the intention to release it as a podcast, provided there are no technical issues.

The power of data: ethics, politics, and public interest | LSE Event Transcription

00:00 - 00:30 Good evening everyone and welcome to the London School of Economics and Political Science for this hybrid event. My name is Sitta Pñena Gangadran and I am associate professor and deputy head of department in the department in deputy head of department in media and communications at the LSC. We are here to talk about the power of data. And I'm very pleased to welcome Chris Higgins, Allison Powell, and Erin
00:30 - 01:00 Young to both our online and in-person audience today. This event is co-sponsored by the Data Science Institute and the Department of Media and Communications. Before I have the privilege of describing the format for tonight's event and introducing our speakers, uh few housekeeping items. I'll keep them brief. Uh first, the school has a code
01:00 - 01:30 of practice on free speech. So, let's champion that code of practice tonight. Second, in the event of a fire, follow the exit signs and the fire assembly point for this building is outside in the plaza. And now to very brief introductions. Chris Wiggins is an associate professor of applied mathematics at Columbia University and
01:30 - 02:00 the chief data scientist at the New York Times. Allison Powell is associate professor in media and communications at LSE and directs the master's of science stream in data and society. Aaron Young is head of innovation in and technology policy at the Institute of Directors. The format of the event will be as follows. We will hear from Chris as he presents followed by Allison and
02:00 - 02:30 Aaron. before coming back together to discuss as a panel. Then there will be an opportunity for all of you uh as an audience in person and online to ask questions to our speaker. This event is being recorded and will hopefully be made available as a podcast subject to no technical difficulties. Please, can I remind you at this moment
02:30 - 03:00 to put your phones on silent to minimize any disruptions? And now I'm delighted to give the floor to Chris Williams. Thank you very much. Uh it's a real pleasure to to be here. Thank you so much to uh our hosts and for inviting me to be here as well. Tonight I thought I'd share with you some of my own thinking about data and ethics. My thinking about data and ethics was very
03:00 - 03:30 much informed by a project I was engaged in from 2017 until present. Uh namely a class at Colombia and a resulting book. So I thought I would share with you a bit about how we thought about data power and ethics in this class that I was teaching at at Colombia. Uh it resulted in a book uh on the far right co-authored with a historian Matthew Jones. I'm not a historian. and I'm merely a fan of history. And these three images sort of capture what we um accomplished in those years 2017 until
03:30 - 04:00 uh 2023 when we were teaching the class together, which was to deal headon with concerns that students had about the internet, which is sim symbolized by young people in the form of a flaming dumpster fire. Uh, and then to unite that with optimism about the future, which for my age group is symbolized by this cartoon character, Buzz Lightyear, who's from a movie from a previous millennium who had irrational exuberance. And we wanted students and also the uh readers of this book to
04:00 - 04:30 understand head-on the challenges of a world in which our personal, political, and professional realities are being shaped by data empowered algorithms. and yet to have some optimism and hope for uh how we can shape that future together. So, I thought I'd share with you a bit about that in maybe 20 minutes. And if you'd like to know more, there's a book which you can read about there and of course um a GitHub repository like any civilized class, the pre the primary materials are available to you online.
04:30 - 05:00 Um so by way of nice to meet you my my own thinking about data has been shaped by this book as well as a book co-authored with computer scientists and as mentioned my own engagements with the New York Times leading a team that develops and deploys machine learning my engagements at Colombia teaching uh as well as doing research in machine learning and then a nonprofit that I co-founded in 2010 in New York City to try to get students to find out about New York City startup community. The book itself resulted in um three
05:00 - 05:30 parts, three sort of stories about data and our world. And as I said, the book itself was really shaped by interacting with Colombia undergraduates who are a pretty demanding bunch. We started the class in 2017 thinking we would teach a class on the history of data science and the students really pushed us to think more about data and its impact on society. So part one really opens up thinking about statistics as you might have encountered it uh as a field of mathematical analysis. Part two is about
05:30 - 06:00 how computers change the way we make sense of the world through data. Part three is about our present condition and about power and solutions. We opened up by talking about stakes. the stakes. And when we started the class in 2017, it was immediately before a blossoming of literature about challenges involving data empowered algorithms. Our ca our colleague Kathy O'Neal wrote weapons of math destruction right around the time that the class debuted. Sophia Noble wrote this great book about um search algorithms and more
06:00 - 06:30 generally the way algorithms reinforce uh societal inequalities. Virginia UMX had this great book about how algorithms can empower um automating inequality. Shashana Zub's own book about surveillance capitalism and about commercial interest and how they exacerbate some concerns and uh with a particular US lens. Ruha Benjamin's great book about race after technology uh captured a lot of the societal and and political concerns around race. uh again with a particular US lens. Another
06:30 - 07:00 scholar captured it well in the title of her class, Lisa Nakamura's class, the internet is a trash fire. Uh which as she says, she teaches a class called the internet as a trash fire. And as she says, she doesn't have to explain to anyone what that means. We started out the class talking about a subject for which there's abundant literature, which is mathematical statistics and its own history. Most people learn statistics without learning its history. It's it's quite a rich and well-covered history. So, uh, I won't spend too much time on
07:00 - 07:30 it other than to, uh, advertise that we started out in the the in the last days of the Victorian Empire at the end of the 19th century, uh, and lean heavily on scholarship by other scholars, including Steven J. Gould's mismeasure of man. We introduced students to the ways that people had tried to make sense of humanity by quantifying whatever they could, sometimes very much influenced by their own prior conceptions about humans themselves are arranged. uh and we introduced them to the birth of correlation and regression and eugenics.
07:30 - 08:00 Three words brought to us by Sir Francis Galton uh all part of the same program. Stephen Jul does a great job capturing what we wanted to do by having a history class to understand the present day. History is very useful for making the present strange when we find history that's far enough away that we can look at other people's innovations and see them with distance and say, "Well, clearly those people had a a wrong conception of how to make sense of the world and yet a history that's close enough to present day that it makes our own present world strange." Or as Steven
08:00 - 08:30 J. Gould wrote, "Shall we believe that science is different today? By what right, other than our own biases, can we identify their prejudice and yet hold that science now operates independently of culture and class?" So that was part of the history in part one of the class uh to make the present day strange for our students and readers. Part two opens up with the birth of digital computation data at war because digital computation is really born during World War II. So we introduce students to what an incredible breakpoint it was to be able to make sense of data on an automated
08:30 - 09:00 machine so that the laborious hand calculations can be made cheap and as we would say today at scale. That process involves the birth of what we now call the military-industrial complex and also labor. And as soon as labor becomes part of making sense of data, it is genderized. And so we um again leaning on the great scholarship by authors such as uh Janet Abbott and her book Recoding Gender, we show how immediately as soon as people saw that data was a project to be managed, people decided which part
09:00 - 09:30 was going to be men's work and which part was going to be women's work, including this work in the creation of the Colossus at Bletchley Park here in 1943. Once we've introduced students to uh how data changed when it became a concern for labor and for computation, we introduced them to the birth of artificial intelligence. You may think that the birth of artificial intelligence was something like chatbots or data empowered algorithms which is our present universe. So we want students to see how the most of the
09:30 - 10:00 history of artificial intelligence as a term had no data in it and in fact data was the low road. Uh the term artificial intelligence was coined by this mathematician John McCarthy who as he put it later um here in London. Actually this this screenshot is from a debate about artificial intelligence in 1973 when people were already sick and tired of hearing about artificial intelligence. He's on record as saying I invented the term artificial intelligence when we were trying to get money for a summer study. And it's useful story to remind students because they often feel like when they hear artificial intelligence people are just
10:00 - 10:30 trying to get money. And it's useful to tell them that it that in fact always has been that way. This is the actual study. These seven uh white guys hung out in the summer of 1956 at Dartmouth. On the far right is Claude Shannon, now immortalized for young people by the word Claude, the chatbot of anthropic. Uh but in there also is the infamous Minsky and John McCarthy uh and several others from a variety of walks of life. It's clear when you look back on what they proposed in 1955 for the 1956 study that they had
10:30 - 11:00 no idea what they were talking about. So they wrote a long list of things that they plan to study. One of them neuron nets turned out to be the thing that won. But we didn't really know that until 2012 or so. And the other fields went on to create seven effectively countries within the continent of computer science. But at the time uh the world was young and they did not know how artificial intelligence would be crafted. Today artificial intelligence has been crafted as a
11:00 - 11:30 problem in which machine learning is the interesting technical nugget. But when you experience artificial intelligence in the palm of your hand, the artificial intelligence is only one part of a large technical system and in fact of a large sociotechnical system of which you are an important element. So in order to tell students the story of machine learning, we also often tell them the story about why and how data were captured for several decades before we could get to the point that people would realize that artificial intelligence was actually not about how we think we
11:30 - 12:00 think, but it was actually about the low road of capturing sufficient data for computers to be able to reproduce to us a simulation of how we think we think. Okay. And by the time we get to the present day, we talk a bit about the battle for data ethics, which was a great story in technology, I would say 2017 through 2023 or so, uh, when a number of the leading companies were interested in positioning themselves as defenders of ethics. Um, ethics of course is a drift word, right? It's a term that is means different things in
12:00 - 12:30 different communities, in different decades or at different companies. uh or as the law professor Phil Alston put it in 2018 on the stage of the AI now institute in New York City. He said, "What I always see in the AI literature these days is ethics. I want to strangle ethics." Now, Professor Alston did not mean that he wanted systems not to be ethical. What he meant was the danger of a drift word is that the word will be captured. And in fact, ethics was largely captured by private companies who defined it in ways that um suited
12:30 - 13:00 their own interests by around 2023. Ethics for many people in academy already had a a prehistory around fairness and privacy. Privacy had already been invest um investigated and interrogated well by scholars such as Latana Sweeney who pointed out that simply removing the name from a table was not enough to make a data set anonymous. As she pointed out, if I have one uh voter list that has everyone's name, zip code, birth date, and sex, and
13:00 - 13:30 I have another data set, which I can get my hands on, which is medical records, that has the zip, birth date, sex, and no name. I can simply take those three fields and make them a composite join key that will allow me to uh as it is often said, mail the governor of Massachusetts his own health records. It's often said that Lata Sweeney did that when she was a graduate student. I emailed her, she said that it was actually not true, which is too bad because it's a great story. But the point is that we may think that privacy is merely removing a name from a data
13:30 - 14:00 set, but it is worse than that. Fairness is often the way we think about ethics um in industry and academia. And a lot of work was done to advance that, including by Julia Engine and co-workers at ProPublica to point out how blackbox algorithms were being used in the criminal justice system in the United States by private companies in ways that are absolutely um not available for us to inspect. We don't know the source code, the algorithms, the underlying logics of these algorithms. She and
14:00 - 14:30 co-workers did a great job uh reverse engineering as best as possible and even more so putting the source code on GitHub which launched a small technical field of fairness within ethics but we try to introduce students to the idea that uh privacy and fairness are really a small part of a larger conception of ethics. We introduce them to um a particular conception of ethics organized around principles that are intention. If you were raised in the academic tradition in the states, you would have encountered this under a
14:30 - 15:00 forgotten book called the Belmont Report. You can tell it's a forgotten book because if you go to Amazon and try to buy it, it says on the bottom forgotten books. But this conception of ethics is that ethics is not a checklist or it's not just about privacy. It's actually about principles that are designed to be intention. So that we must hold those principles in tention and adjudicate difficult decisions about informed consent, harms, and bias as well. um in concert. Many of the things that we tell students are about why you would want to do ethics. Uh for people who work at
15:00 - 15:30 these tech companies, there's a concern around moral injury. Meaning the feeling that you might do something that gets you promoted and then after you've done it, even if you got a raise, you might realize that there was something bad about that thing that you have done. And if you followed the literature by many people who have graduated from these tech companies and then started nonprofits thereafter, many people in fact do graduate from these companies and then do realize that there is a sense of moral injury, a sensation often associated with um first responders or
15:30 - 16:00 doctors or members of the military. But it's also a phenomenon now among tech workers who while they are working in tech form collective action in various ways or might be whistleblowers and try to defend themselves against that sense of moral injury. Employees themselves form in this way a form of private regulation. They don't regulate in the sense of laws but they are private ordering um of their own companies and of society in a way that has been analyzed by legal scholars. Ethicists within companies may
16:00 - 16:30 be technologists who try to design process. So for example, there's this great paper by uh Meg Mitchell and Tim Gabru, later famous for getting fired from Google um for their concerns about large language models. Deb Raji who's currently a graduate student in Berkeley and other co-workers trying to think about how the process of launching a product can be augmented so that in addition to reviewing QA and technical assessments, we can review its ethical impact. As I said, tech ethics and data ethics was uh flourishing 2017 through 2023.
16:30 - 17:00 It'll be a question for historians of technology to ask why by 2023 as AI boomed, tech firms were laying off their ethsists. Uh and that is why it fits well into a history book. With that, um we close the book and the class by trying to give students some form of optimism and thinking about solutions. Thinking about solutions really means thinking about power. So we try to get students to think about not only their role as students who will go on to work in tech companies but as consumers who
17:00 - 17:30 produce data that is the lifeblood of these companies. This form of power which I like to call people power sits in a threeplayer unstable game among corporate power and state power. Often, particularly in the states, people think, "Oh, well, there ought to be a law." And if we see something that's a concern for consumer protection, we think that some centralized legal entity will somehow take care of it. But the law and state power more generally, we want to remind people's is only one player within this unstable game.
17:30 - 18:00 Companies themselves effectively regulate each other or might advance for say privacy as a human right as Apple has been saying for 10 years. In part because it's a value ad to the consumer. Consumers provide the power not only to their democratically elected officials but also to the company in their money, their data, their talent. Having uh elucidated that dynamic of power, what do we hope people get out of the book? Well, we learned an arc from the students. As I said in
18:00 - 18:30 2017, we thought we were going to teach a class on the history of data science. We ended up seeing that there's much more to data, particularly data around truth and power. Our feeling was that there's important material that at present is not being taught neither to the future statisticians nor to the future senators and CEOs. How data happened and also how data sits within a nexus of power and not only power but truth. Data comes with it a rhetorical veilance that somehow data is given this extra truthiness because it's perceived
18:30 - 19:00 as objective and part of the scientific program. And we want to illustrate for students and for readers of the book all of the subjective design choices that professionals who do data know go into all of their analyses as well as the resulting products. Uh and with that I'll close and hand over the floor to Professor Powell. Thank you very much. [Applause]
19:00 - 19:30 Thank you, Chris. Thank you everyone for being here. Um, so I run a program that's called data and society. So I'm picking up where Chris left off, but I'm also going to take our discussion in a little bit of a different direction because my program sits inside a department of media and communication. So when I'm thinking about data, I'm not only
19:30 - 20:00 thinking about how data becomes a resource for narrating our social world, but I'm also thinking about how we tell stories about data. So today I'm going to focus on three questions. What stories are told about data? What influence do these have? And what else is possible? And when I say what stories are told about data, I'm beginning where Chris left off because Chris told us
20:00 - 20:30 about how when you collect data, when you perform your statistical analysis, when in the history of statistics you are collecting information about a people, to describe those people, to describe the average of those people for perhaps as if an average were something to represent those people you are already telling a story. So that is story number one that is being told with
20:30 - 21:00 data. We're also telling a second story which is a story about data's truth or truthiness as Chris pointed out. So I wrote a book also which I'm not going to talk about at great length but I do want to tell you a little bit about it because it described the consequences of a related story which is the story about governing with technology and that story was like the story about data being
21:00 - 21:30 truthy that governing with technology governing with data was better and it's better because it's cheaper, it moves faster, you can describe more things with data. Uh you can make certain kinds of processes move more quickly or as Chris said at scale. But there's a catch here. And that catch is that when you tell that kind of story about governing, for example, a city
21:30 - 22:00 with technology, you're also telling a story about which kinds of experiences are worthy and good and which are not. So, the spoiler of the book, so now you don't have to read it, is that stories about being true, objective, and easier to action than other forms of evidence used to govern can limit the ways that people are given space to participate in citizenship. So in the rest of my talk,
22:00 - 22:30 I'm going to talk about things that I have done in my own research practice to try to create other kinds of stories and other kinds of narrating about data and sometimes with data. And in contrast to many talks that I give, this talk is completely full of pictures of people doing stuff. Um, and if you're really watching carefully, you might see yourself. Uh, you might see uh famous scholars. Uh, you might see some of my
22:30 - 23:00 friends, my neighbors, uh, some of my collaborators. And that's really intentional because I think it's very important when we're talking about technology to remember that technology is people and technical systems also, as Chris has already pointed out, are in fact sociotechnical systems. So when we tell stories with data and about data, we're also doing things together as people. So I'm going to start with
23:00 - 23:30 something I've been doing in my classes for such a long time that this slide has two pictures of two classrooms that are more than 10 years apart. And this is a project that is called data walking. And I started it as an educational project. Um but data walking has been used all around the world as part of public consultation on developing technology in urban regions and as a kind of uh extens
23:30 - 24:00 educational extension project. So if we were going to do this now I would tell you that we were going to go for a walk and you would have to go on a walk with some people that you're sitting next to and you would each have to play a really specific role on that walk. You would have to either figure out how to navigate, you would have to take notes, you would have to uh take photographs, or you would have to draw a map. Now, what is our project on this walk? Well, we're looking for data. Oh, no. We have
24:00 - 24:30 a problem now because we don't know what data is. So, this is where we begin. When we're doing things together as people and telling stories about data, first we have to have a debate about a definition. Um and sometimes you invent a definition like artificial intelligence. Um but in fact debating shared definitions struggling over meaning is a really big part of decision making about what matters in the world.
24:30 - 25:00 So when we debate in our walks about what is data that changes what we're looking for on the walk and what kinds of things are actually collected. So if we were doing this in this room, you would all come back and I would make you all synthesize your data and I would say tell me what kind of response can you make to a problem that you found out in the world based on your data. Um and in uh in 2012 I made people make things out
25:00 - 25:30 of cardboard. So that's why they're sitting on the floor. But by 2025 I had uh I had sort of innovated and so they were making things out of Lego. Um which you know was a lot easier to work with. Uh, I got a complaint from a colleague actually after I ran one of these workshop that the classroom was full of glitter. Um, so Lego is certainly tidier. Um, but playing roles, walking around and then having to pay attention to how the decision that what of what
25:30 - 26:00 you decided was data makes data matter. And it makes data matter because you've decided what it applies to. And so this is a really basic beginning point that starts with defining things that we care about together and also making sure that we have enough space to debate with each other about what those definitions are. So one of the things that I was have been thinking about um when I scaled my
26:00 - 26:30 research up from the classroom to the policymaking space was how to use this struggle over definitions in an area where there seemed to be a lot at stake. So from 2019 until 2023 I was the director of the just AI project uh which was hosted at the Ada Love Lace Institute. It was commissioned by the arts and humanities research council as a strategic intervention intended to increase interdisciplinary connections
26:30 - 27:00 in what was then a really nent field of data and AI ethics. So we were in 2019 we were in the middle of this kind of like data ethics being a thing. Uh we also knew that AI ethics was going to be a thing and the AHRC who was the funer, the Nfield Foundation who supported the Ada Love Lace Institute, the Ada Love Lace Institute and my team composed of Imry Bard uh and Luis Hickman and myself were like, "Okay, so how do we intervene in something that is still emerging?" So
27:00 - 27:30 we did a very complicated um oop sorry I have old slides. We did a very complicated mapping exercise. We tried to figure out who was publishing what in data and AI ethics up until that point. And then we looked at the places where things hadn't settled yet, where the definition wasn't clear. And we started to move the conversation to places where ideas were in flux or contradictory. And we used a lot of
27:30 - 28:00 incredibly creative methods. Uh we commissioned a film that you'll see a screenshot of. Above our heads are all of the words that we are trying to think about in relation to data and AI ethics. Uh we convened hundreds of people in online events. It was COVID so we had to set up very weird uh juryrigged um uh uh network connections as you can see here. And we commissioned uh a fellowship program so that we could focus uh our resources on investing in people
28:00 - 28:30 investigating the relationship between AI and racial justice. And what I think was really interesting about this project is we did not directly shape policy. What we did was introduce new ideas into a conversation and create space for them. The ideas and the themes of our project which included refusal, sustainability, and justice are now in present in conversations in ways that
28:30 - 29:00 they weren't before. We also commissioned science fiction writing uh which is how I met Aaron as it happens because Aaron's work became the basis of a science fiction uh um short story that we then debated in an online salon to make sure that people became invested themselves in the questions that have unfolded around data and AI ethics. And it turns out that people have a lot to say about data and AI ethics, not just experts. If you ask
29:00 - 29:30 them, they will have many, many things to contribute. These photographs are taken in 2023 at a community festival near where I live in East Walworth in South London. And I had been spending time talking with Michael Little and Richard Galpin who had been working in this community to increase community trust and connectivity as a way of increasing the health and resilience of the community. And Michael and I started talking about data. And I said, I
29:30 - 30:00 wonder, Michael, if the people in the community would like to talk about data as something that they could connect with and connect using. and he said,"I don't know if the community is interested in data. Why don't we find out?" So, at the community festival, we made a data stall and it was just down the street from the pub stall that my friend Matt made and it was just up the green from the stuffed animal sales
30:00 - 30:30 stall that our kids made. And we thought for sure no one would come to the data stall because they would probably be at the pub stall. But it turns out that many many people wanted to talk to us about what data was important in their community. We heard about ideas like collecting together the information that everyone in a tower block might need about how much electricity is being used so that they could buy electricity together rather than having prepaid meters. We heard about how much more
30:30 - 31:00 information they wanted from the council. how they needed granular data about what was happening in their area and not just at the burough or high government level. So what we learned here was that if there are mechanisms for people, all kinds of people, what we sometimes refer to as ordinary people, as if some of us are extraordinary people, um if you let people participate in structuring the norms and rules around things that matter like technology and information control, they
31:00 - 31:30 can and they will use those opportunities. So when we were organizing this panel, Sitta said, "I'd really like if you can talk about what to do now so that we don't just have the idea that the data world is a dumpster fire." So I've decided to think a little bit about what to do because there's a lot at stake, not only data or data governance that's being struggled over. It's this idea of AI now that's often thrown out as a
31:30 - 32:00 vague term of power. And as Chris's talk has shown, the dynamics of financial and cultural investment in particular ways of thinking about data and technology can leverage val vague language, things like data ethics for the benefit of certain actors and not others. How often do you see data ethics framed from East Walworth compared to how often you see data ethics framed from Google? So I don't need to tell you that the
32:00 - 32:30 current state of things is not a very good way to support flourishing. And so when I give these talks I'm often aware of the fact I'm talking about small groups of people in particular situations and they might not therefore seem world changing. So why focus on defining data together or expanding the field of data ethics or collaborating with people in tower blocks in inner London? But I think these practices are a bit like digging a new garden. Before you start, it seems
32:30 - 33:00 impossible to have anything grow there that wasn't there before. But once it's in place, you can't imagine the thing the place looking anything otherwise. So I think we need to pay attention to which stories are told by whom and when. develop our individual and collective capabilities and make some breathing space around technology so that we can grow something new. Thank
33:00 - 33:30 [Applause] you. Hello. Lovely to love to be here. Thank you so much for the invitation. Um, I'm going to talk about something quite different and pick up on some different threads from Chris's talk, namely about how do we govern data, um,
33:30 - 34:00 and sociotechnical systems more broadly. So, uh, I've recently joined the Institute of Directors as the head of tech policy as of a few months ago. Uh, and if you don't know us, I wanted to give you some background. So we're a nonparty political organization that was founded in 1903. Uh we have about 20,000 members who are board members, directors from across industries. So everyone from CEOs of large corporations to startup
34:00 - 34:30 entrepreneurs. The IOD was granted a royal charter in 1906, which is when we began influencing parliament in the interests of British business. Um and the charter also tasked the institute with and I will quote promoting for the public benefit high levels of skill, knowledge, professional competence and integrity on the part of directors. Um and this these are our headquarters in
34:30 - 35:00 Palmau. So as such we work a lot on corporate governance. Um there are various definitions of corporate governance given that the term governance like ethics like AI is contested and malleable but essentially it's a system by which companies are directed and controlled properly. So of course Musk did not say this at the ID we advocate for meaningful interpretations of good and
35:00 - 35:30 responsible governance in business. So it's not just about what you might think compliance and risk management. It's about for example committing directors to ethical standards. It's about promoting tech and data literacy across board members. It's about diversity on boards protecting whistleblowers. So to this end the ID launched it's before I joined launched the code of conduct for directors last year. Um it's a tool which helps business leaders make better
35:30 - 36:00 decisions essentially. So it provides them with a framework to help them build and maintain the trust of not just their stakeholders but the wider public in their business activities. So the code is structured around key principles. Now I know I haven't spoken about data yet. What's the link to data here? So you might notice that some of these principles are often seen in AI and data governance principles which have emerged globally
36:00 - 36:30 over the last few years. So transparency, accountability, fairness, we see ethical standards again, trust. When we're talking about corporate governance today, we need to be inherently thinking about digital tech, AI, data governance to harness the opportunities as well as manage the risks. And of course coming back to this point on language we're always acknowledging that not just government governance but terms like AI and ethics are also contested and malleable meaning different things in theory and in
36:30 - 37:00 practice to different people. So with the launch of the UK's growth focused AI opportunities action plan earlier this year um and its largely uncritical focus in part on driving cross economy AI adoption in the UK. I wanted to better understand our members views on AI broadly defined. So we surveyed members in March of this year um and we found that many respondents
37:00 - 37:30 indicated a lack of trust in AI technologies, tools and systems. And this is obviously a huge issue when corporate governance is about building trust. They share concerns around the accuracy and reliability of AI systems, ethical, societal, environmental concerns. And most interestingly, there was a massive push back against the kind of AI hype versus tangible business value and application. And even when we asked
37:30 - 38:00 members specifically about the benefits of AI, although some in early adoption mentioned time savings, administrative efficiencies, others took the opportunity to express critical concerns about widespread AI hype. And it got me thinking, it makes complete sense that this kind of sweeping generalized AI debate often playing out among the upper echelons of the tech sector is so different to what's happening on the ground with adoption
38:00 - 38:30 and governance of AI across both the private and the public sectors in the UK and globally. And I also was thinking perhaps this is kind of pushing us towards this trough of disillusionment that we see in Gartner's hype cycle. you know, it's not entirely surprising um that the frustration and disillusionment in the sentiment of some mainstream business in the UK um given that this is the kind of sensationalized dystopian coverage from
38:30 - 39:00 certain media outlets, not mentioning the Daily Star in particular, these headlines are obviously completely ridiculous. Um but they're very pernicious because they take attention away from the actual issues. the big issues that we really need to be thinking about and talking about when we're discussing, developing, governing, adopting AI systems, issues like the under reppresentation of women in data science
39:00 - 39:30 and AI. So, there's there's lots of issues I could have flagged here. Um, but I'm drawing specifically on research that Allison mentioned already that I led previously at the Alan Turing Institute exploring how the under representation of women in data science and AI as well as in the financing of the systems shapes tech innovation and other datadriven systems. um we found evidence of persistent persistent structural inequality in the data science and AI fields with the
39:30 - 40:00 career trajectories of data and I AI professionals different differentiated by gender. So um you'll see here women are much more likely than men to occupy jobs associated with less status and lower pay. So for example, data preparation. Um and then men were much more likely to hold the kind of more prestigious frontier roles in engineering and machine learning. And so this is this is really about the people
40:00 - 40:30 making the subjective design choices behind the development of datadriven systems. decisions which of course reflect their preferences, their values and decisions which shape and are then encoded into technologies well before they are launched into the wild for adoption by for example businesses who are members of the IOD. We also did some research looking at the financing of datadriven systems, namely venture capital, and we found that over the last decade, female founded startups
40:30 - 41:00 won just 2% of VC funding deals involving AI companies in the UK. We also found that average capital raised per deal by a female founded AI startup is six times lower than the average deal capital raised by an allmale founder team. And then within AI enterprise software specifically where investment is really booming, all female teams raised only half a percent of
41:00 - 41:30 capital. This isn't just about economic and social equity, but it's really about innovation. It's about creating systems that actually serve populations that will contribute to real growth, not just economic, but otherwise as well. We've seen the results of these kind of harmful feedback loops. You know, this garbage in, garbage out, bias in, bias out. The thing is, it's not necessarily about an intention to harm, but it's
41:30 - 42:00 caused by the fundamental structures of data systems. It brings me back to it's no wonder that businesses in the UK have massive trust issues when it comes to datadriven technologies. So today is the day. I forgot my poppy unfortunately, but it marks the 80th anniversary um since the end of World War II. Um so I wanted to mention that in sharp contrast to the data workforce we see today at the advent of electronic computing during the Second World War.
42:00 - 42:30 Um as Chris discussed, uh software programming was largely considered women's work. And actually the first computers in inverted commas were young women. The majority of workers at Bletchley Park, as you saw in Chris's slide, were women. But as computer programming became professionalized, the gender composition of the industry shifted uh kind of marginalizing the work of female technical experts. So in other words, as money and influence entered the fields, um then more men began to
42:30 - 43:00 work in the fields. And this pattern seems to be replicating has replicated itself again in data science and AI today. So in contrast um on the right of the slide we can see who is in the room today making key choices about the development and the governance of datadriven technologies. Basically who holds the most power and it's not the average UKme small business. It's not
43:00 - 43:30 the government and it's not the public. So bring it back to our work at the IOD where we make sure in the policy team that mainstream British business companies of all sizes have a voice in the policy debates around data and AI governance and through initiatives like the code of conduct which I mentioned at the top of my talk. We support directors and boards in providing effective oversight, including of their own data systems and the datadriven systems they're building so that their
43:30 - 44:00 organizations can operate responsibly, not only sustaining the trust of their stakeholders, their shareholders, but also towards the benefit of the economy and society at large. Thank you very much. [Applause] Thank you so much for your presentations and discussion. We will now open the floor
44:00 - 44:30 to from the audience uh both here and online. If you could please type your questions into the Q&A box, we will try to answer as many as possible. Please also include your name and affiliation. For those of you that are here in the room, please raise your hand and I'll take questions in batches of three. And as you here in person and online think about what your um
44:30 - 45:00 questions might be, I just wanted to get us going to see if Chris you had any if we could sort of interact with the two presentations that followed you. um Allison talking about sort of participation fair and equal participation in the discourse on technology in the kind of sociotechnical systems that we want. So you were focused on participation and you Aaron were focused on par, representational
45:00 - 45:30 parody in the tech sector as one way to um achieve the type of uh sociotechnical systems we'd like in the face of unequal power. And so Chris, I'm just wondering if you had an example from recent times u that you think is the most compelling example of how this sort of people power challenges the state and private power
45:30 - 46:00 that you spoke about in the book and that you would like to see um result in solutions beyond just tech solutionism. So yeah, could you share us just one one or two examples? Are they similar to Allison and Erin's or do they differ? Well, so at Columbia, I'm an engineering professor. So I teach young people who go on to careers in technology. So I I'm
46:00 - 46:30 particularly attuned to uh cases where they've gone into companies and then I talk to them later about their experience as technologists. Uh so I'm I'm particularly interested in this internal experience of people who become software developers, data scientists, product managers within companies. So within within the vast sphere of people power, I'm particularly interested in the people who are working in these companies. So within that form of
46:30 - 47:00 private ordering to quote the article from the legal scholar, there's all sorts of ways in which people talk to each other. And and perhaps this is what you meant by saying technology is people. I mean they are people building the technologies and and distilling from what they think are their standards down to lines of code and ultimately product decisions. The ways that they push back include collective action small and large, demanding that the launch cycle includes not only QA and quality
47:00 - 47:30 assurance but some sort of understanding or at least thinking through the potential impact on people and absent of that sometimes after that uh alerting the wider world to the consequences of these technologies. Sometimes this happens after the people have left the company. There's plenty of great books by people who are alumni of these company and then try to help us all understand how these companies works. Sometimes it happens from whistleblowers, leakers and planters who go to the press. I'm a big fan of
47:30 - 48:00 investigative journalism. uh so I do think that's uh one way that uh people work together the inside and the outside is to uh make it possible for people to discuss with the part of civic society that is the press to make sure that people have thought through potential damages seen unseen present and future and technology those those are my favorite are the external visibility sure we'll be able to touch upon some of those themes can I get um questions from the audience
48:00 - 48:30 We'll go there and then behind and in the middle right there. Yeah. Just developing on this uh point about the general public as this um shepherd for better practices around AI. Um I was at an uh conference at the Bennett Institute in Cambridge yesterday. similar discussions arose and there's this concern around the general public is affixed between either AI alarmism or AI hype and there's this
48:30 - 49:00 concern around literacy that informs the ability to kind of hold practices accountable and push back. So in what ways do you think a datacentric view can contri contribute to uh increasing literacy and solidarity? Well, I think you make a good point about the dichotomy between irrational exuberance and and the trough of despair to to quote the diagram from Dr. Young and I think that's a a very useful um
49:00 - 49:30 diagram. We we go into a techn technology often with a view of what it can do. Sometimes that view is in fact argued for by the people who developed or sold that technology to us. And often as we interact with a piece of technology, we find ourselves in that trough as she pointed out of not of realizing that it actually doesn't do everything it was sold to do. The best place surely is the far right hand side in which we have rational exuberance of what it can and cannot do which prevents
49:30 - 50:00 us from either inflated hype or dumerism uh that that will have too much of a power. respons I mean I might also add that this dichotomy is itself a story right so it is a story saying there's there's sort of two main kinds of people the ones who are getting on board are the ones who are getting in the way and you know I think all three of us are really interested in presenting something
50:00 - 50:30 that's a little bit more nuanced and therefore helping decision makers policy makers and governance structures understand that people are not, you know, don't necessarily, it's not that there's a huge gap in technical education about AI, but there might be a gap in the social conversation that would permit other things to become apparent rather than just on board or in the way.
50:30 - 51:00 Uh, thank you. It's been absolutely fantastic talk. Uh if I could just jump wildly ahead to a kind of gigantic question. Um which obviously we find ourselves in a very challenging world politically since the new administration took power in America. Um hopes for enlightened regulation from the federal government are kind of shattered it
51:00 - 51:30 seems at the moment. There seems to be increasing rivalry between America and China. You know all this. So in the context of where we are today, what would you say as a way a strategy for attempting to regulate AI in some way so that it benefits as many people as possible rather than the tech elite? Sorry, I know it's a gigantic question. Where do you start? But your thoughts would be greatly appreciated.
51:30 - 52:00 I would say the one way to think about it is to expand the definition of regulation. So particularly in the states, people have a conception that there's one form of regulation which is central federal laws. Uh and of course regulation even legal regulation is not just federal laws but state laws, municipalities, individual municipalities, municipalities in the states for example have um you know banned facial recognition or something like that. And then of course regulation takes places in other jurisdictions. GDPR is a good example. United States
52:00 - 52:30 companies that wish to be active in Europe had to decide, do they want to make a separate infrastructure for Europe or did they want everything to be GDPR compliant? Most companies decided to simply make one um effectively better business intelligence infrastructure that made it possible to operate in Europe. Moreover, um as the legal scholar Larry Leig pointed out at the end of a previous millennium, regulation is not just about laws. Regulation is also norms, laws, and architecture.
52:30 - 53:00 Norms, norms, laws, markets, and architecture, as he put it, that laws are just part of the regulation. There's also our own technological solutions, which is architecture. Uh there's also markets, whether people choose to spend their money, for example, on an allegedly self-driving car. If some if an entire country decides they're not going to buy that car anymore, it can have an order one impact on the stock and the future of that car, for example. And our own norms, right? We have to decide is it is it okay to use an large language model to write our friends eulogy? That's sort of up to us in our
53:00 - 53:30 norms. Uh we were talking earlier today about the various uh meanings of norms as well. All of those form a type of regulation irrespective of whether or not one particular country at the level of the whole state is bringing about regulation that um enforces consumer protection. just to add to that I we we sometimes see this kind of um innovation versus regulation that the two can't coexist and I think on top of that we've historically seen that Europe is seen as
53:30 - 54:00 the regulator and US is seen as the innovator and I think once again it's nuance that we need that it's not one regulation isn't one thing and innovation isn't another thing I think we're seeing um both in Europe and the UK kind of more exploration around this. So for example the um AI act in Europe there's now a lot of conversations around well how do we actually um kind of implement this in practice is it you know be a bit more flexible in the
54:00 - 54:30 approach likewise in the UK um generally we're taking a kind of more sectorbased approach compared to Europe that takes a more principles-based approach but again um we have for example cooperation across regulators with the uh digital regulation corporation forum where regul regulators can discuss the best approaches the most flexible nuanced approaches on kind of a case sectoral industry basis. So I think it's really keeping this this nuance in
54:30 - 55:00 the discussion that that will make the difference. Um, yeah, maybe it sounds a bit obvious almost, but um, Alison, you mentioned about 2019 being in the middle of being of data ethics being a thing. Um, and I guess I'm kind of wondering what sort of
55:00 - 55:30 I guess narratives or understanding you all have for um, sort of why organizations at that particular time in those in the course of those few years kind of uh, felt the need to kind of concurrently produce these sets of principles um, etc. Like was it a change in language or was there a real kind of um, cultural shift and kind of what explanations you might have for that? I'll start if that's okay, Chris. Yeah, we were we were discussing this a little
55:30 - 56:00 bit earlier. Um, from my in from my view, having watched this unfold, uh, in my view, there was a strong social critique that was beginning to emerge around the observable kinds of harms that were resulting from the design choices made around certain kinds of systems. So those systems included automated um uh in uh algorithms for enforcement, the compass system for sentencing, uh facial
56:00 - 56:30 recognition systems which were widely understood to misrecognize certain groups of people which caused disproportionate levels of harm. So there this was something that was well documented, well discussed, certainly part of uh you know the scholarship of people uh sitting in this room and this was starting to have an impact. this was starting to cause people within the technology industry to be concerned about how their products might be received or whether Chris's outcome of
56:30 - 57:00 people stopping to you know consume as many of these products because of a perception of widespread harm um or you know a malfunctioning in fact uh based on these different kinds of design systems. So part of what occurred was a sort of attempt for by companies to capture the energy and channel it into something that was less disruptive to their bottom line and their business model. And that of course meant for technology companies like Google to
57:00 - 57:30 build ethics boards uh to respond to uh external and very importantly internal critique of the designs of certain kinds of systems and the recognition of the harm that those systems were causing. And what I actually originally had a slide that said, you know, data ethics is a problem question mark that had some news uh coverage of the time. Um some of which was about the Google ethics board which was disbanded almost immediately. Um because it was very clearly
57:30 - 58:00 performative and they also hadn't very carefully uh considered who was actually participating on that ethics board. So this was I think a form of you know a very dynamic form of what some people might call capture but you might also call a kind of unstable terrain of struggle. So there's a terrain of struggle that's identifying real problems with a socio techchnical system and then you know power and benefit uh seek to preempt that critique before it becomes too damaging to absorb it and
58:00 - 58:30 then several years later you know it can it it you allowed to fall away. also because the money has moved somewhere else case to massive investment in AI startups. Great. I think that actually uh segs nicely segus nicely into an online question from Kito Shilango, former MSE data and society student
58:30 - 59:00 um and current Mosilla Foundation senior tech policy fellow who asks on the topic of power and data. I'm interested in the panelists perspectives on resistance particularly how to incorporate it in policy discussion. For Allison, do you have thoughts about how to scale resistance movements to data practices from a local to national and global level? For Erin, do businesses consider resistance as a form of
59:00 - 59:30 governance, which I nicely links back to some of what you were saying about municipalities, for example, bans on facial recognition. So, yeah, great question. Yeah, such a fantastic question and I it's really wonderful to hear from you Kito uh who's doing spectacular work um themselves. Um so how to scale resistance? Well, first of all, I I wonder about scaling because sometimes when we talk about scaling, we assume that scaling has to work in the way that scaling works in uh the hyper
59:30 - 60:00 capitalist uh businesses where you take one kind of tiny pilot idea and then you just repeat it many times with the same sort of principles and with the same expectations. And I don't think that this is how these kinds of dynamics uh operate. So when I was um spending time in East Walworth, we were thinking about how to make a kind of ryomatic network of people who were interested and active already but whose interests and
60:00 - 60:30 activities might be quite different from each other to allow them to speak with each other to sort of do some good practice sharing. So this is not like scaling as in repeating and trying to do exactly the same thing. It's much more thinking about change as a kind of um as a sort of emergent uh collaboration or federation of people with different interests who could potentially share different tactics. Um so that is one model if you're thinking about getting from the very very small to the slightly
60:30 - 61:00 larger. The other way is to think about different um scales of institutions. So um in when I was also talking about East Walworth I was talking about the burrow. So in London we have uh you know a great number of burrows. I can never remember if it's 27 or 29 or 32. So these are local governments and they are an intermediary state uh between the very small and the very large. In other places you have mu different kinds of municipal governments uh state level
61:00 - 61:30 governments. Uh you often have things like neighborhood associations. These can link together again in this kind of federated manner to share knowledge and practice which also allows for ideas to move uh across different kinds of um institutions uh and allows you to have your change not have to be repetitive or be the same every time, but for your change to be different and therefore specific to whatever context you need it to be working in.
61:30 - 62:00 A very good question. Um, and it's something I'd like to ask members about. I my sense is that, you know, we're very we're very early on in kind of adoption and governance in mainstream business when we're thinking about particularly generative AI, I would say, and kind of broader datadriven systems. And so for um the directors I speak to and the organiz organizations they represent particularly those which aren't um which are in sectors and industries which are kind of kind of far from digital first
62:00 - 62:30 industries they're at the very initial stages of asking questions about these systems what are the most valuable use cases then how do we govern that what are the risks so I think it's my answer is it's a little bit early to tell because we still are at kind of very early experimental stage. It'll be interesting to see how that unfolds. Raises a lot of questions about um expost and Xanti, right? Um that I think is probably a topic we could spend
62:30 - 63:00 a lot of time talking about. Yeah. Also, what I mean what I just sort of started thinking about Erin while you were talking was I mean many years ago if you'd asked me this question, I would have said well you could make different stuff. you could, you know, open- source models for different kinds of like development practices and you could have that, you know, design driven innovation also provide different kinds of alternatives. And I wanted to bring that back in because I feel like either I got really cynical uh or we stopped talking
63:00 - 63:30 about actually building different kinds of technical alternatives which meant that we stopped perhaps having an imagination that we could build technology in a different manner that wasn't in the manner uh in which your dumpster fire got started. I don't know if you had thoughts on that. I mean, I'm in favor of technologists thinking creatively about the consequences of what they design. Certainly, um I I think the I think design is only one element of again the
63:30 - 64:00 sociotechnical system. There's many forces at play. There's abundant capital that drives people to uh to design certain things irrespective of the of the technology of the design hopes of the individual technologist. So that fits nicely with another online question that we have from Aisha Mahal. What's the panel's view on the term data colonialism whereby data has become an exploitable resource used to deepen power asymmetries between global north and global south? Where does ethics and
64:00 - 64:30 public interest sit within this dynamic? And what could be done? Right? So maybe it's not a design solution in that case. I mean my conception of ethics definitely touches on at least two aspects of data colonialism. Justice which is often you know interpreted as fairness. Are we having an equal attribution of the benefits to different people and the other is respect for persons which is often interpreted as informed consent
64:30 - 65:00 because often people who are in a position to have their data extracted are not the people who are really consented consenting to the way those data are going to be used eventually. So certainly I think it's I think it's um I think the framework of ethics as a set of principles speaks directly to to data colonialism under that definition. As far as what is to be done, u I'll I'll I'll turn to my panelists, particularly my particularly my policy-minded panelist.
65:00 - 65:30 What What is So I I'm gonna I'm gonna weigh in because our online questioners are really arudite and they're like really really clever. So like we're all like struggling away going, okay, how do I address data colonialism? So okay, listen. Um the current models for both design and for the kind of design of of the industrial strategy behind datadriven systems are that they depend on constant increase in input. So this
65:30 - 66:00 is also uh you know how current machine learning systems are designed you in order for the thing to scale in the sense that you know you were talking about um scalable u business models you have to keep adding more stuff. So if you have to keep adding more stuff and we and there are sort of finite sources on the existing internet which is what most uh many of the sort of conventional large language models are trained on then you need new sources of data and information. If you are creating a
66:00 - 66:30 business that's trying to expand into new markets you also need material that's written in different languages. So this certainly starts making uh you know information produced in global south contexts or in you know all around the globe uh into things that look attractive provided the cost is low enough to acquire them. So I'm going to jump straight to the solution to this dynamic because this is a dynamic that is colonial you know in terms of how it is how it is oriented but it does not
66:30 - 67:00 have to reproduce or repeat the colonial violence. Um, and I think sometimes when I hear the phrase data colonialism, I get a bit worried because it implies a bit that a kind of structure of an industry that's depending on uh gaining new resources uh is inevitably going to produce a kind of colonial violence. So, one way you could you could preempt that would be to make those new sources of material, which in this case is data,
67:00 - 67:30 information, things people say, whatever's on the internet, expensive. And I think this is one way that you could reverse the dynamic. You could try to create a different sort of market. A market in which this notion that data is abundant and therefore cheap and therefore you're always looking for more of it and therefore you know and therefore something that's a material into something that needs to be uh worked with carefully. Uh I talk about this a little bit at the end of my book. also is a moment there was also a moment
67:30 - 68:00 there where I was able to write about what if we used data minimization strategies you know what if we assumed the data was going to be expensive and not cheap and therefore people's participation in systems of governance would be equally expensive if they were participating for example in a citizens assembly or if they were contributing uh data to a datadriven public consultation
68:00 - 68:30 Yeah, we have a question here in the front. You mentioned mark markets as an arena of of you know regulation and public interest and clearly uh holdings mass holdings of data grant an immense amount of power in markets. But do you think our competition regulation actually has developed concepts and tools that that address that power granted by data in markets?
68:30 - 69:00 Um sorry can you has competition regulation helped um create more fairness? No, no. I I I mean I think b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b basically holding data gives enormous power in markets. Does our regulation of markets actually have tools and concepts that matches that question? So for example, if you
69:00 - 69:30 read anyways, if you look at at people talk arguing for AI as a as a differentiator in in in company development, one of the arguments for AI is that it's a a self-perpetuating flywheel that the companies that have more data create more value. They use that value to gather more data and then they have a runaway advantage and then they win. So I I think what you're getting at is does the existing in the states we would say maybe antitrust regulation adequately reach out to and
69:30 - 70:00 and is aware of and defend against the possibility that one company takes market advantage because they've gathered so much data that they have it as their data moat. Um I kind of want to give a US centric answer which is it depends on who's running the FTC meaning um there's actually a debate that's been going on in the last decade in particular in the states around what antitrust should do in in the United States. Should it be for example just about price? So there's
70:00 - 70:30 a conception that the only type of market advantage or monopoly power that matters is is the company ex um rising the price unfairly to the market. Uh and then there's a counternarrative that it's about power. And so one of the arguments that happening uh right now and it's been advanced by a legal scholar named Lena Khan who to my surprise is is still at the FDC uh is that you know the the role of antitrust should be around power and that we should regulate companies that are able to capture the market not because
70:30 - 71:00 they're charging people more because they have monopoly power but because they have such dominance that they must be broken up. And that debate is is is being um played out in court cases week by week u particularly in the last few weeks Google and meta uh on the receiving end. So I think it's absolutely possible but it requires people to think about antitrust as having not an impact not just on price but about consumer protection and about
71:00 - 71:30 enforcing competition. But you're right that uh certainly when I started going to venture capital events in 2010 or so that was part of the narrative is that AI and data are going to be its own moat that the moat will not be a better product. The moat will be that you will gather so much data that no other company can compete with you because you have the most valuable data that no other company can afford to gather. Yeah. to add an interesting I think mainly US-based but this kind of interesting little tech movement where a
71:30 - 72:00 lot of VCbacked tech startups are and and some of the venture capital firms as well pushing back against this um arguing that no it doesn't adequately um challenge that I think something else I was thinking about I think it really depends on where you're looking in the kind of AI data value chain or in the tech stack so um for example if we look at the share of the cloud market in Europe and the UK. We obviously are hugely reliant on I'd say three three or
72:00 - 72:30 four US companies. So there I don't think it's you know perhaps not intervening in the right ways in other parts of the stack maybe differently. Right. I'll just add one quick thing that might be worth considering is um also not just the consumers but the workers and right the extent to which for example platforms have a monopoly on the labor market and whether we've used enough tools in the toolbox to actually address that. We had a question in the back and then
72:30 - 73:00 I'll come here and then over there we have if actually if we could batch these questions so you can pose your question you can go next and you can go third and we'll ask and fourth if we can squeeze it in and we'll try and get a round up of so I I'm afraid I have a very boring question about data cleansing and data governance. So perhaps for for Dr. Young and and as well as the others. I mean the two words and I appreciate how loaded this is that for me interestingly were prominent by their absence in the
73:00 - 73:30 discussion very very interesting discussion were fact and truth. um appreciate how loaded this is and appreciate at the macro level um you know kind of big data modeling data you know sales and so forth is maybe less relevant but thinking you know companies using AI using um advanced learning about more internal decision making and anything larger than a simple local business then immediately starts to get
73:30 - 74:00 lost for lack of clean truthful full factual data and this is a problem which potentially could break businesses in in many cases it it has to some extent or it has put very large companies at risk. So I'm I'm wondering how much of is there an awareness uh are you seeing the question of data governance come up perhaps in parallel with this very interesting you know alarm about um you know what AI can
74:00 - 74:30 really deliver because of course if we don't have the data quality then then how to use it to make decisions. really question. I think this is I see this as fitting um closely alongside digital tech AI literacy. Um what I'm noticing when I speak with boards for example is they many don't have the kind of basic whatever that might mean AI literacy data literacy um
74:30 - 75:00 to even kind of know which questions to ask regarding data governance what what does our data look like how do we begin organizing this data so again I think this comes back to it's almost kind of a layer before thinking about this what what how do we make sure that ummemes for example boards of of non-digital first companies understand this to the degree that it's not just strategic but it's also responsible
75:00 - 75:30 I'm going to batch the questions now if we could just take all of the questions at once and then I'll give an opportunity for all of you to respond so we're here hi everyone um so we've heard a few times tonight about the power of you a smalish number of companies kind of influencing a lot the kind of future of AI and controlling lots of data and the kind of data modes as you as you were talking about. Um, however, you know, typically when it comes to new technologies, you know, you're talking about terms like kind of creative
75:30 - 76:00 destruction as having both kind of pros and cons. And when it comes to the typical value exchange between market efficiencies and equality, do we feel that um that there's more of a tipping point this time because so few companies are kind of controlling and leading the way so much that you know public bodies and you know typical um public authorities are kind of are playing catchup not wanting to hold the market back but also having to play catchup when it comes to regulation after the effect.
76:00 - 76:30 Thank you so much. It's been a fascinating uh session. um with all your knowledge and experience directly and through people you've worked with um what's your personal like behavior in terms of protecting your data in terms
76:30 - 77:00 of social media? How do you live online? Hi um Dr. uh Powell and Dr. Wiggins. Uh you you both uh used the trash fire metaphor and I think you were perhaps joking slightly when you said you didn't need to uh explain that any further, but I'd appreciate if you did if you could talk about some of the the concrete harms that you think are happening now from unethical use of data. I know
77:00 - 77:30 there's been quite a bit of discussion of um accountability and bias in decision- making, but I um Is there anything else? I'm I'm not left that worried. Maybe I'm being blasze. Great. Okay. So, not that worried. Can you explain the internet trash fire? Second question in reverse. Um your own personal data hygiene habits.
77:30 - 78:00 And uh third question which is this can you respond to this problem this common problem of regulators have having to catch up with creative destruction. So again a US perspective is yes they play catch-up and and I think that's the way regulators often design. It's often said in the states that we shouldn't regulate technology before a problem occurs. I'm I'm not saying that I personally advocate that, but it's certainly something that you hear. And
78:00 - 78:30 in fact, often in the states, it's some extremely cataclysmically bad thing that happens before there's any sort of regulatory response, including financial crisis of 1929, or the incredibly deeply unethical research results that created the Belmont report, which I really can't go into without crying. It's bad. Um, so yes, it's often the case that at least in the states, regulation is after a cataclysmically bad event. As far as how I live my social life, not on
78:30 - 79:00 X, haven't never used Facebook. Always gave me the creeps or Snapchat or any of those other things. I use Duck.Go and incognito mode. And I prefer to interact with LLMs using Duck.ai, AI, which is a privacy preserving way of, and I particularly like the open weight choices rather than sending all of my information via an op uh API to a private company. Um, several other ways that I avoid exchanging data with the internet.
79:00 - 79:30 Oh, are you avoiding the concrete harms, Chris? Correct. I mean, there's so many transfires. I I would rather lean on the shoulders of scholars. I mean like uh algorithms of oppression is a great one. I can still remember meeting at data in society Sophia Noble before the book existed and I was like oh search what's wrong with search and she explained to me what happened when you look for you know pictures of white girls and pictures of black girls when I met Lata Sweeney and she told me the story about googling her own name and then sitting
79:30 - 80:00 next to her co-author and the co-author googled his names and they came up with radically different results right for Lata Sweeney was arrest records right which at the point Latana Sweeney was already doing fine as a professor at Harvard but um I I don't actually know where to other than I encourage you to check out some of the books from the I Virginia Eubanks's book automated inequality is a deep ethnography where she interviews people and as one of them says to her be careful because what they're doing to us now they'll do to you later right so she interviews people from low so socioeconomic systems who are having
80:00 - 80:30 their benefits decided algorithmically or you can look at the past 100 days in the United States of people being fired at scale including people who are in charge of you know uh dealing with viruses because of or people of my colleagues whose grants were terminated because you know they had trans regulation of genetic regulatory networks but trans was a bad regular expression. I I the the examples are too
80:30 - 81:00 numerous for me to we would be here a long time. uh but you can I encourage you to check out the slide with the excellent work by um a variety of women scholars 2017 through 2019 for a good start or probably many of the papers by my colleagues on the panel. I can I can work backwards. So actually I give um presentation I used to give at the cheuring was the few slides I showed at the end. I ended with a lot more examples of um gender bias in AI and data systems. And one um which always
81:00 - 81:30 really struck me which is has now thankfully this is not the case that actually my colleague in the front row sent to me was a um a translation system where um you were translating from a language that didn't have gender into a language with gender. And if you um so for example, if you wanted to translate they clean and they make money, it would
81:30 - 82:00 translate as she cleans and he makes money. Now that's been changed, but much more recently we've seen um image generation systems replicating this again. If you ask for an image of a doctor um for example versus an image of a nurse. So um yeah, just one of the kind of harms and risks and the plethora um that we see uh personal behavior. I grew up um with the advent of YouTube and MySpace actually and um this was
82:00 - 82:30 long before I didn't we didn't have any conception of what this might mean harms of putting data online. So I'm not sure what happened to all my data on MySpace hopefully deleted. Um now I I am much more careful obviously but it does you know I do think about that sometimes. Um thirdly the question on kind of regulators playing catchup. Um I think this is I mean of course regulation and
82:30 - 83:00 law lags behind tech capabilities. Um I think it's a question of how much behind and how agile can this be. um Offcom recently have um been loading up on tech talent. So they've been hiring a lot more um people into into the regulator who understand not just from a technical perspective but a social perspective. Um and so I think initiatives like this will kind of up the agility of regulators and perhaps make this not so
83:00 - 83:30 much of a problem going forward. So, I've noticed that it's 7:58 p.m. and I know that these events run strictly to time. So, I will um acknowledge that my colleagues have mentioned so many of the harms. I will however just point out that if one is reflecting on the lack of harm, that might mean that one is in a category which is traditionally privileged and therefore less likely to uh experience harm uh at any given time,
83:30 - 84:00 but that we all move across these categories. And as political situations change, we may find ourselves moving from a category in which things do not feel risky or consequential into categories where they do feel much more risky and consequential as I think Chris has narrated already from the US. Um I think Erin has answered the question about monopoly so well. So I will leave you with my extremely incoherent online life uh which is attempting to respond to a sense that
84:00 - 84:30 things have become indeed more risky than perhaps they were before and perhaps I did have more information about myself uh on the internet uh prior to learning more about these kinds of systems the amount of data that's ingested and the very complex kinds of predictions um and matchings that occur and I will make just one final observation because I know Sitta a little bit. Uh and I also know that she is very careful about her online life
84:30 - 85:00 and just leave you with the thought that the people who are thinking really carefully and deeply about data ethics and governance are trying not to leave too many traces online. And with that it's been a great pleasure to have you. Um please uh I think all of you uh have enjoyed this panel has left us with a lot to think about. Uh thank you all very much for coming to this event and please now join me in a final
85:00 - 85:30 round of applause for our [Applause]