Dungeons and Data - discussing data, AI, and the law

What can happen to your research data once you have shared it with the public? Can artificial intelligence (AI) tools be trained on your research data or on data that are publicly available? What do European and UK regulations say? What are the benefits and the risks AI poses to open and reproducible science?  

If you are intrigued by these questions, come to this open event with representatives of academia, technology and the law. It will be an afternoon of discussions based on real-life examples, followed by a surprise at the end – for the brave library rats!  A visit to the second largest library in the UK, with treasures hidden for centuries! 

Register here 

 

Logo for Dungeons and Data event, showing a circular seal with a drawing of a wooden dungeon door in a stone arch, surrounded by keys, polyhedral dice and other data visualisations such as networks and matrices

The legal implications of using publicly available data to train artificial intelligence (AI) models on the right of attribution and protection against misuse will be discussed in this RROx event 

 

 

 

 

 

 

Event details 

  • Location: Weston Library Lecture Theatre - Broad Street, Oxford OX1 3BG 
  • Date: 7th of November 
  • Timing: Panel talks and discussions from 13:30-16:00, then a small group of RROx members (determined by lottery on the day) will go on a 30-minute guided tour of the Bodleian Library. There will also be a social gathering afterward at a nearby pub. 
  • Format: Five speakers will talk and discuss between them and with the audience (see abstracts and bios below). Be prepared to ask questions! 
  • Audience: This event is designed for any researcher, academic, technology or law expert, whether or not you are affiliated with University of Oxford. Booking is essential 
  • Sign up here 

Talk abstracts and speaker bios: 

 

FAIR and 'AI': a community perspective 

With the growth of technologies such as ChatGPT and other LLMs, what is the research data management community talking about? Leading a project and collaborating at a global level within communities that foster FAIR and good data stewardship, we have spent the last decade advocating for more transparent research. What are the concerns around the possible consequences of our FAIR journey in light of 'AI'? This presentation will highlight the issues that are being discussed in these communities, to provide a concrete foundation for the more expert voices in this new technology later in the programme. 

Speaker: Dr Allyson Lister is the FAIRsharing Content & Community Lead at the University of Oxford. With a background in FAIR, data standardisation, ontologies, semantic data and integration, she is responsible for FAIRsharing content, as well as for the collaborations with users and outreach across all research domains. Allyson has recently completed an EOSC Future / RDA Domain Ambassadorship (for standards, databases and policies), and is a co-chair of two RDA working groups. 

 

What are the most common misunderstandings or questions about compliance? 

The talk will look at some of the key data protection considerations that are relevant to researchers who are processing personal data for their studies. It will also explore some of the most common misunderstandings that arise when researchers are thinking about how the data protection regulations might relate to their research. 

Speaker: Richard Duszanskyj is a Senior Information Compliance Officer in the Information Compliance team (ICT), based in the Assurance Directorate, at University of Oxford. The ICT offers support and guidance on the regulatory requirements relating to data protection and information rights. 

Richard has an extensive background in providing advice and guidance on data protection, with significant experience addressing compliance issues within the research context. He also has a keen interest in the governance challenges presented by the increasing proliferation of AI and is currently focused on exploring mitigations for this new area of risk.  

 

The future of human participants data ethics 

The talk will explore how AI changes, and will continue to change, the ethics of using data on human participants. It will discuss how key concepts of data ethics, such as consent, anonymity, and harm reduction, are in the process of being subverted and how researchers can respond. 

Speaker: Ignacio Cofone is Professor of Law and Regulation of AI at Oxford, working jointly at the Faculty of Law and the Institute for Ethics in Ai, and a Fellow of Reuben College. His research examines how the law can and should adapt to AI-driven social and economic changes with a focus on data protection and antidiscrimination. 

 

Licenses and open models for machine learning 

This talk will discuss the efforts to make Free and Open Source Software principles apply meaningfully to the machine learning space, drawing analogies with other challenging areas such as open hardware licensing and the various ‘ethical source’ movements that have been promoted in recent years. 

Speaker: Rowan Wilson joined Oxford in 2001, working for the Oxford Text Archive and  the Humanities Computing Unit, and has spent the next twenty odd years supporting research at Oxford in various guises. Between the years of 2003 and 2013, Rowan was the Licensing Lead for UK Higher Education's Free and Open Source Software Advisory Service 'OSS Watch'. Rowan is now Head of Research Computing and Support Services within IT Services at Oxford. 

 

Science in the age of AI 

The team will be presenting on our May 2024 report, Science in the age of AI, with a focus on 1) access barriers to high-quality data faced by researchers and 2) the reproducibility of AI-based research. If interested, we recommend reading chapters one and two of the report, which discuss how AI is transforming scientific research and research integrity and trustworthiness. We are also in the process of doing some work around regulating foundation AI models in light of the upcoming government AI bill, and would be interested to share thoughts on that. 

Speakers: 

Areeq Chowdhury is Head of Policy, Data and Digital Technologies, at the Royal Society. His team focuses on how artificial intelligence and other data-driven technologies can and should be used to benefit humanity. Areeq is also an elected Councillor for Canning Town, in East London, and previously founded the influential technology policy think tank, WebRoots Democracy, which ran between 2014 and 2020. 

Ali Griswold is a Senior Policy Adviser on the Data and Digital Technologies team. Before that, she worked as an investigative technology reporter for Quartz (qz.com) and Slate Magazine. She also writes Oversharing, a Substack newsletter on the gig economy and smart and sustainable urbanism. 

Nicole Mwananshiku is a Policy Adviser on the Data and Digital Technologies team. She works on the following projects: Science in the age of AI (exploring how AI is changing the nature and method of scientific research) and The online information environment (exploring how the internet and data-driven technologies affects the production of disinformation). 

 

More information

Besides GDPR, new European regulations about AI (released last April 2024) have sparked discussions about the consequences of data sharing. While openly sharing data has the potential to increase trust in research, there is uncertainty around what can happen to that shared data and how it will be used. This uncertainty could even stall recent advances in openness and FAIRness, leading to hidden ‘dungeons’ of data created by researchers concerned about the implications, such as lack of attribution. 

What do technology experts who are developing AI tools to improve our lives think about this? What are the views of researchers? And what can legal experts tell us about what is allowed and what is not? 

Reproducible Research Oxford (RROx) is opening a forum to address these difficult questions. Our goal is to spread the culture of reproducible research across Oxford, and we believe one important step in that direction is sharing data transparently -- within the limitations of individual privacy and confidentiality and of commercial product development. 

However, recent European regulations can feel like a labyrinth, and we want to understand how they will affect our research practices. In this first Dungeons and Data event, we invite you to explore the depths of this complex topic and help shed light on the hidden chambers of regulation and data sharing. 

We invite students, professors, researchers, and staff to come, and talk, and listen, and think together. And what better place than the Bodleian Library? Many residents of Oxford have never set foot inside! The Bodleian has treasures both displayed and hidden in its underground vaults. By contributing to this workshop, you will have the chance to explore these remarkable buildings and gain new perspectives on the emerging field of AI, making it feel less daunting. It’s a wonderful opportunity to learn, discover, and have a great time! 

 

Disclaimer  

  • This is an emerging field of knowledge. We don’t know everything about it, and we are trying to collectively build that knowledge through discussion. We welcome and encourage respectful expressions of opinions by experts and non-experts in the audience. The content of the talks are the speaker’s opinions and not official views from RROx or the University. 

  • RROx and the University of Oxford support interdisciplinarity and multiprofessional work environments in research. We should work collaboratively, not competitively, and foster each other’s success. 

  • We ask the speakers to disclose any financial or non-financial interest they may have (this includes participation, work, or consultancy to companies developing software, AI models and products, legal advice companies and so on) We also welcome these declarations from the public attending the event.