Bias in metadata 1: Monitoring advances in the field of AI, with an emphasis on bias

  • feb 2023
  • Ryan Brate
  • ·
  • Aangepast 28 jun
  • 128
Ryan Brate
Preservation Digitaal Erfgoed

Samenvatting

This blog post is the first of three blog posts, as part of the Dutch Heritage Network: Preservation Watch, which monitors technological developments as relevant to the GLAM sphere.

The blog posts are written by Ryan Brate, who is a PhD candidate within DHLab of the Koninklijke Nederlandse Akademie van Wetenschappen. You can read the interview with Ryan in this earlier blog post, here.

The GLAM (Galleries, Libraries, Archives and Museum) sphere must unavoidably deal with loaded content: depictions of events and characterisation of peoples that we may recognise as problematic or worthy of further contextual elaboration. Heritage professionals and the GLAM institutions themselves play a central role in curating and providing context to such content. However, scalable Machine Learning and AI can and does also have a role in digesting the wealth of content available for consideration in identifying and exploring these biases. In this blog post we discuss the current state of the art in the field of Artificial Intelligence (AI) and bias, and look at who is pushing the envelope in these areas.

Are we thAIr yet?

AI is concerned with the development of automated systems, which can contextualise information and perform actions, such that the input-output performance mimics that of humans. Indeed the famous Turing test, proposed for qualifying AI, simply tests whether the machine in question is able to fool a human participant into thinking that they are conversing with another human. Transformer models represent the current state of the art in textual contextual understanding, and are used in a variety of downstream AI applications (i.e., applications which are geared towards some specific functionality such as Amazon’s Alexa, or Google Search). These models involve hundreds of millions or billions of parameters, learning numerical representations of sub-words via their associativity, and learning where to focus their attention on spans of text in performing word prediction problems. These models are typically pre-trained on enormous quantities of unstructured text, trawled from the web and from books (e.g., BERT). Transformer models are a successor to earlier RNNs (Recurrent Neural Networks): RNNs consisted of chained ‘cells’, each cell representing a word in a text span, each cell having an internal state, but also receiving and passing some state from their adjacent cells. This passed state between cells represented the wider context in which a word appeared in a text span, and enabled to some extent, the tracking of word relationships over some distance in performing tasks. This linear chaining was flawed however, being both computationally intensive (and expensive) to train, but also more limited in its ability to represent in long distance relationships. Transformer models instead interconnect every word with every other word in a text span, which is both computationally less expensive and much improves the models ability to account for long distance relationships in text.

The latest iteration of such large language models is ChatGPT from openAI. The results can be astounding: both impressive and worrying. ChatGPT can answer homework questions, write essays and (fairly convincingly) answer questions from a news correspondent. Whilst seemingly impressive, such models neither really understand the world, nor the implications of their responses. Such models are ultimately just learning sub-word associations, relying on associativity in text as their only medium for making sense of the concepts of our complicated world. They do not possess the ability to reflect on the biases of information they are trained on. In short, such models have to be considered very carefully in regards to their suitability in being applied to particular downstream tasks without causing unintended harm (e.g., chat bots in the medical domain). In this regard, they are not dissimilar to previous transformer model iterations, or earlier Recursive Neural Networks, or indeed word vector embedding representations of words. Non-transparent in their entrained biases, yet performative in various seeming text-understanding tasks. Ultimately such models are a slave to the quality and content of the text they are fed.

AI is not there yet (and may never be), but language models remain valuable tools for cultural heritage and NLP (Natural Language Processing) researchers. Their entrained biases can be problematic, but they can also be illuminating. We must take care to ensure that we are not imparting unintended bias on some particular downstream Machine Learning task in using such models. However, the topical associations implicit in these models, as reflective of the information they are fed, represent a potentially valuable tool for exploring the biases and perspectives of the source data they are trained on at scale. For example, the paper Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings, by Bolukbasi et al., (2016), explores the biases exhibited by word embeddings trained according to an older state of the art language model, word2vec. This technique is still oft-used to date, and regardless is demonstrative of the biases that language models generally exhibit, being based on word co-occurrence. As the title alludes to, and as shown in the paper excerpt below, trained representations of occupations are shown to exhibit gender bias. However, language models are useful in exploring biases beyond gender: to contentious terminology and topics relevant to the GLAM sphere.

Who is pushing AI developments?


The end of 2020 saw a very public Twitter spat between recently sacked members of Google’s AI ethics research team (co-leads Timnit Gebru and Margeret Mitchell) and their former employers. The spat centred around the contents of a then yet to be published research paper, On the Dangers of Stochastic Parrots. The paper highlights the financial and carbon costs of training such models, as well as their aforementioned unpredictability as to how the language patterns they learn come to manifest in downstream tasks. The stances of its authors and the apparent resistance of their Big-Tech employers to acknowledge the potentially damaging impacts, are somewhat demonstrative of the current ongoing to-and-fro in the field.


On the one hand, performative advances are being made in language model, Machine Learning and AI architectures driven by Big Tech (Microsoft, Google, Facebook etc.) and Tech-entrepreneur financed organisations (such as OpenAI). I.e., the above figure shows key milestones in language model advances in the last decade, all of which being from one of the aforementioned organisations. Computing cost is a factor here. On the other hand, while such organisations do indeed employ algorithmic bias researchers: there is considerable representation in the digital humanities and computational linguistic departments of research institutes and universities, working with language model bias.

Where to look to keep abreast of research into AI and Machine Learning?

Google Brain

Co-founded by Andrew Ng, google brain is the internet giant's AI and machine learning research-focussed sub-division. Google researchers have been hugely influential on language modelling in recent years. Research publications are updated and made available here.

Stanford University

Stanford University attracts some of the most famous researchers in the world in the field of Machine Learning, AI and Natural Language Processing. See The Stanford AI lab for their current work, as well as the Institute for Human-Centered AI (HAI), founded in 2019.

OpenAI

A research lab founded in 2015, with billion-dollar startup funding provided by tech-entrepreneurs, and subsequent funding from big-tech. OpenAI has been headline grabbing in recent years for advances on a variety of AI fronts wrt., a variety of data modes (text, images, sound): ChatGPT and DALL:E.

In the Netherlands, there are various collaborative efforts to further AI research. The Innovation Center for Artificial Intelligence (ICAI) brings together universities and research institutes, industry and government and societal partners. The majority of the ICAI research is organised in so-called ICAI Labs, in which a knowledge partner (university or research institute) works together with one or more industry/government/societal partners on a particular theme. In January 2022, ICAI received €25M from the Dutch Research Council to start another 17 labs, educating 170 Ph.D. students over the course of 10 years. The Netherlands AI Coalition (NLAIC) is a network organisation for the advancement of AI activities in the Netherlands. It was initiated by VNO-NCW MKB-Nederland, the Ministry of Economic Affairs and Climate Policy, TNO, Seedlink, Philips, Ahold Delhaize, IBM and the Dutch Digital Delta Top Team. They have various building blocks, of which Human-centric AI is the most relevant to the Dutch Digital Heritage Network. In this building block, the ELSA Labs cover ‘Ethical, Legal and Societal Aspects’ of AI.

While not directly aimed at the cultural heritage domain, the Hybrid Intelligence Centre is an interesting project to follow. It is funded by a 10-year Zwaartekracht grant by the Dutch Ministry of Education, Culture and Science. In this project, VU, UvA, Delft University of Technology, Groningen University, Leiden University and Utrecht University work together designing Hybrid Intelligent systems, with ethical, legal and societal values, in an approach to Artificial Intelligence that puts humans at the centre.

Where to look to keep abreast of research into bias?

The following is a selected list only of organisations and researchers, working with AI and algorithmic bias in AI.

Cultural AI Lab

The Cultural AI Lab is a Netherlands-based pan-organisational lab falling within the digital humanities field: making use of digital technologies such as Machine Learning/ AI. The lab is headed by Dr. Marieke van Erp (KNAW), Dr. Laura Hollink (CWI) and Dr. Victor de Boer (UvA). Bias exploration is a key theme within the group: encompassing recent and ongoing research projects, including:

  • Culturally Aware AI, which aims to leverage AI approaches to identify polyvocal narratives concerning some subject, and utilise semantic web technologies for viewpoint representation (disclaimer: I am one of the PhD students on this project);

  • SABIO (the social bias observatory), which provides a visual tool for exploring bias in GLAM object metadata according to a variety of measures;

  • Pressing Matter, developing methods to trace object provenance in the context of polyvocal knowledge.

Additionally, the Cultural AI Lab has a monthly paper club, discussing papers often related to bias. The list of past papers can be found here on zenodo. The Cultural AI Lab is part of both ICAI and ELSA.

The Turing Institute

The Turing Institute is concerned with AI and data science in regards to tackling real-world problems. One of their identified key challenges is Algorithmic fairness, transparency and ethics. Here, a list of projects and events in relation to algorithmic fairness can be found. There are additionally several relevant interest groups and research programmes concerned related to AI bias and AI as applied to the humanities field, whose pages are continually updated with relevant research updates: the trustworthy AI forum, safe and ethical AI, data ethics group and humanities and data science. The institute's news page is probably a good place to start in keeping abreast of their recent contributions.

The DAIR institute

Founded by the aforementioned Timnit Gebru, formally of Google’s algorithmic bias team: the DAIR institute aims to research bias in AI, free from the perceived restrictions of Big Tech’s influence. Whilst still in its infancy, given the strength of feeling and recent publicity of their founder on the subject of bias in AI, this could be one to watch in the future.

The complete series by Ryan Brate:
blog 1: Monitoring advances in the field of AI, with an emphasis on bias
blog 2: The Influence of polyvocality on the life-cycle of the GLAM objects
blog 3: Adding (polyvocal) context to semantic web representations
Interview with Ryan Brate

Trefwoorden