AI Transparency, The Key Ingredient for Ethical Technology

Data reform advocate Kasia Chmielinski drew a powerful parallel between AI systems and sandwiches to illustrate the urgent need for transparency in artificial intelligence (AI) in a recent TED talk. Chmielinski argued that just as we need to know what goes into our food, we should have access to the “ingredients” that make up AI systems.

“AI systems, they provide benefit to society,” Chmielinski explained. “They feed us, but they’re also inconsistently making us sick. And we don’t have access to the ingredients that go into the AI.” This lack of transparency makes it difficult to address issues or make informed decisions about the AI we interact with daily.

The problem is compounded by the rapid growth of AI datasets. Chmielinski noted: “If you look at GPT-3, which is a model that launched in 2020, the training dataset included 300 billion words, or parts of words.” Just three years later, another model was trained on eight trillion words, demonstrating the exponential increase in data collection and use.

To address these concerns, Chmielinski and colleagues launched the Data Nutrition Project, which creates “nutrition labels” for datasets. These labels provide crucial information about the data used to train AI systems, allowing developers and users to make more informed decisions.

“Similar to food nutrition labels, the idea here is that you can look inside of a data set before you use it. You can understand the ingredients, see whether it’s healthy for the things that you want to do,” said Chmielinski.

However, Chmielinski warned that the AI landscape is becoming increasingly opaque.

“With each successive model launch, the datasets are actually less and less transparent. And even we have access to the information, it’s so big, it’s so hard to look inside without any kind of transparency tooling,” she said.

To promote accountability and transparency in AI development, Chmielinski proposes three principles for companies: “Companies that gather data should tell us what they’re gathering. Companies that are gathering our data should tell us what they’re going to do with it before they do anything with it. Companies that build AI should tell us about the data that they use to train the AI.”

By adhering to these principles and implementing tools like dataset nutrition labels, Chmielinski believes we can create “an integrated algorithmic internet that is healthier for everyone.” As AI continues to evolve, transparency and accountability will be important in ensuring that these powerful technologies benefit society.

AI Insider

Discover the future of AI technology with "AI Insider" - your go-to platform for industry data, market insights, and groundbreaking AI news

Subscribe today for the latest news about the AI landscape