Hello, welcome back to Voices of Trusted AI! In our February issue, we take a look at the state of AI in 2021 before exploring new ideas that could set trends for artificial intelligence this year.
As a reminder, where you have questions, our community of expert data scientists has answers. You can ask a data scientist anything at the bottom of this digest.
As always, thank you for joining us on our journey to design and develop more responsible AI systems.
Cal Al-Dhubaib, CEO
What Do We Mean by Trusted AI?
Trusted AI is the discipline of designing AI-powered solutions that maximize the value of humans while being more fair, transparent, and privacy-preserving.
What it's about: Growing adoption rates and more efficient spending topped the list of AI trends in 2021. This McKinsey report is a compilation of 1,843 participant responses, ultimately overviewing the state of AI last year. It covers everything from the overall impact of AI to risk management.
Why it matters: The findings in this report can be used by both data scientists and industry leaders alike to compare their practices with those of leading AI “high performers” or respondents that attributed at least 20 percent of EBIT to their use of AI. It’s also a great source of inspiration for new AI use cases (think: services operations optimization and customer service analytics, to name a few).
Take a look at the functions and practices of successful AI adopters in this report and you’ll see one major commonality—they’re using design thinking and clear ethical frameworks to develop AI they can trust. Can your organization say the same?
What it's about: What does it mean for a data analysis to fail? In this article, author Roger Peng explores this question—and ultimately comes to the conclusion that we’re able to learn more than we think from an analysis going bad.
Why it matters: When working with data, it’s unlikely that we’ll avoid failure altogether. As Peng elaborates in his article, verification failure is when an analysis produces an unexpected outcome, whereas validation failure includes considerations that are outside the data entirely.
All failure can, and should, be traced back to the design process of an AI solution—how could this have been prevented and what can we learn from it? Peng goes on further to explain how data scientists and leaders have the opportunity to analyze potential failures and outcomes before data analysis even occurs. The resulting fluid process between analyzing, succeeding, and failing is what leads to AI design that teams can truly trust.
What it's about: Timnit Gebru, formerly of Google, has announced the launch of her new project: DAIR, the Distributed Artificial Intelligence Research Institute. With a focus on marginalized communities, the global institute is designed to benefit those most in danger of the misuse of AI.
Why it matters: Bias in data scienceand AI solutions is not new, especially as it affects minorities and underserved populations. However, Gebru’s DAIR project is a massive step forward in understanding the origins of unethical practices and working to create transparent AI solutions.
Previously, Gebru led research into racially biased facial recognition software, which prompted major brands like Amazon to change their practices. Last year, she was fired from Google for critiquing their large language AI models. We’ve seen companies make movestowards more ethical AI standards and practices, but until stricter regulationsand projects like DAIR are in place, it’s hard to tell who is truly committed to designing trustworthy AI solutions. How does your organization prevent discrimination in trusted AI? What can you do better?
What it's about: In this article, George Anadiotis explains the rise of synthetic data. Where high-quality, real-world data is hard to come by, synthetic data could be the next best option in training machine learning models.
Why it matters: Training machine learning models that you can trust requires transparent, annotated data—it’s very labor-intensive, prone to human error, and not easy to obtain.
One solution? Synthetic data. It is computer-generated, realistic data that can be created using techniques like generative adversarial networks (GANs). And because it is generated via simulation, much of the usual bias from human annotation is eliminated—resulting in cleaner datasets for AI design and development.
Right now, synthetic data is primarily used for computer vision applications, however it’s only a matter of time before we begin seeing increased reliance on synthetic data in other AI applications.
Who should lead my organization in identifying bias in our AI systems?
Risk management is a shared responsibility. Everyone should be empowered to identify and report bias in our AI systems. However, not everyone can be responsible for everything. Risk management requires us to delineate who is responsible for identifying bias in AI systems at each level.
At the highest level, your organization’s leaders must prioritizeidentifying biasby assigning resources to the problem and encouraging a culture that values discussions about bias. Trust in leadership inspires open reporting and a transparent dialogue about bias.
Similarly, your data team must integrate bias identification into their standard practices. It is something that the project lead should integrate at the beginning of each project and check at regular intervals. Models drift and you might begin to see bias where it did not exist before. Your individual contributors should build bias checking into your AI systems themselves to facilitate this validation.
While the nuts and bolts of bias identification occurs at the lowest level of an organization, it will not happen without a transparent culture that values and encourages discussions about bias.
- Merilys Huhn // Associate Data Science Consultant