Between Microsoft’s new Bing Search, the advent of ChatGPT, and a number of other generative AI startups, we’re quickly seeing the old “move fast and break things” approach is not one that needs to make a comeback.
Since initially launching, the creators of these technologies have been reactively rushing to mitigate a number of harmful unintended consequences—some that could have been prevented if diligent ethical AI auditing occurred throughout the entire design process.
As your own organization builds AI solutions, consider the strategy, framework, and responsible AI questions you should be asking before users are affected by the technology. It’s only a matter of time before government regulations will enforce these considerations.
What Do We Mean by Trusted AI?
Trusted AI is the discipline of designing AI-powered solutions that maximize the value of humans while being more fair, transparent, and privacy-preserving.
What it's about: This article explores the potential benefits and limitations of AI in the legal profession. It highlights how AI can streamline legal processes, increase efficiency, and improve accuracy. However, the author also emphasizes that AI should not be viewed as a complete solution but rather as part of the equation. The author suggests that the human touch in legal practice is still essential and that AI should be used to augment the skills and abilities of legal professionals rather than replace them.
Why it matters: While this article explicitly focuses on the legal industry, its takeaways are applicable to any industry that is designing and piloting AI. AI will never be perfect, which is why professionals must approach it with caution, and use it in conjunction with their expertise and judgment.
Studies, like this one, are even beginning to show how AI in combination with humans is more successful than an ML model or human operating alone. Organizations that are seeing the most success with building AI are the ones that also focus on the people and processes surrounding the solution. What sort of training, education, and trust must be established to yield the full impact of an AI/ML model?
What it's about: Researchers at Texas A&M University School of Public Health are developing a framework, known as Copyleft AI with Trusted Enforcement, or CAITE, for governing the ethical use of AI that balances the need for regulation with the promotion of innovation.
The framework includes guidelines for transparency, accountability, and fairness in AI systems, as well as the establishment of oversight bodies to monitor compliance with these guidelines. This framework can promote responsible AI development and deployment while also avoiding excessive regulation that may stifle innovation.
Why it matters: The rapid evolution of AI and inflexible nature of government regulation have made creating ethical AI policies historically challenging. And although we’re seeing a number of regulatory advancements like the EU AI Act, Canada’s Artificial Intelligence & Data Act proposal, and New York City’s automated employment regulation, they take considerable time to finalize and are slow to adapt to changing conditions.
As an organization that is building AI, are you waiting for government regulation to be enforced or are you proactively following an ethical AI design framework?
Jennifer Wagner, JD, of Penn State puts it best: “Efforts to promote ethical and trustworthy AI must go beyond what is legally mandated as the baseline for acceptable conduct. We can and should strive to do better than what is minimally acceptable.”
What it's about: In a recent VentureBeat interview, AI expert Andrew Ng discusses the current state and future potential of artificial intelligence. Ng argues that despite the hype surrounding generative AI, there is still significant rising momentum behind technologies like supervised learning (where machine learning models are trained on labeled data).
He suggests that generative AI will play an increasingly important role in areas such as natural language processing, computer vision, and healthcare. Ng also addresses ethical concerns around AI, emphasizing the importance of transparency and collaboration in ensuring that AI systems are developed and used responsibly.
Why it matters: Andrew Ng’s impact on AI as an educator, researcher, innovator, and leader makes this interview an especially interesting read. Ng draws on his industry experience to separate the generative AI buzz from realistic expectations of the technology.
Much like in the early days of deep learning, Ng points out that companies and data scientists are still figuring out specific use cases where generative AI makes sense. The technology may be exciting, but it likely will not make sense for your organization without the right data and the right strategy in place.
What it's about: While ChatGPT has demonstrated impressive abilities in natural language processing, it has also been criticized for generating offensive or harmful content. OpenAI has responded to these concerns by implementing a range of measures to improve the safety and fairness of ChatGPT, including adding more diverse training data, developing a toxicity filter, and creating a human moderation team to review potentially harmful output.
Why it matters: While these efforts are a step in the right direction, they are not a complete solution to the challenges of developing safe and unbiased AI models. This discussion also begs the question, why were these risks not detected and mitigated while designing the model? And if they were, what changes occurred in the real world to result in this unexpected harmful content?
These are questions you should be considering when building AI/ML solutions for your own organization. It’s also critical to consider the ethical implications of AI in all stages of development and deployment. AI design is an iterative process that requires constant monitoring and managing—especially when humans are involved.
OpenAI’s attempts at reducing bias in ChatGPT is just one example of why the industry requires ongoing research and development in this area. Will pending regulations like the EU AI Act prevent the release of biased or unfair models? Only time will tell.
How do you build models that preserve user privacy?
Privacy is of utmost importance when building machine learning models. As data professionals, we are responsible for handling sensitive information carefully and creating models that respect users' privacy. To achieve this, we can explore various privacy-preserving techniques that help protect user data while still developing effective models. These techniques can be grouped into two overarching themes: data level privacy and model level privacy. Below is just a scratch at the surface of how we can achieve privacy:
Data level privacy focuses on Noise Injection Techniques. These methods focus on adding a layer of protection by altering the data. Differential Privacy adds "noise" to the data, making it difficult to link back to individual users, while Anonymization removes personally identifiable information (PII) from the dataset. This can be taken one step further with Synthetic Data Generation, a technique that mimics the distribution of the original data without passing any original information to the newly generated dataset.
Model level privacy focuses on Collaborative and Encrypted Techniques. Federated Learning trains models on local data across multiple devices or servers and shares only model updates, not the original data. Secure Multi-Party Computation (SMPC) enables joint computations on encrypted data without revealing the data itself, and Homomorphic Encryption allows computations on encrypted data without needing decryption.
By employing a combination of these privacy-preserving techniques, we can address the challenges of maintaining user privacy while still leveraging machine learning models for valuable insights and predictions. The key is to choose the right technique or combination of techniques based on the underlying data and the specific requirements of the project. Happy modeling!