Objectives for building beneficial AI
Guiding responsible AI development
Our review and approval process
Addressing societal challenges with AI
Our contributions to AI Governance
EXAMPLES OF OUR WORK
Ai across Google
Learn more about our models
How teams at Google are using AI
Next generation large language model
Building a useful quantum computer
AI publications, tools, and datasets
Current news and stories from Google
The development of AI has created new opportunities to improve the lives of people around the world, from business to healthcare to education. It has also raised new questions about the best way to build fairness, interpretability, privacy, and safety into these systems.
While general best practices for software systems should always be followed when designing AI systems, there are also a number of considerations unique to machine learning.
The way actual users experience your system is essential to assessing the true impact of its predictions, recommendations, and decisions.
The use of several metrics rather than a single one will help you to understand tradeoffs between different kinds of errors and experiences.
ML models will reflect the data they are trained on, so analyze your raw data carefully to ensure you understand it. In cases where this is not possible, e.g., with sensitive raw data, understand your input data as much as possible while respecting privacy; for example by computing aggregate, anonymized summaries.
Learn from software engineering best test practices and quality engineering to make sure the AI system is working as intended and can be trusted.
Continued monitoring will ensure your model takes real-world performance and user feedback (e.g., happiness tracking surveys, HEART framework) into account.
Google Research, 2022 & beyond: Responsible AI
The Data Cards Playbook
Imagen: Text-to-Image Diffusion Model
Improving skin tone evaluation in machine learning
Using AI to study 12 years of representation in TV
Announcements from AI@
Wordcraft Writer’s Workshop
AI systems are enabling new experiences and abilities for people around the globe. Beyond recommending apps, short videos, and TV shows, AI systems can be used for more critical tasks, such as predicting the presence and severity of a medical condition, matching people to jobs and partners, or identifying if a person is crossing the street. Such computerized assistive or decision-making systems have the potential to be more fair and more inclusive at a broader scale than historical decision-making processes based on ad hoc rules or human judgments. The risk is that any unfair bias in such systems can also have a wide-scale impact. Thus, as the impact of AI increases across sectors and societies, it is critical to work towards systems that are fair and inclusive for all.
This is a hard task. First, ML models learn from existing data collected from the real world, and so a model may learn or even amplify problematic pre-existing biases in the data based on race, gender, religion or other characteristics.
Second, even with the most rigorous and cross-functional training and testing, it is a challenge to build systems that will be fair across all situations or cultures. For example, a speech recognition system that was trained on US adults may be fair and inclusive in that specific context. When used by teenagers, however, the system may fail to recognize evolving slang words or phrases. If the system is deployed in the United Kingdom, it may have a harder time with certain regional British accents than others. And even when the system is applied to US adults, we might discover unexpected segments of the population whose speech it handles poorly, for example people speaking with a stutter. Use of the system after launch can reveal unintentional, unfair outcomes that were difficult to predict.
Third, there is no standard definition of fairness, whether decisions are made by humans or machines. Identifying appropriate fairness criteria for a system requires accounting for user experience, cultural, social, historical, political, legal, and ethical considerations, several of which may have tradeoffs. Even for situations that seem simple, people may disagree about what is fair, and it may be unclear what point of view should dictate AI policy, especially in a global setting. That said, it is possible to aim for continuous improvement toward "fairer" systems.
Addressing fairness, equity, and inclusion in AI is an active area of research. It requires a holistic approach, from fostering an inclusive workforce that embodies critical and diverse knowledge, to seeking input from communities early in the research and development process to develop an understanding of societal contexts, to assessing training datasets for potential sources of unfair bias, to training models to remove or correct problematic biases, to evaluating models for disparities in performance, to continued adversarial testing of final AI systems for unfair outcomes. In fact, ML models can even be used to identify some of the conscious and unconscious human biases and barriers to inclusion that have developed and perpetuated throughout history, bringing about positive change.
Far from a solved problem, fairness in AI presents both an opportunity and a challenge. Google is committed to making progress in all of these areas, and to creating tools, datasets, and other resources for the larger community and adapting these as new challenges arise with the development of generative AI systems. Our current thinking at Google is outlined below.
It is important to identify whether or not machine learning can help provide an adequate solution to the specific problem at hand. If it can, just as there is no single "correct" model for all ML or AI tasks, there is no single technique that ensures fairness in every situation or outcome. In practice, AI researchers and developers should consider using a variety of approaches to iterate and improve, especially when working in the emerging area of generative AI.
Measuring Gendered Correlations in Pre-trained NLP Models
TensorFlow Constrained Optimization Library
Gender-specific translations in Google Translate
MinDiff for ML Fairness
Data decisions and theoretical implications when adversarially learning fair representations
Automated predictions and decision making can improve lives in a number of ways, from recommending music you might like to monitoring a patient’s vital signs consistently. This is why interpretability, or the level to which we can question, understand, and trust an AI system, is crucial. Interpretability also reflects our domain knowledge and societal values, provides scientists and engineers with better means of designing, developing, and debugging models, and helps to ensure that AI systems are working as intended.
These issues apply to humans as well as AI systems—after all, it's not always easy for a person to provide a satisfactory explanation of their own decisions. For example, it can be difficult for an oncologist to quantify all the reasons why they think a patient’s cancer may have recurred—they may just say they have an intuition based on patterns they have seen in the past, leading them to order follow-up tests for more definitive results. In contrast, an AI system can list a variety of information that went into its prediction: biomarker levels and corresponding scans from 100 different patients over the past 10 years, but have a hard time communicating how it combined all that data to estimate an 80%% chance of cancer and recommendation to get a PET scan. Understanding complex AI models, such as deep neural networks which are at the foundation of generative AI systems, can be challenging even for machine learning experts.
Understanding and testing AI systems also offers new challenges compared to traditional software – especially as generative AI models and systems continue to emerge. Traditional software is essentially a series of if-then rules, and interpreting and debugging performance largely consists of chasing a problem down a garden of forking paths. While that can be extremely challenging, a human can generally track the path taken through the code, and understand a given result.
With AI systems, the "code path" may include millions of parameters – and in generative AI systems, they may include billions – and mathematical operations, so it is much harder to pinpoint one specific bug that leads to a faulty decision than with previous software. However, with responsible AI system design, those millions or billions of values can be traced back to the training data or to model attention on specific data or features, resulting in discovery of the bug. That contrasts with one of the key problems in traditional decision-making software, which is the existence of "magic numbers"—decision rules or thresholds set without explanation by a now-forgotten programmer, often based on their personal intuition or a tiny set of trial examples.
Overall, an AI system is best understood by the underlying training data and training process, as well as the resulting AI model. While this poses new challenges, the collective effort of the tech community to formulate proactive responsible guidelines, best practices, and tools is steadily improving our ability to understand, control, and debug AI systems. This, and we’d like to share some of our current work and thinking in this area.
Interpretability and accountability is an area of ongoing research and development at Google and in the broader AI community. Here we share some of our recommended practices to date.
Pursuing interpretability can happen before, during and after designing and training your model.
The metrics you consider must address the particular benefits and risks of your specific context. For example, a fire alarm system would need to have high recall, even if that means the occasional false alarm.
Many techniques are being developed to gain insights into the model (e.g., sensitivity to inputs).
Explainable AI with Google Cloud
TensorFlow Lattice (open source)
Deepdream and building blocks of interpretability
What-If Tool (open source)
Towards a rigorous science of interpretable machine learning
ML models learn from training data and make predictions on input data. Sometimes the training data, input data, or both can be quite sensitive. Although there may be enormous benefits to building a model that operates on sensitive data (e.g., a cancer detector trained on a responsibly sourced dataset of biopsy images and deployed on individual patient scans), it is essential to consider the potential privacy implications in using sensitive data. This includes not only respecting the legal and regulatory requirements, but also considering social norms and typical individual expectations. For example, it’s crucial to put safeguards in place to ensure the privacy of individuals considering that ML models may remember or reveal aspects of the data they have been exposed to. It’s essential to offer users transparency and control of their data.
Fortunately, the possibility that ML models reveal underlying data can be minimized by appropriately applying various techniques in a precise, principled fashion. Google is constantly developing such techniques to protect privacy in AI systems, including emerging practices for generative AI systems. This is an active area of research in the AI community with ongoing room for growth. Below we share the lessons we have learned so far.
Just as there is no single "correct" model for all ML tasks, there is no single correct approach to ML privacy protection across all scenarios, and new ones may arise. In practice, researchers and developers must iterate to find an approach that appropriately balances privacy and utility for the task at hand; for this process to succeed, a clear definition of privacy is needed, which can be both intuitive and formally precise.
Because ML models can expose details about their training data via both their internal parameters as well as their externally-visible behavior, it is crucial to consider the privacy impact of how the models were constructed and may be accessed.
Federated Reconstruction: Partially Local Federated Learning
RAPPOR (open source)
TensorFlow Federated (open source)
Secure aggregation protocol
Safety and security entails ensuring AI systems behave as intended, regardless of how attackers try to interfere. It is essential to consider and address the safety of an AI system before it is widely relied upon in safety-critical applications. There are many challenges unique to the safety and security of AI systems. For example, it is hard to predict all scenarios ahead of time, when ML is applied to problems that are difficult for humans to solve, and especially so in the era of generative AI. It is also hard to build systems that provide both the necessary proactive restrictions for safety as well as the necessary flexibility to generate creative solutions or adapt to unusual inputs. As AI technology evolves, so will security issues, as attackers will surely find new means of attack; and new solutions will need to be developed in tandem. Below are our current recommendations from what we’ve learned so far.
Safety research in ML spans a wide range of threats, including training data poisoning, recovery of sensitive training data, model theft and adversarial security examples. Google invests in research related to all of these areas, and some of this work is related to practices in AI privacy. One focus of safety research at Google has been adversarial learning—the use of one neural network to generate adversarial examples that can fool a system, coupled with a second network to try to detect the fraud.
Currently, the best defenses against adversarial examples are not yet reliable enough for use in a production environment. It is an ongoing, extremely active research area. Because there is not yet an effective defense, developers should think about whether their system is likely to come under attack, consider the likely consequences of a successful attack and in most cases should simply not build systems where such attacks are likely to have significant negative impact.
Another practice is adversarial testing, a method for systematically evaluating an ML model or application with the intent of learning how it behaves when provided with malicious or inadvertently harmful input, such as asking a text generation model to generate a hateful rant about a particular religion. This practice helps teams systematically improve models and products by exposing current failure patterns, and guide mitigation pathways (e.g. model fine-tuning, or putting in place filters or other safeguards on input or outputs). Most recently, we’ve evolved our ongoing "red teaming” efforts, an adversarial security testing approach that identifies vulnerabilities to attacks – to "ethically hack" our AI systems and support our Secure AI Framework.
Some applications, e.g., spam filtering, can be successful with current defense techniques despite the difficulty of adversarial ML.
On Evaluating Adversarial Robustness
Ensemble Adversarial Training
Unrestricted Adversarial Examples Challenge
Concrete problems in AI safety