Skip to main content

Explainer: Impact Assessments for Artificial Intelligence

Artificial intelligence (AI) promises transformative benefits to society but comes at a cost when a system initially designed to solve one problem has unintended consequences. For example, an AI system recently created to automate the drug discovery process by identifying proteins that indicate disease also identified 40,000 potential bioweapons. Impact assessments are one tool to anticipate and manage an AI system’s benefits, risks, and limitations throughout its entire life cycle. The Bipartisan Policy Center has been working for the past year with outside stakeholders to examine the benefits and challenges of potential AI impact assessments. After convening representatives from academia, civil society, government, and industry, we are sharing what we have learned about this emerging tool.

In this piece, we address seven questions to help explain what an impact assessment actually is, and what it means for AI.

What are impact assessments?

An impact assessment is a risk management tool that seeks to ensure an organization has sufficiently considered a system’s relative benefits and costs before implementation. In the context of AI, an impact assessment helps to answer a simple question: alongside this system’s intended use, for whom could it fail?

An organization might use an impact assessment to understand how an AI system will impact users and society more broadly at every stage: data-gathering, development, deployment, and continuous monitoring in society. These assessments can serve as critical guard rails for public and private organizations, with the investment required proportionate to risk. For instance, AI systems that offer movie recommendations would receive less scrutiny than those that make hiring or loan decisions. An impact assessment promotes accountability by requiring an organization to document its decision-making process and “show its work.”

During BPC’s work in 2020 on an AI National Strategy for Congress, there was little consensus among experts on the baseline language, scope, and design of a potential assessment. As we learn how to identify and measure harm from AI systems, impact assessments can make the inner workings of the algorithms that power these systems more transparent. The goal is for impact assessments to promote accountability through documentation and knowledge production, rather than to instill fear of liability. This process seeks to create a regulatory incentive for organizations to answer basic questions about decision-making: why was a given regression criterion or training data set chosen and why were alternatives rejected?

What is the difference between impact assessments and risk assessments?

While risk assessments and impact assessments are closely related, they each convey certain assumptions about AI. A risk assessment assumes a more negative slant: that AI is inherently risky and leads to negative outcomes. As such, the goal is to identify potential threats and vulnerabilities. An impact assessment assumes a more positive slant, seeking to weigh an AI system’s positive benefits alongside its negative risks.

In what contexts outside AI have impact assessments been applied?

While initially focused on the economic and administrative impacts of proposed regulations, impact assessments are now used in many policy contexts. Governments have introduced impact assessments with lenses for data protection, the environment, agriculture, transport, climate change, health, human rights, and racial equity. The purpose is to help an organization compare the costs and benefits of a system through a specific impact lens.

What are shortcomings of impact assessments?

Regardless of context, an impact assessment seeks to establish a systematic approach to outlining risks and documenting decision-making. However, a few significant challenges emerge. First, harms are difficult to measure. Ignoring risks to human dignity, fairness, and privacy can complicate the process of setting standards. Second, even where an impact assessment is embedded in an organization, its use may be differentiated. It may be applied rigorously by an ethics board but ignored by the business unit, or considered at deployment but not during initial design. Third, there is concern that impact assessments focus too much on the short term given the uncertainty of long-term outcomes. Fourth, impact assessments may focus too much on AI’s downside risks and not enough on its upside benefits. Fifth, it can be costly and time-consuming to implement an impact assessment. A risk-based approach would allow for a lighter review of less sensitive applications and more rigid review of sensitive applications. Overall, given these uncertainties, it is important to keep impact assessments adaptable to technological change.

When do AI impact assessments take place?

An impact assessment can either take place ex ante (before) or ex post (after) the impactful action in question. Whereas ex post assessments consider the actual impacts of a technology, policy, or business practice after the activity has been initiated, ex ante assessments consider potential impacts before initiating the activity. AI impact assessments begin ex ante to foresee potential risks, with continuous review of impacts throughout the AI life cycle. This ensures that an organization can mitigate the negative consequences of an AI system before its deployment. The analysis emphasizes accountability, putting the burden on organizations to prevent harms from happening rather than waiting for affected individuals to invoke their rights.

How have AI impact assessments evolved?

The role of impact assessments in AI is starting to become more pronounced. In 2018, the AI Now Institute introduced the concept of an “Algorithmic Impact Assessment.” The purpose was to help communities and stakeholders assess claims made by AI systems and determine “where—or if—their use is appropriate.” Since then, organizations worldwide have begun discussing potential baselines for AI impact assessments. European efforts include Project Sherpa, the EU’s Assessment List for Trustworthy Artificial Intelligence, and “conformity assessments” introduced in the EU’s AI Act.

Canada’s federal government has also co-designed an Algorithmic Impact Assessment tool with academia, civil society, and public institutions. The quantitative tool—open source and available on GitHub—is a questionnaire with 48 risk and 33 mitigation questions that collectively create a “risk score” based on factors such as systems design, algorithm, decision type, impact, and data. The risk score has two components: a “raw impact score” based on the risks of the automation and a “mitigation score” based on how the risks of automation are managed. The final score determines the level of oversight and safeguards required for the project. The U.S. CIO Council also released a similar Algorithmic Impact Assessment, introducing broadly applicable standards that govern the ethical use of automated decision systems. In 2020, South Korea became the first country to introduce AI impact assessments as national-level legislation.

However, there are no internationally-accepted standards that would give organizations a consistent structure to evaluate their AI systems. Absent these standards, some organizations have voluntarily developed their own techniques. For example, model cards are short documents that communicate key information about how a model works and its intended use cases. The nonprofit EqualAI has also launched a badge program to train senior executives on how to understand and implement responsible AI practices.

What are the differences between impact assessments and third-party audits?

The main difference between impact assessments and audits is whether the organization brings in a third party to evaluate the AI system. There are three distinct approaches to AI audits: functionality audits (focusing on the rationale behind using an AI system); code audits (focusing on an algorithm’s source code); and impact audits (focusing on the effects of a system’s outputs). Some worry that such audits could formalize a rigid regulatory process that is convoluted, time consuming, and politicized. For example, a requirement to give external auditors access to an organization’s source code and algorithms could run afoul of U.S. intellectual property laws. Further, absent a consistent set of standards to guide behavior, it is difficult to conduct an objective audit across industries. Some organizations have pursued a middle-ground option, where an outside law firm helps to establish the organization’s internal assessment process, but there is no handover of data or source code to an external party.

Conclusion

While AI impact assessments show promise at a conceptual level, at least three big questions remain. First, to what extent should organizations be required to expose their work to third parties or the wider public rather than use this tool for internal self-assessment? Second, how might we embed impact assessments into an organization’s culture, from developer to executive? Third, how might we measure the risk and societal impact of AI systems in a way that organizations can understand and governments can regulate if necessary? The National Institute for Science and Technology (NIST) is collaborating with stakeholders to answer these questions through its proposed AI Risk Management Framework, but there is still a long way to go to mainstream the use of AI impact assessments.

Read BPC’s comments to the NIST AI Risk Management Framework here.

Share
Read Next

Support Research Like This

With your support, BPC can continue to fund important research like this by combining the best ideas from both parties to promote health, security, and opportunity for all Americans.

Donate Now
Tags