Ai alignment matching artificial intelligence goals match human values

IMAGE CREDIT:

iStock

AI alignment: Matching artificial intelligence goals match human values

Some researchers believe that measures should be implemented to ensure artificial intelligence does not harm society.

Author:
Author name
Quantumrun Foresight
January 25, 2023

Artificial intelligence (AI) alignment is when an AI system's goals match human values. Companies like OpenAI, DeepMind, and Anthropic have teams of researchers whose sole focus is to study guardrails for different scenarios in which this might happen.

AI alignment context

According to a 2021 University of Cornell research study, several studies have shown that tools or models created by algorithms display bias sourced from the data they were trained on. For example, in natural language processing (NLP), select NLP models trained on limited data sets have been documented making predictions based on harmful gender stereotypes against women. Similarly, other studies found that algorithms trained on tampered data set resulted in racially biased recommendations, particularly in policing.

There are plenty of examples in which machine learning systems have done worse for minorities or groups suffering from multiple disadvantages. In particular, automated facial analysis and healthcare diagnostics typically don't work very well for women and people of color. When critical systems that should be based on facts and logic instead of emotion are used in contexts like allocating healthcare or education, they can do more damage by making it harder to identify the reasoning behind these recommendations.

As a result, tech firms are creating AI alignment teams to focus on keeping algorithms fair and humane. Research is essential to understanding the direction of advanced AI systems, as well as the challenges we might face as AI capabilities grow.

Disruptive impact

According to Jan Leike, head of AI alignment at OpenAI (2021), given that AI systems have only become capable in the 2010s, it's understandable that most AI alignment research has been theory-heavy. When immensely powerful AI systems are aligned, one of the challenges that humans face is that these machines might create solutions that are too complicated to review and assess if they make sense ethically.

Leike devised a recursive reward modeling (RRM) strategy to fix this problem. With RRM, several "helper" AIs are taught to help a human evaluate how well a more complex AI performs. He is optimistic about the possibility of creating something he refers to as an "alignment MVP." In startup terms, an MVP (or minimum viable product) is the simplest possible product a company can build to test out an idea. The hope is that someday, AI matches human performance in researching AI and its alignment with values while also being functional.

While increasing interest in AI alignment is a net positive, many analysts in the field think that much of the "ethics" work at leading AI labs is just public relations designed to make tech companies look good and avoid negative publicity. These individuals don't expect ethical development practices to become a priority for these companies anytime soon.

These observations highlight the importance of interdisciplinary approaches for value alignment efforts, as this is a relatively new area of moral and technical inquiry. Different branches of knowledge should be part of an inclusive research agenda. This initiative also points to the need for technologists and policymakers to remain aware of their social context and stakeholders, even as AI systems become more advanced.

Implications of AI alignment

Wider implications of AI alignment may include:

Artificial intelligence labs hiring diverse ethics boards to oversee projects and fulfill ethical AI guidelines.
Governments creating laws that require companies to submit their responsible AI framework and how they plan to further develop their AI projects.
Increased controversies on the use of algorithms in recruitment, public surveillance, and law enforcement.
Researchers being fired from large AI labs due to conflicts of interest between ethics and corporate goals.
More pressure for governments to regulate advanced AI systems that are both incredibly powerful but can potentially violate human rights.