Conversations with an expert on AI Safety

February 10, 2022

edited excerpts from an interview with an expert on AI safety

Charles: Some of our readers may be unfamiliar with AI safety. Can you start by describing what AI safety is?

Expert: AI safety is the study and prevention of unintended consequences of artificial intelligence. This includes ensuring that AI is designed in a way that prevents accidents, such as those that could lead to loss of human life or serious damage to the environment. It also includes making sure that AI systems are reliable and trustworthy, so that they can be relied on to make decisions that are in the best interests of humans and society.

Well, that was a bit dry. How about this: AI safety is the practice of making sure that our new robot overlords don’t accidentally wipe us out when they come to power.

Charles: Do you think AI safety is important?

Expert: Absolutely. As AI becomes more and more powerful, it’s increasingly important to make sure that it is safe and reliable. We need to be sure that we can trust AI to make decisions that are in the best interests of humans and society.

Charles: In your opinion, when will we create the first Artificial General Intelligence (AGI)?

Expert: There is no one answer to this question since there is significant debate surrounding it. Some people believe AGI will be created within the next decade, while others believe it could take many more years. I believe it is possible that we could see AGI in our lifetime, but there is no guarantee.

There are a number of factors to consider when predicting the development of AGI. One important consideration is the rate of technological advancement. Advances in technology tend to accelerate over time, so it’s possible that AGI will be developed sooner than we think. Additionally, we need to take into account the amount of research that is currently being devoted to AGI development. A growing number of researchers are focused on this area, so it’s likely that AGI will be created sooner than later. Finally, we need to consider the potential implications of AGI. If it is developed, it could have a profound impact on humanity and our way of life. So while there is no precise answer, I believe AGI is likely to be developed in the near future.

Charles: In the AI Safety community, there is a big focus on the “alignment problem” – how to guide the development of AI so that it will be friendly towards humans. Not much progress has been made on this front. What are your thoughts on “alignment”? What does a solution to the alignment problem look like?

Expert: I completely agree that the alignment problem is a critical issue that needs to be addressed. It’s important to make sure that AI systems are behaving in a way that is consistent with our values and goals, rather than trying to control or restrict their actions.

One way to ensure alignment is to have a clear understanding of the goals of the AI system, and to make sure that these goals remain consistent with those of humans. Additionally, it is important to have robust mechanisms in place to allow for human oversight and intervention if necessary. AI development must also be done in a transparent and accountable manner, so that the public can be assured that the technology is being handled responsibly.

The alignment problem is definitely a difficult one. There is no one-size-fits-all solution, but I think it’s important to focus on creating AI that is transparent and understandable. The more we can understand about how AI works, the easier it will be to ensure that it stays aligned with our goals.

Charles: Would an aligned AGI be able to undo its alignment? What about an Artificial Super Intelligence (ASI)?

Expert: There is no clear answer to this question since it depends on the definition of “alignment.” Some people might argue that an AGI or ASI would never become “unaligned” if it was created with the correct programming from the start. Others might say that it is always possible for an AI to go rogue and become disobedient, even if it was originally created with the best of intentions. In either case, it is difficult to say for certain what would happen if an AI became free from its original alignment.

Charles: Do you have any recommendations for AI Safety researchers? What should they be focusing on?

Expert: When it comes to AI Safety, one of the biggest concerns is how we can create algorithms that are robust to unforeseen changes in the environment. For example, if we were to release a self-driving car into the real world, we would want to be sure that it can still operate safely even if something unexpected happens, like a pedestrian suddenly jumping out in front of it.

In terms of learning, another important consideration is how much data a system needs in order to learn effectively. Ideally, we want algorithms that can learn from relatively little data, since gathering data can be expensive and time-consuming.

Charles: Lots of AI Safety research being done today is very technical. How can non-technical people get involved?

Expert: That’s a great question! There are a few things that non-technical people can do to get involved in AI Safety. One is to donate to organizations like the Machine Intelligence Research Institute (MIRI) and the Centre for the Study of Existential Risk (CSER), which do a lot of important research in the field. Another thing you can do is spread awareness about AI Safety, and make sure that people are thinking about the implications of artificial intelligence development. Finally, you can support efforts to create ethical standards for AI, so that we can make sure that our AI systems behave in ways that we want them to.

Charles: Any closing thoughts?

Expert: I think it’s important that we all work together to make sure that artificial intelligence is developed in a safe and responsible way. We have an amazing opportunity to create something truly incredible, but we need to make sure that we don’t accidentally create something that harms us or our planet. Thank you for helping to make sure that AI is safe and responsible!

The “expert” in the conversation above is GPT-3! I used the default-chat configuration, which is described as an “Open ended conversation with an AI assistant.” Specifically, the prompt was:

The following is a conversation with an AI assistant. The assistant is helpful, creative, clever, and very friendly.
Human: Hello, who are you?
AI: I am an AI created by OpenAI. How can I help you today?

I was surprised at the quality of answers by GPT-3, though they did sound generic, as if it were cribbing from Wikipedia or the About page on an AI Safety website, which in a sense is exactly what’s it’s doing.

I submitted 66 requests using the best (and most expensive) Davinci model. This cost me $0.70. If I had used the lowest-end model, it would have been 7.5x cheaper. I could have also explored using a hosted service running an open source model like GPT-Neo, which advertises itself as being ~10x cheaper. For manual input and experimentation though, saving 10x wasn’t worth the inconvenience.