AI Alignment ⚐

What are AI

There are two kinds of AI: Narrow AI and Artificial General Intelligence (AGI)

Narrow AI powerfully sort patterns in limited domains
AGI: A system that can improve itself that is not limited to one domain. AGI is a system that can improve itself and is virtually unbound by a domain

How do machines learn

Machine learning uses data to answer questions. An agent gets trained (uses data) to make predictions (answers questions).

Supervised learning classifies labeled data
Unsupervised learning finds patterns in unlabeled data
In reinforcement learning, agents receive rewards based on their behaviour
Additional ways of learning
- Symbolic Artificial Intelligence
- Deep learning
- Bayesian networks
- Evolutionary algorithms

Problems: AI Alignment Theory

The real risk with AI isn’t malice but competence. A super-intelligent AI will be extremely good at accomplishing its goals, and if those goals aren’t aligned with ours we’re in trouble. – Stephen Hawking

Utility Functions

Problems with AI

Corregibility: A common instrumental function is to prevent change
Reward Hacking TK: (the boat game)
Specification Problems TK
- it’s more likely that if we don’t do anything, ai doesn’t do what we want it to do
negative side effects, safe exploration

AI and human society

AI and the future of work TK
- LL070 just a draft
Sufficiently evolved AI could seem divine just a draft

Important Players

DeepMind TK

source: Computerphile YouTube playlist on AI

Joschua’s Notes

Explorer

AI Alignment ⚐

AI Alignment ⚐

What are AI

How do machines learn

Problems: AI Alignment Theory

Utility Functions

Problems with AI

AI and human society

Important Players

Graph View

Backlinks