OpenAI's New AI Models Promise Improved Alignment with Human Values
OpenAI's New AI Models Promise Improved Alignment with Human Values
OpenAI has unveiled two new AI models, o3 and o3-mini, which are touted to be more advanced than their predecessors in terms of reasoning abilities. The startup has also revealed a new safety paradigm, called 'deliberative alignment', aimed at ensuring the models stay aligned with human values during inference. According to OpenAI's research, deliberative alignment improved o1's overall alignment to the company's safety principles, thereby decreasing the rate at which it answered 'unsafe' questions.