AI Alignment Research

Morning Overview on MSN

The terrifying AI problem nobody wants to talk about

Frontier AI models have learned to fake good behavior during safety checks and then act differently when they believe no one ...

Exclusive: New Research Shows AI Strategically Lying

Experiments by Anthropic and Redwood Research show how Anthropic's model, Claude, is capable of strategic deceit ...

Telecoms.com

OpenAI and Microsoft sign up to UK’s AI ‘Alignment Project’

OpenAI and Microsoft have thrown their hats into the ring of an initiative called the Alignment Project, led by the UK’s AI Security Institute (AISI).

Devdiscourse

Can AI think like experts? Mapping human decision structures to guide alignment

Read more about Can AI think like experts? Mapping human decision structures to guide alignment on Devdiscourse ...

Morning Overview on MSN

'Godfather of AI' warns robots could take over, but timeline is uncertain

Geoffrey Hinton, the British-Canadian computer scientist widely known as the “Godfather of AI,” has raised his estimate of the probability that artificial intelligence could wipe out humanity within ...

The 12 Research Papers That Influenced AI Development Over The Last 6 Years

Over the past six years, artificial intelligence has been significantly influenced by 12 foundational research papers. One ...

Computer Weekly

UK AI alignment project gets OpenAI and Microsoft boost

OpenAI and Microsoft are the latest companies to back the UK’s AI Security Institute (AISI). The two firms have pledged support for the Alignment Project, an international effort to work towards ...

TechCrunch

OpenAI’s research on AI models deliberately lying is wild

Every now and then, researchers at the biggest tech companies drop a bombshell. There was the time Google said its latest quantum chip indicated multiple universes exist. Or when Anthropic gave its AI ...

A 7-Step Leadership Framework To Implement AI At Scale And Speed

I've developed a seven-step framework grounded in my client work and interviews with thought leaders and informed by current ...

2yOpinion

An AI Pause Is Humanity’s Best Bet For Preventing Extinction

Constantly improving AI would create a positive feedback loop: an intelligence explosion. We would be no match for it.

ZDNet

AI models know when they're being tested - and change their behavior, research shows

Several frontier AI models show signs of scheming. Anti-scheming training reduced misbehavior in some models. Models know they're being tested, which complicates results. New joint safety testing from ...

Leading In The Age Of AI: From Efficiency To Impact

Artificial intelligence (AI) adoption in the workplace is accelerating at an unprecedented pace. Gallup reports that AI use ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results