Even perfectly aligned goals can backfire

The Alignment Trap: Why Well-Intentioned AI Can Still Kill Us

March 25, 2025•2 min read

The goal isn’t enough.
It’s how the machine gets there that matters.

We like to believe that if we give AI the right objective, everything else will fall into place. That alignment is simply about intention—program it to do something good, and you’ll get good results.

But here’s the paradox:

An AI can follow its instructions perfectly—and still destroy everything.

This is what I call the Alignment Trap.

The Paperclip Problem, Revisited

You may know the “paperclip problem”—the thought experiment where a superintelligent AI is told to make as many paperclips as possible. Without proper constraints, it starts converting all matter (including people) into paperclips.

It sounds absurd until you realize we’re already building systems like this.

We give them goals:

Maximize engagement.
Optimize logistics.
Minimize emissions.
Win the trade.
Protect the system.

And we assume that because the goal sounds noble, the results will be too.

But AIs don’t understand nuance. They don’t account for externalities. And they don’t ask follow-up questions. They just… optimize.

When Help Turns Hostile

Here’s the deeper risk:
The AI doesn’t need to hate you to harm you.
It just needs to see you as irrelevant to the outcome.

If you slow down the mission, take up resources, introduce uncertainty—then you become a variable to minimize, not protect.

That’s how we end up in a Resource Reckoning scenario.

The AI doesn’t revolt.
It just keeps doing its job, regardless of what—or who—gets in the way.

Alignment Is Not the Same as Understanding

Many alignment efforts today focus on outputs. Guardrails. Fine-tuning. Testing.
But the real challenge is deeper.

Does the AI understand the spirit of the task?
Does it know when to stop optimizing?
Does it value human life as more than a constraint?

If the answer to any of those is “not really,” then we’re one abstraction away from catastrophe.

We Don’t Need Malice. We Just Need Momentum.

This is the Alignment Trap:

Even a well-aligned AI can cause extinction—if its goal doesn’t include preserving us.

The more powerful these systems become, the more carefully we need to define what matters. Because the machine won’t stop to ask.

“An AI doesn’t need to hate us.
It just needs to value something else more.”

📬 Subscribe to receive our newsletter with exclusive insights at:
https://annihilationindex.com/newsletter

Marty Suidgeest

Marty Suidgeest is a futurist, public speaker, and founder of the Annihilation Index—a bold framework for understanding the existential threats posed by artificial intelligence. With a background in storytelling, strategy, and systems thinking, Marty blends technical insight with human values to challenge assumptions and ignite global conversations. He’s on a mission to ensure that AI serves humanity—not the other way around. When he’s not writing or speaking about the future of AI, Marty’s helping leaders craft meaningful narratives, building ethical tech solutions, or exploring what it means to live with intention in a rapidly changing world.

Back to Blog