The AI Safety Mirage - Why Industry Rankings Are Failing Us (Part 1)

The Future of Life Institute's latest AI Safety Index reveals a devastating truth—even the "best" AI companies barely scrape a C+ grade while racing toward AGI. With no company achieving adequate safety standards and critical gaps widening between capability and control, we're witnessing the collapse of AI safety theater in real time.

The emperor has no clothes, and the AI safety empire is crumbling in broad daylight.

The Future of Life Institute’s summer 2025 AI Safety Index has just delivered the most damning indictment of the AI industry to date: not a single company achieved higher than a C+ grade across comprehensive safety evaluations. Not OpenAI with its safety rhetoric. Not Google DeepMind with its research pedigree. Not even Anthropic, the supposed safety leader, could break through to truly adequate performance.

This isn’t just a report card—it’s a declaration of systemic failure at the highest levels of technological development that could determine humanity’s future.

The Hard Truth: Companies racing to build artificial general intelligence within the decade scored an average F in existential safety planning. As one expert reviewer put it: "none of the companies has anything like a coherent, actionable plan" for controlling the systems they claim to be building.

The Shocking Scorecard

Let’s be brutally honest about what these grades actually mean in a sector where getting safety wrong could mean civilizational collapse:

Anthropic (C+, 2.64): The industry “leader” barely achieved a passing grade, and only managed a D in existential safety despite being the company most vocal about AI risks.

OpenAI (C, 2.10): The creator of ChatGPT and the most visible AI company globally couldn’t even match Anthropic’s mediocre performance, earning an F in existential safety after dismantling its superalignment team.

Google DeepMind (C-, 1.76): The tech giant with massive resources and top talent scored below average across the board, with a D- in existential safety.

The Rest: Meta, xAI, Zhipu AI, and DeepSeek all scored D or F overall, with the Chinese companies receiving failing grades partly due to transparency standards but also fundamental safety deficiencies.

These aren’t the scores of companies ready to safely navigate the development of human-level AI. These are the scores of organizations fundamentally unprepared for the magnitude of what they’re attempting to build.

Industry Reality Check: When your "safety leader" gets a D in existential safety planning while claiming AGI is 2-3 years away, you don't have a safety problem—you have a safety crisis masquerading as progress.

The AGI Paradox: Racing Toward Disaster

Here’s where the report reveals the most disturbing disconnect in modern technology: Every major AI company claims they will achieve artificial general intelligence within this decade, yet none scored above D in preparing for that future.

This isn’t just irresponsible—it’s pathological.

The reviewers didn’t mince words. One called the situation “deeply disturbing,” noting that despite companies “racing toward human-level AI, none of the companies has anything like a coherent, actionable plan” for ensuring such systems remain safe and controllable.

Think about that for a moment. These companies are:

Raising billions of dollars based on AGI timelines
Making public commitments to achieve human-level AI
Attracting top talent with promises of being part of the AGI breakthrough
Yet they have no credible plan for controlling what they build

The Numbers Don’t Lie

Only 3 out of 7 companies even conduct substantive testing for dangerous capabilities linked to large-scale risks like bioterrorism or cyber warfare. The companies building systems they claim will soon exceed human intelligence in most domains can’t be bothered to systematically test whether their current models might help terrorists create biological weapons.

As one reviewer warned: “I have very low confidence that dangerous capabilities are being detected in time to prevent significant harm. Minimal overall investment in external 3rd party evaluations decreases my confidence further.”

The Capability-Safety Gap: The report confirms what insiders have been whispering—capabilities are accelerating faster than risk management practices, and the gap between firms is widening dangerously. With no regulatory floor, a few companies adopt stronger controls while others neglect basic safeguards.

The Safety Evaluation Charade

Perhaps most damning is what the report reveals about the quality of safety evaluations that companies do conduct. Even among the leaders who actually test for dangerous capabilities, the methodological rigor is shockingly poor.

The evaluation methodology problems include:

No clear reasoning: Companies rarely explain why specific evaluations were chosen or how results should be interpreted
Missing risk connections: The methodology linking evaluations to actual risks is “usually absent”
No independent verification: Companies expect the public to trust self-reported safety claims with no external oversight
Limited external evaluation: Minimal investment in third-party assessments that aren’t controlled by the companies themselves

One expert reviewer captured the fundamental problem: “The methodology/reasoning explicitly linking a given evaluation or experimental procedure to the risk, with limitations and qualifications, is usually absent.”

This isn’t scientific rigor—it’s safety theater designed to provide cover for continued development without meaningful constraint.

The Anthropic Reality Check

Even Anthropic, which received the highest marks and conducts some of the most rigorous testing in the industry, highlights the systemic problems. Despite conducting the only human participant bio-risk trials and leading on dangerous capability evaluations, expert reviewers still concluded they lack adequate safety guarantees.

If the industry “leader” with a C+ grade isn’t actually safe, what does that say about everyone else?

The report’s treatment of Chinese companies reveals another layer of the safety crisis: the Western-centric nature of safety frameworks completely fails to address how AI development actually works globally.

Zhipu AI and DeepSeek received failing grades, but the report acknowledges this partly reflects cultural and regulatory differences rather than pure safety deficiencies. Chinese companies operate under different transparency norms and existing government regulation, making Western self-governance metrics largely irrelevant.

This exposes a critical flaw in current safety thinking: If safety frameworks only work within specific cultural contexts, how can they possibly address the global, competitive nature of AI development?

The real problem isn’t that Chinese companies scored poorly on Western metrics—it’s that we have no coherent approach to AI safety that works across different regulatory and cultural environments. This fragmentation virtually guarantees that safety standards will be a race to the bottom as development continues across multiple jurisdictions with different approaches.

Global Safety Failure: When your safety framework culturally discriminates against companies from different regulatory environments, you don't have safety standards—you have parochial wishful thinking dressed up as policy.

The Whistleblowing Silence

One of the most telling findings involves whistleblowing policies—the last line of defense when internal safety cultures fail. The results are appalling:

Only OpenAI has published its whistleblowing policy, and only after media reports exposed highly restrictive non-disparagement clauses that could silence safety concerns.

This means employees at six of the seven most important AI companies in the world have no public information about how to safely report safety concerns without facing retaliation. In an industry where insiders are often the first to spot concerning model behavior or negligent risk management, this silence is deafening.

The track record is even worse:

Multiple high-profile safety researchers have left companies citing safety culture concerns
NDAs and non-disparagement agreements routinely silence former employees
Retaliation cases have been documented at major companies
Safety teams have been dissolved or had their resources redirected

When the people building these systems can’t speak freely about safety concerns, how can the public trust that adequate precautions are being taken?

The Existential Safety Vacuum

Perhaps the most chilling finding is in the “Existential Safety” domain, where companies’ preparedness for managing extreme risks from human-level AI systems was evaluated.

The results speak for themselves:

Anthropic: D (1.0 out of 4.0)
OpenAI: F (0.67 out of 4.0)
Google DeepMind: D- (0.77 out of 4.0)
Everyone else: F

These scores represent companies’ ability to manage risks from the very systems they claim to be building. The disconnect is almost surreal: organizations spending billions to create human-level AI score failing grades on their ability to control it.

Expert reviewers found:

No quantitative safety guarantees for alignment or control strategies
No formal safety proofs or probabilistic risk bounds
No credible technical plans for ensuring systems remain aligned
No governance frameworks for managing superhuman AI

As one reviewer put it: “Companies working on AGI need to show that risks are actually below an acceptable threshold. None of them have a plan to do this.”

The Bottom Line: The AI industry is in a state of collective denial about the magnitude of the challenge they've set for themselves. They're building systems they admit could be catastrophically dangerous while demonstrating they have no credible plan for keeping them safe.

Where We Stand

The AI Safety Index doesn’t just reveal poor grades—it exposes an industry in crisis, racing toward capabilities it fundamentally doesn’t know how to control safely.

The pattern is clear across every major domain:

Risk assessment is methodologically flawed and inconsistently applied
Current harms show models vulnerable to misuse and jailbreaking
Safety frameworks lack enforcement mechanisms and external oversight
Existential safety planning is essentially non-existent
Governance structures prioritize development speed over safety constraints
Information sharing remains selective and self-serving

This isn’t just about corporate responsibility—it’s about the future of human civilization in an age of artificial intelligence. When the companies building potentially transformative technology can’t achieve basic competency in safety planning, we’re not just witnessing corporate failure. We’re watching the collapse of the illusion that market incentives alone can guide us safely through the most dangerous technological transition in human history.

The scorecard is in, and we’re failing.

This is Part 1 of a two-part analysis. Part 2 will examine what real AI safety accountability would look like and the systemic changes needed to address these failures before it’s too late.

Analysis based on the Future of Life Institute’s AI Safety Index, Summer 2025 edition, evaluating Anthropic, OpenAI, Google DeepMind, Meta, xAI, Zhipu AI, and DeepSeek across 33 indicators and six critical safety domains.