The Global AI Risk Assessment Convergence: What Engineers Need to Know
Editor's note (June 2026):
The "United States" section below rested on Executive Order 14110 — a Biden Administration order that had, in fact, already been revoked by the time this piece was published. Federal AI policy has since moved from oversight toward deregulation. I review that claim and the current US picture in a separate article: A Correction, in Public: Revisiting AI Oversight in the United States
What AI knows
The email was professional, direct, and absolutely devastating.
An AI-powered management system had analyzed declining performance metrics from a mid-level manager at a tech company and generated a termination notice. The system calculated that letting this person go would improve team efficiency by 12% based on historical data. It drafted the email, queued it for the manager's supervisor to review, and assigned an 89% confidence score to the recommendation.
The supervisor glanced at the confidence score, saw the detailed performance analysis, and clicked "approve." The email went out at 4:47 PM on a Friday.
What the AI didn't know—couldn't know—was that the manager's "declining performance" was because she'd been caring for her dying father for the past three months. What the AI didn't consider was that she was the only team member who understood the legacy codebase for their most critical product. What the AI couldn't predict was that her termination would trigger three other senior engineers to quit in protest, costing the company far more than the 12% efficiency gain.
The AI made its decision based on probabilities derived from patterns in data. It was statistically sound. It was technically correct. And it was catastrophically wrong.
This is why three major jurisdictions—the European Union, South Korea, and the United States—are converging on a similar principle: AI systems in high-risk domains must be designed so meaningful human oversight is possible and accountability is preserved. They can inform, analyze, and recommend. But in contexts where decisions involve significant human judgment—employment, healthcare, law enforcement—the architecture should ensure humans retain decision authority rather than merely rubber-stamping AI outputs.
The Pattern: Three Jurisdictions, One Risk Framework
European Union: AI Act (Regulation 2024/1689)
Enacted: August 2024
Enforcement: Phased implementation through 2026-2027
The EU's approach is comprehensive and prescriptive. The AI Act classifies systems into risk categories, with "high-risk AI" requiring strict oversight. These high-risk categories aren't random bureaucratic inventions—they reflect domains where AI failure has disproportionate consequences.
Critical infrastructure like energy grids and water systems are high-risk because a single wrong decision can cascade into widespread disaster. Healthcare systems are high-risk because misdiagnosis or wrong treatment recommendations can kill. Employment systems are high-risk because they affect people's livelihoods and can perpetuate discrimination at scale. Law enforcement systems are high-risk because they involve fundamental rights—liberty, due process, the presumption of innocence.
For these high-risk systems, the EU mandates specific protections: risk management systems, data quality requirements, technical documentation, transparency, and crucially, human oversight. Article 14 requires that high-risk AI systems be designed to enable human oversight, including the ability to override AI decisions.
South Korea: Framework Act on AI Development and Trust (Law No. 20676)
Enacted: January 21, 2025
Enforcement: January 22, 2026
South Korea's law, enacted just months ago, mirrors the EU's structure with remarkable precision. The Korean framework defines "high-impact AI" as systems significantly affecting life, bodily safety, and fundamental rights. The domains are nearly identical: energy supply, healthcare and medical devices, transportation, employment and loan decisions, public services, even student evaluation.
The requirements are similarly aligned: transparency obligations, safety assessments, risk management, explainability, human supervision, and impact assessments on fundamental rights. Article 34 explicitly requires "human management and supervision of high-impact artificial intelligence."
The Korean law adds an interesting computational threshold: AI systems exceeding specified learning computation levels face additional safety obligations, recognizing that more powerful models pose greater risks.
United States: Executive Order 14110 and Sector-Specific Regulation
Federal framework: Executive Order issued October 2023
The United States hasn't enacted comprehensive AI legislation, but the regulatory pattern is emerging across sectors. Executive Order 14110 requires safety testing for foundation models and establishes the AI Safety Institute. The FDA regulates AI in medical devices. The EEOC enforces anti-discrimination law in AI hiring systems. The FTC pursues algorithmic fairness in consumer contexts.
The risk categories converge here too: safety-critical systems in healthcare and transportation, rights-impacting systems in employment and housing, security-critical applications. While enforcement is fragmented across agencies rather than unified in a single act, the underlying logic is the same—AI in high-stakes domains needs oversight.
Why the Convergence Reflects Both Technical Reality and Coordination
Three different political systems, three different legal traditions, three remarkably similar frameworks. This convergence reflects both shared technical realities and international coordination efforts. Organizations like the OECD have published AI Principles that influenced multiple jurisdictions. NIST frameworks inform regulatory thinking globally. International standards bodies facilitate knowledge sharing.
But the underlying technical risks are also genuinely similar across jurisdictions. Consider a medical diagnosis AI. In Brussels, Seoul, or San Francisco, the failure mode is identical: the model outputs an 85% confidence that a patient has cancer based on a CT scan. A doctor sees that number, anchors on it, and makes a diagnosis. But what does 85% actually mean?
In the AI's training data, 85% of similar-looking images were cancer. But "similar" is defined by features the model learned—features that might be correlated with cancer or might be spurious patterns from how a particular hospital's scanners work. The 85% assumes this patient's case is within the distribution of the training data. It assumes no adversarial manipulation of the image. It assumes the model's learned correlations are actually causal relationships.
All of these assumptions could be wrong, and the doctor looking at "85% confidence" has no way to verify them. The number creates false precision. It looks scientific. It feels authoritative. And it can lead to catastrophically wrong decisions if the human simply defers to the machine.
This is why jurisdictions worldwide are requiring human oversight. Not because politicians don't understand technology, but because engineers who've deployed these systems understand their limitations.
The Fundamental Problem: Probability Versus Context
Here's a real case that shows why AI can't make decisions alone.
In 2025, a financial services provider in Germany deployed an AI system to process credit card applications. The system analyzed applicant data—income, credit history, spending patterns—and calculated approval probabilities. When the probability fell below a certain threshold, the system automatically rejected applications and sent denial notices.
The system worked efficiently. It processed thousands of applications quickly. From a purely statistical standpoint, it was making reasonable decisions based on patterns in historical data.
But then came the complaints. Applicants who had demonstrated good creditworthiness—stable income, clean payment history—were being rejected without clear explanation. When they asked why, the company couldn't provide meaningful answers. The AI had made its decision based on a complex web of factors, and the company itself didn't fully understand the logic.
In September 2025, Germany's Hamburg Commissioner for Data Protection and Freedom of Information fined the company €492,000. The violation wasn't that the AI made wrong decisions—it was that the company had automated high-stakes decisions without ensuring human oversight and couldn't explain the reasoning to affected individuals.
Under GDPR Article 22, individuals have the right not to be subject to decisions based solely on automated processing when those decisions produce legal effects or similarly significantly affect them. Credit decisions clearly qualify. The company had violated this by letting the AI make decisions autonomously without meaningful human review and without the ability to provide clear explanations.
What the AI couldn't consider: maybe an applicant had a temporary income dip due to medical leave but was now recovered. Maybe credit history looked thin because the person was young and building credit responsibly. Maybe the spending patterns that triggered rejection were actually signs of responsible financial management in a different cultural context.
The AI optimized over the data it could see. But it lacked the context a human loan officer would naturally consider: the full story behind the numbers, the trajectory of someone's financial situation, the reasonable explanations for what looked like red flags in the data.
The €492,000 fine wasn't just a penalty—it was regulatory enforcement of a principle: in high-stakes decisions affecting people's lives, AI can inform but cannot decide alone. The company's failure wasn't technical. It was architectural. They built a system where AI made decisions and humans merely administered the outcomes, rather than a system where AI provided analysis and humans made informed decisions.
The Training Data Problem: When Verification Becomes Impractical
Here's a challenge with modern AI systems, especially large foundation models: full verification of training data becomes increasingly impractical at scale.
Imagine a hiring AI used by a major corporation. It's been trained on ten years of hiring data—thousands of resumes, interview scores, and performance evaluations. The model learns patterns: certain keywords in resumes correlate with successful hires, certain educational backgrounds predict performance, certain interview response patterns indicate strong candidates.
Now imagine that somewhere in those ten years of training data, subtle bias crept in. Maybe one hiring manager consistently gave lower scores to candidates from certain universities. Maybe the performance evaluation system inadvertently disadvantaged people who took parental leave. Maybe the training data reflects historical discrimination that the company has been trying to correct.
The AI learns these patterns faithfully. It doesn't know they're biased—it just knows they correlate with the labels it was given. When the model outputs "78% probability this candidate will succeed," that number reflects all the biases baked into the training data.
Can you audit the training data to find the bias? In some regulated systems with carefully controlled training pipelines—yes, verification is possible. But for large-scale foundation models trained on billions of web-scraped examples, practical verification becomes infeasible. You'd need to review massive datasets, understand complex feature interactions, and identify which learned patterns are legitimate versus which are proxies for discrimination.
Now add a more sinister possibility: what if someone intentionally poisoned the training data? It wouldn't take much. A few hundred carefully crafted fake examples, mixed into a training set of thousands, could teach the model subtle discriminatory patterns that are nearly impossible to detect.
The model would work perfectly in testing. It would pass fairness audits. But it would systematically discriminate in ways that are invisible without exhaustive auditing of every training example and every learned feature interaction.
This is why jurisdictions worldwide are emphasizing human oversight, particularly for systems where training data provenance is difficult to verify. Not because verification is impossible in principle, but because it's often impractical at the scale of modern AI systems.
The regulations have teeth. In September 2025, Germany's Hamburg Commissioner for Data Protection and Freedom of Information fined a financial services provider €492,000 for failing to meet transparency obligations in automated credit card application decisions. The company used algorithms to reject applications, but couldn't adequately explain the logic behind those decisions to applicants. Under GDPR Article 22, individuals have the right not to be subject to decisions based solely on automated processing when those decisions significantly affect them—unless proper safeguards including human review are in place. The Hamburg case demonstrated that having an AI make credit decisions without meaningful human oversight and explainability isn't just bad engineering—it's illegal under current EU law.
Why "Human Oversight" Isn't Enough
The regulations all require "human oversight" for high-risk AI systems. This sounds reassuring. But in practice, human oversight often becomes rubber-stamping.
Picture a loan officer at a bank. The AI system has processed a mortgage application and returns: "Deny. Default risk: 73%." The officer has thirty seconds per application to review. She sees the 73% risk score, sees that it's above the 70% threshold, and clicks "Deny."
Is this human oversight? Technically yes. Is the human actually making the decision? Not really. She's deferring to the machine because she has neither the time nor the information to second-guess a 73% risk score that seems to be based on sophisticated analysis of credit history, income patterns, and regional economic data.
This is the automation bias problem. When humans see a number from a system they perceive as expert, they anchor on that number. The AI output becomes the decision, and the human becomes a rubber stamp.
Real human oversight requires more than a human in the loop. It requires:
The human must have sufficient information to actually evaluate the AI's reasoning. Not just "73% default risk" but "73% based on these factors: debt-to-income ratio 42%, employment history shows three job changes in five years, neighborhood default rate 8%." Information the human can verify against their own knowledge.
The human must have the time and incentive to actually think critically. If the system incentivizes quick approvals, humans will defer to AI recommendations. If the system holds humans accountable for outcomes, they'll think harder about overriding the AI.
The human must have decision-making authority, not just review authority. They need to be able to say "I see why the AI recommends denial, but I know this applicant's job changes were career advancement, not instability" and override the decision.
Most importantly, the system must be designed so that the AI informs but never decides. The loan application doesn't go to "AI: recommend, human: approve or override." It goes to "AI: provide analysis, human: make decision." The subtle difference matters enormously.
The Context AI Cannot See
Let me share one more story, because it captures something crucial about why AI needs human judgment.
An AI email assistant is helping a project manager communicate with her team. The manager drafts an email: "Hey team, we need to talk about the delays on the Johnson project. I'm seeing slippage on three milestones and we need to get back on track. Let's schedule a meeting this week."
The AI assistant analyzes the email. It detects that the tone might be perceived as accusatory. It knows from training data that team morale is important. It suggests a revision: "Hi everyone! I noticed we're having some challenges with the Johnson timeline. Let's sync up this week to brainstorm solutions and get back on course together!"
The manager sees the suggestion. It sounds more positive. She clicks "use this version" and sends it.
What the AI didn't know: the delays are because one team member has been consistently missing deadlines while deflecting blame to others. The manager's original "we need to talk" language was deliberately firm because she needs to address performance issues directly. The AI's "softer" version undercuts her authority and signals that accountability doesn't matter.
Or flip the scenario: the delays are because the manager herself overcommitted the team to an impossible timeline. The original email's "you need to get back on track" wrongly places blame on the team. The AI's revision is actually more appropriate—focusing on collaborative problem-solving rather than finger-pointing.
The AI can't know which scenario is true. It sees patterns in language but doesn't understand the relationships, the history, the power dynamics, or the actual source of the problem. A human would know. A manager who understands her team would choose her words deliberately based on context the AI cannot access.
This is why high-risk decisions need humans. Not because humans are infallible, but because humans can consider context that no model can capture: the full history of a situation, the relationships between people, the long-term implications of actions, the ethical dimensions beyond what's optimizable in a loss function.
What Regulations Actually Require (And What They Miss)
The convergence across the EU, South Korea, and the US reflects a growing understanding of these limitations. The regulations mandate risk assessments, documentation, transparency, and human oversight. They recognize that AI systems in high-stakes domains need guardrails.
But there's a gap between what regulations require and what's actually needed for safe AI deployment.
Most regulations say "implement human oversight" without defining what that means. They don't distinguish between meaningful human decision-making and rubber-stamp approval. They don't specify how much information the human needs to make an informed decision. They don't address automation bias or the incentive structures that lead humans to defer to machines.
Most regulations require "explainability" without acknowledging that explanations can be misleading. A model that says "I predicted cancer based on these features in the image" is providing an explanation, but is it a useful explanation? Can the doctor verify whether those features are actually diagnostic or just correlations the model learned? Can the doctor assess whether this patient's case is within the model's training distribution?
Most regulations require "risk assessment" and "safety testing" without fully addressing a fundamental challenge: you can't exhaustively test a probabilistic system the way you can test deterministic software. Statistical validation, adversarial testing, and robustness testing provide meaningful assurance, but complete verification remains elusive. You can't guarantee that the training data wasn't subtly poisoned. You can't anticipate every failure mode the model might exhibit in deployment.
And crucially, most regulations don't address the authentication and authorization problem for AI agents. When an AI agent makes fifty autonomous tool calls to complete a user's request, who authorized each action? How do you audit the decision chain? What's the human accountability structure? The Model Context Protocol that many AI systems are built on has no standardized authentication. This isn't a regulatory failure—it's that technology is moving faster than policy.
What's Actually Needed: Matching Autonomy to Risk and Context
Here's a principle that should guide both regulation and engineering practice: in high-stakes, context-dependent domains involving human judgment, AI systems should be designed so they inform decisions rather than make them autonomously. They can analyze, recommend, and provide structured information. But in domains like employment, healthcare decisions, or loan approvals—where context, relationships, and human judgment matter—the architecture should preserve meaningful human decision authority.
This doesn't mean AI can never operate autonomously. Aircraft autopilot systems, industrial safety controls, and fraud detection systems operate autonomously because their decision scope is constrained and failure modes are bounded. The key is matching autonomy to risk and controllability.
This isn't about adding a human approval button that everyone clicks without thinking. It's about fundamentally different system architecture.
Wrong architecture: AI analyzes the medical scan, makes a diagnosis, queues the treatment plan, waits for doctor to click "approve."
Right architecture: AI analyzes the scan, highlights regions of concern, shows similar cases from training data, provides confidence calibration, presents this information to the doctor who then makes the diagnosis based on the AI analysis plus patient history, symptoms, and clinical judgment. (Distributed Systems Part 2 provides the engineering patterns for this architecture — verification wrappers, semantic circuit breakers, and escalation hierarchies that implement human oversight as system design, not checkbox compliance.)
Wrong architecture: AI evaluates job candidate, calculates 67% success probability, recommends "reject," waits for hiring manager to approve or override.
Right architecture: AI analyzes resume and interview, identifies relevant experience, flags potential concerns, compares to historical data, presents this analysis to hiring manager who makes the hiring decision based on AI insights plus knowledge of team dynamics, organizational fit, and growth potential.
The difference is subtle but fundamental. In the wrong architecture, the AI makes the decision and the human rubber-stamps it. In the right architecture, the AI provides information and the human makes the decision.
The Authentication Gap: Where Regulations Fall Short
Here's a problem that regulations acknowledge but don't solve: how do you ensure accountability when AI systems act autonomously?
Consider an AI assistant that helps an executive manage their schedule and communications. The executive says "I need to free up some time next week for the board meeting." The AI agent analyzes the calendar, identifies lower-priority meetings, drafts cancellation emails to those meeting organizers, and sends them.
Who authorized canceling those specific meetings? The executive said "free up time," but didn't review which meetings would be canceled. If one of those cancellations damages an important business relationship, who's accountable? The executive who gave the high-level instruction? The AI vendor who built the agent? The system designer who decided the agent could send emails autonomously?
The regulations say "ensure human oversight" but don't specify how to implement that oversight when AI agents are making dozens of micro-decisions autonomously. They don't address the authentication problem: how does the calendar system know whether the AI agent is authorized to cancel meetings on the executive's behalf? They don't solve the audit problem: how do you log and review chains of autonomous decisions?
The Model Context Protocol, which many AI systems use to expose tools and data to agents, intentionally delegates authentication and authorization to the host and transport layers rather than defining a protocol-level mechanism, allowing integration with existing systems such as OAuth, API keys, or enterprise identity infrastructure. This creates a compliance gap: regulations require "appropriate security measures," but what's appropriate for AI agents accessing sensitive data varies by implementation, and best practices are still emerging.
This is where engineers need to lead. Regulations can set principles—human accountability, security, auditability—but engineers need to build the actual systems that implement those principles. We need standard authentication protocols for AI agents. We need audit logging that captures not just what actions were taken, but why the agent decided to take them. We need clear attribution: which human authorized this chain of autonomous decisions?
The Uncomfortable Truth
Three jurisdictions, three regulatory frameworks, one underlying message: in high-stakes, context-dependent domains, AI systems need meaningful human oversight because their limitations are real and consequential.
This isn't AI skepticism or technophobia. It's clear-eyed assessment of where current AI systems excel and where they struggle. AI systems are probabilistic. They output distributions, not certainties. They learn patterns from training data that may be imperfectly auditable. They optimize over observable context that might miss crucial human factors. They can't always consider the full relational and emotional context that makes decisions actually good versus technically correct but catastrophically wrong.
The convergence across the EU, South Korea, and the US reflects recognition of these realities. The specific regulations differ, enforcement mechanisms vary, but the core insight is consistent: in domains where human judgment, context, and relationships matter, AI should augment human decision-making rather than replace it.
For engineers building AI systems, this means designing for human decision-making from the start. Don't build systems where AI makes decisions and humans approve them. Build systems where AI provides analysis and humans make decisions. Don't treat human oversight as a compliance checkbox. Build real accountability into system architecture.
For organizations deploying AI, this means understanding that the most powerful models aren't always the best choice for high-stakes applications. A less capable model with better explainability might be more appropriate than a black-box model with higher accuracy. A system that provides structured information to human decision-makers might be more effective than a system that automates decisions and asks humans to review them.
And for policymakers, this means recognizing that effective regulation requires more than saying "implement human oversight." It requires addressing the hard problems: How do you prevent automation bias? How do you ensure humans have information to make informed decisions? How do you audit autonomous AI agents? How do you attribute responsibility in complex AI systems?
The mathematics of AI is sound. The engineering is improving. The models are getting more capable. But some constraints remain: full verification of large-scale training data is often impractical, probabilistic systems can't be exhaustively tested like deterministic code, and context that humans naturally consider may be unavailable to the model unless explicitly provided through system design, retrieval augmentation, or integration with structured knowledge sources.
That's why regulations emphasize human oversight for high-risk AI in context-dependent domains. It's not bureaucracy. It's recognition that different types of decisions require different levels of human involvement. And engineers who understand where AI excels and where human judgment remains essential will build better, safer systems that actually work when deployed in the complex, contextual, relationship-driven world where humans live.
References
European Union
- European Parliament and Council. (2024). "Regulation (EU) 2024/1689 on Artificial Intelligence (AI Act)." Official Journal of the European Union, L 1689. https://artificialintelligenceact.eu/
- European Commission. (2024). "AI Act - Official Text." https://artificialintelligenceact.eu/the-act/
- European Commission. (2024). "AI Act - Annex III: High-Risk AI Systems." https://artificialintelligenceact.eu/annex/3/
South Korea
- Government of South Korea. (2025). "인공지능 발전과 신뢰 기반 조성 등에 관한 기본법" [Framework Act on the Development of Artificial Intelligence and Creation of a Trust Foundation]. Law No. 20676, enacted January 21, 2025, enforcement January 22, 2026. https://www.law.go.kr
- Min, H. (2025). "United Korean AI Act Bill: Contents & Comparison with EU AI Act (English)." LinkedIn Article. https://www.linkedin.com/pulse/united-korean-ai-act-bill-contents-comparison-eu-english-min-h98kc/
United States
- The White House. (2023). "Executive Order 14110 on Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence." Federal Register, Vol. 88, No. 210.
- Office of Management and Budget. (2024). "Memorandum M-24-10: Advancing Governance, Innovation, and Risk Management for Agency Use of Artificial Intelligence."
- National Institute of Standards and Technology. (2023). "AI Risk Management Framework (AI RMF 1.0)." NIST AI 100-1.
Enforcement Cases
- Hamburg Commissioner for Data Protection and Freedom of Information. (2025). "Administrative Fine for GDPR Violations in Automated Credit Decision-Making." September 30, 2025. https://www.clydeco.com/en/insights/2025/10/lessons-from-hamburg-commissioner-for-data-protect
- European Court of Justice. (2024). "Ruling on Automated Decision-Making and Data Subject Access (Dun & Bradstreet Austria GmbH)."
Shaped in collaboration with Claude, an AI assistant by Anthropic, during rainy Pacific Northwest afternoons where engineering problems meet philosophical questions.
