Is KYC safe in 2026? Not centralised. The IDmerit breach exposed 1B records. The math behind sharded KYC and the attack-surface gap.
Table of contents
Hero / opening
Is KYC safe in 2026 — the way most vendors run it? The arithmetic says no. On February 18, 2026, identity-verification vendor IDmerit disclosed a database leak exposing roughly 1 billion records across 26 countries. A few weeks earlier, Sumsub admitted an intrusion that had run undetected since July 2024. Two centralised KYC vendors. Three combined years of unmonitored exposure. The question isn't whether centralised KYC vendors get breached. It's how the architecture you choose changes the blast radius when one does. This piece does the math.
Is KYC safe under the centralised vendor model?
The answer is structural, not anecdotal. A centralised KYC vendor's architecture works as follows. The customer uploads their passport, selfie, and proof of address. The vendor verifies the documents, computes a result, and stores both the result and the underlying documents in a database. Most regulated regimes mandate retention for five to seven years. Multiply that retention by every customer the vendor has ever onboarded for every operator on its client roster, and the resulting database is one of the densest concentrations of validated identity records ever assembled.
Three of those databases have been breached in the last 18 months. Read our analysis of the IDmerit data leak, Sumsub security breach lessons, and the identity breach epidemic 2026 analysis for the systemic argument.
The centralised model fails on two axes simultaneously. The single database is a high-value target. And the single point of compromise yields full reconstruction: a successful breach exposes the complete record per customer, not a fragment of it. That's not a security-implementation question; that's an architectural one.
What did the IDmerit and Sumsub breaches actually expose?
The numbers tell the story.
IDmerit, disclosed February 18, 2026. The exposed data set ran to roughly 1 billion records, including approximately 203 million US records, according to Cybernews. Field types included full legal names, home addresses, dates of birth, government-issued ID numbers, phone numbers, email addresses, telecom metadata, and KYC/AML verification logs. Public disclosure came 99 days after the unsecured database was first identified by researchers.
Sumsub, disclosed January 2026. The intrusion reportedly began in July 2024 and went undetected for approximately 18 months, per European reporting. Sumsub's customer list includes Bitget, Bitpanda, Bybit, Huobi, and Wirex — most of mid-cap crypto. The breach exposed contact data on French users; the full extent across the customer base remains under disclosure.
Three KYC vendor incidents in eighteen months. The pattern is not a sequence of unlucky outliers; it's the operating consequence of an architecture in which the vendor's database is the single point of trust.
How does Zyphe's architecture change the attack-surface math?
Zyphe runs the same verification a centralised vendor does — government ID, NFC chip read, biometric liveness, sanctions, PEP, address, source of funds. What changes is what happens to the documents after verification.
The architecture, in numbers:
- 60,000+ decentralised storage nodes, geo-distributed for data residency.
- Each verification record is fragmented across 100 shards.
- Reconstruction requires the user's key plus a configurable threshold (typically 29 of 100 shards).
- AES-GCM-256 encryption with keys held by the user, not the vendor. Zyphe has no master key, no backdoor, no way to reconstruct a customer's record without the user's explicit cryptographic consent.
- Threshold-encrypted audit access for regulators, requiring co-sign with the user before the verification record can be inspected.
The attack-surface comparison, made explicit:
For the architectural detail, see Decentralized PII Storage and Decentralized KYC.
What's the math on a single-node breach?
Quantified, so the procurement team can model it.
Each verification record is split into 100 shards. The threshold for reconstruction is 29. The shards are AES-GCM-256 encrypted with keys held by the customer. The shards are geographically distributed across 60,000+ nodes, with no single physical location holding a quorum.
Probability of meaningful reconstruction from a single compromised node: zero. A single shard is a 1/100 fragment of an encrypted record. Even if the attacker somehow defeats AES-GCM-256 on that single shard (which the cryptographic literature treats as computationally infeasible at current and projected GPU costs), the result is one one-hundredth of a customer's data with no key to decrypt the rest.
Probability of meaningful reconstruction from a multi-node breach without the user key: zero. The attacker would need to compromise at least 29 nodes simultaneously, defeat AES-GCM-256 on each, and still not have the customer's private key. Without the key, the threshold-decryption operation does not produce a recoverable record.
Probability of compelled reconstruction via Zyphe alone: zero. The architecture is designed so that Zyphe cannot reconstruct customer records without user cooperation. Subpoenas served on Zyphe alone return the audit hash, not the document.
The math doesn't make breaches impossible. It makes them irrelevant. The exposed shards have no extractable payload. That is the architectural difference between "we encrypt at rest" (a centralised vendor's typical claim, which the IDmerit and Sumsub breaches showed is insufficient) and "we cannot reconstruct your data without you."
How should a crypto firm evaluate whether its KYC vendor is safe?
Five questions to ask in writing, before any procurement contract is signed:
- Where, exactly, do you store customer documents — and for how long? A centralised vendor will say "our cloud, encrypted at rest, for the regulated retention period." That answer is what IDmerit's customers received. Ask for the architectural diagram.
- What is your breach notification SLA? The Sumsub disclosure took 18 months. The IDmerit disclosure took 99 days from first identification. A 24-to-48-hour SLA in writing is now table stakes for any regulated procurement.
- Can you cryptographically prove that you cannot reconstruct customer records without the customer's consent? A centralised vendor will deflect. A vendor whose architecture makes the question unanswerable in your favour should be eliminated from the shortlist.
- What happens to my customer's data if your company is acquired or fails? Insolvency cascades create the worst kind of breach surface. A vendor whose database survives the company is a vendor whose database survives the customer's right to erasure.
- What's your worst breach case in the last 36 months? If the answer is "we don't comment," ask why. The answers from your shortlist will tell you which vendor's architecture is actually defensible.
For the deeper procurement framework, see our top compliance tools evaluation guide.
What does GDPR, MiCA, and FATF actually require for KYC data security?
The rules don't mandate the architecture; they mandate outcomes. The outcomes a centralised vendor can struggle to deliver are exactly the ones a sharded architecture delivers by default.
- GDPR Article 32 (security of processing). Requires "appropriate technical and organisational measures" including encryption, ability to ensure ongoing confidentiality, integrity, availability, and resilience. A breached vendor's defence rests on whether their measures were "appropriate" — a question juries get to decide retrospectively.
- GDPR Article 17 (right to erasure). Customers can demand deletion. Centralised vendors must execute the deletion across their database; sharded architectures execute via key revocation and the shards become noise.
- GDPR data residency expectations under Schrems II. Geo-locked storage handles this without manual configuration per market.
- MiCA Article 70 (CASP record-keeping). Requires CASPs to maintain identity verification records; doesn't mandate that a single vendor hold them.
- FATF Recommendation 11. Five-year minimum record retention. The retention obligation sits with the obliged entity (the operator), not necessarily with a third-party vendor.
The framing matters: regulators want the records to exist and be auditable, not to live on a particular vendor's server. The architecture that delivers regulator-grade audit without the centralised honeypot satisfies the rule and removes the breach risk simultaneously.
What's the cost of choosing the wrong KYC architecture?
Three categories of cost, in increasing order of severity.
- The breach-incident cost. Average data breach cost reached USD 4.88 million in 2024, per IBM's tracking. For a regulated crypto operator, the multiplier above that baseline is meaningful — regulator-imposed fines, fiat-rail termination, customer churn.
- The procurement-cycle cost. Re-procuring a KYC vendor mid-cycle is not a quarter-long project; it's a year-long project. Every operator that had to migrate off a breached vendor in 2025 will tell you the same thing.
- The reputational cascade. Crypto teams whose KYC vendor was Sumsub had to communicate the breach to their customers. Some customers churned. Investor-facing data rooms now routinely ask "who's your KYC vendor and what's their breach history?" — a question with no good answer if the answer is one of the three named cases.
The architectural choice that avoids all three is the same: decouple verification from PII storage. Read the systemic argument in identity breach epidemic 2026.
The bottom line
Is KYC safe? Not the way the centralised vendor model runs it. Three breaches in eighteen months at vendors with sophisticated security teams and substantial budgets prove the question is no longer "which vendor's controls are best." It's "which vendor's architecture removes the question."
Sharded user-controlled storage doesn't make breaches impossible. It makes them mathematically irrelevant. If the question belongs in your roadmap, book a 30-minute walkthrough and we'll run a real verification through the platform plus the cryptographic spec your security team needs to sign off on.
Related resources
Edoardo Mustarelli(Sales Development Representative)Edoardo Mustarelli, fintech/Web3 strategist at Zyphe, driving sales growth and partnerships with global expertise across technology, finance, and strategy.