Verifiable and Privacy-Preserving Federated Learning Through Differential Privacy and Cryptographic Protocols

During my PhD in Computer Science, I conducted research on a central and unresolved problem in modern machine learning systems:

How can federated learning be made both privacy-preserving and verifiable, without relying on unrealistic trust assumptions?

This work explores the gap between theoretical privacy guarantees and their practical enforcement in distributed systems.

The objective is not only to protect data, but to go one step further:

ensure that privacy mechanisms are correctly applied,
reduce trust in the central server,
and make the entire learning process verifiable and auditable.

You can read more in these documents [Thesis] · [Poster].

🌐 Federated Learning: Promise and Limitations

Federated Learning (FL) enables multiple participants to collaboratively train a model without sharing their raw data.

Instead of centralizing data:

each client trains locally,
and only model updates are shared with a central server.

At first glance, this paradigm appears to solve privacy concerns.

However, this assumption is fundamentally incomplete.

⚠️ The Hidden Problem

Even without raw data sharing:

gradients can leak sensitive information,
model updates can be exploited through inference attacks,
and the central server remains a critical point of trust.

This reveals a key insight:

Federated Learning is only a first layer to privacy by design.

🧠 Research Direction

This thesis is built on a central idea:

Privacy alone is not sufficient — it must be combined with verifiability.

Among existing techniques, Differential Privacy (DP) provides the only formal guarantee that remains valid after model release.

However:

DP introduces a privacy–utility trade-off,
and requires trusting the entity that applies the noise.

To address these limitations and reduce the trade-offs, this work explores the combination of:

Differential Privacy (DP)
Homomorphic Encryption (HE)
Zero-Knowledge Proofs (ZKPs)
Secret Sharing

🧩 Contributions

This thesis introduces three complementary contributions, each addressing a specific limitation of federated learning.

1. Reducing Trust with Homomorphic Encryption

A first framework introduces a secure aggregation tunnel using additive homomorphic encryption.

Client updates remain encrypted during transmission and aggregation
The server never accesses raw updates
Noise generation is decoupled from aggregation

This significantly reduces reliance on a trusted server.

2. Verifying Differential Privacy with Zero-Knowledge Proofs

A second contribution introduces a non-interactive verifiability protocol based on:

zk-SNARKs
cryptographic hash commitments

This allows:

proving that differential privacy was correctly applied
without revealing either the data or the noise

This is a key step toward auditable machine learning systems.

3. ProoFed: A Unified Framework

The final contribution is ProoFed, a distributed framework that combines:

Differential Privacy
Secret Sharing
Verifiable aggregation

Key properties:

Noise generation is decentralized across clients
No single entity controls privacy
Aggregation is verifiable in zero-knowledge

This removes single points of trust while preserving scalability.

🚀 Conclusion

This thesis demonstrates that:

Privacy-preserving machine learning must evolve into verifiable machine learning.

By combining cryptographic techniques with differential privacy, it is possible to design systems that are:

secure (data remains confidential)
trust-reduced (no reliance on a central authority)
verifiable (correctness can be proven)

More broadly, this work highlights a shift:

From “trust the system” → to “prove the system”.