Federated Learning
A distributed machine learning approach where models are trained across multiple decentralized devices or servers holding local data, without exchanging raw data.
Also known as: FL, Collaborative Learning
Category: AI
Tags: ai, machine-learning, privacy, distributed-systems, training
Explanation
Federated learning is a machine learning paradigm where a model is trained collaboratively across multiple participants (devices, organizations, or servers) without centralizing their data. Instead of sending raw data to a central server, each participant trains a local copy of the model on their own data and shares only the model updates (gradients or parameters) with a central coordinator. This approach preserves data privacy while still enabling the creation of powerful shared models.
The concept was introduced by Google researchers McMahan et al. in 2017, initially motivated by the need to improve mobile keyboard predictions without collecting users' typing data on central servers. The foundational algorithm, Federated Averaging (FedAvg), works by distributing the current global model to participating clients, having each client perform local training on their data, collecting the updated model parameters, and averaging them to produce a new global model. This process repeats over multiple rounds.
Federated learning comes in two main flavors. Cross-device federated learning involves training across many consumer devices like smartphones or IoT sensors, with potentially millions of participants, each holding small amounts of data. Cross-silo federated learning involves training across a smaller number of organizations (hospitals, banks, companies) that each hold substantial datasets but cannot share them due to privacy regulations, competitive concerns, or data sovereignty requirements.
Key challenges in federated learning include statistical heterogeneity (non-IID data across participants), systems heterogeneity (devices with different computational capabilities and network conditions), communication efficiency (reducing the bandwidth needed to share model updates), and security (defending against malicious participants who might try to poison the model or infer others' data from shared gradients).
Privacy-preserving techniques often complement federated learning. Differential privacy adds calibrated noise to model updates to prevent information leakage about individual data points. Secure aggregation uses cryptographic protocols so the server can only see the aggregated update, not individual contributions. Homomorphic encryption allows computation on encrypted data. Together, these techniques provide strong formal privacy guarantees.
Federated learning has found applications in healthcare (training diagnostic models across hospitals without sharing patient records), finance (fraud detection across banks), mobile computing (next-word prediction, voice recognition), and autonomous vehicles (sharing driving experiences without centralizing sensitive location data).
Related Concepts
← Back to all concepts