Personalized federated learning for a better customer experience

January 16, 2024

[ad_1]

Federated learning (FL) is a framework that allows edge devices (e.g., Alexa devices) to collaboratively train a global model while keeping customers’ data on-device. A standard FL system involves a cloud server and multiple clients (devices). Each device has its local data and a local copy of the machine learning (ML) model being served.

In each round of FL training, a cloud server sends the current global model to the clients; the clients train their local models using on-device data and send the models to the cloud; and the server aggregates the local models and updates the global model. FL also has a personalization branch, which aims to customize local models to improve their performance on local data.

Illustration of a federated-learning system. Each edge device has its own local model, and it periodically communicates with the cloud server to update a shared global model. A personalization branch aims to customize local models to improve their performance on local data.

In many real-world applications, the local datasets for different clients may have heterogeneous distributions. In a paper we presented at the 36^th Conference on Neural Information Processing Systems (NeurIPS), we show that a training procedure that accounts for that heterogeneity improves the efficiency and accuracy of both the local and global models in federated learning.

Accounting for uncertainty

Our intuition was that that when training a local model, selecting a proper initial model and an appropriate number of training steps is critical to minimizing the training loss, thus achieving the desired personalization.

Optimization trajectories of a local model. The blue and red curves are the training curves obtained with two different initial models.

Our method, which we call Self-FL, is rooted in a theoretical analysis using Bayesian hierarchical models, in which the intra-client and inter-client uncertainties define different layers of the hierarchy. From the Bayesian analysis, we derive equations relating these two uncertainty measures to three local configuration factors: (1) the local initial model, which is used as the starting point of local model training; (2) the learning rate, which determines how dramatically a single training example can affect network weights; and (3) the early-stop rule, which determines when the training procedure should stop to prevent overfitting.

Related content

Innovative training methods and model compression techniques combine with clever engineering to keep speech processing local.

In practice, we don’t have accurate measurements of the intra-client and inter-client uncertainty quantities. But in the paper, we provide several different methods for estimating them. In our experiments, we use an estimate based on the variance in the parameter optimizations determined over the course of training.

To the best of our knowledge, this is the first work to connect personalized FL with hierarchical modeling and to use uncertainty quantification to drive personalization.

Adaptive FL aggregation rule

Existing FL algorithms typically update the global model using a weighted sum of local models, where the weight for each local model is proportional to the local dataset size. Our framework uses an adaptive aggregation rule to update the global model for better personalization. Particularly, we derive the aggregation rule from Bayesian hierarchical modeling, where the global model parameters are considered the ”root” of the statistical model.

The idea is, essentially, that the more a local model’s training data deviates from the global averages, the more responsive the model should be to training on that data. Conversely, the more uncertain the optimization of the local models’ parameters appear to be, the less weight they should be given when updating the global model.

The Self-FL framework for personalized federated learning.

Our method is designed to improve the accuracy of edge devices’ personalized models both directly, by better tailoring them to the types of data they’re likely to see, and indirectly, by making the global models distributed to all clients more accurate. Our empirical results indicate that relative to prior FL schemes, Self-FL improves performance for edge clients. As such, it promises to improve the experience of Amazon customers by making their devices more responsive to their particular needs.

[ad_2]

Source link

Personalized federated learning for a better customer experience

Amazon's annual machine learning conference focuses on community and connections

reMARS revisited: Improve explainability of ML models to meet regulatory requirements

Using Amazon web traffic to track the eclipse

A quick guide to Amazon’s 20+ papers at ICASSP 2024

Play the latest Prime games – Fallout 3 and Fallout: New Vegas

Help support the National Park Foundation by watching select Prime Video content on Fire TV.

Leave a reply Cancel reply