Federated Learning: What is That and Why AI Companies Should Care

October 11, 2021 6 min. read

Contents:

Try AI-Driven Insights
Monitoring for Free

Discover new business ideas and growth opportunities using
our AI-powered insights monitoring tool

It should surprise you if you have never heard of federated learning — it has been around for quite some time now. Check out a simple Google search of the term. You’ll find articles dating back as far back as 2017. Unsurprisingly, it was Google that coined the term and started to develop the first cloud-based infrastructure dedicated to federated learning. However, only after the recent introduction of Apple data privacy policies did businesses start taking this technology seriously.

What is federated learning anyway?

Federated learning is a type of machine learning (ML) that enables multiple decentralized devices to collaboratively learn without exchanging user data and without sending it to a central location. In other words, instead of storing vulnerable user data in a server or a cloud, this technology learns in the device itself. As a result, users’ privacy is now far less likely to be compromised. Now you might be wondering how this is possible. Let’s work the process out step by step.

How does federated learning technology work?

Step 1. Training a model

ML models are trained on a centrally located server.

Step 2. Sending the model to user devices

When the training process is completed, the server sends the model to user devices. There can be hundreds or even millions of user devices. The amount literally depends on the number of clients of the application.

Step 3. Learning

As those devices generate data, models keep learning to get better over time. For example, models spot mistakes in the application’s predictions, correct them and create local training datasets in each user’s device. As a result, each device generates a small update.

Step 4. Exchanging and sending encrypted data

These created updates can be exchanged between devices or sent to the central server, where they are averaged with other user updates to improve the model. So what’s important is that the server can only see the result of the machine learning process, not the data itself. That means no personal user data is exposed or stored on the server.

Step 5. Improving the model

As the server gets the results of the ML training in the form of updates, it starts aggregating them to improve the model. After that, the new model is sent back to user devices. With each cycle, the centralized model that is located at the central server gets better and better.

What are the benefits of federated learning?

We’ve already talked about the challenges of the recent Apple privacy updates that are shaking the world of marketing. The good news is that Apple data privacy algorithms are basically the extensions of federated learning technology. So if there’s so much buzz around it, let’s get to know what the other benefits together with the data privacy are.

More privacy

So as we know, when using traditional machine learning technology, all the data is sent and aggregated in a single central server. As we are also getting to know now, this scenario might make private end-user information sensitive to data breaches. Or even worse, violate the privacy laws in some countries.

With federated learning, on the contrary, data never leaves end-users’ devices, and the training update sent to a server is usually encrypted. The algorithms are still training and becoming better to provide users with the best experience without accessing their individual data. And “yay”! Seems like the mission is accomplished — a decent level of usability can be reached while maintaining user privacy.

Less power consumption

Have you ever thought that modern machine learning systems consume a lot of energy? Why is that? The framework presupposes data being sent back and forward from a user to a central server. No surprise, it requires massive amounts of energy. In fact, it generates as much carbon dioxide emissions as lifetime maintenance of five cars.

A federated learning system setup results in lower carbon dioxide levels released in the atmosphere because it doesn’t require continuous data flows.

On top of that, federated learning models can continue operating even without an internet connection. Add to this on-device inference instead of sending data to the server, and you will get a much more energy-efficient system that can benefit both sides — users (a longer battery life) and developers.

Immediate use

Traditional centralized machine learning technologies have significant limitations because the data is aggregated on a central service and thus it limits the disruptive nature of the learning process.

Federated learning doesn’t have those limitations — the models are improving in the process of the operation of the device, and the improved model can be used immediately. For users, it means that they can feel the difference in the way they use their devices.

Lower latency

In the world of mobile devices, every fraction of a second becomes precious — a fast response guarantees an enhanced user experience.

In the case of a traditional ML scenario, when data is sent from user devices to a central server and back, communication may be too slow, which will immediately result in a poor user experience.

Federated learning overcomes this hurdle with ease — each model serves only one user device, thus enabling user data to stay there too so that the response time for predictions is considerably faster.

Why should AI companies bother?

“Okay, federated learning is a great and very promising technology. But how does it concern me, as an AI-oriented business owner?” you might be wondering. To make a long story short, it is very much possible that federated learning can become a brand new AI business model. Wait a sec. What does that mean?

Today companies no longer have a chance to ignore the importance of data security and data privacy. But the correlation between higher profits and more accurate data is becoming tight-knit. And federated learning, with its data protection mechanisms, has provided a new model for companies that need to leverage data.

Where can it be applied? Or has it been applied already?

As privacy becomes a valuable selling point, besides a more obvious application for mobile phone apps, federated learning systems can be advantageous in a much broader context — among industries where data protection is required.

Mobile applications

First things first, federated learning can be used to predict user behavior with the goal of improving the usability of mobile applications. A few examples of this federated learning in action can be face and voice recognition, along with next-word prediction. And no surprise, it is preferable for users to keep their data on their device instead of sending it to the cloud.

Google, for example, is implementing its Gboard on Android phones, Gmail, and Google search to personalize word suggestions for each user. Federated learning is also used to improve on-device learning algorithms for the voice-activated system “Hey Google.” Apple’s virtual assistant Siri is similarly reliant on federated learning.

Healthcare

Federated learning allows the ability to connect data from different data sets, generalize and make predictions while not moving individual patient data and therefore not compromising their privacy. Let us give you an example here.

Not all hospitals can have the desired amount to make of data at hand for an accurate analysis. So they might need to exchange data (such as patient medical records) with other hospitals. Unfortunately, sometimes it can become a significant obstacle for a healthcare system due to data privacy regulations. And just like that, due to its non-centralized and in-device nature, federated learning has already become a game-changer, saving the lives of COVID-19 patients by predicting the amount of oxygen needed.

Autonomous vehicles

By analyzing real-time information from traffic lights and other autonomous vehicles, making more accurate predictions about routes and maneuvers, federated learning technology can drastically improve the overall safety of the self-driving car experience.

Pioneering Federated Learning Startups

Flower

Flower is a tool that aims to make AI training decentralized. It lets developers train models using data from many devices and locations. Flower uses federated learning, which doesn’t give direct access to data, making it safer for privacy and compliance. They recently introduced FedGPT, a way to train large language models like OpenAI’s ChatGPT and GPT-4, allowing companies to train models on data worldwide.

DynamoFL

DynamoFL is a startup with a federated learning platform that focuses on performance without compromising privacy. It’s in early stages and targets industries like automotive, IoT, and finance. DynamoFL uses novel AI techniques to improve system performance and address vulnerabilities in federated learning, like “member inference” attacks.

DataFleets

DataFleets offers a new approach to safely access and analyze databases without privacy concerns. Instead of reinventing homomorphic encryption, it uses federated learning. DataFleets acts as a trusted agent between a private database and those who need to access it, ensuring data privacy without revealing raw data.

Sherpa

Sherpa, a startup from Bilbao, Spain, originally focused on voice-based digital assistants. Now, it’s building privacy-first AI services for enterprises using federated learning. Sherpa plans to expand its machine learning platform while maintaining its conversational AI and search services.

Xayn

Xayn, based in Berlin, provides ad-free, personalized, and privacy-safe search as an alternative to big adtech companies. It uses Masked Federated Learning for both desktop and mobile versions, tailoring the user’s web experience without compromising privacy.

Wrapping up

Federated learning is well on its way to making significant improvements to data aggregation across a variety of devices, crucially, without user data migration required. Finally, but perhaps most importantly, there is a lower risk of users’ privacy being compromised. Combined, it is a safer, more accessible, more efficient and cheaper way to apply machine learning technologies in even the most competitive or strictly regulated industries.

More useful content on our social media: