At Scaleout we are committed to enabling a next-generation of machine learning systems that allows for parties to come together to build models without sharing their data. We firmly believe that the advancement of such privacy-preserving machine learning technology is a critical component of sustainable AI development.
The current paradigm
In the current paradigm of centralized machine learning, developing accurate models typically starts with collecting as much data as possible in a central data store, then develop machine learning models on the collected, pooled data.
Centralized machine learning is by far the most common practice but it comes with several challenges that diminish its potential in AI systems. Only a fraction of the possible available data is currently accessible and therefore inhibiting machine learning models reaching its full potential.
Generally speaking, more data means better machine learning models, but in many instances, data cannot be gathered in a central location.
Reasons for this include:
Private/Proprietary data — Sharing valuable business data with someone else is not an option.
Regulated data — GDPR, HIPAA, etc.
Practical blockers — data is too big, the network connection is expensive, slow or unreliable.
Main issues with centralized machine learning
Federated Machine Learning
Federated Machine Learning (FedML) is a distributed machine learning approach which enables training on decentralised data. A server coordinates a network of nodes, each of which has local, private training data. The nodes contribute to the construction of a global model by training on local data , and the server combines non-sensitive node model contributions into the global model.
Benefits of FedML
Federated learning addresses the fundamental problems of centralized AI such as privacy, ownership, and locality of data. It extends, even disrupts, the centralized AI paradigm in which better algorithms always comes at the cost of collecting more and more sensitive data.
Federated learning enables:
Data security and privacy where data never moves
Reduced communication complexity and costs
Powerful data network effects in industries where data cannot be transferred
How it works?
In simple terms, it is an approach where you "bring the code to the data, instead of the data to the code".
Our current priority at Scaleout is to build our Federated platform to bridge the gap between academic research on privacy-preserving learning on the one hand, and the requirements of production grade systems for large scale applications on the other hand. Our vision is a machine learning framework-agnostic platform that removes the technical, security and trust-related barriers to forming a ML-alliance, and that lets data scientists focus on building federated models without being bogged down by the challenges outlined above.
Protocol for covering basic communication needs allowing to train a machine learning model in distributed settings.
Additional model types (ensembles, deep learning, etc).
Product trials/First alliance running.
Partner joint prioritized features.
Interested? Let us know!
We believe that federated machine learning holds exceptional promise. The reason is that it has equal potential for constructing integrity-preserving machine learning models at scale, and for enabling powerful business models centred around data-alliances. Feel free to reach out to us so we can discuss more!