Advanced Nonlinear Gradient Methods
The Advanced Nonlinear Gradient Methods Project will provide automated ways to index, filter, correlate and interpret sensory signals to turn data into knowledge.
Everywhere we look, the combination of cheap sensors with ubiquitous networking is unleashing a flood of information that threatens to overwhelm the human capacity to process it. By providing automated ways to index, filter, correlate and interpret sensory signals — in short, to extract knowledge from data — machine learning promises to help us manage this information glut.
What will this research achieve?
The Advanced Nonlinear Gradient Methods (ANGIE) project will explore various newly developed ways to accelerate the convergence of stochastic gradient descent while preserving its scalability properties and incremental (online) character. Our research agenda includes:
- Rapid stochastic gradient algorithm development and implementation
- Mathematical and empirical analysis of convergence and stability
- Applications in adaptive signal processing such as robotics and computer vision
- Development of software tools such as visual programming and algorithmic differentiation.
What are the key features?
The solution will require machine learning techniques that:
- Scale up to very large nonlinear models with up to 107 degrees of freedom
- Handle vast amounts of highly redundant, noisy, unreliable data
- Continually adapt in real-time to non-stationary streams of such data.
Gradient-based optimisation algorithms are the primary engine powering much of machine learning. Although many of these methods are very efficient on smaller problems, they have difficulty meeting the above criteria – they do not scale up to very large models or they require a full pass through of the data at each iteration. These limitations make it prohibitively expensive for large amounts of data and preclude ongoing adaptation.
Simple gradient descent, by contrast, can make do with noisy estimates of the gradient obtained on the fly, and is scalable to very large systems. In spite of its very slow convergence, such a stochastic gradient descent can outperform even sophisticated batch methods on large data sets.
Research team
| Project leader: | Nic Schraudolph |
| Researchers: | Simon Günter • Peter Sunehag • Jochen Trumpf |
| Ph.D. students: | Desmond Chik • Jin Yu |
