What is Optimization? + Learning Gradient Descent | Two Minute Papers #82

What is Optimization? + Learning Gradient Descent | Two Minute Papers #82


Dear Fellow Scholars, this is Two Minute Papers
with Károly Zsolnai-Fehér. Today, we’re not going to have the usual visual
fireworks that we had with most topics in computer graphics, but I really hope you’ll
still find this episode enjoyable and stimulating. This episode is also going to be a bit heavy
on what optimization is and we’ll talk a little bit at the end about the intuition of the
paper itself. We are going to talk about mathematical optimization. This term is not to be confused with the word
“optimization” that we use in our everyday lives for, for instance, improving the efficiency
of a computer code or a workflow. This kind of optimization means finding one,
hopefully optimal solution from a set of possible candidate solutions. An optimization problem is given the following
way: one, there is a set of variables we can play with, and two, there is an objective
function that we wish to minimize or maximize. Well, this probably sounds great for mathematicians,
but for everyone else, maybe this is a bit confusing. Let’s build a better understanding of this
concept through an example! For instance, let’s imagine that we have to
cook a meal for our friends from a given set of ingredients. The question is, how much salt, vegetables
and meat goes into the pan. These are our variables that we can play with,
and the goal is to choose the optimal amount of these ingredients to maximize the tastiness
of the meal. Tastiness will be our objective function,
and for a moment, we shall pretend that tastiness is an objective measure of a meal. This was just one toy example, but the list
of applications is endless. In fact, optimization is so incredibly ubiquitous,
there is hardly any field of science where some form of it is not used to solve difficult
problems. For instance, if we have the plan of a bridge,
we can ask it to tell us the minimal amount of building materials we need to build it
in a way that it remains stable. We can also optimize the layout of the bridge
itself to make sure the inner tension and compression forces line up well. A big part of deep learning is actually also
an optimization problem. There are a given set of neurons, and the
variables are when they should be activated, and we’re fiddling with these variables to
minimize the output error, which can be, for instance, our accuracy in guessing whether
a picture depicts a muffin or a chihuahua. The question for almost any problem is usually
not whether it can be formulated as an optimization problem, but whether it is worth it. And by worth it I mean the question whether
we can solve it quickly and reliably. An optimizer is a technique that is able to
solve these optimization problems and offer us a hopefully satisfactory solution to them. There are many algorithms that excel at solving
problems of different complexities, but what ties them together is that they are usually
handcrafted techniques written by really smart mathematicians. Gradient descent is one of the simplest optimization
algorithms where we change each of the variables around a bit, and as a result, see if the
objective function changes favorably. After finding a direction that leads to the
most favorable changes, we shall continue our journey in that direction. What does this mean in practice? Intuitively, in our cooking example, after
making several meals, we would ask our guests about the tastiness of these meals. From their responses, we would recognize that
adding a bit more salt led to very favorable results, and since these people are notorious
meat eaters, decreasing the amount of vegetables and increasing the meat content also led to
favorable reviews. And we, of course, on the back of this newfound
knowledge, will cook more with these variable changes in pursuit of the best possible meal
in the history of mankind. This is something that is reasonably close
to what gradient descent is in mathematics. A slightly more sophisticated version of gradient
descent is also a very popular way of training neural networks. If you have any questions regarding the gradient
part, we had an extended Two Minute Papers episode on what gradients are and how to use
them to build an awesome algorithm for light transport. It is available, where? Well, of course, in the video description
box, Károly, why are you even asking. So what about the paper part? This incredible new work of Google DeepMind
shows that an optimization algorithm itself can emerge as a result of learning. An algorithm itself is not considered the
same one thing as deciding what an image depicts or how we should grade a student essay, it
is an algorithm, a sequence of steps we have to take. If we’re talking about outputting sequences,
we’ll definitely need to use a recurrent neural network for that. Their proposed learning algorithm can create
new optimization techniques that outperform previously existing methods not everywhere,
but on a set of specialized problems. I hope you’ve enjoyed the journey, we’ll talk
quite a bit about optimization in the future, you’ll love it. Thanks for watching, and for your generous
support, and I’ll see you next time!

8 thoughts on “What is Optimization? + Learning Gradient Descent | Two Minute Papers #82”

  1. More related funny images are available below. My favorite is the "puppy or bagel" one. 🙂
    http://twistedsifter.com/2016/03/puppy-or-bagel-meme-gallery/

  2. Damn chihuafins…

    You mention gradient descent changing the variables a bit. That seems like it would be quite useful with a lot of use cases.

    I'm wondering how people deal with threshold effects or exponential/logarithmic relationships? Does that require so much sensitivity that it gets overwhelmed by noise in the data?

    Besides noise, threshold effects and catalysts, are there other phenomena which make it difficult to interpret data using algorithms or our own brains?

  3. I guess that optimisation really depends on the available variables and quality of feedback. How can you know that you are using the right variables?

Leave a Reply

Your email address will not be published. Required fields are marked *