Hold-ups - Gradient Descent | Akshath’s Blog

I will be posting stuff that I think is holding me back from moving forward in the course I’m currently following, under the title ‘Hold-ups’. This is so that I can revisit these posts later and study these topics in depth.

Some questions I solved

Why does subtracting the value lr * ∇f from x bring us close to that value of x for which x is minimum? Basically, what is ∇f?
- I got a satisfactory answer from the books Thomas’ Calculus and Ian Goodfellow’s Deep Learning. But these in turn led me to ask the following questions:
How does minimising the directional derivative at a point help us realise the direction of steepest descent?
Why does ∇f always point in the direction of steepest ascent?
- I was able to answer these two questions after watching this great Khan Academy video.

Some questions I have

What is Automatic Differentiation? How does it work?
- To read:
What is backpropagation?
How is autograd implemented in PyTorch?
- To read:
What does it really mean when we write loss.backward() in the lesson2-sgd notebook?

Some more stuff to do

Check out the site explained.ai and definitely read The Matrix Calculus You Need For Deep Learning [arXiv:1802.01528]

A/N: I have added the ability to comment on posts now. Feel free to point out any mistakes, ask questions, or just say hi!