Hold-ups - Gradient Descent
I will be posting stuff that I think is holding me back from moving forward in the course I’m currently following, under the title ‘Hold-ups’. This is so that I can revisit these posts later and study these topics in depth.
Some questions I solved
- Why does subtracting the value lr * ∇f from x bring us close to that value of x for which x is minimum? Basically, what is ∇f?
- I got a satisfactory answer from the books Thomas’ Calculus and Ian Goodfellow’s Deep Learning. But these in turn led me to ask the following questions:
- How does minimising the directional derivative at a point help us realise the direction of steepest descent?
- Why does ∇f always point in the direction of steepest ascent?
- I was able to answer these two questions after watching this great Khan Academy video.
Some questions I have
- What is Automatic Differentiation? How does it work?
- To read:
- What is backpropagation?
- How is autograd implemented in PyTorch?
- What does it really mean when we write
loss.backward()
in the lesson2-sgd notebook?
Some more stuff to do
- Check out the site explained.ai and definitely read The Matrix Calculus You Need For Deep Learning [arXiv:1802.01528]
A/N: I have added the ability to comment on posts now. Feel free to point out any mistakes, ask questions, or just say hi!