I'm reading a book on deep learning and I'm a bit confused about one of the ideas the author mentioned. This is from the book *Deep Learning with Python* by Francois Chollet:

A gradient is the derivative of a tensor operation. It’s the generalization of the concept of derivatives to functions of multidimensional inputs: that is, to functions that take tensors as inputs.

Consider an input vector x, a matrix W, a target y, and a loss function loss. You can use W to compute a target candidate y_pred, and compute the loss, or mismatch,between the target candidate y_pred and the target y:

y_pred = dot(W, x)

loss_value = loss(y_pred, y)

If the data inputs x and y are frozen, then this can be interpreted as a function mapping values of W to loss values:

loss_value = f(W)

Let’s say the current value of W is W0. Then the derivative of f in the point W0 is a tensor gradient(f)(W0) with the same shape as W, where each coefficient gradient(f)(W0)[i,j] indicates the direction and magnitude of the change in loss_value you observe when modifying W0[i,j]. That tensor gradient(f)(W0) is the gradient of the function f(W)=loss_value in W0.

You saw earlier that the derivative of a function f(x) of a single coefficient can be interpreted as the slope of the curve of f. Likewise, gradient(f)(W0) can be interpreted as the tensor describing the curvature of f(W) around W0.

For this reason, in much the same way that, for a function f(x), you can reduce the value of f(x) by moving x a little in the opposite direction from the derivative,with a function f(W) of a tensor, you can reduce f(W) by moving W in the opposite direction from the gradient: for example, W1=W0-step*gradient(f)(W0) (where step is a small scaling factor). That means going against the curvature, which intuitively should put you lower on the curve. Note that the scaling factor step is needed because gradient(f)(W0) only approximates the curvature when you’re close to W0,so you don’t want to get too far from W0.

I don't understand why we subtract -step * gradient (f) (W0) from the weight and not just -step, since -step * gradient (f) (W0) represents a loss while -step is the parameter (i.e the x value i.e small change in weight)

- can't get the OTA_HotelResNotifRQ xml value in asp.net web service C#
- can we render image sitecore mvc view using itemID
- Unity dependency not resolving in task
- How do I add code snippets within quotes for Asp.net? (I.e.: <div id="my@model.id">)
- map timezone of selected country by user in dnn
- ASP.NET User Identity
- How to write SQL query where one column returns multiple results C#
- add "please wait" javascript modal to aspx DetailView insert button
- What set an anchor tag href to scroll to a div in all URLs like "/" and "/Home" and "/Home/Index" without redirecting in MVC
- Asp.net MVC Razor @foreach + onclick the data gets listet

- set onclick event on bootstarp 3.2 dropdown
- Python Tornado + AngularJS = what are the ways to work?
- Httphandler for WebDav requests
- Invoking Oracle cursors from CSHTML
- Session Data to Model MVC 5
- Operation Timed out setting a OneDrive site theme
- Generate radiobutton group across rows of table using EditorTemplates
- Model binding with complex type
- Using Asp.net MVC create a model to build the site nav
- HttpRequestException Google QPXExpress

- ASP.NET MVC 5 layout trouble with partial View
- httpWebRequest pass and get parameters
- MVC Razor use ViewBag as part of dynamic Image link
- radiobuttons c# strange behaviour after submitting
- Strange behavior in SQL Server 2012
- (RADGrid ItemCommand Event) Cannot Get Values From Cells?
- Change Link button colot in Repeater using Asp.net C#
- How to extract dll from GAC and install the same on other computer?
- Creating <input> with type="image" which redirects to another aspx page with arguments
- ASP.NET Bar Chart labels