Calculus

astonzhang · October 18, 2019, 11:23pm

https://d2l.ai/chapter_preliminaries/calculus.html

gpolo · November 25, 2019, 4:24pm

Just some changes and (maybe) some corrections on the Gradients section:

I would say explicitly that this formulas are following the Denominator Layout
(https://en.wikipedia.org/wiki/Matrix_calculus#Layout_conventions)
The second and third example (x_traspose * A and x_traspose * A * x) I think that A is assumed to have n rows instead of m as is said, neither of the two are possible if it has m rows (in fact the third one requires a square matrix) so this is confusing or just a mistake.

Thank you for the effort!

gold_piggy · November 26, 2019, 6:42pm

Great catch! I guess you are right. It should be like the following:

minhduc0711 · January 3, 2020, 10:37am

I suggest that the Numerator Layout should be used here for consistency, as the next chapter mentions the Jacobian (m by n matrix), which confused me quite a while.

gpolo · January 3, 2020, 1:19pm

@minhduc0711,

You are right, for me the most important thing is to stay consistent. With the explanation of the Automatic Differentiation section of the "… the gradient of y (a vector of length m) with respect to x (a vector of length n ) is the Jacobian (an m \times n matrix) " they are not consistent with the previous ones, this is the Numerator Layout or Jacobian formulation and in the Calculus section is the Denominator Layout

astonzhang · January 11, 2020, 1:38am

@gpolo @minhduc0711

Thanks. The formula in the Calculus section follows Denominator layout. It’s quite common in deep learning: when you differentiate a loss function (scalar) with respect to a tensor, the shape of the differentiation result is the same as that of the tensor in denominator layout.

I agree that consistency matters. Thus I just removed the Jacobian description (in Numerator layout) in the automatic differentiation section:

Just let us know if you feel more explanations are needed. Thanks.

gpolo · January 12, 2020, 7:16am

@astonzhang,
Now is consistent, the explanation of the Jacobain there was I do not think it was needed, now there is no inconsistency.
Thank you very much for your effort!

naveen_kumar · February 16, 2020, 1:59pm

anyone has solution for 3rd question?

gold_piggy · February 21, 2020, 9:25pm

Hi @naveen_kumar,

Since \|\mathbf{x} \|_2 = (x_1^2 + x_2^2 + ...)^\frac{1}{2},
then \nabla_{\mathbf{x}} \|\mathbf{x} \|_2 = \frac{\mathbf{x}}{\|\mathbf{x} \|_2}.

You can try to get the partial gradient of x_i and concat these i^{th} entries together.

BoxOfCereal · March 19, 2020, 9:08am

Warning. Spoiler may contain answer. I hope asking these here is ok. I have no other way of validating my attemps.

Is the answer \nabla_{\mathbf{x}} \|\mathbf{x} \|_2 for question 3 because with

\nabla=\left[ \dfrac{x_{1}}{\sqrt{x_{1}+x_{2}+x_{n}}},\dfrac{x_{2}}{\sqrt{x_{1}+x_{2}+x_{n}}},...\dfrac{x_{n}}{\sqrt{x_{1}+x_{2}+x_{n}}}\right] the x's in the numerator can be represented as a vector \textbf{x} and the denominator is the definition of the l2 norm?

Also I would like to check my question for 4:

\dfrac{\delta u}{\delta a}=\dfrac{\delta u}{\delta x} \cdot \dfrac{\delta x}{\delta a}+\dfrac{\delta u}{\delta y} \cdot \dfrac{\delta y}{\delta a}+\dfrac{\delta u}{\delta z} \cdot \dfrac{\delta z}{\delta a}

If my answer for 4 is correct, however, I would like to provide feedback. I am very novice when it comes to Calculus and most of this material. If I had not watched a video for how to use the chain rule for multi variable Calc I would have had no clue how to proceed. I think including mention of using the graph method to plot out the chains could help others that were in my shoes.

Also the chain method example omits any use of the partial derivative operator and technically doesn’t require it for single variable chains. But, the text mentions "multivariate functions in deep learning " and the lack of seeing one confused me.

This book has been an invaluable resource for me and made things very accessible for someone with a weak maths background. Thank you so much to the authors and I hope my feedback can be of use.

doithuong · March 21, 2020, 2:45am

Hi everyone, who has the solution to the 4th question

sosroot · May 1, 2020, 2:09pm

u=f(x,y,z) with x=x(a,b), y=y(a,b) and z=z(a,b), so :
\frac{\partial u}{\partial a}=\frac{\partial u}{\partial x}\frac{\partial x}{\partial a}+\frac{\partial u}{\partial y}\frac{\partial y}{\partial a}+\frac{\partial u}{\partial z}\frac{\partial z}{\partial a}
\frac{\partial u}{\partial b}=\frac{\partial u}{\partial x}\frac{\partial x}{\partial b}+\frac{\partial u}{\partial y}\frac{\partial y}{\partial b}+\frac{\partial u}{\partial z}\frac{\partial z}{\partial b}

James1 · May 18, 2020, 8:00pm

Hi, wondering if someone can help me out,

I keep getting a bunch of “IndentationError: unexpected indent” when i try to run the code after the sentence

“With these 33 functions for figure configurations, we define the plot function to plot multiple curves succinctly since we will need to visualize many curves throughout the book.”

i can seem to get this code to work, anyone have any suggestions?

ManuLasker · June 16, 2020, 10:08pm

Hi $z = x + y$ .

$$a^2 + b^2 = c^2$$

$$\begin{vmatrix}a & b\\ c & d \end{vmatrix}=ad-bc$$

Topic		Replies	Views
7.2.1 Clarification D2L Book	0	243	April 13, 2019
How to compute higher order gradients	1	911	July 3, 2018
How to get product of Jacobian and vector Discussion	0	715	October 3, 2019
Matrix Chain Rule Example 2 Typo? Courses	1	408	February 8, 2019
Automatic Differentiation D2L Book	22	3094	December 13, 2019

Calculus

Related Topics