When you submit your work, follow the instructions on the submission workflow page scrupulously for full credit.
Instead of handing out sample exams in preparation for the midterm and final, homework assignments include several problems that are marked Exam Style . These are problems whose format and contents are similar to those in the exams, and are graded like the other problems. Please use these problems as examples of what may come up in an exam.
As a further exercise in preparation for exams, you may want to think of other plausible problems with a similar style. If you collect these problems in a separate notebook as the course progresses, you will have a useful sample you can use to rehearse for the exams.
Is this assignment properly formatted in the PDF submission, and did you tell Gradescope accurately where each answer is?
This question requires no explicit answer from you, and no space is provided for an answer in the template file.
However, the rubric entry for this problem in Gradescope will be used to decuct points for poor formatting of the assignment (PDF is not readable, code or text is clipped, math is not typeset, ...) and for incorrect indications in Gradescope for the location of each answer.
Sorry we have to do this, but poor formatting and bad answer links in Gradescope make our grading much harder.
Before you start, install the Anaconda Python distribution, which contains essentially all of the code you will need for this course. Please do not use other installations of Python, and install the version for Python 3, not Python 2. Also, if you install new Python packages later on, do not mix package installation methods, unless you know very thoroughly what you are doing (for instance, do not use pip
to install new software, only use conda
). This will save you big headaches as the semester progresses.
conda
Environment¶If you use Python for more than this course this semester, you are strongly advised to create a conda
environment for the class, and do all your work for it in that environment. This will prevent interference with other installations. One way to create such an environment is to run the following command in a terminal:
conda env create --file environment.yml
Here is the file environment.yml .
The name of the resulting environment is compsci527
. If you are in the conda
base environment, you can activate this environment by the command
conda activate compsci527
When you are done using the environment, you can go back to the base environment, if desired, with the command
conda deactivate
In principle, code can be developed and debugged directly in Jupyter notebooks by using Python's own pdb module, or by using more visual debuggers such as the one provided with PixieDust, or even by interspersing the code with print
statements.
However, using a programming IDE on your computer is much easier and provides more powerful debugging tools. Anaconda comes with the Spyder IDE, which is quite good. My absolute favorite is PyCharm, which is insanely well conceived and implemented. If you request it from an .edu
email account, you can even get the professional edition for free.
So you are strongly encouraged to write and test your code in your favorite IDE first, and then copy and paste it into notebook cells. Keep functions (and individual lines of code) short by good code structuring, and put each function in its own cell for easier pagination.
In our experience, nearly everyone ignores this advice to save the hour or so that it takes to learn to use an IDE. As a result, nearly everyone wastes hours or even days of time debugging code that could have been fixed in minutes if it had been developed in an IDE with a proper debugger.
To work on this assignment, open the HTML file assignment00.html
from the class homework page (that's what you are reading now) in one browser window, and open the notebook homework00.ipynb
from the provided zip file in the notebook editor (in another browser window) next to it.
As you work on homework00.ipynb
, add as many cells as you need to answer the questions (use the Insert -> Insert Cell Below
menu or the +
icon in the menu bar). When you do so, make sure you select the proper cell type (Markdown
for text, Code
for Python code) from the menu, or else your formatting of text will not work and your code will not run.
Keep your cells small in both dimensions (few, short lines) for easier pagination.
The problems in this part pertain to the following matrix. No explanations are needed with your answers.
$$ A = \left[\begin{array}{cc} 2 & 6 \\ 1 & 3 \\ 0 & 0 \end{array}\right] $$What is the rank of this matrix?
Give exact numerical expressions for vectors that form orthonormal bases for the four fundamental spaces associated with $A$, namely
Answers are not unique, just pick your favorites and keep things simple. However, make sure that your bases are orthonormal.
State which vectors are for which space. For instance, you could say "The row space is spanned by..." or use a bullet list with headers.
An exact numerical expression may contain any algebraic operation. For instance, $\frac{\sqrt{3}}{2}$ is an exact numerical expression for the cosine of $\pi/6$, but 0.866 is not.
You need not distinguish between row vectors and column vectors. For instance, the following expressions would be considered acceptable representations of the same vector for this problem:
The problems in this part pertain to the following matrix, where $r, s, t, u$ are arbitrary real numbers (watch their order in $B$).
$$ B = \left[\begin{array}{cc} r & u\\ s & t \end{array}\right] $$Give the simplest inequality involving $r, s, t, u$ that constitutes a necessary and sufficient condition for $B$ to have real eigenvalues. Show your reasoning.
Use your reasoning for the previous problem to show that if $r = t$, then $B$ has two coincident eigenvalues iff at least one of $u$ and $s$ is zero.
"Iff" stands for "if and only if."
Use your reasoning for problem 2.1 above to show that a $2\times 2$ symmetric matrix with real entries has real eigenvalues.
It is not too hard to show that a real and symmetric square matrix of any size has real eigenvalues. However, the proof of this more general fact typically uses different reasoning. In contrast, this problem asks you to derive this conclusion for a $2\times 2$ matrix only, and using what you discovered in Problem 2.1.
When we think of the linear map described by a matrix-vector multiplication, as in
$$ \mathbf{y} = A \mathbf{x}\;, $$we often think of the computation of $\mathbf{y}$ "by rows," in the sense that the first entry of $\mathbf{y}$ is the inner product of the first row of $A$ and $\mathbf{x}$, and so forth. This is because this is the naive algorithm we were taught to use to compute this product.
A different view, which is often more useful, is to think of the same product "by columns." If we number everything in Python style, that is, starting with zero, we can write the same product as follows:
$$ \mathbf{y} = A \mathbf{x} = \left[\begin{array}{ccc} \mathbf{a}_0 & \cdots & \mathbf{a}_{n-1} \end{array}\right] \left[\begin{array}{c} x_0\\\vdots\\x_{n-1} \end{array}\right] = x_0 \mathbf{a}_0 + \ldots + x_{n-1} \mathbf{a}_{n-1} \;. $$In words, the output vector $\mathbf{y}$ is a linear combination of the columns of $A$ and the coefficients of the combination are the entries of the input vector $\mathbf{x}$.
(If it is not immediately clear to you that these two ways to see the same product are mathematically equivalent, you may want to compute both on a small example (say, with a $3\times 2$ matrix) to verify. Do not hand in your computations, though.)
If you view the product $A \mathbf{x}$ by columns, it should not be too surprising that, somewhat loosely speaking, if $\mathbf{x}$ is on a circle, then $\mathbf{y}$ is on an ellipse, and that if $\mathbf{x}$ is on a sphere, then $\mathbf{y}$ is on an ellipsoid.
To check this, the code below draws the sphere/ellipsoid case.
import numpy as np
from matplotlib import pyplot as plt
%matplotlib inline
Note the %matplotlib inline
magic above that makes matplotlib
work well in Jupyter notebooks.
def sphere(a, n=30):
assert np.all(a.shape == (3, 3)), 'matrix must be 3 x 3'
phi = np.linspace(0., 2. * np.pi, n)
theta = np.linspace(0., np.pi, n)
phi, theta = np.meshgrid(phi, theta)
phi, theta = np.ravel(phi), np.ravel(theta)
sin_theta = np.sin(theta)
x = np.stack((sin_theta * np.cos(phi), sin_theta * np.sin(phi), np.cos(theta)))
y = np.dot(a, x)
return y
def draw_3d_curve(y):
assert y.ndim == 2 and y.shape[0] == 3, 'y must be a 3 by k array'
fig = plt.figure(figsize=(8, 6))
ax = fig.add_subplot(projection='3d')
ax.plot(y[0], y[1], y[2])
ranges = [np.ptp(yk) for yk in y]
ax.set_box_aspect(ranges)
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('z')
plt.show()
a1 = np.array([[5., 0., -1], [3., 1., 0.], [0., -2., 2.]])
ys1 = sphere(a1)
draw_3d_curve(ys1)
This is a weird way to draw an ellipsoid: The function sphere
first generates a $30\times 30$ grid of points $\mathbf{x}$ on a unit sphere, then "flattens" the grid (with the numpy.ravel
function) into a path that traverses all the points on the grid in some order, and then returns a $3\times 900$ matrix whose columns are the 900 points $\mathbf{y} = A \mathbf{x}$. Finally, the function draw_3d_curve
draws the path traversed by the points $\mathbf{y}$.
The reason for this strange procedure is that we can then use the same function draw_3d_curve
to also display the circle/ellipse case.
Write a function with header
def ellipse(a, n=100):
that is similar in style to sphere
(but quite a bit simpler). The function takes a $3\times 2$ matrix $A$ as its parameter a
and generates a $3\times$ n
array of points of the form $\mathbf{y} = A\mathbf{x}$ where $\mathbf{x}$ is a point on the unit circle on the plane.
Then show the result of calling draw_3d_curve
on the output from the call ellipse(a1)
where a1
is defined below.
a1 = np.array([[3., 0.], [0., 2.], [1., 1.]])
Do the following:
Run your code to display $A\mathbf{x}$ where $A$ is now the matrix a2
defined below and $\mathbf{x}$ is a set of $n$ points on the unit circle on the plane.
Describe briefly and clearly in what way the drawing in this Problem differs from that in Problem 3.1 in a major qualitative way. That is, what does the drawing look like rather than an ellipse?
Explain briefly and in intuitive terms why the drawing does not look like an ellipse.
Explain in what way the drawing you obtain in this Problem can still be viewed as a special ellipse.
a2 = np.array([[2., 6.], [1., 3.], [4., 12.]])
Do the following:
Use sphere
, the matrix a3
below, and draw_3d_curve
to draw what is prima facie supposed to be a path on an ellipsoid.
Describe briefly and clearly in what way the drawing in this Problem differs from that in the preamble to this Part in a major qualitative way. That is, what does the drawing look like rather than an ellipsoid? And what would it look like if the value of the parameter n
passed to sphere
were very large?
Explain briefly and in intuitive terms why the drawing does not look like an ellipsoid.
Explain in what way the drawing you obtain in this Problem relates to an ellipsoid.
a3 = np.array([[2., 6., 4.], [1., 3., 0.], [1., 3., 0.]])
Could you change the matrix a3
to another $3\times 3$ matrix a4
so that the ellipsoid looks qualitatively similar to the figure you obtained in Problem 3.2? If so, state how. If this cannot be done, explain why not. No need to produce an actual a4
.
In light of the experiments in the previous problems of this Part, we need to revisit the following statement, which slightly paraphrases what was said in the preamble:
Let $\mathbf{y} = A \mathbf{x}$ be a linear mapping. If $\mathbf{x}$ is on a circle, then $\mathbf{y}$ is on an ellipse, and if $\mathbf{x}$ is on a sphere, then $\mathbf{y}$ is on an ellipsoid.
This statement is of course not quite right, since we don't always obtain regular ellipses or ellipsoids. It is also specialized, as it speaks to figures defined in 2 or 3 dimensions.
To restate this observation more correctly and more generally, let us first introduce some terminology.
We use the term $d$-dimensional hypersphere to denote the set of all points at unit distance from the origin in some linear space of dimension (exactly) $d$.
We also use the term $d$-dimensional hyper-ellipsoid to denote the $d$-dimensional generalization of an ellipse or ellipsoid. In particular:
The notion of $d$-dimensional hyper-ellipsoid can be generalized to be valid for any positive integer $d$, and, in a sense, this is your job in this problem.
Specifically, let $A$ be an $m\times n$ matrix for any positive integer values of $m$ and $n$ (including $m < n$), and let $A$ have $r$ linearly independent columns (so that therefore $r \leq \min(m, n)$). As you know, the number $r$ is called the rank of $A$.
Rephrase the statement quoted above using the terms "$d$-dimensional hypersphere" and "$d$-dimensional hyper-ellipsoid," with $d$ replaced by a suitable value; and referring to suitable fundamental spaces associated with $A$ (remember problem 1.2) so that the statement becomes correct regardless of the values of $m, n, r$. Be as precise as you can in your statement.
No need to explain.
Every linear mapping establishes an isomorphism (i.e., a 1-1 mapping) between its row space and its column space.
Let $y = f(\mathbf{x})$ be a function from $\mathbb{R}^m$ to $\mathbb{R}$.
A line in $\mathbb{R}^m$ can be written in parametric form as follows:
$$ \mathbf{x} = \mathbf{p} + \alpha \mathbf{u} $$where $\mathbf{p}$ is a point in $\mathbb{R}^m$, $\mathbf{u}$ is a nonzero vector in $\mathbb{R}^m$, and $\alpha$ is a real-valued parameter. As $\alpha$ varies on the real line, the point $\mathbf{x}$ varies along a line in $\mathbb{R}^m$.
The restriction of $f$ to this line is the function $\mathbb{R} \rightarrow \mathbb{R}$ defined as follows:
$$ \phi(\alpha) = f(\mathbf{p} + \alpha \mathbf{u}) \;. $$The problems in this part pertain to the function $f\ :\ \mathbb{R}^3 \rightarrow \mathbb{R}$ defined as follows:
$$ f(\mathbf{x}) = 2x^2 - xy + z^2 \;\;\;\text{where}\;\;\; \mathbf{x} = \left[\begin{array}{c} x \\ y \\ z \end{array}\right] \;. $$Write an algebraic expression for the gradient $g(\mathbf{x})$ of the function $f$ given above at an arbitrary point $\mathbf{x}$. Your expression should be in terms of the entries $x, y, z$ of $\mathbf{x}$.
Then give the numerical value of $g(\mathbf{p})$ where
$$ \mathbf{p} = \left[\begin{array}{r} 1 \\ -1 \\ -2 \end{array}\right] \;. $$Write a formula for the restriction $\phi(\alpha) = f(\mathbf{p} + \alpha \mathbf{u})$ where $\mathbf{p}$ is the point given in the previous problem and where $\mathbf{u} = -g(\mathbf{p})$. Thus, the line to which $f$ is restricted goes through $\mathbf{p}$ and is oriented along the (negative) gradient of $f$ at $\mathbf{p}$.
Your formula should be a polynomial in $\alpha$. Simplify the polynomial as much as possible.
Find an exact numerical expression and an approximate decimal value for a stationary point $\alpha_0$ of $\phi(\alpha)$. A stationary point is a local minimum, maximum, or saddle.
Then give the approximate decimal value of $\phi(\alpha_0)$ and state whether $\alpha_0$ is a local maximum, minimum, or saddle point for $\phi(\alpha)$.
All your decimal values should have three decimal digits after the period.
Write code that plots $\phi(\alpha)$ for $\alpha$ in $[-1, 1]$.
You may want to check that the stationary point you found in the previous problem has type, position, and value consistent with your plot. However, do not show the results of your check.
Is the point
$$ \mathbf{x}_0 = \mathbf{p} - \alpha_0 g(\mathbf{p}) $$a stationary point for $f(\mathbf{x})$? Prove your answer.