# Lab 2:Broadcasting, Inner/Outer Products and Matrix Multiplication 
---

## Lab Objectives

The goals for this lab are to 

- Understand Broadcasting. 
- Matrix multiplication (and a few other operations)


At the end of each section there will be questions to answer. 

<span style="color:red">To receive credit for the assignment, the questions must be completed. </span>

We will include import statements for commonly used modules in the cell below. Make sure to type `shift enter` to run it. 

In [2]:
import numpy as np

# shorthand for importing random module from numpy
from numpy import random as npr


# Broadcasting and Array Operations

In the previous lab, we mentioned 1-d arrays of the form, 
$$a=\begin{bmatrix}a_0\\ a_1 \\ \vdots \\ a_{n-1} \end{bmatrix}$$

<br>
and 2d-arrays, i.e., matrices, which have the form,
$$a=\begin{bmatrix}a_{0,0} & a_{0,1} & \dots a_{0,m-1}  \\ \vdots & \dots &  \vdots \\ a_{n-1,0} & a_{0,1} & \dots a_{n-1,m-1} \end{bmatrix}$$

<br>
Linear algebra only defines addition of vectors of the same size:
<br>
$$\begin{bmatrix} 1\\ 2 \\ 3 \\ 4 \end{bmatrix} +\begin{bmatrix} 8\\ 10 \\ 12 \\ 14 \end{bmatrix} = \begin{bmatrix} 9\\ 12 \\ 15 \\ 18 \end{bmatrix}$$
<br>
and scalar multiplication:
$$\begin{bmatrix} 1\\ 2 \\ 3 \\ 4 \end{bmatrix} \times 5 = \begin{bmatrix} 5\\ 10 \\ 15 \\ 20 \end{bmatrix}$$
and regular matrix multiplication. For 2d arrays (matrices), matrix multiplication involves first checking if the number of columns of the first matrix equals the number of rows of the second. So if the first matrix has shape (m,r) the second matrix must have shape (r,n) for some number n, else multiplication is undefined. If the matrices have shapes (m,r) and (r,n), the product has shape (m,n).


But while writing code, numpy allows for operations such as
$$\begin{bmatrix} 1\\ 2 \\ 3 \\ 4 \end{bmatrix} + 5 $$
It does not mean that we have invented new operations. Rather python interprets the above
equation to mean
$$\begin{bmatrix} 1\\ 2 \\ 3 \\ 4 \end{bmatrix} + \begin{bmatrix} 5\\5\\5\\5\end{bmatrix} = \begin{bmatrix} 6\\ 7 \\ 8 \\ 9 \end{bmatrix}$$


Similary numpy interprets
$$\begin{bmatrix} 1\\ 2 \\ 3 \\ 4 \end{bmatrix} +\begin{bmatrix} 1 & 3\end{bmatrix} $$
<br>
not as an incompatible vector addtion, but as the following legitimate one:
$$\begin{bmatrix} 1&1\\ 2&2 \\ 3&3 \\ 4&4 \end{bmatrix} +\begin{bmatrix} 1 & 3\\1&3\\1&3\\1&3\end{bmatrix}$$
<br>


These "generous interpretations" built into numpy are called **broadcasting** rules. This often allows for simpler and more human readable/understandable code. The expansions are not arbitrary, and broadcasting cannot always make any two matrices compatible. Rather there are specific rules underlying broadcasting. They are:

1. Compare the number of dimensions of each array, if one array dimension is shorter append 1d dimensions on the left until they are the same length.
2. Iterate through the new shapes. 
    1. If the array shapes match in a coordinate, nothing to be done.
    2. If the array shapes do not match in a coordinate, but one of the array-shapes has a one in that coordinate, then that array is copied in that dimension to match the other array. For example, if you have (4,1) and (1,2) as the shapes of the two arrays (as one of the additions above). The first coordinate of the shapes has a 4 and a 1 respectively. The second array is therefore repeated along this dimension 4 times. The second coordinate of the shapes have a 1 and a 2. Now the first array is repeated along this dimension 2 times. So the eventual broadcasted shape is 4x2. 
    3. If the array shapes do not match in a coordinate, and neither shape has one as the entry in that coordinate, then they are incompatible. So (4,3) and (2,3) are incompatible. 

After a common shape is found (if any), operations are performed on the broadcasted new matrices.

For example, the shapes
```
 2x3x1 
 3x5
```
are compatible, and would result in  an array of 
``` 
 2x3x5
```
Let us work through this. Applying step 1, we have, 
```
2x3x1
1x3x5
```

Then step two would result in,
```
2x3x5
2x3x5
```
if added or multiplied.

On the other hand, the shapes
```
 3x4x2
 2x1
```
are not compatible.
After step 1, they would have shapes,
```
 3x4x2
 1x2x1
```
Then in step 2 part b, they would be deemed incompatible because of the middle dimension.


for a more in depth explanation see https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html

Now we go over some examples;

In [4]:
## Example of Compatible shapes
a = np.array(np.arange(3)).reshape(1,3)
b = np.array(np.arange(3)).reshape(3,1)

print("a is the row vector \n", a, "\n and has shape ", a.shape,'\n')
print("b is the column vector \n", b, "\n and has shape ", b.shape,'\n')

print("a+b is not a valid matrix operation (you cannot add a row vector to a column, you can only add matrices/vectors of the same shape). But what python understands from a+b here is very different. Both a and b are broadcasted to form 3x3 matrices (can you use the rules above to verify why?). Then python adds the broadcasted matrices to give  \n", a+b, "\n and the output has shape ", (a+b).shape,'\n')
print("a*b is not matrix multiplication, but elementwise multiplication of 2 matrices of the same shape (equivalent to .* in MATLAB). It is unfortunate * is used here. But here, our a and b do not have the same shape, but they can both broadcasted to 3x3 matrices. Then, python performs elementwise multiplication on the broadcasted matrices to output \n", a*b, "\n and has shape ", (a*b).shape)

## They are compatible, here is the result of expanding them to match the new shape of (3,3) 
print("In both cases, a is boradcasted/expanded to \n", np.repeat(a,3,axis=0),'\n')
print("And b is broadcasted/expanded to \n", np.repeat(b,3,axis=1) ,'\n')

## If we wanted to multiply the row vectors:
print('This is the regular matrix multiplication between a and b, denoted by a@b, (regular multiplication of a 1x3 and a 3x1 matrix yields a product of size 1x1): \n', a@b)
print('and the regular multiplication between b and a, denoted by b@a, (a 3x1 and a 1x3 matrix multiply to yield a 3x3 product): \n', b@a)

a is the row vector 
 [[0 1 2]] 
 and has shape  (1, 3) 

b is the column vector 
 [[0]
 [1]
 [2]] 
 and has shape  (3, 1) 

a+b is not a valid matrix operation (you cannot add a row vector to a column, you can only add matrices/vectors of the same shape). But what python understands from a+b here is very different. Both a and b are broadcasted to form 3x3 matrices (can you use the rules above to verify why?). Then python adds the broadcasted matrices to give  
 [[0 1 2]
 [1 2 3]
 [2 3 4]] 
 and the output has shape  (3, 3) 

a*b is not matrix multiplication, but elementwise multiplication of 2 matrices of the same shape (equivalent to .* in MATLAB). It is unfortunate * is used here. But here, our a and b do not have the same shape, but they can both broadcasted to 3x3 matrices. Then, python performs elementwise multiplication on the broadcasted matrices to output 
 [[0 0 0]
 [0 1 2]
 [0 2 4]] 
 and has shape  (3, 3)
In both cases, a is boradcasted/expanded to 
 [[0 1 2]
 [0 1 2]
 [0 1

In [5]:
## Example of Inompatible shapes
a = np.array(np.arange(9)).reshape(3,3)
b = np.array(np.arange(4)).reshape(4,1)

print("a is \n", a, "\n and has shape ", a.shape)
print("b is \n", b, "\n and has shape ", b.shape)

# The following will throw an exception/error.
print("a+b is \n", a+b, "\n and has shape ", (a+b).shape)
print("a*b is \n", a*b, "\n and has shape ", (a*b).shape)


a is 
 [[0 1 2]
 [3 4 5]
 [6 7 8]] 
 and has shape  (3, 3)
b is 
 [[0]
 [1]
 [2]
 [3]] 
 and has shape  (4, 1)


ValueError: operands could not be broadcast together with shapes (3,3) (4,1) 

### Why do we care?
Why not just write `for` loops?

Broadcasting not only simplifies the code we write, it also can be much faster. Python is an interpreted language, and if we attempt to write a `for` we can end up having code take hours longer then necessary. Python , or the corresponding module, provides shortcuts to certain code that is implemented and compiled in C. Part of being an exceptional programmer is understanding how to use the *fast lanes* provided to us by the developers.
For more details see https://docs.scipy.org/doc/numpy/reference/ufuncs.html


<span style="color:red"> This is an important lesson, since as we progress through this course, this is the difference between code that finishes running in short order or takes days to completion. </span>
<br> Here is a short example illustrating this point by naively writing our own function to multiply a multidimensional array by a scalar. 

In [31]:
## multiplies any multidimensional array, a, by a scalar b
def marrsc(a,b):
    c=np.zeros(a.shape)
    for x,y in np.nditer([a,c], op_flags = [['readonly'],["readwrite"]] ): 
        y[...] = x*b
    return a

## Create arbitrary array
a = npr.rand(50,33)

## Time how long it takes to execute give function we wrote
print('With a for loop: \n')
t1= %timeit -o marrsc(a,3.)

## Time how long it takes to use broadcasting
print('\nWith broadcasting: \n')
t2= %timeit -o a*3.

## Compare the difference
print("\n Our naive for loop is ", t1.average/t2.average, " times slower than using broadcasting.")


With a for loop: 

9.94 ms ± 589 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

With broadcasting: 

7.09 µs ± 127 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

 Our naive for loop is  1403.1222679218004  times slower then using broadcasting.


## <span style="color:red"> (1) Questions</span>

0. Create a new markdown cell below and type in the answers.

1. Please write down whether the following shapes are compatible and if so what is the resulting shape.
    1. 2x3x1 and 3
    2. 1x3 and 5x1
    3. 10x10 and 5x1
    4. 2x1x4 and 1x7x4
    5. 300x300x3 and 60x1
    
2. Which operations below would be valid operations if python did not do broadcasting? Which operations below can be performed after broadcasting? Check by writing the corresponding code in a code cell below.

    1. $$\begin{bmatrix} 6\\ 9 \\ 0 \\ 7 \end{bmatrix} * \begin{bmatrix} 2 & 4\end{bmatrix} $$ <br>
    2. $$\begin{bmatrix} 1 & 3 \\ 2 & 1  \end{bmatrix} + \begin{bmatrix} 11 & 33\end{bmatrix} $$ <br>
    3. $$\begin{bmatrix} 1 & 7 \\ 2 & 9 \\ 3 & 4 \end{bmatrix} * \begin{bmatrix} 1 \\ 3  \\ 7\end{bmatrix} $$



The basic operation we want to understand is matrix multiplication. Not the funny elementwise multiplication of matrices (which is not physically relevant except while writing clean code), but regular matrix multiplication. In numpy/python, the regular matrix multiplication (for 2d arrays) is @ or matmul.

First, recall a couple of basic points about matrix multiplication from high school. We assume that we are multiplying two matrices, $A$ and $B$, where $A$ has shape $(m,r)$ and $B$ has shape $(r,n)$ (so multiplication is possible). Note that for multiplication to work the number of columns (the second element in the shape of $A$) must equal the number of rows (the first element in the shape of $B$).

We will present two ways to multiply matrices. One will be through an inner product and the other will be using an outer product of vectors. 


In [7]:
# This stands for: a = matrix with random integer entries of shape (3,2), 
# each entry is in range(0,4) ie one of 0,1,2,3 randomly chosen
a = npr.randint(0,4,(3,2))
b = npr.randint(0,4,(2,3))

print('The product of \n',a,'\n and \n',b,'\n is \n', a@b)
print('We could also multiply them using matmul: \n',np.matmul(a,b))

c = npr.randint(0,4,(1,3))
# Uncommenting the following line will throw up an error:
# a @ c
print('a has shape (3,2), c has shape (1,3), product is not defined: ')

d = npr.randint(0,4,(3,3))

# What happens if we try d @ c? d has shape (3,3) and c has shape (1,3).
# We cannot multiply these matrices since the number of columns of the first (3) != no. of rows of second (1). 
# Here, the operator @ will NOT broadcast c in this case into a (3,3) matrix and then 
# multiply. Rather, it will simply throw an error. This is useful because 
# you have to make sure there is some operator that doesn't do things behind your back :).
# But of course, d @ c.T works (c.T is the transpose of c with shape (3,1)).

print(d @ c.T )

# Uncommenting the following error throws an error:
# print(d @ c)


The product of 
 [[1 2]
 [0 0]
 [0 1]] 
 and 
 [[2 0 1]
 [0 0 1]] 
 is 
 [[2 0 3]
 [0 0 0]
 [0 0 1]]
We could also multiply them using matmul: 
 [[2 0 3]
 [0 0 0]
 [0 0 1]]
a has shape (3,2), c has shape (1,3), product is not defined: 
[[2]
 [4]
 [6]]


# Matrix multiplication

This section reviews matrix multiplication. As mentioned above, if you have two matrices $A$ and $B$, for $AB$ to be defined, the number of columns in $A$ (the second entry in the shape of $A$) must equal the number of rows in $B$ (the first entry in the shape of $B$). So if $A$ has shape $(m,r)$ and $B$ has shape $(r,n)$, the product has shape $(m,n)$.

In high school you learnt how to multiply matrices, possibly by rote. If it is unclear why that is natural, you are not to blame. Here we will review matrix multiplication and as we do more work with actual data and concepts underlying learning, it should become more clear why matrix multiplication is defined this way. 

For future reference, the operator "@" performs matrix multiplication. 

python also provides * for element-wise multiplication, but this is *NOT* called the product of the matrices. This is only provided to facilitate writing code in certain instances, and does not have much physical significance. You can do element-wise multiplication of the entries of matrices $A$ and $B$ if and only if the shapes of $A$ and $B$ are the same, and once again, this doesn't really have much conceptual meaning.

In [15]:
# Change the shapes below to try out various other shapes

a = npr.randint(0,4,(3,2))
b = npr.randint(0,4,(2,4))

print('shape of a is',a.shape,'and shape of b is',b.shape)

print('The product of a and b is \n', a @b, '\n and the shape of the product is', (a@b).shape)
print('The elementwise multiplication of entries in a and b is *NOT* the product of a and b. Such an elementwise product is only defined if the shapes of a and b are the same')

shape of a is (3, 2) and shape of b is (2, 4)
The product of a and b is 
 [[4 9 9 9]
 [2 5 3 3]
 [1 3 0 0]] 
 and the shape of the product is (3, 4)
The elementwise multiplication of entries in a and b is *NOT* the product of a and b. Such an elementwise product is only defined if the shapes of a and b are the same


## Transpose of a matrix

The transpose of a matrix $A$, denoted by $A^T$ simply makes the columns of $A$ into rows. python uses A.T to obtain the transpose of a matrix. Clearly, if $A$ has shape $(m,n)$, then $A^T$ has shape $(n,m)$.

In [12]:
a = npr.randint(0,4,(3,2))

print('The matrix a is \n',a,'\n and its transpose is \n',a.T)
print('The shape of a is ', a.shape, 'and the shape of a.T is',a.T.shape)

The matrix a is 
 [[2 0]
 [3 0]
 [2 2]] 
 and its transpose is 
 [[2 3 2]
 [0 0 2]]
The shape of a is  (3, 2) and the shape of a.T is (2, 3)


Before we go into matrix multiplication proper, we will consider vectors first. Vectors are special matrices where either the number of rows is 1 or the number of columns is 1. Column vectors are numbers arranged in a column (vertically, so only one column), while row vectors are numbered arranged in a row (horizontally, so only one row). 

Therefore if you have two vectors $a$ and $b$, with shapes $(m,1)$ and $(n,1)$ respectively, where $m$ and $n$ are numbers, you can define two possible multiplications:

* If $m=n$, a.T @ b:

Here $a^T$ has shape $(1,n)$ and $b$ has shape $(n,1)$. Therefore the product has shape $(1,1)$. This is called the __inner product__ of $a$ and $b$.

* a @ b.T (regardless of $m$ and $n$):

Here $a$ has shape $(m,1)$, and $b^T$ has shape $(1,n)$. Therefore the product has shape $(m,n)$. This is called the __outer product__ of $a$ and $b$.

Note that the inner product is a number (matrix with only one entry) while the outer product is in general a matrix.

## Inner products

The inner product is also known as the dot product, a term you may be more familiar with. if you have two vectors $a$ and $b$ with n coordinates each, you arrange $a$ as a row vector (i.e. with shape 1xn) and $b$ as a column vector (with shape nx1) and multiply the two:
$$
\begin{bmatrix}a_1&\ldots& a_n\end{bmatrix} 
\begin{bmatrix} b_1\\\vdots\\ b_n\end{bmatrix} 
= a_1b_1+a_2b_2 +\ldots + a_n b_n.
$$
Of course, you could arrange $b$ as the row vector and $a$ as the column, and multiply the row vector $b$ by the column vector $a$ to get the same result. So inner products are __commutative__ (the order of the vectors does not matter).
The result is a number (1x1). Recall that the inner product gives you the angle between the vectors $a$ and $b$, specifically,
$$
a \cdot b = |a||b| \cos\theta,
$$
where $|a|$ and $|b|$ are the lengths of the vectors $a$ and $b$ respectively, and $\theta$ the angle between them. The above equation applied on $a$ and $a$ would give,
$$
a \cdot a = |a|^2\cos 0 = |a|^2,
$$
so the length of a vector is just the square root of the inner product of the vector with itself. 

## (b) Outer products

The outer product is the "other" way to multiply two vectors. Here you take the first vector $a$ and arrange it as a column vector (so has shape nx1). The second vector is arranged as a row vector (so has shape 1xn). The number of columns of the first vector (1) equals the number of rows of the second (1 again), so these are compatible as well. Their product has size nxn, and is called the outer product:
$$
\begin{bmatrix} a_1\\\vdots \\ a_n \end{bmatrix}
\begin{bmatrix} b_1 &\ldots & b_n\end{bmatrix}
=
\begin{bmatrix}
a_1b_1 & a_1b_2 & \ldots & a_1b_n \\
a_2b_1 & a_2 b_2 & \ldots & a_2b_n\\
\vdots & \vdots & \vdots & \vdots\\
a_nb_1 & a_n b_2 & \ldots & a_nb_n
\end{bmatrix}
$$
Now also note that the vectors $a$ and $b$ need not have the same length for their outer product to be defined (namely,
it is perfectly ok if $a$ had length $m$ and $b$ had length $n$. Then we would multiply a $(m,1)$ matrix and a $(1,n)$ matrix,
which is valid (number of columns of the first = number of rows of the second =1). The product is then a $(m,n)$ matrix. 
Note that the outer product of $a$ and $b$ is the __transpose__ of the outer product of $b$ and $a$ in general.

If you remember matrix multiplication from high school, verify that the multiplication above is consistent with the "high school" way of multiplying the $(n,1)$ and $(1,n)$ matrices above. 


In [19]:
a = npr.randint(0,4,(3,1))
b = npr.randint(0,4,(3,1))

print('The vector a is \n', a, '\n and the vector b is \n',b)

print('The inner product of a and b is a.T @ b and is', a.T@b)
print('The outer product of a and b is a @b.T and is \n', a@ b.T)

The vector a is 
 [[2]
 [3]
 [0]] 
 and the vector b is 
 [[3]
 [0]
 [0]]
The inner product of a and b is a.T @ b and is [[6]]
The outer product of a and b is a @b.T and is 
 [[6 0 0]
 [9 0 0]
 [0 0 0]]


### Programming note: 
You have seen this phenomenon in Lab 1, but we emphasize this again. When you extract individual rows or columns of a matrix, python returns a 1-d array with shape that looks like $(n,)$. There is no second entry. So you have two choices to compute inner and outer products from here:
* convert them into 2-d arrays and then make them into row or column vectors as you need (try to do this on your own using what you learnt in lab0)
* use the builtin functions numpy.dot and numpy.outer

The following snippet extracts rows and columns from a parent matrix and computes their inner and outer products using the builtin dot and outer commands. Please experiment around with manual conversion to 2-d vectors.

In [31]:
P = npr.randint(0,4,(3,2))

print('P is \n',P,'\n')

# Extract rows of P. Recall how to access elements of a matrix from lab0

rowone = P[0,:]
rowtwo = P[1,:]
rowthree = P[2,:]

# Extract cols of P

colone= P[:,0]
coltwo = P[:,1]

print('The shape of the extracted rows are ',rowone.shape,'and the shape of the extracted cols are',colone.shape)

# inner products: since the vector sizes must be equal for inner products, and P has size (3,2)
# we can only take inner products between two rows or two cols, but not between a row and col 

print('Inner product between row 1 and row 2 is ', np.dot(rowone,rowtwo), 'while the inner product between col 2 and col 1 is', np.dot(coltwo,colone))

print('Inner product of rowone and rowtwo:', np.dot(rowone, rowtwo), ' and of rowtwo and rowone:', np.dot(rowtwo,rowone))

# outer product: any two vectors have an outer product

print('Outer product of rowone and coltwo is \n', np.outer(rowone,coltwo),'\n and outer product of coltwo and rowone is\n', np.outer(coltwo,rowone))

print('The inner product is commutative: changing the order of the vectors does not change the inner product.\n Changing the order of the vectors transposes the outer product in general.\n')



P is 
 [[3 1]
 [2 1]
 [3 0]] 

The shape of the extracted rows are  (2,) and the shape of the extracted cols are (3,)
Inner product between row 1 and row 2 is  7 while the inner product between col 2 and col 1 is 5
Inner product of rowone and rowtwo: 7  and of rowtwo and rowone: 7
Outer product of rowone and coltwo is 
 [[3 3 0]
 [1 1 0]] 
 and outer product of coltwo and rowone is
 [[3 1]
 [3 1]
 [0 0]]
The inner product is commutative: changing the order of the vectors does not change the inner product.
 Changing the order of the vectors transposes the outer product in general.



There are two separate ways to multiply matrices, but they both give the same product. Matrices can be multiplied using either the inner product or the outer product. 

The method you learnt in high school is the inner product approach. This has the benefit that it is easy to teach by rote, particularly since at the level of high school matrices are just considered an arrangement of numbers. As we have seen in class, matrices are not just an arrangement of numbers, but linear spaces---and a majority of these concepts are actually better brought about from the outer product approach that you have probably not seen.


### Matrix multiplication via the inner product. 

Let $A$ and $B$ be two matrices with shapes $(m,r)$ and $(r,n)$ respectively. The product has shape $(m,n)$, and the 
$(i,j)$th element of the product is just the inner product between the $i'$th row of $A$ and the $j'$th column of $B$. So  
$$
A = \begin{bmatrix} --- {{\bf a}_1} --- \\ \vdots \\ ---{{\bf a}_m} ---\end{bmatrix},
$$
where the ${\bf a}_i$s are vectors, each with $r$ coordinates (because $A$ has $r$ columns), and 
$$
B = \begin{bmatrix} | & \ldots & | \\ {{\bf b}_1} &\ldots &{{\bf b}_n} \\ | &\ldots & | \end{bmatrix},
$$
where the ${\bf b}_j$s are vectors, again each with $r$ coordinates (because $B$ has $r$ rows).
Then the product has shape $(m,n)$ (i.e. $m$ rows and $n$ columns) and its entry in the $i$th row and $j$th column is the
inner product between ${\bf a}_i$ and ${\bf b}_j$ as described above.



In [33]:
a = npr.randint(-2,2,(1,4))
b = npr.randint(-2,2,(1,4))

# Calling the dot/inner product on two row vectors throws up an error. Uncommenting the 
# following line will result in an error.
# np.dot(a,b)

# However, if u and v are not 2d arrays like above, but 1d arrays as follows:
u = npr.randint(-2,2,4)
print('u is a vector with shape ',u.shape)
v = npr.randint(-2,2,4)
print('v is a vector with shape ',v.shape)
# Then numpy.dot has no problem with aligning the vectors appropriately and giving the inner product:
print('The inner product of ',u,' and ',v,' is ',np.dot(u,v))
# The above is especially useful when we take 1-d slices of matrices, since these are returned as vectors.

# For the dot/inner product, we must arrange a as a row and b as a column. Lucky
# for us, a is already a row. So we just need to rearrange
# b.

b.shape = (4,1)

# You can now use the numpy.dot operation (or a simple @ because the inner product is just
# a special case of regular matrix multiplication. We use np.dot to emphasize we are dealing
# with vectors and we are taking the inner product. Verify this by hand.

print('The inner product of \n',a,'\n and \n',b,'\n is ',np.dot(a,b),'\n which we could also obtain using using @: ', a@b)

A = npr.randint(-3,3,(3,2))
B = npr.randint(-3,3,(2,5))


print('The product of A=\n',A,'\n and B=\n',B,'\n is \n',A @ B,'\n with shape ',(A @ B).shape)
# Note the previous comment about 1-d slices of matrices in the following line:
print('The (1,2)th element of the product is the inner product between the 2nd row of A: \n',A[1,:],'\n and the third column of B: \n',B[:,2],'\n which is ',np.dot(A[1,:],B[:,2]))

# Try a few other elements of the product by taking inner products of appropriate slices.

u is a vector with shape  (4,)
v is a vector with shape  (4,)
The inner product of  [ 0 -2  1  1]  and  [ 1 -1 -1 -1]  is  0
The inner product of 
 [[-2  1 -1 -1]] 
 and 
 [[-2]
 [ 1]
 [-2]
 [ 0]] 
 is  [[7]] 
 which we could also obtain using using @:  [[7]]
The product of A=
 [[-3  0]
 [-2  1]
 [ 0 -2]] 
 and B=
 [[ 1 -2 -1 -1  1]
 [ 0  0 -1  1  0]] 
 is 
 [[-3  6  3  3 -3]
 [-2  4  1  3 -2]
 [ 0  0  2 -2  0]] 
 with shape  (3, 5)
The (1,2)th element of the product is the inner product between the 2nd row of A: 
 [-2  1] 
 and the third column of B: 
 [-1 -1] 
 which is  1




###  Multiplying Matrices through the Outer Product
The second method, which is done by using the outer product, calculates $A \times B$ by multiplying columns of $A$ by rows of $B$. Thus if $A$ has shape $(m,r)$ and $B$ has shape $(r,n)$, we now think of the columns of $A$ and the rows of $B$. Namely,
we think of $A$ as  
$$
A = \begin{bmatrix} | & \ldots & | \\ {{\bf a}_1} &\ldots &{{\bf a}_r} \\ | &\ldots & | \end{bmatrix},
$$
where each ${{\bf a}_i}$ is a vector with $m$ coordinates (because $A$ has $m$ rows), and $B$ as
$$
B = \begin{bmatrix} --- {{\bf b}_1} --- \\ \vdots \\ ---{{\bf b}_r} ---\end{bmatrix},
$$
where the ${\bf b}_j$s are vectors, again each with $r$ coordinates (because $B$ has $r$ rows). Then the product of $A$ and $B$
is 
$$ 
\begin{bmatrix} | \\ {{\bf a}_1}\\ | \end{bmatrix}
\begin{bmatrix} ---{{\bf b}_1}--- \end{bmatrix} 
+ 
\ldots 
+
\begin{bmatrix} | \\ {{\bf a}_r}\\ | \end{bmatrix}
\begin{bmatrix} ---{{\bf b}_r}--- \end{bmatrix} 
$$
Of course this is consistent with the inner product way of multiplying matrices! Can you see how?

Let us do an example.
$$\begin{bmatrix} 1 & 2 \\ 3 & 4  \end{bmatrix} \times  \begin{bmatrix} 5 & 6 \\ 7 & 8  \end{bmatrix}=  
\begin{bmatrix} 1 \\ 3 \end{bmatrix}  \begin{bmatrix} 5 & 6 \end{bmatrix} +
\begin{bmatrix} 2 \\ 4 \end{bmatrix} \begin{bmatrix} 7 & 8  \end{bmatrix}
$$


In [3]:
a = npr.randint(-2,2,(4,1))
b = npr.randint(-2,2,(4,1))
# For the outer product of a and b, we must arrange a as a column and b as a row. 
# You can now use the @ operation, since once we arrange the vectors appropriately, 
# the outer product is just an ordinary matrix multiplication. Verify this by hand.

print('The outer product of \n',a,'\n and \n',b,'\n is \n',a@b.T,'\n which we could also obtain using using np.outer: \n', np.outer(a,b))

A = npr.randint(-3,3,(3,2))
B = npr.randint(-3,3,(2,5))


print('The product of \n',A,'\n and \n',B,'\n is \n',A @ B,'\n with shape ',(A @ B).shape)

# We can obtain the product of A and B using the outer product. Let us extract the columns of A
a1 = A[:,0]
a2 = A[:,1]
# and the rows of the second matrix, B
b1 = B[0,:]
b2 = B[1,:]

# Take the outer product of a1 and b1, as well as the outer product of a2 and b2.
o1 = np.outer(a1,b1)
o2 = np.outer(a2,b2)


print('The first col of A is a1 \n', a1.reshape(3,1), '\n the first row of B is b1.T\n ', b1.reshape(1,5), 
      '\n and their outer product is a1 b1.T\n', o1) 

print('The second col of A is a2\n', a2.reshape(3,1), '\n the second row of B is b2.T\n ', b2.reshape(1,5), 
      '\n and their outer product is a2 b2.T\n', o2) 

# The outer product way to multiply A and B:

print('The product AB can be obtained as np.outer(a1,b1)+np.outer(a2,b2) =\n', np.add(o1, o2),'\n which sure enough coincides with \n', A @ B)

The outer product of 
 [[-1]
 [ 1]
 [-2]
 [-2]] 
 and 
 [[0]
 [0]
 [0]
 [1]] 
 is 
 [[ 0  0  0 -1]
 [ 0  0  0  1]
 [ 0  0  0 -2]
 [ 0  0  0 -2]] 
 which we could also obtain using using np.outer: 
 [[ 0  0  0 -1]
 [ 0  0  0  1]
 [ 0  0  0 -2]
 [ 0  0  0 -2]]
The product of 
 [[ 1  1]
 [-2 -2]
 [ 0 -3]] 
 and 
 [[-1 -1 -1  1  1]
 [ 0 -1  0 -3  2]] 
 is 
 [[-1 -2 -1 -2  3]
 [ 2  4  2  4 -6]
 [ 0  3  0  9 -6]] 
 with shape  (3, 5)
The first col of A is 
 [[ 1]
 [-2]
 [ 0]] 
 the first row of B is
  [[-1 -1 -1  1  1]] 
 and their outer product is
 [[-1 -1 -1  1  1]
 [ 2  2  2 -2 -2]
 [ 0  0  0  0  0]]
The second col of A is 
 [[ 1]
 [-2]
 [-3]] 
 the second row of B is
  [[ 0 -1  0 -3  2]] 
 and their outer product is
 [[ 0 -1  0 -3  2]
 [ 0  2  0  6 -4]
 [ 0  3  0  9 -6]]
The product AB can be obtained as np.outer(a1,b1)+np.outer(a2,b2) =
 [[-1 -2 -1 -2  3]
 [ 2  4  2  4 -6]
 [ 0  3  0  9 -6]] 
 which sure enough coincides with 
 [[-1 -2 -1 -2  3]
 [ 2  4  2  4 -6]
 [ 0  3  0  9 -6]]


## <span style="color:red"> (2) Warm up</span>

0. Create a new markdown cell below and type in the answers.

1. Write a function that takes in two vectors and returns their inner product. Use a loop or nditer, not inbuilt functions (we aren't aiming for a good implementation, but to ensure you understand inner products)

2. Verify the previous function by using the numpy inbuilt command np.dot that computes the inner product. 

3. Write a function that takes in two vectors and returns their outer product. Again, use a loop or nditer, not inbuilt functions.

4. Verify the previous function by using the numpy inbuilt command np.matmul or the operator @.

## <span style='color:red'> (3) Problems for submission
In this sequence of problems, we will generate random matrices (2d arrays), and calculate the product by using by both inner and outer product methods and then compare them with the product implemented in numpy. 
 
5. Write a function in numpy that takes two arbitrary shape matrices and calculates the product by the inner product method. Make sure it checks whether their shapes are compatible. Reuse the code from problem 1 above.

6. Write a function in numpy that takes two arbitrary shape matrices and calculates the product by the outer product method. Make sure it checks whether their shapes are compatible. Reuse the code from problem 3 above. Feed the same matrices to 5 and 6 above, and make sure the two approaches give the same answer. Do you understand why it is so?

7. Calculate the product by using the numpy operator '@' or np.matmul and verify that it equals the output from the functions written above.

8. Verify (as in Lab 1) that a matrix multiplied by a column vector is simply a linear combination of the columns of the matrix. Namely, if the columns of the matrix A are ${\bf a}_1, {\bf a}_2, \cdots {\bf a}_n$ (that is, $A=\begin{bmatrix} {\bf a}_1& \cdots & {\bf a}_n \end{bmatrix}$) and the vector $\bf x$ is $\begin{bmatrix} x_1\\\vdots\\x_n\end{bmatrix}$, then $A{\bf x} = x_1 {\bf a}_1 + ... + x_n {\bf a}_n$. Now, this is simply the outer product way of multiplying A and $\bf x$---to demonstrate this, use only the function from Part 3 to form the linear combination described.

9. Now verify (as in Lab 1) that a row vector multiplied by a matrix is simply a linear combination of the rows of the matrix. Again, this is just the outer product way of multiplying---demonstrate this by using only the function from part 3 to form the above linear combination.

10. Using the insights in 8: obtain a matrix P such that if A is any matrix with 3 columns, AP is a cyclic shift of the columns of A (namely the first column of A is the second column of AP, second column of A is the third column of AP, and the third column of A becomes the first column of AP).

11. Using the insight in 9: obtain P such that if A is any matrix with 3 rows, PA is a cyclic shift of the rows (cyclic as explained in 10.)

