ML: Lecture 10

in Witten: Chapter 5
Hw 3 is due on Friday
Sign up for the Science Carnival the deadline says May 3, but you can still sign up this week.

For a t-test, compute sample average and sample standard deviation. But first, you should
understand a normal distribution. table
Do you know how to find the probability/test an hypothesis for a normal distribution? The formulae are
```
  z = (x - μ) / σ
  Z = (X_total - n⋅μ) / σ⋅sqrt(n)
  
```
what does the Central Limit Theorem really say?
compute sample average. Population standard deviation must be known — either it is a binomial distribution or you have a large sample from a control group, etc
How to find the critical value for a given p-value.

It is important to test an algorithm on different data from the training data.
Leave one out: how many test runs?
Sometimes there are three data sets: training, validation, testing (comparison with other
algorithms)
What is a paired t-test and what is the difference between that and a standard t-test?
```
       d_avg
  t =  --------
      sqrt(σ² / n)
 
```

A matrix has two applications in this context: linear functions and linear equations
```
   2x + y = 1
   3x + 2y = 5

   f(x, y) = (2x + y, 3x + 2y)
  
```
What does it mean to solve the equation above?
Only square matrices have inverses. The most common case of a non-square matrix is an overconstrained system: more equations than unknowns. Each equation comes from a data instance, so you generally will have more data than attributes.
The pseudo-inverse satisfies a least squares property — it minimizes the sum of squares of the errors.
We have practiced multiplying matrices — what is the rule?
```
 
  |0 1| . |0 1|   =
  |1 0|   |1 0|  
  
```
What is the determinant of a matrix?
What is the inverse of a matrix?
```
 
  | 1 1|^-1
  |-1 1|
  
```