This week
- in Witten: Chapter 5
- Hw 3 is due on Friday
- <a href="http://www.evergreen.edu/events/sciencecarnival/"Sign up for the Science Carnival the deadline says May 3, but you can still sign up this week.
- Next week we will be discussing clustering (6.8) and regression
- In week 9, we will have a lab on induction and loop invariants
t-test
- read the pages in Witten on t-test and paired t-test. We will do that in lab on Thursday. We will compare two learning algorithms.
Cost analysis
- There are two concepts you should know about: ROC and recall-precision curve. For next week, read about them in Chapter 5 and we will discuss them in class. They both have to do with the tradeoff between FP and FN.
- For cost-benefit analysis, you need a table of costs for TP, TN, FP, FN.
true positive rate = false positive rate = rate of success = TP + TN / TP + TN + FP + FN
- You have already seen the confusion matrix in Weka
- Kappa (Κ) corrects for the rate of success due to random chance.
Some Linear Algebra
- A matrix has two applications in this context: linear functions and linear equations
2x + y = 1 3x + 2y = 5 f(x, y) = (2x + y, 3x + 2y)
- What does it mean to solve the equation above?
- Only square matrices have inverses. The most common case is an overconstrained system:
more equations than unknowns. Each equation comes from a data instance. - The pseudo-inverse satisfies a least squares property — it minimizes the sum of squares
of the errors. - We have practiced multiplying matrices — what is the rule? Square the following matrix:
|0 1| |1 0|
- What is the determinant of a matrix? What is it good for?
- find the determinants of the following matrices:
|0 1| | 1 1| |1 1| |1 0| |-1 1| |1 1|
- What is the inverse of a matrix?
| 1 1|-1 = |-1 1|