CPaT
Computing Practice and Theory
Topics for the quarter
- What are Concepts, Instances, Attributes
- Knowledge representation
- Rule Lists, Trees, Linear Models
- Training a machine learning system
- Cleaning and Transforming Data
- Bayesian Networks
- Clustering
- Neural Networks
- Regression
Readings for this week
- Chapter 1 in Witten
Additional resources
- Artificial Intelligence, Russell and Norvig
- Reinforcement Learning, Sutton and Barto
- Principles of Computer Security Lab Manual, Nestler, White, Conklin — not the one by Conklin and White
- Seattle networking environment: seattle.cs.washington.edu
Algorithms
- rules, P. 6, there are 24 rows, how many possible functions are there from a set of 24 instances to a set of 3 outcomes?
- In general, the data is noisy
- decision list vs decision tree. P. 13
- generalization
Data
- What kinds of data are there?
- Why data needs to be cleaned/preprocessed: missing values, inconsistent values
- Summarizing data: mean, standard deviation, min, max, quartiles
- attribute subset selection: finding a minimum set of attributes that adequately describes the concept.