Program Description: Computing Practice and Theory, Spring, 2013
Faculty: Judy Cushing, Aaron Skomra, Richard Weiss
This project-oriented program combined the theory and practice of pattern analysis and modeling within the context of eScience, and explored how pattern analysis, modeling, and statistics advance the sciences, particularly environmental science. The program consisted of four components: data analysis and statistics, data mining, seminar, and a project of the student’s own choosing.
Data analysis and statistics exposed students to a range of research design and data analysis methods, and to some modeling and visualization tools. The primary focus was on the application and interpretation of statistical methods including graphical and tabular summaries, distributions, t-tests, analysis of variance (ANOVA), Chi-square tests, linear regression, and non-parametric (Monte Carlo/resampling) approaches. We also briefly addressed modeling (using STELLA) and scientific visualization (using Processing). Students completed weekly labs and demonstrated their understanding of the material in nine lab reports, five quizzes, and midterm and final in-class and take-home exams. Text: Gotelli and Ellison, A Primer of Ecological Statistics. Software: MS Excel for data management and presentation, the Resampling Add-in for Excel, JMP for statistical analysis. We also introduced students to R and STELLA.
In Data mining, students learned about problems in classification, clustering, and regression. They learned about the theory and application of decision trees, rule sets, and linear models. The evaluation and optimization of these techniques is based on statistical tests and information theory. The Weka framework allowed them to perform experiments comparing a wide variety of algorithms, although they also worked with implementing a k-means clustering algorithm. Students were evaluated on six required and one optional laboratory report and five homework assignments. They also took two exams and five quizzes. The text was Witten, Frank, and Hall: Data Mining: Practical Machine Learning Tools and Techniques, 3rd Edition.
Seminar focused on the impact of computation and statistics on science and society, and included guest lectures on how scientists use computing. Students read three books: Meadows, Thinking in Systems; Salsburg, The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century; Lanier, You Are Not a Gadget. We also read scholarly articles by Megler and Howe (see lectures below) and sections from Weizenbaum’s Computer Power and Human Reason. Students were asked to write 1-2 page papers each week.
In the lecture series that accompanied seminar, experts in computer science and environmental science spoke on topics related to program themes: 1) Nik Stevenson-Molnar, Conservation Biology Institute: Data Basin’s programming practices, 2) Lois Delcambre, Portland State University: Machine learning to classify user web sessions for the Danish Cancer Society, 3) Veronika Megler, Portland State University: Database technology to serve science, an oceanography case study, 4) Guillaume S. Mauger, University of Washington: Introduction to climate change and climate change models, 5) Bill Howe, University of Washington: Cloud computing to support interactive, visual, exploratory science, 6) Michael Wolfe, The Portland Group Inc.: Technical computing on GPUs, 7) Jenny Orr, Willamette University: Computer graphics for non-programmers, and 8) Bob McKane, Environmental Protection Agency: Modeling natural hydrologic systems.
For their project, students were asked to write proposals with clear and attainable learning objectives that involved research, learning or programming. We asked that proposals include a bibliography, and be either small enough to be completed in one quarter or a self-contained part of a larger project. To facilitate project work, faculty organized small project affinity groups that met weekly with a faculty advisor. Students were asked to report orally and in writing weekly, to write a project report, and to present their project orally to the program at the end of the quarter.
Credit Equivalencies
4 – Modeling, Data Analytics, and Statistics
4 – Data Mining, Machine Learning and Pattern Recognition
4 – Computing theory and practice: Advancing the Practice of Science (Seminar & Lecture)
4 – Student project: research and/or programming practicum
16: Total
*would indicate Upper Division Science Credit (which some students earned) who had taken Computability or another upper division CS course could apply for).