Machine Learning

These resources were gathered primarily by Jenny Orr, in order for the VISTAS group to explore how Machine Learning techniques might be used to enhance our visual analytics.

The following were selected for project participants to explore prior to the meeting on 1/18/2016.

Andrew Eng’s videos from a Stanford University 2012 Graduate Summer School.   Andrew Eng is a really great teacher and has an entire course you can watch if you have the time (ha, ha).  The other talks in this summer school are very good but are probably too advanced and detailed.  Deep Learning is pretty interesting and is what everyone is using these days.  Hinton is one of the first people who developed the approach and he has amazing insight into the way it works but his opening talk is tough to follow.  It is only starting to make sense to me after having watched it multiple times.
  Note, all of these videos are also on youtube.

  1. Deep Learning, Self-Taught Learning and Unsupervised Feature Learning (Part 1 Slides1-68; Part 2 Slides 69-109)  –  This is the first talk on Tuesday, July 10, 2012
  2. The second video- Advanced topics + Research philosophy / Neural Networks: Representation
    – gives a very nice introduction to a neural net, including a detailed example of how it works for XOR. I highly recommend you watch it if you don’t know much about neural networks.   The concept of “features” is very important.   He shows a great video created by Yann Lecun showing a network trained on digit recognition.   Ng also goes over some questions that were handed out to the audience. I don’t have the list of questions. You can over skip these parts.
  3. The 3rd video – Andrew Ng (Stanford University) Non-linear hypotheses  –
    Discusses backpropagation.  He goes over some material very quickly probably because most of the audience already knows backprop, however, this is still worth watching.  Note, the variable J is just a cost function which is a measure of the difference between the actual network output  and the desired output (y).  There are many different kinds of cost functions, the most common one is squared error  1/2 ( network_output – y)^2.  Obviously we want to find network parameters (thetas aka weights) which minimize J.  Backprop uses gradient descent by updating the parameters by moving them along the negative gradient of J wrt theta. We can talk about this more on Monday.

Introductions:  I still haven’t found a really good intro to machine learning article that doesn’t quickly get into a lot of math.  Below are some of what I have found but don’t get discouraged if they are a bit hard to read.  I will keep looking and send along anything that I find.  If anyone else comes across good sources, please send them out to everyone.

  1. A brief intro from the perspective of computer graphics. machineLearningSiggraph.pdf The examples are not great but the overview of the theory isn’t bad. It is accessible via the ACM Digital Libraries. It’s reference is: Peter M. Hall. 2014. Introduction to machine learning for computer graphics. In ACM SIGGRAPH 2014 Courses (SIGGRAPH ’14). ACM, New York, NY, USA, , Article 20 , 33 pages. DOI= 
  2. A summary of the different algorithms:
  3. a cute graph showing what to use and when:
  4. The beginning of this might make sense. It does raise important topics such as overfitting and regularization.
  5. An old article originally written by Klaus and me based on a talk Yann gave.  Looks like Yann has since modified it.  It that covers “tricks of the trade”, many of which are still valid.  lecun-98b.pdf
  6. Article in Nature – Haven’t read through this so don’t know how comprehensible it is. Deep Learning NatureDeepReviewLeCUn.pdf

This is a very high level look at the field.
The Five Tribes of Machine Learning (And What You Can Learn from Each)

Python Libraries there are lots out there

  1. Theano – used by a lot of the main machine learning folks
  2. scikit