Print Settings
 

CME 250A: Machine Learning on Big Data

A short course presenting the application of machine learning methods to large datasets.Topics include: brief review of the common issues of machine learning, such as, memorizing/overfitting vs learning, test/train splits, feature engineering, domain knowledge, fast/simple/dumb learners vs slow/complex/smart learners; moving your model from your laptop into a production environment using Python (scikit) or R on small data (laptop sized) at first; building math clusters using the open source H2O product to tackle Big Data, and finally to some model building on terabyte sized datasets. Prereqresites: basic knowledge of statistics, matrix algebra, and unix-like operating systems; basic file and text manipulation skills with unix tools: pipes, cut, paste, grep, awk, sed, sort, zip; programming skill at the level of CME211 or CS106A.
Terms: Spr | Units: 1
Instructors: ; Gerritsen, M. (PI)
© Stanford University | Terms of Use | Copyright Complaints