2015-2016 2016-2017 2017-2018 2018-2019 2019-2020
Browse
by subject...
    Schedule
view...
 

1 - 2 of 2 results for: CS 246: Mining Massive Data Sets

CS 246: Mining Massive Data Sets

Availability of massive datasets is revolutionizing science and industry. This course discusses data mining and machine learning algorithms for analyzing very large amounts of data. Topics include: Big data systems (Hadoop, Spark); Link Analysis (PageRank, spam detection); Similarity search (locality-sensitive hashing, shingling, minhashing, random hyperplanes); Stream data processing; Analysis of social-network graphs; Association rules; Dimensionality reduction (UV, SVD, and CUR decompositions); Algorithms for very-large-scale mining (clustering, nearest-neighbor search); Large-scale machine learning (gradient descent, decision tree ensembles); Multi-armed bandit; Computational advertising. We also offer a sister class CS246H (Hadoop Labs) and a follow-up project-based class CS341 (Project in Mining Massive Datasets). Prerequisites: At least one of CS107 or CS145.
Terms: Win | Units: 3-4
Instructors: Leskovec, J. (PI)

CS 246H: Mining Massive Data Sets Hadoop Lab

Supplement to CS 246 providing additional material on the Apache Hadoop family of technologies. Students will learn how to implement data mining algorithms using Hadoop and Apache Spark, how to implement and debug complex data mining and data transformations, and how to use two of the most popular big data SQL tools. Topics: data mining, machine learning, data ingest, and data transformations using Hadoop, Spark, Apache Impala, Apache Hive, Apache Kafka, Apache Sqoop, Apache Flume, Apache Avro, and Apache Parquet. Prerequisite: CS 107 or equivalent.
Terms: Win | Units: 1
Filter Results:
term offered
updating results...
number of units
updating results...
time offered
updating results...
days
updating results...
UG Requirements (GERs)
updating results...
component
updating results...
career
updating results...
© Stanford University | Terms of Use | Copyright Complaints