Go to Main Content

Georgetown University


Detailed Course Information


Fall 2017
May 26, 2018
Transparent Image
Information Select the desired Level or Schedule Type to find available classes for the course.

ANLY 502 - Massive Data Fundamentals
In this course, students will learn the technology, business, science, and social implications of "big data" processing. In recent years there has been an explosion of tools, techniques, and technologies for working with massive data sets. Students will build real word systems, using stand-alone Hadoop/Spark environments running in VirtualBox on personal systems, and scalable clusters on Amazon Web Services. Topics: Big Data terminology, scaling from one computer to thousands, data storage and data privacy, Spark, data formats and data wrangling, text processing and web mining, streaming data, graph processing. Students will be provided Amazon Web accounts with allowances that are sufficient to cover the course work. Prerequisites: ANLY 501 or equivalent, working knowledge of Python and the Unix command line. Students need to own a laptop computer with at least 100GB of free disk space and 8GB of RAM. 

3.000 Credit hours
3.000 Lecture hours

Levels: MN or MC Graduate
Schedule Types: Lecture

Analytics Department

Must be enrolled in one of the following Levels:     
      MN or MC Graduate
Must be enrolled in one of the following Fields of Study (Major, Minor, Concentration, or Certificate):
      Computer Science
      Mathematics and Statistics

Return to Previous New Search
Transparent Image
Skip to top of page