Go to Main Content

Georgetown University


Detailed Course Information


Fall 2019
Jan 20, 2022
Transparent Image
Information Select the desired Level or Schedule Type to find available classes for the course.

PPOL 567 - Massive Data Fundamentals
In this course, students will learn the technology, business, science, and social implications of "big data" processing. In recent years there has been an explosion of tools, techniques, and technologies for working with massive data sets. Students will build real word systems, using stand-alone Hadoop/Spark environments running in VirtualBox on personal systems, and scalable clusters on Amazon Web Services. Topics: Big Data terminology, scaling from one computer to thousands, data storage and data privacy, Spark, data formats and data wrangling, text processing and web mining, streaming data, graph processing. Students will be provided Amazon Web accounts with allowances that are sufficient to cover the course work. Prerequisites: PPOL 565 or equivalent, working knowledge of Python and the Unix command line. Students need to own a laptop computer with at least 100GB of free disk space and 8GB of RAM.

3.000 Credit hours
3.000 Lecture hours

Levels: MN or MC Graduate
Schedule Types: Lecture, Seminar

Public Policy Department

Must be enrolled in one of the following Majors:     
      Data Science for Public Policy

Return to Previous New Search
Transparent Image
Skip to top of page