Welcome to the course website for MUSA 550, Geospatial Data Science in Python, taught at the University of Pennsylvania in fall 2022.

Course Description

This course will provide students with the knowledge and tools to turn data into meaningful insights, with a focus on real-world case studies in the urban planning and public policy realm. Focusing on the latest Python software tools, the course will outline the “pipeline” approach to data science. It will teach students the tools to gather, visualize, and analyze datasets, providing the skills to effectively explore large datasets and transform results into understandable and compelling narratives. The course is organized into five main sections:

  1. Exploratory Data Science: Students will be introduced to the main tools needed to get started analyzing and visualizing data using Python.
  2. Introduction to Geospatial Data Science: Building on the previous set of tools, this module will teach students how to work with geospatial datasets using a range of modern Python toolkits.
  3. Data Ingestion & Big Data: Students will learn how to collect new data through web scraping and APIs, as well as how to work effectively with the large datasets often encountered in real-world applications.
  4. Geospatial Data Science in the Wild: Armed with the necessary data science tools, students will be introduced to a range of advanced analytic and machine learning techniques using a number of innovative examples from modern researchers.
  5. From Exploration to Storytelling: The final module will teach students to present their analysis results using web-based formats to transform their insights into interactive stories.

Schedule & Course Materials

  • Schedule is tentative; lectures & assignment dates are subject to change.
  • Weekly course materials are stored in individual repositories on Github — available via the icons below.
  • Lecture slides are distributed as Jupyter notebooks, which are self-contained, executable Python documents.
  • Fully interactive and executable versions of the lecture slides are available via the buttons. This will launch the notebooks in a temporary cloud environment. Note that because the computing platform is temporary (and provided for free!), the cloud environment will be deleted and you will need to create a fresh version after an extended time of inactivity (~20 minutes or so).
  • Static, non-interactive versions of the lecture slides are available via the buttons.
Week Github Topic Date Interactive Slides Static Slides Homework
1 Exploratory Data Science in Python 08/31 (Wed)
09/07 (Wed) Assign HW #1 (required)
2 Data Visualization Fundamentals 09/12 (Mon)
09/14 (Wed)
3 Geospatial Data Analysis and GeoPandas 09/19 (Mon) Assign HW #2 (required)
09/21 (Wed)
4 More Interactive Data Viz, Working with Raster Datasets 09/26 (Mon)
09/28 (Wed)
5 Getting Data Part 1: Working with APIs 10/03 (Mon) Assign HW #3 (required)
10/05 (Wed)
6 Getting Data Part 2: Web Scraping 10/10 (Mon)
10/12 (Wed)
7 Analyzing and Visualizing Large Datasets 10/18 (Mon) Assign HW #4 (optional)
10/19 (Wed)
8 Case Study: Advanced Raster Analysis 10/24 (Mon)
10/26 (Wed)
9 Case Study: OpenStreetMap, Urban Networks, and Interactive Web Maps 10/31 (Mon) Assign HW #5 (optional)
11/02 (Wed)
10 Case Study: Clustering Analysis in Python 11/07 (Mon)
11/09 (Wed) Assign HW #6 (optional)
11 Predictive Modeling Part 1: Home Prices in Philadelphia 11/14 (Mon)
11/16 (Wed)
12 Predictive Modeling Part 2: Space/time Rideshare Demand 11/21 (Mon)
11/28 (Mon) Assign HW #7 (required)
13 From Notebooks to the Web: Github Pages, Web Servers, and Dash 11/30 (Wed)
12/05 (Mon)
14 From Notebooks to the Web: Dashboarding with Panel and the HoloViz Ecosystem 12/07 (Wed) Final project proposal due
12/12 (Mon)

Note: The schedule for lectures and assignments is tentative and subject to change.