CSE 291: Operating Systems in Datacenters

Fall 2022
  • Amy's office hours: Tuesday after class or by appointment in CSE 3130
  • Anil's office hours: Friday 2-3 pm in CSE 3109

Overview


CSE291H is a graduate-level course about recent operating systems research, with a focus on datacenters. The course involves reading and discussing research papers and a research project. This quarter we will read papers about a variety of topics: multicore operating systems, network stacks, scheduling, memory management, disaggregation, and new devices such as SmartNICs, FPGAs, GPUs, and TPUs.

The goals of this course are:

Course Structure


Prerequisite

This course is open to PhD and Masters students as well as advanced undergraduate students. Students should have completed CSE 221 or an equivalent graduate-level operating systems course prior to enrolling.

Reading and Reviews

Each class has 1-2 assigned papers. Students are expected to read the papers ahead of time, submit a short review about each paper (by 11:59 pm the evening before), and come to class prepared to discuss! Classes will be interactive and everyone is expected to participate.

Leading a Discussion

Each student will lead the discussion of one paper. Students will share their discussion outline with the instructor at least two days before the discussion so that they can receive feedback on it.

Warm-Up Assignment

The goal of this assignment is to familiarize students with CloudLab, so that they may use it as an experimentation platform for their research projects.

Project

A major component of this course will be an open-ended research project, conducted individually or in groups of 2-3 students. Students will submit a brief project proposal and a final project write-up, and will also give an in-class presentation about their project at the end of the quarter.

Grading
There is no final exam. The grading breakdown for the course is:

Schedule


Date Topics Papers Slides
Th 9/22 Course overview Intro
Tu 9/27 Multicore, intro to CloudLab Multikernel (SOSP '09), CloudLab (ATC '19) - only first 2 sections CloudLab, Multicore
Th 9/29 Network stacks IX (OSDI '14), XDP (CoNEXT '18)
Tu 10/4 RDMA and RPCs FaRM (NSDI '14)
Th 10/6 RDMA and RPCs eRPC (NSDI '19), PRISM (SOSP '21)
Tu 10/11 Congestion control Homa (SIGCOMM '18), Swift (SIGCOMM '20)
Th 10/13 CPU scheduling Killer Microseconds (CACM '17), Shenango (NSDI '19)
Tu 10/18 CPU scheduling ghOSt (SOSP '21)
Th 10/20 Performance diagnosis NSight (NSDI '22), Collie (NSDI '22)
Tu 10/25 Datacenter tax Warehouse-scale computers (ISCA '15)
Th 10/27 SmartNICs AccelNet (NSDI '18)
Tu 11/1 SmartNICs iPipe (SIGCOMM '19), nanoPU (OSDI '21)
Th 11/3 GPUs PTask (SOSP '11)
Tu 11/8 TPUs TensorFlow (OSDI '16)
Th 11/10 FPGAs AmorphOS (OSDI '18), Coyote (OSDI '20)
Tu 11/15 Disaggregation LegoOS (OSDI '18)
Th 11/17 Memory management Llama (ASPLOS '20)
Tu 11/22 Memory management TLB shootdowns (EuroSys '20)
Th 11/24 Thanksgiving holiday
Tu 11/29 Project presentations
Th 12/1 Project presentations