CS294-248: Topics in Database Theory

Room: Soda 310

Time: Tuesday, Thursday: 11-12:20

Description

Data processing is at the heart of computing today. Data from small to large is stored, transformed, streamed, updated, queried. Five decades of research have produced a wealth of sophisticated data management techniques that allow us today to perform these tasks very efficiently. This course presents the theoretical foundations of such techniques. The topics covered include: static analysis of queries, basic query evaluation algorithms and hypertree decomposition, incremental view maintenance, worst-case-optimal joins, constraints and semantic query optimization, provenance semirings, and datalog with various extensions. These topics use concepts from logic and model theory, algorithms, graph theory, algebra, complexity theory, and probability theory. All the required theoretical background will be covered in the lectures. Students taking this course are expected to have some basic knowledge of relational databases and SQL (or some other declarative query languages, e.g. Spark, or SparQL, or datalog). There will be a few homework assignments consisting of theoretical exercises. Students will be graded based on their homework assignments and on class participation.

Prerequisites

A good grade in CS186 and in one of (CS170 or CS172).

Is this course for you?

This is an advanced course. If you are unsure whether this course is for you, take this very simple self test.

It covers only theory. If you are interested in advances systems topics, I strongly recomment CS286: Graduate DB Systems in Spring 2024. (This is a natural graduate-level follow-on to the undergrad CS186 class.)

Schedule

Lecture on Thursday, 9/24 is canceled because of conflict with the Simons' Bootcamp.

First lecture will be on Tuesday, 8/29.

Dates/VideosUnitContent
8/21-8/25Bootcamp at Simons. It is recommended that you attend 1 or 2 of the mini-courses
8/29, 8/31 1Logic and Queries
9/5, 9/72Conjunctive Queries
9/12, 9/143Incremental View Maintenance (Dan Olteanu)
9/19, 9/214AGM Bound, Worst-Case Optimal Algorithms (Hung Ngo)
9/25-9/27Workshop 1 at Simons
10/3, 10/55Database Constraints
10/10, 10126Incomplete, Probabilistic Databases
10/16-10/20Workshop 2 at Simons
10/24, 10/267Semirings and K-Relations (Val Tannen)
10/31, 11/28-9FAQ, Datalog, Chase
11/7, 11/99Datalog, Chase (cont'd)
11/13-11/17Workshop 3 at Simons
11/21, 11/28TBD

Zoom link

Available here (the zoom link is on the left).

Direct link is here. You need to be authenticated.

But please attend all lectures in person

Homework Assignments

Please submit on bcourses.

HW 1 due on Monday, 9/4

HW 2 due on Monday, 9/11

HW 3 due on Monday, 9/25

Final Report and presentation

Task: pick a theory problem/result and explain it to a wide audience. Write a short report. Suggested length: 3-5 page. Give a short presentation (10'), in class: tentative date, Tuesday 12/5.

Some suggested topics (but feel free to choose beyond these).

Simons' Program

This course is associated with the Simons' Program on Logic and Algorithms in Database Theory and AI, which has three workshops scheduled during the quarter. Students are expected to attend some of the talks at those workshops.

Grading

The grade in this class is based on the participation during the lectures, the homework assignments, and a final assignment, which may involve a report and either a poster or a presentation TBD.