Photo credit: Hector Garcia-Molina
I study data and data-intensive systems as a member of DAWN and the Stanford Future Data Systems group.
🔥🔥 DAWN is Data Analytics for What's Next 🔥🔥
Bio: Peter Bailis is an assistant professor of Computer Science at Stanford University. Peter's research in the Future Data Systems group and DAWN project focuses on the design and implementation of post-database data-intensive systems. He is the recipient of the ACM SIGMOD Jim Gray Doctoral Dissertation Award, an NSF Graduate Research Fellowship, a Berkeley Fellowship for Graduate Study, best-of-conference citations for research appearing in both SIGMOD and VLDB, and the CRA Outstanding Undergraduate Researcher Award. He received a Ph.D. from UC Berkeley in 2015 and an A.B. from Harvard College in 2011, both in Computer Science.
Teaching: CS145 (Fall 2017) | CS245 (Winter 2017) | CS345 (Fall 2016)
We're building MacroBase, a new analytics engine for prioritizing attention in large-scale data.
410 Gates Hall
Stanford University
Stanford, CA 94305

Our group's research is generously supported in part by the affiliate members and other supporters of the Stanford DAWN project—Intel, Microsoft, Teradata, and VMware—as well as the AHPCRC, Toyota, Visa, Keysight Technologies, Hitachi, Northrop Grumman, Facebook, Juniper, and NetApp.


I am a member of the Stanford DAWN project. Day-to-day, I am a professor in the Future Data Systems group and sit in the Stanford InfoLab.

Paper on DAWNBench to appear at NIPS ML Systems workshop in December.
New pre-print with Kai Sheng Tai, Vatsal Sharan, and Greg Valiant on finding heavily-weighted features in data streams using sketches.
Poster on DAWNBench deep learning benchmarking efforts appeared at SOSP AISys workshop in Shanghai!
Talk on DAWN and MacroBase at Teradata Partners 2017 in Anaheim.
Firas gave a talk on MacroBase at HPTS 2017.
Over 350 students enrolled in our Intro DB (CS145) course at Stanford!
Many thanks to the Okawa Foundation for awarding the 2017 Okawa Research grant!
A great kickoff for the DAWN project, and exciting updates to the project site -- over 24 papers this year to date, spanning visualization and ML to new hardware accelerators! Many thanks to our generous founding members for their continued support.
Posted NoScope slides from VLDB 2017.
New blog post on learning sparse models with random projections and compressive sensing.
New results with collaborators in seismology arising from our ongoing project on efficient similarity search in time series (SCEC poster).
New blog post on automatic time series smoothing with ASAP (VLDB 2017).
Two VLDB 2017 camera-ready papers available: NoScope and ASAP
NoScope repo updated with tutorial and docs!
New arXiv preprint on DROP: Dimensionality Reduction OPtimization via PCA and progressive sampling
MacroBase: Prioritizing Attention in Fast Data received a "Best of SIGMOD 2017" citation! Thanks, Dan and SIGMOD PC!
NoScope (optimizing CNN-based video analytics via specialization) accepted to VLDB 2017!
Thanks to NetApp for awarding a NetApp Faculty Fellowship to support our work on prioritizing attention in fast data streams!
New work with Vatsal Sharan, Kai Sheng Tai, and Greg Valiant on learning sparse models by exploiting sparsity now on arXiv!
New blog post on accelerating neural network inference over video with NoScope
More news

Selected Publications · Google Scholar

Finding Heavily-Weighted Features in Data Streams
DROP: Dimensionality Reduction Optimization for Time Series
There and Back Again: A General Approach to Learning Sparse Models
SimDex: Exploiting Model Similarity in Exact Matrix Factorization Recommendations
Infrastructure for Usable Machine Learning: The Stanford DAWN Project
NoScope: Optimizing Neural Network Queries over Video at Scale
ASAP: Prioritizing Attention via Time Series Smoothing
MacroBase: Prioritizing Attention in Fast Data
DAWNBench: An End-to-End Deep Learning Benchmark and Competition
Prioritizing Attention in Fast Data: Principles and Promise
Scalable Atomic Visibility with RAMP Transactions
Coordination Avoidance in Database Systems
Feral Concurrency Control: An Empirical Investigation of Modern Application Integrity
The Missing Piece in Complex Analytics: Low Latency, Scalable Model Management and Serving with Velox
Readings in Database Systems, 5th Edition
Coordination Avoidance in Distributed Databases
Highly Available Transactions: Virtues and Limitations
Quantifying Eventual Consistency with PBS
Scalable Atomic Visibility with RAMP Transactions
The Network is Reliable: An Informal Survey of Real-World Communications Failures
Quantifying Eventual Consistency with PBS
Consistency without Borders
PBS at Work: Advancing Data Management with Consistency Metrics
Bolt-on Causal Consistency
HAT, not CAP: Towards Highly Available Transactions
Eventual Consistency Today: Limitations, Extensions, and Beyond
Probabilistically Bounded Staleness for Practical Partial Quorums
The Potential Dangers of Causal Consistency and an Explicit Solution
Programming Micro-aerial Vehicle Swarms with Karma
Dimetrodon: Processor-level Preventive Thermal Management via Idle Cycle Injection
Positional Communication and Private Information in Honeybee Foraging Models

You can follow me on Twitter at @pbailis.