The Machines Haven't Taken Over Yet: Incorporating Domain Knowledge into ML-driven Database Tuning Tools
Dr. Dana Van Aken is a computer scientist and the co-founder/CTO of OtterTune, a platform that automates database tuning through machine learning. She received her Ph.D. in Computer Science from Carnegie Mellon University, where she specialized in database management systems. Her research interests lie at the intersection of self-driving/autonomous database technology and machine learning.
No video of the event yet, sorry!
Database management systems (DBMSs) expose dozens of configurable knobs that control their runtime behavior. Correctly tuning these knobs can improve the performance and efficiency of the DBMS. But it's a difficult task for humans as it requires understanding the complex and high-dimensional relationship between the database configuration, workload, environment, and performance. This problem has led to research on using machine learning (ML) to devise strategies for automatically optimizing DBMS knobs for any application. Today, ML-driven automated tuning tools can be found in products such as OtterTune and MySQL Autopilot. Despite recent breakthroughs in areas of ML, such as natural language processing (e.g., ChatGPT), the optimization algorithms used by such tools have yet to achieve similar success (spoiler: it's primarily due to a lack of large, high-quality datasets).
In this talk, I will demonstrate the significance of incorporating domain knowledge into these automated DB optimization algorithms. I will give examples that show how to integrate domain knowledge from various sources (e.g., official docs, Percona blogs, PGTune, and ChatGPT) with different parts of the optimization problem (e.g., identifying good/safe regions of the search space, selecting time windows to apply recommended configurations, and explaining why the algorithms are making specific changes).
- 2023 April 20 16:50 PDT
- 50 min
- Santa Clara, Lvl C
- Silicon Valley 2023