Seattle 2024 Program
presented by PASS Data Community Summit
Registration
07:20 - 15:20 Level 1 - Summit Lobbypresented by PASS Data Community Summit
PASS Data Community Summit Keynote
08:10 - 09:30 Microsoft Keynote Room: Level 5 - Ballroom 1presented by PASS Data Community Summit
Level 3 - Garden Lounge
09:30 - 18:00 Level 3 - Garden LoungeExploring the Elephant
presented by Zak Tedder
Okay, let’s talk about the elephant in the room: Postgres! PostgreSQL has a lot going on, and traversing the ups and downs of this database management system can be daunting. We’ll navigate through indexes, partake in partitions, synchronize with sequences, extend on extensions, and more!
Join us on a journey through Postgres! We’ll take a high-level exploration of the PostgreSQL database m...
more 09:40 - 10:30 Essentials: 420 EssentialsUnlocking Insights for Performance Optimization
presented by Anita Singh and Raj Jayakrishnan
Extract: Understanding Database Statistics in PostgreSQL Understanding Database Statistics in Postgres Session Description Database statistics play a crucial role in optimizing query performance and overall database health in Postgres. This session will provide an in-depth look at the different types of database statistics maintained by Postgres, how they are collected and used by the qu...
more 09:40 - 10:00 Ops: 421 Opspresented by Eric Lendvai
Presentation Material: https://harbour.wiki/index.asp?page=PublicArticles&mode=show&id=241017030856&sig=7154095596
This session will focus on assisting developers transitioning from Oracle, MSSQL, or MySQL to PostgreSQL.
You will learn about the key differences in PostgreSQL, including field typ...
more 09:40 - 10:30 Dev: 422 DevLessons learned from OtterTune, an AI-powered database tuning service
presented by Bohan Zhang
Database Management Systems (DBMSs) are complex software that require precise tuning to achieve optimal performance on specific hardware and workloads. However, manual tuning by experienced administrators becomes impractical for large-scale DBMS deployments. To address this challenge, there has been a growing trend in both academia and industry to employ machine learning (ML) for automatic data...
more 10:10 - 10:30 Ops: 421 Opspresented by PASS Data Community Summit
"Exhibit Hall Level 2 - Flex AB"
10:30 - 16:00 Exhibit Hall: Level 2presented by Cary Huang
PostgreSQL has long excelled at native logical replication and is rapidly becoming a top choice for data analytics, thanks to its powerful features and flexibility. As organizations increasingly use diverse databases like MySQL, Oracle, and SQL Server, the need for seamless, high-performance replication into PostgreSQL (aka Change Data Capture (CDC)) has become significant.
In this talk, I ...
more 10:40 - 11:30 Dev: 422 Devpresented by Shree Vidhya Sampath
Ensuring high availability is crucial for databases to achieve resiliency and durability. However enabling high availability while running PostgreSQL on Kubernetes poses unique challenges navigating complexities such as increased write latency, node failures, leader elections, handling replication lag etc. This talk will focus on
- Architectural components of how at Datadog we run PostgreSQL...
presented by Peter Farkas
You will learn how easy migration from MongoDB to PostgreSQL can be. MongoDB is a life-changing technology for many developers. However, MongoDB abandoned its Open-Source roots, changing the license to SSPL and making it unusable for many Open Source and Commercial projects. During the talk, you will learn how FerretDB can be used as a MongoDB replacement and why we decided to use PostgreSQL a...
more 11:10 - 11:30 Essentials: 420 Essentialspresented by Eric Lendvai
DataWharf is a free and open-source data modeling tool designed to facilitate the seamless migration of database schemas from Oracle, MSSQL, and MySQL to PostgreSQL in just minutes.
With DataWharf, you can easily map your column types, manage foreign key constraints, and validate your schema against various integrity rules. You can then generate migration scripts for empty catalogs or extend...
more 11:40 - 12:00 Dev: 422 DevPostgreSQL extensions
presented by Anita Singh and Rajeev Thottathil
PostgreSQL's rich ecosystem of extensions offers a wealth of opportunities to enhance database functionality and overcome performance limitations. In this session, we will explore a curated selection of noteworthy PostgreSQL extensions and showcase how they can be leveraged to optimize database operations, improve performance, and unlock advanced capabilities.
Participants will discover how ...
more 11:40 - 12:00 Essentials: 420 EssentialsSeamless Integration for Modern Database Ecosystems and Bridging Data Silos
presented by Grant Zhou
As organizations increasingly manage diverse data ecosystems, ensuring seamless synchronization across heterogeneous databases is more important than ever.
We are excited to announce the open-source release of SynchDB at this conference. SynchDB is a powerful PostgreSQL extension designed to simplify data synchronization from heterogeneous databases such as MySQL, Oracle, and...
more 11:40 - 12:00 Ops: 421 Opspresented by Fernando Laudares Camargos
"Nothing", you might be thinking. At least I used to: it's okay for deploying application servers in an "elastic" capacity, but not for databases, which are meant to be stable and faultless. However, while Kubernetes environments remain very dynamic, it evolved to support stateful applications better and now counts with an enthusiastic Data on Kubernetes (DoK) community that keeps increasing in...
more 12:10 - 13:00 Ops: 421 OpsExploring PostgreSQL Catalogs and Extensions
presented by Swanand Kshirsagar and Venkataramanan Govindarajan
Discover how to enhance your PostgreSQL database with powerful catalogs and extensions. Explore their essential roles in database management, from data organization and querying to advanced features like full-text search and geospatial analysis. Uncover how these tools can supercharge your database performance and capabilities.
12:10 - 13:00 Dev: 422 Devpresented by Grant McAlister
This talk will first introduce the different ways PostgreSQL can use memory, from the operating system, to cluster wide and then into per session and per operation. From there we will dive into specifics around different PostgreSQL parameters like shared_buffers, work_mem, maintenance_work_mem and how to set them depending on your workload. The presentation will also cover some of the lesser kn...
more 12:10 - 13:00 Essentials: 420 Essentialspresented by PASS Data Community Summit
"AWS Dining Hall Level 2 - Flex C"
12:20 - 14:00 AWS Dining Hall Level 2 - Flex CCD for Stateful Applications
presented by Stephen Atwell and Christopher Crow
In this talk, Stephen Atwell and Chris Crow will share some thoughts on managing stateful applications as part of a CD Pipeline so that applications - and the application's data - can be versioned and deployed safely and repeatedly. This talk will discuss managing persistent data within Kubernetes, as well as managing structural changes to a database as part of a CD process. With Kubernetes and...
more 14:00 - 14:50 Ops: 421 Opspresented by Janis Griffin
Performance tuning can be complex. It’s often hard to know which knob to turn or button to press to get the biggest performance boost. This presentation will detail five steps to identify performance issues and resolve them quickly. Attendees at this session will learn how to fine-tune a SQL statement quickly; identify performance inhibitors to help avoid future performance issues; recognize co...
more 14:00 - 14:20 Dev: 422 Devpresented by Thuymy Tran and chandra pathivada
Learn the architectural concepts, similarities/differences between the PostgreSQL engine and Oracle Database. Review Multi-version concurrency control (MVCC) and Subtransactions.
14:00 - 14:50 Essentials: 420 Essentialspresented by Pavlo Golub
The talk firstly introduces all pertinent levels of database monitoring and then focuses on PostgreSQL and the means it provides. The meaning and importance of key metrics will be explained. As the Postgres community has already developed a lot of tools in that area, some popular common options will be highlighted together with the problems that different monitoring approaches have. To overcome...
more 15:00 - 15:50 Essentials: 420 Essentialsputting words in order without losing your mind or your data
presented by Jeremy Schneider
Comparisons are fundamental to computing - and comparing strings is not nearly as straightforward as you might think. Collations, or the ordering and comparison of strings, is continuously evolving along with all the nuances of natural language. But databases use such comparisons everywhere: for ORDER BY, the humble >, <, and = operators, btree indexes, GROUP BY, range partitioning -- even hash...
more 15:00 - 15:50 Ops: 421 OpsLearn about common failure modes when modifying Postgres tables and how to address them
presented by Saraj Munjal
Schema migrations (DDL statements) can be a rough journey, especially when dealing with high-scale, always-online microservices. In this talk, we will:
- Explore the common pitfalls and failure modes when modifying Postgres tables - locking and backwards-incompatibility
- Share real-world examples from large-scale systems with minimal downtime
- Dive into the strategies and solutions we’ve...
presented by Clay Jackson
In today's data-driven world, the demand for data is increasing and its protection is critical. Oracle is everywhere - from ERP to medical and other applications. However, not all applications need all the features and cost of Oracle. In this presentation, we will explore ways to move data from Oracle to PostgreSQL.
We will begin by highlighting the high-level trends in data and the growing ...
more 16:00 - 16:50 Ops: 421 Opspresented by Robert Bernier
This talk outlines and demonstrates several opportunistic methods scaling your PostgreSQL replication cluster across multiple regions. Each architecture is presented as as POC profiling its particular strengths and of course weaknesses. It is hoped that by demonstrating these variations an appropriate architecture can be created that can best meet any long term objectives over its life-cycle.
16:00 - 16:50 Dev: 422 DevModernize your MSSQL Databases to Postgres with Babelfish
presented by Suraj Talreja
Performing heterogenous database migrations usually involved heavy application redevelopment or re-architecture. In this presentation we will talk about Babelfish for AWS Aurora Postgres as a quick alternative to modernizing to Postgres and the Assessment Service that Cornerstone Consulting Group provides for customers that are interested in modernizing their MSSQL workloads.
The key topic...
more 16:00 - 16:20 Essentials: 420 Essentialspresented by Chandra Pathivada
In this session, we delve into the common scenarios where PostgreSQL may not be utilizing indexes as expected, leading to suboptimal query performance. We explore the underlying reasons, such as query optimization, data distribution, and indexing strategies, that can impact index usage. Join us as we discuss practical solutions and best practices for diagnosing and rectifying these issues, ulti...
more 16:30 - 17:20 Essentials: 420 EssentialsOvercoming Migration Challenges: From Oracle to PostgreSQL
presented by BAJI SHAIK and Sameer Malik
Migrating from Oracle to PostgreSQL is a complex, multi-stage process that involves a variety of technologies and skills. This presentation will explore the key challenges faced during this migration, from the initial assessment to the final cutover. We will discuss critical issues such as converting SYSDATE and NUMBER datatypes, which, if not addressed correctly, can significantly impact datab...
more 17:00 - 17:50 Ops: 421 OpsTaking the pain out of AI with pg_vectorize
presented by Shaun Thomas
AI is a hot topic right now, and for good reason! Natural Language Search and Retrieval Augmented Generation (RAG) are two great ways to leverage data stored in Postgres in an immediately useful way. Why use Full Text Search when we can search for intent and related topics?
Actually doing it on the other hand is a huge pain. We need to choose an embedding model to vectorize the data, t...
more 17:00 - 17:20 Dev: 422 Devpresented by Ryan Booz
As your database grows, the performance and maintenance of large tables can become challenging. Fear not! PostgreSQL has the right tool for the job: declarative table partitioning. In this talk, I will explore the benefits of partitioning in PostgreSQL, including improved performance and simplified maintenance.
After introducing the benefits of table partitioning, I’ll discuss the different ...
more 17:30 - 17:50 Essentials: 420 Essentialspresented by Neil Hansen
In this talk, we'll explore the incredible extensibility of PostgreSQL by demonstrating how ParadeDB built advanced full-text search capabilities into the database. Using pg_search
, our open-source Postgres extension, we'll showcase a real-world application: a blazingly fast search interface for MusicBrainz, the open encyclopedia of music.
Key points we'll cover:
- The challenges of im...
presented by PASS Data Community Summit
Exhibitor Reception
18:00 - 20:00 Exhibit Hall: Level 2presented by PASS Data Community Summit
Registration
07:50 - 15:50 Level 1 - Summit Lobbypresented by PASS Data Community Summit
"Microsoft Keynote Room Level 5 - Ballroom 1"
08:10 - 09:30 Microsoft Keynote Room: Level 5 - Ballroom 1presented by PASS Data Community Summit
Level 3 - Garden Lounge
09:30 - 18:00 Level 3 - Garden Loungepresented by Harry Pierson
Wouldn’t it be nice if you could query your database as it was ten minutes, ten hours, or ten days ago? It would make audits, troubleshooting, and period vs. period reporting so much easier. Once upon a time, Postgres supported time travel queries, but it was deprecated almost thirty years ago because of its abysmal performance. In this talk, we present a new implementation of time travel in Po...
more 09:40 - 10:30 Dev: 422 DevPostgreSQL 17: Performance and Efficiency - A Live Demo Showcase
presented by Kumar Ramamurthy and Adarsha Kuthuru
Unlock the full potential of CloudSQL PostgreSQL 17 with this dynamic demo showcase. Witness the power of its performance enhancements, from faster query execution with streaming reads and optimized indexes to improved vacuuming efficiency for large databases. This session will equip DBAs, data engineers, and solution architects with the knowledge and tools to optimize their Postgres deploymen...
more 09:40 - 10:30 Ops: 421 OpsThe essentials of leadership and effective teams
presented by Aaron Cutshall
Effective leadership and efficient teams are critical to success, where today's business requirements put people with diverse backgrounds, skills, and experiences together in teams under tight schedules with intense pressure. Therefore, teamwork is essential in all areas of our work life, including how we interact with other teams daily. Lessons learned from highly effective teams, such as SEAL...
more 09:40 - 10:30 Essentials: 420 Essentialspresented by PASS Data Community Summit
"Exhibit Hall Level 2 - Flex AB"
10:30 - 16:00 Exhibit Hall: Level 2presented by Fernando Laudares Camargos
Let’s get ready and prepare in advance; space tourism is coming soon!
While the tickets will surely be available online for those with a very large credit card limit, we shall not take the Internet for granted in outer space. When the cruiser stops on Mars for a quick tour, we better have a system that won’t rely on a server hosted on Earth to control the offboarding and onboarding of passen...
more 10:40 - 11:30 Dev: 422 DevTackling Scheduling Challenges with pg_cron and pg_dbms_job
presented by BAJI SHAIK and Rajeshkumar Sabankar
Migrating from databases like Oracle to PostgreSQL presents unique scheduling challenges that can impact the automation of critical tasks such as maintenance, batch jobs, and data processing. This presentation will address these challenges, highlighting the differences in scheduling mechanisms and the complexities involved in adapting existing task workflows to PostgreSQL. We will explore the a...
more 10:40 - 11:30 Ops: 421 OpsSorting it Out
presented by Joe Conway
Background: "libc" is commonly used as a shorthand for the "standard C library", a library of standard functions that can be used by all C programs. glibc is the GNU C Library implementation, which is used on all major Linux distributions (e.g. AL, RHEL, Debian/Ubuntu, SuSE). The glibc library, libc.so, provides most of the foundational C routines such as open, read, write, malloc, printf, and ...
more 10:40 - 11:30 Essentials: 420 Essentialspresented by Thuymy Tran and chandra pathivada
This session takes the angle of a SQL Server DBA, compares and contrasts Amazon Aurora PostgreSQL platform/features for addressing database lifecycle management challenges, such as, high availability, disasters recovery, backup/restore, performance scalability, patch/upgrade, and on-going maintenance tasks.
11:40 - 12:30 Essentials: 420 EssentialsRevolutionizing PostgreSQL Automation with Kubernetes
presented by Julian Fischer
Klutch is a powerful open-source framework to simplify PostgreSQL automation across diverse infrastructures and hundreds of Kubernetes clusters. In this talk, we'll explore how Klutch provides developers with a seamless, Kubernetes-native self-service experience, while delivering centralized control and operational efficiency. Attendees will learn how Klutch addresses common challenges, such a...
more 11:40 - 12:00 Ops: 421 Opspresented by Arnab Saha and Krishna Sarabu
Ensuring the smooth operation of Amazon RDS instances requires adept management of patching and version upgrades, spanning from operating system updates to database enhancements. In this session, we delve into the intricacies of RDS maintenance operations, demystifying the process of operating system patching and database version upgrades. We start by exploring how operating system patching wor...
more 12:10 - 12:30 Ops: 421 Opspresented by PASS Data Community Summit
"AWS Dining Hall Level 2 - Flex C"
12:20 - 14:00 AWS Dining Hall Level 2 - Flex CMulti-tenancy with PostgreSQL databases
presented by Raj Jayakrishnan and Rajeev Thottathil
Tenant isolation, resource sharing, security, and performance are critical in designing multi-tenant database architectures. SaaS providers leverage these architectures to consolidate databases, enhancing agility, scalability, operational efficiency, and cost optimization. In this session, we will explore multi-tenancy partitioning models, cost optimization strategies, high availability, disast...
more 13:30 - 14:20 Ops: 421 Opspresented by Bill Tang
As the wellness industry continues to evolve, customer experience has become a key differentiator. At HERE Spa, we’ve utilized innovative technology powered by PostgreSQL to deliver exceptional, personalized customer experiences. This proposal outlines how we integrate PostgreSQL into our data management and operational workflows to streamline bookings, personalize treatments, and optimize clie...
more 13:30 - 13:50 Dev: 422 DevDesign multi tenant solutions in data and generative AI using powerful knowledge bases
presented by Shailesh Doshi
Row Level Security (RLS) although very convenient for implementing fine-grained access control in databases, it can be very inefficient and ineffective if designed improperly. This session discusses RLS in different database platforms, pros and cons, and different use cases and real-world issues with RLS. The focus of the session will be on techniques to design an effective and efficient databa...
more 14:00 - 14:20 Dev: 422 Devpresented by Pavlo Golub
Discover the endless possibilities of PostgreSQL as a gaming platform by harnessing its ability to customize the Wordle game. Explore how PostgreSQL empowers developers to redefine the game experience through three core entities:
- The available word set. Do we want to allow all words or only popular and well-known ones? Do we want to limit a set to some topic, e.g., IT-slang terms, or geogr...
presented by Ryan Booz
Creating consistently fast, efficient queries and applications requires effort. Regardless of improvements to hardware, query planning, data storage, and all things AI, users generally only care about one metric; how fast queries respond. And although it may surprise you in 2024, improving problem queries still requires human intervention in many cases. Knowing where to begin and some of the fi...
more 14:30 - 15:20 Essentials: 420 EssentialsHow to stop worrying and embrace Postgres in the cloud
presented by Shaun Thomas
Doing High Availability with Postgres is hard. So hard that even the experts get it wrong due to unforeseen edge cases. Quorum, CAP and PACELC theory, fencing, STONITH, network partitions and split brains, RPO, RTO, SLA, sync and async replication, node count and architecture, proxies, connection pools, load balancers... And then there's the tools! EFM, repmgr, Patroni, Stolon, pg_auto_fail...
more 14:30 - 14:50 Ops: 421 Opspresented by Sai Srirampur
PeerDB provides a fast and simple way to replicate data from Postgres to ClickHouse. We implement Postgres Change Data Capture (CDC) to reliably replicate changes from Postgres to ClickHouse. Postgres Logical Decoding is a building block of Postgres CDC. It enables users to stream changes on Postgres as a sequence of logical operations like INSERTs, UPDATEs, and DELETEs. Logical Decoding has ev...
more 14:30 - 14:50 Dev: 422 DevUsing Aurora Postgres with Babelfish extension
presented by Alvaro Costa-Neto and Minesh Chande
Migrating from legacy SQL Server databases is time consuming and resource intensive. Babelfish extends your Amazon Aurora PostgreSQL-Compatible Edition database with the ability to accept database connections from SQL Server clients. Join this session to learn how SQL Server applications work directly with Aurora PostgreSQL with few to no code changes compared to traditional migration and witho...
more 15:00 - 15:50 Dev: 422 Devscience and practice
presented by Neelam Koshiya
Abstract* This session explores the ethical dimensions of generative AI, focusing on responsible development and deployment. We delve into challenges like biased outputs, misinformation, and deepfakes. Emphasis is placed on transparency, explainability, and mitigating biases. Privacy safeguards, diversity in training data, and adherence to regulatory frameworks are discussed. Collaboration amo...
more CANCELED 15:30 - 16:20 Essentials: 420 Essentialspresented by Robert Bernier
You may have worked with Patroni and even CitusDB but have you ever worked with them together? This session will go through the steps of creating a CitusDB architecture that will have all the automated disaster recovery capabilities that you have come to love and expect from a standard replication cluster managed by Patroni.
16:00 - 16:50 Ops: 421 Ops