How to configure a PostgreSQL cluster for multitenancy
Presented by:
Bryn Llewellyn
Bryn Llewellyn is a Technical Product Manager at Yugabyte, Inc. YugabyteDB is an open source, cloud native distributed SQL database that looks like PostgreSQL to the developer. Bryn’s speciality is SQL and stored procedures in the context of distributed SQL.
Bryn has worked in the software field for more than forty years. He started working with SQL when he joined Oracle UK in 1990. He relocated to Oracle HQ (Redwood Shores, CA) in 1996 and his last role, before leaving, was as the Product Manager for PL/SQL. He left Oracle in April 2019 to join YugaByte, Inc.
Bryn started off doing image analysis and pattern recognition at Oxford University (programming in FORTRAN) and then worked in Oslo, first at the Norwegian Computing Center and then in a startup. In Norway, Bryn programmed in Simula—recognized as the first object-oriented programming language and as the inspiration for C++.
A multitenancy scheme for Postgres must allow many application backends to be co-installed in a single cluster so that each is securely isolated from every other one. (The term "application backend" is used to denote the complete set of artifacts that jointly implement an application's RDBMS functionality: schemas, schema-objects of all kinds, and the roles that own these.) It must be possible to design a single application backend without thinking about what other application backends it might be co-installed with.
The native database notion provides a great deal of what multitenancy needs: a hermetic container for schemas and schema objects. A client session cannot exist except by specifying the database that it will use; the choice that is made when the session starts is fixed for the session's lifetime; and the SQL that's issued in the context of one database cannot see inside any other database.
But schemas and schema-objects must be owned by roles and, out-of-the box, a role is a cluster-wide phenomenon and has the capacity to own objects in each of several databases. With the exception of a superuser or a role that is dedicated to provisioning databases or roles, the existence of a role that owns objects in, or can connect to, more than one database, serves only to thwart the goals of multitenancy.
The scheme that this session explains defines and enforces local role as a role that has the connect privilege on exactly one so-called tenant database and whose name is unique by construction. A cluster that has been configured using this scheme has any number of tenant databases together with the exactly one special database that serves as the "home base" for the very few global roles that are allowed.
- An initial configuration script starts with a freshly-created cluster, creates a superuser for "daily" use, and sets the nologin attribute for the bootstrap superuser. It creates a special global role called clstr$mgr with createdb and createrole (just like the documentation recommends) for the "daily" provisioning of tenant databases. And it customizes the template1 database by installing security definer procedures, owned by clstr$mgr, for the provisioning of local roles. You authorize as clstr$mgr only to create or drop a tenant database. Database names are constrained to follow a simple pattern: d0, d1, and so on.
- The scripted steps that create tenant database dN also create a local role dN$mgr for it that has the execute privilege on the role-provisioning procedures that template1 brings but no special attributes. The procedures can create, alter, and drop only local roles for dN. They ensure that the names of these roles start with dN$. This is sufficient to guarantee that role names cannot collide. You authorize as dN$mgr to maintain a single application backend.
The code that implements the scheme that this session describes, together with some example tenant applications, is available with a one-touch master install script here: github.com/YugabyteDB-Samples/ysql-case-studies. This session demonstrates how it all works.
- Date:
- 2023 April 21 09:30 PDT
- Duration:
- 1 h 30 min
- Room:
- Santa Clara, Lvl C
- Conference:
- Silicon Valley 2023
- Language:
- Track:
- Dev
- Difficulty:
- Medium