Bring Compression to Postgres at Zero Cost of Performance
Tong Zhang is a co-founder and the Chief Scientist of ScaleFlux that focuses on commercializing computational solid-state storage drives. He is also currently a Professor in the Electrical, Computer and Systems Engineering Department at Rensselaer Polytechnic Institute. He received the Ph.D. degree in electrical and computer engineering from the University of Minnesota, Minneapolis in 2002. His current research areas include memory and data storage systems, computer architecture, and VLSI signal processing. He has graduated 17 PhD students, and co-authored over 160 papers, with over 4900 citations and h-index of 40. Among his many research accomplishments, he made pioneering contributions to establishing the research area of flash memory signal processing and enabling practical implementation of low-density parity-check (LDPC) codecs in commercial data storage and communication systems.
This proposed talk would present a solution that allows Postgres users to achieve significant data storage savings through compression at zero CPU/performance cost. The key is to deploy Postgres on new compression-capable solid-state drives (SSDs), which is developed following the current industry trend of empowering data storage devices with additional computing capability. This proposed talk will discuss and present: (1) the inefficiency and performance penalty inherent in current practice of realizing data compression for Postgres, which relies on either filesystems (e.g., ZFS and Btrfs) or Linux block layer (e.g., RedHat VDO), (2) introduction to commercially available SSDs with built-in hardware-based transparent compression, (3) experimental results that show, by replacing leading-edge commodity SSD with the new compression-capable SSD, one could reduce the storage cost by over 50% and meanwhile achieve the same or better Postgres TPS performance, and (4) experimental results that show, by deploying Postgres on the compression-capable SSD, one could largely reduce the fillfactor (e.g., from the default 100 to 75) at almost zero cost of physical storage capacity, while achieving over 2x higher TPS under update-intensive workloads. Finally, this proposed talk will discuss the potential of leveraging computation-capable SSDs to improve the efficiency of important operations (e.g., vacuum and table scan) in Postgres.
- 20 min
- Postgres Conference 2020
- Ops and Administration