Maximize Greenplum For Any Use Cases: Decoupling Compute and Storage
Shivram Mani is a long term distributed systems enthusiast and a contributor to open source projects including greenplum and apache hawq. He holds a Masters in Computer Science and over his 5 years in Yahoo, has lead engineering efforts in the search engine teams and was among the first ever users of hadoop based data grids for search analytics. As a principal engineer in Pivotal he has shepherded various projects related to Greenplum’s integration with the hadoop, spark and external ecosystem.
Francisco Guerrero is passionate about data, and a machine learning enthusiast and contributes to open source projects including PXF and Greenplum. He holds a Masters in Computer Science. As part of Greenplum and Pivotal, he contributes to Greenplum's integration to Spark and Hadoop ecosystems.
No video of the event yet, sorry!
Traditional data warehouses are deployed with dedicated on-premise compute and storage. As a result, compute and storage must be scaled together and clusters must be persistently turned on in order to provide data availability at all times. In the cloud, compute and storage can be decoupled by taking advantage of the ability to request on-demand infrastructure. Greenplum in Kubernetes brings the ability to scale compute horizontally, while S3 and Azure cloud provide storage. This means they can be scaled separately depending on the data engineers’ needs, separating data processing from storage.
In this presentation, we will demonstrate the ability to decouple compute and storage in the cloud using Greenplum and Platform Extension Framework (PXF). Deploying a Greenplum cluster in Kubernetes will give us an elastic MPP database engine. Moreover, PXF will allow us to access data residing in multiple clouds. As a result, we expect increased resource utilization and flexibility, while lowering infrastructure costs.
- 2019 March 19 17:30 EDT
- 20 min
- Postgres Conference
- Greenplum Summit