Hadoop Can’t Query? (PostgreSQL Features)
Presented by:
Venkatesh Raghavan
With the explosion of data stores and cloud services, data now resides across many disparate systems and in a variety of formats. When multiple data sets exist in external systems, it is often necessary to perform a lengthy ETL (extract, transform, load) operation to get data into the database. But what if we only needed a small subset of the data? What if we only want to query the data to answer a specific question or to create a specific visualization? In this case, it's often more efficient to join data sets remotely and return only the results, rather than negotiate the time and storage requirements of performing a rather expensive full data load operation.
This talk explores Platform Extension Framework (PXF), an open-source project that enables users to query heterogeneous data sources via pre-built connectors. PXF's architecture enables users to efficiently query large datasets from multiple external sources, without requiring those datasets be loaded into Greenplum - a Postgres based MPP solution.
- Date:
- Duration:
- 1 h
- Room:
- Conference:
- PostgresWorld Webinars 2022
- Language:
- Track:
- Dev
- Difficulty:
- Requires Registration:
- Yes (Registered: 37)