Presented by:


Woo Jung


Woo J. Jung is Head of Data Science APJ at Pivotal. He joined the team in 2011, and is focused on delivering data science driven value to clients across a wide range of verticals. Woo is also passionate about the adoption & usability of advanced analytics tools in the open source data science ecosystem – including the interoperability of R, PostgreSQL, and Greenplum Database. He holds an MSc from Stanford and a BSc from Cornell.

No video of the event yet, sorry!

More enterprises these days want to go beyond traditional backward looking Business Intelligence and move towards predictive and explanatory analysis of data. This session will begin with examples of real customer problems and their solutions and will have a hands on session using a Greenplum cluster to illustrate some techniques.

During this workshop, attendees will be given an introduction to Data Science, focusing on real problems the Pivotal Data Science group has solved. The attendees will then run some exercises in a cloud based environment. These will be annotated scripted exercises so that a prior experience in the subject is not required. The flow will include the common steps Pivotal Data Scientists perform in real customer engagements. There will be a maximum of 25 attendees.

Workshop subjects:

What is Data Science?

Examples of Problems with Data Science

Short GPDB/MPP discussion

Brief overview of MADlib

Overview of PL languages and PL Container

Simple applied DS examples (hands on)

  • Data Loading and Transformation

  • Feature creation

  • Model building, e.g. regression, classification, etc.

  • Model validation

Pre-requisites: Laptop with a modern browser and SSH client: Instruction on using SSH on Windows; Basic knowledge of SQL

Users will connect to a cloud based Greenplum Cluster


Videos on YouTube Channel

GP Database basics -

GP & analytics:

GP & MADlib

2018 April 17 09:00
7 h
Enterprise I
PostgresConf US 2018
Greenplum Summit
Requires Registration:
Yes (Registered: 1)