Presented by:

0b655b2d8836fbe6789c24fb30f52964

Ben Rogojan

Acheron Analytics
No video of the event yet, sorry!

As a data scientists we get to hear the saying “Garbage in Garbage out” all the time. This might be from our comrades or the DBAs that maintain much of the data we receive. Somehow, this tends to feel more like teams throwing data quality issues over the fence. Rather than developing methods to improve data. No one really says too much about how to keep your systems clean. It is even harder to find good data QAs as very few companies employ them. Thus, the skill set is lost. Poor data scientists and data analysts are left on their own spending hours of their time cleaning data and making do with missing data points.

Our team wanted to discuss the importance of spending time on designing systems for data QA. As well as discuss some techniques to ensure that all the “Big Data” flying around in various systems and data stores are being kept as clean as possible. Nope, this talk is not about the sexiest job in the world as per Harvard's research and post. Yet, without it, all the investment companies put into data science, and data analytics wouldn’t be worth it! So let’s keep our data clean.

Date:
Duration:
50 min
Room:
Conference:
PGConf Local: Seattle [PgConf.US]
Language:
Track:
Data Science
Difficulty:
Medium