I Can't Believe It's Not Real Data! An Introduction into Synthetic Data

Oct 18 10:40 AM PDT :calendar: to 11:05 am

About This Talk

From Data Science and Machine Learning to Software Engineering and testing, access to accurate data is one of the biggest bottlenecks hindering development. Developers need accurate, relevant data to safely experiment when building applications, machine learning models, testing, etc. However, developers often run into issues gathering data, from a lack of data to the inability to access the data due to privacy policies. But what if you could have instant access to an unlimited supply of high-fidelity data that’s statistically accurate, privacy-protected, and safe to share? This is where Synthetic Data comes in. In this talk, you’ll learn about Synthetic Data, the problems it solves, and how to get started generating as much relevant data as you want.

In this talk, we’ll discuss what Synthetic Data is, the benefits of using Synthetic Data, and the efficacy of it. You’ll see real-world situations where Synthetic Data removes bias, augments data sets, and makes once private data easily shareable while still protecting the privacy of the initial data set.


    Photo of Mason Egger

    Mason Egger

    Mason is currently the Lead Developer Advocate at Gretel where he specializes in synthetic data, data privacy, and Python. Prior to his role at Gretel he was a Developer Advocate at DigitalOcean and an SRE helping build and maintain a highly available hybrid multicloud PaaS. He is an avid programmer, speaker, educator, and writer. He is an organizer of PyTexas, President of the PyTexas Foundation and actively contributes to open source projects. In his spare time he enjoys reading, camping, kayaking, and exploring new places.