Najlepsze podcasty z kategorii: Data Engineering (2024)

1
Building a Data Vision Board: A Guide to Strategic Planning 49:59

5d ago49:59

49:59

Summary In this episode of the Data Engineering Podcast Lior Barak shares his insights on developing a three-year strategic vision for data management. He discusses the importance of having a strategic plan for data, highlighting the need for data teams to focus on impact rather than just enablement. He introduces the concept of a "data vision boar…

1
Exploring the Power of Airflow 3 at Astronomer with Amogh Desai 30:24

8d ago30:24

30:24

What does it take to go from fixing a broken link to becoming a committer for one of the world’s leading open-source projects? Amogh Desai, Senior Software Engineer at Astronomer, takes us through his journey with Apache Airflow. From small contributions to building meaningful connections in the open-source community, Amogh’s story provides actiona…

1
How Orchestration Impacts Data Platform Architecture 59:39

12d ago59:39

59:39

Summary The core task of data engineering is managing the flows of data through an organization. In order to ensure those flows are executing on schedule and without error is the role of the data orchestrator. Which orchestration engine you choose impacts the ways that you architect the rest of your data platform. In this episode Hugo Lu shares his…

1
Using Airflow To Power Machine Learning Pipelines at Optimove with Vasyl Vasyuta 24:11

16d ago24:11

24:11

Data orchestration and machine learning are shaping how organizations handle massive datasets and drive customer-focused strategies. Tools like Apache Airflow are central to this transformation. In this episode, Vasyl Vasyuta, R&D Team Leader at Optimove, joins us to discuss how his team leverages Airflow to optimize data processing, orchestrate ma…

1
An Exploration Of The Impediments To Reusable Data Pipelines 51:32

20d ago51:32

51:32

Summary In this episode of the Data Engineering Podcast the inimitable Max Beauchemin talks about reusability in data pipelines. The conversation explores the "write everything twice" problem, where similar pipelines are built without code reuse, and discusses the challenges of managing different SQL dialects and relational databases. Max also touc…

1
Maximizing Business Impact Through Data at GlossGenius with Katie Bauer 25:49

23d ago25:49

25:49

Bridging the gap between data teams and business priorities is essential for maximizing impact and building value-driven workflows. Katie Bauer, Senior Director of Data at GlossGenius, joins us to share her principles for creating effective, aligned data teams. In this episode, Katie draws from her experience at GlossGenius, Reddit and Twitter to h…

1
Optimizing Large-Scale Deployments at LinkedIn with Rahul Gade 27:47

26d ago27:47

27:47

Scaling deployments for a billion users demands innovation, precision and resilience. In this episode, we dive into how LinkedIn optimizes its continuous deployment process using Apache Airflow. Rahul Gade, Staff Software Engineer at LinkedIn, shares his insights on building scalable systems and democratizing deployments for over 10,000 engineers. …

1
The Art of Database Selection and Evolution 59:56

27d ago59:56

59:56

Summary In this episode of the Data Engineering Podcast Sam Kleinman talks about the pivotal role of databases in software engineering. Sam shares his journey into the world of data and discusses the complexities of database selection, highlighting the trade-offs between different database architectures and how these choices affect system design, q…

1
Bridging Code and UI in Data Orchestration with Kestra 44:30

1M ago44:30

44:30

Summary In this episode of the Data Engineering Podcast, Anna Geller talks about the integration of code and UI-driven interfaces for data orchestration. Anna defines data orchestration as automating the coordination of workflow nodes that interact with data across various business functions, discussing how it goes beyond ETL and analytics to enabl…

1
Tech Stacks and Tradeoffs: Xudo's Founder on Picking the Right Tools for BI Success 24:56

1M ago24:56

24:56

Wouter Trappers is the founder of Xudo and shares his slightly unconventional path from philosopher to data consultant with the Bros in this latest episode of The Data Engineering Show. Wouter’s grounding in philosophy has proved to be a shaping influence on his approach to business intelligence. Much more than just a software solution, for Wouter,…

1
Streaming Data Into The Lakehouse With Iceberg And Trino At Going 39:49

1M ago39:49

39:49

In this episode, I had the pleasure of speaking with Ken Pickering, VP of Engineering at Going, about the intricacies of streaming data into a Trino and Iceberg lakehouse. Ken shared his journey from product engineering to becoming deeply involved in data-centric roles, highlighting his experiences in ecommerce and InsurTech. At Going, Ken leads th…

1
How Uber Manages 1 Million Daily Tasks Using Airflow, with Shobhit Shah and Sumit Maheshwari 28:44

1M ago28:44

28:44

When data orchestration reaches Uber’s scale, innovation becomes a necessity, not a luxury. In this episode, we discuss the innovations behind Uber’s unique Airflow setup. With our guests Shobhit Shah and Sumit Maheshwari, both Staff Software Engineers at Uber, we explore how their team manages one of the largest data workflow systems in the world.…

1
An Opinionated Look At End-to-end Code Only Analytical Workflows With Bruin 56:11

2M ago56:11

56:11

Summary The challenges of integrating all of the tools in the modern data stack has led to a new generation of tools that focus on a fully integrated workflow. At the same time, there have been many approaches to how much of the workflow is driven by code vs. not. Burak Karakan is of the opinion that a fully integrated workflow that is driven entir…

1
Building Resilient Data Systems for Modern Enterprises at Astrafy with Andrea Bombino 28:29

2M ago28:29

28:29

Efficient data orchestration is the backbone of modern analytics and AI-driven workflows. Without the right tools, even the best data can fall short of its potential. In this episode, Andrea Bombino, Co-Founder and Head of Analytics Engineering at Astrafy, shares insights into his team’s approach to optimizing data transformation and orchestration …

1
Feldera: Bridging Batch and Streaming with Incremental Computation 47:36

2M ago47:36

47:36

Summary In this episode of the Data Engineering Podcast, the creators of Feldera talk about their incremental compute engine designed for continuous computation of data, machine learning, and AI workloads. The discussion covers the concept of incremental computation, the origins of Feldera, and its unique ability to handle both streaming and batch …

1
Data Rewind: Conversation Highlights from Zach Wilson, Matthew Housley, Joe Reis, and Krishnan Viswanathan 28:02

2M ago28:02

28:02

In this special roundup episode of The Data Engineering Show, the Bros revisits some of the best bits from episodes with data thought leaders Zach Wilson, Matthew Housley, Joe Reis, and Krishnan Viswanathan, spotlighting essential trends and lessons learned across the evolving data engineering landscape. From data observability to bridging academia…

1
Inside Airflow 3: Redefining Data Engineering with Vikram Koka 30:08

2M ago30:08

30:08

Data orchestration is evolving faster than ever and Apache Airflow 3 is set to revolutionize how enterprises handle complex workflows. In this episode, we dive into the exciting advancements with Vikram Koka, Chief Strategy Officer at Astronomer and PMC Member at The Apache Software Foundation. Vikram shares his insights on the evolution of Airflow…

1
Accelerate Migration Of Your Data Warehouse with Datafold's AI Powered Migration Agent 48:50

2M ago48:50

48:50

Summary Gleb Mezhanskiy, CEO and co-founder of DataFold, joins Tobias Macey to discuss the challenges and innovations in data migrations. Gleb shares his experiences building and scaling data platforms at companies like Autodesk and Lyft, and how these experiences inspired the creation of DataFold to address data quality issues across teams. He out…

1
Building a Data-Driven HR Platform at 15Five with Guy Dassa 20:25

2M ago20:25

20:25

Data and AI are revolutionizing HR, empowering leaders to measure performance and drive strategic decisions like never before. In this episode, we explore the transformation of HR technology with Guy Dassa, Chief Technology Officer at 15Five, as he shares insights into their evolving data platform. Guy discusses how 15Five equips HR leaders with to…

1
Bring Vector Search And Storage To The Data Lake With Lance 58:01

2M ago58:01

58:01

Summary The rapid growth of generative AI applications has prompted a surge of investment in vector databases. While there are numerous engines available now, Lance is designed to integrate with data lake and lakehouse architectures. In this episode Weston Pace explains the inner workings of the Lance format for table definitions and file storage, …

1
The Role of Python in Shaping the Future of Data Platforms with DLT 54:08

3M ago54:08

54:08

Summary In this episode of the Data Engineering Podcast, Adrian Broderieux and Marcin Rudolph, co-founders of DLT Hub, delve into the principles guiding DLT's development, emphasizing its role as a library rather than a platform, and its integration with lakehouse architectures and AI application frameworks. The episode explores the impact of the P…

1
Build Your Data Transformations Faster And Safer With SDF 42:36

3M ago42:36

42:36

Summary In this episode of the Data Engineering Podcast Lukas Schulte, co-founder and CEO of SDF, explores the development and capabilities of this fast and expressive SQL transformation tool. From its origins as a solution for addressing data privacy, governance, and quality concerns in modern data management, to its unique features like static an…

1
The Intersection of AI and Data Management at Dosu with Devin Stein 20:18

3M ago20:18

20:18

Unlocking engineering productivity goes beyond coding — it’s about managing knowledge efficiently. In this episode, we explore the innovative ways in which Dosu leverages Airflow for data orchestration and supports the Airflow project. Devin Stein, Founder of Dosu, shares his insights on how engineering teams can focus on value-added work by automa…

1
The Resurgence of SQL: Insights from Ryanne Dolan from LinkedIn 32:57

3M ago32:57

32:57

In this episode of The Data Engineering Show, the bros, Eldad and Benjamin are joined by Ryanne Dolan from LinkedIn to discuss the innovative Hoptimator (H2) project. This conversation reveals how LinkedIn has improved its data pipelines by automating the setup and management of complex workflows. Together they cover: Automated Data Pipelines: Ryan…

1
Scaling Airbyte: Challenges and Milestones on the Road to 1.0 57:11

3M ago57:11

57:11

Summary Airbyte is one of the most prominent platforms for data movement. Over the past 4 years they have invested heavily in solutions for scaling the self-hosted and cloud operations, as well as the quality and stability of their connectors. As a result of that hard work, they have declared their commitment to the future of the platform with a 1.…

1
AI-Powered Vehicle Automation at Ford Motor Company with Serjesh Sharma 26:11

4M ago26:11

26:11

Harnessing data at scale is the key to driving innovation in autonomous vehicle technology. In this episode, we uncover how advanced orchestration tools are transforming machine learning operations in the automotive industry. Serjesh Sharma, Supervisor ADAS Machine Learning Operations (MLOps) at Ford Motor Company, joins us to discuss the challenge…

1
From Task Failures to Operational Excellence at GumGum with Brendan Frick 24:06

4M ago24:06

24:06

Data failures are inevitable but how you manage them can define the success of your operations. In this episode, we dive deep into the challenges of data engineering and AI with Brendan Frick, Senior Engineering Manager, Data at GumGum. Brendan shares his unique approach to managing task failures and DAG issues in a high-stakes ad-tech environment.…

1
Enhancing Data Accessibility and Governance with Gravitino 38:41

4M ago38:41

38:41

Summary As data architectures become more elaborate and the number of applications of data increases, it becomes increasingly challenging to locate and access the underlying data. Gravitino was created to provide a single interface to locate and query your data. In this episode Junping Du explains how Gravitino works, the capabilities that it unloc…

1
From Sensors to Datasets: Enhancing Airflow at Astronomer with Maggie Stark and Marion Azoulai 22:25

4M ago22:25

22:25

A 13% reduction in failure rates — this is how two data scientists at Astronomer revolutionized their data pipelines using Apache Airflow.In this episode, we enter the world of data orchestration and AI with Maggie Stark and Marion Azoulai, both Senior Data Scientists at Astronomer. Maggie and Marion discuss how their team re-architected their use …

1
Mastering Data Orchestration with Airflow at M Science with Ben Tallman 24:36

4M ago24:36

24:36

Mastering the flow of data is essential for driving innovation and efficiency in today’s competitive landscape. In this episode, we explore the evolution of data orchestration and the pivotal role of Apache Airflow in modern data workflows.Ben Tallman, Chief Technology Officer at M Science, joins us and shares his extensive experience with Airflow,…

1
Welcome to The Data Flowcast 2:01

4M ago2:01

2:01

Welcome to The Data Flowcast: Mastering Airflow for Data Engineering & AI — the podcast where we keep you up to date with insights and ideas propelling the Airflow community forward.Join us each week, as we explore the current state, future and potential of Airflow with leading thinkers in the community, and discover how best to leverage this workf…

1
Enhancing Business Metrics With Airflow at Artlist with Hannan Kravitz 23:51

4M ago23:51

23:51

Data orchestration is revolutionizing the way companies manage and process data. In this episode, we explore the critical role of data orchestration in modern data workflows and how Apache Airflow is used to enhance data processing and AI model deployment.Hannan Kravitz, Data Engineering Team Leader at Artlist, joins us to share his insights on lev…

1
Cutting-Edge Data Engineering at Teya with Alexandre Magno Lima Martins 23:46

5M ago23:46

23:46

Data engineering is constantly evolving and staying ahead means mastering tools like Apache Airflow. In this episode, we explore the world of data engineering with Alexandre Magno Lima Martins, Senior Data Engineer at Teya. Alexandre talks about optimizing data workflows and the smart solutions they've created at Teya to make data processing easier…

1
The Evolution of DataOps: Insights from DataKitchen's CEO 53:30

5M ago53:30

53:30

Summary In this episode of the Data Engineering Podcast, host Tobias Macey welcomes back Chris Berg, CEO of DataKitchen, to discuss his ongoing mission to simplify the lives of data engineers. Chris explains the challenges faced by data engineers, such as constant system failures, the need for rapid changes, and high customer demands. Chris delves …

Podcasty warte posłuchania

Data Engineering Podcasty

Podcasty warte posłuchania

Skrócona instrukcja obsługi