Beam and Dataflow: a perfect combination for parallel processing
Apache beam is a unified modal that helps you build both batch and streaming pipelines. And Dataflow is a managed service for executing a wide variety of data processing patterns.
While beam supports multiple runners where your pipeline can be executed such as your local runner, Spark, Flink .. Dataflow stands as a perfect runner that is auto-scalable and managed where it takes care of running the pipeline, allocating the required resources, scaling the number of workers, and other cool things.
In this session, we will have a brief introduction into both technologies, explaining some of the features and the use cases, and doing a live demo together.
El Mehdi El Khayati
Mehdi is a data engineer with a serious passion for developing softwares, extracting insights from data, and solving problems. With years of experience in the software field and data engineering specifically, Mehdi was able to play principal rules in various data teams.
Made with ❤️ by Geeksblabla Team
© 2022 Geeksblabla | All Rights Reserved