The Kedro team have published an official video course to teach you the skills you need to unlock the power of Kedro. This blog post explains what you’ll find in the videos and how to optimise your learning.
Why study Kedro?
It’s hard to ship machine learning projects at a quality suitable for production. Within any team there can be many reasons for this, but frequently the main issues include lack of standardisation, lack of software engineering best practices, and accumulated technical and data debt.
By using Kedro to create your data and machine learning pipelines, you can easily create future-friendly, platform-agnostic data science code, reduce deployment times and upskill your collaborators in the process.
Kedro is an open source (Apache 2.0) Python library, so you don’t need any complex setup or an account to try it out. You can install it with
pip install kedro or
conda install -c conda-forge kedro and get started!
The course is structured in five parts, and each part is divided in short videos of 3 to 8 minutes that cover a specific part of the Kedro learning path. It is based on the venerable spaceflights tutorial, which so you can use the documentation as supporting material while watching the course.
The introduction gives an overview of what problem Kedro solves, what it is, and how it fits in the data science ecosystem.
The following three sections are the bulk of the course and show the actual steps to follow in creating a Kedro project. The course has a hands-on approach, so you can follow along with me as I create a project with
kedro new and build up a working pipeline using different file formats, parameters. Later lessons introduce more advanced Kedro features like namespaces and dataset factories. One of the main teachings is how to make a transition from working in a Jupyter Notebook to working with Python files using VSCode. The videos lead you through an example but are structured to help you acquire the necessary setup and skills to work with Kedro on your own, more complex, projects.
The final section offers a few ideas on how you can continue your Kedro journey, including exploring its plugin ecosystem and also contributing to the project!
You can find a complete outline of the videos and links to each in the Kedro documentation that describes the course.
My name is Juan Luis Cano, and aside from being the instructor for this course I’m also the current Product Manager for Kedro at QuantumBlack. I have worked as developer advocate, software engineer, and freelance researcher in the space, consulting, and banking industries, and I have a decade of experience as Python trainer for several private and public entities.
I’m a long time open source advocate and enthusiast: I co-founded the Python España non-profit in 2012, organised and presented at the first PyCon Spain in 2013 and I co-created a Python course for Aerospace engineers back in 2014. Since then, I have continued teaching Python to scientists and engineers and giving around a hundred talks and workshops at PyCon, PyData, and other events in Europe, Africa, South America and the USA. I became a Python Software Foundation Fellow in 2017, and nowadays I’m lead organiser of the PyData Madrid monthly meetups.
Who is this course for?
This course is for data scientists that already know the basics of Python and the command line, but who want to learn more about how create maintainable, reusable data science code.
Whether you are a beginner data scientist who wants to know how to refactor Jupyter notebooks into Python functions and packages using Kedro, or a more experienced data scientist who wants to get a glimpse of how to package your code as a Python library or deploy your data science projects using container solutions like Docker and open source orchestrators like Airflow, this course is for you.
We assume some familiarity with these concepts:
Python basics (coding on Jupyter and other notebook interfaces)
Manipulating data with pandas
Command line basics
We don’t assume knowledge of software engineering in Python, so the course contains information about reusability principles, how to create a Python package, and how to use version control.
Find out more…
You don’t need to register for the course, which is available for free on YouTube. You can skip around the sections to find help on a particular area as you pick up the skills needed to build your own Kedro projects. We’ve listed and linked each video on a page in the Kedro documentation to make it easier to see what’s in each video.
Recently on the Kedro blog
Recently published on the Kedro blog:
We’re always looking for collaborators to write about their experiences using Kedro. Get in touch with us on our Slack workspace to tell us your story!