We're launching a new monthly blog post that'll keep you updated on all the exciting things happening in the Kedro community. From the latest Kedro news to upcoming events and interesting topics, “In the Pipeline” has got you covered.
This month: a new pair of releases, Technical Steering Committee news, upcoming events, and our top picks from recent articles and podcasts.
The latest releases of Kedro and Kedro-Viz are here
Earlier this week, Merel announced on Slack that Kedro
0.18.8 has been released.
Here are the headlines. You can see the full set of release notes on GitHub.
🚀 Major features and changes
KEDRO_LOGGING_CONFIGenvironment variable, which can be used to configure logging from the beginning of the kedro process.
Removed logs folder from the Kedro new project template. File-based logging will remain but just be level
INFOand above and go to project root instead.
A set of bug fixes and other changes 🪲
✍️ Documentation changes
Improvements to Sphinx toolchain including incrementing to use a newer version.
Improvements to documentation on visualising Kedro projects on Databricks, and additional documentation about the development workflow for Kedro projects on Databricks.
Improvements to documentation about configuration.
Updated table of contents for documentation to reduce scrolling.
Note that using
kedro.extras.datasets has been officially deprecated, and will be removed from Kedro in 0.19. Installing
kedro_datasets is now the preferred approach.
In the last week of April, Nero announced the release of Kedro-Viz
Kedro-Viz is an interactive development tool for building and visualising data science pipelines with Kedro. It enables you to monitor the status of your ML project, present it to stakeholders, and smoothly bring new team members onboard. It also offers experiment tracking, and the ability to preview code and datasets.
How do I get Kedro-Viz?
pip install kedro-viz==6.0.0
npm install @quantumblack/kedro-viz@latest
🚀 What can you expect in this release?
Experiment tracking updates allowing users to filter (show/hide) metrics in the time series & parallel coordinates metrics plots.📈
A set of bug fixes and other changes 🪲
You can see the full Kedro-Viz release notes on GitHub.
🔮 What's coming next?
Collaboration features within Kedro-Viz.
Create your own reports.
Technical Steering Committee news
We’d also like to share some numbers that we collected recently:
GitHub Stars on https://github.com/kedro-org/kedro: 8.3K
Monthly Downloads: 467,000
4th May 2023
The Kedro team is organising a 2-hour virtual training session on Thursday, May 4th, 2023 that is open to everyone. The session introduces you to Kedro and explains how to turn a Jupyter notebooks into reusable Python libraries. You’ll learn the benefits of Kedro pipelines and how to visualise them using Kedro-Viz in an interactive session with plenty of Q&A.
Register now to reserve your slot on 4th May 2023 at 4:00pm–6:00pm CEST (which is 10:00am–12:00pm EDT).
18th May 2023
Juan Luis, Kedro’s Developer Advocate, is giving a talk on 18th May at PyCon Lithuania. His talk is titled “Analyze your data at the speed of light with Polars and Kedro” and presents how to combine Kedro with Polars, a new dataframe library backed by Arrow and Rust, for lightning fast data manipulation and exploratory data analysis
In the pipeline: top picks from the Kedro team
Towards Data Science recently published a pair of nice posts by Jõao Pedro about writing a data pipeline with Airflow and AWS Tools (S3, Lambda & Glue) and automatically managing data pipeline infrastructures With Terraform
GetInData | Part of Xebia publish a weekly newsletter on LinkedIn called Data Pill and it hit its 50th edition this week, so celebrated with a compilation of the most popular case studies from previous editions.
The NerdOut@Spotify podcast is always a must-listen. It’s produced by the nerds at Spotify, and made for the nerds inside all of us. You get to hear from Spotify engineers about challenging tech problems and get a firsthand look into what they’re doing. The most recent episode is a fascinating look into building at scale.
Speaking of podcasts and Spotify, the R&D Engineering team recently blogged about the generation of podcast previews using Google Dataflow. The result: a neat way of providing users with audio teasers so they can make listening decisions that aren’t based just on static content, such as cover art and descriptions.
In last month’s virtual Kedro update meeting, we walked the community through the new OmegaConfigLoader, described user research and ongoing collaboration with Databricks, and discussed experiment tracking in Kedro-Viz. If you missed the session, you can catch up with a recording on the Kedro YouTube channel.
That’s it for May 2023
And that’s a wrap for this month. But if you can’t wait for next month’s In the Pipeline news, we also toot out regular updates onto Mastodon (https://social.lfx.dev/@kedro) and across the popular channels of the Slack community.
Spoiler alert! Next month, we might unveil a fresh new look. But shh, let's keep it between us for now. Make sure to bookmark this blog or add our RSS feed to your favorite reader to stay in the loop and join us in the first week of June for another update from the Kedro team.