Towards the end of the working year, in December 2023, we made a new major release of Kedro. Kedro 0.19 contains a host of new features, bug fixes, and documentation improvements.
This blog post gives details of the release and explains how find out more about recent enhancements and improvements to Kedro. The release also includes a few breaking changes with respect to 0.18.x because we have streamlined configuration management, dataset loading and project structure. We’ll explain below what’s changed and where to get more information.
Headline news
Kedro 0.19 introduced project tools to help you create a new Kedro project, customised for your needs. You can now invoke kedro new
in the CLI and generate a project that contains the code you need while omitting the tools and example code you don’t want. We added a set of new spaceflights starters (spaceflights-pandas
, spaceflights-pandas-viz
, spaceflights-pyspark
, and spaceflights-pyspark-viz
) for use in combination with the kedro new command, and substantially revised the documentation for this area. There is a new guide to help users get a new project created swiftly and a section to explain the customisation options in detail.
When it comes to default project structure, Kedro 0.19 now includes the build configuration and project metadata in pyproject.toml
, so that Kedro projects now follow modern Python packaging standards and have a similar structure to any other Python library.
We have previously explained that our goal of making Kedro leaner, lightweight, and fast-evolving, needed us to decouple framework code from the Kedro dataset code. In Kedro 0.19, we’ve released the kedro-datasets
package and removed kedro.extras.datasets
from framework code. Alongside this change, we’ve also improved the error messages displayed when a dataset is not found by raising a more explicit error when dependencies are missing, in differentiation to errors caused simply by typos.
OmegaConfigLoader
is now the only configuration loader in Kedro, as we have removed the alternatives. Furthermore, Kedro 0.19 now enables you to choose between a merge strategy. The default is a destructive merge, but there's also option for a soft merge strategy for configuration files loaded with OmegaConfigLoader.
The main changes in Kedro 0.19
Here’s a short list of some of the other changes we made in this release:
We dropped Python 3.7 support.
We added the
--conf-source
option to%reload_kedro
to enable users to specify a source for project configuration.We added validation for the configuration file used to override run commands via the CLI.
We moved the default environment base and local from config loader to
_ProjectSettings
. This enables the use of config loader as a standalone class without affecting existing Kedro users’ code.We enhanced the documentation with a new top-level navigation to easily switch between Kedro, Kedro Viz, and Kedro-Datasets documentation, and a new search-as-you-type to improve the search experience.
There were numerous bug fixes and tweaks in the release, such as the following:
Added a new field tools to
pyproject.toml
when a project is created.Added validation to node tags to be consistent with node names.
Removed
pip-tools
as a dependency.Accepted path-like filepaths more broadly for datasets.
For the complete list of changes go to the release notes for Kedro 0.19.0 and Kedro 0.19.1.
Breaking changes in Kedro 0.19
These are the significant breaking changes in Kedro 0.19 compared to Kedro 0.18.x:
ConfigLoader
andTemplatedConfigLoader
have been removed.The new datasets package (
kedro-datasets
) replaceskedro.extras.datasets
andtests
.PartitionedDataset
andIncrementalDataset
were removed from kedro.io and moved tokedro-datasets
.Logging was removed from
OmegaConfigLoader
in favour of the environment variableKEDRO_LOGGING_CONFIG
.Support for the layer attribute when defined at a top-level within
DataCatalog
was removed.Inconsistencies in the use of naming were eliminated by renaming
data_set
andDataSet
todataset
andDataset
across the codebase.The
create_default_data_set()
method in theAbstractRunner
was removed in favour of using dataset factories to create default dataset instances.The default project template now has only one
pyproject.toml
at the root of the project (containing both the packaging metadata and the Kedro build config).
For more information if you are upgrading from Kedro 0.18, have a look at the migration guide.
Get started with Kedro 0.19
You can install Kedro 0.19 with pip install kedro==0.19.1
or conda/mamba/micromamba install -c conda-forge kedro=0.19.1
.
Note that we released Kedro 0.19.0 but detected a problematic bug with it so released Kedro 0.19.1 with a fix immediately afterwards, and this is the one you should install and use.
Find out more about Kedro
There are many ways to learn more about Kedro:
Join our Slack organisation to reach out to us directly if you’ve a question or want to stay up to date with news. There's an archive of past conversations on Slack too.
Read our documentation or take a look at the Kedro source code on GitHub.
Check out our video course on YouTube.
What’s next?
At the time of writing, in January 2024, we are planning the milestones for our next releases. (You can see what we're working on right now , whenever you are reading this post, on our sprint board).
We welcome every community contribution, large or small so please do continue to report bugs or suggest future features over on GitHub and raise discussions on Slack.
Stay tuned for an online community session about the new release soon. We’ll announce dates just as soon as we can!