Kedro-Viz — 6 min read

How to share a Kedro-Viz pipeline visualisation

How to set up and use "publish and share" to host a Kedro-Viz project on Amazon S3.

20 Nov 2023 (last updated 17 Jan 2024)
Glass v6 resin

“…Kedro-Viz is a way for us to have a meeting of minds and be able to problem solve with each other. It helps me have conversations about what’s in the code and how it’s organised...”

“Kedro-Viz is incredibly valuable for demos even if only used briefly. Without Viz, people would be spending hours preparing bespoke presentations to walk through a pipeline”

These are recurring examples of the value proposition of Kedro-Viz. It enables users to understand data pipelines and connected datasets, fosters collaboration during data modelling between technical and non-technical team stakeholders, and helps the team lead present updates on a project.

Until now, without a technical or “translator” team member, there was no straightforward way to make Kedro-Viz accessible to non-technical users on a project team. The only options were to screenshot a PNG version of a Kedro-Viz pipeline, make a GIF or create a hosted Kedro-Viz. The first two options meant that users lost the interactivity of Kedro-Viz, and the final option required infrastructure setup and coding.

Recent research by our team indicated that almost half of Kedro users wanted to share a version of their pipeline visualisation for others to explore. Crucial to the use case is to be able to share with non-technical stakeholders who cannot make a copy of the Kedro project, install and run Kedro-Viz to see the pipeline

We recently found a way to address this pain point and launched a way to publish and share a visualisation in Kedro-Viz 6.6.0. The new feature enables users to share their pipeline visualisation with other stakeholders by hosting a Kedro-Viz project on Amazon S3. Team members and senior stakeholders can now view, explore and interrogate updates of the project visualisation to provide feedback to the team.

How to use publish and share

You can host your Kedro-Viz project on Amazon S3. You must first create an S3 bucket and credentials, and then enable static website hosting.

Update and install the dependencies

Kedro-Viz requires specific minimum versions of fsspec[s3], and kedro to publish your project. You can ensure you have these correct versions by updating the requirements.txt file in the src folder of the Kedro project to the following:

1fsspec[s3]>=2023.9.0
2kedro>=0.18.2

Install the dependencies from the project root directory by typing the following in your terminal:

1pip install -r src/requirements.txt

Configure your AWS S3 bucket and set credentials

You can host your Kedro-Viz project on Amazon S3. You must first create an S3 bucket and then enable static website hosting. To do so, follow the AWS tutorial to configure a static website on Amazon S3. Once the S3 bucket is created, you'll need to create an Identity and Access Management (IAM) user account, user group, and generate the corresponding access keys. To do so, first sign in to the AWS Management Console and create an IAM user account. For more information, see the official AWS documentation about IAM Identities.

kedro viz share credentials1

Create a user group from the IAM dashboard, ensuring the user group has full access to the AWS S3 policy. For more information, see the official AWS documentation about IAM user groups.

kedro viz share credentials2

Add the IAM user to the user group (this is only possible if the group has been created).

kedro viz share credentials3

Select the user, then select Create access key. Follow the steps and create your keys.

kedro viz share credentials4

Once that's completed, you'll need to set your AWS credentials as environment variables in your terminal window, as shown below:

1export AWS_ACCESS_KEY_ID="your_access_key_id"
2export AWS_SECRET_ACCESS_KEY="your_secret_access_key"

For more information, see the official AWS documentation about how to work with credentials.

Publish and share the project

You're now ready to publish and share your Kedro-Viz project. Start Kedro-Viz by running the following command in your terminal:

1kedro viz

Click the Publish and share icon in the lower-left of the application. You will see a modal dialog to select your relevant AWS Bucket Region and enter your Bucket Name. Once those two details are complete, click Publish. A hosted, shareable URL will be returned to you after the process completes. Here's an example of the flow:

Permissions and access control

All permissions and access control are controlled by AWS. It's up to you, the user, if you want to allow anyone to see your project or limit access to certain IP addresses, users, or groups. You can control who can view your visualisation using bucket and user policies or access control lists. See the official AWS documentation for more information.

You pay for storing objects in your S3 buckets. The amount you pay depends on your objects’ size, how long you stored the object during the month, and the storage class. See the official AWS documentation for more information.

Hosting responsilities

As Kedro-Viz is an open-source application, the team behind the development of this feature needed to make some technical tradeoffs. For example, from the outset, the answer to the question, “Who’s in charge of hosting the application, the Kedro team or the user?”, was the latter. As we are limited in scope, the Kedro-Viz team just could not incur any financial costs, maintenance overheads, or other hosting realities.

The next tradeoff was how to enable a user to host a Kedro-Viz? We wanted the users to be able to easily publish and share their instance of the app from the Kedro-Viz UI, which led us to explore software development kits from the major cloud computing players (Amazon’s AWS, Google’s GCP, and Microsoft’s Azure). We knew from our own user research that a large share of our users already used AWS elsewhere in their projects and further, we as a team had previously built the collaborative experiment tracking feature using AWS infrastructure.

So, with this knowledge, we chose AWS as the hosting provider for our users to publish and share Kedro-Viz. The feature enables hosting via Simple Storage Solution (a.k.a. S3) , and users maintain full control over the set up, access and configuration of their published Kedro-Viz project.

Summary

Publish and Share Kedro-Viz enables users to easily share their pipeline visualisation with other stakeholders, by hosting their Kedro-Viz project on Amazon S3. it facilitates collaboration and project debugging amongst all stakeholders, and the onboarding of new team members.

This article described this feature, its setup, and development. You can learn more from our documentation. The next step for us is to extend the feature to enable sharing via the command line, and offer the option to deploy onto GitHub pages and other platforms (beyond AWS).

Find out more about Kedro-Viz

Kedro-Viz is an interactive development tool for building and visualising data science pipelines with Kedro. It enables you to monitor the status of your ML project, present it to stakeholders, and easily bring new team members up to speed. You can try it out using our hosted demo.

If you are new to Kedro-Viz you can learn more about the product with this video tutorial series.


On this page:

Photo of Nero Okwa
Nero Okwa
Product Manager, Kedro
Share post:
Mastodon logoLinkedIn logo

All blog posts

cover image alt

News — 5 min read

Introducing a Kedro extension for VS Code

We're launching a Kedro extension for VS Code that offers enhanced code navigation and autocompletion.

Nok Lam Chan

1 Aug 2024

cover image alt

Kedro newsletter — 5 min read

In the pipeline: July 2024

From the latest news to upcoming events and interesting topics, “In the Pipeline” is overflowing with updates for the Kedro community.

Jo Stichbury

1 Jul 2024

cover image alt

SQL in Python — 7 min read

Streamlining SQL Data Processing in Kedro ML Pipelines

Kedro and Ibis streamline the management of ML pipelines and SQL queries within a Python project, leveraging Google BigQuery for efficient execution and storage.

Dmitry Sorokin

5 Jun 2024

cover image alt

Kedro newsletter — 5 min read

In the pipeline: May 2024

From the latest news to upcoming events and interesting topics, “In the Pipeline” is overflowing with updates for the Kedro community.

Jo Stichbury

7 May 2024

cover image alt

Best practices — 5 min read

A practical guide to team topologies for ML platform teams

Creating data platforms is a challenging task. A guest author explains how Kedro reduces the learning curve and enables data science teams.

Carlos Barreto

30 Apr 2024