Skip to content

Part 1: Execute a simple pipeline

The main goal of the Craft AI platform is to allow to deploy easily your machine learning pipelines.

In this part we will use the platform to build a simple “hello world” application by showing you how to execute a basic Python code that prints “Hello world” and displays the number of days until 2025.

You will learn how to:

  • Package your application code into a step on the platform
  • Embed it in a pipeline
  • Execute it on the platform
  • Check the logs of the executions on the web interface

step1_00

Create a step with the SDK

The first thing to do to build an application on the Craft AI platform is to create a step.

A Step is the equivalent of a Python function in the Craft AI platform. Like a regular function, a step is defined by the inputs it ingests, the code it runs, and the outputs it returns. For this “hello world” use case, we are focusing on the code part so we will ignore inputs and outputs for now.

A step can be created from any Python function, using the create_step() method of thesdk object.

All of the code in this example can also be found on GitHub here.

For this example, we will use the following code:

import datetime

def helloWorld() -> None:

    # Count the number of days between January 1, 2000, and today
    start_date = datetime.datetime(2000, 1, 1)
    now = datetime.datetime.now()

    difference = now - start_date

    print(f'Hello world! Number of days since January 1, 2000: {difference.days}')
Create a file with the content above named src/part-1-helloWorld.py in a new folder that will contain the step's files. So, the helloWorld function is located in src/part-1-helloWorld.py.

We can now create the step by running the following command in a Python terminal:

sdk.create_step(
    step_name='part-1-hello-world',
    function_path='src/part-1-helloWorld.py',
    function_name='helloWorld',
    container_config={
        "local_folder": ".../get_started", # Enter the path to your local folder here, the one that contains `src/part-1-helloWorld.py`
    }
)

Its main arguments are:

  • The step_name is the name of the step that will be created. This is the identifier you will use later to refer to this step.
  • The function_path argument is the path of the Python module containing the function that you want to execute for this step. This path must be relative to the local_folder specified in the container_config.
  • The function_name argument is the name of the function that you want to execute for this step.
  • The container_config is the configuration of the container that will be used to execute the function.

Note

One of the container_config parameters is the local_folder parameter, which is the path to the folder we want to retrieve, containing the function to execute. We will explain in a later part how to do this differently, but for now, we focus on deploying steps from local code.

The above code should give you the following output:

>>> Please wait while step is being created. This may take a while...
>>> Steps creation succeeded
>>> {'name': 'part-1-hello-world'}

You can view the list of steps that you created in the platform with the list_steps() function of the SDK.

step_list = sdk.list_steps()
print(step_list)
>>> [{'step_name': 'part-1-hello-world',
>>>   'created_at': 'xxxx-xx-xxTxx:xx:xx.xxxZ',
>>>   'updated_at': 'xxxx-xx-xxTxx:xx:xx.xxxZ',
>>>   'status': 'Ready',
>>>   'origin': 'local'}]

You can see your step and its status of creation at Ready.

You can also get the information of a specific step with the get_step() function of the SDK.

step_info = sdk.get_step('part-1-hello-world')
print(step_info)
>>> {
>>>   'parameters': {
>>>     'step_name': 'part-1-hello-world',
>>>     'function_path': 'src/part-1-helloWorld.py',
>>>     'function_name': 'helloWorld',
>>>     'description': None,
>>>     'container_config': {
>>>       'language': 'python:3.X-slim',
>>>       'requirements_path': 'requirements.txt',
>>>       'dockerfile_path': None
>>>     },
>>>     'inputs': [],
>>>     'outputs': []
>>>   },
>>>   'creation_info': {
>>>     'created_at': 'xxxx-xx-xxTxx:xx:xx.xxxZ',
>>>     'updated_at': 'xxxx-xx-xxTxx:xx:xx.xxxZ',
>>>     'created_by': 'xxxxxxxx-xxxx-xxxx-xxxxx-xxxxxxxxxxx',
>>>     'updated_by': 'xxxxxxxx-xxxx-xxxx-xxxxx-xxxxxxxxxxx',
>>>     'status': 'Ready'
>>>     'origin': 'local',
>>>   }
>>> }

Success

🎉 Now your step has been created. You can now create your Pipeline (and after that, you’ll execute it on the platform).

Create a pipeline with the SDK

step1_2

The step part-1-hello-world containing our helloWorld code is now created in the platform and ready to be used in a pipeline that we will then execute.

A pipeline is a machine learning workflow, consisting of one or more steps, that can be easily deployed on the Craft AI platform. This way, you can create a full pipeline formed with a directed acyclic graph (DAG) by specifying the output of one step as the input of another step.

In the future, it will be possible to assemble multiple steps into a complex machine learning pipeline. For now, the platform only allows single step pipelines.

To create a pipeline consisting of the previous step, you must use the create_pipeline() function of the SDK.

sdk.create_pipeline(
    pipeline_name='part-1-hello-world',
    step_name='part-1-hello-world',
)

This function has two arguments:

  • The pipeline_name is the name of the pipeline you have just created. As for the step_name you will then refer to the pipeline using this name
  • The step_name is the name of the step used in the pipeline.

After executing this function, you should see the following output :

>>> Pipeline creation succeeded
>>> {'pipeline_name': 'part-1-hello-world',
>>> 'created_at': 'xxxx-xx-xxTxx:xx:xx.xxxZ',
>>> 'steps': ['part-1-hello-world'],
>>> 'open_inputs': [],
>>> 'open_outputs': []}

Success

🎉 Now that our pipeline is created (around our step), we want to execute it. To do this, we will run the pipeline with the sdk function, run_pipeline(), and it will execute the code contained in the step.

Execute your pipeline (run)

You can execute a pipeline on the platform directly with the run_pipeline() function.

This function has two arguments:

  • The name of the existing pipeline to execute (pipeline_name)
  • Optional (only if you have inputs): a dict of inputs to pass to the pipeline with input names as dict keys and corresponding values as dict values.
sdk.run_pipeline(pipeline_name='part-1-hello-world')
>>> The pipeline execution may take a while, you can check its status and get information on the Executions page of the front-end.
>>> Its execution ID is 'part-1-hello-world-xxxxx'.
>>> Pipeline execution results retrieval succeeded
>>> Pipeline execution startup succeeded

Success

🎉 Now, you have created a step for the helloWorld function, included it in a pipeline and execute it on the platform! Our hello world application is built and ready to be executed again!

Get information about an execution

Now, we have executed the pipeline. The return of the function allows us to see that the pipeline has been successfully executed; however, it does not provide the logs of the execution (we can receive outputs with the return of the run pipeline, but we did not put any here).

To find the list of executions along with the information and associated logs, you can use the user interface as follows:

  1. Connect to https://mlops-platform.craft.ai

  2. Click on your project:

    step1_3

  3. Click on the Execution page and on “Select an execution”: this displays the list of environments:

    step1_4

  4. Select your environment to get the list of runs and deployments:

    step1_6

  5. Finally, click on a run name to get its executions:

    step1_7

  6. You have the “General” tab to get general information about your execution and the “Logs” tab where you can see and download the execution logs:

    step1_8

Using the SDK

It is possible to get the logs of the executions with the SDK. Let’s see how.

Once your pipeline is executed, you can now see the pipeline executions with the sdk.list_pipeline_executions() command.

sdk.list_pipeline_executions(
    pipeline_name='part-1-hello-world'
)
>>> [{'execution_id': 'part-1-hello-world-XXXX',
>>> 'status': 'Succeeded',
>>> 'created_at': 'xxxx-xx-xxTxx:xx:xx.xxxZ',
>>> 'end_date': 'xxxx-xx-xxTxx:xx:xx.xxxZ',
>>> 'created_by': 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx',
>>> 'pipeline_name': 'part-1-hello-world',
>>> 'steps':
>>>     [{'name': 'part-1-hello-world',
>>>       'status': 'Succeeded',
>>>       'end_date': 'xxxx-xx-xxTxx:xx:xx.xxxZ',
>>>       'start_date': 'xxxx-xx-xxTxx:xx:xx.xxxZ',
>>>       'requirements_path': 'requirements.txt'',
>>>       'origin': 'local'}]}]

Then, you can get the logs of an execution with the sdk.get_pipeline_execution_logs() command. You will have need the execution ID, which can be found with the previous command.

The logs are formatted line by line in JSON, but we can display them more simply with the print() command below. The logs also contain error messages if the execution encounters any.

Here is a complete sequence to print the logs of the latest execution:

pipeline_executions = sdk.list_pipeline_executions(
    pipeline_name='part-1-hello-world'
)

logs = sdk.get_pipeline_execution_logs(
    pipeline_name='part-1-hello-world',
    execution_id=pipeline_executions[-1]['execution_id'] # [-1] to get the last execution
)

print('\n'.join(log['message'] for log in logs))
>>> Please wait while logs are being downloaded. This may take a while…
>>> Hello world ! Number of days to 2024 : xxx

Success

🎉 You can now get your execution's logs.

What we have learned

In this part we learned how to easily build, deploy and use a simple application with the Craft AI platform with the following workflow:

step1_9

These 3 main steps are the fundamental workflow to work with the platform and we will see them over and over throughout this tutorial.

Now that we know how to run our code on the platform, it is time to create more complex steps to have a real ML use case.

Next step : Part 2: Execute a simple ML model