Skip to content

Create an environment

An environment is the infrastructure used by the platform to store data and run computation. The first thing to do when you have created a new project is to create an environment composed of a cluster and a data store.

Summary

  1. Set up an environment
  2. Manage an environment
  3. Handle data in the datastore

Set up an environment

To create an environment, you need to get into a Project and click on the top right corner "New environment".

You need to fill some informations to create your environment. This form is split in 4 sections :

  • General : General informations such as name
  • Computing resources : The specifications of the virtual machine your pipeline will run on
  • Storage resources : The specifications of the data store, that allows you to upload files or use a vector database for LLM
  • Runtime Planner : To avoid wasting resources, you can plan to shutdown and restart your environment whenever you want, like to have it up only certain days of to shut it for the night to reduce its cost and environmental footprint.

create-env

In General :

  • Environment name : needs to be between 3 and 20 characters, with no special characters (@, \, *, etc.)
  • Tag : Experimentation, Testing, Production.

In Computing resources :

  • Cloud provider : AWS, GCP, S3NS or Scaleway
  • Zone : Will be automaticaly selected based on the cloud provider (France for GCP & Scaleway, Netherlands for S3NS and Europe for AWS)
  • Machine : CPU or GPU (a GPU will cost more, but may be mandatory for your projects based on the calculation you need to do) and select a machine size (full table below)
  • Workers : Select a number of workers that will allow task parallelization

We set arbitrary "Size" for different machine based on each provider. Here is the default configuration that is parametered in the Platform. This can be modified, but you need to contact us for this.

Provider Size Hardware Reference Disk Workers
AWS CPU size 1 2CPU 4GB t3a.medium 20 Go 2, 4, 6, 8, 10
AWS CPU size 2 4CPU 16GB t3a.xlarge 20 Go 2, 4, 6, 8, 10
AWS CPU size 3 8CPU 32GB t3a.2xlarge 20 Go 2, 4, 6, 8, 10
AWS CPU size 4 16CPU 64GB m6a.4xlarge 20 Go 2, 4, 6, 8, 10
AWS GPU size 1 1GPU (T4) 16GB (VRAM) g4dn.xlarge 100 Go 2, 4, 6, 8, 10
AWS GPU size 2 1GPU (A10G) 24GB (VRAM) g5.xlarge 100 Go 2, 4, 6, 8, 10
AWS GPU size 3 4 GPU (T4) 48 GB (VRAM) g4dn.12xlarge 200 Go 2, 4, 6, 8, 10
Scaleway CPU size 1 2CPU 4GB PLAY2-NANO 20 Go 1, 2, 4, 6, 8, 10
Scaleway CPU size 2 4CPU 16GB PLAY2-MICRO 20 Go 1, 2, 4, 6, 8, 10
Scaleway CPU size 3 8CPU 32GB PRO2-S 20 Go 1, 2, 4, 6, 8, 10
Scaleway CPU size 4 16CPU 64GB POP2-16C-64G 20 Go 1, 2, 4, 6, 8, 10
Scaleway GPU size 1 1GPU (L4) 24GB (VRAM) L4-1-24G 100 Go 1, 2, 4, 6, 8, 10
Scaleway GPU size 2 1GPU (L40S) 96GB (VRAM) L40S-1-48G 100 Go 1, 2, 4, 6, 8, 10
GCP CPU size 1 2CPU 8GB n2d-standard-2 20 Go 1, 2, 4, 6, 8, 10
GCP CPU size 2 4CPU 16GB n2d-standard-4 20 Go 1, 2, 4, 6, 8, 10
GCP CPU size 3 8CPU 32GB n2d-standard-8 20 Go 1, 2, 4, 6, 8, 10
GCP CPU size 4 16CPU 64GB n2d-standard-16 20 Go 1, 2, 4, 6, 8, 10
GCP GPU size 1 1GPU T4 n1-standard-4 100 Go 1, 2, 4, 6, 8, 10
GCP GPU size 2 1GPU L4 g2-standard-4 100 Go 1, 2, 4, 6, 8, 10
S3NS CPU size 1 2CPU 8GB n2d-standard-2 20 Go 1, 2, 4, 6, 8, 10
S3NS CPU size 2 4CPU 16GB n2d-standard-4 20 Go 1, 2, 4, 6, 8, 10
S3NS CPU size 3 8CPU 32GB n2d-standard-8 20 Go 1, 2, 4, 6, 8, 10
S3NS CPU size 4 16CPU 64GB n2d-standard-16 20 Go 1, 2, 4, 6, 8, 10
S3NS GPU size 1 1GPU T4 n1-standard-4 100 Go 1, 2, 4, 6, 8, 10
S3NS GPU size 2 1GPU L4 g2-standard-4 100 Go 1, 2, 4, 6, 8, 10

In Storage resources :

  • Data store provider : Automaticaly filled with the cloud provider
  • Vector database provider : you can choose to add a Vector DB to your project. It only has one choice for now : Weaviate. More details at Vector DB Documentation

In Runtime Planner :

  • Operational Days : Toogle the day you want your environment to shut down (the weekends for example)
  • Resume Time : The time in UTC when the environment should restart (please, keep in mind that it's not instantaneous so plan with a little delay)
  • Standby Time : The time in UTC when the environment starts shutting itself down

Manage an environment

Once the environment is created, you can manage it with settings.

From the environment page, you have 3 options :

  • Access the settings : Goes to the Settings page, detailed below
  • Put operational/put on standby : Use if your environment is in standby state or up and running to change its state. Once it's on standby, you cannot run anything but can still access everything on "read-only" mode.
  • Delete : Delete the environment and all its associated content

The Settings page is separated in two main parts :

  • A drop-down menu that lists all the environments created on the project
  • A section that presents informations regarding the current selected environment
  • A first tab that presents information regarding
    • The environment, as it was selected during its creation (name & type) and extra informations (status, URL, IP address, creation date)
    • Its resources (cloud provider and Machine size) that cannot be modified
    • Its runtime planner that can be changed with the same options as the creation form (operational days, resume time, standby time)
  • A second tab listing all the environment variables of the environment, with the possibility to add more using key/value couples

Handle data in the DataStore

As each environment has its own datastore, you are able to store documents of information on it and download the results of your processing. This part is detailed in the Datastore part of the documentation.