At cnvrg, we believe datasets should be managed at the organization level and not per project separately.
Hence, once you uploaded datasets to the organizations you can reuse them for every project, experiment, and notebook.
Create a new dataset
Change to the directory with the data you want to link and run:
dataset1>$ cnvrg data init
In the dataset directory, (after running data init) run:
dataset1>$ cnvrg data upload
The dataset will be compressed and uploaded to cnvrg's servers.
Note: if the dataset size is very large, it may take a while
To view all datasets the organization own:
$ cnvrg data list
List dataset commits
To view a specific dataset commits list:
dataset1>$ cnvrg data commits
Run an experiment with dataset
In order to run an experiment with dataset you uploaded, simply add the flag --data to the running command, i.e.: --data=DATA_ID.
The DATA_ID can be found with the data list commands under the data_id column.
prohect1>$ cnvrg run --data=DATA_ID python train.py
The data path to access in the experiment will be at: /data/DATA_ID