Containers
Last updated
Last updated
Users of the Strong Compute ISC develop and launch experiments from inside Containers which facilitates access to their project data, and allows flexibility to install any necessary project dependencies. Users can generate and start their Containers by visiting the "Workstations" page on Control Plane (https://cp.strongcompute.ai).
Users can generate one Container per Organisation that they are a member of, and each Container is generated on one and only one Constellation Cluster ("Cluster"). Once a Container has been generated on a particular Cluster, it cannot be started on any other Cluster. Users can select the Cluster to generate their Container on using the Constellation Cluster menu associated with their desired Container, then click "Generate" to generate the Container.
A Workstation is a computer in a Constellation Cluster that runs User Containers.
Containers can be started with access to one or more GPUs subject to the availability of GPUs on Workstation machines within the Cluster. Users can start their Container with zero or more GPUs by selecting from the GPUs menu for the Container before clicking "Start".
GPUs are made available within Containers this way for development and testing purposes, and are not intended to be used for training. Each Organisation is limited to one (1) "running" Container with one or more GPUs attached at any given time. If a member of your Organisation already has a Container associated with that Organisation running with one or more GPUs, Control Plane will return an error when you also try to start your Container also associated with that Organisation with one or more GPUs.
Users can also optionally select a Dataset to have mounted to their Container when it is started from the "Mounted Dataset" menu for container. Datasets mounted to the Container are accessible within the Container at /data
.
Containers running with zero GPUs share the system resources of the machine such as CPU cores and memory with other user containers running on that machine. Performance of Containers running with zero GPUs may therefore be impacted at times by the number of other User Containers also running on the same workstation.
Containers running with one or more GPUs will have dedicated system resources such as CPU cores and memory. Performance of Containers running with one or more GPUs will not be impacted by other User activity.
Containers are automatically stopped after a period of 2 hours without an active SSH connection.
Organisation Admin and Finance members can stop running member containers associated with that Organisation by visiting the Member Containers section at the bottom of the User Credentials page. This allows Organisation Admin and Finance members to manage the utilisation of Organisation Containers, including those running with one or more GPUs.
Containers provide secure access to your user directory, which is an external volume mounted within your Container at /root
. User directories by default have a maximum volume of 75GB (subject to change). Users will find that attempting to save more than 75GB of data to /root
such as datasets or model checkpoints will result in system refusal, and/or confusing errors.
If an experiment running on the cluster is saving checkpoints to /root
and your user directory runs out of storage, the experiment will fail
. Users are strongly encouraged to delete any unnecessary data from their user directories to avoid running out of storage space.
Tip: run
du -h .
to see a breakdown of the current directory's storage use!
To share files across an organisation easily, use the /shared
directory. This has a 100GB allocation, which is seperate from per-user limits. This is a seperate allocation from your per-user limit!
Upon login, you'll see your current usages printed out.