documentation:ml:berzelius

How to use Berzelius

Note that some of the links below might require you to have a password at SNIC

  1. below information is from memory and should be updated by the next person going through the process
  2. Get a login to SNIC: https://supr.snic.se.
  3. You will be walked through the process.
  4. Note that you will have to accept the SNIC user agreements.
  5. After every step you will receive an email from supr@supr.snic.se. Read the emails carefully, they will tell you what to do next.
  6. At some point you will have to write a small project proposal: https://supr.snic.se/proposal/
    1. Go to Rounds page.
    2. In the sub-menu, select AI/ML and then select LiU Berzelius.
    3. Create the proposal for getting computation time on Berzelius.
  7. You will get a confirmation email from snic.se and NSC Berzelius and later a confirmation email that your project was accepted.
  8. To access Berzelius a login account is needed:
    1. Go to Accounts page and request an account for Berzelius. https://supr.snic.se/account/
    2. Accept the Berzelius User Agreement.
    3. Wait for your account to be created.
    4. When your account is ready you will receive an email instructing you to choose a password.
  9. Before you can log in you need a 2-Factor Authorization (2FA): https://www.nsc.liu.se/support/2fa/migration/
  10. - Go through the section “How to enable 2FA for your cluster login account - detailed version”
  11. Finally, you can run ssh berzelius.nsc.liu.se. You get asked for the password and then the 6-digit number from the authenticator account.
  12. … and you should be in.

If you want to use conda:

  • - On your private workstation, use conda env export > environment.yml to get a description of your conda envionment. Use scp or rsync to copy environment.yml to Berzelius.
  • - run module load Anaconda/2021.05-nsc1 to load the conda module. It is a good idea to add this to your .bashrc.
  • - ln -s ~/.conda /proj/<your_project_dir>/users/$(id -un), don't forget to replace <your_project_dir> with your project id. In my case, it is “berzelius-2022-58”.
  • - conda env create -f environment.yml will replicate your home conda environment with the same name, etc. Then, run conda activate…
  • - conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
  • - If you need ai-gym and atari-games, use conda install -c conda-forge gym.
  • - If you need the ATARI-ROMs, then the steps become a bit myserious: See https://github.com/mgbellemare/Arcade-Learning-Environment and perhaps the outdated https://github.com/openai/atari-py
  • I have tested PyTorch so far, it seems that it can only see a single GPU and apparently not fully supports the NVIDIA A100-SXM4-40GB GPU. under investigation :-)

Install Singularity by following this guide: https://sylabs.io/guides/3.0/user-guide/quick_start.html

Creating a simple Singularity image using a recipe file: https://sylabs.io/guides/3.0/user-guide/definition_files.html

  1. Create and open a recipe file using vim Singularity.recipe
  2. Choose an bootstrap agent that will create the base OS you want to use and add the corresponding lines to the recipe file:
    • Bootstrap: docker
    • From: ubuntu:20.04
  3. Create the %post section which will execute commands within the singularity container:
    • %post
    • apt -y update
    • apt -y install python3
    • apt -y install pip
    • pip3 install –upgrade pip
    • pip3 install torch torchvision torchaudio –extra-index-url https://download.pytorch.org/whl/cu113
  4. Add a %files section if any files are to be used by the singularity container:
    • %files
    • main.py /
    • eval.py /
    • logs/ /
  5. Specify in the %runscript section the standard script to be run:
    • %runscript
    • python3 main.py
  6. Create a .sif file from the recipe by running sudo singularity build image.sif Singularity.recipe
  7. The script in the %runscript section can then be run with singularity run image.sif
  8. Other scripts can be run using singularity exec image.sif python3 eval.py

Using .sif file on Berzelius:

  1. Upload .sif file to Berzelius using scp image.sif <username>@berzelius1.nsc.liu.se:/proj/<project name>/users/<username>
  2. Also upload all needed files to the same folder on Berzelius.
  3. Log into Berzelius, request computing resources and change directory to /proj/<project name>/users/<username>
  4. Run script using, for example singularity exec –nv image.sif python3 main.py
  • documentation/ml/berzelius.txt
  • Last modified: 2022/09/02 14:04
  • by 127.0.0.1