See also: The Ultimate Guide to Distributed Computing with a MWE.
Get an account on the cluster, such as UNC's
sshkey, which allows you to log on without entering passwords.
Try this out by logging into the cluster via
ssh. At the terminal, enter
Installing Julia on a Cluster¶
This is for cases where one does not want to use the version that is installed for everyone (usually because it lags behind the current version).
The command line installation instructions for linux produce a directory of the format
julia-1.6.1 with a binary of
All it then takes is to replace the generic
julia -e with
~/julia-1.6.1/bin/julia -e in the command files called from
Getting started with a test script¶
How to get your code to run on a typical Linux cluster?
- Get started by writing a simple test script (Test3.jl) so we can test running from the command line.
- Make sure you can run the test script locally with
- Now copy Test3.jl to a directory on the cluster and repeat the same.
- Once: make Julia available on the cluster with
module add juliaor
module add julia/1.5.3if you want a specific version.
- Then run
Now we know that things run on the cluster and it's time to submit a batch file:
sbatch -p general -N 1 -J "test_job" -t 3-00 --mem 16384 -n 1 --mail-type=end --email@example.com -o "test1.out" --wrap="julia /full/path/to/Test3.jl"
The usual way of submitting jobs consists of writing an sbatch file and then submitting it using the
- Copy your code and all of its dependencies to the cluster (see below). This is not needed when all dependencies are registered.
- Write a Julia script that contains the startup code for the project and then runs the actual computation (call this
- Write a batch file that submits
julia batch.jlas a job to the cluster's job scheduler. For UNC's longleaf cluster, this would be slurm. So you need to write
job.slthat will be submitted using
Each line in the sbatch file looks like
#SBATCH -o value.
Options (indicated by -o) are:
-t 03-00: time in days-hours
-N 1: number of nodes
--mem 24576: memory in megabytes (per cpu)
Status of running jobs:¶
- squeue -u
- squeue --job XXXX
sacct --format="JobID,JobName%30,State,ExitCode"(best typed using KeyboardMaestro)
Examining memory and cpu usage¶
After jobs completed:
MaxRSS switch shows memory usage.
From time to time,
github asks for user credentials when trying to download private repos, even if those have been downloaded many times before. Then precompile the package from the REPL on the cluster, entering the credentials by hand. They will then be stored for some time again.
Enter the personal access token instead of the account password.
The Julia script¶
Submitting a job is (almost) equivalent to
julia batch.jl from the terminal.
Note: cd() does not work in these command files. To include a file, provide a full path.
If you only use registered packages, life is easy. Your code would simply say:
using Pkg # This needs to be done only once, but it does not hurt Pkg.add(MyPackage) # Make sure all required packages are downloaded Pkg.instantiate() MyPackage.run()
If the code for MyPackage has been copied to the remote, then
julia --project="/path/to/MyPackage" --startup-file=no batch.jl activates MyPackage and runs
--project option is equivalent to Pkg.activate.
- Julia looks for
- Disabling the startup-file prevents surprises where the startup-file changes the directory before looking for batch.jl.
~is not expanded when relative paths are used.
If MyPackage contains is unregistered or contains unregistered dependencies, things get more difficult. Now batch.jl must:
- Activate the package's environment.
- develop all unregistered dependencies. This replaces the invalid paths to directories on the local machine (e.g. /Users/lutz/julia/...) with the corresponding paths on the cluster (e.g. /nas/longleaf/...). Note: I verified that one cannot replace homedir() with ~ in Manifest.toml.
- using MyPackage
Developing MyPackage in a blank folder does not work (for reasons I do not understand). It results in errors indicating that dependencies of MyPackage could not be found.
This approach requires you to keep track of all unregistered dependencies and where they are located on the remote machine. My way of doing this is contained in
PackageTools.jl in the shared repo (this is not a package b/c its very purpose is to facilitate loading of unregistered packages). But the easier way is to create a private registry and register all dependencies.
A reliable command line transfer option is
rsync (on mac / linux). The command would be something like
rsync -atuzv "/someDirectory/sourceDir/" "firstname.lastname@example.org:someDirectorySourceDir"
- The source dir should end in “/”; the target dir should not.
- Exluding .git speeds up the transfer.
--deleteensures that no old files remain on the server.
- This will use
sshfor authentication if it is set up.
An alternative is to use
To transfer an individual file:
run(scp $filename hostname:/path/to/newfile.txt')`.