Packages¶
A package is a basically a directory that
- contains a
module
- declares all of its dependencies
The idea is to make code reusable: we can copy the package directory to another computer and run it there.
You encounter packages in two ways:
- when you use code written by others (as in
using Revise
) - when you write your own code; because Julia really wants you to wrap it in packages.
Environments¶
If a block of related code were totally self-contained, life would be easy. Reusing code would be a simple as copying a directory full of .jl
files. This is how languages without dependency management (such as Matlab
) work.
But writing self-contained packages would be very hard. Imagine having to code random number generation each time you want to draw some random numbers.
So we want to be able to use packages that use other packages that use other packages...
To do so, our code needs to declare what other code (packages) is needed to run it. These are dependencies.
In Julia, dependencies are declared in Project.toml
files. They basically contain a list of packages, their versions, and the locations of their code.
But which Project.toml
should Julia use when you type using Revise
?
The short (not quite accurate) answer is: Julia looks for Project.toml
is the current environment.
- The longer (more accurate) answer is that Julia also looks in all the directories listed in the
DEPOT_PATH
environment variable. - My recommendation is never to touch it and never to worry about it.
At any point in time, exactly one directory is "activated" as the current environment. To see which one, type ] st
. This does two things:
- the
]
switches the REPL to package mode.- the prompt changes to
(@v1.6) pkg>
- the
(@v1.6)
part of the prompt tells us that the active environment isv1.6
, which is Julia's startup environment - the
pkg>
part reminds you that your commands are interpreted as package commands, not regular REPL commands.
- the prompt changes to
- the
st
is short forstatus
.
The same info can be displayed by issuing Pkg
commands directly:
julia> using Pkg
julia> Pkg.status()
Status `~/.julia/environments/v1.5/Project.toml`
[5fb14364] OhMyREPL v0.5.10
[295af30f] Revise v3.1.12
Or you can look at Project.toml
directly:
# To switch to shell prompt, type `;`
shell> cat ~/.julia/environments/v1.6/Project.toml
[deps]
OhMyREPL = "5fb14364-9ced-5910-84b2-373655c76a03"
Revise = "295af30f-e4ad-537b-8983-00126c2a3abe"
Each line gives the name and UUID (a unique id) of each available package.
This means that using Revise
will make the code in Revise.jl
available. But if I try using Plots
, I get an error message:
julia> using Plots
ERROR: ArgumentError: Package Plots not found in current path:
- Run `import Pkg; Pkg.add("Plots")` to install the Plots package.
This basically says: Julia cannot find an entry for Plots
in Project.toml
. So the code cannot be used until I run Pkg.add("Plots")
.
Tip: keep the startup environment minimal. Here you just want to list packages that you use while developing your code (e.g.
Revise.jl
).
Activating an environment¶
Pkg.activate("/path/to/dir")
activates this directory as the current environment.
Equivalently, we can use
cd("/path/to/dir");
] activate .
The .
always means "the current directory".
Note: The current directory (set using cd
) and the active environment need not be the same.
Exercise: Create a directory for the code that you will write in this class. Make this the current environment.
Stacked environments¶
If you activate a different directory using Pkg.activate("/path/to/dir")
, additional packages become available.
Note that the packages known from v1.6
are not "forgotten."
Each time I activate a new environment, packages get added to the known list of loadable packages.
This is known as stacked environments. They are stacked in the sense that activating another environment retains the packages that are already activated.
This is the reason why you want your v1.5
environment to be minimal.
This can lead to great confusion. E.g.: You try to update to a new version of a package, but it does not work because the old one was already loaded in the previous environment.
Adding packages¶
To add a registered package to the list of known dependencies, use Pkg.add("Revise")
.
In response, Julia looks up Revise
in the General Registry and adds an entry for Revise
to Project.toml
(for the current environment).
Now we can issue using Revise
. Julia then
- downloads the code from the package's
github
repo. - copies the code into a hidden directory in
.julia/packages
.- Each version of
Revise
that you ever use gets stored there. - You rarely need to worry about where this code lives.
- Each version of
- precompiles
Revise
Exercise:
- After activating the
Econ890
environment, add the packageDataFrames
- Check that
using DataFrames
works - Check that
DataFrames
is listed as a dependency inProject.toml
- Type
] st -m
to see all the dependencies that you just added! - Remove
DataFrames
by typing] rm DataFrames
- Check that
DataFrames
and all its dependencies have disappeared - But note that
using DataFrames
still works (because it is already loaded) - To really get rid of it, restart the REPL
Tip: After updating package info, it is usually a good idea to restart the REPL.
If a package is not registered, presumably because you wrote it yourself, using
it gets a bit more complicated.
A good approach is to create your own local registry (using LocalRegistry.jl). Then your not officially registered packages are treated like registered ones.
For starter purposes, the alternative is to develop
your packages instead with Pkg.develop(/path/to/MyPackage)
.
What does this do? It simply adds an entry in Project.toml
that links the package name MyPackage
to the directory where the code can be found.
There is one fundamental difference between add
and develop
.
add
fixes the version of the package until you manuallyPkg.update("MyPackage")
. Even if the developer changes the code, the version that you are using remains unchanged.develop
tells Julia to track whatever code changes happen in the directory where the package code resides (the one you provide with thedevelop
command).
Packages¶
So, what is a package?
It really is a special case of an environment that satisfies some additional criteria (e.g., a specific directory structure is present).
One expectation is that src/MyPackage.jl
defines the module MyPackage
plus types and functions.
To use a package, write using MyPackage
and voila - all the types and functions defined in MyPackage
are available in your code, including the code MyPackage
requires from other packages.
Dependency Hell¶
So, you create an environment and add packages A
and B
.
Both A
and B
depend on X
and Y
, but X
also depends on Y
.
But when different people wrote A
, B
, and X
they were using different versions of Y
(and perhaps also of X
).
How can the resulting code possibly run successfully? This problem is called "dependency hell".
The solution relies on meaningful version numbers together with explicity compatibility specifications.
Version numbers (at least for registered packages) have precise meaning.
- Minor version bumps (e.g. from
1.4
to1.5
) are expected to be non-breaking. They can add features, but not change the existing API. - This is called semantic versioning and it is the cornerstone of decentralized software development.
Each package's Project.toml
contains a [compat]
section that specifies the versions of all packages that are compatible.
The package manager's job is to combine the Project.toml
s of all packages used (directly or indirectly) and to figure out a combination of version numbers that satisfies all compatibility requirements.
One might expect that this could never work in a project that uses dozens of packages, but, surprisingly, it generally works out just fine.
The result is that basically all Julia software heavily relies on packages written by others.