See a typo? Have a suggestion? Edit this page on Github
This blog post covers an issue with how packages are shipped with GHC, and an edge case of how this can negatively interact with Stack. The overall point about the package contents mismatch will apply more broadly, but I'm mostly focused in this post on how to better handle this in Stack.
Intro
GHC ships with a few wired in packages which cannot be reinstalled. Examples are ghc
, base
, and template-haskell
. The situation with these for all build tools is simple: the GHC version determines the package version.
There are also packages which are not shipped with GHC at all. All build tools get to determine whatever version of these packages they want, assuming compatibility with the GHC version and the packages that are wired in with it.
There's an in-between category: packages like process
and transformers
are shipped with GHC, but can also be easily reinstalled from Hackage. Some tools may decide to eagerly recompile the latest version of them. Stackage takes a different approach: in order to avoid invalidating the ghc
package, it avoids recompiling any of these packages, and sticks by default with the global version. Stack is designed to work with that.
When recompilation happens
Suppose you have the following stack.yaml
file:
resolver: ghc-8.6.3
packages:
- .
And suppose your local package depends on the unix
package. On non-Windows systems, unix
ships with GHC. Running stack build
will therefore result in only building the local package, not unix
, since unix
is in the global package database.
Now let's suppose that, instead of using a ghc-8.6.3
resolver, you use an LTS Haskell resolver:
resolver: lts-13.4
packages:
- .
Same thing: LTS Haskell snapshots never specify versions of global packages. Instead, they are implicitly inherited from the GHC global package database. In a few specific cases (like the stack init
command), we want to avoid requiring that GHC is already installed, so we instead use a global hints file to specify the versions of packages which ship with GHC.
The unix
package depends on the time
package. Let's suppose your stack.yaml
file instead said:
resolver: lts-13.4
packages:
- .
extra-deps:
- time-1.9.2
And suppose your local package depends on both unix
and time
. If you run stack build
, Stack has three basic options:
- Only recompile
time
, but keep the originalunix
. That's a bad idea: you'll now have two different versions of the datatypes in thetime
package floating around. People may remember extremely confusing error messages about "cannot matchByteString
withByteString
" or similar. It comes from this kind of a build plan. - Fail to compile at all. The
unix
in the global database is "invalidated" by the presence of a newly compiledtime
, and your package depends onunix
. Just give up. - Automatically add an
extra-dep
version ofunix
to the build plan, and recompile bothtime
andunix
.
(1) is a really bad idea, and (2) is pretty annoying. So today, Stack follows (3), and automatically/implicitly adds the invalidated global packages to the build plan.
Where do global packages come from?
Look at the following simplified output of a command describing the global process
package shipped with GHC 8.6.3:
$ stack --resolver ghc-8.6.3 exec -- ghc-pkg describe process
name: process
version: 1.6.3.0
...
depends:
base-4.12.0.0 deepseq-1.4.4.0 directory-1.3.3.0 filepath-1.4.2.1
unix-2.7.2.2
Important bits:
- We're using
process-1.6.3.0
- We're depending on
base-4.12.0.0
Now check out the process.cabal
file for process-1.6.3.0
on Hackage and on Github. You'll notice the following bound:
build-depends: base >= 4.4 && < 4.12,
These pieces of information are contradictory: the cabal file says base < 4.12
, but GHC ships with this version of process
and base-4.12.0.0
! How is this possible?
Well, it turns out, when GHC says it has process-1.6.3.0
, it's not code that's downloaded from Hackage. Instead, it's established via a submodule. Specifically, you can see on Github that GHC 8.6.3 is using process
at Git commit 36a3ad5. By contrast, the v1.6.3.0
tag points to Git commit 7c0b5814.
This isn't a recent revelation, and not only my revelation. I peripherally knew about this, but I've heard of three different people discovering this problem in the past month. I'm writing this blog post to point out how it negatively affects Stack, and what to do about it.
How it breaks Stack
process
depends on the directory
package. Let's use an extra-dep to provide a new version of directory
, and then try to build:
$ cat stack.yaml
resolver: ghc-8.6.3
packages: []
extra-deps:
- directory-1.3.3.1
$ stack build process
Error: While constructing the build plan, the following exceptions were encountered:
In the dependencies for process-1.6.3.0:
base-4.12.0.0 from stack configuration does not match >=4.4 && <4.12 (latest matching version is 4.11.1.0)
needed since process is a build target.
Well that's darn confusing. I changed a version of directory
, and suddenly process
has a bounds error?!? This makes perfect sense given the above (though it's certainly unsatisfying):
- GHC ships with a different version of
process-1.6.3.0
process
needs to be recompiled when we change thedirectory
package- Stack assumes that it can simply use
process-1.6.3.0
to replace the globalprocess
package - However, the
process
on Hackage differs slightly from the one shipped with GHC
This is a relatively simple problem which results in an annoying bounds error, and can be easily fixed by adding a new process
extra-dep. However, other problems are much more severe:
- What if GHC is compiling one of these packages with a special cabal flag? We won't be able to reproduce that. In fact, there are comments in the Stack code base about this already.
- What if the code on Hackage is fundamentally different from what's shipped with GHC? We could get silent but significant behavior changes.
Solution: make no assumptions
The core problem here is that Stack assumes process-1.6.3.0
provided by GHC is the same as what's on Hackage. That assumption plays itself out in our selection of option (3) above. I propose: get rid of the assumption! Instead, move to option (2): if we replace a global package, prune all global packages that depend on it, and require adding those as explicit extra-dep
entries.
This will break some existing build plans, but I'm proposing this change for Stack 2.0, which is already allowing some slight breakage along those lines. This will add some annoyance to users, but with the result of more clarity on what's happening in a build plan. And perhaps we can even make an error message for these cases which points right at this post!