determinator/lib.rs
1// Copyright (c) The cargo-guppy Contributors
2// SPDX-License-Identifier: MIT OR Apache-2.0
3
4#![warn(missing_docs)]
5
6//! Figure out what packages in a Rust workspace changed between two commits.
7//!
8//! A typical continuous integration system runs every build and every test on every pull request or
9//! proposed change. In large monorepos, most proposed changes have no effect on most packages. A
10//! *target determinator* decides, given a proposed change, which packages may have had changes
11//! to them.
12//!
13//! The determinator is desiged to be used in the
14//! [Diem Core workspace](https://github.com/diem/diem), which is one such monorepo.
15//!
16//! # Examples
17//!
18//! ```rust
19//! use determinator::{Determinator, rules::DeterminatorRules};
20//! use guppy::{CargoMetadata, graph::DependencyDirection};
21//! use std::path::Path;
22//!
23//! // guppy accepts `cargo metadata` JSON output. Use a pre-existing fixture for these examples.
24//! let old_metadata = CargoMetadata::parse_json(include_str!("../../../fixtures/guppy/metadata_guppy_78cb7e8.json")).unwrap();
25//! let old = old_metadata.build_graph().unwrap();
26//! let new_metadata = CargoMetadata::parse_json(include_str!("../../../fixtures/guppy/metadata_guppy_869476c.json")).unwrap();
27//! let new = new_metadata.build_graph().unwrap();
28//!
29//! let mut determinator = Determinator::new(&old, &new);
30//!
31//! // The determinator supports custom rules read from a TOML file.
32//! let rules = DeterminatorRules::parse(include_str!("../../../fixtures/guppy/path-rules.toml")).unwrap();
33//! determinator.set_rules(&rules).unwrap();
34//!
35//! // The determinator expects a list of changed files to be passed in.
36//! determinator.add_changed_paths(vec!["guppy/src/lib.rs", "tools/determinator/README.md"]);
37//!
38//! let determinator_set = determinator.compute();
39//! // determinator_set.affected_set contains the workspace packages directly or indirectly affected
40//! // by the change.
41//! for package in determinator_set.affected_set.packages(DependencyDirection::Forward) {
42//! println!("affected: {}", package.name());
43//! }
44//! ```
45//!
46//! # Platform support
47//!
48//! * **Unix platforms**: The determinator works and is supported.
49//! * **Windows**: experimental support. There may still be bugs around path normalization: please
50//! [report them](https://github.com/guppy-rs/guppy/issues/new)!
51//!
52//! # How it works
53//!
54//! A Rust package can behave differently if one or more of the following change:
55//! * The source code or `Cargo.toml` of the package.
56//! * A dependency.
57//! * The build or test environment.
58//!
59//! The determinator gathers data from several sources, and processes it through
60//! [guppy](https://docs.rs/guppy), to figure out which packages need to be re-tested.
61//!
62//! ## File changes
63//!
64//! The determinator takes as input a list of file changes between two revisions. For each
65//! file provided:
66//! * The determinator looks for the package nearest to the file and marks it as changed.
67//! * If the file is outside a package, the determinator assumes that everything needs to be
68//! rebuilt.
69//!
70//! The list of file changes can be obtained from a source control system such as Git. This crate
71//! provides a helper which simplifies the process of enumerating file lists while handling some
72//! gnarly edge cases. For more information, see the documentation for
73//! [`Utf8Paths0`].
74//!
75//! These simple rules may need to be customized for particular scenarios (e.g. to ignore certain
76//! files, or mark a package changed if a file outside of it changes). For those situations, the
77//! determinator lets you specify *custom rules*. See the
78//! [Customizing behavior](#customizing-behavior) section below for more.
79//!
80//! ## Dependency changes
81//!
82//! A dependency is assumed to have changed if one or more of the following change:
83//!
84//! * For a workspace dependency, its source code.
85//! * For a third-party dependency, its version or feature set.
86//! * Something in the environment that it depends on.
87//!
88//! The determinator runs Cargo build simulations on every package in the workspace. For each
89//! package, the determinator figures out whether any of its dependencies (including feature sets)
90//! have changed. These simulations are done with:
91//! * dev-dependencies enabled (by default; this can be customized)
92//! * both the host and target platforms set to the current platform (by default; this can be
93//! customized)
94//! * three sets of features for each package:
95//! * no features enabled
96//! * default features
97//! * all features enabled
98//!
99//! If any of these simulated builds indicates that a workspace package has had any dependency
100//! changes, then it is marked changed.
101//!
102//! ## Environment changes
103//!
104//! The *environment* of a build or test run is anything not part of the source code that may
105//! influence it. This includes but is not limited to:
106//!
107//! * the version of the Rust compiler used
108//! * system libraries that a crate depends on
109//! * environment variables that a crate depends on
110//! * external services that a test depends on
111//!
112//! **By default, the determinator assumes that the environment stays the same between runs.**
113//!
114//! To represent changes to the environment, you may need to find ways to represent those changes
115//! as files checked into the repository, and add [custom rules](#customizing-behavior) for them.
116//! For example:
117//!
118//! * Use a [`rust-toolchain` file](https://doc.rust-lang.org/edition-guide/rust-2018/rustup-for-managing-rust-versions.html#managing-versions)
119//! to represent the version of the Rust compiler. There is a default rule which causes a full
120//! run if `rust-toolchain` changes.
121//! * Record all environment variables in CI configuration files, such as [GitHub Actions workflow
122//! files](https://docs.github.com/en/free-pro-team@latest/actions/reference/workflow-syntax-for-github-actions),
123//! and add a custom rule to do a full run if any of those files change.
124//! * As far as possible, make tests hermetic and not reach out to the network. If you only have a
125//! few tests that make network calls, run them unconditionally.
126//!
127//! # Customizing behavior
128//!
129//! The standard rules followed by the determinator may need to be tweaked in some situations:
130//! * Some files should be ignored.
131//! * If some files or packages change, a full test run may be necessary.
132//! * *Virtual dependencies* that Cargo isn't aware of may need to be inserted.
133//!
134//! For these situations, the determinator allows for custom *rules* to be specified. The
135//! determinator also ships with
136//! [a default set of rules](crate::rules::DeterminatorRules::DEFAULT_RULES_TOML) for common files
137//! like `.gitignore` and `rust-toolchain`.
138//!
139//! For more about custom rules, see the documentation for the [`rules` module](crate::rules).
140//!
141//! # Limitations
142//!
143//! While the determinator can bring significant benefits to CI and local workflows, its model is
144//! quite different from Cargo's. **Please understand these limitations before using the
145//! determinator for your project.**
146//!
147//! For best results, consider doing occasional full runs in addition to determinator-based runs.
148//! You may wish to configure your CI system to use the determinator for pull-requests, and also
149//! schedule full runs every few hours on the main branch in case the determinator misses something.
150//!
151//! ## Build scripts and include/exclude instructions
152//!
153//! **The determinator cannot run [build scripts](https://doc.rust-lang.org/cargo/reference/build-scripts.html).**
154//! The standard Cargo method for declaring a dependency on a file or environment variable is to
155//! output `rerun-if-changed` or `rerun-if-env-changed` instructions in build scripts. These
156//! instructions must be duplicated through custom rules.
157//!
158//! **The determinator doesn't track the [`include` and `exclude` fields in
159//! `Cargo.toml`](https://doc.rust-lang.org/cargo/reference/manifest.html#the-exclude-and-include-fields).**
160//! This is because the determinator's view of what's changed doesn't always align with these
161//! fields. For example, packages typically include `README` files, but the determinator has a
162//! default rule to ignore them.
163//!
164//! If a package includes a file outside of it, either move it into the package (recommended) or
165//! add a custom rule for it. Exclusions may be duplicated as custom rules that cause those files
166//! to be ignored.
167//!
168//! ## Path dependencies outside the workspace
169//!
170//! **The determinator may not be able to figure out changes to path dependencies outside the
171//! workspace.** The determinator relies on metadata to figure out whether a non-workspace
172//! dependency has changed. The metadata includes:
173//! * the version number
174//! * the source, such as `crates.io` or a revision in a Git repository
175//!
176//! This approach works for dependencies on `crates.io` or other package repositories, because a
177//! change to their source code necessarily requires a version change.
178//!
179//! This approach also works for [Git
180//! dependencies](https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#specifying-dependencies-from-git-repositories).
181//! It even works for Git dependencies that aren't pinned to an exact revision in `Cargo.toml`,
182//! because `Cargo.lock` records exact revisions. For example:
183//!
184//! ```toml
185//! # Specifying this in Cargo.toml...
186//! [dependencies]
187//! rand = { git = "https://github.com/rust-random/rand", branch = "master" }
188//!
189//! # ...results in Cargo.lock with:
190//! [[package]]
191//! name = "rand"
192//! version = "0.7.4"
193//! source = "git+https://github.com/rust-random/rand?branch=master#50c34064c80762ddae11447adc6240f42a6bd266"
194//! ```
195//!
196//! The hash at the end is the exact Git revision used, and a change to it is recognized by the
197//! determinator.
198//!
199//! Where this scheme may not work is with path dependencies, because the files on disk can change
200//! without a version bump. `cargo build` can recognize those changes because it compares mtimes of
201//! files on disk, but the determinator cannot do that.
202//!
203//! This is not expected to be a problem for most projects that use workspaces. If
204//! there's future demand, it would be possible to add support for changes to non-workspace path
205//! dependencies if they're in the same repository.
206//!
207//! # Alternatives and tradeoffs
208//!
209//! One way to look at the determinator is as a kind of
210//! [*cache invalidation*](https://martinfowler.com/bliki/TwoHardThings.html). Viewed through this
211//! lens, the main purpose of a build or test system is to cache results, and invalidate those
212//! caches based on certain parameters. When the determinator marks a package as changed, it
213//! invalidates any cached results for that package.
214//!
215//! There are several other ways to design caching systems:
216//! * The caching built into Cargo and other systems like GNU Make, which is based on file
217//! modification times.
218//! * [Mozilla's `sccache`](https://github.com/mozilla/sccache) and other "bottom-up" hash-based
219//! caching build systems.
220//! * [Bazel](https://bazel.build/), [Buck](https://buck.build/) and other "top-down" hash-based
221//! caching build systems.
222//!
223//! These other systems end up making different tradeoffs:
224//! * Cargo can use build scripts to track file and environment changes over time. However, it
225//! relies on a previous build being done on the same machine. Also, as of Rust 1.48, there is no
226//! way to use Cargo caching for test results, only for builds.
227//! * `sccache` [requires paths to be exact across
228//! machines](https://github.com/mozilla/sccache#known-caveats), and is unable to cache [some
229//! kinds of Rust artifacts](https://github.com/mozilla/sccache/blob/master/docs/Rust.md). Also,
230//! just like Cargo's caching, there is no way to use it for test results, only for builds.
231//! * Bazel and Buck have stringent requirements around the environment not affecting build results.
232//! They're also not seamlessly integrated with Cargo.
233//! * The determinator works for both builds and tests, but cannot track file and environment
234//! changes over time and must rely on custom rules. This scheme may produce both false negatives
235//! and false positives.
236//!
237//! While the determinator is geared towards test runs, it also works for builds. If you wish to
238//! use the determinator for build runs, consider stacking it with another layer of caching:
239//! * Use the determinator as a first pass to filter out packages that haven't changed.
240//! * Then use a system like `sccache` to get hash-based caching for builds.
241//!
242//! # Inspirations
243//!
244//! This determinator is inspired by, and shares its name with, the target determinator used in
245//! Facebook's main source repository.
246
247mod determinator;
248pub mod errors;
249mod paths0;
250pub mod rules;
251
252pub use crate::{determinator::*, paths0::*};