- Storing the paths of all object files for a program’s libraries, which can then be passed to a linker action through a provider.
- For an interpreted language, storing the transitive source files that are included in an executable’s runfiles.
Description and operations
Conceptually, a depset is a directed acyclic graph (DAG) that typically looks similar to the target graph. It is constructed from the leaves up to the root. Each target in a dependency chain can add its own contents on top of the previous without having to read or copy them. Each node in the DAG holds a list of direct elements and a list of child nodes. The contents of the depset are the transitive elements, such as the direct elements of all the nodes. A new depset can be created using the depset constructor: it accepts a list of direct elements and another list of child nodes.Order
Theto_list operation performs a traversal over the DAG. The kind of traversal
depends on the order that was specified at the time the depset was
constructed. It is useful for Bazel to support multiple orders because sometimes
tools care about the order of their inputs. For example, a linker action may
need to ensure that if B depends on A, then A.o comes before B.o on the
linker’s command line. Other tools might have the opposite requirement.
Three traversal orders are supported: postorder, preorder, and
topological. The first two work exactly like tree
traversals
except that they operate on DAGs and skip already visited nodes. The third order
works as a topological sort from root to leaves, essentially the same as
preorder except that shared children are listed only after all of their parents.
Preorder and postorder operate as left-to-right traversals, but note that within
each node direct elements have no order relative to children. For topological
order, there is no left-to-right guarantee, and even the
all-parents-before-child guarantee does not apply in the case that there are
duplicate elements in different nodes of the DAG.
order keyword argument. If this
argument is omitted, the depset has the special default order, in which case
there are no guarantees about the order of any of its elements (except that it
is deterministic).
Full example
This example is available at https://github.com/bazelbuild/examples/tree/main/rules/depsets. Suppose there is a hypothetical interpreted language Foo. In order to build eachfoo_binary you need to know all the *.foo files that it directly or
indirectly depends on.
d are all of the *.foo files in
the srcs fields of a, b, c, and d. In order for the foo_binary
target to know about any file besides d.foo, the foo_library targets need to
pass them along in a provider. Each library receives the providers from its own
dependencies, adds its own immediate sources, and passes on a new provider with
the augmented contents. The foo_binary rule does the same, except that instead
of returning a provider, it uses the complete list of sources to construct a
command line for an action.
Here’s a complete implementation of the foo_library and foo_binary rules.
*.foo files with dummy content, and
building the d target.
Performance
To see the motivation for using depsets, consider what would happen ifget_transitive_srcs() collected its sources in a list.
a
will appear twice on the command line and twice in the contents of the output
file.
An alternative is using a general set, which can be simulated by a
dictionary where the keys are the elements and all the keys map to True.
to_list()
at the end in a binary rule is fine, since the overall cost is just O(n). It’s
when many non-terminal targets try to call to_list() that quadratic behavior
occurs.
For more information about using depsets efficiently, see the performance page.