Determinism and reproducibility of results from GPU algorithms is of increasing importance to many use cases.
There are several types of determinism guarantees one might care about. CCCL defines the following vocabulary for identifying and describing different kinds of determinism:
- GPU-to-GPU - results will always be bitwise identical for identical invocations, no matter the GPU
- Run-to-run - results will be identical for identical invocations on the same GPU, but may differ from one GPU to the next
- Not Guaranteed - results may differ in identical invocations
The non-associativity of floating-point arithmetic and the lack of order of execution guarantees from parallel execution generally mean that two identical invocations of an algorithm on identical floating-point inputs may result in different results.
In #1558 we introduced our first new major deterministic algorithm, cub::DeviceReduce and the new cuda::execution::determinism::gpu_to_gpu guarantees API for enabling specifying the desired determinism behavior from an algorithm.
This issue is meant to track work related to adding new features/algorithms to CCCL aimed at improving determinism and reproducibility guarantees.
This issue is also meant to collect input from our users on what use cases for determinism are important to you to help us prioritize what we build next. If there is a feature you would like to see, please leave a comment below telling us as much as you can about: What algorithm/feature for determinism do you need? What kind of determinism do you require?
See also:
Determinism and reproducibility of results from GPU algorithms is of increasing importance to many use cases.
There are several types of determinism guarantees one might care about. CCCL defines the following vocabulary for identifying and describing different kinds of determinism:
The non-associativity of floating-point arithmetic and the lack of order of execution guarantees from parallel execution generally mean that two identical invocations of an algorithm on identical floating-point inputs may result in different results.
In #1558 we introduced our first new major deterministic algorithm,
cub::DeviceReduceand the newcuda::execution::determinism::gpu_to_gpuguarantees API for enabling specifying the desired determinism behavior from an algorithm.This issue is meant to track work related to adding new features/algorithms to CCCL aimed at improving determinism and reproducibility guarantees.
This issue is also meant to collect input from our users on what use cases for determinism are important to you to help us prioritize what we build next. If there is a feature you would like to see, please leave a comment below telling us as much as you can about: What algorithm/feature for determinism do you need? What kind of determinism do you require?
See also: