Skip to main content

PrefixScan

The PrefixScan class provides an efficient implementation of the prefix scan (sum) algorithm, supporting both single and multiple array scanning. It is highly configurable and optimized for parallel computing, commonly used in tasks such as data aggregation and parallel processing. The class supports various modes and flags, allowing for flexible control over the scanning process, including the ability to perform indirect scans using dispatch buffers. Its methods enable in-place scanning for both single and multiple datasets, with support for large-scale data processing across multiple regions.

#include <parallel/TellusimPrefixScan.h>

Constructors

PrefixScan()

Methods

Clear scan.

void clear()

Check scan.

bool isCreated(Flags flags) const

Scan parameters.

uint32_t getGroupSize() const
uint32_t getScanElements() const
uint32_t getMaxElements() const
uint32_t getMaxRegions() const

Create prefix scan.

bool create(const Device &device, Mode mode, uint32_t groups = 256, uint32_t regions = 1, Async *async = nullptr)
bool create(const Device &device, Flags flags, uint32_t groups = 256, uint32_t regions = 1, Async *async = nullptr)
TypeNameDescription
uint32_tgroupsPrefix scan group size.
uint32_tregionsMaximum number of multiple regions.

Dispatch single in-place prefix scan.

bool dispatch(Compute &compute, Buffer &data, uint32_t offset, uint32_t size)
TypeNameDescription
BufferdataBuffer of uint32_t elements to scan.
uint32_toffsetElements offset index (4 aligned).
uint32_tsizeNumber of uint32_t elements to scan.

Dispatch multiple in-place prefix scans.

bool dispatch(Compute &compute, Buffer &data, uint32_t count, const uint32_t *offsets, const uint32_t *sizes, Flags flags = FlagNone)
TypeNameDescription
BufferdataBuffer of uint32_t elements to scan.
uint32_tcountNumber of regions to scan.
uint32_toffsetsElements offset index (4 aligned).
uint32_tsizesNumber of uint32_t elements to scan.

Dispatch single in-place indirect prefix scan.

bool dispatchIndirect(Compute &compute, Buffer &data, Buffer &dispatch, uint32_t offset, Flags flags = FlagNone, uint32_t max_size = Maxu32)
TypeNameDescription
BufferdataBuffer of uint32_t elements to scan.
BufferdispatchDispatch indirect buffer.
uint32_toffsetDispatch indirect buffer offset.
uint32_tmax_sizeMaximum number of elements to scan.

Dispatch multiple in-place indirect prefix scans.

bool dispatchIndirect(Compute &compute, Buffer &data, uint32_t count, Buffer &dispatch, uint32_t offset, Flags flags = FlagNone, uint32_t max_size = Maxu32)
TypeNameDescription
BufferdataBuffer of uint32_t elements to scan.
uint32_tcountNumber of regions to scan.
BufferdispatchDispatch indirect buffer.
uint32_toffsetDispatch indirect buffer offset.
uint32_tmax_sizeMaximum number of elements to scan.

Dispatch multiple in-place indirect prefix scans.

bool dispatchIndirect(Compute &compute, Buffer &data, Buffer &count, Buffer &dispatch, uint32_t count_offset, uint32_t dispatch_offset, Flags flags = FlagNone, uint32_t max_size = Maxu32)
TypeNameDescription
BufferdataBuffer of uint32_t elements to scan.
BuffercountCount indirect buffer.
BufferdispatchDispatch indirect buffer.
uint32_tcount_offsetCount indirect buffer offset.
uint32_tdispatch_offsetDispatch indirect buffer offset.
uint32_tmax_sizeMaximum number of elements to scan.

Enums

Mode

Scan modes.

NameValueDescription
ModeSingle0Single array scan.
ModeMultiple1Multiple arrays scan.
NumModes2

Flags

Scan flags.

NameValueDescription
FlagNone0
FlagSingle(1 << ModeSingle)Enable Single array prefix mode.
FlagMultiple(1 << ModeMultiple)Enable Multi array prefix mode.
FlagIndirect(1 << (NumModes + 0))Enable Dispatch Indirect mode.
FlagRepeat(1 << (NumModes + 1))Repeat Dispatch Indirect prefix scan with the same parameters.
FlagsAll(FlagSingle | FlagMultiple | FlagIndirect)

Structs

DispatchParameters

.

Variables

TypeNameDescription
uint32_toffsetElements offset index (4 aligned).
uint32_tsizeNumber of elements to scan.
uint32_tpadding_0
uint32_tpadding_1