Skip to content

Passes

General StreamTensor Passes

-streamtensor-comprehensive-bufferize

Comprehensively bufferize the program

-streamtensor-fuse-linalg-fill

Fuse linalg fill op into generic op

-streamtensor-linalg-fake-quantize

Convert to quantized model (only for testing use)

Options

-quan-bits       : Number of bits for quantization of non-constant values
-const-quan-bits : Number of bits for quantization of constant values

-streamtensor-raise-scf-to-affine

Raise SCF to affine

-streamtensor-strip-annotations

Strip all annotations with the given name

Options

-annotation-name : 

-streamtensor-transform-interpreter

Interprete transform sequence with the given entry point

Options

-debug-payload-root-tag   : Select the operation with 'transform.target_tag' attribute having the given value as payload IR root. If empty select the pass anchor operation as the payload IR root.
-disable-expensive-checks : Disable expensive checks in the interpreter for a faster run.
-entry-point              : Entry point of the pass pipeline.
-delete-entry-point       : Delete the entry point after transformation.

'tensor_ext' Dialect Passes

-streamtensor-convert-tensor-empty-to-instance

Convert tensor empty op to instance op

-streamtensor-convert-tensor-primitive-to-linalg

Convert tensor primitive op to linalg op

-streamtensor-decompose-tensor-ops

Decompose tensor ops

-streamtensor-lower-pack-unpack

Lower tensor pack/unpack ops

Options

-constant-folding : Whether to apply constant folding

-streamtensor-lower-quantize-ops

Lower quantize ops to normal arithmetic ops

-streamtensor-raise-extract-slice-to-chunk

Raise extract_slice to chunk if possible

'dataflow' Dialect Passes

-streamtensor-convert-itensor-empty-to-instance

Convert itensor empty op to instance op

-streamtensor-convert-tensor-to-kernel

Convert tensor ops to kernel ops

-streamtensor-ensure-itensor-single-use

Ensure each itensor has a single use

-streamtensor-fold-itensor

Fold trivial itensor instances

-streamtensor-lower-itensor-to-stream

Lower itensor operations to stream operations

-streamtensor-materialize-kernel

Materialize kernel ops to task ops

-streamtensor-merge-data-driven-task

Merge data-driven tasks

-streamtensor-merge-task-partition

Merge partition of tasks into a single task

-streamtensor-pack-kernel-interface

Pack/unpack the interface of kernel ops

-streamtensor-reduce-stream-depth

Reduce the depth of stream channels

Options

-max-depth : Maximum depth of stream channels

-streamtensor-simplify-task-structure

Simplify task structure

-streamtensor-vectorize-itensor

Vectorize itensor elements

Options

-max-vectorize-bitwidth : Maximum bitwidth of vectorization

-streamtensor-widen-kernel-interface

Widen/unwiden the interface of kernel ops

Options

-max-widen-bitwidth : Maximum bitwidth of widening

'runtime' Dialect Passes

-streamtensor-convert-memref-to-pointer

Convert memref views to pointer passing for C++ emission

-streamtensor-generate-runtime-host-func

Generate runtime host function

Options

-host-func-name : Name of the host function
-timeout-in-ms  : Timeout in ms when waiting the kernel run to complete

'hls' Dialect Passes

-streamtensor-allocate-memory

Allocate memory kind and layout for tensors and itensors

Options

-num-uram           : Number of URAMs available on chip
-num-bram           : Number of BRAMs available on chip
-size-lutram-in-bit : The size of LUTRAMs in bits
-platform-num-bram  : Number of BRAMs used by static platform

-streamtensor-convert-dataflow-to-func

Convert structural dataflow to function for C++ emission

Options

-create-kernel-wrapper : Whether to create kernel wrapper function

-streamtensor-generate-connectivity

Generate connectivity ops from kernel tasks

Options

-num-hbm-port : Number of system port for HBM access

-streamtensor-generate-directive

Generate HLS directives

-streamtensor-materialize-directive

Materialize HLS directives