'transforms' Submodule
- Introduction
- apply_bufferization
- apply_dataflow_optimization_transforms
- apply_greedy_kernel_fusion
- apply_hls_codegen
- apply_intensity_aware_linalg_tiling
- apply_linalg_optimization_transforms
- apply_linear_programming_stream_sizing
- apply_runtime_codegen
- apply_tensor_to_dataflow_conversion
- construct_kernel_fusion_transform_sequence
- construct_linalg_tiling_transform_sequence
Introduction¶
from streamtensor import transforms
Transformations for StreamTensor compiler.
This module provides a set of transformations for the StreamTensor compiler. These transformations are used to optimize the input MLIR module targeting spatial accelerators. The module is transformed in place.
apply_bufferization(module, task_chain_merging=False)
¶
Apply bufferization.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
module
|
Module
|
The module to be transformed. |
required |
task_chain_merging
|
bool
|
Whether to merge task chains. |
False
|
apply_dataflow_optimization_transforms(module, max_widen_bitwidth=512, max_vectorize_bitwidth=4096, constant_folding=False)
¶
Apply dataflow optimization transforms.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
module
|
Module
|
The module to be transformed. |
required |
max_widen_bitwidth
|
int
|
The maximum bitwidth to widen. |
512
|
max_vectorize_bitwidth
|
int
|
The maximum bitwidth to vectorize. |
4096
|
constant_folding
|
bool
|
Whether to perform constant folding. |
False
|
apply_greedy_kernel_fusion(module, entry_name, max_fusion_cost, dot_file=None, fused_dot_file=None, delete_sequence=True)
¶
Apply greedy kernel fusion.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
module
|
Module
|
The module to be transformed. |
required |
entry_name
|
str
|
The entry function name. |
required |
max_fusion_cost
|
int
|
The maximum fusion cost. |
required |
dot_file
|
Optional[str]
|
The DOT file to print the kernel graph. |
None
|
fused_dot_file
|
Optional[str]
|
The DOT file to print the post-fusion kernel graph. |
None
|
delete_sequence
|
bool
|
Whether to delete the transform sequence after transform. |
True
|
apply_hls_codegen(module, num_uram=960, num_bram=4032, size_lutram_in_bit=36700000, num_hbm_port=32, create_kernel_wrapper=True)
¶
Apply HLS C++ code generation transforms.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
module
|
Module
|
The module to be transformed. |
required |
num_uram
|
int
|
The number of URAMs. |
960
|
num_bram
|
int
|
The number of BRAMs. |
4032
|
size_lutram_in_bit
|
int
|
The size of LUTRAM in bits. |
36700000
|
num_hbm_port
|
int
|
The number of HBM ports. |
32
|
create_kernel_wrapper
|
bool
|
Whether to create kernel wrapper function. |
True
|
apply_intensity_aware_linalg_tiling(module, entry_name, default_tile_size, overall_unroll_size, convert_linalg_to_dataflow, op_ii_map=None, dot_file=None, delete_sequence=True)
¶
Apply intensity-aware linalg tiling.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
module
|
Module
|
The module to be transformed. |
required |
entry_name
|
str
|
The entry function name. |
required |
default_tile_size
|
int
|
The default tile size. |
required |
overall_unroll_size
|
int
|
The overall unroll size. |
required |
convert_linalg_to_dataflow
|
bool
|
Whether to convert tiled linalg op to dataflow kernel. |
required |
op_ii_map
|
Optional[Dict[str, int]]
|
The map from op name to its initiation interval on hardware. |
None
|
dot_file
|
Optional[str]
|
The DOT file to print the linalg graph. |
None
|
delete_sequence
|
bool
|
Whether to delete the transform sequence after transform. |
True
|
apply_linalg_optimization_transforms(module, fake_quantize=False, quantize_bitwidth=8, constant_quantize_bitwidth=8)
¶
Apply linalg optimization transforms.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
module
|
Module
|
The module to be transformed. |
required |
fake_quantize
|
bool
|
Whether to apply fake quantization. |
False
|
quantize_bitwidth
|
int
|
The bitwidth for fake quantization. |
8
|
constant_quantize_bitwidth
|
int
|
The bitwidth of constants for fake quantization. |
8
|
apply_linear_programming_stream_sizing(module, entry_name, pipeline_rewinding=True, stream_size_map=None, stream_full_size=False, dot_file=None)
¶
Apply linear programming stream sizing.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
module
|
Module
|
The module to be transformed. |
required |
entry_name
|
str
|
The entry function name. |
required |
pipeline_rewinding
|
bool
|
Whether to enable pipeline rewinding. |
True
|
stream_size_map
|
Optional[Dict[Tuple[str, str], int]]
|
The overriding map from edge to stream size. |
None
|
stream_full_size
|
bool
|
Set all stream sizes to the maximum (token number). |
False
|
dot_file
|
Optional[str]
|
The DOT file to print the stream graph. |
None
|
apply_runtime_codegen(module, host_func_name, timeout_in_ms=1000)
¶
Apply runtime C++ code generation transforms.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
module
|
Module
|
The module to be transformed. |
required |
host_func_name
|
str
|
The name of the host function. |
required |
timeout_in_ms
|
int
|
The timeout in milliseconds. |
1000
|
apply_tensor_to_dataflow_conversion(module)
¶
Apply tensor to dataflow conversion.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
module
|
Module
|
The module to be transformed. |
required |
construct_kernel_fusion_transform_sequence(target, design_space)
¶
Construct a transform sequence for applying kernel fusion.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
target
|
BlockArgument
|
The handle of the target op. |
required |
design_space
|
KernelFusionDesignSpace
|
The design space of kernel fusion. |
required |
Returns:
Type | Description |
---|---|
List[Value]
|
A list of transformed values, which is empty in this case. |
Raises:
Type | Description |
---|---|
ValueError
|
If the kernel is not a KernelOp. |
construct_linalg_tiling_transform_sequence(target, design_space, convert_linalg_to_dataflow=True)
¶
Construct a transform sequence for applying linalg tiling.
The following parameters are required in the input design space
- parallel_tile_sizes: The parallel tile sizes.
- reduction_tile_sizes: The reduction tile sizes.
- unroll_sizes: The unroll sizes.
- inputs_vec_sizes: The input vector sizes.
- outputs_vec_sizes: The output vector sizes.
- permutation: The permutation of the loop order.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
target
|
BlockArgument
|
The handle of the target op. |
required |
design_space
|
LinalgTilingDesignSpace
|
The design space of linalg tiling. |
required |
convert_linalg_to_dataflow
|
bool
|
Whether to convert tiled linalg op to dataflow kernel. |
True
|
Returns:
Type | Description |
---|---|
List[Value]
|
A list of transformed values, which is empty in this case. |
Raises:
Type | Description |
---|---|
ValueError
|
If the required attributes are not found. |