Skip to content

'compiler' Submodule

Introduction

from streamtensor import compiler

StreamTensor compiler module.

This module provides a high-level interface to the StreamTensor compiler. It exposes a single function hls_pipeline that takes a serialized MLIR module and a set of options, and returns a serialized MLIR module that has been optimized for high-level synthesis (HLS) targeting FPGA devices.

CompileConfig(default_tile_size=16, overall_unroll_size=16, max_fusion_cost=8000, max_widen_bitwidth=512, max_vectorize_bitwidth=4096, partition=3, num_dsp=[2664, 2784, 2928], num_uram=[320, 320, 320], num_bram=[1200, 1152, 1200], num_lut=[386880, 364320, 395040], resource_threshold=0.9, num_hbm_port=32, timeout_in_ms=1000, clock_freqhz=150000000, resource_allocation=True, pipeline_rewinding=True, constant_folding=True, create_kernel_wrapper=False, max_stream_depth=None, task_partition_merging=True, task_chain_merging=False, data_driven_task_merging=False, op_ii_map={'math.erf': 2}, stream_size_map=None, fake_quantize=False, quantize_bitwidth=8, constant_quantize_bitwidth=8, imbalance_threshold=0.3, assign_slr=False)

Configuration for the StreamTensor compiler.

Attributes:

Name Type Description
default_tile_size int

The default tile size for linalg tiling.

overall_unroll_size int

The overall unroll size for linalg tiling.

max_fusion_cost int

The maximum fusion cost for greedy kernel fusion.

max_widen_bitwidth int

The maximum bitwidth for dataflow optimization.

max_vectorize_bitwidth int

The maximum bitwidth for stream vectorization.

partition int

The number of partitions on the target FPGA.

num_dsp List[int]

The number of DSPs on the target FPGA for each partition.

num_uram List[int]

The number of URAMs on the target FPGA for each partition.

num_bram List[int]

The number of BRAMs on the target FPGA for each partition.

num_lut List[int]

The number of LUTs on the target FPGA for each partition.

resource_threshold float

The resource threshold for dsp, uram, bram, and lut.

num_hbm_port int

The number of HBM ports on the target FPGA.

timeout_in_ms int

The timeout in milliseconds when running the kernel.

clock_freqhz int

The clock frequency in Hz of the target FPGA.

stream_sizing int

Whether to perform stream sizing.

pipeline_rewinding bool

Whether to enable pipeline rewinding.

constant_folding bool

Whether to perform constant folding.

create_kernel_wrapper bool

Whether to create kernel wrapper function.

max_stream_depth Optional[int]

The maximum stream depth for stream reduction.

task_partition_merging bool

Whether to merge task partitions.

task_chain_merging bool

Whether to merge task chains.

data_driven_task_merging bool

Whether to merge data-driven tasks.

op_ii_map Optional[Dict[str, int]]

The map from op name to its initiation interval on hardware.

stream_size_map Optional[Dict[Tuple[str, str], int]]

The overriding map from edge to stream size.

fake_quantize bool

Whether to apply fake quantization.

quantize_bitwidth int

The bitwidth for fake quantization.

constant_quantize_bitwidth int

The bitwidth of constants for fake quantization.

imbalance_threshold float

The imbalance threshold (>= 0) of graph partitioning, the larger the value, the more imbalance allowed.

assign_slr bool

Whether to assign SLR for each task.

Parameters:

Name Type Description Default
default_tile_size int

The default tile size for linalg tiling.

16
overall_unroll_size int

The overall unroll size for linalg tiling.

16
max_fusion_cost int

The maximum fusion cost for greedy kernel fusion.

8000
max_widen_bitwidth int

The maximum bitwidth for dataflow optimization.

512
max_vectorize_bitwidth int

The maximum bitwidth for stream vectorization.

4096
partition int

The number of partitions on the target FPGA.

3
num_dsp List[int]

The number of DSPs on the target FPGA for each partition.

[2664, 2784, 2928]
num_uram List[int]

The number of URAMs on the target FPGA for each partition.

[320, 320, 320]
num_bram List[int]

The number of BRAMs on the target FPGA for each partition.

[1200, 1152, 1200]
num_lut List[int]

The number of LUTs on the target FPGA for each partition.

[386880, 364320, 395040]
resource_threshold float

The resource threshold for dsp, uram, bram, and lut.

0.9
num_hbm_port int

The number of HBM ports on the target FPGA.

32
timeout_in_ms int

The timeout in milliseconds when running the kernel.

1000
clock_freqhz int

The clock frequency in Hz of the target FPGA.

150000000
resource_allocation bool

Whether to perform profile-based resource allocation.

True
pipeline_rewinding bool

Whether to enable pipeline rewinding.

True
constant_folding bool

Whether to perform constant folding.

True
create_kernel_wrapper bool

Whether to create kernel wrapper function.

False
max_stream_depth Optional[int]

The maximum stream depth for stream reduction.

None
task_partition_merging bool

Whether to merge task partitions.

True
task_chain_merging bool

Whether to merge task chains.

False
data_driven_task_merging bool

Whether to merge data-driven tasks.

False
op_ii_map Optional[Dict[str, int]]

The map from op name to its initiation interval on hardware.

{'math.erf': 2}
stream_size_map Optional[Dict[Tuple[str, str], int]]

The overriding map from edge to stream size.

None
fake_quantize bool

Whether to apply fake quantization.

False
quantize_bitwidth int

The bitwidth for fake quantization.

8
constant_quantize_bitwidth int

The bitwidth of constants for fake quantization.

8
imbalance_threshold float

The imbalance threshold (>= 0) of graph partitioning, the larger the value, the more imbalance allowed.

0.3
assign_slr bool

Whether to assign SLR for each task.

False

Compiler(work_path='.', config=CompileConfig())

StreamTensor compiler.

Attributes:

Name Type Description
work_path Path

The working directory for the compiler.

config CompileConfig

The configuration for the compiler.

hls_pipeline(module_bytes, module_name, print_dot=True, from_stage=HLSPipelineStage.START, to_stage=HLSPipelineStage.END, synthesis_config=SynthesisConfig())

Compiles a serialized MLIR module for high-level synthesis (HLS).

Parameters:

Name Type Description Default
module_bytes bytes

The serialized MLIR module to compile.

required
module_name str

The name of the module.

required
print_dot bool

Whether to print the dataflow graph as a DOT file.

True
from_stage HLSPipelineStage

The stage to start the compilation from.

START
to_stage HLSPipelineStage

The stage to stop the compilation at.

END
synthesis_config SynthesisConfig

The configuration for the synthesis.

SynthesisConfig()

Returns:

Type Description
Module

The compiled serialized MLIR module.

HLSPipelineStage

Bases: Enum

The stages in the HLS pipeline.

compile_hls_pipeline(module_bytes, module_name, print_dot=True, from_stage=HLSPipelineStage.START, to_stage=HLSPipelineStage.END, work_path='.', compile_config=CompileConfig(), synthesis_config=SynthesisConfig())

Compiles a serialized MLIR module for high-level synthesis (HLS).

Parameters:

Name Type Description Default
module_bytes bytes

The serialized MLIR module to compile.

required
module_name str

The name of the module.

required
print_dot bool

Whether to print the dataflow graph as a DOT file.

True
from_stage HLSPipelineStage

The stage to start the compilation from.

START
to_stage HLSPipelineStage

The stage to stop the compilation at.

END
work_path str

The working directory for the compiler.

'.'
compile_config CompileConfig

The configuration for the compiler.

CompileConfig()
synthesis_config SynthesisConfig

The configuration for the synthesis.

SynthesisConfig()

Returns:

Type Description
Module

The compiled serialized MLIR module.