furiosa.quantizer package

Module contents

A FuriosaAI qunatizer.

class furiosa.quantizer.CalibrationMethod(value)

Bases: enum.IntEnum

Calibration method.

MIN_MAX_ASYM

Min-max calibration (Asymmetric).

Type

CalibrationMethod

MIN_MAX_SYM

Min-max calibration (Symmetric).

Type

CalibrationMethod

ENTROPY_ASYM

Entropy calibration (Aymmetric).

Type

CalibrationMethod

ENTROPY_SYM

Entropy calibration (Symmetric).

Type

CalibrationMethod

PERCENTILE_ASYM

Percentile calibration (Asymmetric).

Type

CalibrationMethod

PERCENTILE_SYM

Percentile calibration (Symmetric).

Type

CalibrationMethod

MSE_ASYM

Mean squared error (MSE) calibration (Asymmetric).

Type

CalibrationMethod

MSE_SYM

Mean squared error (MSE) calibration (Symmetric).

Type

CalibrationMethod

SQNR_ASYM

Signal-to-quantization-noise ratio (SQNR) calibration (Asymmetric).

Type

CalibrationMethod

SQNR_SYM

Signal-to-quantization-noise ratio (SQNR) calibration (Symmetric).

Type

CalibrationMethod

ENTROPY_ASYM = 2
ENTROPY_SYM = 3
MIN_MAX_ASYM = 0
MIN_MAX_SYM = 1
MSE_ASYM = 6
MSE_SYM = 7
PERCENTILE_ASYM = 4
PERCENTILE_SYM = 5
SQNR_ASYM = 8
SQNR_SYM = 9
class furiosa.quantizer.Calibrator(model: Union[onnx.onnx_ml_pb2.ModelProto, bytes], calibration_method: furiosa.quantizer.CalibrationMethod, *, percentage: float = 99.99)

Bases: object

Calibrator.

This collects the values of tensors in an ONNX model and computes their ranges.

collect_data(calibration_dataset: Iterable[Sequence[numpy.ndarray]]) None

Collect the values of tensors that will be used for range computation.

This can be called multiple times.

Parameters

calibration_dataset (Iterable[Sequence[numpy.ndarray]]) – An object that provides input data for the model one at a time.

compute_range(verbose: bool = False) Dict[str, Tuple[float, float]]

Estimate the ranges of the tensors on the basis of the collected data.

Parameters

verbose (bool) – Whether to show a progress bar, Defaults to False.

Returns

A dictionary that maps a

tensor name to a tuple of the tensor’s min and max.

Return type

Dict[str, Tuple[float, float]]

class furiosa.quantizer.Graph

Bases: object

An intermediate representation (IR) of an ONNX or TFlite model.

furiosa.quantizer.quantize(model: Union[onnx.onnx_ml_pb2.ModelProto, bytes], tensor_name_to_range: Mapping[str, Sequence[float]], *, with_quantize: bool = True, normalized_pixel_outputs: Optional[Sequence[int]] = None) Graph

Quantize an ONNX model on the basis of the range of its tensors.

Parameters
  • model (onnx.ModelProto or bytes) – An ONNX model to quantize.

  • tensor_name_to_range (Mapping[str, Sequence[float]]) – A mapping from a tensor name to a 2-tuple (or list) of the tensor’s min and max.

  • with_quantize (bool) – Whether to put a Quantize operator at the beginning of the resulting model. Defaults to True.

  • normalized_pixel_outputs (Optional[Sequence[int]]) – A sequence of indices of output tensors in the ONNX model that produce pixel values in a normalized format ranging from 0.0 to 1.0. If specified, the corresponding output tensors in the resulting quantized model will generate pixel values in an unnormalized format from 0 to 255, represented as unsigned 8-bit integers (uint8). Defaults to None.

Returns

An intermediate representation (IR) of the quantized

model.

Return type

Graph