furiosa.quantizer package

Submodules

furiosa.quantizer.calibrator module

class furiosa.quantizer.calibrator.CalibrationMethod(value)

Bases: IntEnum

Calibration method.

MIN_MAX_ASYM

Min-max calibration (Asymmetric).

Type:

CalibrationMethod

MIN_MAX_SYM

Min-max calibration (Symmetric).

Type:

CalibrationMethod

ENTROPY_ASYM

Entropy calibration (Aymmetric).

Type:

CalibrationMethod

ENTROPY_SYM

Entropy calibration (Symmetric).

Type:

CalibrationMethod

PERCENTILE_ASYM

Percentile calibration (Asymmetric).

Type:

CalibrationMethod

PERCENTILE_SYM

Percentile calibration (Symmetric).

Type:

CalibrationMethod

MSE_ASYM

Mean squared error (MSE) calibration (Asymmetric).

Type:

CalibrationMethod

MSE_SYM

Mean squared error (MSE) calibration (Symmetric).

Type:

CalibrationMethod

SQNR_ASYM

Signal-to-quantization-noise ratio (SQNR) calibration (Asymmetric).

Type:

CalibrationMethod

SQNR_SYM

Signal-to-quantization-noise ratio (SQNR) calibration (Symmetric).

Type:

CalibrationMethod

ENTROPY_ASYM = 2
ENTROPY_SYM = 3
MIN_MAX_ASYM = 0
MIN_MAX_SYM = 1
MSE_ASYM = 6
MSE_SYM = 7
PERCENTILE_ASYM = 4
PERCENTILE_SYM = 5
SQNR_ASYM = 8
SQNR_SYM = 9
class furiosa.quantizer.calibrator.Calibrator(model: ModelProto | bytes, calibration_method: CalibrationMethod, *, percentage: float = 99.99)

Bases: object

Calibrator.

This collects the values of tensors in an ONNX model and computes their ranges.

collect_data(calibration_dataset: Iterable[Sequence[ndarray]]) None

Collect the values of tensors that will be used for range computation.

This can be called multiple times.

Parameters:

calibration_dataset (Iterable[Sequence[numpy.ndarray]]) – An object that provides input data for the model one at a time.

compute_range(verbose: bool = False) Dict[str, Tuple[float, float]]

Estimate the ranges of the tensors on the basis of the collected data.

Parameters:

verbose (bool) – Whether to show a progress bar, Defaults to False.

Returns:

A dictionary that maps a

tensor name to a tuple of the tensor’s min and max.

Return type:

Dict[str, Tuple[float, float]]

furiosa.quantizer.editor module

class furiosa.quantizer.editor.ModelEditor(model: ModelProto)

Bases: object

A utility class for manipulating ONNX models.

convert_input_type(tensor_name: str, tensor_type: TensorType) None

Convert the element type of an input tensor named tensor_name to tensor_type.

Parameters:
  • tensor_name (str) – The name of an input tensor to convert.

  • tensor_type (TensorType) – The desired element type.

convert_output_type(tensor_name: str, tensor_type: TensorType, tensor_range: Tuple[float, float] | None = None) None

Convert the element type of an output tensor named tensor_name to tensor_type.

Parameters:
  • tensor_name (str) – The name of an output tensor to convert.

  • tensor_type (TensorType) – The desired element type.

  • tensor_range (Optional[Tuple[float, float]]) – A new min/max range of the output tensor. If it is None, the original range will be retained. Defaults to None.

class furiosa.quantizer.editor.TensorType(value)

Bases: IntEnum

An enumeration class representing the element type of a tensor.

This class is used with the ModelEditor.convert_{input,output}_type method to specify the desired element type.

INT8 = 2
UINT8 = 1
furiosa.quantizer.editor.get_output_names(model: ModelProto) List[str]

Return the names of outputs in an ONNX model.

Parameters:

model (onnx.ModelProto) – An ONNX model.

Returns:

A list of the names of outputs in the model.

Return type:

List[str]

furiosa.quantizer.editor.get_pure_input_names(model: ModelProto) List[str]

Return the names of inputs in an ONNX model that have no associated initializers.

Parameters:

model (onnx.ModelProto) – An ONNX model.

Returns:

A list of the names of inputs in the model that have no associated initializers.

Return type:

List[str]

Module contents

A FuriosaAI qunatizer.

class furiosa.quantizer.CalibrationMethod(value)

Bases: IntEnum

Calibration method.

MIN_MAX_ASYM

Min-max calibration (Asymmetric).

Type:

CalibrationMethod

MIN_MAX_SYM

Min-max calibration (Symmetric).

Type:

CalibrationMethod

ENTROPY_ASYM

Entropy calibration (Aymmetric).

Type:

CalibrationMethod

ENTROPY_SYM

Entropy calibration (Symmetric).

Type:

CalibrationMethod

PERCENTILE_ASYM

Percentile calibration (Asymmetric).

Type:

CalibrationMethod

PERCENTILE_SYM

Percentile calibration (Symmetric).

Type:

CalibrationMethod

MSE_ASYM

Mean squared error (MSE) calibration (Asymmetric).

Type:

CalibrationMethod

MSE_SYM

Mean squared error (MSE) calibration (Symmetric).

Type:

CalibrationMethod

SQNR_ASYM

Signal-to-quantization-noise ratio (SQNR) calibration (Asymmetric).

Type:

CalibrationMethod

SQNR_SYM

Signal-to-quantization-noise ratio (SQNR) calibration (Symmetric).

Type:

CalibrationMethod

ENTROPY_ASYM = 2
ENTROPY_SYM = 3
MIN_MAX_ASYM = 0
MIN_MAX_SYM = 1
MSE_ASYM = 6
MSE_SYM = 7
PERCENTILE_ASYM = 4
PERCENTILE_SYM = 5
SQNR_ASYM = 8
SQNR_SYM = 9
class furiosa.quantizer.Calibrator(model: ModelProto | bytes, calibration_method: CalibrationMethod, *, percentage: float = 99.99)

Bases: object

Calibrator.

This collects the values of tensors in an ONNX model and computes their ranges.

collect_data(calibration_dataset: Iterable[Sequence[ndarray]]) None

Collect the values of tensors that will be used for range computation.

This can be called multiple times.

Parameters:

calibration_dataset (Iterable[Sequence[numpy.ndarray]]) – An object that provides input data for the model one at a time.

compute_range(verbose: bool = False) Dict[str, Tuple[float, float]]

Estimate the ranges of the tensors on the basis of the collected data.

Parameters:

verbose (bool) – Whether to show a progress bar, Defaults to False.

Returns:

A dictionary that maps a

tensor name to a tuple of the tensor’s min and max.

Return type:

Dict[str, Tuple[float, float]]

class furiosa.quantizer.ModelEditor(model: ModelProto)

Bases: object

A utility class for manipulating ONNX models.

convert_input_type(tensor_name: str, tensor_type: TensorType) None

Convert the element type of an input tensor named tensor_name to tensor_type.

Parameters:
  • tensor_name (str) – The name of an input tensor to convert.

  • tensor_type (TensorType) – The desired element type.

convert_output_type(tensor_name: str, tensor_type: TensorType, tensor_range: Tuple[float, float] | None = None) None

Convert the element type of an output tensor named tensor_name to tensor_type.

Parameters:
  • tensor_name (str) – The name of an output tensor to convert.

  • tensor_type (TensorType) – The desired element type.

  • tensor_range (Optional[Tuple[float, float]]) – A new min/max range of the output tensor. If it is None, the original range will be retained. Defaults to None.

class furiosa.quantizer.TensorType(value)

Bases: IntEnum

An enumeration class representing the element type of a tensor.

This class is used with the ModelEditor.convert_{input,output}_type method to specify the desired element type.

INT8 = 2
UINT8 = 1
furiosa.quantizer.get_output_names(model: ModelProto) List[str]

Return the names of outputs in an ONNX model.

Parameters:

model (onnx.ModelProto) – An ONNX model.

Returns:

A list of the names of outputs in the model.

Return type:

List[str]

furiosa.quantizer.get_pure_input_names(model: ModelProto) List[str]

Return the names of inputs in an ONNX model that have no associated initializers.

Parameters:

model (onnx.ModelProto) – An ONNX model.

Returns:

A list of the names of inputs in the model that have no associated initializers.

Return type:

List[str]

furiosa.quantizer.quantize(model: ModelProto | bytes, tensor_name_to_range: Mapping[str, Sequence[float]]) bytes

Quantize an ONNX model on the basis of the range of its tensors.

Parameters:
  • model (onnx.ModelProto or bytes) – An ONNX model to quantize.

  • tensor_name_to_range (Mapping[str, Sequence[float]]) – A mapping from a tensor name to a 2-tuple (or list) of the tensor’s min and max.

Returns:

A serialized ONNX model that incorporates quantization

information.

Return type:

bytes