furiosa.quantizer package
Module contents
A FuriosaAI qunatizer.
- class furiosa.quantizer.CalibrationMethod(value)
Bases:
enum.IntEnum
Calibration method.
- MIN_MAX_ASYM
Min-max calibration (Asymmetric).
- Type
- MIN_MAX_SYM
Min-max calibration (Symmetric).
- Type
- ENTROPY_ASYM
Entropy calibration (Aymmetric).
- Type
- ENTROPY_SYM
Entropy calibration (Symmetric).
- Type
- PERCENTILE_ASYM
Percentile calibration (Asymmetric).
- Type
- PERCENTILE_SYM
Percentile calibration (Symmetric).
- Type
- MSE_ASYM
Mean squared error (MSE) calibration (Asymmetric).
- Type
- MSE_SYM
Mean squared error (MSE) calibration (Symmetric).
- Type
- SQNR_ASYM
Signal-to-quantization-noise ratio (SQNR) calibration (Asymmetric).
- Type
- SQNR_SYM
Signal-to-quantization-noise ratio (SQNR) calibration (Symmetric).
- Type
- ENTROPY_ASYM = 2
- ENTROPY_SYM = 3
- MIN_MAX_ASYM = 0
- MIN_MAX_SYM = 1
- MSE_ASYM = 6
- MSE_SYM = 7
- PERCENTILE_ASYM = 4
- PERCENTILE_SYM = 5
- SQNR_ASYM = 8
- SQNR_SYM = 9
- class furiosa.quantizer.Calibrator(model: Union[onnx.onnx_ml_pb2.ModelProto, bytes], calibration_method: furiosa.quantizer.CalibrationMethod, *, percentage: float = 99.99)
Bases:
object
Calibrator.
This collects the values of tensors in an ONNX model and computes their ranges.
- collect_data(calibration_dataset: Iterable[Sequence[numpy.ndarray]]) None
Collect the values of tensors that will be used for range computation.
This can be called multiple times.
- Parameters
calibration_dataset (Iterable[Sequence[numpy.ndarray]]) – An object that provides input data for the model one at a time.
- compute_range(verbose: bool = False) Dict[str, Tuple[float, float]]
Estimate the ranges of the tensors on the basis of the collected data.
- Parameters
verbose (bool) – Whether to show a progress bar, Defaults to False.
- Returns
- A dictionary that maps a
tensor name to a tuple of the tensor’s min and max.
- Return type
Dict[str, Tuple[float, float]]
- class furiosa.quantizer.Graph
Bases:
object
An intermediate representation (IR) of an ONNX or TFlite model.
- furiosa.quantizer.quantize(model: Union[onnx.onnx_ml_pb2.ModelProto, bytes], tensor_name_to_range: Mapping[str, Sequence[float]], *, with_quantize: bool = True, normalized_pixel_outputs: Optional[Sequence[int]] = None) Graph
Quantize an ONNX model on the basis of the range of its tensors.
- Parameters
model (onnx.ModelProto or bytes) – An ONNX model to quantize.
tensor_name_to_range (Mapping[str, Sequence[float]]) – A mapping from a tensor name to a 2-tuple (or list) of the tensor’s min and max.
with_quantize (bool) – Whether to put a Quantize operator at the beginning of the resulting model. Defaults to True.
normalized_pixel_outputs (Optional[Sequence[int]]) – A sequence of indices of output tensors in the ONNX model that produce pixel values in a normalized format ranging from 0.0 to 1.0. If specified, the corresponding output tensors in the resulting quantized model will generate pixel values in an unnormalized format from 0 to 255, represented as unsigned 8-bit integers (uint8). Defaults to None.
- Returns
- An intermediate representation (IR) of the quantized
model.
- Return type