furiosa.quantizer package

class furiosa.quantizer.Calibrator(model: Union[onnx.onnx_ml_pb2.ModelProto, bytes], calibration_method: furiosa.quantizer.CalibrationMethod, *, percentage: float = 99.99)

Bases: object

Calibrator.

This collects the values of tensors in an ONNX model and computes their ranges.

collect_data(calibration_dataset: Iterable[Sequence[numpy.ndarray]]) → None

Collect the values of tensors that will be used for range computation.

This can be called multiple times.

Parameters: calibration_dataset (Iterable[Sequence[numpy.ndarray]]) – An object that provides input data for the model one at a time.

compute_range(verbose: bool = False) → Dict[str, Tuple[float, float]]

Estimate the ranges of the tensors on the basis of the collected data.

Parameters

verbose (bool) – Whether to show a progress bar, Defaults to False.

Returns

A dictionary that maps a: tensor name to a tuple of the tensor’s min and max.

Return type

Dict[str, Tuple[float, float]]

class furiosa.quantizer.Graph

Bases: object

An intermediate representation (IR) of an ONNX or TFlite model.

furiosa.quantizer.quantize(model: Union[onnx.onnx_ml_pb2.ModelProto, bytes], tensor_name_to_range: Mapping[str, Sequence[float]], *, with_quantize: bool = True, normalized_pixel_outputs: Optional[Sequence[int]] = None) → Graph

Quantize an ONNX model on the basis of the range of its tensors.

Parameters

model (onnx.ModelProto or bytes) – An ONNX model to quantize.
tensor_name_to_range (Mapping[str, Sequence[float]]) – A mapping from a tensor name to a 2-tuple (or list) of the tensor’s min and max.
with_quantize (bool) – Whether to put a Quantize operator at the beginning of the resulting model. Defaults to True.
normalized_pixel_outputs (Optional[Sequence[int]]) – A sequence of indices of output tensors in the ONNX model that produce pixel values in a normalized format ranging from 0.0 to 1.0. If specified, the corresponding output tensors in the resulting quantized model will generate pixel values in an unnormalized format from 0 to 255, represented as unsigned 8-bit integers (uint8). Defaults to None.

Returns

An intermediate representation (IR) of the quantized: model.

Return type

Graph