furiosa.quantizer.frontend.onnx package

Subpackages

Submodules

furiosa.quantizer.frontend.onnx.calibrate module

class furiosa.quantizer.frontend.onnx.calibrate.CalibrationDataReaderForIterator(iterator: Iterator[Dict[str, numpy.ndarray]])

Bases: onnxruntime.quantization.calibrate.CalibrationDataReader

A CalibrationDataReader that wraps dicts mapping input tensor names to their values.

get_next()

generate the input data dict for ONNXinferenceSession run

exception furiosa.quantizer.frontend.onnx.calibrate.CalibrationError

Bases: Exception

The base class for all exceptions that are related to calibration.

furiosa.quantizer.frontend.onnx.calibrate.calibrate(model: onnx.onnx_ml_pb2.ModelProto, dataset: Iterable[Dict[str, numpy.ndarray]], augmented_model_path: Optional[str] = None) Dict[str, Tuple[float, float]]

Estimates the range of tensors in a model, based on a dataset.

Parameters
  • model – An ONNX model to calibrate.

  • dataset – An Iterable that returns dicts mapping input tensor names to their values.

  • augmented_model_path – A path to save an augmented model to.

Returns

A dict mapping tensors in the model to their minimum and maximum values.

furiosa.quantizer.frontend.onnx.calibrate.calibrate_with_random_data(model: onnx.onnx_ml_pb2.ModelProto, dataset_size: int = 8, augmented_model_path: Optional[str] = None) Dict[str, Tuple[float, float]]

Estimates the range of tensors in a model, based on a random dataset.

Parameters
  • model – An ONNX model to calibrate.

  • dataset_size – the size of a random dataset to use.

  • augmented_model_path – A path to save an augmented model to.

Returns

A dict mapping tensors in the model to their minimum and maximum values.

Module contents

exception furiosa.quantizer.frontend.onnx.AlreadyQuantizedError(op_type: str)

Bases: ValueError

Exception raised if given model is partially quantized.

furiosa.quantizer.frontend.onnx.optimize_model(model: onnx.onnx_ml_pb2.ModelProto, input_shapes: Optional[Dict[str, List[int]]] = None, opset_version: int = 13) onnx.onnx_ml_pb2.ModelProto
furiosa.quantizer.frontend.onnx.parse_onnx_graph(model: onnx.onnx_ml_pb2.ModelProto) Tuple[Dict[str, onnx.onnx_ml_pb2.ValueInfoProto], Dict[str, onnx.onnx_ml_pb2.NodeProto], Dict[str, List[onnx.onnx_ml_pb2.NodeProto]]]
furiosa.quantizer.frontend.onnx.post_training_quantization_with_random_calibration(model: onnx.onnx_ml_pb2.ModelProto, per_channel: bool, static: bool, mode: furiosa.quantizer.frontend.onnx.quantizer.utils.QuantizationMode, num_data: int = 8, opset_version: int = 13) onnx.onnx_ml_pb2.ModelProto
furiosa.quantizer.frontend.onnx.post_training_quantize(model: onnx.onnx_ml_pb2.ModelProto, dataset: Iterable[Dict[str, numpy.ndarray]], per_channel: bool = True, opset_version: int = 13) onnx.onnx_ml_pb2.ModelProto

Post-training-quantizes an ONNX model with a calibration dataset.

Parameters
  • model – An ONNX model to quantize.

  • dataset – A calibration dataset.

  • per_channel – If per_channel is True, Conv’s filters are per-channel quantized. Otherwise, they are per-tensor quantized.

  • opset_version – ONNX OperatorSet version to use.

Returns

An ONNX model post-training-quantized with the calibration dataset.