furiosa.quantizer.frontend.onnx package

Subpackages

Submodules

furiosa.quantizer.frontend.onnx.calibrate module

class furiosa.quantizer.frontend.onnx.calibrate.CalibrationDataReaderForIterator(iterator: Iterator[Dict[str, numpy.ndarray]])

Bases: onnxruntime.quantization.calibrate.CalibrationDataReader

A CalibrationDataReader that wraps dicts mapping input tensor names to their values.

get_next()

generate the input data dict for ONNXinferenceSession run

exception furiosa.quantizer.frontend.onnx.calibrate.CalibrationError

Bases: Exception

The base class for all exceptions that are related to calibration.

furiosa.quantizer.frontend.onnx.calibrate.calibrate(model: onnx.onnx_ml_pb2.ModelProto, dataset: Iterable[Dict[str, numpy.ndarray]], augmented_model_path: Optional[str] = None) Dict[str, Tuple[float, float]]

Estimates the range of tensors in a model, based on a dataset.

Args:

model: An ONNX model to calibrate. dataset: An Iterable that returns dicts mapping input tensor names to their values. augmented_model_path: A path to save an augmented model to.

Returns:

A dict mapping tensors in the model to their minimum and maximum values.

furiosa.quantizer.frontend.onnx.calibrate.calibrate_with_random_data(model: onnx.onnx_ml_pb2.ModelProto, dataset_size: int = 8, augmented_model_path: Optional[str] = None) Dict[str, Tuple[float, float]]

Estimates the range of tensors in a model, based on a random dataset.

Args:

model: An ONNX model to calibrate. dataset_size: the size of a random dataset to use. augmented_model_path: A path to save an augmented model to.

Returns:

A dict mapping tensors in the model to their minimum and maximum values.

Module contents

furiosa.quantizer.frontend.onnx.export_spec(model: onnx.onnx_ml_pb2.ModelProto, output: IO[str])
furiosa.quantizer.frontend.onnx.optimize_model(model: onnx.onnx_ml_pb2.ModelProto, input_shapes: Optional[Dict[str, List[int]]] = None) onnx.onnx_ml_pb2.ModelProto
furiosa.quantizer.frontend.onnx.post_training_quantization_with_random_calibration(model: onnx.onnx_ml_pb2.ModelProto, per_channel: bool, static: bool, mode: furiosa.quantizer.frontend.onnx.quantizer.utils.QuantizationMode, num_data: int = 8) onnx.onnx_ml_pb2.ModelProto
furiosa.quantizer.frontend.onnx.post_training_quantize(model: onnx.onnx_ml_pb2.ModelProto, dataset: List[Dict[str, numpy.ndarray]], per_channel: bool = True) onnx.onnx_ml_pb2.ModelProto

Post-training-quantizes an ONNX model with a calibration dataset.

Args:

model: An ONNX model to quantize. dataset: A calibration dataset. per_channel: If per_channel is True, Conv’s filters are

per-channel quantized. Otherwise, they are per-tensor quantized.

Returns:

An ONNX model post-training-quantized with the calibration dataset.

furiosa.quantizer.frontend.onnx.quantize(model: onnx.onnx_ml_pb2.ModelProto, per_channel: bool, static: bool, mode: furiosa.quantizer.frontend.onnx.quantizer.utils.QuantizationMode, dynamic_ranges: Dict[str, Tuple[float, float]]) onnx.onnx_ml_pb2.ModelProto