furiosa.quantizer.frontend.onnx package
Subpackages
- furiosa.quantizer.frontend.onnx.quantizer package
- furiosa.quantizer.frontend.onnx.transformer package
- Subpackages
- Submodules
- furiosa.quantizer.frontend.onnx.transformer.convert_2d_sum_to_add module
- furiosa.quantizer.frontend.onnx.transformer.convert_conv1d_to_conv2d module
- furiosa.quantizer.frontend.onnx.transformer.eliminate_redundant_shape_pattern module
- furiosa.quantizer.frontend.onnx.transformer.fuse_batchnorm module
- furiosa.quantizer.frontend.onnx.transformer.fuse_conv module
- furiosa.quantizer.frontend.onnx.transformer.fuse_depth_to_space module
- furiosa.quantizer.frontend.onnx.transformer.fuse_gather_matmul module
- furiosa.quantizer.frontend.onnx.transformer.fuse_gelu module
- furiosa.quantizer.frontend.onnx.transformer.fuse_layer_normalization module
- furiosa.quantizer.frontend.onnx.transformer.fuse_lp_normalization module
- furiosa.quantizer.frontend.onnx.transformer.fuse_pad module
- furiosa.quantizer.frontend.onnx.transformer.fuse_redundant_reshape_pattern module
- furiosa.quantizer.frontend.onnx.transformer.infer_squeeze_axes module
- furiosa.quantizer.frontend.onnx.transformer.polish_model module
- furiosa.quantizer.frontend.onnx.transformer.utils module
- Module contents
- furiosa.quantizer.frontend.onnx.utils package
Submodules
furiosa.quantizer.frontend.onnx.calibrate module
- class furiosa.quantizer.frontend.onnx.calibrate.CalibrationDataReaderForIterator(iterator: Iterator[Dict[str, numpy.ndarray]])
Bases:
onnxruntime.quantization.calibrate.CalibrationDataReader
A CalibrationDataReader that wraps dicts mapping input tensor names to their values.
- get_next()
generate the input data dict for ONNXinferenceSession run
- exception furiosa.quantizer.frontend.onnx.calibrate.CalibrationError
Bases:
Exception
The base class for all exceptions that are related to calibration.
- furiosa.quantizer.frontend.onnx.calibrate.calibrate(model: onnx.onnx_ml_pb2.ModelProto, dataset: Iterable[Dict[str, numpy.ndarray]], augmented_model_path: Optional[str] = None) Dict[str, Tuple[float, float]]
Estimates the range of tensors in a model, based on a dataset.
- Parameters
model – An ONNX model to calibrate.
dataset – An Iterable that returns dicts mapping input tensor names to their values.
augmented_model_path – A path to save an augmented model to.
- Returns
A dict mapping tensors in the model to their minimum and maximum values.
- furiosa.quantizer.frontend.onnx.calibrate.calibrate_with_random_data(model: onnx.onnx_ml_pb2.ModelProto, dataset_size: int = 8, augmented_model_path: Optional[str] = None) Dict[str, Tuple[float, float]]
Estimates the range of tensors in a model, based on a random dataset.
- Parameters
model – An ONNX model to calibrate.
dataset_size – the size of a random dataset to use.
augmented_model_path – A path to save an augmented model to.
- Returns
A dict mapping tensors in the model to their minimum and maximum values.
Module contents
- furiosa.quantizer.frontend.onnx.optimize_model(model: onnx.onnx_ml_pb2.ModelProto, input_shapes: Optional[Dict[str, List[int]]] = None) onnx.onnx_ml_pb2.ModelProto
- furiosa.quantizer.frontend.onnx.parse_onnx_graph(model: onnx.onnx_ml_pb2.ModelProto) Tuple[Dict[str, onnx.onnx_ml_pb2.ValueInfoProto], Dict[str, onnx.onnx_ml_pb2.NodeProto], Dict[str, List[onnx.onnx_ml_pb2.NodeProto]]]
- furiosa.quantizer.frontend.onnx.post_training_quantization_with_random_calibration(model: onnx.onnx_ml_pb2.ModelProto, per_channel: bool, static: bool, mode: furiosa.quantizer.frontend.onnx.quantizer.utils.QuantizationMode, num_data: int = 8, check_idempotency: bool = False) onnx.onnx_ml_pb2.ModelProto
- furiosa.quantizer.frontend.onnx.post_training_quantize(model: onnx.onnx_ml_pb2.ModelProto, dataset: List[Dict[str, numpy.ndarray]], per_channel: bool = True, check_idempotency: bool = False) onnx.onnx_ml_pb2.ModelProto
Post-training-quantizes an ONNX model with a calibration dataset.
- Parameters
model – An ONNX model to quantize.
dataset – A calibration dataset.
per_channel – If per_channel is True, Conv’s filters are per-channel quantized. Otherwise, they are per-tensor quantized.
- Returns
An ONNX model post-training-quantized with the calibration dataset.
- furiosa.quantizer.frontend.onnx.quantize(model: onnx.onnx_ml_pb2.ModelProto, per_channel: bool, static: bool, mode: furiosa.quantizer.frontend.onnx.quantizer.utils.QuantizationMode, dynamic_ranges: Dict[str, Tuple[float, float]]) onnx.onnx_ml_pb2.ModelProto