********************************************************* Release Notes - 0.9.0 ********************************************************* Furiosa SDK 0.9.0 is a major release, including many performance enhancements, additional functions, and bug fixes. In partcular, 0.9.0 release includes the significant improvements of the quantization tools. .. list-table:: Component Version Information :widths: 200 50 :header-rows: 1 * - Package Name - Version * - NPU Driver - 1.7.0 * - NPU Firmware Tools - 1.4.0 * - NPU Firmware Image - 1.7.0 * - HAL (Hardware Abstraction Layer) - 0.11.0 * - Furiosa Compiler - 0.9.0 * - Python SDK (furiosa-runtime, furiosa-server, furiosa-serving, furiosa-quantizer, ..) - 0.9.0 * - NPU Management CLI (furiosactl) - 0.11.0 * - NPU Device Plugin - 0.10.1 * - NPU Feature Discovery - 0.2.0 Installing the latest SDK -------------------------------------------------------- If you are using APT repository, the upgrade process is simpler. .. code-block:: sh apt-get update && apt-get upgrade If you wish to designate a specific package for upgrade, execute as below: You can find more details about APT repository setup at :ref:`RequiredPackages`. .. code-block:: sh apt-get update && \ apt-get install -y furiosa-driver-pdma furiosa-libhal-warboy furiosa-libnux furiosactl You can upgrade firmware as follows: .. code-block:: sh apt-get update && \ apt-get install -y furiosa-firmware-tools furiosa-firmware-image You can upgrade Python package as follows: .. code-block:: sh pip install --upgrade pip setuptools wheel pip install --upgrade furiosa-sdk .. warning:: When installing or upgrading the furiosa-sdk without updating pip to the latest version, you may encounter the following errors. .. code-block:: sh ERROR: Could not find a version that satisfies the requirement furiosa-quantizer-impl==0.9.* (from furiosa-quantizer==0.9.*->furiosa-sdk) (from versions: none) ERROR: No matching distribution found for furiosa-quantizer-impl==0.9.* (from furiosa-quantizer==0.9.*->furiosa-sdk) Major changes -------------------------------------------------------- Quantization tool ================================================================ Quantization tool is a library that converts a pre-trained model to a quantized model. You can refer to more details at :ref:`ModelQuantization` 0.9.0 release includes the API improvement and new calibration methods, possibly leading to better accuracy. * Added new quantization-related APIs that are more flexible and solid. (``furiosa.quantizer`` , ``furiosa.optimizer`` ) .. code-block:: python optimized_onnx_model = optimize_model(source_onnx_model) calibrator = Calibrator(optimized_onnx_model, CalibrationMethod.MIN_MAX_ASYM) for calibration_data, _ in tqdm.tqdm(calibration_dataloader, desc="Calibration", unit="images", mininterval=0.5): calibrator.collect_data([[calibration_data.numpy()]]) ranges = calibrator.compute_range() quantizated_graph = quantize(optimized_onnx_model, ranges) * Added an option to decide whether to perform quantize at the beginning of the model. * Instead of ``without_quantize`` being removed from the compiler options, it can be specified via the argument ``with_quantize`` to the ``quantize`` function. * The ``normalized_pixel_outputs`` argument to the ``quantize`` function can be set to convert the model output to uint8 instead of dequantizing to fp32. * A tensor with an element range of ``(0. , 1.)`` can be optimized to convert to pixel data in uint8. * Provides more calibration methods. .. list-table:: Supported Calibration Methods :widths: 300 50 50 :header-rows: 1 * - Calibration Method - Asymmetric - QuasiSymmetric * - Min-Max - MIN_MAX_ASYM - MIN_MAX_SYM * - Entropy - ENTROPY_ASYM - ENTROPY_SYM * - Percentile - PERCENTILE_ASYM - PERCENTILE_SYM * - Mean squared error - MSE_ASYM - MSE_SYM * - Signal-to-quantization-noise ratio - SQNR_ASYM - SQNR_SYM To ensure the effectiveness of new calibration methods, we measured the accuracy of 10 popular models with the new calibration methods. Among them, 8 models showed better accuracy than the existing calibration methods. For example, the accuracy of EfficientNet-B0 increased by 57.452%. With the min-max calibration method, EfficientNet-B0 had an accuracy of 16.104%. In contrast, with the percentile calibration method, the accuracy was 73.556%. The details of the experiment results can be found at :ref:`QuantizationAccuracyTable`. For more information on installing and using the new quantizer, you can refer to the following examples. * `Tutorial: How to use Furiosa SDK from Start to Finish `_ Compiler ============== * Added acceleration support for operators Lower, Unlower * Added acceleration support for operator Dequantize * Support for executing binaries that are larger than the hardware's instruction memory * Improved scheduler and memory allocator to eliminate unnecessary I/O * Various improvements optimize compilation for better execution performance furiosa-toolkit ================================================================ The ``furiosactl`` command-line tool included in the furiosa-toolkit 0.11.0 release includes improvements to the includes the following major improvements The newly added ``furiosactl top`` command is used to view utilization by NPU device over time. .. code-block:: sh $ furiosactl top --interval 200 NOTE: furiosa top is under development. Usage and output formats may change. Please enter Ctrl+C to stop. Datetime PID Device NPU(%) Comp(%) I/O(%) Command 2023-03-21T09:45:56.699483936Z 152616 npu1pe0-1 19.06 100.00 0.00 ./npu_runtime_test -n 10000 results/ResNet-CTC_kor1_200_nightly3_128dpes_8batches.enf 2023-03-21T09:45:56.906443888Z 152616 npu1pe0-1 51.09 93.05 6.95 ./npu_runtime_test -n 10000 results/ResNet-CTC_kor1_200_nightly3_128dpes_8batches.enf 2023-03-21T09:45:57.110489333Z 152616 npu1pe0-1 46.40 97.98 2.02 ./npu_runtime_test -n 10000 results/ResNet-CTC_kor1_200_nightly3_128dpes_8batches.enf 2023-03-21T09:45:57.316060982Z 152616 npu1pe0-1 51.43 100.00 0.00 ./npu_runtime_test -n 10000 results/ResNet-CTC_kor1_200_nightly3_128dpes_8batches.enf 2023-03-21T09:45:57.521140588Z 152616 npu1pe0-1 54.28 94.10 5.90 ./npu_runtime_test -n 10000 results/ResNet-CTC_kor1_200_nightly3_128dpes_8batches.enf 2023-03-21T09:45:57.725910558Z 152616 npu1pe0-1 48.93 98.93 1.07 ./npu_runtime_test -n 10000 results/ResNet-CTC_kor1_200_nightly3_128dpes_8batches.enf 2023-03-21T09:45:57.935041998Z 152616 npu1pe0-1 47.91 100.00 0.00 ./npu_runtime_test -n 10000 results/ResNet-CTC_kor1_200_nightly3_128dpes_8batches.enf 2023-03-21T09:45:58.13929122Z 152616 npu1pe0-1 49.06 94.94 5.06 ./npu_runtime_test -n 10000 results/ResNet-CTC_kor1_200_nightly3_128dpes_8batches.enf The ``furiosactl info`` command has been improved to display concise information about each device. As before, you can enter the ``--full`` option if you want to see more information about a device. .. code-block:: $ furiosactl info +------+--------+----------------+-------+--------+--------------+ | NPU | Name | Firmware | Temp. | Power | PCI-BDF | +------+--------+----------------+-------+--------+--------------+ | npu1 | warboy | 1.6.0, 3c10fd3 | 54°C | 0.99 W | 0000:44:00.0 | +------+--------+----------------+-------+--------+--------------+ $ furiosactl info --full +------+--------+--------------------------------------+-------------------+----------------+-------+--------+--------------+---------+ | NPU | Name | UUID | S/N | Firmware | Temp. | Power | PCI-BDF | PCI-DEV | +------+--------+--------------------------------------+-------------------+----------------+-------+--------+--------------+---------+ | npu1 | warboy | 00000000-0000-0000-0000-000000000000 | WBYB0000000000000 | 1.6.0, 3c10fd3 | 54°C | 0.99 W | 0000:44:00.0 | 511:0 | +------+--------+--------------------------------------+-------------------+----------------+-------+--------+--------------+---------+ More information about installing and using ``furiosactl`` can be found in :ref:`Toolkit`.