Command Line Tools Quickstart

FuriosaAI SDK provides a variety of command line tools to help compile, optimize and quantize models.

Since some command line tools function based on FuriosaAI’s web services, you need to set up an API key to use all the features. If you have not yet set up a key, refer to API Key Settings.

Installation

If the Python execution environment has not been set up, refer to Configuring the Python SDK Runtime Environment.

Install Command Line Tools

$ pip install 'furiosa-sdk[cli, validator]'

Verifying the Installation

$ furiosa version
Server version: 0.2.0 (rev: a4bbbb1b4 built_at: 2021-03-15T21:20:59Z)
Client version: 0.1.0

Command Quickstart

Checking the toolchain version

The toolchain command provides additional functions related to the toolchain. Currently, it provides a function to print the version of the provided toolchains.

$ furiosa toolchain list

Available Toolchains:
[0] 0.1.0 (rev: 952707e5f built_at: 2020-12-15 23:38:22)

Model Compilation

The compile command creates a program that uses the FuriosaAI NPU by compiling models in TFLite and ONNX formats. (Is it using the NPU to compile?*)

$ furiosa compile \
  test_data/MNISTnet_uint8_quant_without_softmax.tflite

output.enf has been generated (elapsed: 513.661 ms)

The -o option specifies the location where the generated program binary will be saved.

$ furiosa compile \
  MNISTnet_uint8_quant_without_softmax.tflite \
  -o /tmp/mnist.enf

mnist.enf has been generated (elapsed: 513.661 ms)

The --conf option allows you to use yaml files when specifying different compiler options.

$ furiosa compile \
  test_data/MNISTnet_uint8_quant_without_softmax.tflite \
  --config test_data/compiler_config.yml

outout.enf has been generated (elapsed: 513.661 ms)

Checking the Model Fit

The validate command takes a TFLite or ONNX as an argument, automatically quantizes it, and tries to compile to a final binary. Success indicates that the model can be run on the FuriosaAI NPU..

$ furiosa validate yolov4.onxx

Passed!
$ furiosa validate efficientnet-lite4-11.onnx

Failed. The result has been written to validation.txt

$ cat validation.txt

  Stdout:
  [Step 1] Checking if the model can be transformed into a quantized model ...
  Stderr:
  /root/miniconda3/envs/furiosa/lib/python3.8/site-packages/onnx/__init__.py:97: RuntimeWarning: Unexpected end-group tag: Not all data was converted
      decoded = cast(Optional[int], proto.ParseFromString(s))
  [Step 1] Failed

Compilation Analysis and Optimization Tool

Both additional options given during compilation and separate commands generate NPU utilization level and model compilation process report for the compiled model.

The --mem-allow-report option for the compile command generates a memory allocation report in HTML format.

$ furiosa compile \
  MNISTnet_uint8_quant_without_softmax.tflite \
  -o /tmp/mnist.enf \
  --mem-alloc-report ./mem-report.html

The figure below is an example of the memory allocation report output.

Memory Allocation Report

The perfeye command predicts and outputs the compiled model’s NPU utilization in operator units.

$ furiosa perfeye \
  MNISTnet_uint8_quant_without_softmax.tflite \
  -o output.html

output.html has been generated (elapsed: 510.783 ms)

The figure below is an example of a perfeye report.

Perfeye Report