Navigating Models from FuriosaAI Model Zoo¶
FuriosaAI's Software Stack¶
FuriosaAI's software stack caters to a diverse range of deep learning models, with a primary focus on vision-related tasks. Within this stack, the FuriosaAI Compiler optimizes Deep Neural Network (DNN) models and generates executable code for the FuriosaAI NPU. It currently supports TFLite and ONNX models, utilizing the latest research and methods for optimization. The compiler efficiently accelerates various vision-related operators on the NPU while utilizing the CPU for unsupported operations.
Vision Models and Beyond¶
FuriosaAI's first-generation NPU, Warboy, is specialized for vision-related tasks. It accelerates popular vision models like ResNet50, SSD-MobileNet, and EfficientNet, while also enabling users to create custom models that utilize supported operators. This flexibility ensures the generation of highly optimized NPU-ready code for various vision tasks.
Exploring Vision Models¶
For easy exploration of vision models tailored for FuriosaAI's NPU, navigate to the furiosa.models.vision
module. Here, you'll find a curated selection of models that have been optimized for efficient deployment on the FuriosaAI Warboy NPU.
from furiosa.models import vision
# List of available vision models
print(dir(vision))
['EfficientNetB0', 'EfficientNetV2s', 'ResNet50', 'SSDMobileNet', 'SSDResNet34', 'YOLOv5l', 'YOLOv5m', 'YOLOv7w6Pose']
# Alternatively, use the Command line tool to list models
! furiosa-models list
Model name | Model description | Task type | Available postprocesses |
---|---|---|---|
ResNet50 | MLCommons ResNet50 model | Image Classification | Python |
SSDMobileNet | MLCommons MobileNet v1 model | Object Detection | Python, Rust |
SSDResNet34 | MLCommons SSD ResNet34 model | Object Detection | Python, Rust |
YOLOv5l | YOLOv5 Large model | Object Detection | Rust |
YOLOv5m | YOLOv5 Medium model | Object Detection | Rust |
EfficientNetB0 | EfficientNet B0 model | Image Classification | Python |
EfficientNetV2s | EfficientNetV2-s model | Image Classification | Python |
EfficientNetV2s | EfficientNetV2-s model | Image Classification | Python |
YOLOv7w6Pose | YOLOv7 w6 Pose Estimation model | Pose Estimation | Python |
Now, let's instantiate a Model class from vision models and delve deeper into its attributes.
model = vision.ResNet50()
print(model)
# Display the static fields of the model
print("Static fields:", list(model.model_fields.keys()))
# Show the lazy-loaded fields of the model
print("Lazy loaded fields:", list(model.model_computed_fields.keys()))
name='ResNet50' task_type=<ModelTaskType.IMAGE_CLASSIFICATION: 'IMAGE_CLASSIFICATION'> format=<Format.ONNX: 'ONNX'> family='ResNet' version='v1.5' metadata=Metadata(description='ResNet50 v1.5 int8 ImageNet-1K', publication=Publication(authors=None, title=None, publisher=None, date=None, url='https://arxiv.org/abs/1512.03385.pdf')) tags=None Static fields: ['name', 'task_type', 'format', 'family', 'version', 'metadata', 'tags', 'preprocessor', 'postprocessor'] Lazy loaded fields: ['origin', 'tensor_name_to_range']
# Moreover, you can access informative static fields using the Command line tool:
! furiosa-models desc ResNet50
libfuriosa_hal.so --- v0.11.0, built @ 43c901f name: ResNet50 format: ONNX family: ResNet version: v1.5 metadata: description: ResNet50 v1.5 int8 ImageNet-1K publication: url: https://arxiv.org/abs/1512.03385.pdf task type: Image Classification available postprocess versions: Python
Acquire the ENF Binary with model_source()
¶
FuriosaAI's Model object offers a method called model_source()
which allows you to obtain the ENF (FuriosaAI Compiler-specific format) binary for a specific model. This ENF binary can be directly used for further processing or deployment without the need for recompilation. This is particularly beneficial when you want to save time and resources associated with the compilation process.
Using model_source()
is straightforward. You call this method on a Model object and, as a result, you receive the ENF binary. The num_pe
parameter, which has a default value of 2, specifies the number of processing elements (PE) to use. You can set it to 1 if you want to use a single PE for the model. This flexibility allows you to optimize the model's deployment according to your specific requirements, whether it's for single-PE or fusioned-PE scenarios.
Here's an example of how to use model_source()
:
from furiosa.runtime.sync import create_runner
model_source = model.model_source(num_pe=2)
# Create a runner with the model source
with create_runner(model_source) as runner:
# Print model inputs metadata
print(runner.model.inputs())
# Run inferences, ...
...
libfuriosa_hal.so --- v0.11.0, built @ 43c901f :-) Finished in 0.000006756s
[TensorDesc(shape=(1, 3, 224, 224), dtype=UINT8, format=NCHW, size=150528, len=150528)]