furiosa.runtime package

Submodules

furiosa.runtime.compiler module

furiosa.runtime.compiler.generate_compiler_log_path() pathlib.Path

Generate a log path for compilation log

furiosa.runtime.consts module

furiosa.runtime.envs module

furiosa.runtime.envs.current_npu_device() str

Return the current npu device name

Returns

NPU device name

furiosa.runtime.envs.is_compile_log_enabled() bool

Return True or False whether the compile log is enabled or not.

Returns

True if the compile log is enabled, or False.

furiosa.runtime.envs.log_dir() str

Return FURIOSA_LOG_DIR where the logs are stored.

Returns

The log directory of furiosa sdk

furiosa.runtime.envs.profiler_output() None | str

Return FURIOSA_PROFILER_OUTPUT_PATH where profiler outputs written.

For compatibility, NUX_PROFILER_PATH is also currently being supported, but it will be deprecated by FURIOSA_PROFILER_OUTPUT_PATH later.

Returns

The file path of profiler output if specified, or None.

furiosa.runtime.envs.xdg_state_home() str

Return XDG_STATE_HOME which is the base directory of furiosa logs, history, and other states

Returns

Furiosa home directory

furiosa.runtime.errors module

Nux Exception and Error

exception furiosa.runtime.errors.ApiClientInitFailed

Bases: furiosa.runtime.errors.NativeException

when api client fails to initialize due to api keys or others

exception furiosa.runtime.errors.CompilationFailed

Bases: furiosa.runtime.errors.NativeException

when Nux fails to compile a given model image to NPU model binary

exception furiosa.runtime.errors.DeviceBusy

Bases: furiosa.runtime.errors.NativeException

The device is already occupied

exception furiosa.runtime.errors.IncompatibleApiClientError

Bases: furiosa.runtime.errors.NativeException

When both API client and server are incompatible

exception furiosa.runtime.errors.IncompatibleModel

Bases: furiosa.runtime.errors.NativeException

When Renegade compiler cannot recognize a given model image binary

exception furiosa.runtime.errors.InternalError(cause='unknown')

Bases: furiosa.runtime.errors.NativeException

internal error or no corresponding error in Python binding

exception furiosa.runtime.errors.InvalidCompilerConfig

Bases: furiosa.runtime.errors.NativeException

Compiler config is invalid

exception furiosa.runtime.errors.InvalidInput(message: str = 'Invalid input tensors')

Bases: furiosa.common.error.FuriosaError

When input tensors are invalid with any reason

exception furiosa.runtime.errors.InvalidSessionOption

Bases: furiosa.runtime.errors.NativeException

when api client fails to initialize due to api keys or others

exception furiosa.runtime.errors.InvalidYamlException

Bases: furiosa.runtime.errors.NativeException

When Renegade compiler cannot recognize a given model image binary

class furiosa.runtime.errors.NativeError(value)

Bases: enum.IntEnum

Python object correspondnig to nux_error_t in Nux C API

API_CLIENT_INIT_FAILED = 18
COMPILATION_FAILED = 14
DEVICE_BUSY = 23
DUMP_PROFILE_FAILED = 10
GET_TASK_FAILED = 9
INCOMPATIBLE_API_CLIENT_ERROR = 17
INCOMPATIBLE_MODEL = 13
INTERNAL_ERROR = 15
INVALID_BUFFER = 6
INVALID_COMPILER_CONFIG = 30
INVALID_INPUTS = 7
INVALID_INPUT_INDEX = 4
INVALID_OUTPUTS = 8
INVALID_OUTPUT_INDEX = 5
INVALID_SESSION_OPTIONS = 21
INVALID_YAML = 16
MODEL_DEPLOY_FAILED = 2
MODEL_EXECUTION_FAILED = 3
NO_API_KEY = 19
NULL_POINTER_EXCEPTION = 20
NUX_CREATION_FAILED = 1
QUEUE_NO_DATA = 12
QUEUE_WAIT_TIMEOUT = 11
SESSION_TERMINATED = 22
SUCCESS = 0
TENSOR_NAME_NOT_FOUND = 24
UNSUPPORTED_FEATURE = 25
exception furiosa.runtime.errors.NativeException(message: str, native_err: Optional[furiosa.runtime.errors.NativeError] = None)

Bases: furiosa.common.error.FuriosaError

general exception caused by Nuxpy

native_error() Optional[furiosa.runtime.errors.NativeError]

Return a native error if this exception comes from C native extension

exception furiosa.runtime.errors.NoApiKeyException

Bases: furiosa.runtime.errors.NativeException

when api client fails to initialize due to api keys or others

exception furiosa.runtime.errors.QueueWaitTimeout

Bases: furiosa.runtime.errors.NativeException

Timed out in Completion queue

exception furiosa.runtime.errors.SessionClosed

Bases: furiosa.common.error.FuriosaError

Session is already terminated

exception furiosa.runtime.errors.SessionTerminated

Bases: furiosa.runtime.errors.NativeException

Session is already terminated

exception furiosa.runtime.errors.TensorNameNotFound

Bases: furiosa.runtime.errors.NativeException

When a given tensor name is not found in this model

exception furiosa.runtime.errors.UnsupportedDataType(dtype)

Bases: furiosa.runtime.errors.NativeException

Unsupported tensor data type

exception furiosa.runtime.errors.UnsupportedFeature

Bases: furiosa.runtime.errors.NativeException

Feature is not supported

exception furiosa.runtime.errors.UnsupportedTensorType

Bases: furiosa.runtime.errors.NativeException

Unsupported tensor type

furiosa.runtime.errors.into_exception(err: Union[ctypes.c_int, int]) furiosa.runtime.errors.NativeException

Convert nux_error_t type in Nux C API to NuxException

Parameters

err (NativeError) –

Returns

NuxException

furiosa.runtime.model module

Model and its methods to access model metadata

class furiosa.runtime.model.Model

Bases: abc.ABC

NPU model binary compiled by Renegade compiler

allocate_inputs() furiosa.runtime.tensor.TensorArray

Creates an array of input tensors with allocated buffers

allocate_outputs() furiosa.runtime.tensor.TensorArray

Creates an array of output tensors with allocated buffers

allocate_tensors(names: List[str]) furiosa.runtime.tensor.TensorArray

Creates an array of tensors corresponding to tensor names with allocated buffers

create_inputs() furiosa.runtime.tensor.TensorArray

Creates an array of input tensors without allocated buffers

create_outputs() furiosa.runtime.tensor.TensorArray

Creates an array of output tensors without allocated buffers

create_tensors(names: List[str]) furiosa.runtime.tensor.TensorArray

Creates an array of tensors corresponding to tensor names without allocated buffers

input(idx) furiosa.runtime.tensor.TensorDesc

Return tensor description of i-th input tensor of Model

property input_num: int

Number of input tensors of Model

inputs() List[furiosa.runtime.tensor.TensorDesc]

Tensor descriptions of all input tensors of Model

output(idx) furiosa.runtime.tensor.TensorDesc

Returns tensor description of i-th output tensor of Model

property output_num: int

Number of output tensors of Model

outputs() List[furiosa.runtime.tensor.TensorDesc]

Tensor descriptions of all output tensors of Model

print_summary()

Prints the summary of this model

summary() str

Returns the summary of this model

furiosa.runtime.profiler module

class furiosa.runtime.profiler.ChromeTraceConfig(*, file: io.IOBase = None)

Bases: pydantic.main.BaseModel

ChromeTrace specific config.

file

file descriptor to write profile data. By default, sys.stdout.

class Config

Bases: object

arbitrary_types_allowed = True
extra = 'forbid'
json_encoders = {<class '_io._IOBase'>: <function ChromeTraceConfig.Config.<lambda>>}
file: io.IOBase
class furiosa.runtime.profiler.PandasDataFrameConfig(*, file: int, **extra_data: Any)

Bases: pydantic.main.BaseModel

PandasDataFrame specific config.

file

file descriptor to write profile data. By default, memfd.

class Config

Bases: object

arbitrary_types_allowed = True
extra = 'allow'
file: int
class furiosa.runtime.profiler.RecordFormat(value)

Bases: enum.IntEnum

Profiler format to record profile data.

ChromeTrace = 0
PandasDataFrame = 1
class furiosa.runtime.profiler.Resource(value)

Bases: enum.Flag

Profiler target resource to be recorded.

ALL = 3
CPU = 1
NPU = 2
class furiosa.runtime.profiler.profile(resource: furiosa.runtime.profiler.Resource = Resource.ALL, format: furiosa.runtime.profiler.RecordFormat = RecordFormat.ChromeTrace, **config: Any)

Bases: object

Profiler context manager.

Examples

>>> from furiosa.runtime.profiler import RecordFormat
>>> with open("profile.json", "w") as f:
>>>     with profile(format=RecordFormat.ChromeTrace, file=f) as profiler:
>>>         # Profiler enabled from here
>>>         with profiler.record("Inference"):
>>>             ... # Profiler recorded with span named 'Inference'
export_chrome_trace(filename)
get_cpu_pandas_dataframe()
get_npu_pandas_dataframe()
get_pandas_dataframe()
get_pandas_dataframe_with_filter(column, value)
print_external_operators()
print_inferences()
print_npu_executions()
print_npu_operators()
print_summary()
record(name: str = '', warm_up=False)

Create profiler span with specified name.

Parameters
  • name (str) – Profiler record span name.

  • warm_up (bool) – If true, do not record profiler result, and just warm up.

furiosa.runtime.session module

Session and its asynchronous API for model inference

class furiosa.runtime.session.AsyncSession(ref: ctypes.c_void_p)

Bases: furiosa.runtime.model.Model

An asynchronous session for a given model allows to submit predictions

close(*, _LIBNUX=<CDLL 'libnux.so', handle 22aea10>)

Closes this session

After a session is closed, CompletionQueue will return an error if CompletionQueue.recv() is called.

_LIBNUX is only for internal use and not supposed to be specified by a user.

submit(values: Union[numpy.ndarray, numpy.generic, furiosa.runtime.tensor.TensorArray], context: Optional[object] = None) None

Submit a prediction request

It immediately returns without blocking the caller, and If the prediction is completed, the outputs will be sent to CompletionQueue.

Parameters
  • values – Input values

  • context – an additional context to identify the prediction request

class furiosa.runtime.session.CompletionQueue(ref: ctypes.c_void_p, context_ty: Optional[type], output_descs: List[furiosa.runtime.tensor.TensorDesc], profiler, profiler_file)

Bases: object

Receives the completion results asynchronously from AsyncSession

close(*, _LIBNUX=<CDLL 'libnux.so', handle 22aea10>)

Closes this completion queue.

If it is closed, AsyncSession also will stop working.

_LIBNUX is only for internal use and not supposed to be specified by a user.

recv(timeout: Optional[int] = None) Tuple[object, furiosa.runtime.tensor.TensorArray]

Receives the prediction results which are asynchronously coming from AsyncSession

If there are already prediction outputs, it will return immediately. Otherwise, it will be blocked until the next result are ready.

If timeout is set, recv() will be blocked only until the timeout occurs. If timed out, recv() throws QueueWaitTimeout exception.

If AsyncSession is closed earlier recv() will throw SessionTerminated exception.

Parameters
  • timeout (int) – How long to wait before giving up.

  • milliseconds. (It should be a positive interger in) –

Returns

A tuple, whose first value is the context value passed when you submit an inference task and the second value is inference output.

class furiosa.runtime.session.Session(model: Union[bytes, str, pathlib.Path], device: Optional[str] = None, worker_num: Optional[int] = None, batch_size: Optional[int] = None, compiler_hints: bool = True, compiler_config: Optional[Mapping[str, object]] = None)

Bases: furiosa.runtime.model.Model

Provides a blocking API to run an inference task with a given model

close(*, _LIBNUX=<CDLL 'libnux.so', handle 22aea10>)

Close the session and release all resources belonging to the session

_LIBNUX is only for internal use and not supposed to be specified by a user.

run(inputs: Union[numpy.ndarray, numpy.generic, furiosa.runtime.tensor.TensorArray, List[Union[numpy.ndarray, numpy.generic]]]) furiosa.runtime.tensor.TensorArray

Runs an inference task with inputs

Parameters

inputs – It can be a single runtime.Tensor, runtime.TensorArray or numpy.ndarray object. Also, you can pass one TensorArray or a list of numpy.ndarray objects.

Returns

Inference output

run_with(outputs: List[str], inputs: Dict[str, numpy.ndarray]) furiosa.runtime.tensor.TensorArray

Runs an inference task with inputs

Parameters

inputs – It can be a single runtime.Tensor, runtime.TensorArray or numpy.ndarray object. Also, you can pass one TensorArray or a list of numpy.ndarray objects.

Returns

Inference output

furiosa.runtime.session.create(model: Union[bytes, str, pathlib.Path, furiosa.registry.model.Model], device: Optional[str] = None, worker_num: Optional[int] = None, batch_size: Optional[int] = None, compiler_config: Optional[Mapping[str, object]] = None, compiler_hints: bool = True) furiosa.runtime.session.Session

Creates a session for a model

Parameters
  • model (bytes, str, Path, Model) – a byte string containing a model image or a path string of a model image file or furiosa.registry.Model

  • device – NPU device (str) (e.g., npu0pe0, npu0pe0-1)

  • worker_num – Number of workers

  • batch_size – Batch size of input tensors

  • compiler_config (Mapping[str, object]) – Compile config

  • compiler_hints – Print compiler hints if True (default: True)

Returns

the session for a given model, allowing to run predictions. Session is a thread safe.

furiosa.runtime.session.create_async(model: Union[bytes, str, pathlib.Path, furiosa.registry.model.Model], context_ty: Optional[type] = None, device: Optional[str] = None, worker_num: Optional[int] = None, batch_size: Optional[int] = None, compiler_hints: Optional[bool] = True, input_queue_size: Optional[int] = None, output_queue_size: Optional[int] = None, compiler_config: Optional[Mapping[str, object]] = None) Tuple[furiosa.runtime.session.AsyncSession, furiosa.runtime.session.CompletionQueue]

Creates a pair of the asynchronous session and the completion queue for a given model

Parameters
  • model (bytes, str, Path, Model) – a byte string containing a model image or a path string of a model image file or furiosa.registry.Model

  • context_ty (type) – Type for passing context from AsyncSession to CompletionQueue

  • device – NPU device (str) (e.g., npu0pe0, npu0pe0-1)

  • worker_num – Number of workers

  • batch_size – Batch size of input tensors

  • compiler_hints – Print compiler hints if True (default: True)

  • input_queue_size – The input queue size, and it must be > 0 and < 2^31.

  • output_queue_size – The output queue size, and it must be be > 0 and < 2^31.

  • compiler_config (Mapping[str, object]) – Compile config

Returns

A pair of the asynchronous session and the completion queue. the asynchronous session for a given model allows to submit predictions. the completion queue allows users to receive the prediction outputs asynchronously.

furiosa.runtime.tensor module

Tensor object and its utilities

class furiosa.runtime.tensor.Axis(value)

Bases: enum.IntEnum

Axis of Tensor

BATCH = 3
BATCH_OUTER = 7
CHANNEL = 2
CHANNEL_OUTER = 6
HEIGHT = 1
HEIGHT_OUTER = 5
UNKNOWN = 8
WIDTH = 0
WIDTH_OUTER = 4
class furiosa.runtime.tensor.DataType(value)

Bases: enum.IntEnum

Tensor data type

BFLOAT16 = 5
FLOAT32 = 0
INT32 = 3
INT64 = 4
INT8 = 2
UINT8 = 1
property numpy_dtype

Return the numpy dtype corresponding to this DataType

class furiosa.runtime.tensor.Tensor(ref: ctypes.c_void_p, desc: furiosa.runtime.tensor.TensorDesc, allocated: bool = False)

Bases: object

A tensor which contains data and tensor description including shape

copy_from(data: Union[numpy.ndarray, numpy.generic])

Copy the contents of Numpy ndarray to this tensor

property dtype: furiosa.runtime.tensor.DataType

Data type of tensor

numpy() numpy.ndarray

Return numpy.ndarray converted from this tensor

property numpy_dtype

Return numpy dtype

property shape: tuple

Return the tensor shape

Returns

Tensor shape. An example shape is `(1, 28, 28, 1)`.

view() numpy.ndarray

Return numpy.ndarray view converted from this tensor

class furiosa.runtime.tensor.TensorArray(ref: ctypes.c_void_p, descs: [<class 'furiosa.runtime.tensor.TensorDesc'>], allocated: bool = False)

Bases: object

A list of tensors

It is used for input and output values of model inferences.

is_empty() bool

True if it has no Tensor

numpy() [<class 'numpy.ndarray'>]

Convert TensorArray to a list of numpy.ndarray

view() [<class 'numpy.ndarray'>]

Convert TensorArray to a list of numpy.ndarray view

class furiosa.runtime.tensor.TensorDesc(ref: ctypes.c_void_p)

Bases: object

Tensor description including dimension, shape, and data type

axis(idx: int) furiosa.runtime.tensor.Axis

Axis type of i-th dimension (e.g., width, height, channel)

dim(idx: int) int

Size of i-th dimension

property dtype: furiosa.runtime.tensor.DataType

Data type of tensor

property format: str

Tensor memory layout (e.g., NHWC, NCHW)

property length: int

Number of all elements across all dimensions

property name: Optional[str]
property ndim: int

Number of dimensions

property numpy_dtype

Return numpy dtype

property shape: tuple

tensor shape

property size: int

Size in bytes

stride(idx: int) int

Stride of i-th dimension

furiosa.runtime.tensor.numpy_dtype(value)

Return numpy dtype from any eligible object of Nux

furiosa.runtime.tensor.rand(tensor: furiosa.runtime.tensor.TensorDesc) numpy.ndarray

Return a new array of given shape and type, filled with random numbers.

furiosa.runtime.tensor.zeros(tensor: furiosa.runtime.tensor.TensorDesc) numpy.ndarray

Return a new array of given shape and type, filled with zeros.

Module contents

Provide high-level Python APIs to access Furiosa AI’s NPUs and its eco-system

furiosa.runtime.full_version() str

Returns a full version string including the native library version