furiosa.runtime package
Submodules
furiosa.runtime.compiler module
- furiosa.runtime.compiler.generate_compiler_log_path() pathlib.Path
Generate a log path for compilation log
furiosa.runtime.consts module
furiosa.runtime.envs module
- furiosa.runtime.envs.current_npu_device() str
Return the current npu device name
- Returns
NPU device name
- furiosa.runtime.envs.is_compile_log_enabled() bool
Return True or False whether the compile log is enabled or not.
- Returns
True if the compile log is enabled, or False.
- furiosa.runtime.envs.log_dir() str
Return FURIOSA_LOG_DIR where the logs are stored.
- Returns
The log directory of furiosa sdk
- furiosa.runtime.envs.profiler_output() None | str
Return FURIOSA_PROFILER_OUTPUT_PATH where profiler outputs written.
For compatibility, NUX_PROFILER_PATH is also currently being supported, but it will be deprecated by FURIOSA_PROFILER_OUTPUT_PATH later.
- Returns
The file path of profiler output if specified, or None.
- furiosa.runtime.envs.xdg_state_home() str
Return XDG_STATE_HOME which is the base directory of furiosa logs, history, and other states
- Returns
Furiosa home directory
furiosa.runtime.errors module
Nux Exception and Error
- exception furiosa.runtime.errors.ApiClientInitFailed
Bases:
furiosa.runtime.errors.NativeException
when api client fails to initialize due to api keys or others
- exception furiosa.runtime.errors.CompilationFailed
Bases:
furiosa.runtime.errors.NativeException
when Nux fails to compile a given model image to NPU model binary
- exception furiosa.runtime.errors.DeviceBusy
Bases:
furiosa.runtime.errors.NativeException
The device is already occupied
- exception furiosa.runtime.errors.IncompatibleApiClientError
Bases:
furiosa.runtime.errors.NativeException
When both API client and server are incompatible
- exception furiosa.runtime.errors.IncompatibleModel
Bases:
furiosa.runtime.errors.NativeException
When Renegade compiler cannot recognize a given model image binary
- exception furiosa.runtime.errors.InternalError(cause='unknown')
Bases:
furiosa.runtime.errors.NativeException
internal error or no corresponding error in Python binding
- exception furiosa.runtime.errors.InvalidCompilerConfig
Bases:
furiosa.runtime.errors.NativeException
Compiler config is invalid
- exception furiosa.runtime.errors.InvalidInput(message: str = 'Invalid input tensors')
Bases:
furiosa.common.error.FuriosaError
When input tensors are invalid with any reason
- exception furiosa.runtime.errors.InvalidSessionOption
Bases:
furiosa.runtime.errors.NativeException
when api client fails to initialize due to api keys or others
- exception furiosa.runtime.errors.InvalidYamlException
Bases:
furiosa.runtime.errors.NativeException
When Renegade compiler cannot recognize a given model image binary
- class furiosa.runtime.errors.NativeError(value)
Bases:
enum.IntEnum
Python object correspondnig to nux_error_t in Nux C API
- API_CLIENT_INIT_FAILED = 18
- COMPILATION_FAILED = 14
- DEVICE_BUSY = 23
- DUMP_PROFILE_FAILED = 10
- GET_TASK_FAILED = 9
- INCOMPATIBLE_API_CLIENT_ERROR = 17
- INCOMPATIBLE_MODEL = 13
- INTERNAL_ERROR = 15
- INVALID_BUFFER = 6
- INVALID_COMPILER_CONFIG = 30
- INVALID_INPUTS = 7
- INVALID_INPUT_INDEX = 4
- INVALID_OUTPUTS = 8
- INVALID_OUTPUT_INDEX = 5
- INVALID_SESSION_OPTIONS = 21
- INVALID_YAML = 16
- MODEL_DEPLOY_FAILED = 2
- MODEL_EXECUTION_FAILED = 3
- NO_API_KEY = 19
- NULL_POINTER_EXCEPTION = 20
- NUX_CREATION_FAILED = 1
- QUEUE_NO_DATA = 12
- QUEUE_WAIT_TIMEOUT = 11
- SESSION_TERMINATED = 22
- SUCCESS = 0
- TENSOR_NAME_NOT_FOUND = 24
- UNSUPPORTED_FEATURE = 25
- exception furiosa.runtime.errors.NativeException(message: str, native_err: Optional[furiosa.runtime.errors.NativeError] = None)
Bases:
furiosa.common.error.FuriosaError
general exception caused by Nuxpy
- native_error() Optional[furiosa.runtime.errors.NativeError]
Return a native error if this exception comes from C native extension
- exception furiosa.runtime.errors.NoApiKeyException
Bases:
furiosa.runtime.errors.NativeException
when api client fails to initialize due to api keys or others
- exception furiosa.runtime.errors.QueueWaitTimeout
Bases:
furiosa.runtime.errors.NativeException
Timed out in Completion queue
- exception furiosa.runtime.errors.SessionClosed
Bases:
furiosa.common.error.FuriosaError
Session is already terminated
- exception furiosa.runtime.errors.SessionTerminated
Bases:
furiosa.runtime.errors.NativeException
Session is already terminated
- exception furiosa.runtime.errors.TensorNameNotFound
Bases:
furiosa.runtime.errors.NativeException
When a given tensor name is not found in this model
- exception furiosa.runtime.errors.UnsupportedDataType(dtype)
Bases:
furiosa.runtime.errors.NativeException
Unsupported tensor data type
- exception furiosa.runtime.errors.UnsupportedFeature
Bases:
furiosa.runtime.errors.NativeException
Feature is not supported
- exception furiosa.runtime.errors.UnsupportedTensorType
Bases:
furiosa.runtime.errors.NativeException
Unsupported tensor type
- furiosa.runtime.errors.into_exception(err: Union[ctypes.c_int, int]) furiosa.runtime.errors.NativeException
Convert nux_error_t type in Nux C API to NuxException
- Parameters
err (NativeError) –
- Returns
NuxException
furiosa.runtime.model module
Model and its methods to access model metadata
- class furiosa.runtime.model.Model
Bases:
abc.ABC
NPU model binary compiled by Renegade compiler
- allocate_inputs() furiosa.runtime.tensor.TensorArray
Creates an array of input tensors with allocated buffers
- allocate_outputs() furiosa.runtime.tensor.TensorArray
Creates an array of output tensors with allocated buffers
- allocate_tensors(names: List[str]) furiosa.runtime.tensor.TensorArray
Creates an array of tensors corresponding to tensor names with allocated buffers
- create_inputs() furiosa.runtime.tensor.TensorArray
Creates an array of input tensors without allocated buffers
- create_outputs() furiosa.runtime.tensor.TensorArray
Creates an array of output tensors without allocated buffers
- create_tensors(names: List[str]) furiosa.runtime.tensor.TensorArray
Creates an array of tensors corresponding to tensor names without allocated buffers
- input(idx) furiosa.runtime.tensor.TensorDesc
Return tensor description of i-th input tensor of Model
- property input_num: int
Number of input tensors of Model
- inputs() List[furiosa.runtime.tensor.TensorDesc]
Tensor descriptions of all input tensors of Model
- output(idx) furiosa.runtime.tensor.TensorDesc
Returns tensor description of i-th output tensor of Model
- property output_num: int
Number of output tensors of Model
- outputs() List[furiosa.runtime.tensor.TensorDesc]
Tensor descriptions of all output tensors of Model
- print_summary()
Prints the summary of this model
- summary() str
Returns the summary of this model
furiosa.runtime.profiler module
- class furiosa.runtime.profiler.ChromeTraceConfig(*, file: io.IOBase = None)
Bases:
pydantic.main.BaseModel
ChromeTrace specific config.
- file
file descriptor to write profile data. By default, sys.stdout.
- class Config
Bases:
object
- arbitrary_types_allowed = True
- extra = 'forbid'
- json_encoders = {<class '_io._IOBase'>: <function ChromeTraceConfig.Config.<lambda>>}
- file: io.IOBase
- class furiosa.runtime.profiler.PandasDataFrameConfig(*, file: int, **extra_data: Any)
Bases:
pydantic.main.BaseModel
PandasDataFrame specific config.
- file
file descriptor to write profile data. By default, memfd.
- file: int
- class furiosa.runtime.profiler.RecordFormat(value)
Bases:
enum.IntEnum
Profiler format to record profile data.
- ChromeTrace = 0
- PandasDataFrame = 1
- class furiosa.runtime.profiler.Resource(value)
Bases:
enum.Flag
Profiler target resource to be recorded.
- ALL = 3
- CPU = 1
- NPU = 2
- class furiosa.runtime.profiler.profile(resource: furiosa.runtime.profiler.Resource = Resource.ALL, format: furiosa.runtime.profiler.RecordFormat = RecordFormat.ChromeTrace, **config: Any)
Bases:
object
Profiler context manager.
Examples
>>> from furiosa.runtime.profiler import RecordFormat >>> with open("profile.json", "w") as f: >>> with profile(format=RecordFormat.ChromeTrace, file=f) as profiler: >>> # Profiler enabled from here >>> with profiler.record("Inference"): >>> ... # Profiler recorded with span named 'Inference'
- export_chrome_trace(filename)
- get_cpu_pandas_dataframe()
- get_npu_pandas_dataframe()
- get_pandas_dataframe()
- get_pandas_dataframe_with_filter(column, value)
- print_external_operators()
- print_inferences()
- print_npu_executions()
- print_npu_operators()
- print_summary()
- record(name: str = '', warm_up=False)
Create profiler span with specified name.
- Parameters
name (str) – Profiler record span name.
warm_up (bool) – If true, do not record profiler result, and just warm up.
furiosa.runtime.session module
Session and its asynchronous API for model inference
- class furiosa.runtime.session.AsyncSession(ref: ctypes.c_void_p)
Bases:
furiosa.runtime.model.Model
An asynchronous session for a given model allows to submit predictions
- close(*, _LIBNUX=<CDLL 'libnux.so', handle 22aea10>)
Closes this session
After a session is closed, CompletionQueue will return an error if CompletionQueue.recv() is called.
_LIBNUX is only for internal use and not supposed to be specified by a user.
- submit(values: Union[numpy.ndarray, numpy.generic, furiosa.runtime.tensor.TensorArray], context: Optional[object] = None) None
Submit a prediction request
It immediately returns without blocking the caller, and If the prediction is completed, the outputs will be sent to CompletionQueue.
- Parameters
values – Input values
context – an additional context to identify the prediction request
- class furiosa.runtime.session.CompletionQueue(ref: ctypes.c_void_p, context_ty: Optional[type], output_descs: List[furiosa.runtime.tensor.TensorDesc], profiler, profiler_file)
Bases:
object
Receives the completion results asynchronously from AsyncSession
- close(*, _LIBNUX=<CDLL 'libnux.so', handle 22aea10>)
Closes this completion queue.
If it is closed, AsyncSession also will stop working.
_LIBNUX is only for internal use and not supposed to be specified by a user.
- recv(timeout: Optional[int] = None) Tuple[object, furiosa.runtime.tensor.TensorArray]
Receives the prediction results which are asynchronously coming from AsyncSession
If there are already prediction outputs, it will return immediately. Otherwise, it will be blocked until the next result are ready.
If
timeout
is set,recv()
will be blocked only until the timeout occurs. If timed out,recv()
throwsQueueWaitTimeout
exception.If AsyncSession is closed earlier
recv()
will throwSessionTerminated
exception.- Parameters
timeout (int) – How long to wait before giving up.
milliseconds. (It should be a positive interger in) –
- Returns
A tuple, whose first value is the context value passed when you submit an inference task and the second value is inference output.
- class furiosa.runtime.session.Session(model: Union[bytes, str, pathlib.Path], device: Optional[str] = None, worker_num: Optional[int] = None, batch_size: Optional[int] = None, compiler_hints: bool = True, compiler_config: Optional[Mapping[str, object]] = None)
Bases:
furiosa.runtime.model.Model
Provides a blocking API to run an inference task with a given model
- close(*, _LIBNUX=<CDLL 'libnux.so', handle 22aea10>)
Close the session and release all resources belonging to the session
_LIBNUX is only for internal use and not supposed to be specified by a user.
- run(inputs: Union[numpy.ndarray, numpy.generic, furiosa.runtime.tensor.TensorArray, List[Union[numpy.ndarray, numpy.generic]]]) furiosa.runtime.tensor.TensorArray
Runs an inference task with inputs
- Parameters
inputs – It can be a single runtime.Tensor, runtime.TensorArray or numpy.ndarray object. Also, you can pass one TensorArray or a list of numpy.ndarray objects.
- Returns
Inference output
- run_with(outputs: List[str], inputs: Dict[str, numpy.ndarray]) furiosa.runtime.tensor.TensorArray
Runs an inference task with inputs
- Parameters
inputs – It can be a single runtime.Tensor, runtime.TensorArray or numpy.ndarray object. Also, you can pass one TensorArray or a list of numpy.ndarray objects.
- Returns
Inference output
- furiosa.runtime.session.create(model: Union[bytes, str, pathlib.Path, furiosa.registry.model.Model], device: Optional[str] = None, worker_num: Optional[int] = None, batch_size: Optional[int] = None, compiler_config: Optional[Mapping[str, object]] = None, compiler_hints: bool = True) furiosa.runtime.session.Session
Creates a session for a model
- Parameters
model (bytes, str, Path, Model) – a byte string containing a model image or a path string of a model image file or furiosa.registry.Model
device – NPU device (str) (e.g., npu0pe0, npu0pe0-1)
worker_num – Number of workers
batch_size – Batch size of input tensors
compiler_config (Mapping[str, object]) – Compile config
compiler_hints – Print compiler hints if True (default: True)
- Returns
the session for a given model, allowing to run predictions. Session is a thread safe.
- furiosa.runtime.session.create_async(model: Union[bytes, str, pathlib.Path, furiosa.registry.model.Model], context_ty: Optional[type] = None, device: Optional[str] = None, worker_num: Optional[int] = None, batch_size: Optional[int] = None, compiler_hints: Optional[bool] = True, input_queue_size: Optional[int] = None, output_queue_size: Optional[int] = None, compiler_config: Optional[Mapping[str, object]] = None) Tuple[furiosa.runtime.session.AsyncSession, furiosa.runtime.session.CompletionQueue]
Creates a pair of the asynchronous session and the completion queue for a given model
- Parameters
model (bytes, str, Path, Model) – a byte string containing a model image or a path string of a model image file or furiosa.registry.Model
context_ty (type) – Type for passing context from AsyncSession to CompletionQueue
device – NPU device (str) (e.g., npu0pe0, npu0pe0-1)
worker_num – Number of workers
batch_size – Batch size of input tensors
compiler_hints – Print compiler hints if True (default: True)
input_queue_size – The input queue size, and it must be > 0 and < 2^31.
output_queue_size – The output queue size, and it must be be > 0 and < 2^31.
compiler_config (Mapping[str, object]) – Compile config
- Returns
A pair of the asynchronous session and the completion queue. the asynchronous session for a given model allows to submit predictions. the completion queue allows users to receive the prediction outputs asynchronously.
furiosa.runtime.tensor module
Tensor object and its utilities
- class furiosa.runtime.tensor.Axis(value)
Bases:
enum.IntEnum
Axis of Tensor
- BATCH = 3
- BATCH_OUTER = 7
- CHANNEL = 2
- CHANNEL_OUTER = 6
- HEIGHT = 1
- HEIGHT_OUTER = 5
- UNKNOWN = 8
- WIDTH = 0
- WIDTH_OUTER = 4
- class furiosa.runtime.tensor.DataType(value)
Bases:
enum.IntEnum
Tensor data type
- BFLOAT16 = 5
- FLOAT32 = 0
- INT32 = 3
- INT64 = 4
- INT8 = 2
- UINT8 = 1
- property numpy_dtype
Return the numpy dtype corresponding to this DataType
- class furiosa.runtime.tensor.Tensor(ref: ctypes.c_void_p, desc: furiosa.runtime.tensor.TensorDesc, allocated: bool = False)
Bases:
object
A tensor which contains data and tensor description including shape
- copy_from(data: Union[numpy.ndarray, numpy.generic])
Copy the contents of Numpy ndarray to this tensor
- property dtype: furiosa.runtime.tensor.DataType
Data type of tensor
- numpy() numpy.ndarray
Return numpy.ndarray converted from this tensor
- property numpy_dtype
Return numpy dtype
- property shape: tuple
Return the tensor shape
- Returns
Tensor shape. An example shape is
`(1, 28, 28, 1)`
.
- view() numpy.ndarray
Return numpy.ndarray view converted from this tensor
- class furiosa.runtime.tensor.TensorArray(ref: ctypes.c_void_p, descs: [<class 'furiosa.runtime.tensor.TensorDesc'>], allocated: bool = False)
Bases:
object
A list of tensors
It is used for input and output values of model inferences.
- is_empty() bool
True if it has no Tensor
- numpy() [<class 'numpy.ndarray'>]
Convert TensorArray to a list of numpy.ndarray
- view() [<class 'numpy.ndarray'>]
Convert TensorArray to a list of numpy.ndarray view
- class furiosa.runtime.tensor.TensorDesc(ref: ctypes.c_void_p)
Bases:
object
Tensor description including dimension, shape, and data type
- axis(idx: int) furiosa.runtime.tensor.Axis
Axis type of i-th dimension (e.g., width, height, channel)
- dim(idx: int) int
Size of i-th dimension
- property dtype: furiosa.runtime.tensor.DataType
Data type of tensor
- property format: str
Tensor memory layout (e.g., NHWC, NCHW)
- property length: int
Number of all elements across all dimensions
- property name: Optional[str]
- property ndim: int
Number of dimensions
- property numpy_dtype
Return numpy dtype
- property shape: tuple
tensor shape
- property size: int
Size in bytes
- stride(idx: int) int
Stride of i-th dimension
- furiosa.runtime.tensor.numpy_dtype(value)
Return numpy dtype from any eligible object of Nux
- furiosa.runtime.tensor.rand(tensor: furiosa.runtime.tensor.TensorDesc) numpy.ndarray
Return a new array of given shape and type, filled with random numbers.
- furiosa.runtime.tensor.zeros(tensor: furiosa.runtime.tensor.TensorDesc) numpy.ndarray
Return a new array of given shape and type, filled with zeros.