Navi - ML Model Serving Framework

Navi is a high-performance, machine learning model serving framework written in Rust. It serves as a critical component within X's recommendation algorithm, playing the role of a "Software framework" as described in the Core Architecture. Its primary responsibility is to efficiently serve machine learning models, ensuring low-latency predictions for various product surfaces like the "For You Timeline" and "Recommended Notifications."

Core Functionality

As a high-performance serving layer, Navi is designed to:

Process incoming prediction requests: It receives serialized input data, typically containing feature information.
Transform data: It converts raw input data into a format consumable by various machine learning models.
Execute models: It runs the loaded machine learning models (e.g., ONNX, TensorFlow, PyTorch) to generate predictions.
Return predictions: It formats and returns the model outputs efficiently.
Batching and concurrency: It manages request batching and concurrent model inference to maximize throughput and minimize latency.

Navi leverages Rust's performance characteristics to handle the demanding requirements of real-time recommendation systems.

Sub-components

Navi is composed of several key sub-components that manage different aspects of the model serving pipeline:

`segdense`

The segdense component is responsible for parsing model schemas and mapping features to the expected tensor inputs. Its main function is to interpret a JSON schema that defines how input features should be arranged for a machine learning model.

Purpose: segdense loads and parses configuration files (e.g., JSON schemas) that describe the structure of features expected by a model. It then creates a FeatureMapper, which translates feature identifiers (like 64-bit hashes of feature names) into precise locations within the model's input tensors (e.g., (Tensor Index, Index of feature within the tensor)). This is crucial for correctly populating model inputs from raw feature data. The segdense component also defines the data structures (Root, DensificationTransformSpec, InputFeature) that represent these complex schemas.
Key Structs/Enums:
- SegDenseError: A custom error type to handle issues during file I/O or JSON parsing.
- FeatureInfo: Contains tensor_index and index_within_tensor, indicating where a feature's value should be placed in the model's input.
- FeatureMapper: A hash map that stores the mapping from feature IDs to their FeatureInfo.

`dr_transform`

The dr_transform component handles the transformation of incoming data records into the tensor formats required by the deployed machine learning models. This is where the conversion of BatchPredictionRequest Thrift objects into InputTensors occurs, preparing the data for inference.

Purpose: dr_transform acts as a crucial data preparation layer. It takes serialized data (e.g., BatchPredictionRequest Thrift objects, which contain DataRecords with various feature types like continuous, binary, and discrete, as well as embeddings) and converts them into the specific InputTensor format expected by the model runtime (e.g., ONNX, PyTorch). It also integrates with external configuration (all_config) to understand the overall model structure and feature renaming. Utilities for loading base64-encoded prediction requests and saving data to NumPy files (for testing/debugging) are also part of this component.
Key Structs/Enums:
- AllConfig: Represents an overarching configuration, likely defining model-specific settings and feature renaming rules.
- BatchPredictionRequestToTorchTensorConverter: A concrete implementation of the Converter trait that transforms incoming prediction requests into a vector of InputTensors.
- TensorInputEnum: An enum to encapsulate different data types (String, Int, Int64, Float, Double, Boolean) for individual tensor inputs, facilitating flexible data handling.

Navi - ML Model Serving Framework

Page Viewers

Guest Views

Navi - ML Model Serving Framework

Core Functionality

Sub-components

`segdense`

`dr_transform`