Intel Embodied Intelligence SDK

Embodied Intelligence SDK is a suite of intuitive, easy-to-use software stack designed to streamline the development process of Embodied Intelligence product and applications on Intel platform. The SDK provides developers with a comprehensive environment for developing, testing, and optimizing Embodied Intelligence software and algorithms efficiently. It also provides necessary software framework, libraries, tools, Best known configuration(BKC), tutorials and example codes to facilitate AI solution development.

Embodied Intelligence SDK includes below features:

  • Comprehensive software platform from BSP, acceleration libraries, SDK to reference demos, with documentation and developer tutorials;

  • Real-time BKC, Linux RT kernel and optimized EtherCAT;

  • Traditional vision and motion planning acceleration on CPU, Reinforcement/Imitation Learning-based manipulation, AI-based vision & LLM/VLM acceleration on iGPU & NPU;

  • Typical workflows and examples including ACT/DP-based manipulation, LLM task planning, Pick & Place, ORB-SLAM3, etc.

Software Architecture

Below picture is high level software architecture of Embodied Intelligence SDK:

../_images/sdk_architecture.png

This software architecture is designed to power Embodied Intelligence systems by integrating computer vision, AI-driven manipulation, locomotion, SLAM, and large models into a unified framework. Built on ROS2 middleware, it takes advantage of Intel’s CPU, iGPU, dGPU, and NPU to optimize performance for robotics and AI applications. The stack includes high-performance AI frameworks, real-time libraries, and system-level optimizations, making it a comprehensive solution for Embodied Intelligence products.

At the highest level, the architecture is structured around key reference pipelines and demos that demonstrate its core capabilities. These include Vision Servo, which enhances robotic perception using AI-powered vision modules, and ACT-based Manipulation, which applies reinforcement learning and imitation learning to improve robotic grasping and movement. Optimized Locomotion leverages traditional control algorithms like MPC (Model Predictive Control) and LQR (Linear Quadratic Regulator), alongside reinforcement learning models for adaptive motion. Additionally, the ORB-SLAM3 pipeline focuses on real-time simultaneous localization and mapping, while LLM Task Planning integrates large language models for intelligent task execution.

Beneath these pipelines, the software stack includes specialized AI and robotics modules. The vision module supports CNN-based models, OpenCV, and PCL operators for optimized perception, enabling robots to interpret their surroundings efficiently. The manipulation module combines traditional motion planning with AI-driven control, allowing robots to execute complex movements. For locomotion, the system blends classic control techniques with reinforcement learning models, ensuring smooth and adaptive movement. Meanwhile, SLAM components such as GPU ORB extraction and ADBSCAN optimization enhance mapping accuracy, and BEV (Bird’s Eye View) models contribute to improved spatial awareness. The large model module supports LLMs, Vision-Language Models (VLM), and Vision-Language-Action Models (VLA), enabling advanced reasoning and decision-making capabilities.

At the core of the system is ROS2 middleware and acceleration frameworks, which provide a standardized framework for robotics development. The architecture is further enhanced by Intel’s AI acceleration libraries, including Intel® OpenVINO™ for deep learning inference, Intel® LLM Library for PyTorch* (IPEX-LLM) for optimized large model execution, and compatibility with TensorFlow*, PyTorch*, and ONNX*. The Intel® oneAPI compiler and libraries offer high-performance computing capabilities, leveraging oneMKL for mathematical operations, oneDNN for deep learning, and oneTBB for parallel processing. Additionally, Intel’s real-time libraries ensure low-latency execution, with tools for performance tuning and EtherCAT-based industrial communication.

To ensure seamless integration with robotic hardware, the SDK runs on a real-time optimized Linux BSP. It includes support for optimized EtherCAT and camera drivers, along with Intel-specific features such as Speed Shift Technology and Cache Allocation to enhance power efficiency and performance. These system-level enhancements allow the software stack to deliver high responsiveness, making it suitable for real-time robotics applications.

Overall, the Embodied Intelligence SDK provides a highly optimized, AI-driven framework for robotics and Embodied Intelligence, combining computer vision, motion planning, real-time processing, and large-scale AI models into a cohesive system. By leveraging Intel’s hardware acceleration and software ecosystem, it enables next-generation robotic applications with enhanced intelligence, efficiency, and adaptability.

Release Note

Click each tab to learn about the new and updated features in each release of Intel® Embodied Intelligence SDK.

Embodied Intelligence SDK v25.15 provides necessary software framework, libraries, tools, BKC, tutorials and example codes to facilitate embodied intelligence solution development on Intel® Core Ultra Series 2 processors (Arrow Lake-H), It provides Intel Linux LTS kernel v6.12.8 with Preempt-RT, and supports for Canonical® Ubuntu® 22.04, introduces initial support for ROS2 Humble. It supports many models optimization with Intel® OpenVINO™, and provides typical workflows and examples including ACT manipulation, ORB-SLAM3, etc.

New Features:

  • Provided Linux 6.12.8 BSP with Preempt-RT

  • Provided Real-time optimization BKC

  • Optimized IgH EtherCAT master with Linux kernel v6.12

  • Added ACT manipulation pipeline with Intel® OpenVINO™/Intel® Extension for PyTorch* optimization

  • Added ORB-SLAM3 pipeline focuses on real-time simultaneous localization and mapping

  • Provided typical AI models optimization tutorials with Intel® OpenVINO™

Known Issues and Limitations

  • There is a known deadlock risk and limitation to use intel_gpu_top to read i915 perf event with Preempt-RT kernel, it will be fixed in next release.

The following model algorithms were optimized by Intel® OpenVINO™:

Algorithm

Description

YOLOv8

CNN based object detection

YOLOv12

CNN based object detection

MobileNetV2

CNN based object detection

SAM

Transformer based segmentation

SAM2

Extend SAM to video segmentation and object tracking with cross attention to memory

FastSAM

Lightweight substitute to SAM

MobileSAM

Lightweight substitute to SAM (Same model architecture with SAM. Can refer to OpenVINO SAM tutorials for model export and application)

U-NET

CNN based segmentation and diffusion model

DETR

Transformer based object detection

DETR GroundingDino

Transformer based object detection

CLIP

Transformer based image classification

Action Chunking with Transformers - ACT

An end-to-end imitation learning model designed for fine manipulation tasks in robotics

Feature Extraction Model: SuperPoint

A self-supervised framework for interest point detection and description in images, suitable for a large number of multiple-view geometry problems in computer vision

Feature Tracking Model: LightGlue

A model designed for efficient and accurate feature matching in computer vision tasks

Bird’s Eye View Perception: Fast-BEV

Obtaining a BEV perception is to gain a comprehensive understanding of the spatial layout and relationships between objects in a scene

Monocular Depth Estimation: Depth Anything V2

A powerful tool that leverages deep learning to infer 3D information from 2D images

The following pipelines were added:

Pipeline Name

Description

Imitation Learning - ACT

Imitation learning pipeline using Action Chunking with Transformers(ACT) algorithm to train and evaluate in simulator or real robot environment with Intel optimization

VSLAM: ORB-SLAM3

One of popular real-time feature-based SLAM libraries able to perform Visual, Visual-Inertial and Multi-Map SLAM with monocular, stereo and RGB-D cameras, using pin-hole and fisheye lens models