Creating and maintaining vector indexes on millions of embeddings can be painfully slow on CPU alone. Oracle AI Database (23ai/26ai) now lets you offload this heavy lifting to a GPU — delivering massive performance gains while keeping your database CPU free for queries and transactions.
In this practical guide, you’ll learn how to set up GPU-powered vector index creation using the Private AI Services Container — perfect for developers and DBAs who want faster indexing without complex infrastructure.
Why GPU Offload Matters for Vector Workloads
- CPU-based HNSW index creation is slow on large datasets
- GPU can build indexes significantly faster (often 5x–10x depending on data size)
- Frees up database CPU for real-time similarity searches
- Easy to run on separate machines (on-prem or cloud)
High-Level Architecture
Your Oracle AI Database sends embedding vectors to a remote GPU container over a secure HTTPS connection. The GPU builds the index and sends it back. The whole process is transparent to your SQL queries.
Prerequisites (Keep It Minimal)
- Oracle AI Database 23ai or 26ai (Free or Enterprise)
- One NVIDIA GPU with compute capability 7.5+ (RTX 3060, A10, A100, etc.)
- Oracle Linux 8 or 9 on the GPU machine
- At least 24GB VRAM recommended for good performance
Step-by-Step Setup Overview
1. Prepare the GPU Server
# Update system and install GPU drivers (OCI GPU images come pre-installed)
sudo dnf update -y
# Install NVIDIA Container Toolkit
curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
sudo dnf install -y nvidia-container-toolkit
# Verify GPU
nvidia-smi
2. Install Podman and Pull the Container
sudo dnf install -y container-tools
# Login to Oracle Container Registry
podman login container-registry.oracle.com
# Pull the GPU Index Service image
podman pull container-registry.oracle.com/database/private-ai:gpu-index-26.1.0.0.0
3. Run the Easy Setup Scripts
# Extract setup scripts from container
IMAGEID=`podman create container-registry.oracle.com/database/private-ai:gpu-index-26.1.0.0.0`
podman cp $IMAGEID:/privateai/scripts/privateai-setup-gpu-index-26.1.0.0.0.zip .
unzip privateai-setup-gpu-index-26.1.0.0.0.zip
# Run configuration
mkdir -p ~/privateai ~/secrets
cd setup
./secretsSetup.sh -s ~/secrets
./configSetup.sh -d ~/privateai -s ~/secrets
./containerSetup.sh -d ~/privateai
4. Start and Verify the Service
podman ps
curl --http2-prior-knowledge --cacert ~/secrets/cert.pem https://$(hostname -f):8443/health
Connect from Oracle AI Database
-- In your database session
BEGIN
DBMS_VECTOR_INDEX.SET_OFFLOAD(
offload_url => 'https://your-gpu-host:8443/v1/index',
api_key => 'your-api-key-from-secrets',
cert_path => '/path/to/cert.pem'
);
END;
/
Best Practices & Tips
- Run the GPU container on a separate machine from the database for best results
- Start with smaller datasets to test performance gains
- Monitor GPU utilization with
nvidia-smiduring index creation - Use TLS 1.3 (automatically configured by the setup scripts)
- Scale vertically with bigger GPUs or horizontally with multiple GPU nodes later
Expected Results
Users typically see **dramatic reductions** in index creation time — especially on datasets with millions of vectors. Your database remains responsive while the heavy compute happens on the GPU.
Next Steps
Once set up, you can create vector indexes normally using DBMS_VECTOR_INDEX.CREATE_INDEX — the offload happens automatically in the background.