.do
AI & Intelligence

embed

Generate vector embeddings from text, images, and multimodal content for semantic search and AI applications

embed

Generate Vector Embeddings for Semantic Understanding

Overview

The embed primitive provides vector embedding generation for text, images, and multimodal content. Convert any content into high-dimensional vectors for semantic search, similarity matching, clustering, and AI-powered applications with state-of-the-art embedding models.

SDK Usage

import { $, embed } from 'sdk.do'

// Generate text embeddings
const textEmbedding = await embed.text('Machine learning fundamentals')

// Generate image embeddings
const imageEmbedding = await embed.image('https://example.com/photo.jpg')

// Batch embedding generation
const embeddings = await embed.batch([
  { type: 'text', content: 'AI research paper' },
  { type: 'text', content: 'Machine learning tutorial' },
  { type: 'image', url: 'https://example.com/diagram.png' }
])

Semantic Pattern

// Embed Pattern: $.Embed.verb.Object
await $.Embed.generate.TextEmbedding({ text: 'content' })
await $.Embed.generate.ImageEmbedding({ url: 'image-url' })
await $.Embed.generate.MultimodalEmbedding({ text, image })
await $.Embed.compute.Similarity({ embedding1, embedding2 })
await $.Embed.search.Similar({ query, collection })

// Semantic search workflow
const query = await $.Embed.generate.TextEmbedding({
  text: 'artificial intelligence research'
})

const results = await $.Vectors.search.Similar({
  embedding: query,
  collection: 'research-papers',
  limit: 10,
  threshold: 0.8
})

Quick Example

import { embed, $ } from 'sdk.do'

// Generate embedding for search query
const queryEmbedding = await embed.text('best AI frameworks for production')

// Search similar documents
const results = await $.Vectors.search.Similar({
  embedding: queryEmbedding,
  collection: 'documentation',
  limit: 10
})

// Compute similarity between two texts
const similarity = await embed.similarity(
  'machine learning models',
  'neural network architectures'
)

console.log(`Similarity score: ${similarity}`) // 0.87

// Batch embed documents for indexing
on($.Document.created, async (doc) => {
  const embedding = await embed.text(doc.content)
  await $.Vectors.store({
    id: doc.id,
    embedding,
    metadata: { title: doc.title, created: doc.createdAt }
  })
})

Core Capabilities

  • Text Embeddings - Generate embeddings from text using state-of-the-art models (OpenAI, Cohere, Voyage)
  • Image Embeddings - Convert images to vectors for visual similarity search
  • Multimodal Embeddings - Unified embeddings for text + image content (CLIP-style)
  • Batch Processing - Efficient batch embedding generation with automatic parallelization
  • Multiple Models - Choose from various embedding models optimized for different use cases
  • Similarity Search - Compute cosine similarity, dot product, or euclidean distance
  • Dimension Reduction - Reduce embedding dimensions for faster search (PCA, UMAP)
  • Edge Deployment - Generate embeddings at edge locations for low latency

Embedding Models

  • text-embedding-3-large (OpenAI) - 3,072 dimensions, best quality
  • text-embedding-3-small (OpenAI) - 1,536 dimensions, fast and cost-effective
  • embed-english-v3.0 (Cohere) - 1,024 dimensions, optimized for English
  • voyage-large-2 (Voyage AI) - 1,536 dimensions, high accuracy
  • CLIP-ViT-L (OpenAI) - Multimodal text + image embeddings

Access Methods

SDK

TypeScript/JavaScript library for programmatic embedding generation

import { embed } from 'sdk.do'
const embedding = await embed.text('sample text')
const similarity = await embed.similarity(text1, text2)

SDK Documentation

CLI

Command-line tool for embed operations

do embed text "machine learning fundamentals"
do embed image https://example.com/photo.jpg
do embed batch --file documents.jsonl

CLI Documentation

API

REST/RPC endpoints for embedding integration

curl -X POST https://api.do/v1/embed/text \
  -d '{"text":"machine learning fundamentals","model":"text-embedding-3-large"}'

API Documentation

MCP

Model Context Protocol for AI assistant integration

Generate embedding for text "AI research paper"
Compute similarity between "neural networks" and "deep learning"

MCP Documentation

  • embeddings - Embedding storage and management
  • vectors - Vector database operations
  • llm - Language model access
  • searches - Semantic search capabilities
  • models - AI model access