.do
Datasets

Ontology Datasets

17 core .org.ai vocabularies providing semantic foundations for business applications

The .org.ai ontologies are the foundational vocabularies of the .do platform. These datasets provide standardized, authoritative taxonomies for industries, occupations, skills, processes, and more - all derived from trusted sources like O*NET, NAICS, APQC, and GS1.

Overview

All ontology datasets are:

  • Standardized - Based on authoritative sources (O*NET, NAICS, APQC, GS1)
  • Semantic - Fully typed with Schema.org-compatible metadata
  • Maintained - Monthly updates from source systems
  • Free - Public domain or open licenses
  • Versioned - Full git history of all changes
  • Type-safe - Complete TypeScript definitions

Core Ontologies (17)

Work & Occupations (7 datasets)

Foundation data for workforce, skills, and career information from O*NET:

occupations.org.ai

923 O*NET SOC occupations with detailed profiles, education requirements, and salary data

skills.org.ai

35 skill categories from O*NET covering cognitive, technical, and interpersonal abilities

tasks.org.ai

19,000+ task statements describing work activities across all occupations

activities.org.ai

2,069 work activities from O*NET detailing what workers do on the job

knowledge.org.ai

33 knowledge areas required for occupations (business, technical, scientific)

context.org.ai

41 work context categories describing physical and social work environments

jobs.org.ai

35,000+ alternate job titles mapped to standard SOC occupations

Technology & Tools (2 datasets)

Equipment and technology used across occupations:

tech.org.ai

2,800+ technology skills from O*NET (software, programming languages, platforms)

tools.org.ai

15,000+ tools and equipment used in various occupations

Business & Industry (4 datasets)

Business processes, industries, products, and services:

industries.org.ai

1,170 NAICS industry codes with hierarchical classification

processes.org.ai

1,000+ APQC business processes across 12 major categories

products.org.ai

6,000+ product categories for inventory and catalog management

services.org.ai

50,000+ service codes for professional and business services

Integration & Data (3 datasets)

Integration actions, events, and educational institutions:

actions.org.ai

14,116+ Zapier actions from 7,000+ apps for workflow automation

events.org.ai

GS1 EPCIS event vocabulary for supply chain tracking

education.org.ai

6,500+ educational institutions (from College Scorecard)

Geographic (1 dataset)

places.org.ai

11M+ geographical places from GeoNames with coordinates and metadata

Quick Start

Installation

# Install specific ontology
pnpm add industries.org.ai

# Or install all ontologies
pnpm add @dotdo/ontologies

Basic Usage

import { getAllTypes, getType } from 'industries.org.ai'

// Get all industries
const industries = getAllTypes()

// Get specific industry
const software = getType('5112') // Software Publishers

// Use with SDK
import { $ } from 'sdk.do'

const jobs = await $.Occupation.list().where({ industry: '5112' })

Data Sources

O*NET (Occupational Information Network)

Source: U.S. Department of Labor URL: https://www.onetonline.org License: Public Domain Update Frequency: Monthly

Datasets (9):

  • occupations.org.ai (923 occupations)
  • skills.org.ai (35 skills)
  • tasks.org.ai (19,000+ tasks)
  • activities.org.ai (2,069 activities)
  • knowledge.org.ai (33 knowledge areas)
  • context.org.ai (41 work contexts)
  • jobs.org.ai (35,000+ job titles)
  • tech.org.ai (2,800+ technologies)
  • tools.org.ai (15,000+ tools)

NAICS (North American Industry Classification System)

Source: U.S. Census Bureau URL: https://www.census.gov/naics License: Public Domain Update Frequency: Every 5 years (revisions)

Datasets (1):

  • industries.org.ai (1,170 industry codes)

APQC (American Productivity & Quality Center)

Source: APQC Process Classification Framework URL: https://www.apqc.org/process-frameworks License: CC BY-ND 4.0 Update Frequency: Annual

Datasets (1):

  • processes.org.ai (1,000+ business processes)

GS1

Source: GS1 Standards URL: https://www.gs1.org License: GS1 General Specifications Update Frequency: Quarterly

Datasets (1):

  • events.org.ai (EPCIS event types)

Zapier

Source: Zapier Platform URL: https://zapier.com License: API Terms of Service Update Frequency: Weekly

Datasets (1):

  • actions.org.ai (14,116+ actions from 7,000+ apps)

GeoNames

Source: GeoNames Geographical Database URL: https://www.geonames.org License: CC BY 4.0 Update Frequency: Continuous

Datasets (1):

  • places.org.ai (11M+ places)

U.S. Department of Education

Source: College Scorecard URL: https://collegescorecard.ed.gov License: Public Domain Update Frequency: Annual

Datasets (1):

  • education.org.ai (6,500+ institutions)

Schema Examples

Occupation

interface Occupation {
  $id: string // "15-1252.00"
  $type: 'Occupation'
  name: string // "Software Developers"
  description: string // Full description
  data: {
    socCode: string // "15-1252.00"
    category: string // "Computer and Mathematical Occupations"
    brightOutlook: boolean // true/false
    greenOccupation: boolean
    apprenticeship: boolean
    education: string // "Bachelor's degree"
    experience: string // "None"
    onTheJobTraining: string
    medianWage: number // 120730 (annual)
    employment: number // 1847900
    projectedGrowth: number // 25.7 (percent)
  }
}

Industry

interface Industry {
  $id: string // "5112"
  $type: 'Industry'
  name: string // "Software Publishers"
  description: string // Full description
  data: {
    naicsCode: string // "5112"
    level: number // 4
    sector: string // "51"
    sectorName: string // "Information"
    subsector: string // "511"
    subsectorName: string // "Publishing Industries"
  }
}

Skill

interface Skill {
  $id: string // "2.A.1.a"
  $type: 'Skill'
  name: string // "Reading Comprehension"
  description: string // Full description
  data: {
    element: string // "2.A.1.a"
    elementName: string // "Reading Comprehension"
    category: string // "Basic Skills"
    type: string // "Content"
  }
}

Process

interface Process {
  $id: string // "10001"
  $type: 'Process'
  name: string // "Develop Vision and Strategy"
  description: string // Full description
  data: {
    processId: string // "1.0"
    level: number // 1
    category: string // "Operating Processes"
    parentId?: string // Parent process ID
  }
}

Common Use Cases

Job Matching

Match candidates to occupations based on skills:

import { $, db } from 'sdk.do'

// Find occupations requiring specific skills
const jobs = await db.query($.Occupation).related($.requires, $.Skill).where({ 'skill.name': 'JavaScript' })

// Get required education for occupation
const occupation = await db.get($.Occupation, '15-1252.00')
console.log(occupation.data.education) // "Bachelor's degree"

Industry Analysis

Analyze industries and related companies:

// Get all software-related industries
const industries = await db.query($.Industry).where({ naicsCode: { startsWith: '511' } })

// Get companies in industry
const companies = await db.related(industry, $.hasIndustry, $.Organization)

Skills Gap Analysis

Compare required vs actual skills:

// Get required skills for occupation
const requiredSkills = await db.related(occupation, $.requires, $.Skill)

// Get user's current skills
const userSkills = await db.related(user, $.has, $.Skill)

// Calculate skills gap
const gaps = requiredSkills.filter((skill) => !userSkills.some((us) => us.id === skill.id))

Process Automation

Map business processes to automation actions:

// Get business process
const process = await db.get($.Process, '10001')

// Find automation actions
const actions = await db.related(process, $.automatedBy, $.Action)

// Filter Zapier actions
const zapierActions = actions.filter((action) => action.ns === 'actions.org.ai')

Location-Based Services

Find places and related data:

// Search places by name
const places = await $.Place.search('San Francisco')

// Get nearby places
const nearby = await $.Place.nearby(37.7749, -122.4194, {
  radius: 10, // miles
  type: 'city',
})

// Get place details
const place = await db.get($.Place, 'geonames:5391959')

API Reference

Query Functions

All ontologies export the same interface:

import { getAllTypes, getType, search } from '[ontology].org.ai'

// Get all types
const all = getAllTypes()

// Get by ID
const type = getType(id)

// Search (if supported)
const results = search(query)

SDK Integration

import { $, db } from 'sdk.do'

// List with filters
const results = await db.list($.Type, {
  where: { category: 'Software' },
  limit: 10,
  orderBy: { name: 'asc' },
})

// Get relationships
const related = await db.related(entity, $.predicate, $.Object)

// Aggregations
const stats = await db.aggregate($.Occupation, {
  groupBy: ['category'],
  count: true,
  avg: ['medianWage'],
})

Updates & Maintenance

All ontologies are updated monthly from their source systems:

# Check last update
do dataset info industries.org.ai

# Manually trigger update (admin only)
do dataset sync industries.org.ai

# View update history
do dataset history industries.org.ai --limit 10

Dataset Statistics

DatasetRecordsSourceLicenseUpdate
occupations.org.ai923O*NETPublic DomainMonthly
skills.org.ai35O*NETPublic DomainMonthly
tasks.org.ai19,000+O*NETPublic DomainMonthly
activities.org.ai2,069O*NETPublic DomainMonthly
knowledge.org.ai33O*NETPublic DomainMonthly
context.org.ai41O*NETPublic DomainMonthly
jobs.org.ai35,000+O*NETPublic DomainMonthly
tech.org.ai2,800+O*NETPublic DomainMonthly
tools.org.ai15,000+O*NETPublic DomainMonthly
industries.org.ai1,170NAICSPublic DomainEvery 5 years
processes.org.ai1,000+APQCCC BY-ND 4.0Annual
products.org.ai6,000+InternalMITMonthly
services.org.ai50,000+InternalMITMonthly
actions.org.ai14,116+ZapierAPI ToSWeekly
events.org.aiVocabularyGS1GS1 SpecsQuarterly
education.org.ai6,500+College ScorecardPublic DomainAnnual
places.org.ai11M+GeoNamesCC BY 4.0Continuous

Contributing

To update an ontology:

  1. Fork dot-do/ai
  2. Update source data in ai/sources/datasets/
  3. Run mdxdb dataset publish [DatasetName] --dry-run
  4. Submit PR with changes

For questions or issues, visit GitHub Discussions or open an issue.


Last updated: 2025-11-02