Ontology Datasets
17 core .org.ai vocabularies providing semantic foundations for business applications
The .org.ai ontologies are the foundational vocabularies of the .do platform. These datasets provide standardized, authoritative taxonomies for industries, occupations, skills, processes, and more - all derived from trusted sources like O*NET, NAICS, APQC, and GS1.
Overview
All ontology datasets are:
- Standardized - Based on authoritative sources (O*NET, NAICS, APQC, GS1)
- Semantic - Fully typed with Schema.org-compatible metadata
- Maintained - Monthly updates from source systems
- Free - Public domain or open licenses
- Versioned - Full git history of all changes
- Type-safe - Complete TypeScript definitions
Core Ontologies (17)
Work & Occupations (7 datasets)
Foundation data for workforce, skills, and career information from O*NET:
occupations.org.ai
923 O*NET SOC occupations with detailed profiles, education requirements, and salary data
skills.org.ai
35 skill categories from O*NET covering cognitive, technical, and interpersonal abilities
tasks.org.ai
19,000+ task statements describing work activities across all occupations
activities.org.ai
2,069 work activities from O*NET detailing what workers do on the job
knowledge.org.ai
33 knowledge areas required for occupations (business, technical, scientific)
context.org.ai
41 work context categories describing physical and social work environments
jobs.org.ai
35,000+ alternate job titles mapped to standard SOC occupations
Technology & Tools (2 datasets)
Equipment and technology used across occupations:
tech.org.ai
2,800+ technology skills from O*NET (software, programming languages, platforms)
tools.org.ai
15,000+ tools and equipment used in various occupations
Business & Industry (4 datasets)
Business processes, industries, products, and services:
industries.org.ai
1,170 NAICS industry codes with hierarchical classification
processes.org.ai
1,000+ APQC business processes across 12 major categories
products.org.ai
6,000+ product categories for inventory and catalog management
services.org.ai
50,000+ service codes for professional and business services
Integration & Data (3 datasets)
Integration actions, events, and educational institutions:
actions.org.ai
14,116+ Zapier actions from 7,000+ apps for workflow automation
events.org.ai
GS1 EPCIS event vocabulary for supply chain tracking
education.org.ai
6,500+ educational institutions (from College Scorecard)
Geographic (1 dataset)
places.org.ai
11M+ geographical places from GeoNames with coordinates and metadata
Quick Start
Installation
# Install specific ontology
pnpm add industries.org.ai
# Or install all ontologies
pnpm add @dotdo/ontologiesBasic Usage
import { getAllTypes, getType } from 'industries.org.ai'
// Get all industries
const industries = getAllTypes()
// Get specific industry
const software = getType('5112') // Software Publishers
// Use with SDK
import { $ } from 'sdk.do'
const jobs = await $.Occupation.list().where({ industry: '5112' })Data Sources
O*NET (Occupational Information Network)
Source: U.S. Department of Labor URL: https://www.onetonline.org License: Public Domain Update Frequency: Monthly
Datasets (9):
- occupations.org.ai (923 occupations)
- skills.org.ai (35 skills)
- tasks.org.ai (19,000+ tasks)
- activities.org.ai (2,069 activities)
- knowledge.org.ai (33 knowledge areas)
- context.org.ai (41 work contexts)
- jobs.org.ai (35,000+ job titles)
- tech.org.ai (2,800+ technologies)
- tools.org.ai (15,000+ tools)
NAICS (North American Industry Classification System)
Source: U.S. Census Bureau URL: https://www.census.gov/naics License: Public Domain Update Frequency: Every 5 years (revisions)
Datasets (1):
- industries.org.ai (1,170 industry codes)
APQC (American Productivity & Quality Center)
Source: APQC Process Classification Framework URL: https://www.apqc.org/process-frameworks License: CC BY-ND 4.0 Update Frequency: Annual
Datasets (1):
- processes.org.ai (1,000+ business processes)
GS1
Source: GS1 Standards URL: https://www.gs1.org License: GS1 General Specifications Update Frequency: Quarterly
Datasets (1):
- events.org.ai (EPCIS event types)
Zapier
Source: Zapier Platform URL: https://zapier.com License: API Terms of Service Update Frequency: Weekly
Datasets (1):
- actions.org.ai (14,116+ actions from 7,000+ apps)
GeoNames
Source: GeoNames Geographical Database URL: https://www.geonames.org License: CC BY 4.0 Update Frequency: Continuous
Datasets (1):
- places.org.ai (11M+ places)
U.S. Department of Education
Source: College Scorecard URL: https://collegescorecard.ed.gov License: Public Domain Update Frequency: Annual
Datasets (1):
- education.org.ai (6,500+ institutions)
Schema Examples
Occupation
interface Occupation {
$id: string // "15-1252.00"
$type: 'Occupation'
name: string // "Software Developers"
description: string // Full description
data: {
socCode: string // "15-1252.00"
category: string // "Computer and Mathematical Occupations"
brightOutlook: boolean // true/false
greenOccupation: boolean
apprenticeship: boolean
education: string // "Bachelor's degree"
experience: string // "None"
onTheJobTraining: string
medianWage: number // 120730 (annual)
employment: number // 1847900
projectedGrowth: number // 25.7 (percent)
}
}Industry
interface Industry {
$id: string // "5112"
$type: 'Industry'
name: string // "Software Publishers"
description: string // Full description
data: {
naicsCode: string // "5112"
level: number // 4
sector: string // "51"
sectorName: string // "Information"
subsector: string // "511"
subsectorName: string // "Publishing Industries"
}
}Skill
interface Skill {
$id: string // "2.A.1.a"
$type: 'Skill'
name: string // "Reading Comprehension"
description: string // Full description
data: {
element: string // "2.A.1.a"
elementName: string // "Reading Comprehension"
category: string // "Basic Skills"
type: string // "Content"
}
}Process
interface Process {
$id: string // "10001"
$type: 'Process'
name: string // "Develop Vision and Strategy"
description: string // Full description
data: {
processId: string // "1.0"
level: number // 1
category: string // "Operating Processes"
parentId?: string // Parent process ID
}
}Common Use Cases
Job Matching
Match candidates to occupations based on skills:
import { $, db } from 'sdk.do'
// Find occupations requiring specific skills
const jobs = await db.query($.Occupation).related($.requires, $.Skill).where({ 'skill.name': 'JavaScript' })
// Get required education for occupation
const occupation = await db.get($.Occupation, '15-1252.00')
console.log(occupation.data.education) // "Bachelor's degree"Industry Analysis
Analyze industries and related companies:
// Get all software-related industries
const industries = await db.query($.Industry).where({ naicsCode: { startsWith: '511' } })
// Get companies in industry
const companies = await db.related(industry, $.hasIndustry, $.Organization)Skills Gap Analysis
Compare required vs actual skills:
// Get required skills for occupation
const requiredSkills = await db.related(occupation, $.requires, $.Skill)
// Get user's current skills
const userSkills = await db.related(user, $.has, $.Skill)
// Calculate skills gap
const gaps = requiredSkills.filter((skill) => !userSkills.some((us) => us.id === skill.id))Process Automation
Map business processes to automation actions:
// Get business process
const process = await db.get($.Process, '10001')
// Find automation actions
const actions = await db.related(process, $.automatedBy, $.Action)
// Filter Zapier actions
const zapierActions = actions.filter((action) => action.ns === 'actions.org.ai')Location-Based Services
Find places and related data:
// Search places by name
const places = await $.Place.search('San Francisco')
// Get nearby places
const nearby = await $.Place.nearby(37.7749, -122.4194, {
radius: 10, // miles
type: 'city',
})
// Get place details
const place = await db.get($.Place, 'geonames:5391959')API Reference
Query Functions
All ontologies export the same interface:
import { getAllTypes, getType, search } from '[ontology].org.ai'
// Get all types
const all = getAllTypes()
// Get by ID
const type = getType(id)
// Search (if supported)
const results = search(query)SDK Integration
import { $, db } from 'sdk.do'
// List with filters
const results = await db.list($.Type, {
where: { category: 'Software' },
limit: 10,
orderBy: { name: 'asc' },
})
// Get relationships
const related = await db.related(entity, $.predicate, $.Object)
// Aggregations
const stats = await db.aggregate($.Occupation, {
groupBy: ['category'],
count: true,
avg: ['medianWage'],
})Updates & Maintenance
All ontologies are updated monthly from their source systems:
# Check last update
do dataset info industries.org.ai
# Manually trigger update (admin only)
do dataset sync industries.org.ai
# View update history
do dataset history industries.org.ai --limit 10Dataset Statistics
| Dataset | Records | Source | License | Update |
|---|---|---|---|---|
| occupations.org.ai | 923 | O*NET | Public Domain | Monthly |
| skills.org.ai | 35 | O*NET | Public Domain | Monthly |
| tasks.org.ai | 19,000+ | O*NET | Public Domain | Monthly |
| activities.org.ai | 2,069 | O*NET | Public Domain | Monthly |
| knowledge.org.ai | 33 | O*NET | Public Domain | Monthly |
| context.org.ai | 41 | O*NET | Public Domain | Monthly |
| jobs.org.ai | 35,000+ | O*NET | Public Domain | Monthly |
| tech.org.ai | 2,800+ | O*NET | Public Domain | Monthly |
| tools.org.ai | 15,000+ | O*NET | Public Domain | Monthly |
| industries.org.ai | 1,170 | NAICS | Public Domain | Every 5 years |
| processes.org.ai | 1,000+ | APQC | CC BY-ND 4.0 | Annual |
| products.org.ai | 6,000+ | Internal | MIT | Monthly |
| services.org.ai | 50,000+ | Internal | MIT | Monthly |
| actions.org.ai | 14,116+ | Zapier | API ToS | Weekly |
| events.org.ai | Vocabulary | GS1 | GS1 Specs | Quarterly |
| education.org.ai | 6,500+ | College Scorecard | Public Domain | Annual |
| places.org.ai | 11M+ | GeoNames | CC BY 4.0 | Continuous |
Contributing
To update an ontology:
- Fork dot-do/ai
- Update source data in
ai/sources/datasets/ - Run
mdxdb dataset publish [DatasetName] --dry-run - Submit PR with changes
For questions or issues, visit GitHub Discussions or open an issue.
Last updated: 2025-11-02