.do
Service TypesData Services

Data Validation Services

Build robust data validation services that ensure data quality, accuracy, and compliance

Data validation services ensure data quality, accuracy, and compliance by checking records against schemas, business rules, and regulatory requirements. These services are critical for maintaining data integrity across systems and preventing costly errors.

Overview

Data validation is the foundation of data quality. Whether you're validating form submissions, ensuring database integrity, checking compliance with regulations, or maintaining data standards, validation services automate the process of ensuring your data meets required criteria.

Key Capabilities

  • Schema Validation: Verify data structure and types against defined schemas
  • Business Rule Validation: Enforce domain-specific business logic and constraints
  • Data Quality Checks: Assess completeness, accuracy, consistency, and timeliness
  • Compliance Validation: Ensure adherence to regulatory requirements (GDPR, HIPAA, etc.)
  • Cross-Field Validation: Validate relationships between fields
  • Custom Validation Logic: Execute complex validation rules

Common Use Cases

  1. Form Validation: Validate user input in real-time
  2. Data Import Validation: Check data quality before loading into systems
  3. API Request Validation: Ensure incoming requests meet requirements
  4. Data Quality Monitoring: Continuously assess data quality metrics
  5. Compliance Auditing: Verify regulatory compliance
  6. Data Migration Validation: Ensure data integrity during migrations

Building Your First Validation Service

Let's start with a comprehensive form validation service:

import $, { db, on, send } from 'sdk.do'

const formValidationService = await $.Service.create({
  name: 'Smart Form Validator',
  description: 'Real-time form validation with custom rules and error messages',
  type: $.ServiceType.DataValidation,
  subtype: 'form-validation',

  input: {
    required: ['formData', 'schema'],
    optional: ['customRules', 'strictMode'],
  },

  output: {
    valid: 'boolean',
    errors: 'array',
    warnings: 'array',
    fieldStatus: 'object',
  },

  pricing: {
    model: 'per-validation',
    rate: 0.001, // $0.001 per validation
    minimumCharge: 0.01,
  },
})

on.ServiceRequest.created(async (request) => {
  if (request.serviceId !== formValidationService.id) return

  const { formData, schema, customRules = [], strictMode = false } = request.inputs

  try {
    const validationResult = {
      valid: true,
      errors: [],
      warnings: [],
      fieldStatus: {},
    }

    // Step 1: Schema validation
    for (const [fieldName, fieldSchema] of Object.entries(schema.fields)) {
      const value = formData[fieldName]
      const fieldErrors = []
      const fieldWarnings = []

      // Required field check
      if (fieldSchema.required && (value === undefined || value === null || value === '')) {
        fieldErrors.push({
          field: fieldName,
          rule: 'required',
          message: fieldSchema.messages?.required || `${fieldName} is required`,
        })
      }

      // Type validation
      if (value !== undefined && value !== null && value !== '') {
        const typeValid = validateType(value, fieldSchema.type)
        if (!typeValid) {
          fieldErrors.push({
            field: fieldName,
            rule: 'type',
            message: fieldSchema.messages?.type || `${fieldName} must be of type ${fieldSchema.type}`,
          })
        }
      }

      // Length validation
      if (value && fieldSchema.minLength && value.length < fieldSchema.minLength) {
        fieldErrors.push({
          field: fieldName,
          rule: 'minLength',
          message: fieldSchema.messages?.minLength || `${fieldName} must be at least ${fieldSchema.minLength} characters`,
        })
      }

      if (value && fieldSchema.maxLength && value.length > fieldSchema.maxLength) {
        fieldErrors.push({
          field: fieldName,
          rule: 'maxLength',
          message: fieldSchema.messages?.maxLength || `${fieldName} must not exceed ${fieldSchema.maxLength} characters`,
        })
      }

      // Range validation
      if (typeof value === 'number') {
        if (fieldSchema.min !== undefined && value < fieldSchema.min) {
          fieldErrors.push({
            field: fieldName,
            rule: 'min',
            message: fieldSchema.messages?.min || `${fieldName} must be at least ${fieldSchema.min}`,
          })
        }

        if (fieldSchema.max !== undefined && value > fieldSchema.max) {
          fieldErrors.push({
            field: fieldName,
            rule: 'max',
            message: fieldSchema.messages?.max || `${fieldName} must not exceed ${fieldSchema.max}`,
          })
        }
      }

      // Pattern validation
      if (value && fieldSchema.pattern) {
        const regex = new RegExp(fieldSchema.pattern)
        if (!regex.test(value)) {
          fieldErrors.push({
            field: fieldName,
            rule: 'pattern',
            message: fieldSchema.messages?.pattern || `${fieldName} format is invalid`,
          })
        }
      }

      // Enum validation
      if (value && fieldSchema.enum) {
        if (!fieldSchema.enum.includes(value)) {
          fieldErrors.push({
            field: fieldName,
            rule: 'enum',
            message: fieldSchema.messages?.enum || `${fieldName} must be one of: ${fieldSchema.enum.join(', ')}`,
          })
        }
      }

      // Custom validators
      if (value && fieldSchema.validators) {
        for (const validator of fieldSchema.validators) {
          const result = await executeValidator(value, validator)
          if (!result.valid) {
            fieldErrors.push({
              field: fieldName,
              rule: validator.name,
              message: result.message,
            })
          }
        }
      }

      // Update field status
      validationResult.fieldStatus[fieldName] = {
        valid: fieldErrors.length === 0,
        errors: fieldErrors,
        warnings: fieldWarnings,
      }

      if (fieldErrors.length > 0) {
        validationResult.valid = false
        validationResult.errors.push(...fieldErrors)
      }

      if (fieldWarnings.length > 0) {
        validationResult.warnings.push(...fieldWarnings)
      }
    }

    // Step 2: Cross-field validation
    if (schema.crossFieldRules) {
      for (const rule of schema.crossFieldRules) {
        const result = validateCrossField(formData, rule)
        if (!result.valid) {
          validationResult.valid = false
          validationResult.errors.push({
            rule: rule.name,
            message: result.message,
            fields: rule.fields,
          })
        }
      }
    }

    // Step 3: Custom business rules
    for (const rule of customRules) {
      const result = await evaluateBusinessRule(formData, rule)
      if (!result.valid) {
        if (rule.severity === 'error' || strictMode) {
          validationResult.valid = false
          validationResult.errors.push({
            rule: rule.name,
            message: result.message,
          })
        } else {
          validationResult.warnings.push({
            rule: rule.name,
            message: result.message,
          })
        }
      }
    }

    // Deliver results
    await send.ServiceResult.deliver({
      requestId: request.id,
      outputs: validationResult,
    })

    // Charge for validation
    await send.Payment.charge({
      customerId: request.customerId,
      amount: formValidationService.pricing.rate,
      description: 'Form validation',
    })
  } catch (error) {
    await send.ServiceRequest.fail({
      requestId: request.id,
      error: error.message,
      retryable: false,
    })
  }
})

function validateType(value: any, type: string): boolean {
  switch (type) {
    case 'string':
      return typeof value === 'string'
    case 'number':
      return typeof value === 'number' && !isNaN(value)
    case 'integer':
      return Number.isInteger(value)
    case 'boolean':
      return typeof value === 'boolean'
    case 'email':
      return /^[^\s@]+@[^\s@]+\.[^\s@]+$/.test(value)
    case 'url':
      try {
        new URL(value)
        return true
      } catch {
        return false
      }
    case 'date':
      return value instanceof Date || !isNaN(Date.parse(value))
    case 'array':
      return Array.isArray(value)
    case 'object':
      return typeof value === 'object' && value !== null && !Array.isArray(value)
    default:
      return true
  }
}

function validateCrossField(data: any, rule: any): { valid: boolean; message?: string } {
  switch (rule.type) {
    case 'dependent':
      // If field A has value, field B is required
      if (data[rule.fields[0]] && !data[rule.fields[1]]) {
        return {
          valid: false,
          message: `${rule.fields[1]} is required when ${rule.fields[0]} is provided`,
        }
      }
      break

    case 'mutuallyExclusive':
      // Only one of the fields can have a value
      const withValues = rule.fields.filter((f) => data[f])
      if (withValues.length > 1) {
        return {
          valid: false,
          message: `Only one of ${rule.fields.join(', ')} can be provided`,
        }
      }
      break

    case 'comparison':
      // Compare two fields
      const val1 = data[rule.fields[0]]
      const val2 = data[rule.fields[1]]
      if (!compareValues(val1, val2, rule.operator)) {
        return {
          valid: false,
          message: `${rule.fields[0]} must be ${rule.operator} ${rule.fields[1]}`,
        }
      }
      break
  }

  return { valid: true }
}

Business Rule Validation Service

Enforce complex business logic:

const businessRuleValidationService = await $.Service.create({
  name: 'Business Rule Validator',
  description: 'Validate data against complex business rules and constraints',
  type: $.ServiceType.DataValidation,
  subtype: 'business-rules',

  input: {
    required: ['data', 'rules'],
    optional: ['context', 'mode'],
  },

  features: ['rule-engine', 'conditional-logic', 'external-lookups', 'ai-validation', 'custom-functions'],

  pricing: {
    model: 'per-validation',
    base: 0.01,
    complexity: {
      simple: 1.0,
      moderate: 2.0,
      complex: 4.0,
    },
  },
})

on.ServiceRequest.created(async (request) => {
  if (request.serviceId !== businessRuleValidationService.id) return

  const { data, rules, context = {}, mode = 'strict' } = request.inputs

  try {
    const results = {
      valid: true,
      violations: [],
      warnings: [],
      evaluated: 0,
      passed: 0,
      failed: 0,
    }

    for (const rule of rules) {
      results.evaluated++

      // Evaluate rule
      const evaluation = await evaluateRule(data, rule, context)

      if (!evaluation.valid) {
        if (rule.severity === 'critical' || mode === 'strict') {
          results.valid = false
          results.failed++
          results.violations.push({
            rule: rule.name,
            severity: rule.severity,
            message: evaluation.message,
            field: evaluation.field,
            actualValue: evaluation.actualValue,
            expectedValue: evaluation.expectedValue,
          })
        } else {
          results.warnings.push({
            rule: rule.name,
            severity: rule.severity,
            message: evaluation.message,
          })
        }
      } else {
        results.passed++
      }
    }

    // Calculate complexity
    const complexity = calculateRuleComplexity(rules)

    // Deliver results
    await send.ServiceResult.deliver({
      requestId: request.id,
      outputs: results,
    })

    // Charge based on complexity
    const multiplier = businessRuleValidationService.pricing.complexity[complexity]
    const cost = businessRuleValidationService.pricing.base * multiplier

    await send.Payment.charge({
      customerId: request.customerId,
      amount: cost,
      description: `Business rule validation (${complexity})`,
    })
  } catch (error) {
    await send.ServiceRequest.fail({
      requestId: request.id,
      error: error.message,
      retryable: true,
    })
  }
})

async function evaluateRule(data: any, rule: any, context: any): Promise<any> {
  switch (rule.type) {
    case 'condition':
      return evaluateCondition(data, rule.condition)

    case 'range':
      return evaluateRange(data, rule)

    case 'lookup':
      return await evaluateLookup(data, rule, context)

    case 'calculation':
      return evaluateCalculation(data, rule)

    case 'ai-validation':
      return await evaluateWithAI(data, rule)

    case 'custom':
      return await executeCustomRule(data, rule)

    default:
      return { valid: true }
  }
}

async function evaluateWithAI(data: any, rule: any): Promise<any> {
  // Use AI for complex validation logic
  const result = await ai.validate({
    model: 'gpt-5',
    data,
    rule: rule.description,
    examples: rule.examples,
  })

  return {
    valid: result.valid,
    message: result.reasoning,
    confidence: result.confidence,
  }
}

Data Quality Assessment Service

Evaluate data quality across multiple dimensions:

const dataQualityService = await $.Service.create({
  name: 'Data Quality Assessor',
  description: 'Comprehensive data quality assessment across multiple dimensions',
  type: $.ServiceType.DataValidation,
  subtype: 'quality-assessment',

  dimensions: ['completeness', 'accuracy', 'consistency', 'timeliness', 'validity', 'uniqueness'],

  input: {
    required: ['dataset'],
    optional: ['dimensions', 'thresholds', 'sampleSize'],
  },

  pricing: {
    model: 'per-assessment',
    base: 10.0,
    perThousandRecords: 1.0,
  },
})

on.ServiceRequest.created(async (request) => {
  if (request.serviceId !== dataQualityService.id) return

  const { dataset, dimensions = dataQualityService.dimensions, thresholds = {}, sampleSize } = request.inputs

  try {
    // Sample dataset if needed
    const sample = sampleSize && dataset.length > sampleSize ? sampleDataset(dataset, sampleSize) : dataset

    const assessment = {
      totalRecords: dataset.length,
      sampleSize: sample.length,
      dimensions: {},
      overallScore: 0,
      issues: [],
      recommendations: [],
    }

    // Assess each dimension
    for (const dimension of dimensions) {
      const result = await assessDimension(sample, dimension, thresholds[dimension])
      assessment.dimensions[dimension] = result

      if (result.score < (thresholds[dimension]?.minimum || 80)) {
        assessment.issues.push({
          dimension,
          score: result.score,
          details: result.details,
        })
      }
    }

    // Calculate overall score
    const scores = Object.values(assessment.dimensions).map((d: any) => d.score)
    assessment.overallScore = scores.reduce((sum, score) => sum + score, 0) / scores.length

    // Generate recommendations
    assessment.recommendations = generateRecommendations(assessment)

    // Deliver results
    await send.ServiceResult.deliver({
      requestId: request.id,
      outputs: assessment,
    })

    // Calculate cost
    const thousands = Math.ceil(dataset.length / 1000)
    const cost = dataQualityService.pricing.base + thousands * dataQualityService.pricing.perThousandRecords

    await send.Payment.charge({
      customerId: request.customerId,
      amount: cost,
      description: `Data quality assessment (${dataset.length} records)`,
    })
  } catch (error) {
    await send.ServiceRequest.fail({
      requestId: request.id,
      error: error.message,
      retryable: true,
    })
  }
})

async function assessDimension(sample: any[], dimension: string, threshold?: any): Promise<any> {
  switch (dimension) {
    case 'completeness':
      return assessCompleteness(sample, threshold)

    case 'accuracy':
      return await assessAccuracy(sample, threshold)

    case 'consistency':
      return assessConsistency(sample, threshold)

    case 'timeliness':
      return assessTimeliness(sample, threshold)

    case 'validity':
      return assessValidity(sample, threshold)

    case 'uniqueness':
      return assessUniqueness(sample, threshold)

    default:
      return { score: 100, details: {} }
  }
}

function assessCompleteness(sample: any[], threshold?: any): any {
  const fields = Object.keys(sample[0] || {})
  const fieldCompleteness = {}

  let totalFields = 0
  let completeFields = 0

  for (const field of fields) {
    const nonNull = sample.filter((record) => record[field] !== null && record[field] !== undefined && record[field] !== '').length

    const completeness = (nonNull / sample.length) * 100
    fieldCompleteness[field] = {
      completeness,
      nullCount: sample.length - nonNull,
    }

    totalFields++
    if (completeness >= (threshold?.fieldThreshold || 95)) {
      completeFields++
    }
  }

  const overallCompleteness = (completeFields / totalFields) * 100

  return {
    score: overallCompleteness,
    details: {
      fields: fieldCompleteness,
      completeFields,
      totalFields,
    },
  }
}

async function assessAccuracy(sample: any[], threshold?: any): Promise<any> {
  // Use AI to assess accuracy
  const accuracyChecks = []

  for (const record of sample.slice(0, 100)) {
    // Sample for accuracy check
    const check = await ai.validate({
      model: 'gpt-5',
      record,
      validationRules: threshold?.rules || ['check for obvious errors', 'verify data makes sense', 'validate relationships'],
    })

    accuracyChecks.push({
      valid: check.valid,
      issues: check.issues,
    })
  }

  const accurate = accuracyChecks.filter((c) => c.valid).length
  const score = (accurate / accuracyChecks.length) * 100

  return {
    score,
    details: {
      checked: accuracyChecks.length,
      accurate,
      issues: accuracyChecks.filter((c) => !c.valid).map((c) => c.issues),
    },
  }
}

function assessConsistency(sample: any[], threshold?: any): any {
  const fields = Object.keys(sample[0] || {})
  const inconsistencies = []

  for (const field of fields) {
    const values = sample.map((r) => r[field]).filter((v) => v !== null && v !== undefined)
    const types = [...new Set(values.map((v) => typeof v))]

    // Check type consistency
    if (types.length > 1) {
      inconsistencies.push({
        field,
        issue: 'mixed-types',
        types,
      })
    }

    // Check format consistency for strings
    if (types.includes('string')) {
      const formats = detectFormats(values.filter((v) => typeof v === 'string'))
      if (formats.length > 1) {
        inconsistencies.push({
          field,
          issue: 'mixed-formats',
          formats,
        })
      }
    }
  }

  const score = Math.max(0, 100 - inconsistencies.length * 10)

  return {
    score,
    details: {
      inconsistencies,
      fieldsChecked: fields.length,
    },
  }
}

Schema Validation Service

Validate data against JSON Schema and other schema formats:

import $, { on, send } from 'sdk.do'
import Ajv from 'ajv'
import addFormats from 'ajv-formats'

const schemaValidationService = await $.Service.create({
  name: 'Schema Validator',
  description: 'Validate data against JSON Schema, OpenAPI, and custom schemas',
  type: $.ServiceType.DataValidation,
  subtype: 'schema-validation',

  supportedFormats: ['json-schema', 'openapi', 'graphql', 'protobuf', 'avro'],

  pricing: {
    model: 'per-validation',
    rate: 0.005,
    volume: [
      { min: 0, max: 1000, rate: 0.005 },
      { min: 1001, max: 10000, rate: 0.003 },
      { min: 10001, max: Infinity, rate: 0.001 },
    ],
  },
})

on.ServiceRequest.created(async (request) => {
  if (request.serviceId !== schemaValidationService.id) return

  const { data, schema, schemaFormat = 'json-schema', options = {} } = request.inputs

  try {
    // Initialize validator
    const ajv = new Ajv({ allErrors: true, verbose: true })
    addFormats(ajv)

    // Compile schema
    let validate
    if (schemaFormat === 'json-schema') {
      validate = ajv.compile(schema)
    } else {
      // Convert other schema formats to JSON Schema
      const jsonSchema = await convertToJSONSchema(schema, schemaFormat)
      validate = ajv.compile(jsonSchema)
    }

    // Validate data
    const dataArray = Array.isArray(data) ? data : [data]
    const results = []

    for (const item of dataArray) {
      const valid = validate(item)

      results.push({
        valid,
        errors: valid
          ? []
          : validate.errors.map((error) => ({
              path: error.instancePath,
              message: error.message,
              keyword: error.keyword,
              params: error.params,
            })),
        data: item,
      })
    }

    const allValid = results.every((r) => r.valid)

    // Deliver results
    await send.ServiceResult.deliver({
      requestId: request.id,
      outputs: {
        valid: allValid,
        results,
        summary: {
          total: results.length,
          valid: results.filter((r) => r.valid).length,
          invalid: results.filter((r) => !r.valid).length,
        },
      },
    })

    // Calculate cost
    const rate = getVolumeRate(dataArray.length, schemaValidationService.pricing.volume)
    const cost = dataArray.length * rate

    await send.Payment.charge({
      customerId: request.customerId,
      amount: cost,
      description: `Schema validation (${dataArray.length} records)`,
    })
  } catch (error) {
    await send.ServiceRequest.fail({
      requestId: request.id,
      error: error.message,
      retryable: false,
    })
  }
})

async function convertToJSONSchema(schema: any, format: string): Promise<any> {
  switch (format) {
    case 'openapi':
      return convertOpenAPIToJSONSchema(schema)

    case 'graphql':
      return convertGraphQLToJSONSchema(schema)

    case 'protobuf':
      return convertProtobufToJSONSchema(schema)

    case 'avro':
      return convertAvroToJSONSchema(schema)

    default:
      throw new Error(`Unsupported schema format: ${format}`)
  }
}

Compliance Validation Service

Ensure data meets regulatory requirements:

const complianceValidationService = await $.Service.create({
  name: 'Compliance Validator',
  description: 'Validate data compliance with GDPR, HIPAA, PCI-DSS, and other regulations',
  type: $.ServiceType.DataValidation,
  subtype: 'compliance',

  regulations: ['gdpr', 'hipaa', 'pci-dss', 'sox', 'ccpa', 'iso-27001'],

  input: {
    required: ['data', 'regulation'],
    optional: ['jurisdiction', 'dataType', 'customRules'],
  },

  pricing: {
    model: 'per-validation',
    base: 0.1,
    regulations: {
      gdpr: 1.0,
      hipaa: 1.5,
      'pci-dss': 2.0,
    },
  },
})

on.ServiceRequest.created(async (request) => {
  if (request.serviceId !== complianceValidationService.id) return

  const { data, regulation, jurisdiction = 'US', dataType, customRules = [] } = request.inputs

  try {
    const compliance = {
      compliant: true,
      violations: [],
      warnings: [],
      requirements: [],
      recommendations: [],
    }

    // Get regulation requirements
    const requirements = await getComplianceRequirements(regulation, jurisdiction, dataType)
    compliance.requirements = requirements

    // Check each requirement
    for (const requirement of requirements) {
      const check = await checkCompliance(data, requirement)

      if (!check.compliant) {
        compliance.compliant = false
        compliance.violations.push({
          requirement: requirement.name,
          severity: requirement.severity,
          description: requirement.description,
          issue: check.issue,
          remediation: check.remediation,
        })
      } else if (check.warning) {
        compliance.warnings.push({
          requirement: requirement.name,
          warning: check.warning,
        })
      }
    }

    // Generate recommendations
    if (!compliance.compliant) {
      compliance.recommendations = await generateComplianceRecommendations(compliance.violations, regulation)
    }

    // Deliver results
    await send.ServiceResult.deliver({
      requestId: request.id,
      outputs: compliance,
    })

    // Charge based on regulation complexity
    const multiplier = complianceValidationService.pricing.regulations[regulation] || 1.0
    const cost = complianceValidationService.pricing.base * multiplier

    await send.Payment.charge({
      customerId: request.customerId,
      amount: cost,
      description: `Compliance validation (${regulation})`,
    })
  } catch (error) {
    await send.ServiceRequest.fail({
      requestId: request.id,
      error: error.message,
      retryable: true,
    })
  }
})

async function getComplianceRequirements(regulation: string, jurisdiction: string, dataType?: string): Promise<any[]> {
  const requirements = []

  switch (regulation) {
    case 'gdpr':
      requirements.push(
        { name: 'data-minimization', severity: 'high', description: 'Collect only necessary data' },
        { name: 'consent', severity: 'high', description: 'Explicit consent required for data processing' },
        { name: 'right-to-erasure', severity: 'high', description: 'Support data deletion requests' },
        { name: 'data-portability', severity: 'medium', description: 'Enable data export in machine-readable format' },
        { name: 'encryption', severity: 'high', description: 'Encrypt personal data' }
      )
      break

    case 'hipaa':
      requirements.push(
        { name: 'phi-protection', severity: 'critical', description: 'Protected Health Information must be secured' },
        { name: 'access-controls', severity: 'high', description: 'Implement role-based access controls' },
        { name: 'audit-logs', severity: 'high', description: 'Maintain audit logs of PHI access' },
        { name: 'encryption', severity: 'critical', description: 'Encrypt PHI at rest and in transit' }
      )
      break

    case 'pci-dss':
      requirements.push(
        { name: 'cardholder-data-protection', severity: 'critical', description: 'Protect stored cardholder data' },
        { name: 'encryption', severity: 'critical', description: 'Encrypt transmission of cardholder data' },
        { name: 'access-restriction', severity: 'high', description: 'Restrict access to cardholder data' },
        { name: 'monitoring', severity: 'high', description: 'Monitor and test networks regularly' }
      )
      break
  }

  return requirements
}

Real-Time Validation Service

Validate data streams in real-time:

const realtimeValidationService = await $.Service.create({
  name: 'Real-Time Validator',
  description: 'Validate data streams with sub-10ms latency',
  type: $.ServiceType.DataValidation,
  subtype: 'real-time',

  latency: '<10ms',
  throughput: '100000 validations/second',

  pricing: {
    model: 'subscription',
    tiers: [
      { name: 'starter', price: 100, validations: 1000000 },
      { name: 'professional', price: 500, validations: 10000000 },
      { name: 'enterprise', price: 2000, validations: 100000000 },
    ],
  },
})

on.Stream.data(async (stream) => {
  const validationRules = await db.ValidationRule.list({
    where: { streamId: stream.id, enabled: true },
  })

  // Compile rules for performance
  const compiledRules = compileValidationRules(validationRules)

  for await (const record of stream) {
    const startTime = Date.now()

    try {
      const validation = {
        valid: true,
        errors: [],
      }

      // Apply validation rules
      for (const rule of compiledRules) {
        const result = rule.validate(record)
        if (!result.valid) {
          validation.valid = false
          validation.errors.push(result.error)
        }
      }

      const latency = Date.now() - startTime

      // Emit validation result
      await send.Stream.emit({
        streamId: stream.id,
        data: {
          ...record,
          _validation: validation,
          _latency: latency,
        },
      })

      // Route based on validation
      if (!validation.valid) {
        await send.Stream.routeToDeadLetter({
          streamId: stream.id,
          record,
          validation,
        })
      }
    } catch (error) {
      await send.Stream.error({
        streamId: stream.id,
        record,
        error: error.message,
      })
    }
  }
})

function compileValidationRules(rules: any[]): any[] {
  return rules.map((rule) => {
    // Compile rule for fast execution
    return {
      name: rule.name,
      validate: new Function(
        'record',
        `
        try {
          ${rule.code}
          return { valid: true }
        } catch (error) {
          return { valid: false, error: error.message }
        }
      `
      ),
    }
  })
}

Batch Validation Service

Validate large datasets efficiently:

const batchValidationService = await $.Service.create({
  name: 'Batch Validator',
  description: 'Validate large datasets with detailed error reporting',
  type: $.ServiceType.DataValidation,
  subtype: 'batch',

  maxBatchSize: 1000000,

  pricing: {
    model: 'per-thousand',
    rate: 1.0,
    volume: [
      { min: 0, max: 10000, rate: 1.0 },
      { min: 10001, max: 100000, rate: 0.75 },
      { min: 100001, max: Infinity, rate: 0.5 },
    ],
  },
})

on.ServiceRequest.created(async (request) => {
  if (request.serviceId !== batchValidationService.id) return

  const { dataUrl, validationRules, options = {} } = request.inputs

  try {
    // Download dataset
    const dataset = await downloadDataset(dataUrl)

    // Process in batches
    const batchSize = 10000
    const results = {
      total: dataset.length,
      valid: 0,
      invalid: 0,
      errors: [],
      warnings: [],
    }

    for (let i = 0; i < dataset.length; i += batchSize) {
      const batch = dataset.slice(i, i + batchSize)

      // Validate batch in parallel
      const batchResults = await Promise.all(
        batch.map(async (record, index) => {
          const validation = await validateRecord(record, validationRules)
          return {
            index: i + index,
            valid: validation.valid,
            errors: validation.errors,
            warnings: validation.warnings,
          }
        })
      )

      // Aggregate results
      batchResults.forEach((result) => {
        if (result.valid) {
          results.valid++
        } else {
          results.invalid++
          if (options.includeErrors) {
            results.errors.push(result)
          }
        }

        if (result.warnings.length > 0 && options.includeWarnings) {
          results.warnings.push(result)
        }
      })

      // Update progress
      await send.ServiceProgress.updated({
        requestId: request.id,
        progress: (i + batchSize) / dataset.length,
        message: `Validated ${Math.min(i + batchSize, dataset.length)} of ${dataset.length} records`,
      })
    }

    // Generate report
    const reportUrl = await generateValidationReport(results, options)

    // Deliver results
    await send.ServiceResult.deliver({
      requestId: request.id,
      outputs: {
        summary: results,
        reportUrl,
        validationRate: (results.valid / results.total) * 100,
      },
    })

    // Calculate cost
    const thousands = Math.ceil(dataset.length / 1000)
    const rate = getVolumeRate(dataset.length, batchValidationService.pricing.volume)
    const cost = thousands * rate

    await send.Payment.charge({
      customerId: request.customerId,
      amount: cost,
      description: `Batch validation (${dataset.length} records)`,
    })
  } catch (error) {
    await send.ServiceRequest.fail({
      requestId: request.id,
      error: error.message,
      retryable: true,
    })
  }
})

Pricing Models for Validation Services

Per-Validation Pricing

Best for: Form validation, API validation

pricing: {
  model: 'per-validation',
  rate: 0.001,
  minimumCharge: 0.01,
}

Subscription Pricing

Best for: Real-time validation, continuous monitoring

pricing: {
  model: 'subscription',
  tiers: [
    { name: 'starter', price: 50, validations: 100000 },
    { name: 'professional', price: 200, validations: 1000000 },
    { name: 'enterprise', price: 1000, validations: 10000000 },
  ],
}

Complexity-Based Pricing

Best for: Business rule validation, compliance checks

pricing: {
  model: 'complexity-based',
  base: 0.01,
  multipliers: {
    simple: 1.0,
    moderate: 2.0,
    complex: 4.0,
  },
}

Best Practices

1. Provide Clear Error Messages

function formatValidationError(error: any): string {
  return `${error.field}: ${error.message} (current value: ${error.actualValue}, expected: ${error.expectedValue})`
}

2. Cache Validation Rules

const ruleCache = new Map()

async function getCachedRule(ruleId: string): Promise<any> {
  if (ruleCache.has(ruleId)) {
    return ruleCache.get(ruleId)
  }

  const rule = await db.ValidationRule.get({ where: { id: ruleId } })
  ruleCache.set(ruleId, rule)
  return rule
}

3. Support Incremental Validation

async function validateIncremental(previousValidation: any, changes: any): Promise<any> {
  // Only re-validate changed fields
  const fieldsToValidate = Object.keys(changes)
  const validation = { ...previousValidation }

  for (const field of fieldsToValidate) {
    validation.fieldStatus[field] = await validateField(changes[field], field)
  }

  return validation
}

Next Steps