Data Validation Services
Build robust data validation services that ensure data quality, accuracy, and compliance
Data validation services ensure data quality, accuracy, and compliance by checking records against schemas, business rules, and regulatory requirements. These services are critical for maintaining data integrity across systems and preventing costly errors.
Overview
Data validation is the foundation of data quality. Whether you're validating form submissions, ensuring database integrity, checking compliance with regulations, or maintaining data standards, validation services automate the process of ensuring your data meets required criteria.
Key Capabilities
- Schema Validation: Verify data structure and types against defined schemas
- Business Rule Validation: Enforce domain-specific business logic and constraints
- Data Quality Checks: Assess completeness, accuracy, consistency, and timeliness
- Compliance Validation: Ensure adherence to regulatory requirements (GDPR, HIPAA, etc.)
- Cross-Field Validation: Validate relationships between fields
- Custom Validation Logic: Execute complex validation rules
Common Use Cases
- Form Validation: Validate user input in real-time
- Data Import Validation: Check data quality before loading into systems
- API Request Validation: Ensure incoming requests meet requirements
- Data Quality Monitoring: Continuously assess data quality metrics
- Compliance Auditing: Verify regulatory compliance
- Data Migration Validation: Ensure data integrity during migrations
Building Your First Validation Service
Let's start with a comprehensive form validation service:
import $, { db, on, send } from 'sdk.do'
const formValidationService = await $.Service.create({
name: 'Smart Form Validator',
description: 'Real-time form validation with custom rules and error messages',
type: $.ServiceType.DataValidation,
subtype: 'form-validation',
input: {
required: ['formData', 'schema'],
optional: ['customRules', 'strictMode'],
},
output: {
valid: 'boolean',
errors: 'array',
warnings: 'array',
fieldStatus: 'object',
},
pricing: {
model: 'per-validation',
rate: 0.001, // $0.001 per validation
minimumCharge: 0.01,
},
})
on.ServiceRequest.created(async (request) => {
if (request.serviceId !== formValidationService.id) return
const { formData, schema, customRules = [], strictMode = false } = request.inputs
try {
const validationResult = {
valid: true,
errors: [],
warnings: [],
fieldStatus: {},
}
// Step 1: Schema validation
for (const [fieldName, fieldSchema] of Object.entries(schema.fields)) {
const value = formData[fieldName]
const fieldErrors = []
const fieldWarnings = []
// Required field check
if (fieldSchema.required && (value === undefined || value === null || value === '')) {
fieldErrors.push({
field: fieldName,
rule: 'required',
message: fieldSchema.messages?.required || `${fieldName} is required`,
})
}
// Type validation
if (value !== undefined && value !== null && value !== '') {
const typeValid = validateType(value, fieldSchema.type)
if (!typeValid) {
fieldErrors.push({
field: fieldName,
rule: 'type',
message: fieldSchema.messages?.type || `${fieldName} must be of type ${fieldSchema.type}`,
})
}
}
// Length validation
if (value && fieldSchema.minLength && value.length < fieldSchema.minLength) {
fieldErrors.push({
field: fieldName,
rule: 'minLength',
message: fieldSchema.messages?.minLength || `${fieldName} must be at least ${fieldSchema.minLength} characters`,
})
}
if (value && fieldSchema.maxLength && value.length > fieldSchema.maxLength) {
fieldErrors.push({
field: fieldName,
rule: 'maxLength',
message: fieldSchema.messages?.maxLength || `${fieldName} must not exceed ${fieldSchema.maxLength} characters`,
})
}
// Range validation
if (typeof value === 'number') {
if (fieldSchema.min !== undefined && value < fieldSchema.min) {
fieldErrors.push({
field: fieldName,
rule: 'min',
message: fieldSchema.messages?.min || `${fieldName} must be at least ${fieldSchema.min}`,
})
}
if (fieldSchema.max !== undefined && value > fieldSchema.max) {
fieldErrors.push({
field: fieldName,
rule: 'max',
message: fieldSchema.messages?.max || `${fieldName} must not exceed ${fieldSchema.max}`,
})
}
}
// Pattern validation
if (value && fieldSchema.pattern) {
const regex = new RegExp(fieldSchema.pattern)
if (!regex.test(value)) {
fieldErrors.push({
field: fieldName,
rule: 'pattern',
message: fieldSchema.messages?.pattern || `${fieldName} format is invalid`,
})
}
}
// Enum validation
if (value && fieldSchema.enum) {
if (!fieldSchema.enum.includes(value)) {
fieldErrors.push({
field: fieldName,
rule: 'enum',
message: fieldSchema.messages?.enum || `${fieldName} must be one of: ${fieldSchema.enum.join(', ')}`,
})
}
}
// Custom validators
if (value && fieldSchema.validators) {
for (const validator of fieldSchema.validators) {
const result = await executeValidator(value, validator)
if (!result.valid) {
fieldErrors.push({
field: fieldName,
rule: validator.name,
message: result.message,
})
}
}
}
// Update field status
validationResult.fieldStatus[fieldName] = {
valid: fieldErrors.length === 0,
errors: fieldErrors,
warnings: fieldWarnings,
}
if (fieldErrors.length > 0) {
validationResult.valid = false
validationResult.errors.push(...fieldErrors)
}
if (fieldWarnings.length > 0) {
validationResult.warnings.push(...fieldWarnings)
}
}
// Step 2: Cross-field validation
if (schema.crossFieldRules) {
for (const rule of schema.crossFieldRules) {
const result = validateCrossField(formData, rule)
if (!result.valid) {
validationResult.valid = false
validationResult.errors.push({
rule: rule.name,
message: result.message,
fields: rule.fields,
})
}
}
}
// Step 3: Custom business rules
for (const rule of customRules) {
const result = await evaluateBusinessRule(formData, rule)
if (!result.valid) {
if (rule.severity === 'error' || strictMode) {
validationResult.valid = false
validationResult.errors.push({
rule: rule.name,
message: result.message,
})
} else {
validationResult.warnings.push({
rule: rule.name,
message: result.message,
})
}
}
}
// Deliver results
await send.ServiceResult.deliver({
requestId: request.id,
outputs: validationResult,
})
// Charge for validation
await send.Payment.charge({
customerId: request.customerId,
amount: formValidationService.pricing.rate,
description: 'Form validation',
})
} catch (error) {
await send.ServiceRequest.fail({
requestId: request.id,
error: error.message,
retryable: false,
})
}
})
function validateType(value: any, type: string): boolean {
switch (type) {
case 'string':
return typeof value === 'string'
case 'number':
return typeof value === 'number' && !isNaN(value)
case 'integer':
return Number.isInteger(value)
case 'boolean':
return typeof value === 'boolean'
case 'email':
return /^[^\s@]+@[^\s@]+\.[^\s@]+$/.test(value)
case 'url':
try {
new URL(value)
return true
} catch {
return false
}
case 'date':
return value instanceof Date || !isNaN(Date.parse(value))
case 'array':
return Array.isArray(value)
case 'object':
return typeof value === 'object' && value !== null && !Array.isArray(value)
default:
return true
}
}
function validateCrossField(data: any, rule: any): { valid: boolean; message?: string } {
switch (rule.type) {
case 'dependent':
// If field A has value, field B is required
if (data[rule.fields[0]] && !data[rule.fields[1]]) {
return {
valid: false,
message: `${rule.fields[1]} is required when ${rule.fields[0]} is provided`,
}
}
break
case 'mutuallyExclusive':
// Only one of the fields can have a value
const withValues = rule.fields.filter((f) => data[f])
if (withValues.length > 1) {
return {
valid: false,
message: `Only one of ${rule.fields.join(', ')} can be provided`,
}
}
break
case 'comparison':
// Compare two fields
const val1 = data[rule.fields[0]]
const val2 = data[rule.fields[1]]
if (!compareValues(val1, val2, rule.operator)) {
return {
valid: false,
message: `${rule.fields[0]} must be ${rule.operator} ${rule.fields[1]}`,
}
}
break
}
return { valid: true }
}Business Rule Validation Service
Enforce complex business logic:
const businessRuleValidationService = await $.Service.create({
name: 'Business Rule Validator',
description: 'Validate data against complex business rules and constraints',
type: $.ServiceType.DataValidation,
subtype: 'business-rules',
input: {
required: ['data', 'rules'],
optional: ['context', 'mode'],
},
features: ['rule-engine', 'conditional-logic', 'external-lookups', 'ai-validation', 'custom-functions'],
pricing: {
model: 'per-validation',
base: 0.01,
complexity: {
simple: 1.0,
moderate: 2.0,
complex: 4.0,
},
},
})
on.ServiceRequest.created(async (request) => {
if (request.serviceId !== businessRuleValidationService.id) return
const { data, rules, context = {}, mode = 'strict' } = request.inputs
try {
const results = {
valid: true,
violations: [],
warnings: [],
evaluated: 0,
passed: 0,
failed: 0,
}
for (const rule of rules) {
results.evaluated++
// Evaluate rule
const evaluation = await evaluateRule(data, rule, context)
if (!evaluation.valid) {
if (rule.severity === 'critical' || mode === 'strict') {
results.valid = false
results.failed++
results.violations.push({
rule: rule.name,
severity: rule.severity,
message: evaluation.message,
field: evaluation.field,
actualValue: evaluation.actualValue,
expectedValue: evaluation.expectedValue,
})
} else {
results.warnings.push({
rule: rule.name,
severity: rule.severity,
message: evaluation.message,
})
}
} else {
results.passed++
}
}
// Calculate complexity
const complexity = calculateRuleComplexity(rules)
// Deliver results
await send.ServiceResult.deliver({
requestId: request.id,
outputs: results,
})
// Charge based on complexity
const multiplier = businessRuleValidationService.pricing.complexity[complexity]
const cost = businessRuleValidationService.pricing.base * multiplier
await send.Payment.charge({
customerId: request.customerId,
amount: cost,
description: `Business rule validation (${complexity})`,
})
} catch (error) {
await send.ServiceRequest.fail({
requestId: request.id,
error: error.message,
retryable: true,
})
}
})
async function evaluateRule(data: any, rule: any, context: any): Promise<any> {
switch (rule.type) {
case 'condition':
return evaluateCondition(data, rule.condition)
case 'range':
return evaluateRange(data, rule)
case 'lookup':
return await evaluateLookup(data, rule, context)
case 'calculation':
return evaluateCalculation(data, rule)
case 'ai-validation':
return await evaluateWithAI(data, rule)
case 'custom':
return await executeCustomRule(data, rule)
default:
return { valid: true }
}
}
async function evaluateWithAI(data: any, rule: any): Promise<any> {
// Use AI for complex validation logic
const result = await ai.validate({
model: 'gpt-5',
data,
rule: rule.description,
examples: rule.examples,
})
return {
valid: result.valid,
message: result.reasoning,
confidence: result.confidence,
}
}Data Quality Assessment Service
Evaluate data quality across multiple dimensions:
const dataQualityService = await $.Service.create({
name: 'Data Quality Assessor',
description: 'Comprehensive data quality assessment across multiple dimensions',
type: $.ServiceType.DataValidation,
subtype: 'quality-assessment',
dimensions: ['completeness', 'accuracy', 'consistency', 'timeliness', 'validity', 'uniqueness'],
input: {
required: ['dataset'],
optional: ['dimensions', 'thresholds', 'sampleSize'],
},
pricing: {
model: 'per-assessment',
base: 10.0,
perThousandRecords: 1.0,
},
})
on.ServiceRequest.created(async (request) => {
if (request.serviceId !== dataQualityService.id) return
const { dataset, dimensions = dataQualityService.dimensions, thresholds = {}, sampleSize } = request.inputs
try {
// Sample dataset if needed
const sample = sampleSize && dataset.length > sampleSize ? sampleDataset(dataset, sampleSize) : dataset
const assessment = {
totalRecords: dataset.length,
sampleSize: sample.length,
dimensions: {},
overallScore: 0,
issues: [],
recommendations: [],
}
// Assess each dimension
for (const dimension of dimensions) {
const result = await assessDimension(sample, dimension, thresholds[dimension])
assessment.dimensions[dimension] = result
if (result.score < (thresholds[dimension]?.minimum || 80)) {
assessment.issues.push({
dimension,
score: result.score,
details: result.details,
})
}
}
// Calculate overall score
const scores = Object.values(assessment.dimensions).map((d: any) => d.score)
assessment.overallScore = scores.reduce((sum, score) => sum + score, 0) / scores.length
// Generate recommendations
assessment.recommendations = generateRecommendations(assessment)
// Deliver results
await send.ServiceResult.deliver({
requestId: request.id,
outputs: assessment,
})
// Calculate cost
const thousands = Math.ceil(dataset.length / 1000)
const cost = dataQualityService.pricing.base + thousands * dataQualityService.pricing.perThousandRecords
await send.Payment.charge({
customerId: request.customerId,
amount: cost,
description: `Data quality assessment (${dataset.length} records)`,
})
} catch (error) {
await send.ServiceRequest.fail({
requestId: request.id,
error: error.message,
retryable: true,
})
}
})
async function assessDimension(sample: any[], dimension: string, threshold?: any): Promise<any> {
switch (dimension) {
case 'completeness':
return assessCompleteness(sample, threshold)
case 'accuracy':
return await assessAccuracy(sample, threshold)
case 'consistency':
return assessConsistency(sample, threshold)
case 'timeliness':
return assessTimeliness(sample, threshold)
case 'validity':
return assessValidity(sample, threshold)
case 'uniqueness':
return assessUniqueness(sample, threshold)
default:
return { score: 100, details: {} }
}
}
function assessCompleteness(sample: any[], threshold?: any): any {
const fields = Object.keys(sample[0] || {})
const fieldCompleteness = {}
let totalFields = 0
let completeFields = 0
for (const field of fields) {
const nonNull = sample.filter((record) => record[field] !== null && record[field] !== undefined && record[field] !== '').length
const completeness = (nonNull / sample.length) * 100
fieldCompleteness[field] = {
completeness,
nullCount: sample.length - nonNull,
}
totalFields++
if (completeness >= (threshold?.fieldThreshold || 95)) {
completeFields++
}
}
const overallCompleteness = (completeFields / totalFields) * 100
return {
score: overallCompleteness,
details: {
fields: fieldCompleteness,
completeFields,
totalFields,
},
}
}
async function assessAccuracy(sample: any[], threshold?: any): Promise<any> {
// Use AI to assess accuracy
const accuracyChecks = []
for (const record of sample.slice(0, 100)) {
// Sample for accuracy check
const check = await ai.validate({
model: 'gpt-5',
record,
validationRules: threshold?.rules || ['check for obvious errors', 'verify data makes sense', 'validate relationships'],
})
accuracyChecks.push({
valid: check.valid,
issues: check.issues,
})
}
const accurate = accuracyChecks.filter((c) => c.valid).length
const score = (accurate / accuracyChecks.length) * 100
return {
score,
details: {
checked: accuracyChecks.length,
accurate,
issues: accuracyChecks.filter((c) => !c.valid).map((c) => c.issues),
},
}
}
function assessConsistency(sample: any[], threshold?: any): any {
const fields = Object.keys(sample[0] || {})
const inconsistencies = []
for (const field of fields) {
const values = sample.map((r) => r[field]).filter((v) => v !== null && v !== undefined)
const types = [...new Set(values.map((v) => typeof v))]
// Check type consistency
if (types.length > 1) {
inconsistencies.push({
field,
issue: 'mixed-types',
types,
})
}
// Check format consistency for strings
if (types.includes('string')) {
const formats = detectFormats(values.filter((v) => typeof v === 'string'))
if (formats.length > 1) {
inconsistencies.push({
field,
issue: 'mixed-formats',
formats,
})
}
}
}
const score = Math.max(0, 100 - inconsistencies.length * 10)
return {
score,
details: {
inconsistencies,
fieldsChecked: fields.length,
},
}
}Schema Validation Service
Validate data against JSON Schema and other schema formats:
import $, { on, send } from 'sdk.do'
import Ajv from 'ajv'
import addFormats from 'ajv-formats'
const schemaValidationService = await $.Service.create({
name: 'Schema Validator',
description: 'Validate data against JSON Schema, OpenAPI, and custom schemas',
type: $.ServiceType.DataValidation,
subtype: 'schema-validation',
supportedFormats: ['json-schema', 'openapi', 'graphql', 'protobuf', 'avro'],
pricing: {
model: 'per-validation',
rate: 0.005,
volume: [
{ min: 0, max: 1000, rate: 0.005 },
{ min: 1001, max: 10000, rate: 0.003 },
{ min: 10001, max: Infinity, rate: 0.001 },
],
},
})
on.ServiceRequest.created(async (request) => {
if (request.serviceId !== schemaValidationService.id) return
const { data, schema, schemaFormat = 'json-schema', options = {} } = request.inputs
try {
// Initialize validator
const ajv = new Ajv({ allErrors: true, verbose: true })
addFormats(ajv)
// Compile schema
let validate
if (schemaFormat === 'json-schema') {
validate = ajv.compile(schema)
} else {
// Convert other schema formats to JSON Schema
const jsonSchema = await convertToJSONSchema(schema, schemaFormat)
validate = ajv.compile(jsonSchema)
}
// Validate data
const dataArray = Array.isArray(data) ? data : [data]
const results = []
for (const item of dataArray) {
const valid = validate(item)
results.push({
valid,
errors: valid
? []
: validate.errors.map((error) => ({
path: error.instancePath,
message: error.message,
keyword: error.keyword,
params: error.params,
})),
data: item,
})
}
const allValid = results.every((r) => r.valid)
// Deliver results
await send.ServiceResult.deliver({
requestId: request.id,
outputs: {
valid: allValid,
results,
summary: {
total: results.length,
valid: results.filter((r) => r.valid).length,
invalid: results.filter((r) => !r.valid).length,
},
},
})
// Calculate cost
const rate = getVolumeRate(dataArray.length, schemaValidationService.pricing.volume)
const cost = dataArray.length * rate
await send.Payment.charge({
customerId: request.customerId,
amount: cost,
description: `Schema validation (${dataArray.length} records)`,
})
} catch (error) {
await send.ServiceRequest.fail({
requestId: request.id,
error: error.message,
retryable: false,
})
}
})
async function convertToJSONSchema(schema: any, format: string): Promise<any> {
switch (format) {
case 'openapi':
return convertOpenAPIToJSONSchema(schema)
case 'graphql':
return convertGraphQLToJSONSchema(schema)
case 'protobuf':
return convertProtobufToJSONSchema(schema)
case 'avro':
return convertAvroToJSONSchema(schema)
default:
throw new Error(`Unsupported schema format: ${format}`)
}
}Compliance Validation Service
Ensure data meets regulatory requirements:
const complianceValidationService = await $.Service.create({
name: 'Compliance Validator',
description: 'Validate data compliance with GDPR, HIPAA, PCI-DSS, and other regulations',
type: $.ServiceType.DataValidation,
subtype: 'compliance',
regulations: ['gdpr', 'hipaa', 'pci-dss', 'sox', 'ccpa', 'iso-27001'],
input: {
required: ['data', 'regulation'],
optional: ['jurisdiction', 'dataType', 'customRules'],
},
pricing: {
model: 'per-validation',
base: 0.1,
regulations: {
gdpr: 1.0,
hipaa: 1.5,
'pci-dss': 2.0,
},
},
})
on.ServiceRequest.created(async (request) => {
if (request.serviceId !== complianceValidationService.id) return
const { data, regulation, jurisdiction = 'US', dataType, customRules = [] } = request.inputs
try {
const compliance = {
compliant: true,
violations: [],
warnings: [],
requirements: [],
recommendations: [],
}
// Get regulation requirements
const requirements = await getComplianceRequirements(regulation, jurisdiction, dataType)
compliance.requirements = requirements
// Check each requirement
for (const requirement of requirements) {
const check = await checkCompliance(data, requirement)
if (!check.compliant) {
compliance.compliant = false
compliance.violations.push({
requirement: requirement.name,
severity: requirement.severity,
description: requirement.description,
issue: check.issue,
remediation: check.remediation,
})
} else if (check.warning) {
compliance.warnings.push({
requirement: requirement.name,
warning: check.warning,
})
}
}
// Generate recommendations
if (!compliance.compliant) {
compliance.recommendations = await generateComplianceRecommendations(compliance.violations, regulation)
}
// Deliver results
await send.ServiceResult.deliver({
requestId: request.id,
outputs: compliance,
})
// Charge based on regulation complexity
const multiplier = complianceValidationService.pricing.regulations[regulation] || 1.0
const cost = complianceValidationService.pricing.base * multiplier
await send.Payment.charge({
customerId: request.customerId,
amount: cost,
description: `Compliance validation (${regulation})`,
})
} catch (error) {
await send.ServiceRequest.fail({
requestId: request.id,
error: error.message,
retryable: true,
})
}
})
async function getComplianceRequirements(regulation: string, jurisdiction: string, dataType?: string): Promise<any[]> {
const requirements = []
switch (regulation) {
case 'gdpr':
requirements.push(
{ name: 'data-minimization', severity: 'high', description: 'Collect only necessary data' },
{ name: 'consent', severity: 'high', description: 'Explicit consent required for data processing' },
{ name: 'right-to-erasure', severity: 'high', description: 'Support data deletion requests' },
{ name: 'data-portability', severity: 'medium', description: 'Enable data export in machine-readable format' },
{ name: 'encryption', severity: 'high', description: 'Encrypt personal data' }
)
break
case 'hipaa':
requirements.push(
{ name: 'phi-protection', severity: 'critical', description: 'Protected Health Information must be secured' },
{ name: 'access-controls', severity: 'high', description: 'Implement role-based access controls' },
{ name: 'audit-logs', severity: 'high', description: 'Maintain audit logs of PHI access' },
{ name: 'encryption', severity: 'critical', description: 'Encrypt PHI at rest and in transit' }
)
break
case 'pci-dss':
requirements.push(
{ name: 'cardholder-data-protection', severity: 'critical', description: 'Protect stored cardholder data' },
{ name: 'encryption', severity: 'critical', description: 'Encrypt transmission of cardholder data' },
{ name: 'access-restriction', severity: 'high', description: 'Restrict access to cardholder data' },
{ name: 'monitoring', severity: 'high', description: 'Monitor and test networks regularly' }
)
break
}
return requirements
}Real-Time Validation Service
Validate data streams in real-time:
const realtimeValidationService = await $.Service.create({
name: 'Real-Time Validator',
description: 'Validate data streams with sub-10ms latency',
type: $.ServiceType.DataValidation,
subtype: 'real-time',
latency: '<10ms',
throughput: '100000 validations/second',
pricing: {
model: 'subscription',
tiers: [
{ name: 'starter', price: 100, validations: 1000000 },
{ name: 'professional', price: 500, validations: 10000000 },
{ name: 'enterprise', price: 2000, validations: 100000000 },
],
},
})
on.Stream.data(async (stream) => {
const validationRules = await db.ValidationRule.list({
where: { streamId: stream.id, enabled: true },
})
// Compile rules for performance
const compiledRules = compileValidationRules(validationRules)
for await (const record of stream) {
const startTime = Date.now()
try {
const validation = {
valid: true,
errors: [],
}
// Apply validation rules
for (const rule of compiledRules) {
const result = rule.validate(record)
if (!result.valid) {
validation.valid = false
validation.errors.push(result.error)
}
}
const latency = Date.now() - startTime
// Emit validation result
await send.Stream.emit({
streamId: stream.id,
data: {
...record,
_validation: validation,
_latency: latency,
},
})
// Route based on validation
if (!validation.valid) {
await send.Stream.routeToDeadLetter({
streamId: stream.id,
record,
validation,
})
}
} catch (error) {
await send.Stream.error({
streamId: stream.id,
record,
error: error.message,
})
}
}
})
function compileValidationRules(rules: any[]): any[] {
return rules.map((rule) => {
// Compile rule for fast execution
return {
name: rule.name,
validate: new Function(
'record',
`
try {
${rule.code}
return { valid: true }
} catch (error) {
return { valid: false, error: error.message }
}
`
),
}
})
}Batch Validation Service
Validate large datasets efficiently:
const batchValidationService = await $.Service.create({
name: 'Batch Validator',
description: 'Validate large datasets with detailed error reporting',
type: $.ServiceType.DataValidation,
subtype: 'batch',
maxBatchSize: 1000000,
pricing: {
model: 'per-thousand',
rate: 1.0,
volume: [
{ min: 0, max: 10000, rate: 1.0 },
{ min: 10001, max: 100000, rate: 0.75 },
{ min: 100001, max: Infinity, rate: 0.5 },
],
},
})
on.ServiceRequest.created(async (request) => {
if (request.serviceId !== batchValidationService.id) return
const { dataUrl, validationRules, options = {} } = request.inputs
try {
// Download dataset
const dataset = await downloadDataset(dataUrl)
// Process in batches
const batchSize = 10000
const results = {
total: dataset.length,
valid: 0,
invalid: 0,
errors: [],
warnings: [],
}
for (let i = 0; i < dataset.length; i += batchSize) {
const batch = dataset.slice(i, i + batchSize)
// Validate batch in parallel
const batchResults = await Promise.all(
batch.map(async (record, index) => {
const validation = await validateRecord(record, validationRules)
return {
index: i + index,
valid: validation.valid,
errors: validation.errors,
warnings: validation.warnings,
}
})
)
// Aggregate results
batchResults.forEach((result) => {
if (result.valid) {
results.valid++
} else {
results.invalid++
if (options.includeErrors) {
results.errors.push(result)
}
}
if (result.warnings.length > 0 && options.includeWarnings) {
results.warnings.push(result)
}
})
// Update progress
await send.ServiceProgress.updated({
requestId: request.id,
progress: (i + batchSize) / dataset.length,
message: `Validated ${Math.min(i + batchSize, dataset.length)} of ${dataset.length} records`,
})
}
// Generate report
const reportUrl = await generateValidationReport(results, options)
// Deliver results
await send.ServiceResult.deliver({
requestId: request.id,
outputs: {
summary: results,
reportUrl,
validationRate: (results.valid / results.total) * 100,
},
})
// Calculate cost
const thousands = Math.ceil(dataset.length / 1000)
const rate = getVolumeRate(dataset.length, batchValidationService.pricing.volume)
const cost = thousands * rate
await send.Payment.charge({
customerId: request.customerId,
amount: cost,
description: `Batch validation (${dataset.length} records)`,
})
} catch (error) {
await send.ServiceRequest.fail({
requestId: request.id,
error: error.message,
retryable: true,
})
}
})Pricing Models for Validation Services
Per-Validation Pricing
Best for: Form validation, API validation
pricing: {
model: 'per-validation',
rate: 0.001,
minimumCharge: 0.01,
}Subscription Pricing
Best for: Real-time validation, continuous monitoring
pricing: {
model: 'subscription',
tiers: [
{ name: 'starter', price: 50, validations: 100000 },
{ name: 'professional', price: 200, validations: 1000000 },
{ name: 'enterprise', price: 1000, validations: 10000000 },
],
}Complexity-Based Pricing
Best for: Business rule validation, compliance checks
pricing: {
model: 'complexity-based',
base: 0.01,
multipliers: {
simple: 1.0,
moderate: 2.0,
complex: 4.0,
},
}Best Practices
1. Provide Clear Error Messages
function formatValidationError(error: any): string {
return `${error.field}: ${error.message} (current value: ${error.actualValue}, expected: ${error.expectedValue})`
}2. Cache Validation Rules
const ruleCache = new Map()
async function getCachedRule(ruleId: string): Promise<any> {
if (ruleCache.has(ruleId)) {
return ruleCache.get(ruleId)
}
const rule = await db.ValidationRule.get({ where: { id: ruleId } })
ruleCache.set(ruleId, rule)
return rule
}3. Support Incremental Validation
async function validateIncremental(previousValidation: any, changes: any): Promise<any> {
// Only re-validate changed fields
const fieldsToValidate = Object.keys(changes)
const validation = { ...previousValidation }
for (const field of fieldsToValidate) {
validation.fieldStatus[field] = await validateField(changes[field], field)
}
return validation
}Next Steps
- Data Transformation Services → - Convert and normalize data
- Data Enrichment Services → - Add value to data
- Data Analytics Services → - Generate insights
- Service Composition → - Combine validation services