Skip to main content
Face identity extractor pipeline showing detection, alignment, and embedding generation
The face identity extractor provides production-grade face recognition using state-of-the-art models (SCRFD for detection + ArcFace for embeddings). Detects faces, aligns to canonical template, and generates 512-dimensional embeddings with 99.8%+ verification accuracy (LFW benchmark).
View extractor details at api.mixpeek.com/v1/collections/features/extractors/face_identity_extractor_v1 or fetch programmatically with GET /v1/collections/features/extractors/{feature_extractor_id}.

When to Use

Use CaseDescription
Face verification1:1 matching to verify identity
Face identification1:N search to identify a person in a database
Face clusteringGroup photos by person automatically
Employee verificationWorkplace identity systems
Photo organizationOrganize photo libraries by people
SurveillanceSecurity and monitoring applications

When NOT to Use

ScenarioRecommended Alternative
General image searchimage_extractor
Object/scene detectionmultimodal_extractor
Video content analysismultimodal_extractor
Non-face biometricsSpecialized extractors

Supported Input Types

InputTypeDescriptionProcessing
imagestringURL or S3 pathDetect and embed all faces
videostringURL or S3 pathSample frames, detect faces, deduplicate
video_framestringURL or S3 pathTreated as image
Supported formats:
  • Image: JPEG, PNG, WebP, BMP
  • Video: MP4, MOV, AVI, MKV, WebM
Recommended resolution: 640px+ for optimal face detection

Input Schema

Provide one of the following inputs:
{
  "image": "s3://photos/john-doe-portrait.jpg"
}
{
  "video": "s3://segments/interview-clip.mp4"
}
FieldTypeDescription
imagestringImage URL or S3 path containing faces
videostringVideo URL or S3 path. Subject to max_video_length limit
video_framestringSingle video frame URL or S3 path (treated as image)

Output Schema

Each detected face produces one document with the following fields:
FieldTypeDescription
face_identity_extractor_v1_embeddingfloat[512]ArcFace embedding, L2 normalized
face_indexintegerIndex of this face in source image (0-based)
bboxobjectBounding box {x1, y1, x2, y2, width, height}
detection_scorenumberSCRFD detection confidence (0.0-1.0)
landmarksobject5 facial landmarks for alignment
quality_scorenumberFace quality score (0.0-1.0)
quality_componentsobjectQuality component scores (blur, size, etc.)
aligned_face_cropstringBase64 aligned 112x112 face crop (optional)
frame_numberintegerFrame number in source video
timestampnumberTimestamp in source video (seconds)
embedding_modelstringEmbedding model used
detection_modelstringDetection model used
processing_time_msnumberProcessing time (milliseconds)
{
  "face_identity_extractor_v1_embedding": [0.023, -0.041, 0.018, ...],
  "face_index": 0,
  "bbox": {"x1": 120, "y1": 80, "x2": 280, "y2": 300, "width": 160, "height": 220},
  "detection_score": 0.98,
  "landmarks": {"left_eye": [150, 140], "right_eye": [230, 142], ...},
  "quality_score": 0.85,
  "embedding_model": "arcface_r100",
  "detection_model": "scrfd_2.5g",
  "processing_time_ms": 45.2
}

Parameters

Detection Parameters

ParameterTypeDefaultDescription
detection_modelstring"scrfd_2.5g"SCRFD model variant
min_face_sizeinteger20Minimum face size in pixels to detect
detection_thresholdfloat0.5Confidence threshold (0.0-1.0)
max_faces_per_imageintegernullMaximum faces to process per image

Detection Models

ModelSpeedAccuracyBest For
scrfd_500m2-3msGoodReal-time applications
scrfd_2.5g5-7msBetterRecommended - balanced
scrfd_10g10-15msBestMaximum accuracy

Embedding Parameters

ParameterTypeDefaultDescription
embedding_modelstring"arcface_r100"Face embedding model
normalize_embeddingsbooleantrueL2-normalize to unit vectors

Embedding Models

ModelAccuracy (LFW)SpeedNotes
arcface_r10099.8%+StandardRecommended - highest accuracy
arcface_r5099.5%+FasterSlightly lower accuracy
magface_r10099.7%+StandardIncludes built-in quality score

Quality Parameters

ParameterTypeDefaultDescription
enable_quality_scoringbooleantrueCompute quality scores (adds ~5ms per face)
quality_thresholdfloatnullMinimum quality to index (null = index all)
Quality threshold guide:
  • null - Index all detected faces
  • 0.5 - Moderate filtering (removes low quality)
  • 0.7 - High quality only

Video Parameters

ParameterTypeDefaultDescription
max_video_lengthinteger60Maximum video length in seconds
video_sampling_fpsfloat1.0Frames per second to sample
video_deduplicationbooleantrueRemove duplicate faces across frames
video_deduplication_thresholdfloat0.8Cosine similarity for deduplication

Output Parameters

ParameterTypeDefaultDescription
output_modestring"per_face"per_face or per_image
include_face_cropsbooleanfalseInclude aligned 112x112 face crops as base64
store_detection_metadatabooleantrueStore bbox, landmarks, detection scores

Configuration Examples

{
  "feature_extractor": {
    "feature_extractor_name": "face_identity_extractor",
    "version": "v1",
    "input_mappings": {
      "image": "payload.photo_url"
    },
    "field_passthrough": [
      { "source_path": "metadata.employee_id" }
    ],
    "parameters": {
      "detection_model": "scrfd_2.5g",
      "detection_threshold": 0.7,
      "embedding_model": "arcface_r100",
      "enable_quality_scoring": true,
      "quality_threshold": 0.5,
      "max_faces_per_image": 1,
      "min_face_size": 40
    }
  }
}

Face Matching

Use cosine similarity to match faces:
Similarity ScoreInterpretation
> 0.30Very likely same person
0.25 - 0.30Likely same person
0.20 - 0.25Possibly same person
< 0.20Different people
Recommended threshold: 0.25-0.30 for same person verification

Performance & Costs

MetricValue
Detection accuracy99%+ (WIDER FACE benchmark)
Verification accuracy99.8%+ (LFW benchmark)
Processing speedDetection: 5-7ms, Embedding: 10-15ms per face
Cost per image30 credits base
Cost per face30 credits additional per detected face

Video Processing

  • Deduplication: Reduces 90-95% redundancy in video
  • Sampling: 1 FPS recommended for most use cases
  • Max length: 300 seconds (extraction only)

Vector Index

PropertyValue
Index nameface_identity_extractor_v1_embedding
Dimensions512
TypeDense
Distance metricCosine
Datatypefloat32
Inference modelface_identity_arcface_r100_v1

Pipeline Overview

  1. SCRFD Detection - Bounding boxes + 5 landmarks
  2. 5-Point Affine Alignment - 112x112 canonical face
  3. ArcFace Embedding - 512-d L2-normalized vector
  4. Quality Scoring (optional) - Filter low-quality faces

Limitations

  • Face only: Does not identify age, gender, or expressions
  • Pose sensitivity: Extreme angles may reduce accuracy
  • Occlusion: Masks, glasses, hair may affect detection
  • Resolution: Minimum 20px face size, 40px+ recommended
  • Lighting: Poor lighting reduces quality scores
  • Video length: Maximum 300 seconds per video