Feature Extraction
Configure and customize multimodal feature extraction for different content types
Feature extractors allow you to define how content is processed and what information is extracted, with support for text, images, video, and audio content.
Extractor Types
Text Processing
- Text embedding
- Language detection
- Sentiment analysis
- Named entity recognition
Image Analysis
- Object detection
- Scene classification
- Face recognition
- OCR extraction
Video Processing
- Frame analysis
- Transcription
- Scene detection
- Motion tracking
Configuration
Extractor Options
Output Configuration
Processing Flow
Best Practices
Optimize Intervals
Choose appropriate processing intervals for video content
Configure Thresholds
Set confidence thresholds based on use case requirements
Select Indexes
Use appropriate vector indexes for different content types
Monitor Resources
Balance processing depth with resource utilization
Limitations
Be aware of these technical constraints:
- Maximum video duration: 4 hours
- Maximum file size: 2GB
- Processing timeout: 30 minutes
- Rate limits apply to extraction requests
Embedding Models
Feature extractors support multiple embedding types and models for different content formats. You can generate embeddings from text, URLs, files, and base64-encoded content.
Supported Input Types
Text & URLs
- Direct text input
- Web page content
- Remote file URLs
- Field-specific embeddings
Files & Encoded
- Local file processing
- Base64 encoded content
- Multi-modal inputs
- Batch processing
Embedding Configuration
Output Formats
When multiple embedding requests use the same model:
- Embeddings are generated in the same vector space
- Final embeddings are averaged across inputs
For detailed implementation examples, see the Feature Extraction API Reference.
Was this page helpful?