The Instagram integration uses the Business Discovery API to fetch public media from other Instagram Business or Creator accounts. You authenticate with your own account, then monitor any number of public accounts.
Prerequisites
- A Facebook account with a Facebook Page that has a linked Instagram Business or Creator account. This is your “viewer” account — you don’t need to own the accounts you’re monitoring.
- The Facebook/Instagram app must have the
instagram_basic,pages_show_list, andbusiness_managementpermissions.
You don’t need any relationship with the accounts you monitor. Business Discovery works with any public Instagram Business or Creator account.
How It Works
The Instagram integration uses Meta’s Business Discovery API, which allows an authenticated Instagram Business Account to query public profile data and media from other Business or Creator accounts. Key concepts:- Viewer Account — Your authenticated Instagram Business Account. This is the account that makes API calls on your behalf.
- Target Accounts — The public Instagram accounts you want to monitor. Each target is configured as a separate sync config with the account’s username as the
source_path. - One Connection, Many Targets — A single OAuth connection provides the viewer account. You can create unlimited sync configs to monitor different target accounts.
Data Flow
What Gets Synced
For each media item on the target account, Mixpeek captures:| Field | Description |
|---|---|
| Media Content | The image or video file, downloaded and stored in your bucket |
| Media Type | IMAGE, VIDEO, CAROUSEL_ALBUM, or REEL |
| Caption | The post’s caption text |
| Timestamp | When the post was published |
| Like Count | Number of likes at sync time |
| Comments Count | Number of comments at sync time |
| Permalink | Direct link to the post on Instagram |
Setup
Create an Instagram Connection
Navigate to Connections in Mixpeek Studio and click Add Connection. Select Instagram and complete the OAuth flow.During OAuth, you’ll be asked to grant permissions and select which Facebook Pages to share. Make sure to select a Page that has a linked Instagram Business Account.
Add Sync Configs for Target Accounts
Create a sync config for each Instagram account you want to monitor. The Sync modes:
source_path is the target account’s username.initial_only— Fetches all available media once.continuous— Periodically checks for new posts and syncs them incrementally.
Process with a Collection
Create a collection with a feature extractor to process the synced media. The Batch processing runs automatically when syncs complete. Documents are created in Qdrant with multimodal embeddings, video segments, thumbnails, and full lineage back to the original bucket object.
multimodal_extractor generates 1408-dimensional embeddings for both images and videos.Resilience
The Instagram sync provider includes built-in resilience for handling the Facebook Graph API at scale:| Feature | Behavior |
|---|---|
| Adaptive Page Size | Starts at 25 items per request. Automatically halves on server errors (minimum 5), preventing failures on large accounts. |
| Exponential Backoff | Retries up to 3 times with exponential backoff and jitter on 5xx errors and network timeouts. |
| Rate Limit Handling | Respects 429 Retry-After headers from the Graph API. The Instagram API allows ~200 calls per hour per account. |
| Graph API Error Parsing | Parses 400 error responses from the Graph API. Retries “reduce data” errors with smaller page sizes. Fails fast on non-retryable errors (invalid username, permission denied). |
| Graceful Degradation | If a page of results fails after all retries, items already synced from previous pages are preserved. |
| CDN Download Retries | Media downloads from Instagram’s CDN retry independently with their own backoff logic. |
Token Lifecycle
Instagram access tokens have a limited lifespan. The integration handles token management automatically:- Short-lived token (1 hour) — Obtained during the OAuth callback.
- Long-lived token (60 days) — Exchanged automatically during the OAuth flow.
- Auto-refresh — When a token is within 7 days of expiry, it’s refreshed automatically before each sync execution.
Limitations
- Business/Creator accounts only — Business Discovery only works with public Instagram Business or Creator accounts. Personal accounts cannot be discovered.
- Public media only — Only publicly visible posts are accessible. Stories, DMs, and private account content are not available.
- Rate limits — The Graph API allows approximately 200 calls per hour per authenticated account. With a default page size of 25, this supports syncing ~5,000 media items per hour.
- No real-time updates — Media is fetched on-demand when a sync is triggered. Use
continuoussync mode for periodic polling.
Use Cases
Brand Monitoring
Track competitor visual strategies across Instagram. Analyze creative trends, posting frequency, and content themes using multimodal search.
Influencer Analysis
Build a searchable database of influencer content. Find visually similar posts, track engagement patterns, and identify content themes.
Content Intelligence
Process Instagram media through feature extractors to detect objects, extract text from images, transcribe video audio, and generate semantic embeddings for search.
Creative Benchmarking
Compare visual content across brands. Use multimodal retrieval to find similar creative executions and track how visual trends evolve over time.

