POST
/
extract
from mixpeek import Mixpeek

mixpeek = Mixpeek("API_KEY")

# for text extraction
# Notice how there's no modality provided, it's not required.
extraction = mixpeek.extract(
  input="s3://document.pdf",
  input_type="url"
)

The extract method allows you to pull data out of various modalities. Depending on the modality, different techniques can be applied:

  • image: optical character recognition, object detection, etc.
  • audio: transcription, diarization, etc.
  • video: object detection, scene detection, etc.
  • text: named entity recognition, tokenization, etc.

Currently only the text modality is available for public use. Contact us to use the beta for image, video and audio.

Request

model_id
string

Optional indication of which model to use

input
string
required

The input URL, text, or base64 encoding of the object to extract.

modality
string

The type of data to be extracted (e.g., image, audio, video, text).

Note that modality is not required here. If you don’t include it, we’ll automatically detect how to process it. This is useful when it’s the first step in the pipeline and you don’t know what filetypes you want to process in advance.

input_type
string
required

Specify whether the input is a url, base64, or text.

from mixpeek import Mixpeek

mixpeek = Mixpeek("API_KEY")

# for text extraction
# Notice how there's no modality provided, it's not required.
extraction = mixpeek.extract(
  input="s3://document.pdf",
  input_type="url"
)

Was this page helpful?