Skip to content
SIE

Models

SIE supports extraction models for named entity recognition, relation extraction, text classification, and vision tasks. Model performance varies by task and domain. Run mise run eval <model> -t <task> to benchmark on your data.


GLiNER models extract entities with zero-shot label support. Define your own entity types at query time.

ModelLanguagesMax TokensF1 (CoNLL2003)
urchade/gliner_small-v2.1English163840.60
urchade/gliner_medium-v2.1English163840.61
urchade/gliner_large-v2.1English163840.55
urchade/gliner_multi-v2.1Multilingual163840.60
ModelDomainLanguagesF1 (CoNLL2003)
urchade/gliner_multi_pii-v1PII detectionMultilingual0.54
EmergentMethods/gliner_large_news-v2.1News articlesEnglish0.55
Ihor/gliner-biomed-large-v1.0BiomedicalEnglish0.64
ModelLanguagesF1 (CoNLL2003)Notes
numind/NuNER_ZeroEnglish0.61Zero-shot, merges adjacent
numind/NuNER_Zero-spanEnglish0.64Span extraction

GLiREL models extract relationships between entities. Specify relation types at query time.

ModelMax TokensF1 (FewRel)Notes
jackboyla/glirel-large-v0163840.26Zero-shot relations
result = client.extract(
"jackboyla/glirel-large-v0",
Item(text="Tim Cook is the CEO of Apple Inc."),
labels=["person", "organization"],
output_schema={"relation_types": ["works_for", "ceo_of", "founded"]}
)
for relation in result["relations"]:
print(f"{relation['head']} --{relation['relation']}--> {relation['tail']}")

GLiClass models classify text into arbitrary categories without fine-tuning.

ModelMax LengthNotes
knowledgator/gliclass-small-v1.0512Faster, smaller
knowledgator/gliclass-base-v1.0512Higher quality

DeBERTa models use natural language inference for zero-shot classification.

ModelMax LengthNotes
MoritzLaurer/deberta-v3-base-zeroshot-v2.0512Balanced
MoritzLaurer/deberta-v3-large-zeroshot-v2.0512Higher quality

Vision models require the florence2 bundle:

Terminal window
docker run --gpus all -p 8080:8080 ghcr.io/superlinked/sie:florence2

Microsoft Florence-2 models handle multiple vision tasks: OCR, captioning, object detection, and more.

ModelTasksNotes
microsoft/Florence-2-baseOCR, caption, detectionBase model
microsoft/Florence-2-largeOCR, caption, detectionLarger, higher quality
microsoft/Florence-2-base-ftOCR, caption, detectionFine-tuned variant
mynkchaudhry/Florence-2-FT-DocVQADocument QADocVQA fine-tuned

Florence-2 supports multiple task prompts:

TaskInstructionOutput
OCR<OCR>Extracted text
OCR with regions<OCR_WITH_REGION>Text with bounding boxes
Caption<CAPTION>Image description
Detailed caption<DETAILED_CAPTION>Extended description
Object detection<OD>Bounding boxes and labels
Document QA<DocVQA>Answer to question

Donut models parse structured documents without OCR pre-processing.

ModelTaskNotes
naver-clova-ix/donut-base-finetuned-docvqaDocument QAQuestion answering
naver-clova-ix/donut-base-finetuned-cord-v2Receipt parsingKey-value extraction
naver-clova-ix/donut-base-finetuned-rvlcdipDocument classificationDocument types

Zero-shot object detection with text prompts.

ModelAP (COCO)P50 LatencyNotes
IDEA-Research/grounding-dino-tiny0.49602msSmaller, faster
IDEA-Research/grounding-dino-base0.58671msHigher quality
google/owlv2-base-patch16-ensemble0.52547msOWL-ViT based

Extraction models are grouped into bundles based on dependency compatibility:

BundleModelsNotes
defaultGLiNER, GLiREL, GLiClass, NLI, detectionStandard dependencies
florence2Florence-2, DonutVision dependencies (timm)

Start with a specific bundle:

Terminal window
# NER, classification, relations (default bundle)
docker run --gpus all -p 8080:8080 ghcr.io/superlinked/sie:default
# Vision models (Florence-2, Donut)
docker run --gpus all -p 8080:8080 ghcr.io/superlinked/sie:florence2