Models

SIE supports extraction models for named entity recognition, relation extraction, text classification, and vision tasks. Model performance varies by task and domain. Run mise run eval <model> -t <task> to benchmark on your data.

NER Models (GLiNER)

GLiNER models extract entities with zero-shot label support. Define your own entity types at query time.

General Purpose

Model	Languages	Max Tokens	F1 (CoNLL2003)
`urchade/gliner_small-v2.1`	English	16384	0.60
`urchade/gliner_medium-v2.1`	English	16384	0.61
`urchade/gliner_large-v2.1`	English	16384	0.55
`urchade/gliner_multi-v2.1`	Multilingual	16384	0.60

Domain-Specific

Model	Domain	Languages	F1 (CoNLL2003)
`urchade/gliner_multi_pii-v1`	PII detection	Multilingual	0.54
`EmergentMethods/gliner_large_news-v2.1`	News articles	English	0.55
`Ihor/gliner-biomed-large-v1.0`	Biomedical	English	0.64

NuNER Models

Model	Languages	F1 (CoNLL2003)	Notes
`numind/NuNER_Zero`	English	0.61	Zero-shot, merges adjacent
`numind/NuNER_Zero-span`	English	0.64	Span extraction

Relation Extraction (GLiREL)

GLiREL models extract relationships between entities. Specify relation types at query time.

Model	Max Tokens	F1 (FewRel)	Notes
`jackboyla/glirel-large-v0`	16384	0.26	Zero-shot relations

Python
TypeScript

result = client.extract(
    "jackboyla/glirel-large-v0",
    Item(text="Tim Cook is the CEO of Apple Inc."),
    labels=["person", "organization"],
    output_schema={"relation_types": ["works_for", "ceo_of", "founded"]}
)

for relation in result["relations"]:
    print(f"{relation['head']} --{relation['relation']}--> {relation['tail']}")

const result = await client.extract(
  "jackboyla/glirel-large-v0",
  { text: "Tim Cook is the CEO of Apple Inc." },
  { labels: ["person", "organization"] }
);

for (const entity of result.entities) {
  console.log(`${entity.label}: ${entity.text}`);
}

Classification Models

GLiClass (Zero-Shot)

GLiClass models classify text into arbitrary categories without fine-tuning.

Model	Max Length	Notes
`knowledgator/gliclass-small-v1.0`	512	Faster, smaller
`knowledgator/gliclass-base-v1.0`	512	Higher quality

NLI-Based Classification

DeBERTa models use natural language inference for zero-shot classification.

Model	Max Length	Notes
`MoritzLaurer/deberta-v3-base-zeroshot-v2.0`	512	Balanced
`MoritzLaurer/deberta-v3-large-zeroshot-v2.0`	512	Higher quality

Vision Models

Vision models require the florence2 bundle:

docker run --gpus all -p 8080:8080 ghcr.io/superlinked/sie:florence2

Florence-2

Microsoft Florence-2 models handle multiple vision tasks: OCR, captioning, object detection, and more.

Model	Tasks	Notes
`microsoft/Florence-2-base`	OCR, caption, detection	Base model
`microsoft/Florence-2-large`	OCR, caption, detection	Larger, higher quality
`microsoft/Florence-2-base-ft`	OCR, caption, detection	Fine-tuned variant
`mynkchaudhry/Florence-2-FT-DocVQA`	Document QA	DocVQA fine-tuned

Florence-2 supports multiple task prompts:

Task	Instruction	Output
OCR	`<OCR>`	Extracted text
OCR with regions	`<OCR_WITH_REGION>`	Text with bounding boxes
Caption	`<CAPTION>`	Image description
Detailed caption	`<DETAILED_CAPTION>`	Extended description
Object detection	`<OD>`	Bounding boxes and labels
Document QA	`<DocVQA>`	Answer to question

Donut

Donut models parse structured documents without OCR pre-processing.

Model	Task	Notes
`naver-clova-ix/donut-base-finetuned-docvqa`	Document QA	Question answering
`naver-clova-ix/donut-base-finetuned-cord-v2`	Receipt parsing	Key-value extraction
`naver-clova-ix/donut-base-finetuned-rvlcdip`	Document classification	Document types

Object Detection

Zero-shot object detection with text prompts.

Model	AP (COCO)	P50 Latency	Notes
`IDEA-Research/grounding-dino-tiny`	0.49	602ms	Smaller, faster
`IDEA-Research/grounding-dino-base`	0.58	671ms	Higher quality
`google/owlv2-base-patch16-ensemble`	0.52	547ms	OWL-ViT based

Bundle Compatibility

Extraction models are grouped into bundles based on dependency compatibility:

Bundle	Models	Notes
`default`	GLiNER, GLiREL, GLiClass, NLI, detection	Standard dependencies
`florence2`	Florence-2, Donut	Vision dependencies (timm)

Start with a specific bundle:

# NER, classification, relations (default bundle)
docker run --gpus all -p 8080:8080 ghcr.io/superlinked/sie:default

# Vision models (Florence-2, Donut)
docker run --gpus all -p 8080:8080 ghcr.io/superlinked/sie:florence2

What’s Next

Extract and Tag Data - usage guide with examples
Evals - benchmark models on your tasks