Models
SIE supports extraction models for named entity recognition, relation extraction, text classification, and vision tasks. Model performance varies by task and domain. Run mise run eval <model> -t <task> to benchmark on your data.
NER Models (GLiNER)
Section titled “NER Models (GLiNER)”GLiNER models extract entities with zero-shot label support. Define your own entity types at query time.
General Purpose
Section titled “General Purpose”| Model | Languages | Max Tokens | F1 (CoNLL2003) |
|---|---|---|---|
urchade/gliner_small-v2.1 | English | 16384 | 0.60 |
urchade/gliner_medium-v2.1 | English | 16384 | 0.61 |
urchade/gliner_large-v2.1 | English | 16384 | 0.55 |
urchade/gliner_multi-v2.1 | Multilingual | 16384 | 0.60 |
Domain-Specific
Section titled “Domain-Specific”| Model | Domain | Languages | F1 (CoNLL2003) |
|---|---|---|---|
urchade/gliner_multi_pii-v1 | PII detection | Multilingual | 0.54 |
EmergentMethods/gliner_large_news-v2.1 | News articles | English | 0.55 |
Ihor/gliner-biomed-large-v1.0 | Biomedical | English | 0.64 |
NuNER Models
Section titled “NuNER Models”| Model | Languages | F1 (CoNLL2003) | Notes |
|---|---|---|---|
numind/NuNER_Zero | English | 0.61 | Zero-shot, merges adjacent |
numind/NuNER_Zero-span | English | 0.64 | Span extraction |
Relation Extraction (GLiREL)
Section titled “Relation Extraction (GLiREL)”GLiREL models extract relationships between entities. Specify relation types at query time.
| Model | Max Tokens | F1 (FewRel) | Notes |
|---|---|---|---|
jackboyla/glirel-large-v0 | 16384 | 0.26 | Zero-shot relations |
result = client.extract( "jackboyla/glirel-large-v0", Item(text="Tim Cook is the CEO of Apple Inc."), labels=["person", "organization"], output_schema={"relation_types": ["works_for", "ceo_of", "founded"]})
for relation in result["relations"]: print(f"{relation['head']} --{relation['relation']}--> {relation['tail']}")const result = await client.extract( "jackboyla/glirel-large-v0", { text: "Tim Cook is the CEO of Apple Inc." }, { labels: ["person", "organization"] });
for (const entity of result.entities) { console.log(`${entity.label}: ${entity.text}`);}Classification Models
Section titled “Classification Models”GLiClass (Zero-Shot)
Section titled “GLiClass (Zero-Shot)”GLiClass models classify text into arbitrary categories without fine-tuning.
| Model | Max Length | Notes |
|---|---|---|
knowledgator/gliclass-small-v1.0 | 512 | Faster, smaller |
knowledgator/gliclass-base-v1.0 | 512 | Higher quality |
NLI-Based Classification
Section titled “NLI-Based Classification”DeBERTa models use natural language inference for zero-shot classification.
| Model | Max Length | Notes |
|---|---|---|
MoritzLaurer/deberta-v3-base-zeroshot-v2.0 | 512 | Balanced |
MoritzLaurer/deberta-v3-large-zeroshot-v2.0 | 512 | Higher quality |
Vision Models
Section titled “Vision Models”Vision models require the florence2 bundle:
docker run --gpus all -p 8080:8080 ghcr.io/superlinked/sie:florence2Florence-2
Section titled “Florence-2”Microsoft Florence-2 models handle multiple vision tasks: OCR, captioning, object detection, and more.
| Model | Tasks | Notes |
|---|---|---|
microsoft/Florence-2-base | OCR, caption, detection | Base model |
microsoft/Florence-2-large | OCR, caption, detection | Larger, higher quality |
microsoft/Florence-2-base-ft | OCR, caption, detection | Fine-tuned variant |
mynkchaudhry/Florence-2-FT-DocVQA | Document QA | DocVQA fine-tuned |
Florence-2 supports multiple task prompts:
| Task | Instruction | Output |
|---|---|---|
| OCR | <OCR> | Extracted text |
| OCR with regions | <OCR_WITH_REGION> | Text with bounding boxes |
| Caption | <CAPTION> | Image description |
| Detailed caption | <DETAILED_CAPTION> | Extended description |
| Object detection | <OD> | Bounding boxes and labels |
| Document QA | <DocVQA> | Answer to question |
Donut models parse structured documents without OCR pre-processing.
| Model | Task | Notes |
|---|---|---|
naver-clova-ix/donut-base-finetuned-docvqa | Document QA | Question answering |
naver-clova-ix/donut-base-finetuned-cord-v2 | Receipt parsing | Key-value extraction |
naver-clova-ix/donut-base-finetuned-rvlcdip | Document classification | Document types |
Object Detection
Section titled “Object Detection”Zero-shot object detection with text prompts.
| Model | AP (COCO) | P50 Latency | Notes |
|---|---|---|---|
IDEA-Research/grounding-dino-tiny | 0.49 | 602ms | Smaller, faster |
IDEA-Research/grounding-dino-base | 0.58 | 671ms | Higher quality |
google/owlv2-base-patch16-ensemble | 0.52 | 547ms | OWL-ViT based |
Bundle Compatibility
Section titled “Bundle Compatibility”Extraction models are grouped into bundles based on dependency compatibility:
| Bundle | Models | Notes |
|---|---|---|
default | GLiNER, GLiREL, GLiClass, NLI, detection | Standard dependencies |
florence2 | Florence-2, Donut | Vision dependencies (timm) |
Start with a specific bundle:
# NER, classification, relations (default bundle)docker run --gpus all -p 8080:8080 ghcr.io/superlinked/sie:default
# Vision models (Florence-2, Donut)docker run --gpus all -p 8080:8080 ghcr.io/superlinked/sie:florence2What’s Next
Section titled “What’s Next”- Extract and Tag Data - usage guide with examples
- Evals - benchmark models on your tasks