Skip to content
SIE

Bundles

Python ML libraries often have conflicting dependency requirements. Models using trust_remote_code=True may depend on specific transformers versions. SIE solves this with bundles. Each bundle is a self-contained environment with compatible dependencies.

For example:

  • sentence-transformers requires transformers>=4.57
  • GritLM/GritLM-7B requires transformers before version 4.54
  • These cannot coexist in the same environment

Bundles group models with compatible dependencies into separate Docker images.


BundlePurposeKey Models
defaultStandard modelsBGE-M3, E5, Qwen3, GLiNER, ColBERT
legacyOlder transformers (before 4.56)Stella, GritLM-7B
gte-qwen2Alibaba GTE modelsgte-Qwen2-1.5B, gte-Qwen2-7B
sglangLarge LLM embeddingsQwen3-4B+, E5-Mistral-7B, NV-Embed
florence2Vision-language modelsFlorence-2, Donut

The default bundle includes most models using transformers>=4.57. This is the recommended starting point.

Included models:

  • Dense: BAAI/bge-m3, intfloat/e5-*, Alibaba-NLP/gte-multilingual-base
  • Qwen3: Qwen/Qwen3-Embedding-0.6B, Qwen/Qwen3-Embedding-4B
  • 7B models: intfloat/e5-mistral-7b-instruct, Salesforce/SFR-Embedding-*
  • NVIDIA: nvidia/NV-Embed-v2, nvidia/llama-embed-nemotron-8b
  • Sparse: OpenSearch neural sparse, SPLADE variants, Granite sparse
  • ColBERT: jinaai/jina-colbert-v2, answerdotai/answerai-colbert-small-v1
  • NER: GLiNER models, GLiREL relation extraction

Models requiring older transformers versions (before 4.56). Use when you need Stella or GritLM.

Included models:

  • dunzhang/stella_en_1.5B_v5
  • dunzhang/stella_en_400M_v5
  • GritLM/GritLM-7B

Alibaba GTE-Qwen2 models with DynamicCache API compatibility requirements.

Included models:

  • Alibaba-NLP/gte-Qwen2-1.5B-instruct
  • Alibaba-NLP/gte-Qwen2-7B-instruct

Large LLM embedding models (4B+ parameters) using SGLang backend for memory efficiency.

Included models:

  • Qwen/Qwen3-Embedding-4B, qwen3-embedding-8b
  • Alibaba-NLP/gte-Qwen2-7B-instruct
  • intfloat/e5-mistral-7b-instruct
  • Linq-AI-Research/Linq-Embed-Mistral
  • Salesforce/SFR-Embedding-Mistral, Salesforce/SFR-Embedding-2_R
  • nvidia/llama-embed-nemotron-8b

Microsoft Florence-2 and Donut vision-language models. Requires timm for the DaViT vision encoder.

Included models:

  • microsoft/Florence-2-base, microsoft/Florence-2-large
  • microsoft/Florence-2-base-ft
  • mynkchaudhry/Florence-2-FT-DocVQA
  • naver-clova-ix/donut-base-finetuned-cord-v2 (receipt parsing)
  • naver-clova-ix/donut-base-finetuned-docvqa (document QA)
  • naver-clova-ix/donut-base-finetuned-rvlcdip (document classification)

Each bundle has a corresponding Docker image tag. One image per bundle.

Terminal window
# Default bundle (recommended)
docker run -p 8080:8080 ghcr.io/superlinked/sie:default
# With GPU
docker run --gpus all -p 8080:8080 ghcr.io/superlinked/sie:default
# Legacy bundle for Stella/GritLM
docker run --gpus all -p 8080:8080 ghcr.io/superlinked/sie:legacy
# GTE-Qwen2 bundle
docker run --gpus all -p 8080:8080 ghcr.io/superlinked/sie:gte-qwen2
# SGLang bundle for large LLM models
docker run --gpus all -p 8080:8080 ghcr.io/superlinked/sie:sglang
# Florence-2 bundle for vision models
docker run --gpus all -p 8080:8080 ghcr.io/superlinked/sie:florence2

Choose a bundle based on the models you need:

  1. Start with default - covers most use cases with 60+ models
  2. Use legacy if you need Stella or GritLM-7B
  3. Use gte-qwen2 for Alibaba GTE-Qwen2 instruction models
  4. Use sglang for memory-efficient large LLM embeddings
  5. Use florence2 for document understanding and OCR

Models are loaded on first request. The bundle only determines which models are available.