Model Profiles

Profiles are named bundles of runtime options. Instead of passing the same options repeatedly, define a profile once and reference it by name.

Quick Example

from sie_sdk import SIEClient
from sie_sdk.types import Item

client = SIEClient("http://localhost:8080")

# Use the "sparse" profile
result = client.encode(
    "BAAI/bge-m3",
    Item(text="machine learning"),
    options={"profile": "sparse"}
)

# Returns sparse embeddings only
sparse = result["sparse"]
print(f"Non-zero tokens: {len(sparse['indices'])}")

Built-in Profiles

Models can define multiple profiles. Common patterns include:

Profile	Purpose	Typical Settings
`default`	Standard behavior	Model’s default output types
`sparse`	Lexical search	`output_types: [sparse]`
`muvera`	ColBERT via dense	`muvera: {}`, `output_types: [dense]`

BGE-M3 includes sparse, banking, and medical-vn profiles. ColBERT models include muvera profiles for MUVERA-based retrieval.

Using Profiles

Pass the profile name in the options parameter:

from sie_sdk import SIEClient
from sie_sdk.types import Item

client = SIEClient("http://localhost:8080")

# Sparse-only embedding
result = client.encode(
    "BAAI/bge-m3",
    Item(text="search query"),
    options={"profile": "sparse"}
)

# Domain-specific LoRA with custom instruction
result = client.encode(
    "BAAI/bge-m3",
    Item(text="transfer funds"),
    is_query=True,
    options={"profile": "banking"},
)

The HTTP API uses the same options field:

curl -X POST http://localhost:8080/v1/encode/BAAI/bge-m3 \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -d '{"items": [{"text": "search query"}], "params": {"options": {"profile": "sparse"}}}'

Profile Fields

Profiles support these fields:

Field	Type	Description
`is_default`	bool	Marks the default profile (one per model)
`output_types`	list	Output types to return (dense, sparse, multivector)
`output_similarity`	dict	Similarity function per output type (cosine, dot)
`instruction`	string	Instruction prefix for queries
`lora`	string	LoRA adapter path (HuggingFace ID or local path)

Profiles can also include any adapter-specific runtime options. For example, MUVERA profiles include muvera: {} to enable the postprocessor.

Defining Custom Profiles

Define profiles in the model’s config YAML file:

name: BAAI/bge-m3
hf_id: BAAI/bge-m3
adapter: sie_server.adapters.bge_m3_flash:BGEM3FlashAdapter

profiles:
  default:
    is_default: true
  sparse:
    output_types:
      - sparse
  banking:
    lora: saivamshiatukuri/bge-m3-banking77-lora
    instruction: "Classify banking intent"
  legal:
    instruction: "Given a legal query, retrieve relevant case law"
    lora: org/bge-m3-legal-lora

Each model must have exactly one profile with is_default: true.

Profile Resolution

Runtime options are resolved in this order (later overrides earlier):

Defaults - adapter_options_runtime from model config
Profile - Options from the selected profile
Request - Options passed in the request

# Request-level options override profile settings
result = client.encode(
    "BAAI/bge-m3",
    Item(text="query"),
    options={
        "profile": "sparse",
        "is_query": True  # Overrides any profile setting
    }
)

If no profile is specified, the default profile is used. If a profile name is invalid, the server returns an error listing available profiles.

Listing Available Profiles

Query the models endpoint to see available profiles:

curl http://localhost:8080/v1/models

The response includes profile information for each model.

What’s Next

Sparse embeddings - when to use sparse output
Multi-vector embeddings - MUVERA profile for ColBERT models