SAM Family

This page documents the RankSEG integration path for SAM-family outputs from Hugging Face transformers.

SAM-family outputs use explicit adapter classes instead of the standard Transformers semantic-segmentation helper.

Why SAM has its own adapters

SAM outputs are not plain semantic-segmentation logits. Before RankSEG is called, SAM masks must be restored from model space back to image space using family-specific geometry:

SAM processor -> SAM model outputs -> restore mask probabilities
-> RankSEG -> prompt, instance, or semantic masks

The adapters in rankseg.integration.sam keep this restoration step explicit. RankSEG only replaces the final binary mask step after masks have been resized and converted to probabilities.

from rankseg.integration import sam

Adapter map

Adapter

Input family

Main prediction method

Probability-only method

sam.Sam1

SAM1 and SAM-HQ prompt masks

postprocess(...)

restore_mask_probs(...)

sam.Sam2

SAM2 prompt masks

postprocess(...)

restore_mask_probs(...)

sam.Sam3

SAM3 instance masks

postprocess_instance(...)

restore_instance_mask_probs(...)

sam.Sam3

SAM3 semantic masks

postprocess_semantic(...)

restore_semantic_mask_probs(...)

SAM1 prompt masks

adapter = sam.Sam1(rankseg_kwargs={"metric": "dice"})
preds = adapter.postprocess(
    outputs,
    original_sizes=inputs["original_sizes"],
    reshaped_input_sizes=inputs["reshaped_input_sizes"],
)

original_sizes and reshaped_input_sizes should come from the SAM processor inputs. The adapter removes padding, resizes masks back to the original image size, applies sigmoid, and then calls RankSEG.

SAM2 prompt masks

adapter = sam.Sam2(
    rankseg_kwargs={"metric": "dice"},
    apply_non_overlapping_constraints=False,
)
preds = adapter.postprocess(
    outputs,
    original_sizes=inputs["original_sizes"],
)

Set apply_non_overlapping_constraints=True when you want lower-scoring overlapping SAM2 masks to be suppressed before converting logits to probabilities.

SAM3 instance masks

adapter = sam.Sam3(rankseg_kwargs={"metric": "dice"}, threshold=0.3)
results = adapter.postprocess_instance(
    outputs,
    target_sizes=target_sizes,
)

The instance method returns one dictionary per image with scores, boxes, and masks. threshold filters low-confidence instances before RankSEG is applied to the remaining mask probabilities.

SAM3 semantic masks

adapter = sam.Sam3(rankseg_kwargs={"metric": "dice"})
preds = adapter.postprocess_semantic(
    outputs,
    target_sizes=target_sizes,
)

The SAM adapters follow the official Transformers post-processing order through geometry and score restoration. RankSEG replaces the final binary mask step. For SAM3, callers choose postprocess_instance(...) or postprocess_semantic(...) explicitly.

Shape conventions

Stage

Typical shape

Meaning

Restored SAM prompt probabilities

(num_masks, 1, H, W) or compatible per-image tensors

One probability map per proposed prompt mask.

Restored SAM3 instance probabilities

(num_instances, H, W)

One probability map per retained instance.

RankSEG prompt predictions

Binary mask tensors matching restored mask geometry

Final prompt masks after RankSEG post-processing.

RankSEG semantic predictions

One tensor per image

Final semantic masks from SAM3 semantic probabilities.

Restored probabilities

Use the restore_* methods when you need the restored mask probabilities instead of final RankSEG predictions:

sam1_probs = sam.Sam1().restore_mask_probs(
    outputs,
    original_sizes=inputs["original_sizes"],
    reshaped_input_sizes=inputs["reshaped_input_sizes"],
)

sam2_probs = sam.Sam2().restore_mask_probs(
    outputs,
    original_sizes=inputs["original_sizes"],
)

sam3_instances = sam.Sam3(threshold=0.3).restore_instance_mask_probs(
    outputs,
    target_sizes=target_sizes,
)

sam3_semantic_probs = sam.Sam3().restore_semantic_mask_probs(
    outputs,
    target_sizes=target_sizes,
)

Explicit adapter imports are also supported when you prefer shorter local names:

from rankseg.integration.sam import Sam1, Sam2, Sam3

Current exclusions

The SAM integration does not currently support SAM video tracker state.

Executable tutorial

The SAM-family notebook is the recommended way to learn this integration. It runs SAM1, SAM2, and SAM3 examples in separate sections, compares the official post-processing path with the RankSEG path, and keeps the model outputs shared between the two paths so the replacement point is visible.