SAM Family¶

This page documents the RankSEG integration path for SAM-family outputs from Hugging Face transformers.

SAM-family outputs use explicit adapter classes instead of the standard Transformers semantic-segmentation helper.

Why SAM has its own adapters¶

SAM outputs are not plain semantic-segmentation logits. Before RankSEG is called, SAM masks must be restored from model space back to image space using family-specific geometry:

SAM processor -> SAM model outputs -> restore mask probabilities
-> RankSEG -> prompt, instance, or semantic masks

The adapters in rankseg.integration.sam keep this restoration step explicit. RankSEG only replaces the final binary mask step after masks have been resized and converted to probabilities.

from rankseg.integration import sam

Adapter map¶

Adapter	Input family	Main prediction method	Probability-only method
`sam.Sam1`	SAM1 and SAM-HQ prompt masks	`postprocess(...)`	`restore_mask_probs(...)`
`sam.Sam2`	SAM2 prompt masks	`postprocess(...)`	`restore_mask_probs(...)`
`sam.Sam3`	SAM3 instance masks	`postprocess_instance(...)`	`restore_instance_mask_probs(...)`
`sam.Sam3`	SAM3 semantic masks	`postprocess_semantic(...)`	`restore_semantic_mask_probs(...)`

Recommended RankSEG options¶

SAM prompt and instance masks are naturally represented as per-mask binary probability maps, so the adapters default to output_mode="multilabel" when rankseg_kwargs does not specify an output mode.

adapter = sam.Sam1(
    rankseg_kwargs={"metric": "dice", "solver": "RMA"}
)

SAM1 prompt masks¶

adapter = sam.Sam1(rankseg_kwargs={"metric": "dice"})
preds = adapter.postprocess(
    outputs,
    original_sizes=inputs["original_sizes"],
    reshaped_input_sizes=inputs["reshaped_input_sizes"],
)

original_sizes and reshaped_input_sizes should come from the SAM processor inputs. The adapter removes padding, resizes masks back to the original image size, applies sigmoid, and then calls RankSEG.

SAM2 prompt masks¶

adapter = sam.Sam2(
    rankseg_kwargs={"metric": "dice"},
    apply_non_overlapping_constraints=False,
)
preds = adapter.postprocess(
    outputs,
    original_sizes=inputs["original_sizes"],
)

Set apply_non_overlapping_constraints=True when you want lower-scoring overlapping SAM2 masks to be suppressed before converting logits to probabilities.

SAM3 instance masks¶

adapter = sam.Sam3(rankseg_kwargs={"metric": "dice"}, threshold=0.3)
results = adapter.postprocess_instance(
    outputs,
    target_sizes=target_sizes,
)

The instance method returns one dictionary per image with scores, boxes, and masks. threshold filters low-confidence instances before RankSEG is applied to the remaining mask probabilities.

SAM3 semantic masks¶

adapter = sam.Sam3(rankseg_kwargs={"metric": "dice"})
preds = adapter.postprocess_semantic(
    outputs,
    target_sizes=target_sizes,
)

The SAM adapters follow the official Transformers post-processing order through geometry and score restoration. RankSEG replaces the final binary mask step. For SAM3, callers choose postprocess_instance(...) or postprocess_semantic(...) explicitly.

Shape conventions¶

Stage	Typical shape	Meaning
Restored SAM prompt probabilities	`(num_masks, 1, H, W)` or compatible per-image tensors	One probability map per proposed prompt mask.
Restored SAM3 instance probabilities	`(num_instances, H, W)`	One probability map per retained instance.
RankSEG prompt predictions	Binary mask tensors matching restored mask geometry	Final prompt masks after RankSEG post-processing.
RankSEG semantic predictions	One tensor per image	Final semantic masks from SAM3 semantic probabilities.

Restored probabilities¶

Use the restore_* methods when you need the restored mask probabilities instead of final RankSEG predictions:

sam1_probs = sam.Sam1().restore_mask_probs(
    outputs,
    original_sizes=inputs["original_sizes"],
    reshaped_input_sizes=inputs["reshaped_input_sizes"],
)

sam2_probs = sam.Sam2().restore_mask_probs(
    outputs,
    original_sizes=inputs["original_sizes"],
)

sam3_instances = sam.Sam3(threshold=0.3).restore_instance_mask_probs(
    outputs,
    target_sizes=target_sizes,
)

sam3_semantic_probs = sam.Sam3().restore_semantic_mask_probs(
    outputs,
    target_sizes=target_sizes,
)

Explicit adapter imports are also supported when you prefer shorter local names:

from rankseg.integration.sam import Sam1, Sam2, Sam3

Current exclusions¶

The SAM integration does not currently support SAM video tracker state.

Executable tutorial¶

The SAM-family notebook is the recommended way to learn this integration. It runs SAM1, SAM2, and SAM3 examples in separate sections, compares the official post-processing path with the RankSEG path, and keeps the model outputs shared between the two paths so the replacement point is visible.