SAM2

class grid.model.perception.segmentation.sam2.SAM2(*args, **kwargs)

SAM2 model for object segmentation and object tracking.

This class implements a wrapper for the Segment Anything 2 (SAM2) model, which segments objects in RGB images and videos based on input prompts.

Credits:

https://github.com/facebookresearch/segment-anything-2

License:

This code is licensed under the Apache 2.0, and BSD-3 License.

Parameters:

model_size (str)

__init__(model_size='large')

Initialize the SAM2 model.

Parameters:

model_size (str) -- The size of the SAM2 model to use. Options are "tiny", "small", "base_plus", and "large".

Return type:

None

segment_image(rgbimage, input_prompts, input_labels, multimask_output)

Segment objects from an RGB image based on input prompts.

Parameters:
  • rgbimage (np.ndarray) -- Target RGB image represented as a numpy array of shape (H, W, 3).

  • input_prompts (np.ndarray) -- Input prompts for segmentation (e.g., bounding boxes, points).

  • input_labels (np.ndarray) -- Labels associated with the prompts.

  • multimask_output (bool) -- Whether to output multiple masks for each object.

Returns:

List of object masks (np.ndarray of shape (H, W, 1)) where object pixels are 1, others 0.

Return type:

List[np.ndarray]

segment_video(video_dir, input_prompts, input_labels)

Segment objects from video frames based on input prompts.

Parameters:
  • video_dir (str) -- Directory containing JPEG frames of the video.

  • input_prompts (np.ndarray) -- Input prompts for segmentation (e.g., points).

  • input_labels (np.ndarray) -- Labels associated with the prompts.

Returns:

Segmentation masks for each frame in the video.

Return type:

Dict[int, Dict[int, np.ndarray]]