GroundedSAM

class grid.model.perception.segmentation.gsam.GroundedSAM(*args, **kwargs)

GroundedSAM: Object Segmentation Model

This class implements a wrapper for the GroundedSAM model, which segments objects in RGB images based on text prompts.

Credits:

https://github.com/IDEA-Research/Grounded-Segment-Anything and https://github.com/facebookresearch/segment-anything

License:

This code is licensed under the Apache 2.0 License.

__init__(box_threshold=0.5, text_threshold=0.25, nms_threshold=0.8)

Initialize GroundedSAM model.

Parameters:
  • box_threshold (float) -- The threshold value for bounding box prediction. Defaults to 0.5.

  • text_threshold (float) -- The threshold value for text prediction. Defaults to 0.25.

  • nms_threshold (float) -- The threshold value for non-maximum suppression. Defaults to 0.8.

Return type:

None

segment_object(rgbimage, text_prompt)

detect and segment objects from rgbimage where the target objects are specificed by text_prompt

Parameters:
  • rgbimage (np.ndarray) -- target rgb image represented as numpy array of shape (H, W, 3)

  • text_prompt (str) -- text prompt specifies the objects to be detected and segmented.

Returns:

list of object mask (np.ndarray of shape (H, W, 1)): object pixels are 1, others 0. The list is empty if not object is detected.

Return type:

List[np.ndarray]

Example

>>> gsam = GroundedSAM()
>>> res = gsam.segment_image(img, "turbine")
>>> if len(res) > 0:
>>>    # always check if the list is non-empty before processing the result
>>>    mask, confidence = res[0]
>>>    # mask of shape (H, W, 1)