GSAM2
- class grid.model.perception.segmentation.gsam2.GSAM2(*args, **kwargs)
GSAM2: Grounded Segment Anything 2.1 Model
This class implements a wrapper for the GSAM2 model, which combines the power of Grounding DINO for text-based object detection with SAM2 for high-precision segmentation in RGB images.
- Credits:
https://github.com/facebookresearch/segment-anything-2 https://github.com/IDEA-Research/GroundingDINO
- License:
This code is licensed under the Apache 2.0 and BSD-3 License.
- __init__(model_size='large', box_threshold=0.35, text_threshold=0.25, nms_threshold=0.8)
Initialize the GSAM2 model with Grounding DINO and SAM2 components.
- Parameters:
model_size (str) -- The size of the SAM2 model to use. Options are "tiny", "small", "base_plus", and "large".
box_threshold (float) -- Confidence threshold for object boxes from Grounding DINO.
text_threshold (float) -- Confidence threshold for text-prompted object detection.
nms_threshold (float) -- Non-maximum suppression (NMS) threshold for filtering overlapping boxes.
- Return type:
None
- segment_object(rgbimage, text_prompt)
Segment objects in an RGB image based on a text prompt.
This function uses Grounding DINO to detect objects based on the text prompt, and SAM2 to segment the detected objects.
- Parameters:
rgbimage (str) -- Path to the RGB image file.
text_prompt (str) -- Text prompt to guide object detection.
- Returns:
List of segmentation masks for the detected objects.
- Return type:
List[np.ndarray]