OWLv2
- class grid.model.perception.detection.owlv2.OWLv2(*args, **kwargs)
OWLv2: Open-vocabulary Object Detection Model
This class implements a wrapper for the OWLv2 model, which detects objects in RGB images based on a text prompt.
- Credits:
- Code:
- License:
This code is licensed under the Apache 2.0 License.
- __init__(box_threshold=0.2)
Initialize Owlv2 model.
- Parameters:
box_threshold (float, optional) -- The threshold value for object detection. Defaults to 0.2.
- Return type:
None
- detect_object(rgbimage, text_prompt)
Detect objects, which are specified by the text_prompt, in the RGB image and return the bounding boxes, scores, and labels.
- Parameters:
rgbimage (np.ndarray) -- Target RGB image represented as a numpy array of shape (H, W, 3).
text_prompt (str) -- Text prompt specifies the objects to be detected. There can be multiple objects in the prompt, and different objects are separated by ,.
- Returns:
bounding boxes (np.ndarray): List of bounding boxes with 2D pixel coordinates with respect to the image in xyxy format. (N, 4). scores (np.ndarray): List of confidence scores of the detected bounding boxes. (N) labels (np.ndarray): List of labels corresponding to the detected objects. (N)
- Return type:
Tuple[Optional[np.ndarray], Optional[np.ndarray], Optional[np.ndarray]]
Example
>>> owlv2_model = OWLv2() >>> boxes, scores, labels = owlv2_model.detect_object(img, "fire, redline") >>> print(boxes, scores, labels)