Grounding DINO - GRID Documentation

Get Started

Open GRID

GRID Enterprise

Robot API

Overview
Reference

Simulation

AirGen
Isaac

AI Layer

Overview
Perception
- Depth
- Detection
- Matching
- Optical Flow
- Segmentation
- SLAM
- Point Tracking
- Time to Collision
- Vision Language Action
- Vision Language
- Visual Odometry
Navigation

Deployment

Data Generation Pipelines

FAQ

from grid.model.perception.detection.gdino import GroundingDINO
car = AirGenCar()

# We will be capturing an image from the AirGen simulator 
# and run model inference on it.

img =  car.getImage("front_center", "rgb").data

model = GroundingDINO(use_local = False)
box, scores, labels = model.run(rgbimage=img, prompt=<prompt>)
print(box, scores, labels)

## if you want to use the model locally, set use_local=True
model = GroundingDINO(use_local = True)
box, scores, labels = model.run(rgbimage=img, prompt=<prompt>)
print(box, labels)

The GroundingDINO implements a wrapper for the GroundingDINO model, which detects objects in RGB images based on text prompts.

class GroundingDINO()

box_threshold

float

default:"0.4"

Confidence threshold for bounding box detection.

text_threshold

float

default:"0.25"

Confidence threshold for text-based object detection.

use_local

boolean

default:"False"

If True, inference call is run on the local VM, else offloaded onto GRID-Cortex. Defaults to False.

def run()

rgbimage

np.ndarray

required

The input RGB image of shape $(M,N,3)$ .

prompt

str

required

Text prompt for object detection. Multiple prompts can be separated by a ”.”.

Returns

List[float], List[float], List[str]

Returns three lists: bounding boxes coordinates, confidence scores, and label strings.

from grid.model.perception.detection.gdino import GroundingDINO
car = AirGenCar()

# We will be capturing an image from the AirGen simulator 
# and run model inference on it.

img =  car.getImage("front_center", "rgb").data

model = GroundingDINO(use_local = False)
box, scores, labels = model.run(rgbimage=img, prompt=<prompt>)
print(box, scores, labels)

## if you want to use the model locally, set use_local=True
model = GroundingDINO(use_local = True)
box, scores, labels = model.run(rgbimage=img, prompt=<prompt>)
print(box, labels)

This code is licensed under the Apache 2.0 License.

Was this page helpful?

OWLv2

from grid.model.perception.detection.gdino import GroundingDINO
car = AirGenCar()

# We will be capturing an image from the AirGen simulator 
# and run model inference on it.

img =  car.getImage("front_center", "rgb").data

model = GroundingDINO(use_local = False)
box, scores, labels = model.run(rgbimage=img, prompt=<prompt>)
print(box, scores, labels)

## if you want to use the model locally, set use_local=True
model = GroundingDINO(use_local = True)
box, scores, labels = model.run(rgbimage=img, prompt=<prompt>)
print(box, labels)