Metric3D

class grid.model.perception.depth.metric3d.Metric3D(*args, **kwargs)

Metric3D: Depth Estimation Model

This class implements a wrapper for the Metric3D model, which estimates depth maps from RGB images using a variety of encoder types.

Credits:: https://github.com/YvanYin/Metric3D
License:: This code is licensed under the BSD 2-Clause License.

__init__()

Initialize the Metric3D model with the specified encoder configuration.

We use the VIT Large encoder for this model. We will provide the users the option to choose their encoder type in the next update. The model is loaded onto the GPU if available, otherwise it defaults to the CPU.

Return type:: None

preprocess_image(rgbimage, model_type='vit')

Preprocess the image for model input, adjusts intrinsic parameters, and tracks padding info.

Parameters:

rgbimage (Image.Image) -- The input RGB image.
model_type (str) -- The type of model to use (default is 'vit').

Returns:

The preprocessed image tensor.

Return type:

torch.Tensor

Example

>>> img = Image.open('path_to_image.jpg')
>>> m3d = Metric3D()
>>> processed = m3d.preprocess_image(img)
>>> print(processed['rgb_tensor'].shape)

run(rgbimage)

Runs the Metric3D depth estimation model on the given RGB image.

Parameters:: rgbimage (np.ndarray) -- The input RGB image.
Returns:: The 2D depth map generated by the Metric3D model.
Return type:: np.ndarray

Example

>>> img = np.random.randint(0, 255, (256, 256, 3)).astype(np.uint8)
>>> m3d = Metric3D()
>>> depth = m3d.run(img)
>>> print(depth.shape)
(256, 256)