MoonDream

class grid.model.perception.vlm.moondream.MoonDream(*args, **kwargs)

MoonDream: Visual Question Answering Model

This class implements a wrapper for the MoonDream v3 model, which answers questions about visual media (images) using the MoonDream framework.

Credits:

https://github.com/vikhyat/moondream

License:

This code is licensed under the Apache 2.0 License.

__init__()

Initialize the MoonDream model.

The model is loaded and initialized with pre-trained weights.

Return type:

None

run(rgbimage, question)

Runs the MoonDream visual question answering model on the given RGB image and question.

Parameters:
  • rgbimage (np.ndarray) -- The input RGB image.

  • question (str) -- The question to be answered.

Returns:

The predicted answer.

Return type:

np.ndarray

Example

>>> rgbimage = np.random.randint(0, 255, (224, 224, 3))
>>> moondream = MoonDream()
>>> answer = moondream.run(rgbimage, "What do you see in this image?")