StableDiffusion

class grid.model.generation.image.stablediffusion.StableDiffusion(*args, **kwargs)

Stable Diffusion 2.1: Text-to-Image Generation Model

This class implements a wrapper for the Stable Diffusion 2.1 model, which generates high-quality images from text prompts. It utilizes the Hugging Face Diffusers library and employs the DPMSolverMultistepScheduler for efficient inference.

pipe

The Stable Diffusion pipeline object.

Type:

StableDiffusionPipeline

generate_image(prompt (str)) np.ndarray

Generates an image based on the input text prompt.

Parameters:

prompt (str)

Return type:

ndarray

clear_weights()

Clears the model weights to free up memory.

Example

>>> from grid.model.generation.image.stablediffusion import StableDiffusion
>>> model = StableDiffusion()
>>> img = model.generate_image("A photo of a friendly looking robot")
>>> plt.imshow(img)
>>> plt.show()

Note

This model requires a CUDA-enabled GPU for optimal performance.

Credits:
License:

This model is licensed under the MIT License.

__init__()

Initializes the Stable Diffusion 2.1 model.

This constructor sets up the Stable Diffusion pipeline with the DPMSolverMultistepScheduler and moves the model to the appropriate device (GPU if available).

Return type:

None

generate_image(prompt)

Generates an image from the given text prompt using the Stable Diffusion 2.1 model.

Parameters:

prompt (str) -- The text description of the image to be generated.

Returns:

The generated image as a NumPy array with shape (height, width, 3)

and dtype uint8, representing an RGB image.

Return type:

np.ndarray

Raises:

RuntimeError -- If the model fails to generate an image.

Example

>>> from grid.model.generation.image.stablediffusion import StableDiffusion
>>> model = StableDiffusion()
>>> img = model.generate_image("A photo of a friendly looking robot")
>>> img.shape
(768, 768, 3)