StableDiffusion
- class grid.model.generation.image.stablediffusion.StableDiffusion(*args, **kwargs)
Stable Diffusion 2.1: Text-to-Image Generation Model
This class implements a wrapper for the Stable Diffusion 2.1 model, which generates high-quality images from text prompts. It utilizes the Hugging Face Diffusers library and employs the DPMSolverMultistepScheduler for efficient inference.
- pipe
The Stable Diffusion pipeline object.
- Type:
StableDiffusionPipeline
- generate_image(prompt (str)) np.ndarray
Generates an image based on the input text prompt.
- Parameters:
prompt (str)
- Return type:
ndarray
- clear_weights()
Clears the model weights to free up memory.
Example
>>> from grid.model.generation.image.stablediffusion import StableDiffusion >>> model = StableDiffusion() >>> img = model.generate_image("A photo of a friendly looking robot") >>> plt.imshow(img) >>> plt.show()
Note
This model requires a CUDA-enabled GPU for optimal performance.
- Credits:
Stable Diffusion 2.1: https://huggingface.co/stabilityai/stable-diffusion-2-1
- License:
This model is licensed under the MIT License.
- __init__()
Initializes the Stable Diffusion 2.1 model.
This constructor sets up the Stable Diffusion pipeline with the DPMSolverMultistepScheduler and moves the model to the appropriate device (GPU if available).
- Return type:
None
- generate_image(prompt)
Generates an image from the given text prompt using the Stable Diffusion 2.1 model.
- Parameters:
prompt (str) -- The text description of the image to be generated.
- Returns:
- The generated image as a NumPy array with shape (height, width, 3)
and dtype uint8, representing an RGB image.
- Return type:
np.ndarray
- Raises:
RuntimeError -- If the model fails to generate an image.
Example
>>> from grid.model.generation.image.stablediffusion import StableDiffusion >>> model = StableDiffusion() >>> img = model.generate_image("A photo of a friendly looking robot") >>> img.shape (768, 768, 3)