In case you’ve been backpacking in the Himalayas for the last year, generative AI has recently become enormously popular. Text generators like OpenAI's ChatGPT and Google Bard are one type of generative AI model. Text-to-image generators are another. One of the leaders in the space is Stable Diffusion, an open source image generation AI system.
We'll use Stable Diffusion's free trial API to build a React.js client that connects and interacts with the service.
Getting started with Stable Diffusion
Stable Diffusion is available from its GitHub repository. Several projects offer endpoints for interacting with hosted Stable Diffusion installs, so you can avoid having to set up and train the model yourself. One of the best interfaces to Stable Diffusion is the Stable Diffusion API, which offers a free trial. We’ll use the API to see how we can interact with Stable Diffusion in React, a front-end JavaScript library that is also free and open source.
To begin, we’ll use the create-react-app
command-line tool to start a new React application. You’ll need Node and NPM installed for this. Then, install create-react-app
with $ npm i -g create-react-app
. Now, you can create a simple application with the command: $ create-react-app react-sd
. To test it, move into the react-sd
directory and type $ npm start
, and visit the landing page at localhost:3000
.
Build the React UI
Now that we have React and Stable Diffusion set up, let’s start building a UI that will let us enter a text prompt, send it to the Stable Diffusion API endpoint, and display the resultant image. We’ll use a simple scrollable pane to show the list of generated images.
First, we'll create a text box to receive the prompt and a Submit button to go with it, as shown in Listing 1. This code can replace React's default App.js file.
Listing 1. Add a text box and a Submit option to the basic App.js file
import React, { useState } from "react";
const App = () => {
const [textPrompt, setTextPrompt] = useState("");
const [prompts, setPrompts] = useState([]);
const generateImage = async () => {
};
const handleClick = () => {
setPrompts([...prompts, textPrompt]);
setTextPrompt("");
generateImage();
};
return (
<div>
<input
type="text"
placeholder="Enter a text prompt"
onChange={(e) => setTextPrompt(e.target.value)}
/>
<button onClick={handleClick}>Generate Image</button>
<ul>
{prompts.map((prompt) => (
<li key={prompt}>{prompt}</li>
))}
</ul>
</div>
);
};
export default App;
Now the application will accept text input and save it to an array that is displayed in the UI as an unordered list. The prompt input and array of submitted prompts are both useState
hook variables.
Connect to the API endpoint
So far, the generateImage()
function is a no-op. Let’s begin implementing it. The first step is to submit the text prompt to the Stable Diffusion endpoint. The Stable Diffusion API is a RESTful style that expects JSON requests and responses, as described here. The documentation includes a list of API endpoints, including the text-to-image endpoint, which is described here.
One wrinkle we have to deal with is that Stable Diffusion’s API doesn’t like browser requests. This is a security issue (exposing the API Key in the browser), so we’ll use a public proxy to work around it. Don’t do this in a real-world application where your application is exposed. In that case, you would use a back-end application to hold the key and fold it into the front-end request. In this case, we’ll use Heroku’s CORS anywhere proxy, which I'll demonstrate shortly.
So, the general plan is to take the user's text and submit it to the API endpoint as a POST request. The JSON body holds fields for the API key and user text. The response comes back as a JSON response body (more about the format here). The main thing we are after is the output array, which in our case has a single element: the URL of the generated image.
In the user interface, we’ll take the URL and put it in an image component next to the text prompt in the unordered list. Listing 2 shows the code to perform these tasks by evolving Listing 1.
Listing 2. Calling the API and displaying the generated image
import React, { useState } from "react";
const App = () => {
const [textPrompt, setTextPrompt] = useState("");
const [prompts, setPrompts] = useState([]);
const generateImage = async () => {
const apiKey = "YOUR KEY HERE";
const url = "https://stablediffusionapi.com/api/v3/text2img";
const proxyUrl = "https://cors-anywhere.herokuapp.com/";
const requestOptions = {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ key: apiKey, prompt: textPrompt }),
};
try {
const response = await fetch(proxyUrl + url, requestOptions);
const data = await response.json();
// Get the image URL from the response
const imageUrl = data.output[0];
// Update the prompts array with the generated image URL
setPrompts([...prompts, { prompt: textPrompt, imageUrl }]);
} catch (error) {
console.error("Error generating image:", error);
}
};
const handleClick = async () => {
setPrompts([...prompts, { prompt: textPrompt, imageUrl: "" }]);
setTextPrompt("");
await generateImage();
};
return (
<div className="container">
<input className="input-container"
type="text"
placeholder="Enter a text prompt"
value={textPrompt}
onChange={(e) => setTextPrompt(e.target.value)}
/>
<button onClick={handleClick}>Generate Image</button>
<ul className="prompts-list">
{prompts.map((item, index) => (
<li key={index} className="prompt-item">
<p className="prompt-text">{item.prompt}</p>
{item.imageUrl && <img src={item.imageUrl} alt="Generated Image" className="generated-image"/>}
</li>
))}
</ul>
</div>
);
};
export default App;
Two important notes here: First, be sure to put your real API Key where it says "YOUR API KEY HERE." Second (once more) don’t do this in a public-facing web application because if you do, your key will be exposed.
Next, note that we’ve added the imageUrl
field to the prompt's useState
variable, so we can track and display the generated images. Also notice the await
for generateImage()
to finish in the handleClick()
function, so that the image generated is displayed.
We've also added some styling, which you'll see soon.
If you start this application with npm start
and view it in a browser, the first time you submit the prompt, you may be asked by the Heroku CORS proxy to request permission. In the JavaScript console, you’ll see a message about missing permissions (status 401) and a link. Just click the link and once the page opens, hit the temporary access button, as shown in Figure 1.
Now when you submit the prompts, the application will display the list of prompts and their images, as shown in Figure 2.
Fine-tune the application
Everything is working, but we could make several improvements. For one, the loading state displays for all images. For another, the Stable Diffusion API sometimes returns a status of processing with an ETA field for when the image should be completed. A more sophisticated UI would handle the processing status.
The first priority is to improve the styling. Let’s add CSS to make things look nicer, including displaying the image in a tighter format and styling the text.
Listing 3. The application with improved styling
.container {
text-align: center;
margin-top: 50px;
}
.input-container {
margin-bottom: 20px;
}
input[type="text"] {
padding: 10px;
font-size: 16px;
border: none;
border-radius: 4px;
}
button {
padding: 10px 20px;
font-size: 16px;
background-color: #007bff;
color: #fff;
border: none;
border-radius: 4px;
cursor: pointer;
}
.prompts-list {
list-style-type: none;
padding: 0;
margin: 0;
}
.prompt-item {
display: flex;
align-items: center;
margin-bottom: 10px;
}
.prompt-text {
flex: 1;
font-size: 18px;
}
.generated-image {
width: 200px;
height: 200px;
object-fit: cover;
border-radius: 4px;
}
.loading-text {
font-size: 14px;
font-style: italic;
color: #aaa;
}
Now we can import the new styles by adding this line to the beginning of the App.js file:
import "./App.css";
Now, we’ll see the input text, button prompt, and image history with a more reasonable look, as shown in Figure 3.
As a final improvement, let’s add the ability to click on an image and send it along with the prompt to Stable Diffusion's img2img endpoint, which allows for clicking on an existing image and combining it with a prompt text for a new output image URL. When clicked, the image issues a request that has an extra field in the JSON body: init_image
, with the URL of the image you've clicked on. You can see the updated code in Listing 4, which has the added image-click handler and the updated generateImage()
function.
Listing 4. Updated code
const generateImage = async (imageUrl) => {
const apiKey = "YOUR KEY HERE";
let url = "https://stablediffusionapi.com/api/v3/text2img";
const proxyUrl = "https://cors-anywhere.herokuapp.com/";
const requestBody = {
key: apiKey,
prompt: textPrompt,
};
if (imageUrl) {
requestBody.init_image = imageUrl;
requestBody.samples = 1;
requestBody.width = 800;
requestBody.height = 800;
url = "https://stablediffusionapi.com/api/v3/img2img";
}
const requestOptions = {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(requestBody),
};
try {
setIsLoading(true);
const response = await fetch(proxyUrl + url, requestOptions);
const data = await response.json();
if (data.status === "error") {
console.error("Error generating image: " + data.message);
alert(data.message);
return;
}
// Get the image URL from the response
const generatedImageUrl = data.output[0];
// Update the prompts array with the generated image URL
setPrompts([...prompts, { prompt: textPrompt + (imageUrl ? " (image2image)" : ""), imageUrl: generatedImageUrl }]);
} catch (error) {
console.error("Error generating image:", error);
} finally {
setIsLoading(false);
}
};
//...
const handleImageClick = (imageUrl) => {
console.log("Clicked on image:", imageUrl);
generateImage(imageUrl);
};
// ...
<img onClick={() => handleImageClick(item.imageUrl)} src={item.imageUrl} alt="Generated Image" className="generated-image"/>
The main thing we do here is add a click handler to the image and if the imageUrl
argument is present on generateImage()
, use the image2image
URL and add the necessary arguments to the JSON body (this endpoint requires a samples, width, and height parameters).
Now you can evolve images by clicking them and adding prompts, as shown in Figure 4.
Conclusion
Generative AI has really made its mark over the last year or so, and Stable Diffusion has proved itself to be one of the most important models, with its powerful ability to create images based on text prompts. By using a hosted API, as we’ve done here, we can gain access to AI generated images from a web application without actually installing and training the model ourselves. In this article, you've seen several examples of how to integrate a React front end with the Stable Diffusion API, including how to take existing images and refine them with further text prompts.