Reactive JavaScript with Stable Diffusion

Use React and the Stable Diffusion API to build a reactive AI application that generates images from user-submitted text.

Intro to Stable Diffusion AI.

In case you’ve been backpacking in the Himalayas for the last year, generative AI has recently become enormously popular. Text generators like OpenAI's ChatGPT and Google Bard are one type of generative AI model. Text-to-image generators are another. One of the leaders in the space is Stable Diffusion, an open source image generation AI system. 

We'll use Stable Diffusion's free trial API to build a React.js client that connects and interacts with the service.

Getting started with Stable Diffusion

Stable Diffusion is available from its GitHub repository. Several projects offer endpoints for interacting with hosted Stable Diffusion installs, so you can avoid having to set up and train the model yourself. One of the best interfaces to Stable Diffusion is the Stable Diffusion API, which offers a free trial. We’ll use the API to see how we can interact with Stable Diffusion in React, a front-end JavaScript library that is also free and open source.

To begin, we’ll use the create-react-app command-line tool to start a new React application. You’ll need Node and NPM installed for this. Then, install create-react-app with $ npm i -g create-react-app. Now, you can create a simple application with the command: $ create-react-app react-sd. To test it, move into the react-sd directory and type $ npm start, and visit the landing page at localhost:3000.

Build the React UI

Now that we have React and Stable Diffusion set up, let’s start building a UI that will let us enter a text prompt, send it to the Stable Diffusion API endpoint, and display the resultant image. We’ll use a simple scrollable pane to show the list of generated images.

First, we'll create a text box to receive the prompt and a Submit button to go with it, as shown in Listing 1. This code can replace React's default App.js file.

Listing 1. Add a text box and a Submit option to the basic App.js file


import React, { useState } from "react";

const App = () => {
  const [textPrompt, setTextPrompt] = useState("");
  const [prompts, setPrompts] = useState([]);

  const generateImage = async () => {
  };

  const handleClick = () => {
    setPrompts([...prompts, textPrompt]);
    setTextPrompt("");
    generateImage();
  };

  return (
    <div>
      <input
        type="text"
        placeholder="Enter a text prompt"
        onChange={(e) => setTextPrompt(e.target.value)}
      />
      <button onClick={handleClick}>Generate Image</button>
      <ul>
        {prompts.map((prompt) => (
          <li key={prompt}>{prompt}</li>
        ))}
      </ul>
    </div>
  );
};

export default App;

Now the application will accept text input and save it to an array that is displayed in the UI as an unordered list. The prompt input and array of submitted prompts are both useState hook variables.

Connect to the API endpoint

So far, the generateImage() function is a no-op. Let’s begin implementing it. The first step is to submit the text prompt to the Stable Diffusion endpoint. The Stable Diffusion API is a RESTful style that expects JSON requests and responses, as described here. The documentation includes a list of API endpoints, including the text-to-image endpoint, which is described here.

One wrinkle we have to deal with is that Stable Diffusion’s API doesn’t like browser requests. This is a security issue (exposing the API Key in the browser), so we’ll use a public proxy to work around it. Don’t do this in a real-world application where your application is exposed. In that case, you would use a back-end application to hold the key and fold it into the front-end request. In this case, we’ll use Heroku’s CORS anywhere proxy, which I'll demonstrate shortly.

So, the general plan is to take the user's text and submit it to the API endpoint as a POST request. The JSON body holds fields for the API key and user text. The response comes back as a JSON response body (more about the format here). The main thing we are after is the output array, which in our case has a single element: the URL of the generated image. 

In the user interface, we’ll take the URL and put it in an image component next to the text prompt in the unordered list. Listing 2 shows the code to perform these tasks by evolving Listing 1. 

Listing 2. Calling the API and displaying the generated image


import React, { useState } from "react";

const App = () => {
  const [textPrompt, setTextPrompt] = useState("");
  const [prompts, setPrompts] = useState([]);

  const generateImage = async () => {
    const apiKey = "YOUR KEY HERE"; 
    const url = "https://stablediffusionapi.com/api/v3/text2img"; 
    const proxyUrl = "https://cors-anywhere.herokuapp.com/";

    const requestOptions = {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({ key: apiKey, prompt: textPrompt }),
    };

    try {
      const response = await fetch(proxyUrl + url, requestOptions);
      const data = await response.json();
      
      // Get the image URL from the response
      const imageUrl = data.output[0];
      
      // Update the prompts array with the generated image URL
      setPrompts([...prompts, { prompt: textPrompt, imageUrl }]);
    } catch (error) {
      console.error("Error generating image:", error);
    }
  };
  
  const handleClick = async () => {
    setPrompts([...prompts, { prompt: textPrompt, imageUrl: "" }]);
    setTextPrompt("");
    await generateImage();
  };

  return (
    <div className="container">
      <input className="input-container"
        type="text"
        placeholder="Enter a text prompt"
        value={textPrompt}
        onChange={(e) => setTextPrompt(e.target.value)}
      />
      <button onClick={handleClick}>Generate Image</button>
      <ul className="prompts-list">
        {prompts.map((item, index) => (
          <li key={index} className="prompt-item">
            <p className="prompt-text">{item.prompt}</p>
            {item.imageUrl && <img src={item.imageUrl} alt="Generated Image" className="generated-image"/>}
          </li>
        ))}
      </ul>
    </div>
  );
};

export default App;

Two important notes here: First, be sure to put your real API Key where it says "YOUR API KEY HERE." Second (once more) don’t do this in a public-facing web application because if you do, your key will be exposed.

Next, note that we’ve added the imageUrl field to the prompt's useState variable, so we can track and display the generated images. Also notice the await for generateImage() to finish in the handleClick() function, so that the image generated is displayed.

We've also added some styling, which you'll see soon.

If you start this application with npm start and view it in a browser, the first time you submit the prompt, you may be asked by the Heroku CORS proxy to request permission. In the JavaScript console, you’ll see a message about missing permissions (status 401) and a link. Just click the link and once the page opens, hit the temporary access button, as shown in Figure 1.

Reactive programming with Stable Diffusion. IDG

Figure 1. Request CORS Proxy permission.

Now when you submit the prompts, the application will display the list of prompts and their images, as shown in Figure 2.

Reactive programming with Stable Diffusion. IDG

Figure 2. The user interface is working.

Fine-tune the application

Everything is working, but we could make several improvements. For one, the loading state displays for all images. For another, the Stable Diffusion API sometimes returns a status of processing with an ETA field for when the image should be completed. A more sophisticated UI would handle the processing status.

The first priority is to improve the styling. Let’s add CSS to make things look nicer, including displaying the image in a tighter format and styling the text.

Listing 3. The application with improved styling


.container {
  text-align: center;
  margin-top: 50px;
}

.input-container {
  margin-bottom: 20px;
}

input[type="text"] {
  padding: 10px;
  font-size: 16px;
  border: none;
  border-radius: 4px;
}

button {
  padding: 10px 20px;
  font-size: 16px;
  background-color: #007bff;
  color: #fff;
  border: none;
  border-radius: 4px;
  cursor: pointer;
}

.prompts-list {
  list-style-type: none;
  padding: 0;
  margin: 0;
}

.prompt-item {
  display: flex;
  align-items: center;
  margin-bottom: 10px;
}

.prompt-text {
  flex: 1;
  font-size: 18px;
}

.generated-image {
  width: 200px;
  height: 200px;
  object-fit: cover;
  border-radius: 4px;
}

.loading-text {
  font-size: 14px;
  font-style: italic;
  color: #aaa;
}

Now we can import the new styles by adding this line to the beginning of the App.js file:


import "./App.css";

Now, we’ll see the input text, button prompt, and image history with a more reasonable look, as shown in Figure 3.

Reactive programming with StableD. IDG

Figure 3. The application with an improved layout.

As a final improvement, let’s add the ability to click on an image and send it along with the prompt to Stable Diffusion's img2img endpoint, which allows for clicking on an existing image and combining it with a prompt text for a new output image URL. When clicked, the image issues a request that has an extra field in the JSON body: init_image, with the URL of the image you've clicked on. You can see the updated code in Listing 4, which has the added image-click handler and the updated generateImage() function.

Listing 4. Updated code


const generateImage = async (imageUrl) => {
  const apiKey = "YOUR KEY HERE";
  let url = "https://stablediffusionapi.com/api/v3/text2img";
  const proxyUrl = "https://cors-anywhere.herokuapp.com/";

  const requestBody = {
    key: apiKey,
    prompt: textPrompt,
  };

  if (imageUrl) {
    requestBody.init_image = imageUrl;
    requestBody.samples = 1;
    requestBody.width = 800; 
    requestBody.height = 800;
    url = "https://stablediffusionapi.com/api/v3/img2img";
  }

  const requestOptions = {
    method: "POST",
    headers: { "Content-Type": "application/json" },
    body: JSON.stringify(requestBody),
  };

  try {
    setIsLoading(true);
    const response = await fetch(proxyUrl + url, requestOptions);
    const data = await response.json();

    if (data.status === "error") {
      console.error("Error generating image: " + data.message);
      alert(data.message);
      return;
    }
    // Get the image URL from the response
    const generatedImageUrl = data.output[0];

    // Update the prompts array with the generated image URL
    setPrompts([...prompts, { prompt: textPrompt + (imageUrl ? " (image2image)" : ""), imageUrl: generatedImageUrl }]);
  } catch (error) {
    console.error("Error generating image:", error);
  } finally {
    setIsLoading(false);
  }
};

//...
const handleImageClick = (imageUrl) => {
  console.log("Clicked on image:", imageUrl);
        generateImage(imageUrl);
};

// ...
<img onClick={() => handleImageClick(item.imageUrl)} src={item.imageUrl} alt="Generated Image" className="generated-image"/>

The main thing we do here is add a click handler to the image and if the imageUrl argument is present on generateImage(), use the image2image URL and add the necessary arguments to the JSON body (this endpoint requires a samples, width, and height parameters).

Now you can evolve images by clicking them and adding prompts, as shown in Figure 4.

Reactive programming with Stable Diffusion. IDG

Figure 4. Image to image prompts.

Conclusion

Generative AI has really made its mark over the last year or so, and Stable Diffusion has proved itself to be one of the most important models, with its powerful ability to create images based on text prompts. By using a hosted API, as we’ve done here, we can gain access to AI generated images from a web application without actually installing and training the model ourselves. In this article, you've seen several examples of how to integrate a React front end with the Stable Diffusion API, including how to take existing images and refine them with further text prompts.

Copyright © 2023 IDG Communications, Inc.