Development of Stable Diffusion API
Stable Diffusion is an open-source AI that generates images when you give it text. If you want to run it on your own PC, you will need a GPU. (There is also stable_diffusion.openvino that can run on a CPU)
I wanted an API that could be used on any PC and could be integrated with services like Slack. So, I developed an API for Stable Diffusion.
Conclusion
I used the SDK of DreamStudio.ai, stability-sdk.
The artifact is placed in the following repository.
It works in both local environments and Docker containers.
To run it, you will need the API Key of DreamStudio.ai. Since it runs on Docker, it can run anywhere if the service can deploy Docker. (No GPU is required)
I like GCP, so I deployed it to a service called CloudRun.
The API accepts parameters in the form of <url>/?prompt=<text>
and returns an image.
When I used it on Slack, it looked like this.
For now, I was able to run Stable Diffusion with the API.
GPU and Design
Before using stability-sdk, I thought about designing an environment to run Stable Diffusion on my own. I have left a memo of the design research in the following link.
Specifically, I considered the following patterns.
- Run Stable Diffusion using the GPU of Google Colaboratory and publish it with a simple API
- Run Stable Diffusion using a GPU on a server (such as GCE or CloudRun) and publish it with a simple API
- Run Stable Diffusion using a GPU in a batch (Cloud Batch) and run it when necessary. (Kick the batch process from the API)
The first one has a 12-hour limit on the use of Google Colaboratory, and something is needed to circumvent that. However, I rejected it because I think it deviates from the original purpose.
The second one is rejected because it would cost tens to hundreds of thousands of yen in running costs.
The third one was the first one I conceived. It would be a waste to have a GPU server running all the time like the second one, so I thought about the third option as a batch process. When I actually built it with the third option, (I haven't investigated the cause in depth) it took more than 30 minutes to start up, and it didn't seem to be usable.
So, after some consideration, I realized that stability-sdk seemed to be a quick solution that required no maintenance or running costs.
Of course, there are disadvantages.
- Dependence on the SDK, which you can't control (you can't do img2img)
- Pay-as-you-go system
However, assuming it was for personal use, I judged the advantages to outweigh the disadvantages.
stability-sdk
DreamStudio.ai uses Stable Diffusion. They publish stability-sdk as an API. You need to write in Python to use it. Reading the source code, I think it's relatively easy to write the SDK in another language because it uses gRPC. I wrote it quickly in Python, using flask and stability-sdk.
For now, I wrote an ultra-simple API that only accepts Prompts. stability-sdk has various parameters, so I thought about making it accept those as well, and I thought it would be interesting to write something like a bot for Midjourney's discord.
In conclusion
In Markdown, when you load an image, if you specify the API I developed this time, the image changes every time you open the Markdown. You can fix it by specifying the prompt and seed, but I think this is interesting too.
Share
Related tags
- Created an App to Consistently Record and Visualize Data in a Free Format
- Developing "Bochi-Bochi", an App to Easily Find Cheap Ingredients
- What I Learned from Refreshing My Blog Page with Qwik
- Introducing AI Ghostwriter - A Tool to Improve Writing Efficiency
- Defining Fragments Composed in Micro Frontends as Web Components and Sharing them with Module Federation
- Created OEmbed and OGP WebComponents for use on my blog site
- Things I Learned from Developing Chrome Extensions (Manifest V3)
- If you're writing in Markdown, Rocket, an SSG that uses WebComponents, is recommended!
- Refreshing Silverbirder's Portfolio Page (v2)
- I Made an API That Only Returns Google Account Images
- Building a TikTok Scraping Infrastructure on GCP and the Challenges Faced
- Micro Frontends on the Client Side (ES Module)
- Micro Frontends with Zalando tailor (LitElement & etcetera)
- Micro Frontends with SSR in Ara-Framework
- Created a GAS Library, zoom-meeting-creator, to Automatically Generate Zoom Meetings
- Introducing a Tool for Bulk Updating Account Images and What I Learned
- Cotlin is a Tool for Collecting Links on Twitter, Discover Presentations from Around the World
- I tried creating rMinc, a service that registers GMail to GCalendar
- I Tried Making a One-Frame Comic Search Service Tiqav2 (Algolia + Cloudinary + Google Cloud Vision API)