Construct an AI-Powered WhatsApp Sticker Generator with Python

Think about sending your individual customized memes or cartoons as an alternative of those from the web. So, remodel your selfies or images into enjoyable, stylized stickers utilizing OpenAI’s new GPT-Picture-1 mannequin. On this tutorial, we’ll construct a WhatsApp sticker generator in Python that applies numerous artwork types, together with caricature and Pixar-style filters, to your photos.

You’ll discover ways to arrange the OpenAI picture enhancing API, seize or add photos in Colab, outline humorous and humorous textual content classes, or use your individual textual content and course of three stickers in parallel utilizing a number of API keys for velocity. By the top, you’ll have a working sticker maker powered by GPT-Picture-1 and customized textual content prompts.

Why GPT-Picture-1?

We evaluated a number of cutting-edge image-generation fashions, together with Gemini 2.0 Flash, Flux, and Phoenix, on the Leonardo.ai platform. Specifically, all these fashions struggled with rendering textual content and expressions accurately. For example:

Google’s Gemini 2.0 picture API typically produces misspelled or jumbled phrases even when given actual directions. For Instance, with Gemini, the precise textual content seems to be like ‘Massive Sale At the moment!’ and we get outputs like “Massive Sale Todai” or random gibberish.
Flux delivers excessive picture high quality on the whole, however customers report that it “rapidly launched little errors” into any textual content it renders. Flux additionally makes tiny spelling errors or garbled letters, particularly because the textual content size will increase. Flux additionally defaults to very related face generations, i.e, “all faces are wanting the identical” except closely constrained.
Phoenix is optimized for constancy and immediate adherence, however like most diffusion fashions, it nonetheless views textual content visually and may introduce errors. We discovered that Phoenix may generate a sticker with the proper wording solely sporadically, and it tended to repeat the identical default face for a given immediate.

Collectively, these limitations led us to develop GPT-Picture-1. In contrast to the above fashions, GPT-Picture-1 incorporates a specialised immediate pipeline that explicitly enforces appropriate textual content and expression adjustments.

Learn extra: The right way to run the Flux mannequin?

How GPT-Picture-1 Powers Picture Modifying

GPT-Picture-1 is OpenAI’s flagship multimodal mannequin. It creates and edits photos from textual content and picture prompts to generate high-quality picture outputs. Primarily, we will instruct GPT-Picture-1 to use an edit to a supply picture primarily based on a textual content immediate. In our case, we use the photographs. Edit the API endpoint with GPT-Picture-1 to use enjoyable and humorous filtering, and overlay textual content to a photograph enter to create stickers.

The immediate is fastidiously constructed to implement a sticker-friendly output (1024×1024 PNG). Then GPT-Picture-1 basically turns into the AI-powered sticker creator, the place it’s going to change the looks of the topic within the photograph and add hilarious textual content.

# Arrange OpenAI purchasers for every API key (to run parallel requests)

purchasers = [OpenAI(api_key=key) for key in API_KEYS]

So, for that, we create one OpenAI consumer per API key. With three keys, we will make three simultaneous API calls. This multi-key, multi-thread method makes use of ThreadPoolExecutor. It lets us generate 3 stickers in parallel for every run. Because the code prints, it makes use of “3 API keys for SIMULTANEOUS technology”, dramatically dashing up the sticker creation..

Step-by-Step Information

The thought of making your individual AI sticker generator could sound advanced, however this information will assist you to simplify your complete course of. You’ll start with the atmosphere preparation in Google Colab, then we are going to overview the API, perceive classes of phrases, validate textual content, generate totally different inventive types, and at last generate stickers in parallel. Every half is accompanied by code snippets and explanations so you may observe alongside simply. Now, let’s proceed to code:

Putting in and Working on Colab

To generate stickers, we’ve received to have the appropriate setup! This challenge will use Python libraries PIL and rembg for fundamental picture processing, and google-genai will likely be used to be used within the Colab occasion. Step one is the set up the dependencies instantly in your Colab pocket book.

!pip set up --upgrade google-genai pillow rembg

!pip set up --upgrade onnxruntime

!pip set up python-dotenv

OpenAI Integration and API Keys

After set up, import the modules and arrange API keys. The script creates one OpenAI consumer per API key. This lets the code distribute image-edit requests throughout a number of keys in parallel. The consumer listing is then utilized by the sticker-generation capabilities.

API_KEYS = [ # 3 API keys

            "API KEY 1",

             "API KEY 2",

             "API KEY 3"

]

"""# Stickerverse

"""

import os

import random

import base64

import threading

from concurrent.futures import ThreadPoolExecutor, as_completed

from openai import OpenAI

from PIL import Picture

from io import BytesIO

from rembg import take away

from google.colab import recordsdata

from IPython.show import show, Javascript

from google.colab.output import eval_js

import time

purchasers = [OpenAI(api_key=key) for key in API_KEYS]

Picture add & digital camera seize (logic)

Now the subsequent step is to entry the digital camera to seize a photograph or add a picture file. The capture_photo() makes use of JavaScript injected into Colab to open the webcam and return a captured picture.upload_image() makes use of Colab’s file add widget and verifies the uploaded file with PIL.

# Digital camera seize by way of JS

def capture_photo(filename="photograph.jpg", high quality=0.9):

    js_code = """

    async operate takePhoto(high quality) {

        const div = doc.createElement('div');

        const video = doc.createElement('video');

        const btn = doc.createElement('button');

        btn.textContent="📸 Seize";

        div.appendChild(video);

        div.appendChild(btn);

        doc.physique.appendChild(div);

        const stream = await navigator.mediaDevices.getUserMedia({video: true});

        video.srcObject = stream;

        await video.play();

        await new Promise(resolve => btn.onclick = resolve);

        const canvas = doc.createElement('canvas');

        canvas.width = video.videoWidth;

        canvas.top = video.videoHeight;

        canvas.getContext('2nd').drawImage(video, 0, 0);

        stream.getTracks().forEach(monitor => monitor.cease());

        div.take away();

        return canvas.toDataURL('picture/jpeg', high quality);

    }

    """

    show(Javascript(js_code))

    information = eval_js("takePhoto(%f)" % high quality)

    binary = base64.b64decode(information.cut up(',')[1])

    with open(filename, 'wb') as f:

        f.write(binary)

    print(f"Saved: {filename}")

    return filename

# Picture add operate

def upload_image():

    print("Please add your picture file...")

    uploaded = recordsdata.add()

    if not uploaded:

        print("No file uploaded!")

        return None

    filename = listing(uploaded.keys())[0]

    print(f"Uploaded: {filename}")

    # Validate if it is a picture

    attempt:

        img = Picture.open(filename)

        img.confirm()

        print(f"📸 Picture verified: {img.format} {img.measurement}")

        return filename

    besides Exception as e:

        print(f"Invalid picture file: {str(e)}")

        return None

# Interactive picture supply choice

def select_image_source():

    print("Select picture supply:")

    print("1. Seize from digital camera")

    print("2. Add picture file")

    whereas True:

        attempt:

            alternative = enter("Choose possibility (1-2): ").strip()

            if alternative == "1":

                return "digital camera"

            elif alternative == "2":

                return "add"

            else:

                print("Invalid alternative! Please enter 1 or 2.")

        besides KeyboardInterrupt:

            print("nGoodbye!")

            return None

Output:

Examples of Classes and Phrases

Now we’ll create our totally different phrase classes to placed on our stickers. Due to this fact, we’ll use a PHRASE_CATEGORIES dictionary that accommodates many classes, akin to company, Bollywood, Hollywood, Tollywood, sports activities, memes, and others. When a class is chosen, the code randomly selects three distinctive phrases for the three sticker types.

PHRASE_CATEGORIES = {

    "company": [

        "Another meeting? May the force be with you!",

        "Monday blues activated!",

        "This could have been an email, boss!"

    ],

    "bollywood": [

        "Mogambo khush hua!",

        "Kitne aadmi the?",

        "Picture abhi baaki hai mere dost!"

    ],

    "memes": [

        "Bhagwan bharose!",

        "Main thak gaya hoon!",

        "Beta tumse na ho payega!"

    ]

}

Phrase Classes and Customized Textual content

The generator makes use of a dictionary of phrase classes. The person can both choose a class for random phrase choice or enter their very own customized phrase. There are additionally helper capabilities for interactive choice, in addition to a easy operate to validate the size of a customized phrase.

def select_category_or_custom():

    print("nChoose your sticker textual content possibility:")

    print("1. Choose from phrase class (random choice)")

    print("2. Enter my very own customized phrase")

    whereas True:

        attempt:

            alternative = enter("Select possibility (1 or 2): ").strip()

            if alternative == "1":

                return "class"

            elif alternative == "2":

                return "customized"

            else:

                print("Invalid alternative! Please enter 1 or 2.")

        besides KeyboardInterrupt:

            print("nGoodbye!")

            return None

# NEW: Perform to get customized phrase from person

def get_custom_phrase():

    whereas True:

        phrase = enter("nEnter your customized sticker textual content (2-50 characters): ").strip()

        if len(phrase) < 2:

            print("Too brief! Please enter a minimum of 2 characters.")

            proceed

        elif len(phrase) > 50:

            print("Too lengthy! Please maintain it beneath 50 characters.")

            proceed

        else:

            print(f"Customized phrase accepted: '{phrase}'")

            return phrase

For customized phrases, enter size is checked (2–50 characters) earlier than acceptance.

Phrase Validation and Spelling Guardrails

def validate_and_correct_spelling(textual content):

    spelling_prompt = f"""

    Please examine the spelling and grammar of the next textual content and return ONLY the corrected model.

    Don't add explanations, feedback, or change the which means.

    Textual content to examine: "{textual content}"

    """

    response = purchasers[0].chat.completions.create(

        mannequin="gpt-4o-mini",

        messages=[{"role": "user", "content": spelling_prompt}],

        max_tokens=100,

        temperature=0.1

    )

    corrected_text = response.decisions[0].message.content material.strip()

    return corrected_text

Now we’ll create a pattern build_prompt operate to arrange some basic-level directions for the agent. Additionally observe build_prompt() calls the spelling validator, after which embeds the corrected textual content into the strict sticker immediate:

# Concise Immediate Builder with Spelling Validation

def build_prompt(textual content, style_variant):

    corrected_text = validate_and_correct_spelling(textual content)

    base_prompt = f"""

    Create a HIGH-QUALITY WhatsApp sticker in {style_variant} model.

    OUTPUT:

    - 1024x1024 clear PNG with 8px white border

    - Topic centered, balanced composition, sharp particulars

    - Protect unique facial identification and proportions

    - Match expression to sentiment of textual content: '{corrected_text}'

    TEXT:

    - Use EXACT textual content: '{corrected_text}' (no adjustments, no emojis)

    - Daring comedian font with black define, high-contrast colours

    - Place textual content in empty house (prime/backside), by no means masking the face

    RULES:

    - No hallucinated components or ornamental glyphs

    - No cropping of head/face or textual content

    - Keep reasonable however expressive look

    - Guarantee consistency throughout stickers

    """

    return base_prompt.strip()

Fashion Variants: Caricature vs Pixar

The three model templates reside in STYLE_VARIANTS. The primary two are caricature transformations and the final is a Pixar-esque 3D look. These strings will get despatched instantly into the immediate builder and dictate the visible model.

STYLE_VARIANTS = [

"Transform into detailed caricature with slightly exaggerated facial features...",

"Transform into expressive caricature with enhanced personality features...",

"Transform into high-quality Pixar-style 3D animated character..."

]

Producing Stickers in Parallel

The true energy of the challenge is the parallel sticker technology. The sticker technology is finished in parallel with threading all three on the similar time, utilizing separate API keys, so wait occasions are dramatically decreased.

# Generate single sticker utilizing OpenAI GPT-image-1 with particular consumer (WITH TIMING)
def generate_single_sticker(input_path, output_path, textual content, style_variant, client_idx):

    attempt:

        start_time = time.time()

        thread_id = threading.current_thread().title

        print(f"[START] Thread-{thread_id}: API-{client_idx+1} producing {style_variant[:30]}... at {time.strftime('%H:%M:%S', time.localtime(start_time))}")

        immediate = build_prompt(textual content, style_variant)

        outcome = purchasers[client_idx].photos.edit(

            mannequin="gpt-image-1",

            picture=[open(input_path, "rb")],

            immediate=immediate,

            # input_fidelity="excessive"

            high quality = 'medium'

        )

        image_base64 = outcome.information[0].b64_json

        image_bytes = base64.b64decode(image_base64)

        with open(output_path, "wb") as f:

            f.write(image_bytes)

        end_time = time.time()

        period = end_time - start_time

        style_type = "Caricature" if "caricature" in style_variant.decrease() else "Pixar"

        print(f"[DONE] Thread-{thread_id}: {style_type} saved as {output_path} | Period: {period:.2f}s | Textual content: '{textual content[:30]}...'")

        return True

    besides Exception as e:

        print(f"[ERROR] API-{client_idx+1} failed: {str(e)}")

        return False

# NEW: Create stickers with customized phrase (all 3 types use the identical customized textual content)

def create_custom_stickers_parallel(photo_file, custom_text):

    print(f"nCreating 3 stickers together with your customized phrase: '{custom_text}'")

    print("   • Fashion 1: Caricature #1")

    print("   • Fashion 2: Caricature #2")

    print("   • Fashion 3: Pixar Animation")

    # Map futures to their data

    tasks_info = {}

    with ThreadPoolExecutor(max_workers=3, thread_name_prefix="CustomSticker") as executor:

        start_time = time.time()

        print(f"n[PARALLEL START] Submitting 3 API calls SIMULTANEOUSLY at {time.strftime('%H:%M:%S', time.localtime(start_time))}")

        # Submit ALL duties without delay (non-blocking) - all utilizing the identical customized textual content

        for idx, style_variant in enumerate(STYLE_VARIANTS):

            output_name = f"custom_sticker_{idx+1}.png"

            future = executor.submit(generate_single_sticker, photo_file, output_name, custom_text, style_variant, idx)

            tasks_info[future] = {

                'output_name': output_name,

                'textual content': custom_text,

                'style_variant': style_variant,

                'client_idx': idx,

                'submit_time': time.time()

            }

        print("All 3 API requests submitted! Processing as they full...")

        accomplished = 0

        completion_times = []

        # Course of outcomes as they full

        for future in as_completed(tasks_info.keys(), timeout=180):

            attempt:

                success = future.outcome()

                task_info = tasks_info[future]

                if success:

                    accomplished += 1

                    completion_time = time.time()

                    completion_times.append(completion_time)

                    period = completion_time - task_info['submit_time']

                    style_type = "Caricature" if "caricature" in task_info['style_variant'].decrease() else "Pixar"

                    print(f"[{completed}/3] {style_type} accomplished: {task_info['output_name']} "

                          f"(API-{task_info['client_idx']+1}, {period:.1f}s)")

                else:

                    print(f"Failed: {task_info['output_name']}")

            besides Exception as e:

                task_info = tasks_info[future]

                print(f"Error with {task_info['output_name']} (API-{task_info['client_idx']+1}): {str(e)}")

        total_time = time.time() - start_time

        print(f"n [FINAL RESULT] {accomplished}/3 customized stickers accomplished in {total_time:.1f} seconds!")

# UPDATED: Create 3 stickers in  PARALLEL (utilizing as_completed)

def create_category_stickers_parallel(photo_file, class):

    if class not in PHRASE_CATEGORIES:

        print(f" Class '{class}' not discovered! Obtainable: {listing(PHRASE_CATEGORIES.keys())}")

        return

    # Select 3 distinctive phrases for 3 stickers

    chosen_phrases = random.pattern(PHRASE_CATEGORIESBeginner, 3)

    print(f" Chosen phrases for {class.title()} class:")

    for i, phrase in enumerate(chosen_phrases, 1):

        style_type = "Caricature" if i <= 2 else "Pixar Animation"

        print(f"   {i}. [{style_type}] '{phrase}' → API Key {i}")

    # Map futures to their data

    tasks_info = {}

    with ThreadPoolExecutor(max_workers=3, thread_name_prefix="StickerGen") as executor:

        start_time = time.time()

        print(f"n [PARALLEL START] Submitting 3 API calls SIMULTANEOUSLY at {time.strftime('%H:%M:%S', time.localtime(start_time))}")

        # Submit ALL duties without delay (non-blocking)

        for idx, (style_variant, textual content) in enumerate(zip(STYLE_VARIANTS, chosen_phrases)):

            output_name = f"{class}_sticker_{idx+1}.png"

            future = executor.submit(generate_single_sticker, photo_file, output_name, textual content, style_variant, idx)

            tasks_info[future] = {

                'output_name': output_name,

                'textual content': textual content,

                'style_variant': style_variant,

                'client_idx': idx,

                'submit_time': time.time()

            }

        print("All 3 API requests submitted! Processing as they full...")

        print("   • API Key 1 → Caricature #1")

        print("   • API Key 2 → Caricature #2")

        print("   • API Key 3 → Pixar Animation")

        accomplished = 0

        completion_times = []

        # Course of outcomes as they full (NOT in submission order)

        for future in as_completed(tasks_info.keys(), timeout=180):  # 3 minute complete timeout

            attempt:

                success = future.outcome()  # This solely waits till ANY future completes

                task_info = tasks_info[future]

                if success:

                    accomplished += 1

                    completion_time = time.time()

                    completion_times.append(completion_time)

                    period = completion_time - task_info['submit_time']

                    style_type = "Caricature" if "caricature" in task_info['style_variant'].decrease() else "Pixar"

                    print(f"[{completed}/3] {style_type} accomplished: {task_info['output_name']} "

                          f"(API-{task_info['client_idx']+1}, {period:.1f}s) - '{task_info['text'][:30]}...'")

                else:

                    print(f"Failed: {task_info['output_name']}")

            besides Exception as e:

                task_info = tasks_info[future]

                print(f"Error with {task_info['output_name']} (API-{task_info['client_idx']+1}): {str(e)}")

        total_time = time.time() - start_time

        print(f"n[FINAL RESULT] {accomplished}/3 stickers accomplished in {total_time:.1f} seconds!")

        if len(completion_times) > 1:

            fastest_completion = min(completion_times) - start_time

            print(f"Parallel effectivity: Quickest completion in {fastest_completion:.1f}s")

Right here, generate_single_sticker() builds the immediate and calls the photographs. edit endpoint utilizing the required client_idx. The parallel capabilities create a ThreadPoolExecutor with max_workers=3, submit the three duties, and course of outcomes with as_completed. This lets the script log every completed sticker rapidly. Furthermore, we will additionally view the logs to see what is occurring for every thread (time, what was it caricature or Pixar model).

Principal execution block

On the backside of the script, the __main__ guard defaults to working sticker_from_camera(). Nevertheless, you may agree/uncomment as desired to run interactive_menu(), create_all_category_stickers() or different capabilities.

# Principal execution

if __name__ == "__main__":

    sticker_from_camera()

Output:

Output Picture:

For the entire model of this WhatsApp sticker generator code, go to this GitHub repository.

Conclusion

On this tutorial, now we have walked by means of establishing GPT-Picture-1 calls, establishing an prolonged immediate for stickers, capturing or importing photos, deciding on amusing phrases or customized textual content, and working 3 model variants concurrently. In only a few hundred strains of code, this challenge converts your footage into some comic-styled stickers.

By merely combining OpenAI’s imaginative and prescient mannequin with some inventive immediate engineering and multi-threading, you may generate enjoyable, personalised stickers in seconds. And the outcome will likely be an AI-based WhatsApp sticker generator that may produce immediately shareable stickers with a single click on to any of your folks and teams. Now attempt it to your personal photograph and your favourite joke!

Incessantly Requested Questions

Q1. What does the AI-Powered WhatsApp Sticker Generator do?

A. It transforms your uploaded or captured images into enjoyable, stylized WhatsApp stickers with textual content utilizing OpenAI’s GPT-Picture-1 mannequin.

Q2. Why is GPT-Picture-1 higher than different picture fashions?

A. GPT-Picture-1 handles textual content accuracy and facial expressions higher than fashions like Gemini, Flux, or Phoenix, guaranteeing stickers have appropriate wording and expressive visuals.

Q3. How does the script velocity up sticker technology?

A. It makes use of three OpenAI API keys and a ThreadPoolExecutor to generate three stickers in parallel, reducing down processing time.

Hey! I am Vipin, a passionate information science and machine studying fanatic with a powerful basis in information evaluation, machine studying algorithms, and programming. I’ve hands-on expertise in constructing fashions, managing messy information, and fixing real-world issues. My aim is to use data-driven insights to create sensible options that drive outcomes. I am desirous to contribute my expertise in a collaborative atmosphere whereas persevering with to study and develop within the fields of Knowledge Science, Machine Studying, and NLP.

Construct an AI-Powered WhatsApp Sticker Generator with Python

Why GPT-Picture-1?

How GPT-Picture-1 Powers Picture Modifying

Step-by-Step Information

Putting in and Working on Colab

OpenAI Integration and API Keys

Picture add & digital camera seize (logic)

Examples of Classes and Phrases

Phrase Classes and Customized Textual content

Phrase Validation and Spelling Guardrails

Fashion Variants: Caricature vs Pixar

Producing Stickers in Parallel

Principal execution block

Conclusion

Incessantly Requested Questions

Login to proceed studying and revel in expert-curated content material.

Related Articles

Your Information to Asynchronous Java

Shadow AI : Learn how to take care of unauthorized fashions and uncontrolled brokers

Your AI Coding Instrument Has Amnesia

LEAVE A REPLY Cancel reply

Latest Articles

Your Information to Asynchronous Java

Shadow AI : Learn how to take care of unauthorized fashions and uncontrolled brokers

Your AI Coding Instrument Has Amnesia

Cilium, eBPF, and Fashionable Kubernetes Networking with Invoice Mulligan

What Is Adobe FrameMaker? A Newbie’s Information to Options & Advantages