Generating 3D Game Assets with ChatGPT and Shap-E

Back to Blog

I’ve always had this dream of building a video game, but creating 3D assets has always been a painful barrier. In this tutorial, we are going to create 3D game assets automatically using ChatGPT and Shap-E. ✨

More specifically, we’re going to use the ChatGPT API to generate ideas for different game items (e.g Iron Sword) and characters (e.g Farmer, Cow), which we’ll then pass to Shap-E – a new model by OpenAI to generate 3D objects from the text.

We’ll also write some code to convert the 3D models to standard 3D file formats (glTF, obj), so you can import them to your favorite game editor (Unity, Godot, Unreal, …).

Step 1: Generate game items and characters using ChatGPT API

Start by installing and importing the OpenAI library:

%pip install openai

import openai
openai.api_key = "<YOUR_API_KEY>"

You can now do something like this to generate a list of game item ideas:

items = json.loads(openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[{
        "role": "user",
        "content": f"""
            Generate 9 items for a medieval MMORPG game.
            Each item should contain a name (e.g Iron Sword), a short visual description (e.g A razor sharp sword).
            The output should be a JSON array, where each item is a JSON object with name, short_visual_description, and skills (attack / strength / defence).
            The short visual description should be maximum 4 words.
            Return only the JSON array.
        """
    }],
)["choices"][0]["message"]["content"])

This will generate a list such as:

[{'name': 'Dragonfire Staff',
  'short_visual_description': 'Fiery wooden staff',
  'skills': {'attack': 20, 'strength': 10, 'defence': 5}},
 {'name': 'Mithril Plate Armor',
  'short_visual_description': 'Shiny silver armor',
  'skills': {'attack': 0, 'strength': 5, 'defence': 25}},
 {'name': 'Elven Bow',
  'short_visual_description': 'Graceful wooden bow',
  'skills': {'attack': 15, 'strength': 5, 'defence': 0}},
 {'name': 'Dwarven Warhammer',
  'short_visual_description': 'Heavy metal hammer',
  'skills': {'attack': 25, 'strength': 20, 'defence': 5}},
 {'name': 'Enchanted Dagger',
  'short_visual_description': 'Glowing sharp blade',
  'skills': {'attack': 10, 'strength': 5, 'defence': 0}},
 {'name': 'Holy Avenger',
  'short_visual_description': 'Radiant longsword',
  'skills': {'attack': 30, 'strength': 15, 'defence': 10}},
 {'name': 'Shadow Cloak',
  'short_visual_description': 'Dark flowing cloak',
  'skills': {'attack': 0, 'strength': 0, 'defence': 20}},
 {'name': 'Giant Slayer',
  'short_visual_description': 'Massive two-handed sword',
  'skills': {'attack': 35, 'strength': 25, 'defence': 5}},
 {'name': "Siren's Song",
  'short_visual_description': 'Enchanted lute',
  'skills': {'attack': 5, 'strength': 0, 'defence': 0}}]

And we can do something similar to generate characters:

characters = json.loads(openai.ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[{
        "role": "user",
        "content": f"""
            Generate 9 characters for a medieval MMORPG game, like World of Warcraft, MapleStory, or RuneScape.
            Each character should contain a name (e.g Farmer, Cow, Goblin), a short visual description (e.g A green goblin).
            The output should be a JSON array, where each item is a JSON object with name, short_visual_description, and combat level. 
            The short visual description should be maximum 4 words.
            Return only the JSON array.
        """
    }],
)["choices"][0]["message"]["content"])

which will generate the following:

[{'name': 'Grimm',
  'short_visual_description': 'Dark hooded figure',
  'combat level': 50},
 {'name': 'Thorn',
  'short_visual_description': 'Hulking brute',
  'combat level': 65},
 {'name': 'Sylph',
  'short_visual_description': 'Graceful elf archer',
  'combat level': 40},
 {'name': 'Grendel',
  'short_visual_description': 'Massive troll',
  'combat level': 75},
 {'name': 'Raven',
  'short_visual_description': 'Sneaky thief',
  'combat level': 30},
 {'name': 'Aurora',
  'short_visual_description': 'Radiant sorceress',
  'combat level': 55},
 {'name': 'Frost', 'short_visual_description': 'Ice mage', 'combat level': 45},
 {'name': 'Balthazar',
  'short_visual_description': 'Fiery demon',
  'combat level': 80},
 {'name': 'Oberon',
  'short_visual_description': 'Regal fairy king',
  'combat level': 70}]

Step 2: Create 3D Objects From Text Using Shap-E

First, import the Shap-E library and load the models:

import torch

from shap_e.diffusion.sample import sample_latents
from shap_e.diffusion.gaussian_diffusion import diffusion_from_config
from shap_e.models.download import load_model, load_config
from shap_e.util.notebooks import decode_latent_mesh

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

xm = load_model('transmitter', device=device)
model = load_model('text300M', device=device)
diffusion = diffusion_from_config(load_config('diffusion'))

Next, we’ll generate a 3D mesh for each one of the items and characters above. We’ll use the “short visual description” that ChatGPT generates and pass that to Shap-E:

guidance_scale = 30.0

prompts = {
    **{item["name"]: item["short_visual_description"] for item in items},
    **{character["name"]: character["short_visual_description"] for character in characters},
}

latents = {}

for name, description in prompts.items():
    print(name)

    slug = name.lower().replace(' ', '_')
    latents[slug] = sample_latents(
        batch_size=1,
        model=model,
        diffusion=diffusion,
        guidance_scale=guidance_scale,
        model_kwargs=dict(texts=[description]),
        progress=True,
        clip_denoised=True,
        use_fp16=True,
        use_karras=True,
        karras_steps=64,
        sigma_min=1e-3,
        sigma_max=160,
        s_churn=0,
    )

Finally, we can use the following code to generate .ply files for each 3D model:

for name, latent in latents.items():
    decode_latent_mesh(xm, latent).tri_mesh().write_ply(f"{name}.ply")

However, if you want to use another format such as glTF or obj, there’s a cool trick you can do. There’s a library called trimesh that supports all of these formats. We’ll just need to convert Shap-E’s format to trimesh:

import trimesh
from trimesh.visual import ColorVisuals
import numpy as np


for name, latent in latents.items():
    print(name)
    latent_mesh = decode_latent_mesh(xm, latent).tri_mesh()

    vertex_colors = np.vstack((
        latent_mesh.vertex_channels['R'],
        latent_mesh.vertex_channels['G'],
        latent_mesh.vertex_channels['B']
    )).T

    mesh = trimesh.Trimesh(vertices=latent_mesh.verts,
                           faces=latent_mesh.faces,
                           face_normals=latent_mesh.normals,
                           visual=ColorVisuals(vertex_colors=vertex_colors))

    scene = trimesh.Scene()
    scene.add_geometry(mesh)

    with open(f"models/{name}.glb", "wb") as f:
        f.write(trimesh.exchange.gltf.export_glb(scene))

In this example, we convert the 3D objects into the standard glTF file format, which can be later be imported to three.js, Unity, etc.

You can also use the trimesh library to convert to other file formats, such as obj (see trimesh.exchange.obj.export_obj).

Bias & Fairness

Like any model, Shap-E is not without risks. In this Shap-E paper, the authors wrote it with ambiguous captions, in which certain details, such as body shape or color, were left unspecified.

As you can see, the samples generated by the model likely perpetuate common gender-role stereotypes in response to these ambiguous prompts:

Examples where Shap-E likely exhibits biases from its dataset, from the original paper.

Conclusion

In this tutorial, we used ChatGPT and Shap-E to generate 3D game assets, including game items and characters, and convert them into popular 3D file formats. While exciting, we were also introduced to the risks of bias and fairness issues arising, reminding us to be cautious when designing games with ambiguous prompts.

Alon Gubkin

Alon is the CTO of Aporia.

Great things to Read

AI Leadership & CultureAnnouncementsGenerative AI

Aporia releases its latest market overview – 2024 AI Report: Evolution of Models & Solutions

In the ever-evolving field of AI, the maturity of production applications is a sign of progress. The industry is witnessing...

Aporia Team

Read Now 4 min read

Generative AIMLOps & LLMOps

DALL-E Mini: A Lesson in Unintentional Machine Learning Bias

Over the last year and a half, there has been a major leap forward in the text-to-image space, where deep...

Tom Alon

Read Now 6 min read

Control All your GenAI Apps in minutes

Get a Demo

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.