Blue Background
Blue Background
Back to Blog

Generating 3D Game Assets with ChatGPT and Shap-E

Alon Gubkin Alon Gubkin
6 min read May 08, 2023

Table of Contents

    I’ve always had this dream of building a video game, but creating 3D assets has always been a painful barrier. In this tutorial, we are going to create 3D game assets automatically using ChatGPT and Shap-E. ✨

    More specifically, we’re going to use the ChatGPT API to generate ideas for different game items (e.g Iron Sword) and characters (e.g Farmer, Cow), which we’ll then pass to Shap-E – a new model by OpenAI to generate 3D objects from the text.

    3D Game Assets with ChatGPT Graphic

    We’ll also write some code to convert the 3D models to standard 3D file formats (glTF, obj), so you can import them to your favorite game editor (Unity, Godot, Unreal, …).

    Step 1: Generate game items and characters using ChatGPT API

    Start by installing and importing the OpenAI library:

    %pip install openai  import openai openai.api_key = "<YOUR_API_KEY>"

    You can now do something like this to generate a list of game item ideas:

    items = json.loads(openai.ChatCompletion.create(     model="gpt-3.5-turbo",     messages=[{         "role": "user",         "content": f"""             Generate 9 items for a medieval MMORPG game.             Each item should contain a name (e.g Iron Sword), a short visual description (e.g A razor sharp sword).             The output should be a JSON array, where each item is a JSON object with name, short_visual_description, and skills (attack / strength / defence).             The short visual description should be maximum 4 words.             Return only the JSON array.         """     }], )["choices"][0]["message"]["content"])

    This will generate a list such as:

    [{'name': 'Dragonfire Staff',   'short_visual_description': 'Fiery wooden staff',   'skills': {'attack': 20, 'strength': 10, 'defence': 5}},  {'name': 'Mithril Plate Armor',   'short_visual_description': 'Shiny silver armor',   'skills': {'attack': 0, 'strength': 5, 'defence': 25}},  {'name': 'Elven Bow',   'short_visual_description': 'Graceful wooden bow',   'skills': {'attack': 15, 'strength': 5, 'defence': 0}},  {'name': 'Dwarven Warhammer',   'short_visual_description': 'Heavy metal hammer',   'skills': {'attack': 25, 'strength': 20, 'defence': 5}},  {'name': 'Enchanted Dagger',   'short_visual_description': 'Glowing sharp blade',   'skills': {'attack': 10, 'strength': 5, 'defence': 0}},  {'name': 'Holy Avenger',   'short_visual_description': 'Radiant longsword',   'skills': {'attack': 30, 'strength': 15, 'defence': 10}},  {'name': 'Shadow Cloak',   'short_visual_description': 'Dark flowing cloak',   'skills': {'attack': 0, 'strength': 0, 'defence': 20}},  {'name': 'Giant Slayer',   'short_visual_description': 'Massive two-handed sword',   'skills': {'attack': 35, 'strength': 25, 'defence': 5}},  {'name': "Siren's Song",   'short_visual_description': 'Enchanted lute',   'skills': {'attack': 5, 'strength': 0, 'defence': 0}}]

    And we can do something similar to generate characters:

    characters = json.loads(openai.ChatCompletion.create(     model="gpt-3.5-turbo",     messages=[{         "role": "user",         "content": f"""             Generate 9 characters for a medieval MMORPG game, like World of Warcraft, MapleStory, or RuneScape.             Each character should contain a name (e.g Farmer, Cow, Goblin), a short visual description (e.g A green goblin).             The output should be a JSON array, where each item is a JSON object with name, short_visual_description, and combat level.              The short visual description should be maximum 4 words.             Return only the JSON array.         """     }], )["choices"][0]["message"]["content"])

    which will generate the following:

    [{'name': 'Grimm',   'short_visual_description': 'Dark hooded figure',   'combat level': 50},  {'name': 'Thorn',   'short_visual_description': 'Hulking brute',   'combat level': 65},  {'name': 'Sylph',   'short_visual_description': 'Graceful elf archer',   'combat level': 40},  {'name': 'Grendel',   'short_visual_description': 'Massive troll',   'combat level': 75},  {'name': 'Raven',   'short_visual_description': 'Sneaky thief',   'combat level': 30},  {'name': 'Aurora',   'short_visual_description': 'Radiant sorceress',   'combat level': 55},  {'name': 'Frost', 'short_visual_description': 'Ice mage', 'combat level': 45},  {'name': 'Balthazar',   'short_visual_description': 'Fiery demon',   'combat level': 80},  {'name': 'Oberon',   'short_visual_description': 'Regal fairy king',   'combat level': 70}]

    Step 2: Create 3D Objects From Text Using Shap-E

    First, import the Shap-E library and load the models:

    import torch  from shap_e.diffusion.sample import sample_latents from shap_e.diffusion.gaussian_diffusion import diffusion_from_config from import load_model, load_config from shap_e.util.notebooks import decode_latent_mesh  device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')  xm = load_model('transmitter', device=device) model = load_model('text300M', device=device) diffusion = diffusion_from_config(load_config('diffusion'))

    Next, we’ll generate a 3D mesh for each one of the items and characters above. We’ll use the “short visual description” that ChatGPT generates and pass that to Shap-E:

    guidance_scale = 30.0  prompts = {     **{item["name"]: item["short_visual_description"] for item in items},     **{character["name"]: character["short_visual_description"] for character in characters}, }  latents = {}  for name, description in prompts.items():     print(name)      slug = name.lower().replace(' ', '_')     latents[slug] = sample_latents(         batch_size=1,         model=model,         diffusion=diffusion,         guidance_scale=guidance_scale,         model_kwargs=dict(texts=[description]),         progress=True,         clip_denoised=True,         use_fp16=True,         use_karras=True,         karras_steps=64,         sigma_min=1e-3,         sigma_max=160,         s_churn=0,     )

    Finally, we can use the following code to generate .ply files for each 3D model:

    for name, latent in latents.items():     decode_latent_mesh(xm, latent).tri_mesh().write_ply(f"{name}.ply")
    3D Game Assets with ChatGPT Graphic

    However, if you want to use another format such as glTF or obj, there’s a cool trick you can do. There’s a library called trimesh that supports all of these formats. We’ll just need to convert Shap-E’s format to trimesh:

    import trimesh from trimesh.visual import ColorVisuals import numpy as np   for name, latent in latents.items():     print(name)     latent_mesh = decode_latent_mesh(xm, latent).tri_mesh()      vertex_colors = np.vstack((         latent_mesh.vertex_channels['R'],         latent_mesh.vertex_channels['G'],         latent_mesh.vertex_channels['B']     )).T      mesh = trimesh.Trimesh(vertices=latent_mesh.verts,                            faces=latent_mesh.faces,                            face_normals=latent_mesh.normals,                            visual=ColorVisuals(vertex_colors=vertex_colors))      scene = trimesh.Scene()     scene.add_geometry(mesh)      with open(f"models/{name}.glb", "wb") as f:         f.write(

    In this example, we convert the 3D objects into the standard glTF file format, which can be later be imported to three.js, Unity, etc.

    You can also use the trimesh library to convert to other file formats, such as obj (see

    Bias & Fairness

    Like any model, Shap-E is not without risks. In this Shap-E paper, the authors wrote it with ambiguous captions, in which certain details, such as body shape or color, were left unspecified.

    As you can see, the samples generated by the model likely perpetuate common gender-role stereotypes in response to these ambiguous prompts:

    3D Game Assets with ChatGPT Graphic
    Examples where Shap-E likely exhibits biases from its dataset, from the original paper.


    In this tutorial, we used ChatGPT and Shap-E to generate 3D game assets, including game items and characters, and convert them into popular 3D file formats. While exciting, we were also introduced to the risks of bias and fairness issues arising, reminding us to be cautious when designing games with ambiguous prompts.

    On this page

      Green Background

      Start Monitoring Your Models in Minutes