Aporia Releases Its Latest Market Overview – 2024 AI Report: Evolution of Models & Solutions
In the ever-evolving field of AI, the maturity of production applications is a sign of progress. The industry is witnessing...
🤜🤛 Aporia partners with Google Cloud to bring reliability and security to AI Agents - Read more
I’ve always had this dream of building a video game, but creating 3D assets has always been a painful barrier. In this tutorial, we are going to create 3D game assets automatically using ChatGPT and Shap-E. ✨
More specifically, we’re going to use the ChatGPT API to generate ideas for different game items (e.g Iron Sword) and characters (e.g Farmer, Cow), which we’ll then pass to Shap-E – a new model by OpenAI to generate 3D objects from the text.
We’ll also write some code to convert the 3D models to standard 3D file formats (glTF
, obj
), so you can import them to your favorite game editor (Unity, Godot, Unreal, …).
Start by installing and importing the OpenAI library:
%pip install openai
import openai
openai.api_key = "<YOUR_API_KEY>"
You can now do something like this to generate a list of game item ideas:
items = json.loads(openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[{
"role": "user",
"content": f"""
Generate 9 items for a medieval MMORPG game.
Each item should contain a name (e.g Iron Sword), a short visual description (e.g A razor sharp sword).
The output should be a JSON array, where each item is a JSON object with name, short_visual_description, and skills (attack / strength / defence).
The short visual description should be maximum 4 words.
Return only the JSON array.
"""
}],
)["choices"][0]["message"]["content"])
This will generate a list such as:
[{'name': 'Dragonfire Staff',
'short_visual_description': 'Fiery wooden staff',
'skills': {'attack': 20, 'strength': 10, 'defence': 5}},
{'name': 'Mithril Plate Armor',
'short_visual_description': 'Shiny silver armor',
'skills': {'attack': 0, 'strength': 5, 'defence': 25}},
{'name': 'Elven Bow',
'short_visual_description': 'Graceful wooden bow',
'skills': {'attack': 15, 'strength': 5, 'defence': 0}},
{'name': 'Dwarven Warhammer',
'short_visual_description': 'Heavy metal hammer',
'skills': {'attack': 25, 'strength': 20, 'defence': 5}},
{'name': 'Enchanted Dagger',
'short_visual_description': 'Glowing sharp blade',
'skills': {'attack': 10, 'strength': 5, 'defence': 0}},
{'name': 'Holy Avenger',
'short_visual_description': 'Radiant longsword',
'skills': {'attack': 30, 'strength': 15, 'defence': 10}},
{'name': 'Shadow Cloak',
'short_visual_description': 'Dark flowing cloak',
'skills': {'attack': 0, 'strength': 0, 'defence': 20}},
{'name': 'Giant Slayer',
'short_visual_description': 'Massive two-handed sword',
'skills': {'attack': 35, 'strength': 25, 'defence': 5}},
{'name': "Siren's Song",
'short_visual_description': 'Enchanted lute',
'skills': {'attack': 5, 'strength': 0, 'defence': 0}}]
And we can do something similar to generate characters:
characters = json.loads(openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[{
"role": "user",
"content": f"""
Generate 9 characters for a medieval MMORPG game, like World of Warcraft, MapleStory, or RuneScape.
Each character should contain a name (e.g Farmer, Cow, Goblin), a short visual description (e.g A green goblin).
The output should be a JSON array, where each item is a JSON object with name, short_visual_description, and combat level.
The short visual description should be maximum 4 words.
Return only the JSON array.
"""
}],
)["choices"][0]["message"]["content"])
which will generate the following:
[{'name': 'Grimm',
'short_visual_description': 'Dark hooded figure',
'combat level': 50},
{'name': 'Thorn',
'short_visual_description': 'Hulking brute',
'combat level': 65},
{'name': 'Sylph',
'short_visual_description': 'Graceful elf archer',
'combat level': 40},
{'name': 'Grendel',
'short_visual_description': 'Massive troll',
'combat level': 75},
{'name': 'Raven',
'short_visual_description': 'Sneaky thief',
'combat level': 30},
{'name': 'Aurora',
'short_visual_description': 'Radiant sorceress',
'combat level': 55},
{'name': 'Frost', 'short_visual_description': 'Ice mage', 'combat level': 45},
{'name': 'Balthazar',
'short_visual_description': 'Fiery demon',
'combat level': 80},
{'name': 'Oberon',
'short_visual_description': 'Regal fairy king',
'combat level': 70}]
First, import the Shap-E library and load the models:
import torch
from shap_e.diffusion.sample import sample_latents
from shap_e.diffusion.gaussian_diffusion import diffusion_from_config
from shap_e.models.download import load_model, load_config
from shap_e.util.notebooks import decode_latent_mesh
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
xm = load_model('transmitter', device=device)
model = load_model('text300M', device=device)
diffusion = diffusion_from_config(load_config('diffusion'))
Next, we’ll generate a 3D mesh for each one of the items and characters above. We’ll use the “short visual description” that ChatGPT generates and pass that to Shap-E:
guidance_scale = 30.0
prompts = {
**{item["name"]: item["short_visual_description"] for item in items},
**{character["name"]: character["short_visual_description"] for character in characters},
}
latents = {}
for name, description in prompts.items():
print(name)
slug = name.lower().replace(' ', '_')
latents[slug] = sample_latents(
batch_size=1,
model=model,
diffusion=diffusion,
guidance_scale=guidance_scale,
model_kwargs=dict(texts=[description]),
progress=True,
clip_denoised=True,
use_fp16=True,
use_karras=True,
karras_steps=64,
sigma_min=1e-3,
sigma_max=160,
s_churn=0,
)
Finally, we can use the following code to generate .ply
files for each 3D model:
for name, latent in latents.items():
decode_latent_mesh(xm, latent).tri_mesh().write_ply(f"{name}.ply")
However, if you want to use another format such as glTF or obj, there’s a cool trick you can do. There’s a library called trimesh that supports all of these formats. We’ll just need to convert Shap-E’s format to trimesh:
import trimesh
from trimesh.visual import ColorVisuals
import numpy as np
for name, latent in latents.items():
print(name)
latent_mesh = decode_latent_mesh(xm, latent).tri_mesh()
vertex_colors = np.vstack((
latent_mesh.vertex_channels['R'],
latent_mesh.vertex_channels['G'],
latent_mesh.vertex_channels['B']
)).T
mesh = trimesh.Trimesh(vertices=latent_mesh.verts,
faces=latent_mesh.faces,
face_normals=latent_mesh.normals,
visual=ColorVisuals(vertex_colors=vertex_colors))
scene = trimesh.Scene()
scene.add_geometry(mesh)
with open(f"models/{name}.glb", "wb") as f:
f.write(trimesh.exchange.gltf.export_glb(scene))
In this example, we convert the 3D objects into the standard glTF file format, which can be later be imported to three.js, Unity, etc.
You can also use the trimesh library to convert to other file formats, such as obj (see trimesh.exchange.obj.export_obj).
Like any model, Shap-E is not without risks. In this Shap-E paper, the authors wrote it with ambiguous captions, in which certain details, such as body shape or color, were left unspecified.
As you can see, the samples generated by the model likely perpetuate common gender-role stereotypes in response to these ambiguous prompts:
In this tutorial, we used ChatGPT and Shap-E to generate 3D game assets, including game items and characters, and convert them into popular 3D file formats. While exciting, we were also introduced to the risks of bias and fairness issues arising, reminding us to be cautious when designing games with ambiguous prompts.
In the ever-evolving field of AI, the maturity of production applications is a sign of progress. The industry is witnessing...
Over the last year and a half, there has been a major leap forward in the text-to-image space, where deep...