Building an AI Fantasy RPG with CrewAI: A Deep Dive into AI Agents

Building an AI Fantasy RPG with CrewAI: A Deep Dive into AI Agents

·

4 min read

Introduction

Building intelligent game agents that can maintain context, generate consistent content, and interact naturally with players is a complex challenge. In this post, I'll share my experience building a fantasy RPG using CrewAI, explain why I chose specific tools and architectures, and dive deep into the implementation details.

Why Agent-Based Architecture?

Modern AI applications often require multiple specialized components working together. Traditional sequential prompting can become unwieldy when managing complex interactions. Agent frameworks solve this by allowing:

  1. Specialized Roles: Each agent can focus on specific tasks

  2. Dynamic Communication: Agents can interact and delegate tasks

  3. State Management: Maintain context across interactions

  4. Error Recovery: Better handling of failures and edge cases

Comparing Agent Frameworks

Let's look at the main options:

  1. CrewAI:

    • Lightweight and focused on agent collaboration

    • Simple API with clear task delegation

    • Built-in state management

    • Easy integration with custom LLMs

  2. AutoGen:

    • More complex but very flexible

    • Better suited for multi-turn conversations

    • Requires more setup and configuration

    • Heavier resource usage

  3. LangGraph:

    • Graph-based approach to agent workflows

    • Good for complex decision trees

    • Steeper learning curve

    • Better for fixed workflows

I chose CrewAI for this project because:

  • It provides just enough structure without over-complicating

  • Easy integration with Together AI

  • Lightweight state management

  • Clear agent-to-agent communication

Implementation Details

Agent Structure

from crewai import Agent, Task
from together import Together
from typing import List, Dict, Any

class GameMasterAgent:
    def __init__(self, api_key: str, openai_api_key: str):
        self.together_client = Together(api_key=api_key)
        self.openai_client = OpenAI(api_key=openai_api_key)

        self.agent = Agent(
            role='Game Master',
            goal='Manage game flow and player interactions',
            backstory='Expert storyteller with deep knowledge of fantasy realms',
            allow_delegation=True,
            llm=self._setup_llm()
        )

    def _setup_llm(self):
        return CustomTogetherModel(
            together_client=self.together_client,
            model_name="meta-llama/Llama-3-70b-chat-hf"
        )

    async def process_action(self, action: str, game_state: GameState) -> str:
        """Process player action with context awareness"""
        context = self._build_context(game_state)
        task = Task(
            description=f"Process player action: {action}",
            context=context
        )
        return await self.agent.execute(task)

Why Together AI and Llama?

I chose Together AI as the infrastructure provider and Llama as the base model for several reasons:

  1. Cost Efficiency: Together AI provides competitive pricing for Llama model usage

  2. Model Performance: Llama-3-70B offers excellent performance for text generation

  3. Customization: Easy to fine-tune and adapt for specific use cases

  4. Low Latency: Together AI provides fast response times

Here's how we integrate Together AI:

class CustomTogetherModel(BaseChatModel):
    def __init__(self, together_client, **kwargs):
        super().__init__(**kwargs)
        self.client = together_client
        self.model_name = "meta-llama/Llama-3-70b-chat-hf"

    async def _generate(self, messages: List[Dict[str, Any]], 
                       stop: List[str] | None = None) -> str:
        try:
            response = await self.client.chat.completions.create(
                model=self.model_name,
                messages=messages,
                temperature=0.7,
                max_tokens=2000
            )
            return response.choices[0].message.content
        except Exception as e:
            logging.error(f"Generation error: {e}")
            raise

State Management with Pydantic

State management is crucial for maintaining consistency. We use Pydantic for:

  • Type validation

  • State transitions

  • History tracking

class GameState(BaseModel):
    world: Dict[str, Any]
    current_location: Dict[str, Any]
    inventory: Dict[str, int]
    history: List[Dict[str, Any]]
    puzzle_progress: Optional[PuzzleProgress] = None
    character: Dict[str, Any] = {}

    def add_to_history(self, action: str, response: str):
        """Add action and response to history with timestamp"""
        self.history.append({
            'action': action,
            'response': response,
            'timestamp': datetime.utcnow().isoformat()
        })
        # Keep last 10 interactions for context
        if len(self.history) > 10:
            self.history.pop(0)

    def get_context_window(self) -> List[Dict[str, Any]]:
        """Get recent history for context"""
        return self.history[-5:]

Safety and Guardrails

We implement multiple layers of safety checks:

  1. Content Filtering:

     class SafetyChecker:
         def __init__(self, api_key: str):
             self.llm = CustomTogetherModel(api_key)
    
         async def check_content(self, content: str) -> bool:
             """Check content for inappropriateness"""
             prompt = self._build_safety_prompt(content)
             result = await self.llm.generate(prompt)
             return self._parse_safety_result(result)
    
         def _build_safety_prompt(self, content: str) -> str:
             return f"""
             Analyze the following content for appropriateness in a fantasy game:
             {content}
    
             Consider:
             1. Violence level
             2. Language appropriateness
             3. Thematic elements
    
             Respond with SAFE or UNSAFE and explanation.
             """
    
  2. Input Validation:

     def validate_action(self, action: str) -> bool:
         """Validate player action"""
         # Check length
         if len(action) > 200:
             return False
    
         # Check for prohibited content
         prohibited = ['hack', 'cheat', 'exploit']
         if any(word in action.lower() for word in prohibited):
             return False
    
         return True
    

DALL-E 3 Integration

For visual content generation:

async def generate_scene_image(self, scene_description: str) -> Dict[str, Any]:
    """Generate scene imagery using DALL-E 3"""
    try:
        response = await self.openai_client.images.generate(
            model="dall-e-3",
            prompt=self._enhance_prompt(scene_description),
            n=1,
            size="1024x1024",
            quality="hd",
            style="vivid"
        )

        return {
            'url': response.data[0].url,
            'prompt': scene_description,
            'generated_at': datetime.utcnow().isoformat()
        }
    except Exception as e:
        logging.error(f"Image generation failed: {e}")
        return None

def _enhance_prompt(self, base_prompt: str) -> str:
    """Enhance prompt for better DALL-E results"""
    return f"""A high-quality, detailed fantasy illustration showing:
    {base_prompt}
    Style: Epic fantasy art with dramatic lighting and cinematic composition.
    """

Key Learnings and Best Practices

  1. Agent Design:

    • Keep agents focused on specific tasks

    • Use clear communication protocols

    • Implement proper error handling

    • Store important states

  2. Performance Optimization:

    • Cache frequently used responses

    • Batch similar requests

    • Use async where possible

    • Implement timeouts

  3. Content Safety:

    • Multiple layers of checking

    • Clear content guidelines

    • Fallback content options

    • User feedback mechanisms

Few images from the game

Conclusion

Building an intelligent game with AI agents requires careful consideration of architecture, tools, and safety measures. CrewAI provides a solid foundation for agent coordination, while Together AI and Llama offer powerful and cost-effective language processing capabilities.

The full code is available on GitHub. Feel free to explore and build upon this framework!