[WIP] Update README.md to reflect actual grader weights and system logic#1
Closed
Copilot wants to merge 1 commit into
Closed
[WIP] Update README.md to reflect actual grader weights and system logic#1Copilot wants to merge 1 commit into
Copilot wants to merge 1 commit into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Original prompt
Update the README.md to reflect the actual grader weights and system logic based on the code in app/graders.py and inference.py.
Current Issues:
Changes Required:
Update Grader Formula Section (line 181-187):
Add Comprehensive Action Space Table (after line 132):
Add Grader System Breakdown (new section after Reward System):
Add Baseline Agent Behavior Section (expand current section):
Fix Tasks & Difficulty Levels (line 118-126):
Keep all original content, structure, and ensure backward compatibility with existing documentation.
The following is the prior conversation context from the user's chat exploration (may be truncated):
User: can u conver action space to a table
Assistant: I'll convert the Action Space section to a table format for you.
Assistant: Here's the Action Space converted into a table format:
request_infozone_idallocate_resourcezone_id,resource_type,amountfinalizeWould you like me to update the README.md with this table format and commit the changes to your repository?
User: heres the inference """
inference.py
Hackathon inference script for DisasterTriageEnv.
Connects an LLM agent (via OpenAI-compatible API) to the running
DisasterTriage FastAPI server and runs one full episode.
Logging output (stdout only):
[START] task=<task_name> env=disaster-triage-env model=<model_name>
[STEP] step= action=<action_str> reward=<0.00> done=<true|false> error=
[END] success=<true|false> steps= rewards=<r1,r2...rn>
Usage:
python inference.py [--task easy|medium|hard] [--base-url http://127.0.0.1:8000]
Environment variables (loaded from .env):
API_BASE_URL — LLM API base URL (default: https://api.openai.com/v1)
MODEL_NAME — model identifier (default: gpt-4.1-mini)
HF_TOKEN — API key / HF token (REQUIRED — raises error if missing)
"""
from future import annotations
import argparse
import json
import os
import sys
import time
from typing import Any, Dict, List, Optional
import numpy as np
import requests
from dotenv import load_dotenv
from openai import OpenAI
---------------------------------------------------------------------------
Load .env
---------------------------------------------------------------------------
load_dotenv()
---------------------------------------------------------------------------
Environment variables
---------------------------------------------------------------------------
LLM Proxy (must be provided by the environment, e.g. LiteLLM)
LLM_API_BASE: str = os.getenv("API_BASE_URL", "https://api.openai.com/v1")
MODEL_NAME: str = os.getenv("MODEL_NAME", "gpt-4.1-mini")
HF_TOKEN: str | None = os.getenv("HF_TOKEN")
if not HF_TOKEN:
print("[WARN] HF_TOKEN not found. Simulation will fail until a secret is provided.")
# We define a dummy client to avoid crash during import
client = None
else:
# ---------------------------------------------------------------------------
# OpenAI client (OpenAI-compatible, pointed at API_BASE_URL)
# ---------------------------------------------------------------------------
client = OpenAI(
base_url=LLM_API...
This pull request was created from Copilot chat.