vertiport_autonomy.evaluation package
Submodules
vertiport_autonomy.evaluation.framework module
Evaluation Framework for Vertiport Autonomy
Provides systematic evaluation of agents with comprehensive KPI reporting. Supports multiple scenarios, multiple runs, and statistical analysis.
- class vertiport_autonomy.evaluation.framework.EvaluationMetrics(scenario: str, agent_type: str, episode_id: int, seed: int, episode_length: int, total_reward: float, average_reward: float, collisions: int, los_violations: int, unauthorized_landings: int, missions_completed: int, completion_rate: float, average_mission_time: float, throughput: float, final_states: List[str], time_in_states: Dict[str, float])[source]
Bases:
object
Container for episode evaluation metrics
- __init__(scenario: str, agent_type: str, episode_id: int, seed: int, episode_length: int, total_reward: float, average_reward: float, collisions: int, los_violations: int, unauthorized_landings: int, missions_completed: int, completion_rate: float, average_mission_time: float, throughput: float, final_states: List[str], time_in_states: Dict[str, float]) None
- class vertiport_autonomy.evaluation.framework.EvaluationFramework(output_dir: str = 'evaluation_results')[source]
Bases:
object
Comprehensive evaluation framework for vertiport coordination agents
- evaluate_agent(agent_fn, agent_name: str, scenarios: List[str] | None = None, num_episodes: int = 10, max_steps: int = 500, seeds: List[int] | None = None) List[EvaluationMetrics] [source]
Evaluate an agent across multiple scenarios and episodes
- Parameters:
agent_fn – Function that takes (scenario_path, seed, max_steps) and returns metrics
agent_name – Name identifier for the agent
scenarios – List of scenario files to test (default: all scenarios)
num_episodes – Number of episodes per scenario
max_steps – Maximum steps per episode
seeds – List of random seeds (default: sequential integers)
- Returns:
List of EvaluationMetrics for all episodes
vertiport_autonomy.evaluation.metrics module
Performance metrics calculation utilities.
- vertiport_autonomy.evaluation.metrics.calculate_performance_metrics(episode_data: List[Dict[str, Any]]) Dict[str, float] [source]
Calculate aggregated performance metrics from episode data.
- Parameters:
episode_data – List of episode results containing metrics
- Returns:
Dictionary of aggregated performance metrics
Module contents
Evaluation framework and metrics.
- class vertiport_autonomy.evaluation.EvaluationFramework(output_dir: str = 'evaluation_results')[source]
Bases:
object
Comprehensive evaluation framework for vertiport coordination agents
- compare_agents(results_files: List[str]) DataFrame [source]
Compare multiple agents from saved results
- Parameters:
results_files – List of CSV files with evaluation results
- Returns:
DataFrame with comparative statistics
- evaluate_agent(agent_fn, agent_name: str, scenarios: List[str] | None = None, num_episodes: int = 10, max_steps: int = 500, seeds: List[int] | None = None) List[EvaluationMetrics] [source]
Evaluate an agent across multiple scenarios and episodes
- Parameters:
agent_fn – Function that takes (scenario_path, seed, max_steps) and returns metrics
agent_name – Name identifier for the agent
scenarios – List of scenario files to test (default: all scenarios)
num_episodes – Number of episodes per scenario
max_steps – Maximum steps per episode
seeds – List of random seeds (default: sequential integers)
- Returns:
List of EvaluationMetrics for all episodes
- vertiport_autonomy.evaluation.calculate_performance_metrics(episode_data: List[Dict[str, Any]]) Dict[str, float] [source]
Calculate aggregated performance metrics from episode data.
- Parameters:
episode_data – List of episode results containing metrics
- Returns:
Dictionary of aggregated performance metrics