agenteval-framework

v0.1.1 suspicious
4.0
Medium Risk

Unit tests for agents. Define an agent with capa's capabilities.yaml, write scenarios and assertions, mock MCP servers, run in CI.

🤖 AI Analysis

Final verdict: SUSPICIOUS

The package shows no immediate signs of malicious activity, but the recent creation and rapid commit history of the repository raise concerns about potential tampering or a new supply-chain attack.

  • Recent repository creation with rapid commit history
  • Detected shell execution patterns requiring further context
Per-check LLM notes
  • Network: No network calls were detected, which is normal and not suspicious.
  • Shell: Shell execution patterns detected may be for package installation or cleaning purposes, but without context, further investigation is needed to confirm legitimacy.
  • Obfuscation: No obfuscation patterns detected, indicating low risk.
  • Credentials: No credential harvesting patterns detected, indicating low risk.
  • Metadata: The repository was created very recently and all commits occurred within a short period, raising suspicion.

🔬 Heuristic Checks

Outbound Network Calls

No suspicious network call patterns found

Code Obfuscation

No obfuscation patterns detected

Shell / Subprocess Execution score 8.0

Found 4 shell execution pattern(s)

  • ompt) try: proc = subprocess.run( cmd, stdin=subprocess.DEVNULL, capture_output=T
  • _bin", "capa") proc = subprocess.run( [capa_bin, "install", "-p", "claude-code"],
  • pa") try: subprocess.run( [capa_bin, "clean"], cwd=st
  • try: proc = subprocess.run( cmd, cwd=str(workspace),
Credential Harvesting

No credential harvesting patterns detected

Typosquatting

No typosquatting candidates detected

Registered Email Domain

Email domain looks legitimate: zaitoun.dev>

Suspicious Page Links

All external links appear legitimate

Git Repository History score 5.0

Git history flags: Repository created very recently: 6 day(s) ago (2026-05-31T15:40:56Z)

  • Repository created very recently: 6 day(s) ago (2026-05-31T15:40:56Z)
  • All 14 commits happened within 24 hours
Maintainer History score 4.0

2 maintainer concern(s) found

  • Author name is missing or very short
  • Author "" appears to have only 1 package on PyPI (new or inactive account)
Known CVE Vulnerabilities

No known vulnerabilities found in OSV database.

💡 AI App Starter Prompt

Use this prompt to build a project with agenteval-framework
Create a fully-functional mini-application called 'AgentTestSuite' using the Python package 'agenteval-framework'. This application will serve as a unit testing tool specifically designed for evaluating AI agents in various simulated environments. The goal is to demonstrate how 'agenteval-framework' can be used to define agent capabilities, create test scenarios, and assert expected behaviors, all within a continuous integration (CI) pipeline setup.

**Step-by-Step Application Design:**
1. **Define Agent Capabilities:** Use 'capa's capabilities.yaml' file to describe the capabilities of your AI agent. For example, if your agent can navigate a grid world, understand natural language commands, and interact with objects, you would specify these capabilities in the YAML file.
2. **Craft Scenarios and Assertions:** Develop several test scenarios where the agent must perform tasks based on its defined capabilities. Each scenario should include a set of assertions that check whether the agent behaves as expected under different conditions. For instance, a scenario could involve the agent receiving a command to move to a specific location and then checking if the agent successfully navigates there.
3. **Mock MCP Servers:** Since real-world interaction might not always be feasible during testing, use the 'agenteval-framework' to mock Multi-Agent Platform (MCP) servers. This allows the application to simulate environments and interactions without needing actual server setups.
4. **Integrate into CI Pipeline:** Set up a CI pipeline that automatically runs these tests whenever changes are made to the agent's codebase. This ensures that any modifications do not break existing functionality.
5. **Report Test Results:** The application should generate comprehensive reports detailing which tests passed, failed, or were skipped. These reports should also highlight any unexpected behaviors or errors encountered during testing.

**Suggested Features:**
- Support for multiple agent types (e.g., navigation agents, dialogue agents)
- Extensible scenario creation allowing users to add new test cases easily
- Detailed logging and error handling mechanisms
- Integration with popular CI/CD tools like Jenkins, GitHub Actions, etc.
- Visualization of test results through graphs or charts to better understand performance over time.

By following these steps and implementing the suggested features, 'AgentTestSuite' will showcase the robustness and flexibility of the 'agenteval-framework' package in a practical, real-world application.