agt-eval

v0.1.0 suspicious
4.0
Medium Risk

CLI toolkit for evaluating LLM agents

🤖 AI Analysis

Final verdict: SUSPICIOUS

The package has minimal direct risks such as network calls or shell execution, but its low-maintenance metadata suggests potential issues that warrant further investigation.

  • Low metadata quality
  • No package description provided
Per-check LLM notes
  • Network: No network calls detected, which is normal unless the package requires internet access for its functionality.
  • Shell: No shell execution patterns detected, indicating no immediate risk of executing arbitrary commands.
  • Obfuscation: No obfuscation patterns detected, indicating low risk.
  • Credentials: No credential harvesting patterns detected, indicating low risk.
  • Metadata: The package shows signs of low maintenance and possibly low effort, raising some suspicion but not definitive evidence of malice.

📦 Package Quality Overall: Low (3.6/10)

✦ High Test Suite 9.0

Test suite present — 12 test file(s) found

  • Test runner config found: conftest.py
  • Test runner config found: pyproject.toml
  • 12 test file(s) detected (e.g. conftest.py)
○ Low Documentation 1.0

No documentation detected

  • No documentation URL, doc files, or meaningful description found
○ Low Contributing Guide 2.0

No contributing guide or governance files found

  • No CONTRIBUTING, CODE_OF_CONDUCT, or governance files found
◈ Medium Type Annotations 5.0

Partial type annotation coverage

  • 152 type-annotated function signatures detected in source
○ Low Multiple Contributors 1.0

Unable to verify contributor count: no GitHub repository found

  • No GitHub repository linked — contributor count unavailable

🔬 Heuristic Checks

Outbound Network Calls

No suspicious network call patterns found

Code Obfuscation

No obfuscation patterns detected

Shell / Subprocess Execution

No shell execution patterns detected

Credential Harvesting

No credential harvesting patterns detected

Typosquatting

No typosquatting candidates detected

Registered Email Domain

No author email provided

Suspicious Page Links

All external links appear legitimate

Git Repository History

No GitHub repository linked

  • No GitHub repository link found
Maintainer History score 8.0

4 maintainer concern(s) found

  • Only one version has ever been released — brand new package
  • Author name is missing or very short
  • Author "" appears to have only 1 package on PyPI (new or inactive account)
  • Package has no PyPI classifiers (low effort / metadata quality)
Known CVE Vulnerabilities

No known vulnerabilities found in OSV database.

💡 AI App Starter Prompt

Use this prompt to build a project with agt-eval
Create a comprehensive evaluation tool for assessing the performance of various Large Language Models (LLMs) using the 'agt-eval' Python package. This tool will allow users to input different prompts and evaluate how well each model responds based on predefined criteria such as accuracy, coherence, and relevance. The application should also provide a user-friendly interface for visualizing the results through graphs and charts. Additionally, incorporate a feature to compare multiple models simultaneously, showcasing their strengths and weaknesses side by side. Use 'agt-eval' to handle the evaluation process, ensuring that it integrates seamlessly with other Python libraries for data visualization like Matplotlib or Seaborn.