agi-evals

v0.1.0 suspicious
6.0
Medium Risk

Plug any model into any major AGI eval and actually run it.

πŸ€– AI Analysis

Final verdict: SUSPICIOUS

The package exhibits elevated risks due to potential shell execution and obfuscation techniques, which could be employed for malicious purposes. However, there is no concrete evidence of malicious intent.

  • High shell execution risk
  • Significant obfuscation techniques
Per-check LLM notes
  • Network: Network calls indicate external API interactions which may be legitimate but require further investigation into the purpose and destination of the requests.
  • Shell: Shell execution patterns suggest the package might execute arbitrary commands, posing a risk if not properly sanitized or controlled.
  • Obfuscation: The code uses complex encoding and decoding techniques which can be used for hiding malicious payloads or to protect sensitive data, indicating a higher risk of obfuscation.
  • Credentials: No clear patterns indicative of credential harvesting were detected, but caution is advised as the obfuscation could potentially hide such activities.
  • Metadata: The package is new and has limited maintainer history, which raises some suspicion but does not conclusively indicate malicious intent.

πŸ“¦ Package Quality Overall: Low (4.2/10)

β—ˆ Medium Test Suite 6.0

Partial test coverage signals detected

  • Test runner config found: pyproject.toml
β—ˆ Medium Documentation 7.0

Some documentation present

  • Documentation URL: "Documentation" -> https://agi-eval.studio/evals
  • Detailed PyPI description (8280 chars)
β—‹ Low Contributing Guide 2.0

No contributing guide or governance files found

  • No CONTRIBUTING, CODE_OF_CONDUCT, or governance files found
β—ˆ Medium Type Annotations 5.0

Partial type annotation coverage

  • 203 type-annotated function signatures detected in source
β—‹ Low Multiple Contributors 1.0

Unable to verify contributor count: no GitHub repository found

  • No GitHub repository linked β€” contributor count unavailable

πŸ”¬ Heuristic Checks

⚠ Outbound Network Calls score 7.5

Found 5 network call pattern(s)

  • model, run_meta) resp = httpx.post( f"{base.rstrip('/')}/runs", json=payload,
  • perf_counter() resp = httpx.post( f"{self._base_url}/api/chat", json=payload, tim
  • """ offset = 0 with httpx.Client(timeout=_TIMEOUT, headers=_hf_headers()) as client:
  • t) tuples. """ resp = httpx.get( HF_SPLITS_URL, params={"dataset": dataset},
  • ody bytes. """ resp = httpx.get(url, timeout=_TIMEOUT, follow_redirects=True) resp.raise
⚠ Code Obfuscation score 8.0

Found 4 obfuscation pattern(s)

  • ormat zlib.decompress(base64.b64decode(raw.encode("utf-8"))) ) return json.loads(decoded) i
  • return str(round(float(eval(expression, {"__builtins__": None}, {})), 2)) except
  • ented upstream format zlib.decompress(base64.b64decode(raw.encode("utf-8"))) ) return json
  • ): pass decoded = pickle.loads( # noqa: S301 - documented upstream format zlib.dec
⚠ Shell / Subprocess Execution score 6.0

Found 3 shell execution pattern(s)

  • fworld" ) subprocess.run([exe], check=True) rows: list[dict[str, Any]] = []
  • try: proc = subprocess.run( [sys.executable, str(path)],
  • try: proc = subprocess.run( [sys.executable, str(path), *argv],
βœ“ Credential Harvesting

No credential harvesting patterns detected

βœ“ Typosquatting

No typosquatting candidates detected

βœ“ Registered Email Domain

No author email provided

βœ“ Suspicious Page Links

All external links appear legitimate

βœ“ Git Repository History

No GitHub repository linked

  • No GitHub repository link found
⚠ Maintainer History score 6.0

3 maintainer concern(s) found

  • Only one version has ever been released β€” brand new package
  • Package is very new: uploaded 2 day(s) ago
  • Author "iso-ai" appears to have only 1 package on PyPI (new or inactive account)
βœ“ Known CVE Vulnerabilities

No known vulnerabilities found in OSV database.

πŸ’‘ AI App Starter Prompt

Use this prompt to build a project with agi-evals
Create a mini-application that evaluates the performance of various AI models across different AGI (Artificial General Intelligence) tasks using the 'agi-evals' Python package. This application will serve as a tool for researchers and developers to quickly assess the capabilities of their AI models against established benchmarks. Here’s a step-by-step guide on how to build this application:

1. **Setup Environment**: Start by setting up your Python environment. Ensure you have Python 3.x installed along with pip. Install the 'agi-evals' package via pip.
2. **Define Models**: List a variety of AI models that you want to evaluate. These could include language models, image recognition models, etc., from different providers like Hugging Face, Google, etc.
3. **Select AGI Tasks**: Choose several AGI evaluation tasks that cover a broad spectrum of intelligence types, such as logical reasoning, creativity, problem-solving, and natural language processing.
4. **Integrate 'agi-evals'**: Utilize the 'agi-evals' package to plug these models into the selected AGI tasks. Use the package's functionalities to automate the process of running these evaluations.
5. **Develop UI/CLI**: Create a user-friendly interface (either a web-based UI using Flask/Django or a command-line interface) where users can select which models and tasks they wish to evaluate. The application should provide real-time feedback on the evaluation progress.
6. **Results Visualization**: Implement a feature within the application to visualize the results of the evaluations. Users should be able to see comparative analysis between different models across the various tasks.
7. **Report Generation**: Add functionality for generating detailed reports based on the evaluation results. These reports should summarize the strengths and weaknesses of each model in relation to the tasks performed.
8. **Security Measures**: Ensure that the application includes security measures to protect sensitive data and model weights if necessary.
9. **Testing & Documentation**: Thoroughly test the application and document all its features, including how to install, use, and extend the application.

By following these steps, you'll develop a comprehensive tool that leverages the 'agi-evals' package to provide deep insights into the performance of AI models across diverse AGI tasks.