agent-guard-plugins

v0.5.0 suspicious
6.0
Medium Risk

Drop-in prompt-injection guards for Claude, OpenAI Codex, Hermes, and OpenCLAW agents. Wraps the agent-guard-modernbert-base and agent-guard-deberta-pi-base classifiers on Hugging Face.

🤖 AI Analysis

Final verdict: SUSPICIOUS

The package has a moderate risk score due to its obfuscated code and potential credential harvesting capabilities, despite not making network calls.

  • High obfuscation risk
  • Potential credential harvesting
Per-check LLM notes
  • Network: No network calls detected, indicating low risk.
  • Shell: Detection of shell execution patterns suggests potential for executing external commands, which could be benign but requires further investigation into the purpose and context.
  • Obfuscation: The code shows signs of obfuscation which could be used to hide malicious activity or logic from casual inspection.
  • Credentials: The code includes patterns that suggest the potential harvesting of credentials or sensitive information, such as reading files from system paths.

🔬 Heuristic Checks

Outbound Network Calls

No suspicious network call patterns found

Code Obfuscation score 4.0

Found 2 obfuscation pattern(s)

  • ) model.eval() _state["backend"] = "lora" else:
  • ile = False model.eval() _state["backend"] = "merged" if _stat
Shell / Subprocess Execution score 10.0

Found 5 shell execution pattern(s)

  • aude Code CLI. proc = subprocess.run( ["python3", str(HARNESS)], capture_
  • X_BIN": codex_bin} proc = subprocess.run( ["bash", str(WRAPPER), INJECTION], env=env,
  • ackage is missing. proc = subprocess.run( ["python3", str(HARNESS)], capture_output=T
  • mes_available: proc = subprocess.run( ["python3", str(HARNESS)], capture_
  • THON", "python3")} proc = subprocess.run( [node, str(PLUGIN_DIR / "run-openclaw-e2e.mjs")],
Credential Harvesting score 5.0

Found 2 credential access pattern(s)

  • "tool_input": {"file_path": "/etc/hosts"}} ) self.assertEqual(decision, {})
  • e="read_file", args={"path": "/etc/hosts"}, ) self.assertIsNone(result)
Typosquatting

No typosquatting candidates detected

Registered Email Domain

No author email provided

Suspicious Page Links

All external links appear legitimate

Git Repository History

Repository dannyliv/agent-guard-plugins appears legitimate

Maintainer History score 2.0

1 maintainer concern(s) found

  • Author "dannyliv" appears to have only 1 package on PyPI (new or inactive account)
Known CVE Vulnerabilities

No known vulnerabilities found in OSV database.

💡 AI App Starter Prompt

Use this prompt to build a project with agent-guard-plugins
Create a Python-based mini-application named 'GuardedChat' that acts as a secure interface between users and AI language models like Claude, OpenAI Codex, Hermes, and OpenCLAW. This application will utilize the 'agent-guard-plugins' package to ensure that any harmful or inappropriate content is filtered out before it reaches the user. Here's a step-by-step guide on how to build this application:

1. **Setup Environment**: Begin by setting up your Python environment. Ensure you have Python 3.8 or higher installed. Install necessary packages including 'agent-guard-plugins', 'requests', and 'flask'.
2. **Design the Interface**: Design a simple command-line interface (CLI) or a web interface using Flask for user interaction.
3. **Integrate AI Models**: Integrate the supported AI language models into your application. Each model should be accessible via API calls.
4. **Implement Security Measures**: Use the 'agent-guard-plugins' package to wrap around the AI models. This package includes classifiers that can detect and block prompts that might lead to harmful outputs.
5. **User Interaction**: Allow users to input queries through the designed interface. These queries will then be sent to the AI models after being processed by the 'agent-guard-plugins' for safety checks.
6. **Display Responses**: Display the responses from the AI models back to the user. If any query fails the safety check, inform the user without revealing the exact nature of the failure.
7. **Logging & Monitoring**: Implement logging to keep track of all interactions. This will help in monitoring the performance and security of the application.
8. **Testing & Deployment**: Thoroughly test the application to ensure all features work as expected. Deploy the application either locally or on a cloud service provider.

Suggested Features:
- User Authentication: Require users to log in before interacting with the AI models.
- Customizable Filters: Allow users to customize their own filters based on specific concerns (e.g., political, religious).
- Feedback System: Implement a system where users can report inappropriate responses, which can then be reviewed and acted upon.
- Analytics Dashboard: Provide a dashboard that shows statistics about usage and security measures taken.