aqueduct-core

v1.2.0 suspicious
6.0
Medium Risk

Agentic Spark — declarative, self-healing Apache Spark blueprints.

🤖 AI Analysis

Final verdict: SUSPICIOUS

The package exhibits a moderate risk profile due to its execution of shell commands and handling of AWS credentials without adequate safeguards.

  • High shell risk
  • Inadequate credential handling
Per-check LLM notes
  • Network: The presence of socket and HTTP requests may indicate legitimate network functionality but could also be used for unexpected communications.
  • Shell: Executing shell commands can be risky as it allows the package to interact with the system at a low level, potentially leading to unauthorized actions.
  • Obfuscation: The obfuscation detected appears to be standard Python code formatting and does not indicate malicious intent.
  • Credentials: The mention of AWS credentials and IAM roles suggests that the package may be accessing AWS services but lacks proper secure handling mechanisms, indicating potential risk.
  • Metadata: The author's information is incomplete, suggesting potential unreliability.

📦 Package Quality Overall: Medium (6.2/10)

✦ High Test Suite 9.0

Test suite present — 1 test file(s) found

  • Test runner config found: pyproject.toml
  • 1 test file(s) detected (e.g. test_runner.py)
◈ Medium Documentation 5.0

Some documentation present

  • Detailed PyPI description (10767 chars)
○ Low Contributing Guide 4.0

No contributing guide or governance files found

  • Development Status classifier >= Beta
◈ Medium Type Annotations 5.0

Partial type annotation coverage

  • 254 type-annotated function signatures detected in source
✦ High Multiple Contributors 8.0

Active multi-contributor project

  • 4 unique contributor(s) across 100 commits in sadigaxund/Aqueduct
  • Small but multi-author team (3–4 contributors)

🔬 Heuristic Checks

Outbound Network Calls score 7.5

Found 5 network call pattern(s)

  • socket try: with socket.create_connection((host, port), timeout=timeout): return True
  • try: with socket.create_connection((host, port), timeout=3): pass
  • t None else timeout) with httpx.Client(timeout=effective_timeout) as client: response = cli
  • t None else timeout) with httpx.Client( timeout=httpx.Timeout(connect=15.0, read=effective_
  • dels" try: resp = httpx.get(models_url, timeout=10) resp.raise_for_status()
Code Obfuscation score 2.0

Found 1 obfuscation pattern(s)

  • config=resolved_config) def compile( # noqa: A001 blueprint: Blueprint, blueprint_path: Path | None = None, run_id: str | None = None, depot: Any = None, execution_date: Any = None, secrets_provider: str = "env",
Shell / Subprocess Execution score 4.0

Found 2 shell execution pattern(s)

  • r c in cmd)) result = subprocess.run(cmd, env=env, check=False) rc = result.returncode
  • r, ] result = subprocess.run(cmd, capture_output=True, text=True, check=False) if
Credential Harvesting score 2.5

Found 1 credential access pattern(s)

  • Y_ID/AWS_SECRET_ACCESS_KEY, ~/.aws/credentials, " "IAM role on EC2/ECS/EKS/Lambda, or SSO)."
Typosquatting

No typosquatting candidates detected

Registered Email Domain

Email domain looks legitimate: gmail.com>

Suspicious Page Links

All external links appear legitimate

Git Repository History

Repository sadigaxund/Aqueduct appears legitimate

Maintainer History score 4.0

2 maintainer concern(s) found

  • Author name is missing or very short
  • Author "" appears to have only 1 package on PyPI (new or inactive account)
Known CVE Vulnerabilities

No known vulnerabilities found in OSV database.

💡 AI App Starter Prompt

Use this prompt to build a project with aqueduct-core
Develop a small-scale data processing application using the 'aqueduct-core' package, which leverages Apache Spark for efficient data manipulation and analysis. This application will serve as a tool for analyzing social media sentiment across different platforms like Twitter and Reddit. It will allow users to input keywords or hashtags and then retrieve recent posts containing these terms from the specified platforms. The app will process the retrieved data to calculate sentiment scores using a pre-trained model and provide visual summaries of the sentiments expressed in the collected data.

Key Features:
1. Integration with Twitter and Reddit APIs for data retrieval.
2. Use of 'aqueduct-core' to define and manage Apache Spark jobs for data processing tasks such as cleaning, filtering, and sentiment analysis.
3. Visualization of sentiment trends over time using matplotlib or seaborn libraries.
4. User-friendly command-line interface for inputting search terms and viewing results.
5. Self-healing capabilities provided by 'aqueduct-core' to ensure robustness and reliability of the data processing pipeline.

Steps to Develop the Application:
1. Set up your development environment with Python and install necessary packages including 'aqueduct-core', 'tweepy' for Twitter API access, 'praw' for Reddit API access, and 'matplotlib/seaborn' for visualization.
2. Define the structure of your data processing pipeline using 'aqueduct-core'. This includes setting up data sources (Twitter and Reddit), defining transformations (cleaning and sentiment scoring), and specifying outputs (storage and visualization).
3. Implement functions to interact with the Twitter and Reddit APIs to fetch relevant posts based on user inputs.
4. Utilize 'aqueduct-core' to orchestrate the data flow through your defined pipeline, ensuring that each step is executed correctly and efficiently.
5. Integrate a sentiment analysis model into your pipeline. This could be a pre-trained model or one you train yourself depending on the complexity of the task.
6. Design a simple CLI that allows users to enter search queries and view sentiment analysis results in real-time.
7. Test your application thoroughly to ensure it handles various edge cases and errors gracefully, leveraging 'aqueduct-core's self-healing mechanisms.
8. Document your code and provide instructions for running the application.