airflow-provider-skaledata

v0.3.0 suspicious
4.0
Medium Risk

SkaleData's Airflow extensions: drop-in replacements for upstream Airflow providers that talk to SkaleData-managed services. Imports live under the `skale.providers.*` namespace, mirroring Airflow's own provider layout.

🤖 AI Analysis

Final verdict: SUSPICIOUS

The package has low risks for network calls, shell execution, obfuscation, and credential harvesting. However, the metadata risk score is elevated due to recent creation, low activity, and incomplete author information, raising suspicion.

  • Metadata risk due to recent creation and incomplete author information
  • Low activity level
Per-check LLM notes
  • Network: No network calls detected, which is normal if the package does not require external communications.
  • Shell: No shell execution patterns detected, indicating the package does not attempt to execute system commands.
  • Obfuscation: No obfuscation patterns detected, indicating low risk.
  • Credentials: No credential harvesting patterns detected, indicating low risk.
  • Metadata: The recent creation, low activity, and incomplete author information suggest potential risk.

📦 Package Quality Overall: Low (3.2/10)

◈ Medium Test Suite 6.0

Partial test coverage signals detected

  • 1 test file(s) detected (e.g. test_airbyte.py)
◈ Medium Documentation 5.0

Some documentation present

  • Detailed PyPI description (2660 chars)
○ Low Contributing Guide 2.0

No contributing guide or governance files found

  • No CONTRIBUTING, CODE_OF_CONDUCT, or governance files found
○ Low Type Annotations 1.0

No type annotations detected

  • No type annotations, py.typed marker, or stub files detected
○ Low Multiple Contributors 2.0

Single-author or unverifiable project

  • 1 unique contributor(s) across 13 commits in skaledata/airflow-base
  • Single author with few commits — possibly a personal or throwaway project

🔬 Heuristic Checks

Outbound Network Calls

No suspicious network call patterns found

Code Obfuscation

No obfuscation patterns detected

Shell / Subprocess Execution

No shell execution patterns detected

Credential Harvesting

No credential harvesting patterns detected

Typosquatting

No typosquatting candidates detected

Registered Email Domain

Email domain looks legitimate: skaledata.com>

Suspicious Page Links

All external links appear legitimate

Git Repository History score 7.5

Git history flags: Repository created very recently: 4 day(s) ago (2026-06-02T16:22:34Z)

  • Repository created very recently: 4 day(s) ago (2026-06-02T16:22:34Z)
  • Repository has zero stars and zero forks
  • All 13 commits happened within 24 hours
Maintainer History score 6.0

3 maintainer concern(s) found

  • Only one version has ever been released — brand new package
  • Author name is missing or very short
  • Author "" appears to have only 1 package on PyPI (new or inactive account)
Known CVE Vulnerabilities

No known vulnerabilities found in OSV database.

💡 AI App Starter Prompt

Use this prompt to build a project with airflow-provider-skaledata
Create a mini-application using the Python package 'airflow-provider-skaledata' that automates data ingestion from various sources into a SkaleData-managed data lake. This application will serve as a proof-of-concept for integrating external data sources into SkaleData's ecosystem via Apache Airflow.

The application should include the following steps:
1. **Initialization**: Set up a basic Airflow environment that includes the 'airflow-provider-skaledata' package. Ensure that all necessary dependencies are installed and configured properly.
2. **Source Configuration**: Define different data sources (e.g., CSV files, SQL databases, APIs) that the application will ingest data from. For each source, create a dedicated Airflow operator that fetches the data and formats it according to SkaleData's requirements.
3. **Transformation & Validation**: Implement a series of transformation tasks within Airflow that clean and transform the ingested data. These tasks should handle common data issues such as null values, data type mismatches, and inconsistencies.
4. **Loading into SkaleData**: Use the 'airflow-provider-skaledata' package to create operators that load the transformed data into a SkaleData-managed data lake. This step should demonstrate how the package integrates seamlessly with SkaleData's services.
5. **Monitoring & Alerts**: Set up monitoring and alerting mechanisms within Airflow to track the status of each task and notify relevant stakeholders if any issues arise during the data ingestion process.
6. **Documentation & Testing**: Write comprehensive documentation detailing how to set up and run the application. Include testing scripts and examples to ensure that the application functions correctly across different scenarios.

Key Features:
- Utilize 'airflow-provider-skaledata' to streamline interactions with SkaleData-managed services.
- Implement robust error handling and logging to maintain data integrity.
- Provide flexibility in data source types to showcase the application's versatility.
- Incorporate best practices for data security and privacy during the data ingestion process.

This project aims to demonstrate the capabilities of 'airflow-provider-skaledata' in simplifying the integration of diverse data sources into SkaleData's managed services.