airflow-provider-google-sheets

v0.9.1 suspicious
4.0
Medium Risk

Apache Airflow provider for Google Sheets — read, write, and smart merge

🤖 AI Analysis

Final verdict: SUSPICIOUS

The package has low risks for obfuscation and credential theft but shows signs of potential unreliability due to low repository activity and limited maintainer history.

  • Low obfuscation risk
  • Low credential risk
  • Repository with low activity and limited maintainer history
Per-check LLM notes
  • Obfuscation: No obfuscation patterns detected, indicating low risk.
  • Credentials: No credential harvesting patterns detected, indicating low risk.
  • Metadata: The repository's low activity and the maintainer's limited history suggest potential unreliability.

📦 Package Quality Overall: Medium (6.0/10)

✦ High Test Suite 9.0

Test suite present — 9 test file(s) found

  • Test runner config found: pyproject.toml
  • 9 test file(s) detected (e.g. test_google_sheets.py)
◈ Medium Documentation 7.0

Some documentation present

  • Documentation URL: "Documentation" -> https://github.com/mkozhin/airflow-provider-google-sheets#re
  • Detailed PyPI description (23904 chars)
○ Low Contributing Guide 4.0

No contributing guide or governance files found

  • Development Status classifier >= Beta
◈ Medium Type Annotations 5.0

Partial type annotation coverage

  • 86 type-annotated function signatures detected in source
◈ Medium Multiple Contributors 5.0

Limited contributor diversity

  • 1 unique contributor(s) across 46 commits in mkozhin/airflow-provider-google-sheets
  • Single author but highly active (46 commits)

🔬 Heuristic Checks

Outbound Network Calls

No suspicious network call patterns found

Code Obfuscation

No obfuscation patterns detected

Shell / Subprocess Execution

No shell execution patterns detected

Credential Harvesting

No credential harvesting patterns detected

Typosquatting

No typosquatting candidates detected

Registered Email Domain

Email domain looks legitimate: kozhin.cc>

Suspicious Page Links

All external links appear legitimate

Git Repository History score 2.5

Git history flags: Repository has zero stars and zero forks

  • Repository has zero stars and zero forks
Maintainer History score 4.0

2 maintainer concern(s) found

  • Author name is missing or very short
  • Author "" appears to have only 1 package on PyPI (new or inactive account)
Known CVE Vulnerabilities

No known vulnerabilities found in OSV database.

💡 AI App Starter Prompt

Use this prompt to build a project with airflow-provider-google-sheets
Develop a mini-application that integrates Apache Airflow with Google Sheets to automate data processing tasks. This application will serve as a tool for businesses to streamline their data management workflows. Here’s a detailed plan on how to proceed:

1. **Setup Environment**: Begin by setting up your development environment. Install Apache Airflow, the `airflow-provider-google-sheets` package, and any other necessary dependencies.

2. **Google Sheets Integration**: Utilize the `airflow-provider-google-sheets` package to connect to a Google Sheet. Ensure you have the appropriate OAuth credentials set up for authentication.

3. **Data Extraction**: Write a DAG (Directed Acyclic Graph) in Apache Airflow that extracts data from specific sheets within a Google Sheet document. Consider implementing filters based on date ranges or other criteria to refine the extracted data.

4. **Data Transformation**: After extraction, transform the raw data into a more usable format. This could involve cleaning the data, performing calculations, or even merging data from multiple sheets into a single dataset using the smart merge feature of the package.

5. **Data Loading**: Once transformed, load the data back into Google Sheets or another destination such as a database or another Google Sheet. This step should also include writing logs and handling errors gracefully.

6. **Scheduling & Monitoring**: Set up scheduling for your DAGs using Airflow’s scheduler. Additionally, implement monitoring mechanisms to track the execution status and performance of your DAGs.

7. **User Interface**: Develop a simple web-based interface that allows users to interact with the data through the Google Sheets integration. Users should be able to trigger data extraction and loading processes, view logs, and monitor task statuses.

8. **Security & Compliance**: Ensure that all interactions with Google Sheets are secure and comply with relevant data protection regulations. This includes handling OAuth tokens securely and ensuring data privacy.

9. **Testing & Deployment**: Thoroughly test your application under various scenarios to ensure reliability. Deploy your application in a production-like environment to validate its performance and stability.

This project aims to demonstrate the power of integrating Apache Airflow with Google Sheets for automating complex data management tasks. It will showcase how the `airflow-provider-google-sheets` package simplifies these integrations and enhances productivity.