ascairn

v0.3.0 suspicious
4.0
Medium Risk

Analyze centromere variation from short-read data using rare k-mers in alpha satellite sequences

πŸ€– AI Analysis

Final verdict: SUSPICIOUS

The package has minimal risks associated with network calls, shell executions, and obfuscation. However, the missing maintainer's author name and the apparent newness or inactivity of the account raise concerns about its legitimacy.

  • Missing maintainer's author name
  • Account appears new or inactive
Per-check LLM notes
  • Network: No network calls detected, which is normal and expected.
  • Shell: Shell executions observed are likely intended for local command-line tools execution related to the package's functionality, but warrant scrutiny to ensure commands are not being misused.
  • Obfuscation: No obfuscation patterns detected, indicating low risk of malicious activity related to code obfuscation.
  • Credentials: No credential harvesting patterns detected, suggesting no immediate risk of secret or sensitive information being stolen.
  • Metadata: The maintainer's author name is missing and the account seems new or inactive, which could indicate potential risk.

πŸ“¦ Package Quality Overall: Medium (5.0/10)

✦ High Test Suite 9.0

Test suite present β€” 5 test file(s) found

  • 5 test file(s) detected (e.g. test_cen_type.py)
β—ˆ Medium Documentation 5.0

Some documentation present

  • Detailed PyPI description (18821 chars)
β—‹ Low Contributing Guide 4.0

No contributing guide or governance files found

  • Development Status classifier >= Beta
β—‹ Low Type Annotations 1.0

No type annotations detected

  • No type annotations, py.typed marker, or stub files detected
β—ˆ Medium Multiple Contributors 6.0

Limited contributor diversity

  • 2 unique contributor(s) across 99 commits in friend1ws/ascairn
  • Two distinct contributors found

πŸ”¬ Heuristic Checks

βœ“ Outbound Network Calls

No suspicious network call patterns found

βœ“ Code Obfuscation

No obfuscation patterns detected

⚠ Shell / Subprocess Execution score 10.0

Found 6 shell execution pattern(s)

  • Checking sequence depth") subprocess.run([ "ascairn", "check_depth", bam_file,
  • 2: Counting rare k-mers") subprocess.run([ "ascairn", "kmer_count", bam_file,
  • pend("--single_hap") subprocess.run(cmd, check=True) # Aggregate per-chromosome cen_typ
  • , reference] try: subprocess.run(cmd, check = True, stdout = subprocess.DEVNULL) except E
  • ence is not None else [] subprocess.run(["samtools", "view", "-bh", bam_file, "-L", baseline_region_
  • + ref_args, check=True) subprocess.run(["samtools", "index", baseline_bam], check=True) mosdep
βœ“ Credential Harvesting

No credential harvesting patterns detected

βœ“ Typosquatting

No typosquatting candidates detected

βœ“ Registered Email Domain

Email domain looks legitimate: gmail.com>

βœ“ Suspicious Page Links

All external links appear legitimate

βœ“ Git Repository History

Repository friend1ws/ascairn appears legitimate

⚠ Maintainer History score 4.0

2 maintainer concern(s) found

  • Author name is missing or very short
  • Author "" appears to have only 1 package on PyPI (new or inactive account)
βœ“ Known CVE Vulnerabilities

No known vulnerabilities found in OSV database.

πŸ’‘ AI App Starter Prompt

Use this prompt to build a project with ascairn
Your task is to develop a user-friendly command-line tool that leverages the 'ascairn' package to analyze centromere variations from short-read data using rare k-mers in alpha satellite sequences. This tool will serve researchers and geneticists who need to quickly assess genetic variations within centromeres. Here’s a step-by-step guide on how to approach this project:

1. **Project Setup**: Begin by setting up your development environment. Ensure you have Python installed and create a virtual environment. Install necessary dependencies, including 'ascairn'.

2. **Core Functionality**: Your main function should accept input files containing short-read data. Utilize 'ascairn' to identify rare k-mers in these sequences, which will help in detecting variations in centromere regions.

3. **Data Processing**: Implement data processing steps such as quality control checks, trimming of reads if necessary, and alignment to a reference genome. 'ascairn' can assist in filtering out common k-mers to focus on rare ones.

4. **Analysis Output**: Design an output format that summarizes the detected variations. This could include tables listing the identified rare k-mers, their frequency, and their location within the centromere region. Visualizations like graphs showing the distribution of rare k-mers across different samples can also be beneficial.

5. **User Interface**: Create a simple yet effective command-line interface where users can specify input files, choose analysis parameters (like k-mer size), and select output formats. Include options for saving results to a file or displaying them directly on the console.

6. **Error Handling & Documentation**: Ensure your application handles errors gracefully, providing informative messages when something goes wrong. Write comprehensive documentation explaining how to install and use the tool, along with examples and FAQs.

7. **Testing & Validation**: Test your application thoroughly using known datasets to validate its accuracy and reliability. Consider adding unit tests for various components of your codebase.

8. **Deployment**: Prepare your application for deployment by packaging it into a distributable form (e.g., a Python wheel). Make sure it's easily installable via pip or other package managers.

Suggested Features:
- Interactive mode allowing users to explore different k-mer sizes and see immediate results.
- Integration with cloud storage services for handling large datasets.
- Advanced visualization tools to highlight significant variations visually.
- Support for multiple input file formats commonly used in genomics research.