agentic-rag-pdf

v0.1.5 suspicious
4.0
Medium Risk

Agentic RAG over any PDF — grounded, cited answers from research reports, ESG/sustainability disclosures, contracts, manuals, and filings. Multi-agent LangGraph workflow with hybrid retrieval, RAPTOR summaries, NL→SQL over extracted tables, CRAG corrective loops, and a defense-in-depth hallucination stack.

🤖 AI Analysis

Final verdict: SUSPICIOUS

The package is considered suspicious due to its use of unconventional coding practices and making network calls to an external API, which are not clearly justified by the package description.

  • Unconventional import and string formatting techniques
  • External API network calls
Per-check LLM notes
  • Network: The package makes network calls to an external API, which could be legitimate depending on its functionality but may also indicate unexpected behavior if not documented.
  • Shell: No shell execution patterns were detected.
  • Obfuscation: The code uses unconventional import and string formatting techniques which may indicate an attempt to obfuscate the code, but without more context it's hard to determine if it's malicious.
  • Credentials: No clear patterns of credential harvesting were detected.

🔬 Heuristic Checks

Outbound Network Calls score 3.0

Found 2 network call pattern(s)

  • , } try: r = requests.get(f"{settings.ollama_base_url}/api/tags", timeout=timeout_s)
  • mport requests resp = requests.post( f"{settings.ollama_base_url}/api/show",
Code Obfuscation score 6.0

Found 3 obfuscation pattern(s)

  • turn total _TABLE_NAME_RE = __import__("re").compile(r"^[a-zA-Z0-9_]{1,64}$") def _fetch_table_data(ta
  • ry thread_id = f"chat-{__import__('uuid').uuid4().hex[:8]}" print("\n=== SUSTAINABILITY SME — IN
  • thread_id = f"chat-{__import__('uuid').uuid4().hex[:8]}" print(f"New session: {thread
Shell / Subprocess Execution

No shell execution patterns detected

Credential Harvesting

No credential harvesting patterns detected

Typosquatting

No typosquatting candidates detected

Registered Email Domain

Email domain looks legitimate: gmail.com>

Suspicious Page Links

All external links appear legitimate

Git Repository History score 2.5

Git history flags: Repository has zero stars and zero forks

  • Repository has zero stars and zero forks
Maintainer History score 4.0

2 maintainer concern(s) found

  • Author name is missing or very short
  • Author "" appears to have only 1 package on PyPI (new or inactive account)
Known CVE Vulnerabilities

No known vulnerabilities found in OSV database.

💡 AI App Starter Prompt

Use this prompt to build a project with agentic-rag-pdf
Develop a mini-application named 'PDFInsight' which leverages the 'agentic-rag-pdf' package to provide comprehensive analysis and insights from various types of PDF documents such as research reports, sustainability disclosures, contracts, manuals, and filings. The application should be designed to assist users in extracting meaningful information quickly and accurately.

### Steps to Create the Application:
1. **Setup**: Begin by installing the necessary packages including 'agentic-rag-pdf'. Ensure your environment supports Python and has the latest version of the package installed.
2. **Document Upload**: Design a user-friendly interface where users can upload their PDF files. This could be done through a web-based application or a command-line tool depending on your preference.
3. **Content Analysis**: Use the 'agentic-rag-pdf' package to perform multi-agent LangGraph workflow on the uploaded document. This involves hybrid retrieval, RAPTOR summaries, NL→SQL over extracted tables, CRAG corrective loops, and a defense-in-depth hallucination stack to ensure accuracy and depth of analysis.
4. **Insight Generation**: Based on the analysis, generate key insights and summaries that highlight important points within the document. These insights should be actionable and provide value to the user.
5. **Hallucination Check**: Implement a feature to verify the accuracy of the generated insights against the original document content using the package’s defense-in-depth hallucination stack.
6. **User Interface**: Develop an intuitive user interface where users can view the uploaded document, see the generated insights, and interact with the analysis results. Consider including options to download the insights in a preferred format (e.g., PDF, CSV).
7. **Feedback Loop**: Allow users to provide feedback on the accuracy of the insights. Use this feedback to improve future analyses through CRAG corrective loops.

### Suggested Features:
- **Interactive Table Extraction**: Enable users to explore tables within the document and extract specific data points.
- **Custom Query Interface**: Provide users with a way to ask custom questions about the document content and receive answers based on the document’s information.
- **Visualization Tools**: Include basic visualization tools to help users understand complex data within the document more easily.
- **Comparison Tool**: Offer a feature to compare different versions of the same document or similar documents side-by-side.
- **Export Options**: Allow users to export the analyzed content and insights into various formats such as PDF, Excel, or CSV for further use.

This project aims to showcase the power and versatility of the 'agentic-rag-pdf' package while providing a practical solution for analyzing complex PDF documents.