TEDBench

v0.2.0 suspicious
7.0
High Risk

TEDBench: Large-Scale Protein Fold Classification Benchmark and MiAE Pretraining

🤖 AI Analysis

Final verdict: SUSPICIOUS

The package exhibits moderate to high risk due to its use of potentially unsafe methods such as os.system for shell execution and code obfuscation, which can obscure malicious activities.

  • High shell risk due to os.system usage
  • Moderate obfuscation risk indicating possible hidden functionality
Per-check LLM notes
  • Network: The network call pattern suggests legitimate file retrieval, but could be risky if URLs are not vetted.
  • Shell: The use of os.system for shell execution poses significant risks including potential code injection and privilege escalation.
  • Obfuscation: The code shows signs of obfuscation which may be used to hide the functionality and intent of the code from casual inspection.
  • Credentials: No patterns indicative of credential harvesting were detected.
  • Metadata: The author's details are sparse, and there is no associated GitHub repository, which raises some suspicion.

🔬 Heuristic Checks

Outbound Network Calls score 1.5

Found 1 network call pattern(s)

  • checkpoint_path): urllib.request.urlretrieve(ckpt_url, cached_checkpoint_path) return
Code Obfuscation score 8.0

Found 4 obfuscation pattern(s)

  • bet(cfg.model.name) model.eval() train_loader, val_loader, test_loader, ext_test_loade
  • bet("mif") self.model.eval() self.model = self.model.to(device) @torch.no_
  • etrained=True ) model.eval() tokenizer = EsmTokenizer.from_pretrained(cfg.model.pat
  • er, device="cuda"): model.eval() X = [] y = [] for batch in tqdm(data_loader):
Shell / Subprocess Execution score 4.0

Found 2 shell execution pattern(s)

  • h}\' \'{tmp_save_path}\'" os.system(cmd) # Check whether the structure is predicted by
  • pdb_dir} {tmp_save_path}" os.system(cmd) with open(tmp_save_path, "r") as r, open(save_
Credential Harvesting

No credential harvesting patterns detected

Typosquatting

No typosquatting candidates detected

Registered Email Domain score 3.0

Suspicious email domain flags: Very short email domain: ki.uni-stuttgart.de>

  • Very short email domain: ki.uni-stuttgart.de>
Suspicious Page Links

All external links appear legitimate

Git Repository History

No GitHub repository linked

  • No GitHub repository link found
Maintainer History score 4.0

2 maintainer concern(s) found

  • Author name is missing or very short
  • Author "" appears to have only 1 package on PyPI (new or inactive account)
Known CVE Vulnerabilities

No known vulnerabilities found in OSV database.

💡 AI App Starter Prompt

Use this prompt to build a project with TEDBench
Your task is to create a Python-based mini-application that leverages the TEDBench package to classify protein folds using pre-trained models. This application will serve as a user-friendly tool for researchers and students interested in exploring large-scale protein fold classification. Here are the steps and features your application should include:

1. **Setup Environment**: Ensure you have Python installed along with necessary libraries such as TensorFlow, PyTorch, and Scikit-learn. Install TEDBench via pip.
2. **Data Preparation**: Use TEDBench's built-in datasets for training and testing. Your application should allow users to load these datasets easily.
3. **Model Selection**: Provide options for selecting different pre-trained models available in TEDBench. Users should be able to choose between different MiAE (Multi-Invariance Autoencoder) models.
4. **Interactive Interface**: Develop a simple command-line interface where users can input their choice of model, dataset, and any other parameters needed for classification.
5. **Performance Evaluation**: After running the classification, display key performance metrics such as accuracy, precision, recall, and F1-score.
6. **Visualization**: Implement basic visualization tools to help users understand the classification results better. For instance, plotting confusion matrices or ROC curves.
7. **Documentation**: Include clear documentation within the code and as external README files explaining how to install dependencies, run the application, and interpret the output.

Your goal is to make this application accessible to users who may not be experts in machine learning but are interested in exploring protein fold classification using advanced AI techniques.