Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR adds support for the IJSEM (International Journal of Systematic and Evolutionary Microbiology) data source to the kg-microbe project. The implementation includes a new transform class for processing IJSEM data and integrates it into the existing transformation pipeline.
- Adds IJSEMTransform class with basic structure for processing IJSEM phenotype database files
- Integrates IJSEM into the main transformation registry and constants
- Updates SLURM job scripts with email notifications for job status monitoring
Reviewed Changes
Copilot reviewed 11 out of 12 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| kg_microbe/transform_utils/ijsem/ijsem_transform.py | Implements the main IJSEM transformation class with file processing logic |
| kg_microbe/transform_utils/constants.py | Adds IJSEM-related constants including source name and resource filename |
| kg_microbe/transform.py | Registers IJSEM transform in the main transformation registry |
| download.yaml | Adds download configuration for IJSEM data from Figshare |
| hpc/*.sl | Updates SLURM scripts with email notification settings |
| data/transformed/ijsem/*.tsv | Creates placeholder output files with TSV headers |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| @@ -0,0 +1,56 @@ | |||
| """IJSEM tranform script.""" | |||
There was a problem hiding this comment.
There is a typo in 'tranform' - it should be 'transform'.
| """IJSEM tranform script.""" | |
| """IJSEM transform script.""" |
| data = line.strip().split("\t") | ||
| data_dict = dict(zip(header, data)) | ||
|
|
||
| import pdb; pdb.set_trace() |
There was a problem hiding this comment.
Debug breakpoint (pdb.set_trace()) should not be left in production code. This will cause the program to halt execution and wait for user input.
| import pdb; pdb.set_trace() | |
| # Debug breakpoint removed |
|
|
||
| # Unzip file | ||
| if not Path(data_file).is_file(): | ||
| os.system(f"unzip {input_file} -d {self.input_base_dir}") |
There was a problem hiding this comment.
Using os.system() with user-controlled input can lead to command injection vulnerabilities. Consider using subprocess.run() with proper argument handling or the zipfile module instead.
| os.system(f"unzip {input_file} -d {self.input_base_dir}") | |
| with zipfile.ZipFile(input_file, 'r') as zip_ref: | |
| zip_ref.extractall(self.input_base_dir) |
|
|
||
| class IJSEMTransform(Transform): | ||
|
|
||
| """Template for how the transform class would be designed.""" |
There was a problem hiding this comment.
The docstring describes this as a template but this appears to be the actual IJSEM implementation. Update the docstring to describe the specific purpose of the IJSEM transform.
| """Template for how the transform class would be designed.""" | |
| """Transform class for processing and converting IJSEM (International Journal of Systematic and Evolutionary Microbiology) data into standardized node and edge files for downstream use.""" |
Fixes #180
from @turbomam
see also