This project focuses on the identification of structural variations in a bacterial genome through comparison with a reference genome. The analysis is performed using next-generation sequencing (NGS) data and a combination of bioinformatics tools and custom-developed scripts.
The reference genome is Lactobacillus casei, while the target genome (lact.sp) is a simulated organism containing artificial structural variations such as insertions, deletions, and inversions.
-
Align sequencing reads to a reference genome
-
Analyze coverage and mapping quality
-
Detect structural variations (SVs), including:
- Long and short insertions
- Long and short deletions
- Inversions
-
Develop custom tracks for deeper genomic analysis
- BWA – sequence alignment
- Samtools – SAM/BAM file processing
- IGV (Integrative Genomics Viewer) – visualization
- Python – custom track generation
Code/– Python scripts for track generationImg/– images from analysisPlot IGV/– IGV screenshotsWig/– generated track files (.wig)
The project includes:
- Sequence coverage analysis
- Physical coverage analysis
- Single mate analysis
- Fragment length evaluation
- Orientation analysis
- Detection of anomalous mate pairs
Custom tracks were implemented to enhance the detection of structural variations and improve interpretability of genomic regions.
-
Index the reference genome:
bwa index reference.fasta -
Align reads:
bwa mem reference.fasta reads1.fastq reads2.fastq > output.sam -
Convert and sort:
samtools view -bS output.sam > output.bam samtools sort output.bam -o sorted.bam samtools index sorted.bam -
Generate custom tracks using provided Python scripts
-
Load
.bamand.wigfiles into IGV for visualization
This project was developed as part of a university assignment in Bioinformatics (LM-18 UniPD).
It is intended for educational purposes only. Unauthorized copying, reuse, or submission of this work (in whole or in part) as original work for academic evaluation is strictly prohibited.
If you use or reference this project, proper citation of the author is required.
Eleonora Signor
Master’s Degree in Computer Science University of Padua
2021/2022