Bioinformatic_starter_pack contains programs that were written while doing home tasks in Python course in Bioinformatics Institute.
The examples of running some programs that are highlighted with * are presented in Showcases.ipynb
The program bioinf_starter_pack.py provides basic tools to work with different types of bioinformatic data.
-
RNASequence/DNASequence/AminoAcidSequenceclasses *To manipulate with three main bioinformatic data types - DNA, RNA and proteins. The implementation of these classes embraced four basic principles of OOP - encapsulation, inheritance, polymorphism, and abstraction.
-
filter_fastqfunctionTo filter fastq files under the GC-content, the length of sequences and the quality scores
-
run_genscanfunction *To make requests on Genscan website to predict possible CDS, exons and introns in the sequence of interest. The output is also implemented as a class.
-
telegram_loggerdecoratorTo send the user information on the function of the main interest. The information includes the result of run, time of running, stdout and stderr output in log file. This function is implemented based on Telegram API without specific libraries.
The program bio_files_processor.py facilitates parsing FASTA and GenBank files.
-
convert_multiline_fasta_to_onelinefunctionTo convert any number of DNA/RNA/protein sequences in FASTA file from multiline format to one line.
-
select_genes_from_gbk_to_fastafunctionTo select nearest neighbors of the gene of main interest from GenBank file and output them in FASTA format.
-
OpenFastacontext manager *To open FASTA file and return separate FASTA records including id, description and sequence. The implementation of the context manager is similar to
openbuilt-in function.
The program custom_random_forest.py contains custom implementation of Random forest classifier as
RandomForestClassifierCustomclass *
In terms of this course this class was refined using parallel programming. Using n_jobs argument enables speeding up the running
The program test_bioinf_starter_pack.py includes classes to test the programs mentioned above.
I would like to express huge gratitude to the course team. All participants can be found at the end of this page
They are the best teachers ever✨✨