-
Notifications
You must be signed in to change notification settings - Fork 0
Hw18 #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Hw18 #4
Changes from all commits
f981ca9
f468451
8dabc17
12dfcc1
6fc8f0a
4ace5e0
6e7d0b3
d6d024f
c24d68d
474cdfd
3ee2705
4fecad5
cefbd5f
3632c7f
b98fe39
407c777
027829b
ebff383
c38c33e
b6d592d
6496d34
6c4cb94
9bb9e0f
513f26c
da63a81
cebc7f2
db1a0e2
c2c915d
dc12272
743a44b
fb57738
0c0bc63
590d9e6
4e93466
2fc8723
f286608
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,2 +1,3 @@ | ||
| # BI_toolkit | ||
| Homework for Python course in the Bioinformatic Institute | ||
| # Homework repository for 2023-2024 Informatics Bioinstitute professional retraining program | ||
|
|
||
| Here, several homeworks conserning processing bioinformatical data are gathered. Please, see Showcases.ipynb for some examples. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ну тут все-таки стоит побольше рассказать про то что в репозитории, с какими данными работаешь, какой есть функционал. После просмотра README у человека должно быть понимание интересно ему дальше разбираться с репозиторием или нет:) |
||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Поскольку в твоем репозитории ноутбук заменяет README, то лучше было бы добавить некоторые пояснения, что вот тут вот мы тестируем базовые операции для днк, рнк и белков, вот тут проверяем, что срабатывает ошибка и тд. Хорошо бы, чтобы человеку, который не очень понимает о чем код при быстром просмотре стало все понятно. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,224 @@ | ||
| { | ||
| "cells": [ | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": 11, | ||
| "id": "4b3c7110", | ||
| "metadata": {}, | ||
| "outputs": [], | ||
| "source": [ | ||
| "import os\n", | ||
| "from bio_files_processor import OpenFasta\n", | ||
| "from beginner_bioinf_tools import DNASequence, RNASequence, AminoAcidSequence, filter_fastq" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": 12, | ||
| "id": "cce834fa", | ||
| "metadata": {}, | ||
| "outputs": [], | ||
| "source": [ | ||
| "path_to_fasta = os.path.join('data', 'fasta_example.fasta')\n", | ||
| "path_to_fastq = path_to_fasta = os.path.join('data', 'example_fastq.fastq')" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": 13, | ||
| "id": "b17d1431", | ||
| "metadata": {}, | ||
| "outputs": [ | ||
| { | ||
| "data": { | ||
| "text/plain": [ | ||
| "'UACG'" | ||
| ] | ||
| }, | ||
| "execution_count": 13, | ||
| "metadata": {}, | ||
| "output_type": "execute_result" | ||
| } | ||
| ], | ||
| "source": [ | ||
| "str(DNASequence('ATGC').transcribe())" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": 14, | ||
| "id": "f0550631", | ||
| "metadata": {}, | ||
| "outputs": [ | ||
| { | ||
| "data": { | ||
| "text/plain": [ | ||
| "'GuAACcaU'" | ||
| ] | ||
| }, | ||
| "execution_count": 14, | ||
| "metadata": {}, | ||
| "output_type": "execute_result" | ||
| } | ||
| ], | ||
| "source": [ | ||
| "str(RNASequence('AugGUUaC').reverse_complement())" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": 15, | ||
| "id": "cf32f6b5", | ||
| "metadata": {}, | ||
| "outputs": [ | ||
| { | ||
| "ename": "InvalidInputError", | ||
| "evalue": "Cannot complement: incorrect input sequence. Only nucleotides (in both cases) are supported!", | ||
| "output_type": "error", | ||
| "traceback": [ | ||
| "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", | ||
| "\u001b[1;31mInvalidInputError\u001b[0m Traceback (most recent call last)", | ||
| "Cell \u001b[1;32mIn[15], line 1\u001b[0m\n\u001b[1;32m----> 1\u001b[0m \u001b[38;5;28mstr\u001b[39m(DNASequence(\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mATGY\u001b[39m\u001b[38;5;124m'\u001b[39m)\u001b[38;5;241m.\u001b[39mcomplement())\n", | ||
| "File \u001b[1;32m~\\1ib\\python\\hw18\\beginner_bioinf_tools.py:148\u001b[0m, in \u001b[0;36mDNASequence.__init__\u001b[1;34m(self, sequence)\u001b[0m\n\u001b[0;32m 146\u001b[0m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39msequence \u001b[38;5;241m=\u001b[39m sequence\n\u001b[0;32m 147\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mis_correct_alphabet():\n\u001b[1;32m--> 148\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m InvalidInputError(\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mCannot complement: incorrect input sequence. Only nucleotides (in both cases) are supported!\u001b[39m\u001b[38;5;124m'\u001b[39m)\n", | ||
| "\u001b[1;31mInvalidInputError\u001b[0m: Cannot complement: incorrect input sequence. Only nucleotides (in both cases) are supported!" | ||
| ] | ||
| } | ||
| ], | ||
| "source": [ | ||
| "str(DNASequence('ATGY').complement())" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": 16, | ||
| "id": "e6ec1bdd", | ||
| "metadata": {}, | ||
| "outputs": [ | ||
| { | ||
| "data": { | ||
| "text/plain": [ | ||
| "<beginner_bioinf_tools.DNASequence at 0x1b7f1878290>" | ||
| ] | ||
| }, | ||
| "execution_count": 16, | ||
| "metadata": {}, | ||
| "output_type": "execute_result" | ||
| } | ||
| ], | ||
| "source": [ | ||
| "AminoAcidSequence('KMGf').convert_to_gene() #returns DNA!" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": 21, | ||
| "id": "06999e39", | ||
| "metadata": {}, | ||
| "outputs": [ | ||
| { | ||
| "data": { | ||
| "text/plain": [ | ||
| "'AAAATGGGGttc'" | ||
| ] | ||
| }, | ||
| "execution_count": 21, | ||
| "metadata": {}, | ||
| "output_type": "execute_result" | ||
| } | ||
| ], | ||
| "source": [ | ||
| "str(AminoAcidSequence('KMGf').convert_to_gene())" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": 17, | ||
| "id": "562bb696", | ||
| "metadata": {}, | ||
| "outputs": [ | ||
| { | ||
| "data": { | ||
| "text/plain": [ | ||
| "'LYS---MET---GLY---phe'" | ||
| ] | ||
| }, | ||
| "execution_count": 17, | ||
| "metadata": {}, | ||
| "output_type": "execute_result" | ||
| } | ||
| ], | ||
| "source": [ | ||
| "AminoAcidSequence('KMGf').recode_3letter_to_1letter('---')" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": 18, | ||
| "id": "4b6286fb", | ||
| "metadata": {}, | ||
| "outputs": [ | ||
| { | ||
| "data": { | ||
| "text/plain": [ | ||
| "{'H': 30.77,\n", | ||
| " 'S': 23.08,\n", | ||
| " 'F': 15.38,\n", | ||
| " 'K': 7.69,\n", | ||
| " 'M': 7.69,\n", | ||
| " 'G': 7.69,\n", | ||
| " 'L': 7.69}" | ||
| ] | ||
| }, | ||
| "execution_count": 18, | ||
| "metadata": {}, | ||
| "output_type": "execute_result" | ||
| } | ||
| ], | ||
| "source": [ | ||
| "AminoAcidSequence('HKSHMGFFHSHSL').info_amino_acid_percentage()" | ||
| ] | ||
| }, | ||
| { | ||
| "cell_type": "code", | ||
| "execution_count": 20, | ||
| "id": "697f02c5", | ||
| "metadata": {}, | ||
| "outputs": [ | ||
| { | ||
| "data": { | ||
| "text/plain": [ | ||
| "[SeqRecord(seq=Seq('TATAGCTACTACACCTTCATGTGATATAACTTCAAGCAATTTTTCATTTAACAT...CTC'), id='SRX079804:1:SRR292678:1:1101:391832:391832', name='SRX079804:1:SRR292678:1:1101:391832:391832', description='SRX079804:1:SRR292678:1:1101:391832:391832 2:N:0:1 BH:ok', dbxrefs=[])]" | ||
| ] | ||
| }, | ||
| "execution_count": 20, | ||
| "metadata": {}, | ||
| "output_type": "execute_result" | ||
| } | ||
| ], | ||
| "source": [ | ||
| "filter_fastq(path_to_fastq, gc_bounds=30, length_bounds=(60,70), quality_threshold=35)" | ||
| ] | ||
| } | ||
| ], | ||
| "metadata": { | ||
| "kernelspec": { | ||
| "display_name": "Python 3 (ipykernel)", | ||
| "language": "python", | ||
| "name": "python3" | ||
| }, | ||
| "language_info": { | ||
| "codemirror_mode": { | ||
| "name": "ipython", | ||
| "version": 3 | ||
| }, | ||
| "file_extension": ".py", | ||
| "mimetype": "text/x-python", | ||
| "name": "python", | ||
| "nbconvert_exporter": "python", | ||
| "pygments_lexer": "ipython3", | ||
| "version": "3.11.5" | ||
| } | ||
| }, | ||
| "nbformat": 4, | ||
| "nbformat_minor": 5 | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ну тут все-таки немножно лучше добавить о том что реализовано:) Обработка каких данных, что за обработка вообще. То есть после README должно быть какое-то представление о том полезен человеку этот репозиторий или нет