Develop by CaptnClementine · Pull Request #1 · CaptnClementine/gene_code_tools

CaptnClementine · 2023-10-07T17:21:10Z

This branch was done for 5's homework

…chymotrypsin"

This function counts sulphur-containing amino acids (Cysteine and Methionine) in a protein sequence.

…letter code

albidgy

Хорошая работа!

Плюсы:

Отличные комментарии к коммитам.
Подробный и понятный README.
Код по работе с FASTQ выглядит очень хорошо. Работает все верно, за исключением того, что не учтена возможность подачи gc_bounds числа с плавающей точкой.
Здесь как раз можно было посмотреть прогресс студента, у тебя очень хорошие успехи:)
Стилистика кода, нейминг в последней части домашнего задания - отлично.

Замечания:

Недостаточно внимания уделила прошлым заданиям. В основном это касается стилистики кода (см. комментарии).
Главный скрипт не в формате .py
Не убран import в середине главного скрипта
В dna_rna_tools_utils.py нет импорта Typing

Баллы:

3 фильтрации FASTQ 3/3
Главная функция 0.2/1 (нет формата .py, забыла убрать import в середине скрипта, нельзя подать float в gc_bounds)
README 2/2
Структура репозитория и качество кода 2.5/3. Нет импорта Typing, пробелы, нейминг (i), отступы, лишние скобки в if. По большей части это все штрафы с прошлых ДЗ, которые нужно было бы поправить.
Улучшение кода ДНК/РНК и белковых тулов 0.5/1

Итого: 8.2 балла

albidgy · 2023-10-12T10:41:01Z

+DNA = set ('ATGCatgc')
+RNA = set ('AUGCaugc')


Здорово, что учла комметарий от Никиты.
Есть недочет: между функцией (в данном случае у тебя функция преобразования типа) и аргументом пробел не ставится.

Suggested change

DNA = set ('ATGCatgc')

RNA = set ('AUGCaugc')

DNA = set('ATGCatgc')

RNA = set('AUGCaugc')

albidgy · 2023-10-12T10:41:05Z

+    "a": "a", "A": "A",
+    "t": "u", "T": "U",
+    "u": "t", "U": "T",
+    "g": "g", "G": "G",
+    "c": "c", "C": "C"


Просто на будущее: обычно используют или одинарные, или двойные кавычки. Лучше выбрать один стиль.

albidgy · 2023-10-12T10:41:08Z

+    return unique_chars <= RNA
+
+
+def type_rna_or_dna(seqs: List[str]) -> str:


Нет импорта List, код не запустится.

albidgy · 2023-10-12T10:41:10Z

+    counter_rna = 0
+    ambigiuos = 0
+    for i in seqs:
+      if is_dna(i) and is_rna(i):


Не хватает отступов.

Suggested change

if is_dna(i) and is_rna(i):

if is_dna(i) and is_rna(i):

albidgy · 2023-10-12T10:41:11Z

+    ambigiuos = 0
+    for i in seqs:
+      if is_dna(i) and is_rna(i):
+          ambigiuos = ambigiuos+1


Не хватает пробелов.

Suggested change

ambigiuos = ambigiuos+1

ambigiuos = ambigiuos + 1

albidgy · 2023-10-12T11:16:32Z

+    return filtered_seqs
+
+
+from amino_analyzer_utils import aa_weight, count_hydroaffinity, peptide_cutter, one_to_three_letter_code, sulphur_containing_aa_counter


все импорты лучше выносить в начало файла. А этот спорт работать не будет, потому что скрипт amino_analyzer_utils находится в директории gene_code_utils. Видимо, ты забыла удалить эту строку, но с ней код работать не будет.

albidgy · 2023-10-12T11:17:32Z

@@ -0,0 +1,145 @@
+from typing import Dict, Tuple, Union


Вообще ничего не будет работать, потому что этот файл не имеет расширения .py

albidgy · 2023-10-12T11:27:34Z

+    filtered_seqs = dict()
+    for fastq_name, (seq, quality) in seqs.items():
+        if is_dna(seq):
+            if type(gc_bounds) == int:


а если ввести, например, gc_bound=44.4 работать не будет.

Спасибо

albidgy · 2023-10-12T11:46:53Z

+    u_to_t = {'U': 'T', 'u': 't'}
+    for i in seq:
+        if i in u_to_t:
+            c_dna.append(u_to_t.get(i))
+        else:
+            c_dna.append(i)


Лучше было сделать также, как и функцию transcribe.

albidgy · 2023-10-12T11:47:03Z

+    """
+
+    filtered_seqs = dict()
+    for fastq_name, (seq, quality) in seqs.items():


Здорово! 👍

Спасибо за проверку!

CaptnClementine · 2023-10-17T19:01:21Z

Аааа, поняла, спаисбо вт, 17 окт. 2023 г. в 18:59, Alexandra Kasianova ***@***.***>:

…

***@***.**** commented on this pull request. ------------------------------ In gene_code_utils/amino_analyzer_utils.py <#1 (comment)> : > + average_weights = { + 'A': 71.0788, 'R': 156.1875, 'N': 114.1038, 'D': 115.0886, 'C': 103.1388, + 'E': 129.1155, 'Q': 128.1307, 'G': 57.0519, 'H': 137.1411, 'I': 113.1594, + 'L': 113.1594, 'K': 128.1741, 'M': 131.1926, 'F': 147.1766, 'P': 97.1167, + 'S': 87.0782, 'T': 101.1051, 'W': 186.2132, 'Y': 163.1760, 'V': 99.1326 + } Тут идея в том, что константа - это какая-то структура данных (желательно, но не всегда, неизменяемая), в которую ты один раз данные записала руками и больше их никак не меняешь. То есть этот как англо-русский словарь, ты к нему обращаешься, чтобы узнать значение какого-то слова. Здесь принцип тот же) — Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/A7ISJYCWAXOXVJLTDK4F5G3X72TNLAVCNFSM6AAAAAA5XB2T3SVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTMOBSG42TONZSGM> . You are receiving this because you authored the thread.Message ID: ***@***.***>

CaptnClementine and others added 30 commits October 6, 2023 22:11

Create filter_dna_utils.py

607a749

Add 'is_dna' function to check input sequence

d279d38

Add 'count_gc_content' function

2165478

Add 'is_in_gc_bounds' function to check sequence threshold

a934e0c

Add 'is_in_length_bounds' function to check sequence threshold

85f43f7

Add 'check_quality' function to check sequence threshold

d3a3456

Create gene_code_main_operations

547e14e

Add 'filter_dna' to filter sequences by GC-content, length and quality

44a9be2

Create amino_analyzer_utils.py

499b0e9

Add 'is_aa' function to check if a sequence contains only amino acids

7a35ab5

Add 'choose_weight' function to choose the weight type

5349b7f

Add 'aa_weight' function to calculate the protein weight

e5d408b

Add 'count_hydroaffinity' function to count it in protein sequence

f142af3

Add 'peptide_cutter' to identifies cleavage sites for "trypsin" and "…

1d0610c

…chymotrypsin"

Add 'one_to_three_letter_code' function to convert protein sequence

17859ad

Add 'sulphur_containing_aa_counter' function

8e2e973

This function counts sulphur-containing amino acids (Cysteine and Methionine) in a protein sequence.

Add 'run_amino_analyzer' function to analyse protein sequence in one-…

8b2a489

…letter code

Create dna_rna_tools_utils.py

167f505

Add 'is_dna' function to check if a sequence is DNA

9690bbd

Add 'is_rna' function to check if a sequence is RNA

9cacb05

Add 'reverse' function to reverse a sequence

11d5112

Add 'complement' function to find complement of sequence

bed0dcc

Add 'reverse_complement' function to find reverse complement sequence

e40468c

Add 'reverse_transcription' function to reverse transcription of RNA

d9dbaa8

Add 'type_rna_or_dna' function to detect type of sequence

0a64a42

Add 'has_start_codon' function to check if RNA has start codon

8855e07

Add 'is_palindrome' function to check if a sequence is a palindrome

1af134a

Add 'transcribe' function to transcribe a DNA sequence into RNA

49e1d63

Add 'run_dna_rna_tools' function to make various opertaions with DNA/RNA

a95b168

Update README.md

40b9e86

CaptnClementine added 4 commits October 8, 2023 10:05

Correct bounds in 'filter_dna' function

94751f3

Correct bounds in GC and length filter functions

6172210

Update README.md to correct bounds descrition in 'filter_dna' function

f412a04

Update README.md

a06d117

albidgy reviewed Oct 12, 2023

View reviewed changes

		return unique_chars <= RNA


		def type_rna_or_dna(seqs: List[str]) -> str:

		return filtered_seqs


		from amino_analyzer_utils import aa_weight, count_hydroaffinity, peptide_cutter, one_to_three_letter_code, sulphur_containing_aa_counter

Conversation

CaptnClementine commented Oct 7, 2023

Uh oh!

albidgy left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

CaptnClementine commented Oct 17, 2023 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

albidgy left a comment •

edited

Loading