Development by grishchenkoira · Pull Request #1 · grishchenkoira/BSAT

grishchenkoira · 2023-10-07T21:10:37Z

This pull request adds a utility Bio_Seq_Analysis_Tool.py with 3 main functions, a folder with 3 modules required for Bio_Seq_Analysis_Tool and README.md file

… of this one

albidgy

Плюсы:

Молодец, что добавила в README примеры, что вернется если подать такую-то последовательность.
Код, который ты пишешь, работает правильно.
Докстринги и typing сделаны правильно и подробно. Здорово!

Замечания:

Не забывай ставить пробелы.
Не включила границы в fastq_analysis.
Не нужно системные файлы добавлять на GitHub, это плохая практика. Ты добавила директорию .ipynb_checkpoints.
Вот совсем непонятно, что было сделано за тот или иной коммит. Хочется более конкретных комментариев, что было добавлено/сделано.

- У тебя достаточно тяжело воспринимаемый код получается. Это сейчас не страшно, но над этим нужно работать. У меня есть 2 предложения:

Когда ты написала какую-то функцию, пробуй придумать, как можно убрать повторяющиеся части кода (если они есть).
Можно, думаю, смотреть прошлые домашние задания одногруппников. Мы там указываем на преимущества и недостатки. Анализируй код, почему он тебе нравится или не нравится.

Баллы:

3 фильтрации FASTQ 3/3
Главная функция 0.9/1 (границы значений не включены)
README 2/2
Структура репозитория и качество кода 2/3 (-0.5 за системные файлы, -0.3 за комментарии к коммитам, -0.2 Отсутствие пробелов).
Улучшение кода ДНК/РНК и белковых тулов 0.8/1.

Итого: 8.7 баллов

albidgy · 2023-10-16T09:20:29Z

+    :rtype: str
+    :return: complement sequence   
+    """
+    complement_dict = {'A': 'T', 'C': 'G', 


Про словарь я тебе писала, что его лучше сделать константой и вынести за пределы функции.

albidgy · 2023-10-16T09:20:30Z

+    complement_dict = {'A': 'T', 'C': 'G', 
+                   'G': 'C', 'T': 'A', 'U': 'A', 'a': 't',
+                   'c': 'g', 'g': 'c', 't': 'a', 'u': 'a'}
+    complement_seq = []


За это спасибо!

albidgy · 2023-10-16T09:20:31Z

+    :return: GC-contentn percent 
+    """
+    length = len(seq)
+    gc_content = 0.0


Можно не создавать переменную, она сама создастся на лету в строке 78

albidgy · 2023-10-16T09:20:34Z

+    length = len(seq)
+    gc_content = 0.0
+    seq_up = seq.upper()
+    c = seq_up.count("C")


Раньше было лучше название - c_nucl

albidgy · 2023-10-16T09:20:36Z

+    seq_up = seq.upper()
+    c = seq_up.count("C")
+    g = seq_up.count("G")
+    gc_content = round(((c+g)/length*100),2)


Не забывай про пробелы. + одна пара скобок тут лишняя.

Suggested change

gc_content = round(((c+g)/length*100),2)

gc_content = round((c+g) / length * 100 , 2)

albidgy · 2023-10-16T09:20:54Z

+            if len(sep_seq[0]) == 1:
+                encode_seq.append(RESIDUES_NAMES_THREE[residue])
+    for residue, reg in zip(encode_seq, save_register(seq)):
+        if (reg == 1):


Скобки не нужны

albidgy · 2023-10-16T09:20:56Z

+                encode_seq.append(RESIDUES_NAMES_THREE[residue])
+    for residue, reg in zip(encode_seq, save_register(seq)):
+        if (reg == 1):
+            encode_seq_registered += residue.upper()


Лучше использовать append()

albidgy · 2023-10-16T09:20:58Z

+            fin_seq.append(residue)
+            if ((i+1) % 3 == 0):
+                fin_seq.append(' ')
+        return ' '.join(fin_seq)


Suggested change

return ' '.join(fin_seq)

return ''.join(fin_seq)

Лучше убрать пробел, а то запись странная получается.

Код работает правильно, но в нем много повторений. Если тебе будет интересно, как это можно сделать короче, напиши мне, я напишу альтернативную версию.

albidgy · 2023-10-16T09:21:01Z

+        residue_count[[tl_code for tl_code in RESIDUES_NAMES if RESIDUES_NAMES[tl_code] == residue][0]] = 0
+    for residue in seq:
+        residue_count[[tl_code for tl_code in RESIDUES_NAMES if RESIDUES_NAMES[tl_code] == residue][0]] += 1


Вот эту часть можно было переписать. 2 словаря у тебя теперь есть, поэтому достаточно легко сделать это более аккуратно. Замечание не исправлено.

albidgy · 2023-10-16T09:21:03Z

+            site_full_position.append(f'{site_start_position[counter]}:{site_end_position[counter]}')
+        return f'Site entry in sequence = {site_count}. Site residues can be found at positions: {site_full_position}'
+    else:
+        return f'{site} site is not in sequence!'


Замечание не исправлено.

grishchenkoira added 10 commits October 7, 2023 23:46

Innitial commit for dna_rna_analysis.py

91e416f

Innitial commit for fastq_analysis.py

52772ec

Innitial commit for protein_analysis

edf755b

Add python script with all functions for this module

b247115

Add python script with all functions for this module

78dc3b5

Add python script with all functions for protein module

c6ade48

Initail commit for main script of Bio_Seq_Analysis_Tool

caee4ae

Add forder with required modules for Bio_Seq_Analysis_Tool.py

10e061b

Add python script with all functions into Bio_Seq_Analysis_Tool

1de06fd

Add README for Bio_Seq_Analysis_Tool module with detailed description…

be10c57

… of this one

albidgy reviewed Oct 16, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Development#1

Development#1
grishchenkoira wants to merge 10 commits into
mainfrom
development

grishchenkoira commented Oct 7, 2023

Uh oh!

albidgy left a comment

Uh oh!

albidgy Oct 16, 2023

Uh oh!

albidgy Oct 16, 2023

Uh oh!

albidgy Oct 16, 2023

Uh oh!

albidgy Oct 16, 2023

Uh oh!

albidgy Oct 16, 2023

Uh oh!

albidgy Oct 16, 2023

Uh oh!

albidgy Oct 16, 2023

Uh oh!

albidgy Oct 16, 2023

Uh oh!

albidgy Oct 16, 2023

Uh oh!

albidgy Oct 16, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	gc_content = round(((c+g)/length*100),2)
	gc_content = round((c+g) / length * 100 , 2)

Conversation

grishchenkoira commented Oct 7, 2023

Uh oh!

albidgy left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants