forked from arkatebi/CAFA-Toolset
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathFormatChecker.py
More file actions
executable file
·85 lines (77 loc) · 2.36 KB
/
FormatChecker.py
File metadata and controls
executable file
·85 lines (77 loc) · 2.36 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
#!/usr/bin/python
'''
This methods in this script check the format of an input file.
It has the following methods to check the file format:
check_gaf_format():
It checks wheter the format of the file is in GAF.
If the file is in GAF format, it returns True
Otherwise, it returns False
check_sprot_format(fh_sprot):
This method checks whether the format of the file
(with file handle fh_sprot) is in UniProtKB/Swissprot
format.
If the file is in UniProtKB/Swissprot format format,
it returns True
Otherwise,
it returns False.
check_benchmark_format:
This method returns False:
if the input file name is an empty string or
if the file does not exist or
if the file size is zero or
if the file is in correct format
Otherwise, it returns True
'''
import os
import sys
import re
from os.path import basename
from Bio import SwissProt as sp
import stat
def check_gaf_format(fh_goa):
"""
This method checks whether the format of the file
(with file handle fh_goa) is in GAF 1.0 or GAF 2.0.
If the file is in GAF format, it returns True
Otherwise, it returns False.
"""
firstline = fh_goa.readline()
fields = firstline.strip().split('\t')
if re.search('^\!gaf', firstline):
return True
elif len(fields) == 15:
return True
else:
return False
def check_sprot_format(fh_sprot):
"""
This method checks whether the format of the file
(with file handle fh_sprot) is in UniProtKB/Swissprot format.
If the file is in UniProtKB/Swissprot format format,
it returns True
Otherwise,
it returns False.
"""
iter_handle = sp.parse(fh_sprot) # sp.parse method returns a generator
try:
for rec in iter_handle:
break
except:
return False
else:
return True
def check_benchmark_format(fh_benchmark):
"""
This method checks the format of a benchmark file.
It returns False:
if the the file is NOT in correct 2-column format
Otherwise, it returns True
"""
for lines in fh_benchmark:
cols = lines.strip().split('\t')
if len(cols) != 2:
return False
return True
if __name__ == '__main__':
print (sys.argv[0] + ':')
print (__doc__)