forked from saleslab/ParsingLabSolutionsASCIIOutput
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathASCIItoTable.py
More file actions
120 lines (99 loc) · 4.73 KB
/
ASCIItoTable.py
File metadata and controls
120 lines (99 loc) · 4.73 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
'''
*ASCIItoTable*
This script is a Python translation of the "TableGenerator" module
originally written in Excel's Visual Basic for Applications (VBA) IDE.
The general purpose is to take as input one or more ASCII data results
files generated from a Shimadzu HPLC via LabSolutions software and parse it
into it a tabular, more readable format.
The output is a single, tab-delimited textfile consisting of rows for each
sample including the sample name, sample identifier, and concentration of
each elutant found.
The original code was written, developed, and tested in an Excel 2010
version of VBA on a Windows PC. It utilized the spreadsheet style of the
program as a means of manipulating the data before producing the final
product in .xlsx format. Due to regular software updates and the
requirement of manual reference enabling, it was preferred to have the
process written in an OS-independent, compact format such as Python.
Written by Tommy Thompson
'''
### Initializations ###
## Libraries
from os import listdir
from pathlib import Path
## Key Labels
str1 = "Sample Name"
str2 = "Sample ID"
str3 = "ID#"
## Output Heading
header = "SampleName\tSampleID\t"
elutList = []
## Result Containers
sNameList = []
sIDList = []
concDict = {}
## Iterator
sampNum = 0
## I/O Locations
sourceInput = input("Enter the path of the folder containing the files: ")
resultInput = input("Enter the path of where the output will be saved: ")
sourcePath = Path(sourceInput) #make the path OS-agnostic
resultPath = Path(resultInput) / "results.txt" #append results file name to path
### Reading the Data ###
## File Loop
# Read each file in folder given at path
for ascFile in listdir(sourcePath):
path = sourcePath / ascFile #append file name to complete path
# Open and read textfile
with open(path, "r") as rawMat:
## Line Loop
# Read each line one at a time
for line in rawMat:
# Check for any key labels
if str1 in line: #if "Sample Name" is found...
sName = line.split("\t") #...list tab-delimited contents
sNameList.append(sName[1].rstrip()) #add sample name to container w/o end characters
elif str2 in line: #if "Sample ID" is found...
sID = line.split("\t")
sIDList.append(sID[1].rstrip()) #add sample ID to container w/o end characters
elif line[0:3] == str3: #if "ID#" is found at beginning of line...
line = next(rawMat) #...move to next line
## Elutant While
# Continue reading until all elutants read
while line != "\n":
sLine = line.split("\t") #list tab-delimited contents
# Check if new or not
if not sLine[1] in elutList: #if elutant has not already been seen before...
elutList.append(sLine[1].rstrip()) #...add it to list w/o end characters
concDict[sLine[1].rstrip()] = [""]*sampNum #establish new key in dictionary;
#fill up previous values with ZLS
# Take the concentration found for this elutant and add it to its key's list of values
concDict[sLine[1].rstrip()].append(sLine[5].rstrip())
line = next(rawMat) #move to next line to continue loop
## Equalize Keys
# Check to see if all current elutants have equal-sized lists of values
if not all(len(concDict[e]) == sampNum+1 for e in concDict): #size should be #samps + 1
for e in concDict: #if not...
if len(concDict[e]) != sampNum+1: #...some known elutants weren't found this time
concDict[e].append("") #...add ZLS to those that didn't receive a new value
sampNum += 1
### Writing the Data ###
with open(resultPath, "w") as finProd:
## Heading Loop
finProd.write(header)
# Write each elutant found to header of textfile
for elut in elutList:
if elut != elutList[len(elutList)-1]: #if not last elutant in list...
finProd.write(elut + "\t") #...append a tab character
else:
finProd.write(elut + "\n") #otherwise append a newline character
## Data Loops
# Create a results line for each sample
for i, samp in enumerate(sNameList):
resLine = [samp, sIDList[i]] #combine current sample name and sample ID
# Step through each elutant found
for val in concDict:
resLine.append(concDict[val][i]) #append concentration value to results line
# Delimit each line's contents by a tab and terminate with a newline
resLine = "\t".join(resLine) + "\n"
# Write the line to the file
finProd.write(resLine)