Hi Nicola,
I've just been trying to run your handy phastSim method on several different sequences. It seems to fail on a few -- for example on the genome AE001825:
phastSim --outpath ./ --outputFile AE001825.1.phastsim-001.temp --reference AE001825.1.fasta --treeFile AE001825.1.phastsim-001.phastsimtree --createFasta --hyperMutProbs 0.01 0.04 --hyperMutRates 100 10 --indels --insertionRate GAMMA 0.1 1.0 --deletionRate CONSTANT 0.1 --insertionLength GEOMETRIC 0.9 --deletionLength NEGBINOMIAL 2 0.95
With a simple tree file (AE001825.1.phastsim-001.phastsimtree):
((phastsim0:0.1,phastsim1:0.1,phastsim2:0.1,phastsim3:0.1,phastsim4:0.1,phastsim5:0.1,phastsim6:0.1,phastsim7:0.1,phastsim8:0.1,phastsim10:0.1):0.001);
The AE001825 sequence does have some non-ACGT characters (R & K), which I presume is the issue for phastSim? Should I randomly select A/G for R, or G/T for K.
Best wishes,
Paul.
Hi Nicola,
I've just been trying to run your handy phastSim method on several different sequences. It seems to fail on a few -- for example on the genome AE001825:
phastSim --outpath ./ --outputFile AE001825.1.phastsim-001.temp --reference AE001825.1.fasta --treeFile AE001825.1.phastsim-001.phastsimtree --createFasta --hyperMutProbs 0.01 0.04 --hyperMutRates 100 10 --indels --insertionRate GAMMA 0.1 1.0 --deletionRate CONSTANT 0.1 --insertionLength GEOMETRIC 0.9 --deletionLength NEGBINOMIAL 2 0.95With a simple tree file (AE001825.1.phastsim-001.phastsimtree):
((phastsim0:0.1,phastsim1:0.1,phastsim2:0.1,phastsim3:0.1,phastsim4:0.1,phastsim5:0.1,phastsim6:0.1,phastsim7:0.1,phastsim8:0.1,phastsim10:0.1):0.001);The
AE001825sequence does have some non-ACGT characters (R&K), which I presume is the issue forphastSim? Should I randomly selectA/GforR, orG/TforK.Best wishes,
Paul.