Skip to content

Inquiry Regarding PECAT Phasing and Assembly Interpretation #42

@HLHsieh

Description

@HLHsieh

Hi,

Thank you for developing such a helpful tool. I’m planning to use it for local assembly of a repetitive region and would appreciate your insights on a couple of questions related to its usage and feasibility for my case.

  1. Does PECAT provide the phase result of each read in dual-mode assembly? Specifically, I would like to know whether it assigns each read to a specific haplotype during the assembly process.

  2. For my test, I simulated reads with lengths shorter than the known repetitive region I intend to assemble. Given this input, can PECAT reliably assemble the entire repetitive region? Below are the settings I used for assembly:

project=P8_dual
reads=./P8_extracted_region.fasta
genome_size=300000
threads=8
cleanup=1
grid=local
prep_min_length=3000
prep_output_coverage=50
corr_iterate_number=1
corr_block_size=4000000000
corr_correct_options=
corr_filter_options=--filter0=l=5000:al=2500:alr=0.5:aal=5000:oh=3000:ohr=0.3
corr_rd2rd_options=-x ava-ont
corr_output_coverage=80
align_block_size=4000000000
align_rd2rd_options=-X -g3000 -w30 -k19 -m100 -r500
align_filter_options=--filter0=l=5000:aal=6000:aalr=0.5:oh=3000:ohr=0.3 --task=extend --filter1=oh=300:ohr=0.03
asm1_assemble_options=
phase_method=
phase_rd2ctg_options=-x map-ont -c -p 0.5 -r 1000
phase_use_reads=1
phase_phase_options=
phase_clair3_command=singularity exec -B `pwd -P`:`pwd -P` -B /tmp:/tmp clair3_v0.1-r12.sif /opt/bin/run_clair3.sh
phase_clair3_use_reads=0
phase_clair3_options=--platform=ont --model_path=/opt/models/ont_guppy5/  --include_all_ctgs
phase_clair3_rd2ctg_options=-x map-ont -w10 -k19 -c -p 0.5 -r 1000
phase_clair3_phase_options=--coverage lc=30 --phase_options icr=0.1:icc=6:sc=10 --filter i=70
phase_clair3_filter_options=--threshold=2500 --rate 0.05
asm2_assemble_options=--contig_format dual,prialt
polish_map_options=-x map-ont
polish_filter_options=--filter0 oh=1000:ohr=0.1
polish_cns_options=
polish_medaka=
polish_medaka_command = singularity exec -B `pwd -P`:`pwd -P`  -B /tmp:/tmp medaka_v1.7.2.sif medaka
polish_medaka_map_options=-x map-ont
polish_medaka_cns_options=--model r941_prom_sup_g507
  1. I noticed that PECAT can be run without using Clair3 or Medaka. Could you clarify the differences in phasing performance between the built-in method and using Clair3? Additionally, how much improvement can be expected by enabling Medaka polishing (polish_medaka=1)?

  2. I would appreciate some clarification on how to interpret the file located in 5-assemble/haplotype_1_tiles. Specifically, for my use case, I am performing a local assembly on a specific region (~50 kb), and I noticed that the assembly generates several contigs. Could you advise on how to interpret this output, particularly in the context of the generated contigs?

ctg7 edge=9364:B~5840:E
ctg7 edge=5840:E~978:B
ctg3 edge=2237:E~1756:E
ctg3 edge=1756:E~3895:E
ctg3 edge=3895:E~8776:B
ctg3 edge=8776:B~10451:B
ctg1 edge=11619:E~2988:B
ctg1 edge=2988:B~1562:E
ctg1 edge=1562:E~11796:B
ctg12 edge=8148:E~15022:B
ctg12 edge=15022:B~10968:B
ctg12 edge=10968:B~10451:B
ctg0 edge=12123:B~10881:E
ctg0 edge=10881:E~12073:B

Thank you in advance for your time and assistance. I look forward to your feedback.

Best regards,
Hsin

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions