-
Notifications
You must be signed in to change notification settings - Fork 194
Description
Thank you for providing such a useful tool. I keep getting the following errors when I try to run the juish.sh script:
Picked up _JAVA_OPTIONS: -Xmx1024m -Xms1024m
Using 1 CPU thread(s) for primary task
Picked up _JAVA_OPTIONS: -Xmx1024m -Xms1024m
Using 1 CPU thread(s) for primary task
Picked up _JAVA_OPTIONS: -Xmx150000m -Xms150000m
Using 32 CPU thread(s) for primary task
Using 10 CPU thread(s) for secondary task
Unable to find All-All or All-All
Unable to find 1-AEMK02000504.1 or AEMK02000504.1-1
Unable to find 1-AEMK02000475.1 or AEMK02000475.1-1
Unable to find 1-AEMK02000418.1 or AEMK02000418.1-1
Unable to find 1-AEMK02000436.1 or AEMK02000436.1-1
Unable to find 1-AEMK02000488.1 or AEMK02000488.1-1
Unable to find 1-AEMK02000242.1 or AEMK02000242.1-1
Unable to find 1-AEMK02000530.1 or AEMK02000530.1-1
Unable to find 1-AEMK02000688.1 or AEMK02000688.1-1
Meanwhile, thousands of files named inter.hic_1-1, inter.hic_1-1_1, etc., appear in the ./aligned folder. Most of them are 0 Kb, but some contain content. This process usually takes 2-3 days. I know this is running the command time ${juiceDir}/scripts/common/juicer_tools pre -n -s $outputdir/inter.txt -g $outputdir/inter_hists.m $fragstr $resstr $threadHicString $outputdir/merged1.txt $outputdir/inter.hic $genomePath and I thought it would eventually generate the inter.hic and inter_30.hic files, but it didn't. My ./aligned folder generated files such as inter.txt, inter_hists.m, merged1.txt, and merged1_index.txt, but I don't know why it can't generate the .hic files.
After a lengthy "Unable to find..." message in the log file, the result is as follows:
Unable to find AEMK02000417.1-AEMK02000654.1 or AEMK02000654.1-AEMK02000417.1
Unable to find AEMK02000323.1-AEMK02000519.1 or AEMK02000519.1-AEMK02000323.1
Unable to find AEMK02000654.1-AEMK02000519.1 or AEMK02000519.1-AEMK02000654.1
Unable to find AEMK02000417.1-AEMK02000519.1 or AEMK02000519.1-AEMK02000417.1
Not including fragment map
Start preprocess
Writing header
Writing body
........................................................................................................................................................
Writing footer
nBytesV5: 21512946
masterIndexPosition: 8905665690Finished preprocess
Done creating .hic file. Normalization not calculated due to -n flag.
To run normalization, run: java -jar juicer_tools.jar addNormreal 2555m17.185s
user 1575m12.996s
sys 698m47.528s
Picked up _JAVA_OPTIONS: -Xmx150000m -Xms150000m
Using 64 CPU thread(s) for primary task
Error reading datasetnull
java.io.EOFException
at htsjdk.tribble.util.LittleEndianInputStream.readString(LittleEndianInputStream.java:119)
at juicebox.data.DatasetReaderV2.read(DatasetReaderV2.java:95)
at juicebox.tools.utils.norm.NormalizationVectorUpdater.updateHicFile(NormalizationVectorUpdater.java:137)
at juicebox.tools.clt.old.AddNorm.launch(AddNorm.java:83)
at juicebox.tools.clt.old.AddNorm.run(AddNorm.java:137)
at juicebox.tools.HiCTools.main(HiCTools.java:97)real 0m1.434s
user 0m1.004s
sys 0m0.332s
Thu Nov 6 13:56:52 CST 2025
I asked GPT-5 why this error was occurring, and it told me it might be due to a chromosome encoding mismatch, but I've confirmed that the chromosome encoding is consistent in all my files. The first time I ran it, the .fa file header I used contained "1 dna:primary_assembly primary_assembly:Sscrofa11.1:1:1:274330532:1 REF". Even after I kept only the number 1, rerunning juicer.sh still resulted in the same error.
The command I used to run juicer.sh and the related input files are as follows:
nohup bash ~/HiC/juicer/CPU/juicer.sh -D ~/HiC -d ~/HiC/S1_LDM -z ~/HiC/references/Sus_scrofa.Sscrofa11.1.dna.toplevel.fa -y ~/HiC/restriction_sites/Sscrofa11.1_DpnII.txt -p ~/HiC/restriction_sites/Sscrofa11.1.chrom.sizes -s DpnII -t 64 -T 32 --cleanup > S1_LDM_juicer.log 2>&1 &
The juicer code was obtained directly via git clone; juicer.sh displays "# Juicer version 2.0". The versions of other software are:
samtools: Version: 1.22.1 (using htslib 1.22.1)
bwa: version: 0.7.19-r1273
java: openjdk version "1.8.0_412", OpenJDK Runtime Environment (Zulu 8.78.0.19-CA-linux64) (build 1.8.0_412-b08), OpenJDK 64-Bit Server VM (Zulu 8.78.0.19-CA-linux64) (build 25.412-b08, mixed mode)
juicer_tools.2.20.00.jar
The pig reference genome file I used is as follows:
>1 dna:primary_assembly primary_assembly:Sscrofa11.1:1:1:274330532:1 REF
GCTTAATTTTTGTCATTTCTCACCCCTGCTCTTGAGAGCTTTTGTTGATAATGTTGTTAT
TGCTTTCATTCTGCTTTTATTTTGTAAGCCCTGCACTCATTCATCGCTGTACCCGAATA......
The contents of the Sscrofa11.1.chrom.sizes file are as follows:
1 274330532
2 151935994
3 132848913
4 130910915
5 104526007
6 170843587
7 121844099
8 138966237
9 139512083
10 69359453
11 79169978
12 61602749
13 208334590
14 141755446
15 140412725
16 79944280
17 63494081
18 55982971
X 125939595
Y 43547828
MT 16613
AEMK02000452.1 3843259
AEMK02000698.1 2641821
AEMK02000361.1 2125600
AEMK02000598.1 2117578
......
The contents of the inter.txt file are as follows:
Experiment description: Sample name HiC_sample; Juicer version 2.0; BWA 0.7.19-r1273; 64 threads; openjdk version "1.8.0_412"; Juicer Tools Version 2.20.00; ~/HiC/juicer/CPU/juicer.sh -D ~/HiC -d ~/HiC/S1_LDM -z ~/HiC/references/Sus_scrofa.Sscrofa11.1.dna.toplevel.fa -y ~/HiC/restriction_sites/Sscrofa11.1_DpnII.txt -p ~/HiC/restriction_sites/Sscrofa11.1.chrom.sizes -s DpnII -t 64 -T 32 --cleanup
Read type: Paired End
Sequenced Read Pairs: 973313542
No chimera found: 18598281 (1.91%)
One or both reads unmapped: 18598281 (1.91%)
2 alignments: 874775550 (89.88%)
2 alignments (A...B): 573405368 (58.91%)
2 alignments (A1...A2B; A1B2...B1A2): 301370182 (30.96%)
3 or more alignments: 79939711 (8.21%)
Ligation Motif Present: 484148292 (49.74%)
Average insert size: 471.46
Total Unique: 556029602 (63.56%, 57.13%)
Total Duplicates: 318745948 (36.44%, 32.75%)
Library Complexity Estimate*: 886,486,733
Intra-fragment Reads: 4,616,340 (0.47% / 0.83%)
Below MAPQ Threshold: 61,892,191 (6.36% / 11.13%)
Hi-C Contacts: 489,521,071 (50.29% / 88.04%)
3' Bias (Long Range): 95% - 5%
Pair Type %(L-I-O-R): 25% - 25% - 25% - 25%
L-I-O-R Convergence: 1874
Inter-chromosomal: 138,937,092 (14.27% / 24.99%)
Intra-chromosomal: 350,583,979 (36.02% / 63.05%)
Short Range (<20Kb):
<500BP: 84,437,326 (8.68% / 15.19%)
500BP-5kB: 36,299,387 (3.73% / 6.53%)
5kB-20kB: 24,036,088 (2.47% / 4.32%)
Long Range (>20Kb): 205,811,178 (21.15% / 37.01%)
The first few lines of the merged1.txt file contain the following:
16 10 51759 143 16 10 57364 162
0 10 57180 160 16 10 57381 162
0 10 57484 161 16 10 57394 162
0 10 57513 161 0 10 176158 483
0 10 57513 161 0 10 489621 1372
0 10 57468 161 0 10 11830688 33938
0 10 57512 161 16 10 14304906 42557
0 10 57489 161 0 10 17001633 51302
0 10 57551 161 0 10 46835758 153718
0 10 57483 161 16 10 61971604 200195
This problem has been bothering me for a long time, and I can't find a solution at all. I would be extremely grateful if someone could provide a solution. Please also let me know if you need any other information to help resolve the issue.