Skip to content

Unable to find All-All or All-All #386

@HHan-ZJU

Description

@HHan-ZJU

Thank you for providing such a useful tool. I keep getting the following errors when I try to run the juish.sh script:

Picked up _JAVA_OPTIONS: -Xmx1024m -Xms1024m
Using 1 CPU thread(s) for primary task
Picked up _JAVA_OPTIONS: -Xmx1024m -Xms1024m
Using 1 CPU thread(s) for primary task
Picked up _JAVA_OPTIONS: -Xmx150000m -Xms150000m
Using 32 CPU thread(s) for primary task
Using 10 CPU thread(s) for secondary task
Unable to find All-All or All-All
Unable to find 1-AEMK02000504.1 or AEMK02000504.1-1
Unable to find 1-AEMK02000475.1 or AEMK02000475.1-1
Unable to find 1-AEMK02000418.1 or AEMK02000418.1-1
Unable to find 1-AEMK02000436.1 or AEMK02000436.1-1
Unable to find 1-AEMK02000488.1 or AEMK02000488.1-1
Unable to find 1-AEMK02000242.1 or AEMK02000242.1-1
Unable to find 1-AEMK02000530.1 or AEMK02000530.1-1
Unable to find 1-AEMK02000688.1 or AEMK02000688.1-1

Meanwhile, thousands of files named inter.hic_1-1, inter.hic_1-1_1, etc., appear in the ./aligned folder. Most of them are 0 Kb, but some contain content. This process usually takes 2-3 days. I know this is running the command time ${juiceDir}/scripts/common/juicer_tools pre -n -s $outputdir/inter.txt -g $outputdir/inter_hists.m $fragstr $resstr $threadHicString $outputdir/merged1.txt $outputdir/inter.hic $genomePath and I thought it would eventually generate the inter.hic and inter_30.hic files, but it didn't. My ./aligned folder generated files such as inter.txt, inter_hists.m, merged1.txt, and merged1_index.txt, but I don't know why it can't generate the .hic files.

After a lengthy "Unable to find..." message in the log file, the result is as follows:

Unable to find AEMK02000417.1-AEMK02000654.1 or AEMK02000654.1-AEMK02000417.1
Unable to find AEMK02000323.1-AEMK02000519.1 or AEMK02000519.1-AEMK02000323.1
Unable to find AEMK02000654.1-AEMK02000519.1 or AEMK02000519.1-AEMK02000654.1
Unable to find AEMK02000417.1-AEMK02000519.1 or AEMK02000519.1-AEMK02000417.1
Not including fragment map
Start preprocess
Writing header
Writing body
........................................................................................................................................................
Writing footer
nBytesV5: 21512946
masterIndexPosition: 8905665690

Finished preprocess
Done creating .hic file. Normalization not calculated due to -n flag.
To run normalization, run: java -jar juicer_tools.jar addNorm

real 2555m17.185s
user 1575m12.996s
sys 698m47.528s
Picked up _JAVA_OPTIONS: -Xmx150000m -Xms150000m
Using 64 CPU thread(s) for primary task
Error reading datasetnull
java.io.EOFException
at htsjdk.tribble.util.LittleEndianInputStream.readString(LittleEndianInputStream.java:119)
at juicebox.data.DatasetReaderV2.read(DatasetReaderV2.java:95)
at juicebox.tools.utils.norm.NormalizationVectorUpdater.updateHicFile(NormalizationVectorUpdater.java:137)
at juicebox.tools.clt.old.AddNorm.launch(AddNorm.java:83)
at juicebox.tools.clt.old.AddNorm.run(AddNorm.java:137)
at juicebox.tools.HiCTools.main(HiCTools.java:97)

real 0m1.434s
user 0m1.004s
sys 0m0.332s
Thu Nov 6 13:56:52 CST 2025

I asked GPT-5 why this error was occurring, and it told me it might be due to a chromosome encoding mismatch, but I've confirmed that the chromosome encoding is consistent in all my files. The first time I ran it, the .fa file header I used contained "1 dna:primary_assembly primary_assembly:Sscrofa11.1:1:1:274330532:1 REF". Even after I kept only the number 1, rerunning juicer.sh still resulted in the same error.

The command I used to run juicer.sh and the related input files are as follows:
nohup bash ~/HiC/juicer/CPU/juicer.sh -D ~/HiC -d ~/HiC/S1_LDM -z ~/HiC/references/Sus_scrofa.Sscrofa11.1.dna.toplevel.fa -y ~/HiC/restriction_sites/Sscrofa11.1_DpnII.txt -p ~/HiC/restriction_sites/Sscrofa11.1.chrom.sizes -s DpnII -t 64 -T 32 --cleanup > S1_LDM_juicer.log 2>&1 &

The juicer code was obtained directly via git clone; juicer.sh displays "# Juicer version 2.0". The versions of other software are:
samtools: Version: 1.22.1 (using htslib 1.22.1)
bwa: version: 0.7.19-r1273
java: openjdk version "1.8.0_412", OpenJDK Runtime Environment (Zulu 8.78.0.19-CA-linux64) (build 1.8.0_412-b08), OpenJDK 64-Bit Server VM (Zulu 8.78.0.19-CA-linux64) (build 25.412-b08, mixed mode)
juicer_tools.2.20.00.jar

The pig reference genome file I used is as follows:

>1 dna:primary_assembly primary_assembly:Sscrofa11.1:1:1:274330532:1 REF
GCTTAATTTTTGTCATTTCTCACCCCTGCTCTTGAGAGCTTTTGTTGATAATGTTGTTAT
TGCTTTCATTCTGCTTTTATTTTGTAAGCCCTGCACTCATTCATCGCTGTACCCGAATA......

The contents of the Sscrofa11.1.chrom.sizes file are as follows:

1 274330532
2 151935994
3 132848913
4 130910915
5 104526007
6 170843587
7 121844099
8 138966237
9 139512083
10 69359453
11 79169978
12 61602749
13 208334590
14 141755446
15 140412725
16 79944280
17 63494081
18 55982971
X 125939595
Y 43547828
MT 16613
AEMK02000452.1 3843259
AEMK02000698.1 2641821
AEMK02000361.1 2125600
AEMK02000598.1 2117578
......

The contents of the inter.txt file are as follows:

Experiment description: Sample name HiC_sample; Juicer version 2.0; BWA 0.7.19-r1273; 64 threads; openjdk version "1.8.0_412"; Juicer Tools Version 2.20.00; ~/HiC/juicer/CPU/juicer.sh -D ~/HiC -d ~/HiC/S1_LDM -z ~/HiC/references/Sus_scrofa.Sscrofa11.1.dna.toplevel.fa -y ~/HiC/restriction_sites/Sscrofa11.1_DpnII.txt -p ~/HiC/restriction_sites/Sscrofa11.1.chrom.sizes -s DpnII -t 64 -T 32 --cleanup
Read type: Paired End
Sequenced Read Pairs: 973313542
No chimera found: 18598281 (1.91%)
One or both reads unmapped: 18598281 (1.91%)
2 alignments: 874775550 (89.88%)
2 alignments (A...B): 573405368 (58.91%)
2 alignments (A1...A2B; A1B2...B1A2): 301370182 (30.96%)
3 or more alignments: 79939711 (8.21%)
Ligation Motif Present: 484148292 (49.74%)
Average insert size: 471.46
Total Unique: 556029602 (63.56%, 57.13%)
Total Duplicates: 318745948 (36.44%, 32.75%)
Library Complexity Estimate*: 886,486,733
Intra-fragment Reads: 4,616,340 (0.47% / 0.83%)
Below MAPQ Threshold: 61,892,191 (6.36% / 11.13%)
Hi-C Contacts: 489,521,071 (50.29% / 88.04%)
3' Bias (Long Range): 95% - 5%
Pair Type %(L-I-O-R): 25% - 25% - 25% - 25%
L-I-O-R Convergence: 1874
Inter-chromosomal: 138,937,092 (14.27% / 24.99%)
Intra-chromosomal: 350,583,979 (36.02% / 63.05%)
Short Range (<20Kb):
<500BP: 84,437,326 (8.68% / 15.19%)
500BP-5kB: 36,299,387 (3.73% / 6.53%)
5kB-20kB: 24,036,088 (2.47% / 4.32%)
Long Range (>20Kb): 205,811,178 (21.15% / 37.01%)

The first few lines of the merged1.txt file contain the following:

16 10 51759 143 16 10 57364 162
0 10 57180 160 16 10 57381 162
0 10 57484 161 16 10 57394 162
0 10 57513 161 0 10 176158 483
0 10 57513 161 0 10 489621 1372
0 10 57468 161 0 10 11830688 33938
0 10 57512 161 16 10 14304906 42557
0 10 57489 161 0 10 17001633 51302
0 10 57551 161 0 10 46835758 153718
0 10 57483 161 16 10 61971604 200195

This problem has been bothering me for a long time, and I can't find a solution at all. I would be extremely grateful if someone could provide a solution. Please also let me know if you need any other information to help resolve the issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions