Skip to content

Error after Performing machine learning classification #32

@Aciole-David

Description

@Aciole-David

Hello!
I'm testing vRhyme and got stuck after 'Performing machine learning classification' step

Running on a slurm HPC system
Fresh mamba install
Inputs :
a) Single-end Next-seq reads;
b) virsorter output sequences from megahit contigs

Slurm log below:


/home/hpc_scientist/miniforge3/envs/vrhyme_env/bin/vRhyme:16: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/p
kg_resources.html
import pkg_resources
Command: /home/hpc_scientist/miniforge3/envs/vrhyme_env/bin/vRhyme
-i final-viral-combined.fa
-u putativeVLP_data_106.fastq putativeVLP_data_76.fastq putativeVLP_data_77.fastq putativeVLP_data_78.fastq putativeVLP_data_79.fastq putativeVLP_data_80.fastq putativeVLP_data_81.fastq putativeVLP_data_82.fastq putativeVLP_data_83.fastq putativeVLP_data_85.fastq putativeVLP_data_86.fastq putativeVLP_data_87.fastq putativeVLP_data_88.fastq putativeVLP_data_89.fastq
-t 20
-o vrhyme_out
--method longest
--verbose

Date: 2024-03-12 (y-m-d)
Start: 11:30:34 (h:m:s)
Program: vRhyme v1.1.0

Time (min) | Log

0.0 Initializing and validating vRhyme parameters
0.11 Running 'longest' dereplication: 97% identity and 70% coverage
0.69 No sequences were of sufficient similarity to dereplicate
0.69 Single end read file(s) identified. Running bowtie2 on 14 unpaired file(s)
3.43 Extracting coverage information from BAM files
3.57 Coverage extraction complete. Generating coverage table
3.57 Performing pairwise coverage comparisons
3.58 Running Prodigal on filtered sequences
3.64 Generating codon usage features
3.64 Generating nucleotide features
3.67 Performing pairwise distance calculations
3.67 Performing machine learning classification
Traceback (most recent call last):
File "/home/hpc_scientist/miniforge3/envs/vrhyme_env/bin/vRhyme", line 960, in
net_data = machine_stuff.machine_stuff(distances, presets, model_method, pairs_machine, cohen_machine, iterations, cohen_check)
File "/home/hpc_scientist/miniforge3/envs/vrhyme_env/bin/machine_stuff.py", line 73, in machine_stuff
model_ET = pickle.load(read_model_ET)
File "sklearn/tree/_tree.pyx", line 865, in sklearn.tree._tree.Tree.setstate
File "sklearn/tree/_tree.pyx", line 1571, in sklearn.tree._tree._check_node_ndarray
ValueError: node array from the pickle has an incompatible dtype:

  • expected: {'names': ['left_child', 'right_child', 'feature', 'threshold', 'impurity', 'n_node_samples', 'weighted_n_node_samples', 'missing_go_to_left'], 'f
    ormats': ['<i8', '<i8', '<i8', '<f8', '<f8', '<i8', '<f8', 'u1'], 'offsets': [0, 8, 16, 24, 32, 40, 48, 56], 'itemsize': 64}
  • got : [('left_child', '<i8'), ('right_child', '<i8'), ('feature', '<i8'), ('threshold', '<f8'), ('impurity', '<f8'), ('n_node_samples', '<i8'), ('weight
    ed_n_node_samples', '<f8')]

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions