Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbCAN-sub error #129

Open
typhainepl opened this issue Sep 7, 2023 · 9 comments
Open

dbCAN-sub error #129

typhainepl opened this issue Sep 7, 2023 · 9 comments
Labels
bug Something isn't working

Comments

@typhainepl
Copy link

typhainepl commented Sep 7, 2023

Hi,

I'm encountering an error while trying to run dbCAN, and it appears to be related to the output generation. Any assistance you could provide would be greatly appreciated.

I've installed dbcan through conda.

The command I am running is the following:run_dbcan test_seq.faa protein --out_dir test_2

The output and error message:

***************************1. DIAMOND start*************************************************

diamond v2.1.8.162 (C) Max Planck Society for the Advancement of Science, Benjamin Buchfink, University of Tuebingen
Documentation, support and updates available at http://www.diamondsearch.org
Please cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)

#CPU threads: 4
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
Temporary directory: test_2
#Target sequences to report alignments for: 1
Opening the database...  [0.08s]
Database: db/CAZy (type: Diamond database, sequences: 2428817, letters: 1157024505)
Block size = 2000000000
Building query seed set...  [0s]
Algorithm: Query-indexed
Building query histograms...  [0s]
Seeking in database...  [0s]
Loading reference sequences...  [3.2s]
Initializing temporary storage...  [0.014s]
Building reference histograms...  [6.609s]
Allocating buffers...  [0s]
Processing query block 1, reference block 1/1, shape 1/2.
Building reference seed array...  [2.74s]
Building query seed array...  [0s]
Computing hash join...  [0.058s]
Searching alignments...  [0s]
Deallocating memory...  [0s]
Processing query block 1, reference block 1/1, shape 2/2.
Building reference seed array...  [4.652s]
Building query seed array...  [0s]
Computing hash join...  [0.515s]
Searching alignments...  [0.001s]
Deallocating memory...  [0s]
Deallocating buffers...  [0.009s]
Clearing query masking...  [0s]
Computing alignments... Loading trace points...  [0.004s]
Sorting trace points...  [0s]
Computing alignments...  [0.001s]
Deallocating buffers...  [0s]
Loading trace points...  [0s]
 [0.007s]
Deallocating reference...  [0.007s]
Loading reference sequences...  [0s]
Deallocating buffers...  [0s]
Deallocating queries...  [0s]
Total time = 17.908s
Reported 0 pairwise alignments, 0 HSPs.
0 queries aligned.

***************************1. DIAMOND end***************************************************


***************************2. HMMER start*************************************************


***************************2. HMMER end***************************************************


***************************3. dbCAN_sub start***************************************************

ID count: 8
total time: 5.017667531967163

***************************3. dbCAN_sub end***************************************************

Traceback (most recent call last):
  File "/homes/typhaine/miniconda3/envs/run_dbcan/bin/run_dbcan", line 10, in <module>
    sys.exit(cli_main())
  File "/homes/typhaine/miniconda3/envs/run_dbcan/lib/python3.8/site-packages/dbcan_cli/run_dbcan.py", line 883, in cli_main
    run(inputFile=args.inputFile, inputType=args.inputType, cluster=args.cluster, dbCANFile=args.dbCANFile,
  File "/homes/typhaine/miniconda3/envs/run_dbcan/lib/python3.8/site-packages/dbcan_cli/run_dbcan.py", line 290, in run
    with open(f"{outPath}dbsub.out") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'test_2/dbsub.out'

My input file (test_seq.faa) looks like this:

>0A023FBW4 E1142_AMBCJ
MTSHGAVKIAIFAVIALHSIFECLSKPQILQRTDHSTDSDWDPQMCPETCNPSKNISCSSECLCVTLGGGDETGTCFNMSGVDWLGHAQASDGHNDG
>A0A023FF81 E1126_AMBCJ
MTSHSAVRIAIFAVIALHSIFECLSKPQILQRTDKSTDSEWDPQTCPETCIPSKNITCSDGCVCVKLGEEEEGTCFNMTGVDWLGSPSDD
>A0A023PXA5 YA19A_YEAST
MLLSELVATASSLPYTAISIHNNCRVPAARHIHHGCRYFHGPPVMHLPQCLRTIQFSPSVISTSYQIPVICQHHAVVPTARYLPDYCSIISWHRPLWGIHILIVPQSQLPLPIRPKRIHTTHRYKPVIAFNDHIPSLALWICLHYQGSNGCVTPVAAKFFIIFHFVGLKEIMSPSRNATRNLNQYWRVL
>A0A023PXB5 IRC2_YEAST
MFALIISSKGKTSGFFFNSSFSSSALVGIAPLTAYSALVTPVFKSFLVILPAGLKSKSFAVNTPFKSCWCVIVMCSYFFCVYHLQKQHYCGAPSLYSYLLCL
>A0A023PXC2 YE53A_YEAST
MLPLCLTFLSFFLSLGGSFKAVMTKEEADGTTEAAACLFWIFNWTVTLIPLNSLVALAISSPTFFGDRPKGPIFGAKAAEAPTSPPTALRYKYLTSLGSNFGGIFVYPLFLLSTF
>A0A023PXD3 YE88A_YEAST
MTRLPPIPRMTVTLTTRPAVPTCNEGSSILHYIYIPIYEPNEQKEKRRRKTPPEPRAYTTTTTIATNSRISGCSLTLEDGIHLRGKRAETARLPAATPQKRTGPARG
>A0A023PXD5 YE147_YEAST
MMTAAKRLGLYSALRACSATVFRSNLHPKVTVATMFCSVGTIPDVAEVSFSDSGAALFMSSSLWKVVAGFVPSRFWFSHTCLVFGSNTILFASLNSFKRSSSAIIKKVSLDTPVYVGLEKKNKMQPLLPCFFRRAV
>A0A023PXE5 YH006_YEAST
MDLYPPASWAALVPFCKALTFKVPVVLGNRNPSPPSPLPPMALSLSLLIPLSRLSLSGSSDTADGSLLISCISRGSCGIFRMGCEAVKGRSLGCLLPRSNCTYGCMSLRKYVSVCSM

Best,

Typhaine

@yinlabniu
Copy link
Collaborator

yinlabniu commented Sep 7, 2023 via email

@typhainepl
Copy link
Author

My input file already has the > symbol in front of the seq IDs.

@typhainepl
Copy link
Author

It is not shown in the email, but it is there.

@linnabrown
Copy link
Owner

I will try ur input on my local machine.

@yinlabniu
Copy link
Collaborator

yinlabniu commented Sep 7, 2023 via email

@typhainepl
Copy link
Author

Thank you!

@linnabrown
Copy link
Owner

Hi @typhainepl , I figured out. It is due to the eval_num and covarage are strict to you (Our default hmm_eval and hmm_cov are 1e-15 and 0.35). Therefore, the parsed file is empty and output does not exist. We did not have this kind of problem so we think the file exists in default.

  1. I am writing the code for the non-existing case to make it robust
  2. You can relax those threshold by changing --hmm_eval and --hmm_cov in your command line.

@linnabrown
Copy link
Owner

linnabrown commented Sep 7, 2023

This is e-value


#                                                                            --- full sequence --- -------------- this domain -------------   hmm coord   ali coord   env coord
# target name        accession   tlen query name           accession   qlen   E-value  score  bias   #  of  c-Evalue  i-Evalue  score  bias  from    to  from    to  from    to  acc description of target
#------------------- ---------- ----- -------------------- ---------- ----- --------- ------ ----- --- --- --------- --------- ------ ----- ----- ----- ----- ----- ----- ----- ---- ---------------------
GH28_e71.hmm|GH28:685 -            346 A0A023FF81           -             90     0.022   14.4   0.1   1   1   2.9e-06     0.025   14.2   0.1   145   195    28    78    13    82 0.89 -
GH28_e105.hmm|GH28:14 -            364 A0A023FF81           -             90      0.11   12.0   0.2   1   1   1.4e-05      0.12   11.9   0.2   181   226    32    77    15    83 0.85 -
GH28_e4.hmm|GH28:43   -            363 A0A023FF81           -             90      0.13   11.6   0.0   1   1   1.5e-05      0.13   11.6   0.0   182   221    40    79    13    84 0.85 -
GT2_e221.hmm|GT2:21  -            228 A0A023PXD5           -            136      0.15   11.7   0.0   1   1   6.3e-06      0.16   11.7   0.0   165   190    53    78    13   107 0.84 -
#
# Program:         hmmscan
# Version:         3.3.2 (Nov 2020)
# Pipeline mode:   SCAN
# Query file:      test_2/0.txt
# Target file:     db/dbCAN_sub.hmm
# Option settings: hmmscan -o /dev/null --domtblout test_2/d0.txt --cpu 5 db/dbCAN_sub.hmm test_2/0.txt 
# Current dir:     /Users/xxx/Desktop/proj/dbcan/run_dbcan
# Date:            Thu Sep  7 18:24:01 2023
# [ok]

@typhainepl
Copy link
Author

Thank you for investigating and getting back to me. I'll change the e-value and coverage to see if I can get some results, but it would be great if you can take into account the possibility of having empty results in the pipeline.

@linnabrown linnabrown added the bug Something isn't working label Jan 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants