Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: cuda runtime error (77) : an illegal memory access was encountered at /opt/conda/conda-bld/pytorch_1565287148058/work/aten/src/THC/THCCachingHostAllocator.cpp:296 #5

Open
yflv-yanxia opened this issue Sep 29, 2019 · 1 comment

Comments

@yflv-yanxia
Copy link

yflv-yanxia commented Sep 29, 2019

If I run you demo code on cuda, I get an error, but it is correct on cpu:

THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1565287148058/work/aten/src/THC/THCCachingHostAllocator.cpp line=296 error=77 : an illegal memory access was encountered
Traceback (most recent call last):
File "", line 1, in
File "", line 18, in test_run
File "/public/home/yflv/pytorch/torch-asg/torch_asg/asg.py", line 135, in forward
target_lengths)
File "/public/home/yflv/pytorch/torch-asg/torch_asg/asg.py", line 77, in forward
batch_input_len, num_batches, num_labels, batch_output_len)
RuntimeError: cuda runtime error (77) : an illegal memory access was encountered at /opt/conda/conda-bld/pytorch_1565287148058/work/aten/src/THC/THCCachingHostAllocator.cpp:296

My test code:

import torch
from torch_asg import ASGLoss

def test_run():
num_labels = 7
input_batch_len = 6
num_batches = 2
target_batch_len = 5
asg_loss = ASGLoss(num_labels=num_labels,
reduction='mean', # mean (default), sum, none
gpu_no_stream_impl=False, # see below for explanation
forward_only=False # see below for explanation
).cuda()
for i in range(1):
# Note that inputs follows the CTC convention so that the batch dimension is 1 instead of 0,
# in order to have a more efficient GPU implementation
inputs = torch.randn(input_batch_len, num_batches, num_labels, requires_grad=True).cuda()
targets = torch.randint(0, num_labels, (num_batches, target_batch_len)).cuda()
input_lengths = torch.randint(1, input_batch_len + 1, (num_batches,))
target_lengths = torch.randint(1, target_batch_len + 1, (num_batches,))
loss = asg_loss.forward(inputs, targets, input_lengths, target_lengths)
print('loss', loss)
# You can get the transition matrix if you need it.
# transition[i, j] is transition score from label j to label i.
print('transition matrix', asg_loss.transition)
loss.backward()
print('transition matrix grad', asg_loss.transition.grad)
print('inputs grad', inputs.grad)

test_run()

my environment:
centos 7.2
Python 3.6.5 :: Anaconda, Inc.
pytorch 1.2.0
cuda9.2.148
cudnn7.1.4

@Daniel-CUHK
Copy link

Hi ,how to solve this issue? I have the same one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants