Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can't utilize GPU to accelerate computation. #55

Open
yb70 opened this issue Jul 5, 2018 · 3 comments
Open

can't utilize GPU to accelerate computation. #55

yb70 opened this issue Jul 5, 2018 · 3 comments

Comments

@yb70
Copy link

yb70 commented Jul 5, 2018

Hi,
I tried to visualize the InceptionV4 model layers and fed a MRI image to it, everything works well except that the GPU seems did not involved in computing. TensorFlow do allocate graphic memory for the process but the GPU utilization rate is at 0%. Most of time only one CPU core is working. How can I utilize GPU to accelerate? below are my codes.

`
import tensorflow as tf
from tf_cnnvis import *
from nets.inception_v4 import inception_v4_base, inception_v4_arg_scope
import matplotlib.image as mpimg
import numpy as np

slim = tf.contrib.slim

if name == 'main':
X = tf.placeholder(tf.float32, [None, 160, 160, 3])
img = mpimg.imread('data/image.png')
img = img[42:202, 3:163]
img = np.stack([img, img, img], axis=2)
img = np.reshape(img, [1, 160, 160, 3])

with slim.arg_scope(inception_v4_arg_scope()):
    net_out = inception_v4_base(inputs=X)

t_vars = tf.trainable_variables()
IV4_vars = [var for var in t_vars if var.name.startswith('InceptionV4')]

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    saver = tf.train.Saver(var_list=IV4_vars)
    saver.restore(sess,
                  'checkpoint/inception_v4_2016_09_09/inception_v4.ckpt')

    _ = deconv_visualization(
        sess_graph_path=sess,
        value_feed_dict={X: img},
        layers=['r', 'p', 'c'],
        path_logdir='summary/cnnvis/log',
        path_outdir='summary/cnnvis/out')

`

Thanks for the excellent work : )

@jidebingfeng
Copy link

I have the same problem!

Reconstruction Completed for FirstStageFeatureExtractor/resnet_v1_101/resnet_v1_101/conv1/Conv2D layer. Time taken = 7.082620 s
Reconstruction Completed for FirstStageFeatureExtractor/resnet_v1_101/resnet_v1_101/block3/unit_1/bottleneck_v1/conv1/Conv2D layer. Time taken = 123.632443 s
Reconstruction Completed for FirstStageFeatureExtractor/resnet_v1_101/resnet_v1_101/block3/unit_9/bottleneck_v1/conv2/Conv2D layer. Time taken = 111.493410 s
Reconstruction Completed for FirstStageFeatureExtractor/resnet_v1_101/resnet_v1_101/block3/unit_17/bottleneck_v1/conv3/Conv2D layer. Time taken = 602.894801 s

@falaktheoptimist
Copy link
Member

Here is how we have currently compute deconvolution: For each layer for which we want the deconvolution output, we have 1 forward pass. Then, we have 8 parallel backward passes for deconvolution of 8 channels in a layer happening simultaneously. Then, the next 8 feature maps and so on. This value 8 was chosen to account for memory limitations. Also, there is no particular learning happening here - so exploitation of GPU computations for optimizer like repeated operations on same data won't be possible as in the case of learning weights. Do suggest if you see some other possible solutions.

@yb70
Copy link
Author

yb70 commented Aug 23, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants