Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run_classifier.py Error COLA #31

Open
zzj0402 opened this issue Jan 15, 2020 · 2 comments
Open

run_classifier.py Error COLA #31

zzj0402 opened this issue Jan 15, 2020 · 2 comments

Comments

@zzj0402
Copy link

zzj0402 commented Jan 15, 2020

Running the cola script returns:

2020-01-15 17:53:21.504699: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:163] no NVIDIA GPU device is present: /dev/nvidia0 does not exist
2020-01-15 17:53:21.505194: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2020-01-15 17:53:21.518577: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3599910000 Hz
2020-01-15 17:53:21.519665: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x3c2f130 executing computations on platform Host. Devices:
2020-01-15 17:53:21.519701: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): Host, Default Version
Traceback (most recent call last):
  File "run_classifer.py", line 457, in <module>
    app.run(main)
  File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "run_classifer.py", line 307, in main
    loss_multiplier=loss_multiplier)
  File "run_classifer.py", line 195, in get_model
    pooled_output, _ = albert_layer(input_word_ids, input_mask, input_type_ids)
  File "/root/ALBERT-TF2.0/albert.py", line 212, in __call__
    return super(AlbertModel, self).__call__(inputs, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/base_layer.py", line 842, in __call__
    outputs = call_fn(cast_inputs, *args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/autograph/impl/api.py", line 237, in wrapper
    raise e.ag_error_metadata.to_exception(e)
RuntimeError: in converted code:

    /root/ALBERT-TF2.0/albert.py:229 call  *
        word_embeddings = self.embedding_lookup(input_word_ids)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/base_layer.py:817 __call__
        self._maybe_build(inputs)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/base_layer.py:2141 _maybe_build
        self.build(input_shapes)
    /root/ALBERT-TF2.0/albert.py:273 build
        dtype=self.dtype)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/base_layer.py:522 add_weight
        aggregation=aggregation)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/tracking/base.py:744 _add_variable_with_custom_getter
        **kwargs_for_getter)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/base_layer_utils.py:139 make_variable
        shape=variable_shape if variable_shape else None)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/variables.py:258 __call__
        return cls._variable_v1_call(*args, **kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/variables.py:219 _variable_v1_call
        shape=shape)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/variables.py:65 getter
        return captured_getter(captured_previous, **kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/distribute/distribute_lib.py:1322 creator_with_resource_vars
        return self._create_variable(*args, **kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/distribute/one_device_strategy.py:262 _create_variable
        return next_creator(*args, **kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/variables.py:197 <lambda>
        previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/variable_scope.py:2507 default_variable_creator
        shape=shape)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/variables.py:262 __call__
        return super(VariableMetaclass, cls).__call__(*args, **kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/resource_variable_ops.py:1406 __init__
        distribute_strategy=distribute_strategy)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/resource_variable_ops.py:1537 _init_from_args
        initial_value() if init_from_fn else initial_value,
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/base_layer_utils.py:119 <lambda>
        init_val = lambda: initializer(shape, dtype=dtype)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/init_ops_v2.py:343 __call__
        self.stddev, dtype)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/init_ops_v2.py:809 truncated_normal
        shape=shape, mean=mean, stddev=stddev, dtype=dtype, seed=self.seed)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/random_ops.py:171 truncated_normal
        mean_tensor = ops.convert_to_tensor(mean, dtype=dtype, name="mean")
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1184 convert_to_tensor
        return convert_to_tensor_v2(value, dtype, preferred_dtype, name)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1242 convert_to_tensor_v2
        as_ref=False)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1296 internal_convert_to_tensor
        ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/tensor_conversion_registry.py:52 _default_conversion_function
        return constant_op.constant(value, dtype, name=name)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/constant_op.py:227 constant
        allow_broadcast=True)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/constant_op.py:235 _constant_impl
        t = convert_to_eager_tensor(value, ctx, dtype)
    /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/constant_op.py:96 convert_to_eager_tensor
        return ops.EagerTensor(value, ctx.device_name, dtype)

    RuntimeError: /job:localhost/replica:0/task:0/device:GPU:0 unknown device.
@suresh96458
Copy link

@zzj0402 can you help in finding the solution for this issue :
#32

@Bidek56
Copy link

Bidek56 commented Jan 17, 2020

Are you are trying run on a GPU but you don't have one or it's not configured?
Please try using a Docker file docker pull tensorflow/tensorflow:latest-gpu-py3 to ensure your GPU is configured if you have one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants