Add `to_device()` to VarBuilder to load weights to a specific GPU #1388

mokeyish · 2023-11-30T05:34:18Z

This makes it easier to load transformer blocks onto different GPUs.

fn load(vb: VarBuilder) {

  // ...

  let blocks: Vec<_> = (0..cfg.n_layers)
      .map(|i| {
          let dev_ordinal = i / per_device_layers;
          log::debug!("load block {i} into GPU{dev_ordinal}");

          let vb = vb.to_device(&Device::new_cuda(dev_ordinal).unwrap());

          Block::load(
              vb.pp(&format!("model.layers.{i}")),
              cache,
              &cfg,
              comm.clone(),
          )
          .unwrap()

      })
      .collect();

     // ...
}

mokeyish · 2023-12-02T07:36:45Z

@LaurentMazare Hi, there two checks failed because the credentials could not be obtained.

But what do you think of this feature? Or are there any other alternatives?

LaurentMazare · 2023-12-03T11:01:48Z

I'm not really sure about this, I'm a bit afraid this makes the VarBuilder harder to reason about, e.g. currently when using a VarBuilder backed by a VarMap the tensors stored in the map are on the device specified by the VarBuilder and can be returned on subsequent calls with the same name. With this change, if to_device was called on the VarBuilder it's a bit unclear what should be done, the current version would return the old tensors even if they are not on the same device, so we should at least check for this.

mokeyish · 2023-12-04T01:14:03Z

For some models, the memory of GPU is not enough to fit the entire model, so it needs to be divided by layers and placed on different GPU devices.

Because candle does not have model.to(device) in torch. If you want to put some layers in the transformer to a specific GPU, it will be more troublesome. But with this PR, we can

let vb4layer = vb.pp(&format!("transformer.layers.{i}")).to_device(device)?;

let block = BlockLayer::load(vb4layer)

Without this PR, we can only load the weight into the layer first, and then write a lot of code to_device one by one.

LaurentMazare · 2023-12-05T06:33:47Z

I agree that the use case you mentioned is not well covered at the moment. What I was just saying is that the changes proposed by this PR have some drawbacks that we have to be a bit more careful about - I would have to think a bit more about it to see if there is a good way around this.

Add to_device() to VarBuilder

6da45ce

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `to_device()` to VarBuilder to load weights to a specific GPU #1388

Add `to_device()` to VarBuilder to load weights to a specific GPU #1388

mokeyish commented Nov 30, 2023 •

edited

Loading

mokeyish commented Dec 2, 2023

LaurentMazare commented Dec 3, 2023

mokeyish commented Dec 4, 2023

LaurentMazare commented Dec 5, 2023

Add to_device() to VarBuilder to load weights to a specific GPU #1388

Are you sure you want to change the base?

Add to_device() to VarBuilder to load weights to a specific GPU #1388

Conversation

mokeyish commented Nov 30, 2023 • edited Loading

mokeyish commented Dec 2, 2023

LaurentMazare commented Dec 3, 2023

mokeyish commented Dec 4, 2023

LaurentMazare commented Dec 5, 2023

Add `to_device()` to VarBuilder to load weights to a specific GPU #1388

Add `to_device()` to VarBuilder to load weights to a specific GPU #1388

mokeyish commented Nov 30, 2023 •

edited

Loading