Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sedutil-cli --initialsetup fails with “NOT_AUTHORIZED” on uniniatilized drive when booting with UEFI. #291

Closed
cpintado opened this issue Jun 13, 2019 · 9 comments

Comments

@cpintado
Copy link

cpintado commented Jun 13, 2019

Our infrastructure requires that we boot servers with SED via PXE and we unlock the SED in an unattended way. We boot a Linux environment in which we have installed sedutil-cli, we check if the disk has Locked enabled and if not then we configure the SED as per the steps documented here (except for the setup of the PBA as we don’t use it). In some particular cases, it doesn’t work. For the purposes of this issue I will describe some tests we did manually to isolate the problem:

First, we boot in a Linux rescue mode in which we have installed sedutil-cli.

We then run the following command:

sedutil-cli --query /dev/nvme0

In all drives we used to test, part of the output of the above command is this:
Locking function (0x0002)
Locked = N, LockingEnabled = N, LockingSupported = Y, MBRDone = N, MBREnabled = N, MediaEncrypt = Y

We see here that because LockingEnabled is 0, the disk has not been initialized. Then we proceed to run the following command:

sedutil-cli -v --initialsetup test /dev/nvme0n1

We have observed that when booting in Legacy mode for the “Intel SSD DC P4610” and the “Samsung SSD 983 DCT M.2 1” drives, the command works well and the rest of the configuration can be completed. However, when UEFI was used to boot the ““Samsung SSD 983 DCT M.2 1”” drive, running the above command returns this error:

method status code NOT_AUTHORIZED
Session start failed rc = 1
One or more header fields have 0 length
EndSession Failed
takeOwnership failed
Initial setup failed - unable to take ownership

We don’t understand why this is happening, because we know that the disk was not initialized previously and also LockingEnabled is “N”. Switching to boot in Legacy mode makes it work well again.

In our rescue environment, we are installing the sedutil package from the CentOS 7 repositories, which is version 1.15.1-1

Another detail is that if we boot the server in UEFI in rescue mode and remove the configuration using PSID revert, then we can complete the setup (as it was suggested in other issues like #232 and #289). However if the OPAL configuration is removed using "sedutil-cli --revertnoerase
sedutil-cli --reverttper "
then the problem happens again the next time we try to set it up. It seems like when booting in UEFI for some reason it is required to OSID revert before setting up the configuration.

Have you seen this issue before? Are there any pointers you can give us to test it? We are also trying to contact the manufacturer (Samsung) to see if this is something particular of their implementation (we will also try to use disks from another manufacturer to try)

I would like to understand if this is a common issue that happens when booting in UEFI or if it is something typical from this hardware. I'm interested in understanding this because I have to PXE boot a big number of server and apply this configuration to them in an unattended way and it will be difficult (though not impossible) to previously get all the PSIDs of the disks so I can automate running PSID revert previously to the initial setup.

@ladar
Copy link

ladar commented Jul 1, 2019

When booting with UEFI, you need to ensure CSM support has been enabled.

@oom-is
Copy link

oom-is commented Oct 22, 2019

You might not need CSM at this point.

I submitted a Pull Request #306 that updates Buildroot and the Linux kernel significantly. Using either my tree with that Pull Request integrated, or the @ChubbyAnt tree which also integrates it, I would be VERY interested whether you still see this behavior.

I have a hard requirement to support true UEFI boot and installs, and right now It Just Works without CSM for me using the updated Buildroot (syslinux) and kernel. I also have included a UEFI-bootable ISO version of the Rescue64 in the binaries I uploaded earlier today after testing.

@th-joerger
Copy link

@oom-is I just tried your beta-build and I still have the exact same problem as @cpintado
Used this iso, UEFI bootstick, CSM activated and 970 Pro 1 TB

I can give you further debugging output, if this helps!

@cpintado
Copy link
Author

@oom-is thank you! I don't have access to that system anymore as I have changed jobs, but I will forward your suggestion to the new team to see if they can check that.

@oom-is
Copy link

oom-is commented Oct 22, 2019

@cpintado Thanks. I'll be interested to hear anything more - noting that at this point I only have one NVMe drive that supports TCG Opal, and zero drives that support TCG Enterprise. Regardless, I want this to just work...and it looks like new Samsung 983 DCT M.2 drives are surprisingly inexpensive.

Before I go buy a new drive, though:
@th-joerger You're getting equivalent results (unable to initialize "normally" after true UEFI boot, but either CSM or PSID full crypto erase allows --initialsetup) with a Samsung 970 Pro? I have a 970 Evo which I've not yet attempted to initialize - I'm going to do some initial "let's see what happens" with that drive before I head off in other directions.

I'm very interested that the behavior is specific so far to NVMe Samsung drives but is reported as similar across both TCG Enterprise (DCT 983) and TCG Opal 2.0 (970 Pro / Evo) compliant drives. I also note that @ChubbyAnt has stated their configuration as "HP 15M-DS0012DX, Envy x360 with the AMD Ryzen 3700u and a Samsung 970 Evo Plus 2TB" and reported that the recent buildroot/kernel updates did remove their need for CSM.

There's something I'm missing here.

@oom-is
Copy link

oom-is commented Oct 23, 2019

My working hypothesis: This isn't a SEDutil bug or omission but rather a Samsung "feature"

I found the thing I was vaguely remembering. #248 but scroll down to second comment. The original issue was a SanDisk drive that actually didn't support TCG Opal.

We did find a Samsung new drive had to be PSID reverted before
it could be noticed as Opal. In effect, this blocks the attack vector
of a bad guy seizing an un-owned Opal drive which has been
known for many years. However, I do not believe this blocking behavior
is approved by any of the TCG specs yet. Sedutil can be employed
for the revert. regards,
Bob Thibadeau

For those who might not be aware, Bob Thibadeau is at least the godfather of SEDutil 1.15.1 and probably much more. He was heavily involved in the TCG when they wrote the Opal SSC, he was Chief Technologist at Seagate when they developed their Opal SED line, then Chief Scientist at WAVE (SED management software), and in a SHOCKING development he's the main driver behind Bright Plaza/Drive Trust Alliance.

Why would Samsung require a PSID revert to use the drive with Opal?
If you're a security-aware person, this actually makes some (limited) sense. If malware ever gets onto a PC where there are SEDs installed but no program or individual has "taken crypto ownership" then one could postulate the ultimate ransomware scam (take crypto ownership of the drives, and the users can't possibly recover their data.) Requiring a full PSID revert implies that someone with physical access to the system REALLY wants to start managing security on the drive. Well, at least intends to start managing....oh I give up.

@th-joerger
Copy link

@oom-is I can confirm your findings. Just tested with the beta-build and a 970 Evo drive. PSID reset did enable Opal functionality and the drive could be initialised. Strangly enough, this did NOT delete anything from the drive, so I'd guess, Samsung uses this "hack" where they require you to provide the PSID for unlocking Opal but not actually resetting the device when no crypto ownership has been taken.

The How-To is not up to date, as it has some commands wrong (missing userid). Any ideas how to update the wiki?

n.b.: I ran into some problems with the locked drive preventing the system from even POSTing, but this ultimately was a bug in the MSI MPG Z390 firmware and easily fixed with a BIOS upgrade.

@oom-is
Copy link

oom-is commented Oct 23, 2019

@th-joerger I'm glad to hear that it sounds as though you're able to use your 970 Evo drive. From my notes I've seen a similar behavior (unable to perform --initialSetup until a PSID "revert" is run, and the PSID revert doesn't actually erase data) on a Seagate drive that had Windows 10 installed. Again, I think that behavior is by design...but we certainly need to pinpoint where and when it occurs and get it documented.

Edited To Add: You're correct that my "beta" build needs better docs, and I'm working those. For right now, the initialization process is as noted by @ladar here - both his tree and mine (and others) include the patches that start implementing multiuser support in SEDutil. The --setmbrdone and --setlockingrange commands now expect a username, which conventionally is Admin1 during the initialization process.

As for updating the Wiki on this main DTA tree - at this point I'm expecting to re-write from scratch; only project contributors can update the Wiki here and the chance of getting anyone added as a contributor seems slim.

Separately, though - the one piece above that still doesn't make sense is that on the DCT 983 (Enterprise) drive, the behavior appears different depending on whether CSM is enabled or not. I think the consumer (Opal) Samsung drive behavior is sufficiently explained (and probably intentional) but without an Enterprise drive I'm still guessing at what and why is happening there. More news to follow.

@nuku97
Copy link

nuku97 commented May 1, 2020

First of all, thanks to everyone who is contributing to this software. I recently set up a new computer build (Ryzen 3700X, MSI B450 Mortar Max mainboard) with an OPAL compatible PCI Express NVME SSD Mushkin Pilot-E 2 TB. I am using a dual boot configuration Gentoo Linux and WIndows with Grub/UEFI as boot loader.

Thanks to the build provided by @ChubbyAnt I was able to boot my Ryzen into the Rescue System (original image from this repository did not boot) and installed the encryption with sedutil. However, I experienced the same behaviour as described here: "sedutil-cli --initialsetup debug /dev/nvme0" failed with "NOT_AUTHORIZED". After doing a full back up, I tried the method described here of doing a PSID revert ("sedutil-cli --PSIDrevert /dev/nvme0") prior to the "initialsetup". Also in my case, the drive was NOT erased by the PSID revert and afterwards the "initalsetup" was successful.

Now, I experienced aother problem as described in #27:after Preboot Authentification the computer forgets the Grub/Gentoo Linux UEFI boot entry and only boots into Windows. Thankfully the workaround (moving Microsoft's Bootloader /EFI/Microsoft to /EFI/MS and installing Grub's Bootloader as /EFI/Microsoft/Boot/bootmgfw.efi with Grub chainloading Windows Bootloader from the MS directory) as documented in that issue solved this for now. Even though I worry that I will have to redo this regularly after Windows updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants