Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rogue Carrier Support for AGX Xavier Jetson #710

Open
frankvp11 opened this issue Dec 23, 2024 · 6 comments
Open

Rogue Carrier Support for AGX Xavier Jetson #710

frankvp11 opened this issue Dec 23, 2024 · 6 comments

Comments

@frankvp11
Copy link

frankvp11 commented Dec 23, 2024

Issue: Re-adding Support for CTI Rogue Carrier Boards on AGX Xavier

Previous Context


Current Problem

I am attempting to re-add support for the CTI Rogue Carrier boards for the AGX Xavier. While building the image was successful, flashing it to the carrier board in recovery mode fails.


Build Process

The image was built successfully, producing the following output:

NOTE: Tasks Summary: Attempted 4902 tasks of which 0 didn't need to be rerun and all succeeded.

Summary: There were 74 WARNING messages.  
[000005777][LOG]Build for cti-rogue-xavier succeeded.  
[000005777][LOG]If build for cti-rogue-xavier succeeded, final image should have been generated here:  
[000005777][LOG]   build/tmp/deploy/images/cti-rogue-xavier/balena-image-cti-rogue-xavier.balenaos-img  
[000005777][LOG]Done.  

console-latest.log


Flashing Error

When flashing the image to the carrier board in recovery mode, the following error occurs:

The above License Agreement can be consulted at https://developer.download.nvidia.com/embedded/L4T/r35_Release_v2.1/release/Tegra_Software_License_Agreement-Tegra-Linux.txt  
Consistency check done for /tmp/22/Linux_for_Tegra  
Using cached Linux_for_Tegra  
Checking resin cache  
Cache image check done  
Extracting partition bootloader-dtb from /tmp/22/resin/img to /tmp/22/resin  
...  
TypeError: Cannot read properties of undefined (reading 'name')  
    at exports.extractPartition (/usr/src/app/jetson-flash/lib/utils.js:137:38)  
    at async ResinJetsonFlash.generateArtifacts (/usr/src/app/jetson-flash/lib/resin-jetson-flash.js:423:14)  
    at async ResinJetsonFlash.run (/usr/src/app/jetson-flash/lib/resin-jetson-flash.js:451:3)  
    at async run (/usr/src/app/jetson-flash/bin/cmd.js:79:2)  

Flashing was attempted using the Docker container approach, following the steps outlined here.


Changes Made So Far

I have made the following changes in an attempt to re-add support for the CTI Rogue Carrier boards:

  1. Added the following files:

  2. Updated the PINMUXCFG to my copy at:
    Reference Line

  3. Re-added cti-rogue-xavier DTBName in balena-image.inc:
    Reference Line

  4. Made related CTI changes in:
    Reference File


Request for Help

I would greatly appreciate assistance in identifying what might be causing the flashing error and whether there are additional changes or corrections I should make. Please let me know if more logs or details are needed.

@frankvp11
Copy link
Author

frankvp11 commented Dec 24, 2024

Analyzing the image that the script created, and the one that can be found from balena, it is evident that this is an issue with the partitions:

sudo parted balena-image-cti-rogue-xavier_15\:26.balenaos-img print

outputs:

Model:  (file)
Disk /home/frank/images/balena-image-cti-rogue-xavier_15:26.balenaos-img: 1279MB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 

Number  Start   End     Size    File system  Name         Flags
 1      4194kB  46.1MB  41.9MB  fat16        resin-boot   boot, esp
 2      46.1MB  545MB   499MB   ext4         resin-rootA
 3      545MB   1044MB  499MB   ext4         resin-rootB
 4      1044MB  1065MB  21.0MB  ext4         resin-state
 5      1065MB  1278MB  213MB   ext4         resin-data

On the image from balena:

sudo parted balena-cloud-roya-nv-jetson-xavier-6.0.13-v16.4.6.img print
Model:  (file)
Disk /home/frank/images/balena-cloud-roya-nv-jetson-xavier-6.0.13-v16.4.6.img: 1829MB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 

Number  Start   End     Size    File system  Name              Flags
 1      20.5kB  532kB   512kB                mts-mce
 2      532kB   1044kB  512kB                mts-mce_b
 3      1044kB  5239kB  4194kB               mts-proper
 4      5239kB  9433kB  4194kB               mts-proper_b
 5      9433kB  9957kB  524kB                cpu-bootloader
 6      9957kB  10.5MB  524kB                cpu-bootloader_b
 7      10.5MB  10.9MB  393kB                bootloader-dtb
 8      10.9MB  11.3MB  393kB                bootloader-dtb_b
 9      11.3MB  13.4MB  2097kB               secure-os
10      13.4MB  15.5MB  2097kB               secure-os_b
11      15.5MB  15.6MB  131kB                eks
12      15.6MB  15.7MB  131kB                eks_b
13      15.7MB  16.8MB  1049kB               bpmp-fw
14      16.8MB  17.8MB  1049kB               bpmp-fw_b
15      17.8MB  18.9MB  1049kB               bpmp-fw-dtb
16      18.9MB  19.9MB  1049kB               bpmp-fw-dtb_b
17      19.9MB  20.2MB  262kB                xusb-fw
18      20.2MB  20.4MB  262kB                xusb-fw_b
19      20.4MB  21.0MB  524kB                rce-fw
20      21.0MB  21.5MB  524kB                rce-fw_b
21      21.5MB  25.7MB  4194kB               adsp-fw
22      25.7MB  29.9MB  4194kB               adsp-fw_b
23      29.9MB  30.4MB  524kB                sce-fw
24      30.4MB  30.9MB  524kB                sce-fw_b
25      30.9MB  37.2MB  6291kB               sc7
26      37.2MB  43.5MB  6291kB               sc7_b
27      43.5MB  178MB   134MB                BMP
28      178MB   312MB   134MB                BMP_b
29      312MB   379MB   67.1MB               kernel
30      379MB   446MB   67.1MB               kernel_b
31      446MB   447MB   524kB                kernel-dtb
32      447MB   447MB   524kB                kernel-dtb_b
33      447MB   448MB   1049kB               CPUBL-CFG
34      448MB   457MB   8389kB               RP1
35      457MB   465MB   8389kB               RP2
36      465MB   470MB   4723kB               PADDING
37      470MB   596MB   126MB   fat16        resin-boot        boot, esp
38      596MB   1095MB  499MB   ext4         resin-rootA
39      1095MB  1594MB  499MB   ext4         resin-rootB
40      1594MB  1615MB  21.0MB  ext4         resin-state
41      1615MB  1828MB  213MB   ext4         resin-data

Any ideas on what might have gone wrong, and how it can be fixed?

Edit: I'm not entirely sure of what I did differently, but it now creates what appears to be a proper image. In any case, the partitions are correct.
The problem at this point is that after flashing nothing really happens... I use the balena-cli to inject a config-json (pre flash) into the image, and then after I flash it doesn't show up on balena-cloud. It just shows the splash-screen and I am able to SSH into it (when it's a dev image).

@acostach
Copy link
Contributor

Hi @frankvp11 , if the board can be flashed now and it's just not showing up in balena-cloud, this could be happening because the config.json that you injected uses a different slug than the device-type in /mnt/boot/device-type.json. When you ssh to the device you can check and compare /mnt/boot/config.json and /mnt/boot/device-type.json, and update the slug in device-type.json to match. From the OS image name you shared I assume that would be 'jetson-xavier'. After doing the change, you can reboot the board and check if it shows up as a Jetson AGX Xavier in balena-cloud.

Also, could you please drop us a line at https://www.balena.io/contact to tell us more about your use-case, so we can better assist you? Thanks

@frankvp11
Copy link
Author

That seems to have worked.
I'll have to test over the next little bit if it has everything running that I need, but great breakthrough.
Tthank you very much!

@frankvp11
Copy link
Author

Hi @acostach,
In the past few days, I've gotten most things running that I needed / tested.
However, I am unable to get the GPU / NVidia related things to work. With the help of this forum post, I got jtop working, which shows me that there does exist a GPU (as it's able to get it's temperature and whatnot), however it can't find the Jetpack version, nor does the GPU ever get used.
My guess is that the drivers didn't get properly installed. Note that the Dockerfile wherein I am running jtop is as follows:

FROM nvcr.io/nvidia/l4t-base:r32.4.3

ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update && \
    apt-get install -y --no-install-recommends \
        lbzip2 git wget unzip jq xorg tar python3 libegl1 binutils xz-utils \
        bzip2 python3-pip systemd-sysv dbus systemd python3-dev libffi-dev libssl-dev && \
    rm -rf /var/lib/apt/lists/*

RUN python3 --version && pip3 --version

RUN pip3 install --upgrade pip setuptools wheel

RUN pip3 install --upgrade jetson-stats

RUN jetson_stats --version || echo "jetson-stats failed to install"

RUN systemctl mask \
        dev-hugepages.mount \
        sys-fs-fuse-connections.mount \
        sys-kernel-config.mount \
        display-manager.service \
        getty@.service \
        systemd-logind.service \
        systemd-remount-fs.service \
        getty.target \
        graphical.target

COPY entry.sh /usr/local/bin/entry.sh
RUN chmod +x /usr/local/bin/entry.sh

COPY balena_start.service /etc/systemd/system/balena_start.service
RUN systemctl enable jtop.service

ENTRYPOINT ["/usr/local/bin/entry.sh"]

Where the entry and services are the same as the linked forum's solution.
Here's some (potentially) useful outputs:

root@ae858bd:/# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_21:14:42_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89
root@ae858bd:/# cat /etc/nv_tegra_release
cat: /etc/nv_tegra_release: No such file or directory
root@ae858bd:/# nvidia-smi
bash: nvidia-smi: command not found
root@ae858bd:/# dpkg -l | grep -i nvidia
root@ae858bd:/# dpkg -l | grep -E 'cuda|cudnn|tensorrt|opencv|visionworks'

image

you can view what I used to build the image on my fork of the repo

@frankvp11
Copy link
Author

Nevermind, I read this article which describes roughly how to do it. Thanks for the help!

@acostach
Copy link
Contributor

acostach commented Jan 6, 2025

@frankvp11 I assume the image you build is based on L4T 32.7.3, in which case I recommend unpacking the 32.7.3 BSP in your container like in this example

However, please note that we're in progress of moving the Xavier AGX and NX devices to Jetpack 5 - L4T 35.4.1, and we won't be releasing any more images on Jetpack 4 from this repository for them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants