Skip to content
This repository has been archived by the owner on Jul 21, 2020. It is now read-only.

default core clock limit hard wall #8

Open
GloriousEggroll opened this issue Nov 4, 2018 · 13 comments
Open

default core clock limit hard wall #8

GloriousEggroll opened this issue Nov 4, 2018 · 13 comments
Labels
bug Something isn't working

Comments

@GloriousEggroll
Copy link

hi, i was hitting an issue with my vega 64 with hitting a 1630 default soft wall limit on the core. I found via the arch wiki:
https://wiki.archlinux.org/index.php/AMDGPU#Overclocking
you can increase the limit. I set
echo "10" > /sys/class/drm/card0/device/pp_sclk_od
to allow a 10% max increase from default, and now i am able to get past the soft wall

@Haxk20
Copy link

Haxk20 commented Nov 4, 2018

This should be reported to AMDGPU buglist as im experiencing the same issue. But on RX560X.
Sadly i have to set high performance mode on my card because i cant get past the soft wall on memory clocks. Sadly that way my card runs 100% all the time while using it for gaming.

@BoukeHaarsma23
Copy link
Owner

What do you mean by "soft wall"? The problem is a bit unclear to me

@Haxk20
Copy link

Haxk20 commented Nov 4, 2018

What do you mean by "soft wall"? The problem is a bit unclear to me

GPU clocks are at the lowest state until you "overclock" the GPU then it starts to change states depending on load. Without overclocking it or chaging performance state its stuck at lowest clock state.

But still this isnt bug of this program.
This is bug in AMDGPU driver most likely.

@GloriousEggroll
Copy link
Author

GloriousEggroll commented Nov 4, 2018

the vega 64 is able to clock much higher than 1630, but by default on linux it's limited to 1630 no matter what you set unless you increase the max limit first as mentioned above

@Haxk20 my gpu goes through all the clock states properly. the max advertised boost on my card is 1590. the problem is it can normally go much higher than that. I was hitting a wall at 1630. Once i tried to set any state higher than 1630, it would default to the previous state. Once I added the fix mentioned above, it allowed clocking higher than 1630.

To put this in perspective. some people have managed to clock their vega 64s on average 1645-1680 safely, and I had done this on windows.

This is not an amdgpu bug because the states are working properly. It's just the default max clock limit that doesn't let you go past 1630 until you add the line I mentioned from the arch wiki. The % amount is adjustable, does not need to be 10. I just used 10 for plenty of headroom for overclocking.

here's what my modified bash script looks like with the limit raised and fan speed set:

#!/bin/bash
echo "10" > /sys/class/drm/card0/device/pp_sclk_od
echo "manual" > /sys/class/drm/card0/device/power_dpm_force_performance_level
echo 260000000 > /sys/class/drm/card0/device/hwmon/hwmon0/power1_cap
echo "s 0 877 800" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "s 1 1020 900" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "s 2 1200 925" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "s 3 1300 935" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "s 4 1450 940" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "s 5 1590 970" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "s 6 1620 1070" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "s 7 1660 1100" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "c" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "m 0 167 800" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "m 1 500 800" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "m 2 800 950" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "m 3 1000 960" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo "c" > /sys/class/drm/card0/device/pp_od_clk_voltage
echo 1  > /sys/class/drm/card0/device/hwmon/hwmon0/pwm1_enable
echo 200 > /sys/class/drm/card0/device/hwmon/hwmon0/pwm1

@Haxk20
Copy link

Haxk20 commented Nov 4, 2018

Weird because when i do your "fix" then my states start working properly. But this really isnt the proper solution as the driver should switch states normally without the need of me switching the first state.
My solution to this was to set high to performance state and this will set the highest state on Core and memory.
But without this the auto performance level sets the clocks to lowest state. Only when i do your fix then my Core clocks start to go to other states. But memory clocks are still stuck.
I would file a bug but tbh i dont know where.

BoukeHaarsma23 added a commit that referenced this issue Nov 7, 2018
@BoukeHaarsma23 BoukeHaarsma23 added the wontfix This will not be worked on label Nov 7, 2018
@BoukeHaarsma23
Copy link
Owner

I was able to reproduce this. However, since I will not write any overclock without the user knowledge I updated the FAQ 7bae16e and will close this.

@GloriousEggroll
Copy link
Author

GloriousEggroll commented Nov 7, 2018

the option doesn't write an overclock, it only increases the amount the card is able to extend to. it does not change any of the card's default clock speeds

example:
my p7 default clock speed+voltage is 1590/1200. Should I choose to overclock, i can increase it up to 1630 by default without increasing any upper limit.

If I add the option specified, I can overclock further than that, but only if I manually change the states.

I can set the option and still not touch the states and the speeds remain at default for all states.

If you do nothing but execute that option, the max clock for p7 will still remain at the 1590 default unless manually set by the user. All it does is allow it to go past 1630 -if- manually set.

It's like putting my size 11 foot into a size 12 shoe vs size 13 shoe. my foot stays the same size unless I somehow magically grow it myself.

@BoukeHaarsma23
Copy link
Owner

BoukeHaarsma23 commented Nov 7, 2018

Then I think we misunderstand each other. Or I don't understand everything properly.

Does

echo "10" > /sys/class/drm/card0/device/pp_sclk_od

Not overclock your GPU core by 10%?

@GloriousEggroll
Copy link
Author

GloriousEggroll commented Nov 7, 2018

no. it increases the max limit that it is able to clock to IF manually set.

think of it like this:
you put the gpu in a box.
you then take the gpu out, and put it in a bigger box, 10% bigger.
the gpu (core) still stays the same size (clock) unless you manually set it to a larger size (core)

so if I wanted to overclock the card to 1640, I would have to both set the 10% limit and then set the p7 state clock speed to 1640. Without setting the p7 clock speed it stays at 1590 even though you have the option now to make it go to 1640.

@BoukeHaarsma23
Copy link
Owner

I see, I misread the documentation on this. Thanks for pointing this out!

@BoukeHaarsma23 BoukeHaarsma23 reopened this Nov 7, 2018
@BoukeHaarsma23 BoukeHaarsma23 added bug Something isn't working and removed wontfix This will not be worked on labels Nov 7, 2018
@BoukeHaarsma23
Copy link
Owner

However, this does suggest otherwise:
https://www.phoronix.com/scan.php?page=news_item&px=AMDGPU-OverDrive-Linux-4.15

@GloriousEggroll
Copy link
Author

he used it with the automatic default scaling values. the option should not be set if not setting manual values otherwise the gpu will try to auto-scale overclock to the max value in the range. So what you would want to do is turn it on only if the user enables manual clock setting mode, then if they disable manual clock mode set it back to 0.

@BoukeHaarsma23
Copy link
Owner

BoukeHaarsma23 commented Nov 7, 2018

Ah gotcha, working on it ;) Would first need a root implementation for this though, so working on that first

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants