I basically only buy AMD, but I want to point out how rocm still doesn't fully support the 780M.
I have a laptop with a 680M and a mini pc with a 780M both beefy enough to play around with small LLM. You basically have to force the gpu detection to an older version, and I get tons of gpu resets on both.
AMD your hardware is good please give the software more love.
AMD doesn't realise the wide penetration and availability of CUDA is what makes the ecosystem so strong. Developers can develop and test on their personal devices which are prevalent, and that's what creates such a big software ecosystem for the expensive chips.
When I raised this feedback with our AMD Rep, they said it was intentional and that consumer GPUs are primarily meant for gaming. Absolutely shortsighted.
I can forgive AMD for not seeing how important CUDA was ten years ago. Nvidia was both smart and lucky.
But failing to see it five years ago is inexcusable. Missing it two years ago is insane. And still failing to treat ML as an existential threat is, IDK, I’ve got no words.
That's besides the point. They are offering ML solutions. I believe pytorch and most other stuff works decently well on their datacenter/hpc GPUs these days. They just haven't managed to offer something attractive to small scale enterprises and hobbyists, which costs them a lot of midshare in discussions like these.
But they're definitely aware of AI/ML stuff, pitching it to their investors, acquiring other companies in the field and so on.
Meanwhile the complete lack of enthusiast ML software for their consumer grade cards mean they can put gobs of GPU memory on their GPUs without eating into their HPC business line.
I feel like that's something they would be explaining to their investors if it was intentional though.
Not sure which complete lack you're talking about. You can run the SotA open source image and text generation models on the 7900 xtx. They might be one or two iterations behind their nvidia counterparts and you will run into more issues, but there is a community.
It's either strategic incompetence, technical incompetence, or both at this point.
At least they seem to be seriously trying to fix it now.
https://www.techpowerup.com/324171/amd-is-becoming-a-softwar...
They just revoked permission to open source the CUDA on AMD driver. I don't think they've gotten the message yet.
There is still an active fork: https://github.com/lshqqytiger/ZLUDA
One thing I wonder about with AMD is that they know the history of how CUDA got to it's current position, but even if you say trying to compete in that market is fighting yesterday's war and they don't want to dedicate much resources to it they don't seem to have much vision to start and do the long-term commitment to what could be the next big thing. What projects do they have to boost their strengths, and can't be copied easily? The examples I would point to though are Zen (a massive course correction after K10-Bulldozer) and early HSA after acquiring ATi
I suspect that it is legal fears tbh - it is almost certain that if AMD or anyone else tried to make some kind of CUDA compatibility, nVidia would pretty fiercely sue them into the ground. This is almost certainly why both Intel and AMD bailed on ZLUDA.
They don't need compatibility but functionality in the first place.
Most AI workloads use an abstraction layer anyway, e.g. pytorch.
Obviously anything that's known on this thread is known to AMD management, or at least their assistants.
I recently tried to setup Linux on a few machines with nvidia and AMD GPUs, and, while AMD could improve, they're way ahead of nvidia on all fronts except machine learning.
Nvidia's drivers are still uniformly garbage (as they have been for the last 20 years) across the board, but they do work sometimes, and I guess they're better for machine learning. I have a pile of "supported" nvidia cards that can't run most opengl / glx software, even after installing dkms, recompiling the planet, etc, etc, etc.
Since AMD upstreamed their stuff into the kernel, everything just works out of the box, but you're stuck with rocm.
So, for all use cases except machine learning, AMD's software blows Nvidia's out of the water for me. This includes running Windows games, which works better under Linux than Windows (the last time I checked), thanks to Steam.
On my 780m, I installed current devuan (~= debian) stable, and had a few xscreensaver crashes and reboots. I checked dmesg, and it had clear errors about irq state machines being wrong for some of the radeon stuff. So, even when running future hardware, their error logs are great.
After enabling backports and upgrading the kernel, the dmesg errors went away, and it's a 100% uptime machine.
The remaining hardware problem is that pulseaudio is still terrible after all these years, so I have to repeatedly switch audio out to hdmi.
Use pipewire instead of pulseaudio. Much better.
I would have already switched, but, apparently, the pipewire authors decided it shouldn't daemonize properly:
https://dev1galaxy.org/viewtopic.php?id=5867
Taking a dependency on systemd is a strange choice for a project who's entire point is ripping out Poettering's second-to-last train wreck.
That doesn't read like taking a dependency on systemd. Rather, Pipewire doesn't have custom code to "daemonize" by itself.
Is there any reason why all individual tools should learn how to daemonize (in addition to or in replacement of running in the foreground)? There's external tools that can take care of that uniformly, and using the latest/greatest syscalls for it. That seems better than every application including this code. As highlighted in the thread, there are other programs that can launch+daemonise another a process (like the aptly named [daemon(1)](https://manpages.debian.org/unstable/daemon/daemon.1.en.html) tool). Seems more like the UNIX way, from the outset.
That tool’s RSS is somehow 170KB (vs zero for a self-daemonizing process).
Also, it’s incredibly complicated. (I looked at the source.)
Here’s a writeup of a simple daemon: https://pavaka.github.io/simple-linux-daemon-tutorial/
Given that it’s typed once (by the daemon author, and not the end user), it seems like a big win vs. daemon(1) to me.
> That tool’s RSS is somehow 170KB (vs zero for a self-daemonizing process).
Why is the RSS relevant? I assume it doesn't need to keep on running. Also, even if it kept running, 170KB is not the end of the world.
> Also, it’s incredibly complicated. (I looked at the source.) Here’s a writeup of a simple daemon: https://pavaka.github.io/simple-linux-daemon-tutorial/
Maybe it's complicated, but perhaps it's trying to replicate daemon(3) without bugs, and for different processes. See the BUGS section in the daemon(3) man page.
> Given that it’s typed once (by the daemon author, and not the end user), it seems like a big win vs. daemon(1) to me.
This seems like a false comparison. It's not the case that the end user writes the code to daemonise in the non-included case. The user would just use daemon(1) or systemd(8) or something else that can daemonise. Or perhaps a service manager that doesn't need to daemonise, like runit(8) (https://smarden.org/runit/) and its ilk.
The more I read about this, the more I want to know why it's so important that pipewire is running "daemonized" (whether it does it itself or not). Can you explain the advantages and disadvantages?
Having had AMS Ryzen laptops for the last 6 plus years, so much this.
Right now I'm messing around trying to get pytorch vulkcan support compiling just so I avoid switching to ROCM.