A few months ago, I got new drivers, and O3AS, which had been my best performing E@H app, started throwing errors withing a few seconds of starting. I recently tried it again and the error looks the same from what I can remember. Here's an example:
https://einsteinathome.org/task/1722984570
But, it used to work. So, I'm confused. This isn't entirely out of character for RDNA3, though. At the same time, another E@H app -- maybe MeerKat? -- started working when it never had worked before. And there is yet other problems, like crashes, that I'm not even thinking about right now.
In any case, I could use some help troubleshooting it. This seems like a good place to start because it's 100 % reproducible and occurs very quickly. I've reported some of the past problems upstream, but they say they cannot help without being able to debug the live code for themselves. I pointed them to the source page, but neither they nor I can quite get started building a test environment. We could use some help. Would appreciate someone who could help me build a test environment so I could document it and explain it to AMGPU devs so they can reproduce. I realize this will require a bit more interaction, but there's no hurry; we can work asynchronously here or DM or maybe personal e-mail? Whatever works.
Copyright © 2025 Einstein@Home. All rights reserved.
It appears to be an issue
)
It appears to be an issue related to incompatibility with Fedora's drivers.
https://einsteinathome.org/cs/content/all-sky-gravitational-wave-search-o3-v107-tasks-compilation-fail-ldlld-error-undefined-symbo
https://github.com/ROCm/ROCm/issues/3575
Only the developers can confirm whether there is a way around it in the code. You could try contacting Oliver Behnke.
Yeah, thanks. I believe I
)
Yeah, thanks. I believe I have contact the right people, but that's a new name.
There really shouldn't be any
)
There really shouldn't be any issues running O3AS on a 7900xtx.
I see that you also have an A750 in that system. Do you have mesa-libOpenCL installed? There can be conflicts between the two OpenCL drivers.
Interesting suggestion. I'm
)
Interesting suggestion. I'm not sure why I don't, since I have every thing else from mesa, but no, i don't have mesa-libopencl installed. I wonder if I ran into this conflict before, removed it, and forgot about it.
I think I finally see what AHOREK's Team was saying about the error. I now see that it looks like a simple case of a missing call. Not sure why O3AS is the only app that calls it, out of all the ones I run or have tried recently, but that is what the error suggests.
So, follow-up question: what
)
So, follow-up question: what is __printf_alloc()? The suggestion above is that there is something wrong with Fedora. But, that isn't a good explanation for the symptom. I can find __printf_alloc() in both llvm libs and rocm-comgr. So, it doesn't *seem* like it's missing.
It also looks like an internal call, so I'm really confused as to how any library could be built if it were missing internal calls. Something doesn't add up.
So, I'm back to my original question. Can someone please help me actually build an test environment for E@H? It seems like the only way I can satisfy people who are will to help me is if I can actually figureout how to build and test this app for myself.
btw what is your glibc
)
btw what is your glibc version?
ldd --version
ldd --version
ldd --version__printf_alloc is included in glibc 2.34+, but Einstain apps usually include all libraries statically for compatibility with older systems. The app is simply searching for a function in a dynamic library that is either missing on your system or doesn't match the expected version.
Unfortunately, the current O3AS source doesn’t appear to be publicly available, so only the Einstein developers can assist. They may try building it with different flags, libraries, or configurations...
ldd (GNU libc) 2.40 Ah,
)
ldd (GNU libc) 2.40
Ah, okay, that is very helpful information about O3AS not being available. That definitely is a problem for my current approach.
Other people I've asked say __printf_alloc is NOT in gcc libs. I've looked for it myself, and I cannot find it.
Thank you for your continued help! I have to run, but I'll do more digging later...
well, it could be a custom
)
well, it could be a custom function, for instance:
https://github.com/stjordanis/ROCm-Device-Libs/blob/master/opencl/src/misc/printf.cl#L18
it's more likely because it crashes when building the OpenCL code
ld.lld: error: undefined symbol: __printf_alloc
Error: Creating the executable from LLVM IRs failed.
XLAL Error - XLALOpenCLGetProgramFromSource (/home/jenkins/workspace/workspace/EaH-GW-OpenCL-Testing/SLAVE/LIBC215/TARGET/linux-x86_64/EinsteinAtHome/source/lalsuite/lalpulsar/lib/GPUUtils/OpenCLUtils.c:705): clBuildProgram failed with OpenCL error: CL_BUILD_PROGRAM_FAILURE
unfortunately, this log doesn't show the OpenCL backtrace.
I attempted to find references on https://git.ligo.org/lscsoft/lalsuite, but it appears that the current O3AS code is sourced from a different location. Without developers who have access to and are knowledgeable about the current codebase, there's not much we can do.
I appreciate your
)
I appreciate your help!
Yes, that's what I was thinking, too.
You got me thinking about other versions of libs that might be on my system, like mesa. So, I checked for LLVM and found several different versions installed. Since I think __printf_alloc() might be in there, I removed some unnecessary LLVM libs. I haven't tried O3AS again, yet, though.
I have to have more than one, because it seems Intel uses llvm-15, while ROCm is built against llvm-18, and llvm-19 is actually the current version of the standard llvm-libs pkg on F41 and several pkgs, like mesa-dri-drivers and mesa-libEGL are built against it. So, I like this thinking about conflicting libs available at OCL compile time.
I'll enable O3AS again, but I'm not hopeful. I think we might be on the right track, but I'm not sure what to try/check, next.