But this is a segmentation fault inside the BOINC software itself, so someone should make a bug report to BOINC's TRAC system, I guess. This issue should not have any relation to Einstein@Home.
CU
Bikeman
Right, this is the BOINC Core Client. Given the pretty old version, though, I wonder if anyone @BOINC cares...
Most, and perhaps all, of the signal 11 problems occurred when I had network problems. Also, the problem machines are all running the newer 5.10.x versions of BOINC.
Donald (and others), a possible solution to this would be to use a local DNS server, running BIND in one of your machines.
I have read that the signal 11 issues in newer versions of BOINC are related to the new syncronous access to DNS. If DNS is not available for a while, the core client remains stopped, trying to connect DNS, and the running science task fails.
Using a local DNS server, that will be always available, should solve the problem.
In my particular case, I had BIND already running, because my box crunching for E@H is a mail server. So I've simply pointed to it as the first DNS to look for (putting the chain "nameserver 127.0.0.1" as the first line of /etc/resolv.conf)
If you don't have BIND running on any machine, a "caching-only" installation of BIND should be enough. And that's always a nice addition to a local network, it will make all DNS lookups faster. Look at the following "HowTo":
Most, and perhaps all, of the signal 11 problems occurred when I had network problems. Also, the problem machines are all running the newer 5.10.x versions of BOINC.
Donald (and others), a possible solution to this would be to use a local DNS server, running BIND in one of your machines.
I have read that the signal 11 issues in newer versions of BOINC are related to the new syncronous access to DNS. If DNS is not available for a while, the core client remains stopped, trying to connect DNS, and the running science task fails.
Using a local DNS server, that will be always available, should solve the problem.
In my particular case, I had BIND already running, because my box crunching for E@H is a mail server. So I've simply pointed to it as the first DNS to look for (putting the chain "nameserver 127.0.0.1" as the first line of /etc/resolv.conf)
If you don't have BIND running on any machine, a "caching-only" installation of BIND should be enough. And that's always a nice addition to a local network, it will make all DNS lookups faster. Look at the following "HowTo":
Sorry, but I've done some tests, and it seems that this "solution" does not solve anything...
I've been now experimenting with my DSL connection, disconnecting the router from the telephone line, something that produces obvious "connection problems", and then forcing an "update prefs" in BOINC's core client. The result: the running WU is immediately aborted with signal 11 error... I'm using BOINC 5.8.16 and einstein 4.20
I read the suggestion of using local DNS servers on BOINC's forums, but now I see that it does not work.
So it seems that the only possible fix by now is to use an older version of BOINC, that uses the old style asyncronous DNS lookup. This means a 5.4.x client or some of the very earlier versions of 5.8.x
Well, some more tests, and it seems that perhaps a local DNS really helps...
Now I've left only in /etc/resolv.conf the first "nameserver 127.0.0.1" line, and removed any other "nameserver" in this file.
The present WU has survived the lack of network connection, and it is running OK now. I will see if it is validated at last.
The machine that runs the local DNS server cannot be switched off or restarted during a network failure. Switching it off erases all data in BIND's cache (stored in RAM memory), and the next attemp of a BOINC client to read DNS data will fail if there is no Internet connection available.
So it seems that the only possible fix by now is to use an older version of BOINC, that uses the old style asyncronous DNS lookup. This means a 5.4.x client or some of the very earlier versions of 5.8.x
OTOH, if you downgrade BOINC, you'll get some more bugs that have long since been fixed in BOINC.
I guess that crunchers who know they have frequent connection problems should consider setting "Network activity" to "Never" in the Boinc GUI and periodically (e.g once a day) press the Update button after checking that the Internet connection is working. This is especially true for users who have slower PCs because it's not nice to loose (say) a full day of crunching.
So it seems that the only possible fix by now is to use an older version of BOINC, that uses the old style asyncronous DNS lookup. This means a 5.4.x client or some of the very earlier versions of 5.8.x
OTOH, if you downgrade BOINC, you'll get some more bugs that have long since been fixed in BOINC.
I guess that crunchers who know they have frequent connection problems should consider setting "Network activity" to "Never" in the Boinc GUI and periodically (e.g once a day) press the Update button after checking that the Internet connection is working. This is especially true for users who have slower PCs because it's not nice to loose (say) a full day of crunching.
CU
Bikeman
Actually, I'm still running some old P-III machines, so it would be more like two to three days worth of crunching.
The "signal 11" happens in the BOINC library (the part of BOINC that gets linked into the application) whenever the Core Client becomes unresponsive. Newer Clients seem to become unresponsive more often than older ones (e.g. for DNS requests), but in principle it could happen with older Clients, too. We are working on fixing the problem.
RE: RE: Ubuntu 7.04
)
Right, this is the BOINC Core Client. Given the pretty old version, though, I wonder if anyone @BOINC cares...
BM
BM
RE: Most, and perhaps
)
Donald (and others), a possible solution to this would be to use a local DNS server, running BIND in one of your machines.
I have read that the signal 11 issues in newer versions of BOINC are related to the new syncronous access to DNS. If DNS is not available for a while, the core client remains stopped, trying to connect DNS, and the running science task fails.
Using a local DNS server, that will be always available, should solve the problem.
In my particular case, I had BIND already running, because my box crunching for E@H is a mail server. So I've simply pointed to it as the first DNS to look for (putting the chain "nameserver 127.0.0.1" as the first line of /etc/resolv.conf)
If you don't have BIND running on any machine, a "caching-only" installation of BIND should be enough. And that's always a nice addition to a local network, it will make all DNS lookups faster. Look at the following "HowTo":
http://www.langfeldt.net/DNS-HOWTO/BIND-9/
RE: RE: Most, and
)
Interesting idea. I need to set up a BIND server, anyway, so I may give this a try.
Sorry, but I've done some
)
Sorry, but I've done some tests, and it seems that this "solution" does not solve anything...
I've been now experimenting with my DSL connection, disconnecting the router from the telephone line, something that produces obvious "connection problems", and then forcing an "update prefs" in BOINC's core client. The result: the running WU is immediately aborted with signal 11 error... I'm using BOINC 5.8.16 and einstein 4.20
I read the suggestion of using local DNS servers on BOINC's forums, but now I see that it does not work.
So it seems that the only possible fix by now is to use an older version of BOINC, that uses the old style asyncronous DNS lookup. This means a 5.4.x client or some of the very earlier versions of 5.8.x
Well, some more tests, and it
)
Well, some more tests, and it seems that perhaps a local DNS really helps...
Now I've left only in /etc/resolv.conf the first "nameserver 127.0.0.1" line, and removed any other "nameserver" in this file.
The present WU has survived the lack of network connection, and it is running OK now. I will see if it is validated at last.
The machine that runs the local DNS server cannot be switched off or restarted during a network failure. Switching it off erases all data in BIND's cache (stored in RAM memory), and the next attemp of a BOINC client to read DNS data will fail if there is no Internet connection available.
Hi! RE: So it seems
)
Hi!
OTOH, if you downgrade BOINC, you'll get some more bugs that have long since been fixed in BOINC.
I guess that crunchers who know they have frequent connection problems should consider setting "Network activity" to "Never" in the Boinc GUI and periodically (e.g once a day) press the Update button after checking that the Internet connection is working. This is especially true for users who have slower PCs because it's not nice to loose (say) a full day of crunching.
CU
Bikeman
RE: Hi!RE: So it seems
)
Actually, I'm still running some old P-III machines, so it would be more like two to three days worth of crunching.
RE: Right, this is the
)
Thanks
Guess I will stick with this old Boinc 5.2.13 anyway, it was reliable for a long time.
Another 13 units lost when I
)
Another 13 units lost when I forgot to switch off the network in Boinc Manager, signal 11 yes, hostid=1090631
Fedora C6 Boinc 5.10.21
Well I am a rather normal person ( I think ;)) and miss things now and then but I don't have the time to babysit Boinc.
Restarted with 5.2.13.
The "signal 11" happens in
)
The "signal 11" happens in the BOINC library (the part of BOINC that gets linked into the application) whenever the Core Client becomes unresponsive. Newer Clients seem to become unresponsive more often than older ones (e.g. for DNS requests), but in principle it could happen with older Clients, too. We are working on fixing the problem.
BM
BM