Switching Apps in the middle of a task usually confuses the run-time estimation ...
I disagree :-).
My experience says that the run time estimate remains as before and if the new app is appreciably faster then the remaining time estimate will simply reduce faster than it otherwise would do. DCF is not affected by the app change until the task completes at which point the faster than expected finish (of this and subsequent tasks) will cause the DCF to reduce until a new equilibrium is reached. In other words, the app change in the middle should have no immediate effect on DCF.
Two things that will have an immediate effect on DCF are a faulty benchmark test and changing the clock while BOINC is running. There are probably more.
If you just want to fix things then stop BOINC and adjust the DCF in the state file to be 0.922172 x a / b, where a is the true time the task will take and b is the faulty estimate. From Kathryn's figures, a and b should be about 9 and 40 respectively. So change the DCF to be 0.210000 and all should be well. Then restart BOINC.
Danke schön, Bernd. Die S5R3_449.SSE2, obbwohl versehentlich ins Lebenslicht gerückt, ist ein richtiger Volltreffer. Das Glück gehört eben den Tüchtigen. Große Klasse!
Bikeman was quite correct that the SSE2 version of the app is significantly faster than the SSE version on core architecture. I've now seen results for a Q6600 which show a modest speedup only if using the SSE app but a much better speedup if using the SSE2 app. So, if you are using the beta app on a C2D or C2Q you should be using the SSE2 version if you want the best performance. It would be interesting to know why the SSE version performs relatively poorly on just this architecture.
@Bernd - is there any prospect of a Windows version any time soon? Whilst I had previously converted about half my fleet to Linux some time ago, I don't really want to have to convert the balance if Windows users will get the benefits in the near future.
I see heavy optimization going on.
I have not time to see how much of memory copy is in app,but would not be good to SSE(2) it as well?I think about implementation like in VLC media player...(code is however in GPL,so only inspiration most probably :-|)
And how long it will take to see current optimization in Win appp?
Bikeman was quite correct that the SSE2 version of the app is significantly faster than the SSE version on core architecture. I've now seen results for a Q6600 which show a modest speedup only if using the SSE app but a much better speedup if using the SSE2 app. So, if you are using the beta app on a C2D or C2Q you should be using the SSE2 version if you want the best performance. It would be interesting to know why the SSE version performs relatively poorly on just this architecture.
Since I've switched my X9650 Extreme Quad-core machine over to the SSE2 app, I've been getting crunch times as short as just under three hours. It's also done some good on my AMD 6000+ machine, though probably to not as great an extent.
If anyone would like to see for themselves. . .
For my 6000+ and X9650 machines, workunits that have been completed with the regular SSE app are branded with app version 4.49. Any that have been completed with the SSE2 app are branded with app version 44.91. (It was supposed to be "4.491", but the boinc client apparently doesn't know what to do with three decimal places.)
I see heavy optimization going on.
I have not time to see how much of memory copy is in app,but would not be good to SSE(2) it as well?I think about implementation like in VLC media player...(code is however in GPL,so only inspiration most probably :-|)
I don't think that there's much copying done in the App, apart from the first ~1min after startup when setting up the memory structures, reading the data files etc.
Quote:
And how long it will take to see current optimization in Win appp?
The prefetching is not new at all, it's already used in the MacOS and Windows (Beta & Power) Apps.
As for the advantage of the SSE2 over the SSE version I first need to find out where it comes from, further measurements will depend on this.
With some recent developments around BOINC I'll have another try on compiling the "science App" for Windows with gcc, too, so we can focus on a single compiler (yes, I'm aware of icc, but using that is even more difficult with our current code).
I see heavy optimization going on.
I have not time to see how much of memory copy is in app,but would not be good to SSE(2) it as well?I think about implementation like in VLC media player...(code is however in GPL,so only inspiration most probably :-|)
I don't think that there's much copying done in the App, apart from the first ~1min after startup when setting up the memory structures, reading the data files etc.
Quote:
And how long it will take to see current optimization in Win appp?
The prefetching is not new at all, it's already used in the MacOS and Windows (Beta & Power) Apps.
As for the advantage of the SSE2 over the SSE version I first need to find out where it comes from, further measurements will depend on this.
With some recent developments around BOINC I'll have another try on compiling the "science App" for Windows with gcc, too, so we can focus on a single compiler (yes, I'm aware of icc, but using that is even more difficult with our current code).
BM
Thanks for answers.
My question was not about prefetching and alike(win beta is being used)... :-) but about SSE2.How long those measurements will take?
And it looks like I forgot that memory-intensive does not have to mean copy op...
RE: Switching Apps in the
)
I disagree :-).
My experience says that the run time estimate remains as before and if the new app is appreciably faster then the remaining time estimate will simply reduce faster than it otherwise would do. DCF is not affected by the app change until the task completes at which point the faster than expected finish (of this and subsequent tasks) will cause the DCF to reduce until a new equilibrium is reached. In other words, the app change in the middle should have no immediate effect on DCF.
Two things that will have an immediate effect on DCF are a faulty benchmark test and changing the clock while BOINC is running. There are probably more.
If you just want to fix things then stop BOINC and adjust the DCF in the state file to be 0.922172 x a / b, where a is the true time the task will take and b is the faulty estimate. From Kathryn's figures, a and b should be about 9 and 40 respectively. So change the DCF to be 0.210000 and all should be well. Then restart BOINC.
Cheers,
Gary.
Danke schön, Bernd. Die
)
Danke schön, Bernd. Die S5R3_449.SSE2, obbwohl versehentlich ins Lebenslicht gerückt, ist ein richtiger Volltreffer. Das Glück gehört eben den Tüchtigen. Große Klasse!
MfG,
Waldi
RE: RE: I think I stand
)
Bikeman was quite correct that the SSE2 version of the app is significantly faster than the SSE version on core architecture. I've now seen results for a Q6600 which show a modest speedup only if using the SSE app but a much better speedup if using the SSE2 app. So, if you are using the beta app on a C2D or C2Q you should be using the SSE2 version if you want the best performance. It would be interesting to know why the SSE version performs relatively poorly on just this architecture.
@Bernd - is there any prospect of a Windows version any time soon? Whilst I had previously converted about half my fleet to Linux some time ago, I don't really want to have to convert the balance if Windows users will get the benefits in the near future.
Cheers,
Gary.
I see heavy optimization
)
I see heavy optimization going on.
I have not time to see how much of memory copy is in app,but would not be good to SSE(2) it as well?I think about implementation like in VLC media player...(code is however in GPL,so only inspiration most probably :-|)
And how long it will take to see current optimization in Win appp?
RE: RE: RE: I think I
)
Since I've switched my X9650 Extreme Quad-core machine over to the SSE2 app, I've been getting crunch times as short as just under three hours. It's also done some good on my AMD 6000+ machine, though probably to not as great an extent.
If anyone would like to see for themselves. . .
For my 6000+ and X9650 machines, workunits that have been completed with the regular SSE app are branded with app version 4.49. Any that have been completed with the SSE2 app are branded with app version 44.91. (It was supposed to be "4.491", but the boinc client apparently doesn't know what to do with three decimal places.)
Well the speedup on your X2
)
Well the speedup on your X2 6000 is even better than on your Xeon. :-)
Lowest boost on your Xeon is less than 10 % while the X2 does not fall below 16.x %. Max. boost is about the same, more than 20% faster.
My X2 5000 even lies between 19 and 22%, but hasn't gotten short units so far.
cu,
Michael
RE: Well the speedup on
)
I'm running with Hyperthreading enabled on the Xeon. That may be skewing the results a bit.
I've had a successful run.
)
I've had a successful run.
I'm running Fedora Core 8 x64, BOINC 6.2.1 / 6.2.2 (recently updated), AMD Athlon x64 4200 (on a single core).
RE: I see heavy
)
I don't think that there's much copying done in the App, apart from the first ~1min after startup when setting up the memory structures, reading the data files etc.
The prefetching is not new at all, it's already used in the MacOS and Windows (Beta & Power) Apps.
As for the advantage of the SSE2 over the SSE version I first need to find out where it comes from, further measurements will depend on this.
With some recent developments around BOINC I'll have another try on compiling the "science App" for Windows with gcc, too, so we can focus on a single compiler (yes, I'm aware of icc, but using that is even more difficult with our current code).
BM
BM
RE: RE: I see heavy
)
Thanks for answers.
My question was not about prefetching and alike(win beta is being used)... :-) but about SSE2.How long those measurements will take?
And it looks like I forgot that memory-intensive does not have to mean copy op...