[mathicgb] 60/393: Fixed what is, to the best of my ability to determine, an issue caused by a bug in the MSVC implementation of OpenMP. Go read the comment I added for the details. So that was more than a few hours out of my life. Great. I really enjoyed that. Wonderful. No really. (if someone has an explanation other than "compiler bug", I'd really be interested to hear it).
Doug Torrance
dtorrance-guest at moszumanska.debian.org
Fri Apr 3 15:58:32 UTC 2015
This is an automated email from the git hooks/post-receive script.
dtorrance-guest pushed a commit to branch upstream
in repository mathicgb.
commit dca8151201458da111913f9c19f123e168b06409
Author: Bjarke Hammersholt Roune <bjarkehr.code at gmail.com>
Date: Fri Oct 12 19:06:49 2012 +0200
Fixed what is, to the best of my ability to determine, an issue caused by a bug in the MSVC implementation of OpenMP. Go read the comment I added for the details. So that was more than a few hours out of my life. Great. I really enjoyed that. Wonderful. No really. (if someone has an explanation other than "compiler bug", I'd really be interested to hear it).
---
src/mathicgb/F4MatrixReducer.cpp | 39 +++++++++++++++++++++++++++++++++++----
1 file changed, 35 insertions(+), 4 deletions(-)
diff --git a/src/mathicgb/F4MatrixReducer.cpp b/src/mathicgb/F4MatrixReducer.cpp
index 4af072e..b74f312 100755
--- a/src/mathicgb/F4MatrixReducer.cpp
+++ b/src/mathicgb/F4MatrixReducer.cpp
@@ -116,14 +116,45 @@ public:
}
}
+
template<class Iter>
- void addRowMultiple(SparseMatrix::Scalar multiple, Iter begin, Iter end) {
+ void addRowMultiple(SparseMatrix::Scalar multiple, const Iter& begin, const Iter& end) {
+ // Now, you may be wondering why begin and end are passed by reference
+ // instead of by value, and that would be a good question. As it turns
+ // out, this method does not work otherwise when run in parallel using
+ // OpenMP on MS Visual Studio 2012 when being called from myReduce().
+ // Strange but true.
+ //
+ // Why is that? To the best of my ability
+ // to determine what was going on, it appears that, sometimes,
+ // two threads would be running on the same stack when calling this method,
+ // overwriting each other's local variables causing all kinds of havoc.
+ // My evidence for this is that I could find no other explanation after
+ // hours of investigation and that I could consistently get two threads
+ // with different return values of omp_get_thread_num() to print out the
+ // same address for a local variable - and this would happen just before
+ // things went wrong. So at this point I'm concluding that it is a compiler
+ // bug. All the writes and reads outside critical sections are to local
+ // variables, memory allocated by the same thread or to data structures
+ // that do not change within the scope of the parallel code in myReduce(),
+ // so I don't know what the issue would otherwise be. I thought perhaps
+ // not building all the code with OpenMP enabled could be the issue,
+ // but changing that did not fix the issue.
+ //
+ // Now you may be wondering what that has to do with passing iterators by
+ // reference. As it happens, the issue does not happen this way. "But that
+ // doesn't make any sense", you say, and you would be right. Feel free
+ // to come up with a better explanation of this issue.
+ //
+ // If you want to take a look at this issue, the issue only turns up for 64
+ // bit debug builds. This was on Visual Studio version
+ // "11.0.50727.1 RTMREL" - Bjarke Hammersholt Roune
T mult = static_cast<T>(multiple);
- for (; begin != end; ++begin) {
- MATHICGB_ASSERT(begin.index() < colCount());
+ for (Iter it = begin; it != end; ++it) {
+ MATHICGB_ASSERT(it.index() < colCount());
// Watch out for overflow here! This is only OK because
// T is promoting the multiplication to type T.
- mEntries[begin.index()] += begin.scalar() * mult;
+ mEntries[it.index()] += it.scalar() * mult;
}
}
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-science/packages/mathicgb.git
More information about the debian-science-commits
mailing list