[SCM] libm4ri: library of Method of the Four Russians Inversion branch, master, created. b1a8d2b916a4bd14de3b8c6949b8816f46a9636f

Felix Salfelder felix at salfelder.org
Fri Jun 15 11:39:26 UTC 2012


The branch, master has been created
        at  b1a8d2b916a4bd14de3b8c6949b8816f46a9636f (commit)

- Shortlog ------------------------------------------------------------
commit b1a8d2b916a4bd14de3b8c6949b8816f46a9636f
Author: Felix Salfelder <felix at salfelder.org>
Date:   Fri Jun 15 13:29:01 2012 +0200

    removed ltmain.dh (autogenerated)

commit 5c617835e45556a706544213becadb0ce38c4e74
Author: Felix Salfelder <felix at salfelder.org>
Date:   Fri Jun 15 13:08:43 2012 +0200

    new upstream release, 20120415

commit 08ba8f6db3f7eaa27ddda866a2e5190795af23c1
Merge: ab4ab5d 8732060
Author: Felix Salfelder <felix at salfelder.org>
Date:   Fri Jun 15 13:25:20 2012 +0200

    Merge commit 'release-20120415'

commit ab4ab5d1b2f32d2f8a1b3f9a4b1485b729b86434
Author: Felix Salfelder <felix at salfelder.org>
Date:   Fri Jun 15 12:30:35 2012 +0200

    switch to dh

commit 1970625d866ca914a29eab200bd5b7114f7462aa
Author: Felix Salfelder <felix at salfelder.org>
Date:   Fri Jun 15 12:24:43 2012 +0200

    remove autogenerated files

commit 2a53b6c7c9420ae4d0956819c05847049b219b6a
Author: Felix Salfelder <felix at salfelder.org>
Date:   Sat Feb 11 18:48:38 2012 +0100

    debian/0.0.20111203-1.
    
    - maintainer set to debian-science
    - adapted version numbers

commit 8732060769d586a44af79471e17bf5e748449bb9
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Fri Apr 13 23:59:56 2012 +0100

    fixed nasty bug in mzd_t_malloc/free

commit c9392bae233bcc9bafe713a1f429d16c0eea9a13
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Fri Apr 13 23:58:09 2012 +0100

    mzd_inv_m4ri makes no guarantees what happens when the input is not invertible

commit 2766fc3b0607fce8c6b898917d0d6bf2b99b0638
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Thu Mar 1 17:53:15 2012 +0100

    a few small changes to make scan-build shut up

commit 98afc34f0c4d4f76ae0657d2d6d31fbe225c2216
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Thu Mar 1 15:38:51 2012 +0100

    move OMP stuff to lower level (which gives better results on my 4 core i7)

commit c4f418505db96cfa4dc7d454fc29037d80f4e9d2
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Mon Feb 20 10:43:27 2012 +0000

    open files in binary mode (for Windows) (fixes: #43)

commit bae91fd8d8b3eabd5d64978d0eb297c4ee1e4f87
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sun Feb 19 17:38:01 2012 +0000

    mzd_transpose() on 0x0 matrix should not crash (fixes #31)

commit 22d95d7169592497a348ea84016359c248cc1165
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sun Feb 19 17:32:21 2012 +0000

    limit size of mzd_t cache to 16*64 (fixes #40)

commit 4b4305ea5ab7f551ef88429ded55ff2d24c9f5f4
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sun Feb 19 12:03:23 2012 +0000

    do not use HAVE_OPENMP, always use __M4RI_HAVE_OPENMP (fixes #41)

commit 2d2b38fb5786a48d00e8165e36d307f6b0511dad
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sat Feb 18 19:53:39 2012 +0000

    allow for really really big matrices (fixes #39)

commit 9d2c9a6ce18e1203889f3e0ffa81ee65feecb85b
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sat Feb 18 13:59:50 2012 +0000

    fixes #38 (fix suggested by anonymously)

commit 9acc8ba0aea5bd73f678a4b0a3da26f41514c5f1
Author: Alexander Dreyer <adreyer at gmx.de>
Date:   Fri Feb 17 14:58:03 2012 +0100

    FIX: m4ri called exit in library code

commit 759ec7c8fe086dc9c11edee172d8c56b0bf51d24
Author: Felix Salfelder <felix at salfelder.org>
Date:   Sat Feb 11 18:47:37 2012 +0100

    update autogenerated files (do we need them?)

commit e59917e3a908c16aacaffe43b458514b8fef2e31
Author: Felix Salfelder <felix at salfelder.org>
Date:   Sat Feb 11 14:49:45 2012 +0100

    debian/0.0.20111203

commit d0bb6fc1e1b5bb7a5def3fbd927aa30833f069b8
Merge: bf59f52 5f49f56
Author: Felix Salfelder <felix at salfelder.org>
Date:   Sat Feb 11 14:47:29 2012 +0100

    Merge commit 'release-20111203'

commit bf59f5297fe4b6baa334f5d5cb38cfaf4610ec6a
Author: Felix Salfelder <felix at salfelder.org>
Date:   Sat Feb 11 14:46:43 2012 +0100

    debian/0.0.20111004

commit 01e0003149fd5daf1c6afbf26a1ca5c666252211
Merge: 0326a3f e98c22e
Author: Felix Salfelder <felix at salfelder.org>
Date:   Sat Feb 11 14:40:16 2012 +0100

    Merge commit 'release-20111004'
    
    Conflicts:
    	COPYING

commit 0326a3fbfcd693c07b01709d4b347e92fffc0a1c
Author: Felix Salfelder <felix at salfelder.org>
Date:   Sat Feb 11 12:09:36 2012 +0100

    imported debian from libm4ri_0.0.20080521-2.diff.gz

commit 47e7fcb94cf906028463d36e3119f8cb94c964b6
Author: Felix Salfelder <felix at salfelder.org>
Date:   Sat Feb 11 12:06:33 2012 +0100

    libm4ri_0.0.20080521.orig.tar.gz

commit e219310cb39ca7cdf784bed80202a124f86aa8b5
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Fri Feb 3 11:13:28 2012 +0000

    define __STDC_LIMIT_MACROS before including <stdint.h> (reported by Jerry James)

commit 05263c999e5f7386548e0e3fbd6a8e46b83c30f8
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Wed Jan 11 17:54:19 2012 +0100

    fixed reference to bench_packedmatrix

commit c7b54c7106ff3ff4ce041f4ebca23cd97e135651
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Wed Jan 11 17:54:03 2012 +0100

    corrected guardian ifdef

commit 7fb76c303275dea3989302cec7d2c1922fd1b8da
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Thu Jan 5 11:41:53 2012 +0100

    mzd_extract_u & mzd_extract_l and faster TRSM upper right

commit f353c86ff47e5dd7a92ac2a6580006e3c26625d5
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Wed Jan 4 19:11:59 2012 +0100

    removing bench_trsm_* which are replaced by bench_trsm

commit c80971f6b4dcf72fa72a1fb97addc8ddf68922f7
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Wed Jan 4 19:10:59 2012 +0100

    Added bench_trsm which unifies bench_trsm_*

commit 89ef734803e469bb5e3c7408875686dd2b007028
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Tue Jan 3 18:11:41 2012 +0100

    fixed spelling of Kronrod

commit b96fe3b2da88f4735f6cc0f391364887363c59e4
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Tue Jan 3 18:11:27 2012 +0100

    fixed cutting of matrices to fix problems with SSE2 instructions

commit 5ef613cc3b18c2eb177cd4b197d3d77544705a4d
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Mon Dec 26 00:58:24 2011 +0100

    refactoring: renamed files & functions (probably a bit moe clean up to do)

commit 1d9f18e2d41f5fe64d819ea16f0c523ae1b02473
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sun Dec 25 23:45:04 2011 +0100

    faster & nices trtri for upper triangular matrices

commit 381faa770ad72218dde39d4cf34b9088a3eee344
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sun Dec 25 22:00:32 2011 +0100

    improved benchmarketing code for PLUQ/PLE decomposition (still called PLS in code)

commit 406c376692145b0b8b444e62d1c82a8ab49b9e88
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sat Dec 24 22:01:02 2011 +0100

    benchmarking for asymptotically fast trtri for upper triangular matrices

commit 63aae2694efaaf802f2e1a4a92f13b11411aa595
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sat Dec 24 21:56:30 2011 +0100

    bugfixes & asymptotically fast inversion for upper triangular matrices

commit 5bf121b36f3f71c85e484c53f422ccd51f443f61
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sat Dec 24 21:55:55 2011 +0100

    benchmarking & testing code for inversion

commit 83cfc83402b55f235baa99776c6425decd89270d
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sat Dec 24 20:25:05 2011 +0100

    mzd_invert_m4ri replaced by mzd_inv_m4ri() with sane interface, implemented mzd_invert_upper_m4ri for inverting upper triangular matrices

commit 1774bd88226a5403c88447b19c26cd2157a8a643
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sat Dec 24 20:24:32 2011 +0100

    removed trailing whitespaces

commit 09e57cba7411eacb4d52e26cf432cd7341188ad8
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sun Dec 18 23:24:45 2011 +0000

    allow -n 1 in benchmarketing code

commit 9ea9bee593de010ddc33c039de10838f7961d2a4
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sat Dec 3 13:43:31 2011 +0000

    Added tag release-20111203 for changeset 8c2115cc469c

commit 5f49f56e530f88a764faee3555c7f77587720ccd
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sat Dec 3 13:43:06 2011 +0000

    changed library release to 20111203

commit d247a7505b4e03a056ef71ba36857966c21eac0c
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Wed Nov 30 21:51:02 2011 +0000

    fixed png stuff for machines without libpng (i.e., not building png stuff)

commit 3fd3a57af7e0a7175ba65772da1fa592fd3b3640
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Wed Nov 30 15:17:44 2011 +0000

    use MATHJAX when generating Doxygen HTML docs

commit cc408a13c3eb1dad6d7f188bdd8c3d7be7efd25e
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Wed Nov 30 15:17:24 2011 +0000

    shipping pkg.m4 for systems which don't have pkg-config such as OpenSolaris

commit d00bcbf2cd1d0bd77f1eb91f0cba79b313838129
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Tue Oct 25 23:14:43 2011 +0200

    removed misleading "this file is broken" comment, it *should* not be broken.

commit 865bfa008a31e98e2e24f885aac48d79c227a869
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Fri Oct 14 21:48:51 2011 +0100

    invert mono doesn't work for colormap so we do it by hand

commit fe2abcbf5bba4c7017b01793f37c1a9db90c211d
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Fri Oct 14 18:10:54 2011 +0100

    bugfix reading/writing png

commit efac98030407ce8711539f919f5105c1c00c520d
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Thu Oct 13 16:37:49 2011 +0100

    PNG reading/writing & reading of JCF's sparse matrix format

commit ba41260fe1b342955bb6645868d98b3231ac0642
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Tue Oct 4 11:40:25 2011 +0100

    Added tag release-20111004 for changeset 7453821cbd9b

commit e98c22e79cf4c186a8f5ed12308ebcadc1c289f0
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Mon Oct 3 15:42:40 2011 +0100

    changing version number for upcoming release

commit f867d08d61bad06e14ea50c31fce0a1eec9ee5f6
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Mon Oct 3 15:33:20 2011 +0100

    ... and fixed a bug in the last check-in

commit 7b5a13ba4ad9044c0aed81facb5436696218ae3b
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Mon Oct 3 15:28:18 2011 +0100

    renamed SIMD_FLAGS to SIMD_CFLAGS && defined __M4RI_SIMD_CFLAGS and __M4RI_OPENMP_CFLAGS in m4ri_config.h

commit e9cab4b86af290814bc6b5cc5a4c3e595acc56c9
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Fri Sep 30 12:58:39 2011 +0100

    bugfix in mzd_is_zero()

commit b6fa483f5170622499798a71c582e2299c946eba
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Wed Sep 28 15:57:33 2011 +0100

    whitespace stuff

commit b2105f3b3705acc54e04d28b9d845c603641a764
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Tue Sep 13 12:51:14 2011 +0100

    mzd_cmp() should not compare stuff after ncols

commit 9f226f57d5bb0790a684bbcbf0847d584c0e2f0e
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Mon Aug 29 17:02:53 2011 +0100

    Added tag release-20110901 for changeset 753358af056e

commit 22e9155bba32d3b808879f0523ec35f6c829760a
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Mon Aug 29 17:02:41 2011 +0100

    changing version to 20110901 for upcoming release

commit 3bf650705b5a12785afba7f4882a13c4a1950124
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Mon Aug 29 16:54:15 2011 +0100

    handle cflags better
    - install pkg-config .pc file
    - write CFLAGS to m4ri_config.h
    - throw a meaningful error message if __M4RI_HAVE_SSE2 is set but __SSE2__ is not

commit 61416ab36f5c83e7f339b7da0c21409960c612d4
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Wed Aug 24 23:42:46 2011 +0100

    use new-style config.h

commit 88a4073ca2f475f814e8688da9b7dd325fc8097e
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Mon Jul 11 15:15:15 2011 +0100

    Added tag release-20110715 for changeset ab55c3167691

commit aba251b6cc3f1d26c87ae4f6f9dd0e3403a01de3
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Mon Jul 11 15:15:03 2011 +0100

    Added tag release-20110613 for changeset 68c0b623b59a

commit 7bd1b0d428d7ebb656337c1e34e57d9015509b54
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Mon Jul 11 15:12:49 2011 +0100

    Added tag release-20110601 for changeset 75bcfb497a80

commit 4ac593ce747d5161fabcc4b555bbd2a86051cc66
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Wed Jul 6 16:55:32 2011 +0200

    adapting soname for upcoming release

commit 69fc78fb8648e63ec8767485500a1a6ea0cd75ed
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Wed Jul 6 16:54:20 2011 +0200

    added option to pass cache sizes explicitly to configure

commit 4206f38d2af1fa6125fff2fc957738a1b3175e99
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Fri Jul 1 15:29:13 2011 +0200

    use less iterations per experiment in cache tuning but more experiments

commit 2810d717c4c028bae1f08a4da924c86a1bc5b94a
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Fri Jul 1 15:21:38 2011 +0200

    flush the buffer in tuning such that the user gets feedback that we are not hanging

commit a7f2190ae2df3169bea0f755d93b8d16bbda6d97
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Fri Jul 1 10:38:20 2011 +0200

    revise PLE decomposition to match new block-iterative algorithm.
    
    This improves performance for sparse-ish matrices and guarantees O(n^3/log n) complexity in the worst0case.

commit 3c9fe75558c9551a8ce0778f843a761374b53c7c
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Tue Jun 28 17:46:00 2011 +0200

    fix memleak in vector_destruct()

commit f32ef4b809d5000c04723de4e0f96a6415b4a844
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Wed Jun 22 16:48:34 2011 +0200

    adding m4ri_spread_bits and m4ri_shrink_bits + testcases

commit 16cf24b6bfb7598181c851620cc28ebcc98b32bd
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Wed Jun 22 16:36:30 2011 +0200

    disable manual zero-ing out the transpose matrix since our tests indicate it happens on the fly anyway. Added tests though.

commit e1737a88d9d3f152a5427ce7153a3d5b8ec56bd6
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Wed Jun 22 16:25:58 2011 +0200

    documentation update on PLE factorisation

commit 49e75e2b21933d0f6ce834116e92dd95be6129f2
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Wed Jun 22 16:24:07 2011 +0200

    zero out transpose target matrix before writing to it

commit fc5e2c80760395fbf09b5d0d06020db06513ed07
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Wed Jun 8 16:45:12 2011 +0100

    fix compilation and segfaults when OpenMP is enabled
    large parts of this patch are based on patches provided by Jerry James

commit 4ca1d52f6d1d590c0516bd9bab4bdbe381704750
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Wed Jun 8 15:33:16 2011 +0100

    fixing printing of benchmarketing information

commit 0f4a5f8145e12329ea8d4c6e4916659783e6c227
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Mon May 30 16:54:16 2011 +0100

    print cycles per bit in bench elimination and multiplication

commit 01f252899c63eca46b631fd114757678f0c6ab63
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Mon May 30 16:53:27 2011 +0100

    updating README and AUTHORS for upcoming 20110601 release

commit 328a1c5c117b58eaeec4d12755f21676cf31ea34
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Mon May 23 16:23:18 2011 +0100

    xor is a restricted keyword in C++

commit c48b1de86bbfb05a63e70ca541d21b34192c6dd8
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Mon May 23 11:41:59 2011 +0100

    fixed typo which prevented compilation

commit f9b89ed829c17a1616b6e0368463dfdaafe54f72
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sat May 21 21:03:42 2011 +0100

    MS Visual Studio 10 support

commit 91b418c922734bf13269a0e61afdd9f2c96bcec6
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sat May 21 18:10:23 2011 +0100

    adapting release version

commit ba242219743d62115e816a514026b45dbe31ca4c
Merge: bff05c6 77cc697
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sat May 21 18:06:00 2011 +0100

    merging Carlos' swap patches

commit 77cc69772053fb642280a2045b484d663f644043
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sat Apr 30 19:25:38 2011 +0100

    only set HAVE_PAPI if we have papi

commit bff05c637a632e4aea2519a98f7489bf57d5bbc4
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sat Apr 30 19:16:11 2011 +0200

    Copied the improved code of mzd_col_swap to mzd_col_swap_in_rows and added support for start_row/stop_row.
    
    The result has the same speed mzd_col_swap (per row).

commit 415f319297c211e1998d482350bc0a1834c046b9
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sat Apr 30 18:08:35 2011 +0200

    Add support for transposing multi-block matrices.

commit 2dd557a950baa38cf78e9aabb69a1cd444efa7c8
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sat Apr 30 13:38:10 2011 +0200

    Also ignore generated maintainer file ltmain.sh

commit 8eb50c1745448bb505f055b424d27e714d77a237
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Fri Apr 29 16:14:49 2011 +0100

    do not fail if realpath is not installed

commit 6bf10856e0487cdcb40ea1a6e9b992bb17847234
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Fri Apr 29 16:14:35 2011 +0100

    follow-up check-in for cache size fix

commit 3f4cc07578306338128d9c21c2d97d5fee2648ea
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Fri Apr 29 15:37:11 2011 +0100

    initialise variables (i.e., take care of Wall reported errors)

commit 17b65d8d30d2072532768690529a60f7aacca696
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Fri Apr 29 15:36:48 2011 +0100

    install debug_dump.h otherwise programs linking against the library will fail to compile

commit ce1671342ac40975e2cf0d9fa18ff0dbefa95af7
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Fri Apr 29 15:36:02 2011 +0100

    remove ltmain.sh which is autogenerated

commit 81b93952abb106b84d651c0d11eb45e28b6fe309
Author: Carlo Wood <carlo at alinoe.com>
Date:   Fri Apr 29 05:21:33 2011 +0200

    Speed up of mzd_col_swap with a factor of two.
    
    Plus added a testsuite for it.

commit 64394b029c0515b5ca49c94ea4dbc05fddb55b5c
Author: Carlo Wood <carlo at alinoe.com>
Date:   Fri Apr 29 01:59:59 2011 +0200

    Bug fix in mzd_equal.
    
    When shift = B->offset - A->offset turns out to
    be negative, we swap A and B. I forgot to also
    reinitialize 'width'. Renamed 'width' to Awidth.
    
    Also got rid of __M4RI_LEFT/RIGHT_BITMASK macros.

commit 6f6f87279a1bdb1f2fb52a5b18f8a542e934b7b9
Author: Carlo Wood <carlo at alinoe.com>
Date:   Wed Apr 27 21:15:39 2011 +0200

    Bug fix and general fixups. Testsuite for transpose.
    
    Added test_transpose.c to the testsuite.
    Fixed a bug for non-square matrices of specific sizes where
    uninitialized data was written to the excess bits of the
    destination matrix of mzd_transpose.
    Added a few asserts related to multiblock matrices.
    A few minor documentation fixes and typos.

commit 36b04287c28bba1db8fe4548e889afa159f524ec
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Apr 26 22:51:23 2011 +0200

    Major improvement of transposing.

commit 18268ea42cac36675d0b396e6236039d604732aa
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Apr 19 22:16:30 2011 +0200

    Rewrite of _mzd_addmul_even_weird to use rowstride.
    
    Doesn't seem to speed anything up, but it was a 'test case'
    to show how it's done ;). Eliminates the use of 'rows',
    reducing the memory access roughly with a factor of two.
    
    Of course, in the light of calling mzd_init, which
    still calls malloc for blocks, and rows and fills the
    latter with data... this all makes little sense unless
    we really get rid of rows (and also cache allocations
    of blocks[]).

commit ab65c44f4805e0a5d9eb13411640ad27138324a7
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Apr 19 21:49:48 2011 +0200

    Compiler warning fixes.

commit 0594d84a16bd32a3c3c414b8bf49ad423ef14db6
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Apr 19 19:37:11 2011 +0200

    Implement separate cache for mzd_t.
    
    This cache only calls malloc if more than 64 matrices
    are used, and then allocates space for 64 mzd_t structs
    at a time.

commit 8aa7dd01282d235fcd6f96ba7d289045068e4d97
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Apr 19 16:04:10 2011 +0200

    Added row_offset and accessor functions for mzd_t using it.
    
    row_offset is the distance in rows from the beginning of
    block 0 to the first row. This allows to calculate the
    following functions in just a few clock cycles. Note that
    I had to reduce the size of offset, flags and blockrows_log
    in order to keep sizeof(mzd_t) equal to 64 byte.
    
    This patch adds the mzd_t functions:
    mzd_first_row, mzd_first_row_next_block, mzd_row_to_block,
    mzd_rows_in_block and mzd_row.
    
    The canonical way to walk over all rows then becomes:
    
    int n = 0;
    rci_t row = 0;
    word* ptr = mzd_first_row(M);
    int count = mzd_rows_in_block(M, 0);
    while(1) {
      while (count--) {
        assert(ptr == mzd_row(M, row) && ptr == M->rows[row++]);
        ptr += M->rowstride;
      }
      if ((count = mzd_rows_in_block(M, ++n)) <= 0)
        break;
      ptr = mzd_first_row_next_block(M, n);
    }
    assert(M->ncols == 0 || row == M->nrows);
    
    Also fixed a bug where mzd_flag_multiple_blocks was set even
    if a windowed matrix fell completely inside a single block.
    
    Removed one multiplication and the only division from mzd_init_window.
    
    Code was now tested with small blocks (down to 4 kB, although
    that causes two tests to fail because then the gray code tables
    don't always fit in one block anymore. Using 8 kB blocks is 100%
    successful however).

commit dc034ec8f37cd4900530c14847f7f5550cc950f4
Author: Carlo Wood <carlo at alinoe.com>
Date:   Mon Apr 18 19:15:12 2011 +0200

    Added mzd_t::offset_vector and made mzd_t::blocks non-zero also for windowed matrices.
    
    After this patch, also for windowed matrices, you can find the first
    word of the first row with M->blocks[0].begin + M->offset_vector.
    
    Subsequently you can find the next rows by adding rowstride until
    the resulting pointer >= M->blocks[n].end. Then increment n and
    set the pointer to the first word in the next block with
    M->blocks[n].begin + (M->offset_vector % M->rowstride).
    
    For example, to run over all rows of an arbitrary matrix:
    
    rci_t r = 0;            // Only used for assertion that we checked all rows.
    int n = 0;              // Current block number.
    rci_t counted = 0;      // Number of rows processed already.
    word* row = M->blocks[n].begin + M->offset_vector;
    
    // Run over all blocks.
    while(1) {
      // Number of rows till the end of the current block.
      int row_count = (M->blocks[n].end + M->rowstride - 1 - row) / M->rowstride;
      // Don't go passed the end of the matrix.
      row_count = MIN(row_count, M->nrows - counted);
      // Keep track of how many we already processed.
      counted += row_count;
    
      // START INNER LOOP
      while (row_count--) {
        assert(row == M->rows[r++]); // Assertion to check it works and count the total number of rows.
        row += M->rowstride;
      }
      // END INNER LOOP
    
      if (!M->blocks[++n].size)     // Was this the last block?
        break;
    
      // Position row at first word of first row in the next block.
      row = M->blocks[n].begin + (M->offset_vector % M->rowstride);
    }
    
    // Make sure we checked all rows.
    assert(M->ncols == 0 || r == M->nrows);

commit aa4e2a5142728b1f5abe100ead59f9cf43885a55
Author: Carlo Wood <carlo at alinoe.com>
Date:   Mon Apr 18 16:06:53 2011 +0200

    Move __M4RI_CPU_L1_CACHE and __M4RI_CPU_L2_CACHE to m4ri_config.h.in.

commit f8332c2e36dc4b65fbed733e93c939e3adfc6c41
Author: Carlo Wood <carlo at alinoe.com>
Date:   Mon Apr 18 15:21:01 2011 +0200

    Add option --debug-mzd.
    
    This allows one to run heavy consistency checks on the elements
    of mzd_t (which are heavily correlated) without dumping the
    hash values. Using just --debug-dump also does the consistency
    checks on top of printing the hash values.

commit 3870d4db4b12d69ffc15d65c5f2f9603f1032979
Author: Carlo Wood <carlo at alinoe.com>
Date:   Mon Apr 18 15:18:52 2011 +0200

    Add new elements to mzd_t and keep them consistent.
    
    After this patch it is  guaranteed that excess bits and padding
    are zero (I think they already were, but now I checked it, see
    next commit).
    
    rowstride is the offset between two rows within a block.
    blockrows_log and blockrows_mask are respectively the log2
    of the number of rows in blocks before the last block,
    and the number of rows minus one. Note that number of rows
    is exactly a power of 2: 1 << blockrows_log, so that given
    some row, the first word in that row is given by:
    blocks[row >> blockrows_log] + (row & blockrows_mask) * rowstride.
    
    high_bitmask and low_bitmask are precomputed bitmasks, masking
    valid bits in the case of excess bits and/or offset bits respectively.
    If width == 1, then both are equal and mask all valid bits.
    
    Matrices with a width less than mzd_paddingwidth (currently 3),
    are no longer 128-bit aligned. This makes small matrices (width = 1),
    twice as compact while they already weren't processed with SSE2
    instruction anyway (in fact, it now might be possible to start
    to use SSE2 to do two rows at once).
    
    Finally, the flags element contains (currently) five bits that
    should allow to speed up certain algorithms. The basic idea is
    that if flags is zero then we can use the fastest possible algorithm.
    The flags contain information about offset, excess bits, it
    being a windowed matrix, and the existance of more than one block.

commit 5f3d07884d8fc823879c6fde72eb72fdc4a5b1c6
Author: Carlo Wood <carlo at alinoe.com>
Date:   Mon Apr 18 15:02:49 2011 +0200

    Documentation fix.
    
    mzd_t::offset is already modulo m4ri_radix.

commit ebeb512f71f56aa2222d9f8acb0768094debaec1
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sun Apr 17 02:48:57 2011 +0200

    Add --enable-debug-dump.
    
    When configured with --enable-debug-dump, print a trace
    of (hash values of) output values and their function/location,
    upon leaving any function that does something significant.
    
    This can be used to quickly find the function that behaves
    different in the case that some patch breaks the testsuite.

commit da91dc7d35e52628c7fe4a9accef43c3a37617e0
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sun Apr 17 02:34:13 2011 +0200

    More constness and some whitespace issues.

commit c9fc96be7bebb3c0a64825b86896d21fc902ccc3
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sat Apr 16 21:23:10 2011 +0200

    A few more compiler warning fixes and a const thingy.

commit b761ba069210cd8b6fa4a99d32c28875a2b0aa68
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sat Apr 16 16:32:12 2011 +0200

    More constness fixes.
    
    This should cause all pointers passed to functions
    to be a pointer-to-const when the content is not changed.
    
    I needed to introduce mzd_init_window_const, which creates
    a window into a const mzd_t, returning mzd_t const* as well.
    I decided to demand an explicit cast when freeing such
    a window (we can fix that later once I added flags to mzd_t,
    and add runtime checking when freeing a const window,
    removing the need for explicit casts).
    
    Also fixed a few -Wall compiler warnings that sneaked
    into the testsuite.

commit 91504b25d7817fd2c4622ed603a3822b111e4c68
Author: Carlo Wood <carlo at alinoe.com>
Date:   Fri Apr 15 18:42:20 2011 +0200

    Fix constness of trsm* functions.

commit bded76b2689fd0b2fd44cf7c898225857cd69f0d
Author: Carlo Wood <carlo at alinoe.com>
Date:   Fri Apr 15 16:33:25 2011 +0200

    Do not install or include config.h in header files.
    
    This patch introduces src/m4ri_config.h.in, from which
    src/m4ri_config.h is generated during configure, which
    subsequently is installed instead of config.h and included
    by other headers. This to avoid to have the whole list
    of macros defined in config.h polute the macro namespace
    for users of the library.
    
    Header files now use __M4RI_ prefixed versions of
    HAVE_SSE2, HAVE_MM_MALLOC, HAVE_POSIX_MEMALIGN and HAVE_OPENMP,
    although rather than undefining HAVE_MM_MALLOC when HAVE_SSE2
    isn't set, we use two helper macros: __M4RI_USE_MM_MALLOC
    and __M4RI_USE_POSIX_MEMALIGN (the idea being, as before,
    that without sse2 alignment is not needed).
    
    Also include testsuite/testing.h in the dist tar-ball,
    because otherwise 'make check' is broken for tar-ball
    releases.
    
    Renamed SSE2_CUTOFF --> __M4RI_SSE2_CUTOFF because it's
    also visible / used in a header.

commit 29accda9db14aadbdbaf82bcb8267eaa7ab9c21d
Author: Carlo Wood <carlo at alinoe.com>
Date:   Wed Apr 13 18:41:09 2011 +0200

    Make it harder for the compiler to put parts of inlined functions outside our loop.

commit 95281a500e091307dcf9797e25552280e7eca488
Author: Carlo Wood <carlo at alinoe.com>
Date:   Wed Apr 13 18:12:15 2011 +0200

    Add dependency on m4ri headers to testsuite.

commit 55fec0a02b92708952ae9ad2d9d576b8b01b552c
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Apr 12 21:56:40 2011 +0200

    Allow to only dump a single counter.

commit c818ee2e357aa3be49bfb8e6ed43dbabe8ca1097
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Apr 12 21:26:18 2011 +0200

    Compiler warning fixes.
    
    Added missing headers.
    Removed unused parameters.
    'static inline' must come before return type.
    
    This warning is still not fixed:
    
    src/packedmatrix.h:862: warning: unused parameter ‘c_startblock’
    src/packedmatrix.h:863: warning: unused parameter ‘a_startblock’
    src/packedmatrix.h:864: warning: unused parameter ‘b_startblock’

commit 6b8561e4908b8fe0608c571d8576a4d85793c85b
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Apr 12 20:33:32 2011 +0200

    Added forgotten m4/ax_func_posix_memalign.m4
    
    Oops, I had forgotten to add this.

commit ca633abbcee2eac95c28dafe18f48c0753feb938
Merge: 21c3ab0 faf0883
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Apr 12 20:19:11 2011 +0200

    Merge with malb

commit 21c3ab0e315d85335c31611324506fe0a4300851
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Apr 12 18:38:10 2011 +0200

    Doxygen warning fixes.
    
    Added missing documentation (that I was responsible for).
    Added M4RI_DOXYGEN to PREDEFINED in Doxyfile, and used
    that to help doxygen understand what we want to generating
    documentation for and what not.
    Also updated Doxyfile by running 'doxygen -u Doxyfile'.

commit 18ccb433e3a03efbc273f1a929a6f9029c670884
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Apr 12 18:01:15 2011 +0200

    Moved mmc functions to their own file.
    
    Added back doxygen documentation for mmc functions and macro's.
    Renamed __M4RI_MM_MAX_MALLOC --> __M4RI_MAX_MZD_BLOCKSIZE and put
    it in packedmatrix.h. This constant is the size for mzd_t blocks,
    and has nothing to do with the memory management code.
    Renamed __M4RI_MZD_MUL_BLOCKSIZE --> __M4RI_MUL_BLOCKSIZE

commit 8ac1513ea7032cd7c333942c3887337aab3caefe
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Apr 12 04:20:20 2011 +0200

    __M4RI_ENABLE_MMC juggling and support for posix_memalign

commit 34208ddd63aca72a0fb221b756f4fd63aa6dc2a2
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Apr 12 03:29:57 2011 +0200

    Bug fix, forgot a few instances of CPU_L2_CACHE.

commit 289e82883e0ba3f79e4e59ea0dc4076ab8990c93
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Apr 12 02:01:36 2011 +0200

    Bug fix for crash of bench_* programs.
    
    PAPI needs to be initialized before calling PAPI_event_code_to_name.

commit 0368d1c35142e31353628219be6947b45214c70b
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Apr 12 01:55:44 2011 +0200

    Move _mmc_ code from misc.h to packedmatrix.c.
    
    Those functions were way to large to inline anyway
    (and allocation memory is very slow, so useless to
    inline). Moreover, packedmatrix.c is the only file
    using those functions!

commit 7a5e99a011b594f7b4f3b1ca6a96933c8c74a8f4
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Apr 12 01:13:06 2011 +0200

    Prefix all exported variables, functions and macros.
    
    Macros are now prefixed with __M4RI_, which the exception of MAX and MIN.
    Functions are prefixed with m4ri_, mzd_ or _mzd.
    Constants defined in headers are lower case and prefixed with m4ri_.
    This doesn't make the code more readable unfortunately, but it's just
    not professional to export things in a library that can easily collide with others.

commit faf088323fad703ac7e76e6efbb3e88995a79ea1
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Mon Apr 11 17:42:01 2011 +0100

    allow --with-papi=PREFIX when AC_CHECK_LIB cannot find it (i.e. the case it is meant for)

commit eff72b2843d030dc25633283883885ece84b65f4
Author: Carlo Wood <carlo at alinoe.com>
Date:   Mon Apr 11 18:05:50 2011 +0200

    Also search for papi.h by using -I include flags.
    
    Prints found paths and warnings as appropriate.

commit 02d1c5a9555ec5fc95715ecc055b4ddaf2cd11c2
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Mon Apr 11 11:06:06 2011 +0100

    add hack to add PAPI include directory.
    
    Now PAPI can be installed in funny places (read: not /usr/local) and the build succeeds

commit 41e50e9992f42f045ed163aed28987e534e79b66
Author: Carlo Wood <carlo at alinoe.com>
Date:   Mon Apr 11 03:45:52 2011 +0200

    Determine and use LIBPAPI_PATH.
    
    This set PAPI_LDFLAGS="-Wl,-rpath,$LIBPAPI_PATH" instead
    of explicitely using /usr/local/lib.
    It also adds -L$LIBPAPI_PATH to PAPI_LIBS.
    
    LIBPAPI_PATH can be set explicitely with --with-papi=/papi/prefix,
    or support for libpapi can be explicitely skipped by
    specifying --without-papi, or --with-papi=no. If no --with-papi
    is given then LIBPAPI_PATH is set to the directory where
    libpapi.so is found, by searching first all -L directories in LDFLAGS
    and finally in /usr/local/lib.

commit b1284a846141c8aae3c45089079f84961adf8459
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sun Apr 10 13:57:11 2011 +0100

    matops doesn't exist anymore

commit 580c5fdc43cc903cb2fe77a7f85c3f553793e3df
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sun Apr 10 13:38:16 2011 +0100

    bench_smallops made obsolete by bench_packedmatrix

commit 570acb14161e75bc25ffa8c109100602f8b42cea
Merge: 78dda31 473e509
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sun Apr 10 13:32:51 2011 +0100

    merging in Carlo's benchmarking patches

commit 78dda319a5a54647918c4dd770dbb888d5efe09c
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sun Apr 10 01:46:30 2011 +0200

    Added support for PAPI.

commit 32664de37683a33eadcbefa7627c3d2510c9ba7f
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Apr 5 18:55:14 2011 +0200

    Fix order of calloc function parameters.
    
    Rationale is that if one of the arguments can be odd,
    it should be the first parameter. Although this doesn't
    have effect on this library, it's basically a possible
    alignment issue.

commit 052152b51a1fa39d90c4fbb49f1cc4f7424624f6
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Apr 5 18:38:21 2011 +0200

    Add LIKELY/UNLIKELY macros for future use

commit b7837dc553f989f8fdc21808888ed8f7d035217b
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Apr 5 03:52:34 2011 +0200

    Create a randomize matrix for each call for mzd_gauss_delayed and mzd_echelonize_naive.

commit a6f105a830aa1cdbd1f567e6990750bab5c3c20f
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Apr 5 00:32:34 2011 +0200

    Fix constness of packedmatrix mzd_t input pointers.

commit 473e50938156fec139272c7cb2d6cec18618085d
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Mon Apr 4 23:05:46 2011 +0200

    correcting a few minor things in bench_packedmatrix
    - mzd_transpose is more general (m x n instead of only square matrices)
    - mzd_mul_va does support more general inputs
    - corrected theoretical complexity of various algorithms

commit f77a9d160311865980ec9bf30bcdbee59a8228fe
Author: Carlo Wood <carlo at alinoe.com>
Date:   Mon Apr 4 20:08:03 2011 +0200

    Packedmatrix benchmark fixes.
    
    Only test mzd_first_zero_row with a square matrix, and assign result to volatile dummy.
    Updated several 'cost' values to bring single measurements near 20 ms.

commit cac52abdca9fc31f7489863ee30f607df60f9ffc
Author: Carlo Wood <carlo at alinoe.com>
Date:   Mon Apr 4 20:04:34 2011 +0200

    Minor changes to mzd_first_zero_row

commit 42a347e728db5cc24719cfaa6e5b8f43b1955ebb
Author: Carlo Wood <carlo at alinoe.com>
Date:   Mon Apr 4 01:14:46 2011 +0200

    Bug fix in print_complexity1_human and complexity code updates.

commit 4ba1bf995bcd9308dfcb421258e2bed80eb0f73f
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sun Apr 3 19:39:25 2011 +0200

    Improved printed Usage output.

commit c90a43a0cd80b7c316f266eda09eee4b0d476c64
Merge: d17a15a 4f21a4d
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sun Apr 3 19:26:43 2011 +0200

    Merge with malb

commit 4f21a4d3be45a5d8655ffbfd525fbb5ffa326b56
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sun Apr 3 19:23:54 2011 +0200

    Added general benchmark program for individual packedmatrix functions.

commit d17a15a4929f61aa45a2c59e6a0794c41bdf5c64
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Fri Apr 1 09:48:18 2011 +0200

    I foolishly forgot to add some of the newly added files

commit 8f732876dc1a4a5dc7ac24167da73c15c723c705
Merge: 3b17969 f9fe044
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Thu Mar 31 18:06:46 2011 +0200

    merge with Carlo's copyright update

commit f9fe0445a5ba37e6cd35b9139572fd4d1b104fe5
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Thu Mar 31 18:01:20 2011 +0200

    improved mzd_add
    - mzd_add is much faster now for small matrices
    - mzd_add respects ncols & offset properly
    - there is a benchmark to test the performance of mzd_add, mzd_copy, mzd_transpose
    - there is a test to test correctness of mzd_add (including whether it writes to places it shouldn't)

commit 3b17969301d7af1814f1692fc13746fca4321d10
Author: Carlo Wood <carlo at alinoe.com>
Date:   Wed Mar 30 14:59:29 2011 +0200

    Fixed copyright header in testsuite/test_random.c.
    
    Also added explicit Copyright notice to some
    of the files (not all of them, because at some
    point the non-legal aspect of such a statement
    (lets call it the "boost" factor) is too high
    for me to feel comfortable with adding such
    notice when basically all I did was changing
    the types of variables into int, rci_t or wi_t.

commit 8e9aa032446fd6a4e04666e626d78c4bb97f713b
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Wed Mar 30 14:27:33 2011 +0200

    adding autogenerated files to .hgignore

commit 927b01bec54387fb243d41b96d61efc2d7bd0a26
Merge: cce6b35 0bc3d4b
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Tue Mar 29 08:56:18 2011 +0100

    merging with Carlo's random() fixes

commit 0bc3d4bf6af0f17d3299c4c8024f024ee9d49696
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Mar 29 03:49:05 2011 +0200

    Take BENCH_RANDOM_REVERSE into account in bench_randomize.
    
    Just to be pedantic, because the testsuite doesn't
    use this (with offset != 0).

commit 5bc161044b8d10e2eea45335026d8365a8a2787a
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Mar 29 03:31:48 2011 +0200

    Duplicated code of m4ri_randomize m4ri_random_word to benchmarketing.c
    
    Also needed to fix this file of course :/

commit 84614d268799fed091025da81a8b86951123a776
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Mar 29 03:23:36 2011 +0200

    Bug fix in m4ri_random_word.
    
    m4ri_random_word accidently only returned 31
    bits instead of 64.
    
    Improved output format of mzd_print to print
    a '|' between words instead of a space, also
    when offset != 0 (before a colon was printed
    in that case). And to not print a trailing ':'
    after the last bit.
    
    Fixed mzd_equal (now returning int instead
    of BIT), to also work for windowed matrices.
    
    Fixed mzd_randomize to generate the same
    matrix (after a call to srandom() with the same
    seed) independent of the offset of a matrix,
    and without overwriting the excess bits. In
    other words, this functions now also works
    for windowed matrices.

commit cce6b35235fcf658b66bb6c869813b331149c873
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Mon Mar 28 19:14:56 2011 +0100

    removing work arounds for compiler bug (not properly alligned loops) since they are not cross platform

commit 6e465932ebdf9bad109d68e9f3db3c7e8e6b448c
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Mon Mar 28 19:14:18 2011 +0100

    adding swap_bits() function to easy transition for third parties to new matrix layout

commit 35d5d1195b5251129d0c8282333e0013be5fb348
Author: Carlo Wood <carlo at alinoe.com>
Date:   Mon Mar 28 00:24:08 2011 +0200

    Use TOPSRCDIR Makefile var instead of PWD. Inverse random bits when needed.

commit 3e1bfc44b77925d1450c4ee45c6b7e8a3a2d3e71
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sun Mar 27 23:45:00 2011 +0200

    Random benchmark improvements.
    
    Speed up of m4ri_random_word, and introduction
    of bench_random_word so it can be used during
    benchmarking of older revisions too.
    
    Fixed testsuite/Makefile.in such that it is possible
    to compile the testsuite against a different source tree.
    
    Various other minor improvements and fixes.

commit 614da55cd7651147f1d83c7062a4406e2f5db580
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sat Mar 26 20:26:46 2011 +0100

    Benchmark facelift.
    
    Added testsuite/benchmarketing.c with code
    for general commandline options to control
    the minimum/maximum number of times a test is run,
    desired accurcary, confidence level, maximum running
    time and amount of output printed.
    
    Also fixed the bench_pluq.c and bench_trsm*.c
    to support the new benchmarketing engine.

commit e73f08dc54fdcf1608b8d4959e9d585f0f9cb265
Merge: 9143a2f af558b2
Author: Carlo Wood <carlo at alinoe.com>
Date:   Fri Mar 25 04:23:19 2011 +0100

    Merge with https://bitbucket.org/malb/m4ri changeset 7d7a103dfba3

commit 9143a2f94239f69bac68bb35acd9185b64bb699b
Author: Carlo Wood <carlo at alinoe.com>
Date:   Fri Mar 25 02:04:30 2011 +0100

    Type and whitespace clean up (part 2).
    
    This finalizes this patch.
    C++ wrapping and hacks needed for that removed.
    More manual whitespace fixes.
    One bug fix: A header had type int as parameter while it should have been wi_t.

commit fd1624b54e3e1131015f54e1753dbef670096447
Author: Carlo Wood <carlo at alinoe.com>
Date:   Thu Mar 24 18:20:36 2011 +0100

    Type and whitespace clean up.
    
    Introduction of rci_t and wi_t.
    
    rci_t is the type of row or column index, and differences thereof.
    However, when it is known the certain row/col difference is
    significantly small, the type int is used. For example:
    
    int skip = col % RADIX;
    
    wi_t is the type of word index, and differences thereof.
    This the type of differences between two word*.
    Most notably, rci_t / RADIX = wi_t.
    
    Note the types in mzd_t::rows[rci_t][wi_t].
    
    This patch also moves a lot (all?) variable declaration from the
    beginning of function to where they are actually (first) used.
    This is supported by C99 and makes the code a lot more clear.
    
    This patch also includes a manual re-indent, most notably putting
    const on the right-hand side of types, using pre-increment when
    possible and adding spaces around operators.
    
    Finally, this patch contain temporary code to check correct usage
    of rci_t and wi_t relative to int (size_t) and to eachother. This is
    achieved by, next to word, also wrapping multiples of RADIX in a C++
    class, as Radix_t, using wordPtr and wordPtrPtr as wrappers for
    respectively word* and word** and by wrapping rci_t and wi_t in
    C++ classes. This commit contains this code for archival purposes,
    as it will be removed again in the next patch (this code only
    compiles with a C++ compiler at the moment).

commit fb8ae5b1ddfa1741473a703f167c1098b2a5956f
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Mar 15 20:09:34 2011 +0100

    Removed FIXME from _mzd_addmul_weird_weird
    
    Not really sure if this was worth 2 hours of my life,
    but Okay... moved a shift outside of inner loop.

commit 58163eb948ddc5da35f36608460a550eb672b297
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Mar 15 17:09:43 2011 +0100

    Remove the FIXME from _mzd_transpose_direct_128
    
    Doesn't really gain anything, but I guess it feels better
    to start with 0xFFFFFFFF ;).
    
    Note that I had to write a separate test program for this,
    because this function is NOT tested by the current testsuite!

commit 80f25bcc66ce4e2592bc2e2c32d18853435aef92
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Mar 15 16:37:38 2011 +0100

    Use int consistently (needed for wordwrapper)

commit 97ffa52d03615308509ffd570a9811ff197cb053
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Mar 15 15:15:25 2011 +0100

    Minor cleanup of misc.h.
    
    Among other things, get rid of BITMASK again now
    that is too simple; after some contemplation I
    decided that seeing 'ONE << spot' in the code
    is more instructive than seeing 'BITMASK(spot)',
    especially since the latter more or less hides
    whether or not a modulo with RADIX is needed or
    not.

commit dfeecd85868c251ee2f814bd2819758b04489ebf
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Mar 15 14:58:26 2011 +0100

    Reverse ONE.
    
    This is why we have done all this. By reversing ONE (making it 1 again
    in the process) columns and shifts match up and a lot of code (and
    macros) become simpler and more logical.

commit 91b5ccf0959609e681c4b6edaf034aaf7425bc3d
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Mar 15 03:57:54 2011 +0100

    Remove last traces of reverse.
    
    This patch reverses the k bits of Gray codes of k bits long:
    code::ord. It also moves those k bits to the other side
    of the word when read with mzd_read_bits and also adjusts
    mzd_xor_bits to work on the other side.
    
    This means that the index of all L*, M* and E* array's
    that take a Gray code as index are reversed (for their
    respective lengths).
    
    It was discovered that a test to break a loop in mzd_find_pivot
    was unnecessary, as it was testing a bit that is always zero.
    This test was removed (if we left it in we would have needed
    to check bit index -1 of a word).
    
    The library now compiles and works again when compiled with
    a C compiler!

commit 4d80ada8f252beefb4bd7bef3ba286c79e0e3000
Author: Carlo Wood <carlo at alinoe.com>
Date:   Mon Mar 14 18:54:38 2011 +0100

    Bring word::convert_to_int back to its original state.

commit 6b5c6a6b9474fbb89644f35dce7c7f86bd3510d3
Author: Carlo Wood <carlo at alinoe.com>
Date:   Mon Mar 14 18:25:44 2011 +0100

    Bring word::convert_to_BIT back to its original state.

commit 31a7ad95c077460666de3df388011a15b26354ae
Author: Carlo Wood <carlo at alinoe.com>
Date:   Mon Mar 14 18:20:25 2011 +0100

    Added extra asserts to make sure that shifts are within defined range.

commit 2c2919b64c49ce7a7b4faafd74177e95f8182258
Author: Carlo Wood <carlo at alinoe.com>
Date:   Mon Mar 14 17:28:51 2011 +0100

    Remove word::operator-(int).
    
    Fixed documentation of LEFT_BITMASK and RIGHT_BITMASK.
    Improved implementation of LEFT_BITMASK.
    Eliminated need for word::operator-(int).

commit f3984f6502c2a7db1f24b3b38ff03a99765598a1
Author: Carlo Wood <carlo at alinoe.com>
Date:   Mon Mar 14 16:31:17 2011 +0100

    Change shift operators back to their original state.
    
    In other words: swap >> and << (operating on words) throughout the code.
    
    This patch also fixes a bug in m4ri_random_word, which never returned
    more than 16 bits! This was fixed by NOT inverting the shift direction
    in this function.

commit 7757e451f3a265ac1f8af8255b1ce10bffb1d138
Author: Carlo Wood <carlo at alinoe.com>
Date:   Mon Mar 14 14:51:40 2011 +0100

    Cancel reversals in word::operator-(void) const and WRITE_BIT.
    
    Since WRITE_BIT is the only place where the negation operator
    is called, we can cancel the reversals here - bringing both
    back to their original state (meaning: we could switch back
    to typedef uint64_t word as far as those are concerned).

commit 7362299fac91cb4067232e95b51310c30dd6985d
Author: Carlo Wood <carlo at alinoe.com>
Date:   Mon Mar 14 14:47:18 2011 +0100

    Export reversal from CONVERT_TO_WORD to code.
    
    This also removes the FIXME64 class again.

commit c2b4d6bc179f1849d7225e9712504c58170ac166
Author: Carlo Wood <carlo at alinoe.com>
Date:   Mon Mar 14 14:32:36 2011 +0100

    Move reversal from word(uint64_t) to CONVERT_TO_WORD.
    
    This patch uses a temporary FIXME64 wrapper class to
    make sure that ALL calls to this constructor go through
    CONVERT_TO_WORD.
    
    Internal constructions of word are don't directly now (without reversal).

commit 54953d0e476a9400487059e560f419e770c0cf73
Author: Carlo Wood <carlo at alinoe.com>
Date:   Mon Mar 14 04:25:44 2011 +0100

    Reversed the bit order of the internal representation of class word.
    
    On top of that, remove operators <, >, <= and >=.
    Comparing integer values for less than or greater than has to happen
    by first converting to the underlaying uint64_t (using CONVERT_TO_UINT64_T).
    
    This patch already exports the reversal of the bits through
    CONVERT_TO_UINT64_T, so that functions that used operator< needed
    to be reimplemented for the reversed case (larger_log2 --> lesser_LSB).
    However, the reversal on conversion from uint64_t to word is still hidden,
    breaking the demand that CONVERT_TO_UINT64_T(CONVERT_TO_WORD(val)) == val.
    This will be fixed in the next commit.

commit d17437314b8b3bea0b881d219ae9f0f4a063f6e3
Merge: 19e254c eeb5da4
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sun Mar 13 22:13:06 2011 +0100

    Fake merge of dead head

commit 19e254cfd41d906f819aa97e9a0adac17cfd1365
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sun Mar 13 21:53:49 2011 +0100

    Introduction of M4RI_WORDWRAP and the C++ class word.
    
    This patch turns word into a C++ class that allows
    extensive testing and checks, if the library is
    configured with a C++ compiler as compiler.
    
    For example:
    CFLAGS="-O2" CC="g++" ./configure --enable-maintainer-mode --enable-debug
    
    It is not possible to pass --enable-sse2, as the class contains
    more members than just the underlaying uint64_t and therefore
    no 128-bit operations are possible anymore.

commit 421af9e9d34d1f291c8f1ccaab084322f0d19e08
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sun Mar 13 21:07:22 2011 +0100

    Add braces around expressions with & used as truth value.

commit e8999b94cfd8672fba5fe848a5bd4fbcbad049e2
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sun Mar 13 21:02:16 2011 +0100

    Out of range bug fix.
    
    The added assert fails without the added if.

commit 4b996b1ac70e2dfe2fe50ffca496760ef22f5fed
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sun Mar 13 20:59:39 2011 +0100

    Use sizeof(word) where appropriate.
    
    If word has a different size (even with RADIX == 64, as is the case
    for the word wrapper that will be added in a few commits) then we
    really need to use sizeof(word) in this place instead of 8.

commit e7aef678612e9c146b8039c3afc15cefd3068d98
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sun Mar 13 20:56:11 2011 +0100

    Bug fix for _mzd_transpose_direct
    
    I was incorrectly assuming that the rowstride is constant;
    it is not when we enter a new block. This fix no doubt will
    make transposing slower again, but we can't fix that until
    after 'rowstride' is introduced.
    
    Also, use ONE instead of 1 where appropriate, and use size_t
    instead of word for bit distances.

commit 46a72f759b32a313bb4b75a15557094f9004fc64
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sun Mar 13 20:38:33 2011 +0100

    Use uint64_t, not word, when we are dealing with 64-bit integers.

commit 30f7f546a4794e2d9d3645fffe7266112c850338
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sun Mar 13 20:25:50 2011 +0100

    testsuite changes
    
    Generate testsuite/Makefile from testsuite/Makefile.in.
    Add cleandist (to remove Makefile).
    Include config.h in all tests.
    Add targets for test_% executables and collapse the bench_% targets.
    
    Generation from Makefile.in allows the use of @CC@, which -- together with
    the include of config.h -- is needed for the word wrapper (see future commit).

commit f87d88c6545fa6d5d5917cc58e48de305abc0d91
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sun Mar 13 20:13:37 2011 +0100

    Explicitely use CONVERT_TO_UINT64_T when a word is transformed to integer.
    
    While the macro does nothing, it makes clear that after this
    conversion typical integer operations like addition, subtraction
    and multiplication might have a meaning again.
    
    Typically, all relationship with columns should be considered to
    be lost (but certain properties, like the number of bits set,
    are preserved).

commit c6133d2bcf1d89c73fe4c027a0978f9e135f4a8b
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sun Mar 13 19:54:31 2011 +0100

    Explicitely use CONVERT_TO_WORD every time an int (or BIT) is converter into word.

commit 0476820c4c17f3e96d72997758393bd9a9a5773f
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sun Mar 13 19:52:27 2011 +0100

    Explicitely convert a word to BIT.
    
    Turn BIT into an int, as is normal for a BOOL.
    The explicit conversion macro, CONVERT_TO_BIT, does nothing
    but makes clear that we expect the word to only have its
    least significant bit set and that we want to transform it
    to a boolean: false if the bit is not set and true when the
    bit is set.

commit eeb5da41e6b5ef0c921803443a801d19d1f21198
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sun Mar 13 19:17:57 2011 +0100

    Work in progress

commit 943202017f2528a7bdce0159576b91da3b01c38e
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sat Mar 12 23:02:40 2011 +0100

    Make things work for both, g++ and gcc.
    
    When setting CC=g++, 'word' is now automatically wrapped in a class.

commit 994a4ea97a129ca809b04834b743c2e339aaf41e
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sat Mar 12 03:07:50 2011 +0100

    Implement WRAPWORD

commit 0056b5146057390093f8f05b0dc3e890ad33a93d
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sat Mar 12 02:50:24 2011 +0100

    Introduction of mzd_read_bits_int.
    
    This function makes explicit when we expect
    the result to be interpreted as a (small) integer
    that can be used as index for tables.

commit af558b2103a2cca7756ea2a2a12e42009caf98f6
Author: Minh Van Nguyen <nguyenminh2 at gmail.com>
Date:   Tue Mar 8 05:22:31 2011 -0800

    more user friendly documentation in README
    
    Some typo fixes in README. Basic instructions on installation. Some
    instruction on building the reference manual.

commit 7a95eda9e108df7d86d4af892eaf603e039e4edc
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Wed Mar 9 18:35:03 2011 +0000

    added benchmarketing "framework" for getting more reliable timings out of bench_ files.
    Switched bench_elimination,bench_multiplication and bench_elimination_sparse to new "framwork".

commit 6b76705b5a59c6f055c9756cce574e5fde5de789
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Wed Mar 9 16:49:43 2011 +0000

    set the random seed to a fixed value to allow reproducible tests/benchmarks

commit 6edefb93de74b983d4ad727b64ae34f7736fb545
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Mar 8 01:16:13 2011 +0100

    Remove dependency on cwautomacros.
    
    As per Martin's request.

commit d730ec2e97051681e8deaaa7563a995e6181f854
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Mar 8 00:56:25 2011 +0100

    Code alignment that makes _mzd_mul_naive-64 20% faster (or not).

commit 68bc54308da1592172d916a20f0ef3e1fae9b62f
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sun Mar 6 21:20:14 2011 +0100

    Rewrite of _mzd_transpose_direct
    
    This version is a lot faster, and more importantly,
    no longer confusing the compiler.

commit c2fa76d3434d8e49c631a3f721c59793b6ce2e33
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sat Mar 5 19:21:30 2011 +0100

    Hijack testsuite/Makefile to compile matops.

commit 472f641055583289e00e49d1cf8a357e9632ff88
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sat Mar 5 18:50:52 2011 +0100

    Introduction of MIDDLE_BITMASK.
    
    MIDDLE_BITMASK is just 'LEFT_BITMASK & RIGHT_BITMASK' when operating
    on the same word (thus, small matrices) but a little bit faster.
    
    Also, set RADIX to just '64'. While 'sizeof(word) << 3' results in the
    same it, 1) has type 'size_t' *) and 2) only works when CHAR_BIT == 8,
    which is not the case on all machines :p.
    
    Since we use uint64_t as type for word, it is more logical to use 64
    explicitely than to pretend that word could be changed to a different
    type (which it cannot)
    
    (*) we shouldn't use size_t for bit sizes... but I'll get back to
    that later).

commit 2780181bd009d027d10bd3e7b0c05ddc2a904347
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sat Mar 5 18:44:34 2011 +0100

    Make sure the correct library is being used at run time for the testsuite.
    
    Hard code the path to the library under test into the benchmarks,
    otherwise they might run against an installed version if
    LD_LIBRARY_PATH is not set (correctly).

commit 3c1a1d4ce38788f280ed8ac61966a671a3203b32
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sat Mar 5 18:31:49 2011 +0100

    Add back the +1 to the result of log2_floor.
    
    The old log2_floor returned one too much,
    however, apparently this is what had been
    taken into account(?).
    
    Bench marking shows that the correction
    of log2_floor had the following effect:
    
    bench_multiplication_10000_10000         same speed
    bench_elimination_10000_10000_pluq       slower
    bench_multiplication_10000_2048          faster
    bench_multiplication_10000_4096          slower
    bench_elimination_10000_10000_m4ri       same speed
    
    Because bench_elimination_10000_10000_pluq is
    more important than bench_multiplication_10000_2048
    this change is hereby reverted.

commit bd42b6e17f42de5cafbac8de751444c614da32a6
Author: Carlo Wood <carlo at alinoe.com>
Date:   Fri Mar 4 04:13:58 2011 +0100

    Compiler warning fixes.

commit f68ffb4e3baac7cf8b7b71acdeb9f5440471ed7a
Author: Carlo Wood <carlo at alinoe.com>
Date:   Fri Mar 4 03:42:40 2011 +0100

    Final micro optimizations in packedmatrix.h.

commit 4df7a0208c7d36d64b143b5b4605bbde568f4547
Author: Carlo Wood <carlo at alinoe.com>
Date:   Thu Mar 3 00:45:46 2011 +0100

    More micro optimizations and a bug fix.
    
    Removed LEFTMOST_BITS and RIGHTMOST_BITS completely:
    the meaning 'left' and 'right' is a bit fuzzy at
    the moment ;)
    
    Whereever we do an AND with ~0xF, use ~0xFUL instead,
    although sign extention is saving us, it's a bit
    scary to interpret 0xF as (signed) int, invert it
    to get 0xFFFFFFF0 and then because it is signed
    get 0xFFFFFFFFFFFFFFF0 before it's assigned to an
    unsigned long (long).
    
    Some micro optimization mostly involving the fact
    that we can swap bits by doing an XOR with the XOR
    of the bits that have to be swapped.
    
    Fixed a bug in mzd_row_add_offset where non-zero
    excess bits in the source would end up in the destination.
    Apparently the destination is never a window into
    a larger matrix, so this wasn't discovered by the
    testsuite, but it might matter in cases where someone
    relies on those excess bits to be zero.
    While I was add it, I more or less rewrote the
    function to be faster for smaller matrices without
    being a clock cylce slower for larger ones ;)
    
    Fixed several typo fixes in documentation of xor.h.

commit 03ec153a8e463dfcd7d283c3f0734da3eb237487
Author: Carlo Wood <carlo at alinoe.com>
Date:   Wed Mar 2 03:21:29 2011 +0100

    More bit hack improvements.
    
    No longer do a modulo in the macros when not needed,
    but require calling code to do that.
    
    BITMASK wasn't used. I redefined it and now use it
    to create a mask with just one bit set for one column,
    and use it everywhere where before ONE was shifted.
    
    Improved documentation of LEFT_BITMASK and RIGHT_BITMASK,
    and use them in packedmatrix.h instead of explicitely
    shifting ONE there.

commit 69e7ca6e660ed3b4d7649fa5a3d3c269c58fd2af
Author: Carlo Wood <carlo at alinoe.com>
Date:   Wed Mar 2 01:15:15 2011 +0100

    Fix inclusion of misc.h.
    
    (Only) include it in headers that need it,
    and remove it from source files that already
    include another header that needs it itself.

commit 17258f6aaaf90ba16ba7dab7a0d90c91d7e40fee
Author: Carlo Wood <carlo at alinoe.com>
Date:   Wed Mar 2 00:51:27 2011 +0100

    Bit hack speed ups and documentation fixes for misc.h.

commit b36d949d2aad94ef693442e6139950301d973b6a
Author: Carlo Wood <carlo at alinoe.com>
Date:   Wed Mar 2 00:47:41 2011 +0100

    Fixed a typo.

commit ba56ed23a28aa54cd1a72d99960ff1ae6ad42816
Author: Carlo Wood <carlo at alinoe.com>
Date:   Tue Mar 1 23:45:26 2011 +0100

    Get rid of explicit unsigned long long constants.

commit 8615f12f1eef897ff2d0e3397caff40ee70d744b
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sun Feb 27 03:09:41 2011 +0100

    Use the canonical 64-bit type for word.

commit 1df28630637f011eee9d77c13c3d095f13307d1a
Author: Carlo Wood <carlo at alinoe.com>
Date:   Sat Feb 26 21:07:34 2011 +0100

    Added cwautomacro's 'autogen.sh' to generate auto tool files.
    Removed generated src/config.h.in from repository.
    Added .hgignore to ignore all generated auto tool files.
    Fixed configure.ac warning.

commit 8cfd448797d661c1ec352a5385618cad50fd785f
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sun Feb 6 16:40:49 2011 +0000

    *** empty log message ***

commit fe5687ad95ad91179165d730758104c377984704
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Fri Feb 4 00:31:24 2011 +0000

    slightly better _mzd_combine

commit 60591fb8ed93f86d71b765d60be5be9324d3f439
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Thu Feb 3 23:35:14 2011 +0000

    don't compute the full PLUQ in mzd_echelonize_pluq() if full=0

commit ff3480e40e32d8f48e1604501fa5e411b57d518b
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Wed Feb 2 22:54:37 2011 +0000

    _mzd_pls_submatrix() only considers the currently needed words instead of of whole rows
    this fixes #24 and ensures that M4RI-style PLE decomposition is O(n^3/log n) also in the worst case

commit 55a4888058a8db584593c51f68d66bca60259195
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sun Jan 30 15:09:17 2011 +0000

    benchmark(et)ing code for sparse-ish matrices

commit 36a0565012323c2f97759f1753857c1b6f4966a7
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sat Jan 29 22:01:23 2011 +0000

    optimised compression L in _mzd_pls() (fixes #23)

commit a901bc2dffdd660144895610b028a783d4797426
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Fri Jan 28 15:35:46 2011 +0000

    more work on compression of L

commit dfbe749ca602357e2aa6408165272189437c68ca
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Fri Jan 28 15:15:41 2011 +0000

    new function _mzd_compress_l which implements compression of L for PLS

commit 9da1005bd6017630c9b1ee37ff69c384f02d6edd
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Fri Jan 28 14:01:52 2011 +0000

    allow generic ranks in bench_elimination so we can improve rank sensitivity

commit 8bd9d6cf5feb639e92827f961c6ce250b4795295
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Thu Jan 27 20:46:31 2011 +0000

    adapt testsuite to new build structure

commit cee7187cdb7cc4ff127e23ccf37b923afabd3465
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Mon Jan 10 12:06:01 2011 +0000

    package passes make distcheck now

commit 0cd7c01150885e484f6b99f9a7175368e67851ee
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Thu Nov 18 23:59:24 2010 +0000

    slight speed improvement for TRSM upper left

commit be578bd07761b8d1ba4bbd21b0dfeee3e1a7d60a
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Thu Nov 18 23:21:20 2010 +0000

    new TRSM passing all tests now

commit 5394f76417b4f6899b2c7c9e1d5230d4aa772a79
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Thu Nov 18 23:20:41 2010 +0000

    adding optional randomized tests for TRSM

commit 9a077d5201bfc17983a56e73af3435c7c3fbbd0d
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Thu Nov 18 23:10:30 2010 +0000

    a more comprehensive test suite for TRSM

commit 64c21eef7859843f6114a8b1d2ba0be71589c2aa
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Thu Nov 18 22:29:44 2010 +0000

    rewrote mzd_make_table in order to support offset!=0 needed for M4RI based TRSM (experimental)

commit 94b0f8ffcd874d99a12c6b7dc741dda13d035fcd
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Thu Nov 18 17:42:03 2010 +0000

    implemented simple TRSM upper left using Greasing
    cf. http://bitbucket.org/malb/m4ri/issue/21/use-greasing-in-trsm

commit 68f69d3239e430121f35b54071ecff2958453754
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Mon Sep 6 16:45:01 2010 +0100

    fixing segfault in corner case of solve.c

commit 2b51d76c7e42a9141cd83f3f670c487bfbee71ca
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Tue Aug 31 12:42:09 2010 +0100

    yet another fix for system solving. Inconsistent systems *are* detected despite
    the previous commit message. However, A->offset != 0 was never supported by
    our PLUQ but test_solve.c used to assume it did.

commit a579d74099f59f4d8a22754d535befdf040d4039
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Mon Aug 30 19:37:36 2010 +0100

    fixing solving (for systems which are consistent)

commit 37cafc478a0265707d2325d5f7244781a8825a3b
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Tue Aug 17 19:12:41 2010 +0100

    Added tag release-20100817 for changeset 6758e6a445c0

commit 6f49b137ff1231283793c2d42c8c88a1354454a7
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Tue Aug 17 19:12:03 2010 +0100

    new release

commit 597fb044350daca329bf3fa4258b1f42ec1640ff
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Mon Aug 16 13:01:43 2010 +0100

    improved speed of cache tuning, seems to give good results on t2,bsd,road,prai243,redhawk,eno,iras

commit 48f9cedb1787ae1f172ec94596f9824ed0442b5f
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sat Aug 14 16:31:26 2010 +0100

    more robust cache tuning by increasing the number of trials

commit f467f7d2b098f1106a6c54ca4ef1ad0958eb38bd
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Wed Jul 21 23:55:04 2010 +0100

    make sure the memory managers match!

commit da12640602b6d53721f27c8aac282b12f0fe1fc7
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Wed Jul 21 23:54:54 2010 +0100

    Cygwin requires no-undefined, otherwise no shared lib is built

commit 86be9caf690d80aae40c297c3c18c0a6f0ddbede
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Tue Jul 20 17:08:13 2010 +0100

    wide should be a size_t

commit a72d8aa0c597110fb4ea972db47632d0db49ed3f
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Mon Jul 19 23:18:23 2010 +0100

    refactoring to allow m4rie to reach into some of our fast routines

commit 853092939e2088f5551c584a94100cb6ef0da047
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sun Jul 18 21:18:48 2010 +0100

    exporting all mzd_process_rowsX variants

commit fbbec54414350297e85b631da5144334a5a3bf02
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sun Jul 18 21:18:03 2010 +0100

    Added tag release-20100701 for changeset 8513835b2a92

commit bd924e629bcff28885830dfb131ea89c449e99b9
Author: Martin Albrecht <martinralbrecht at googlemail.com>
Date:   Sun Jul 18 21:17:46 2010 +0100

    fix default paramters in configure

commit 596ab3e260c3bbd58d5b433e32670d520bc2c973
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sun Jul 11 16:27:59 2010 +0100

    allow the user to disable SSE2 instructions
    (needed for SAGE_FAT_BINARY, cf. #9381 on Sage Trac)

commit 569b5d1102d134c601b1e92e55bc9b1fe643c1a5
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Jun 28 22:20:17 2010 +0100

    updating Visual Studio project

commit 1d2c10865f579e8f7486f7651cf36c51e3eb7dd2
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Jun 28 22:08:05 2010 +0100

    fixed docstring for PLS decomposition

commit 74b19f30a2f9c68561e9d16930c7138c2c9e8c73
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Jun 28 21:39:56 2010 +0100

     * renamed LQUP functions and filenames to PLS
     * added echelonform.[c|h] files, which provide highlevel echelon forms

commit 8cca3368ba96fb26a36ec69f138d6cb56e6698d3
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Jun 29 17:41:30 2009 +0100

    implemented heuristic algorithm which starts with M4RI and switches to PLS based
    decomposition when the remaining matrix has a density of > 0.15.

commit 58742964d379cdcc4961d0ea06191154dda97918
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Jun 4 13:23:53 2010 -0700

    tuned OpenMP parameters for M4RI on sage.math

commit 790733539cac40d40ca942c2a03ab8f18bfe5c89
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Jun 1 23:45:50 2010 +0100

    current OpenMP complaints about return from critical blocks, also removed nested criticial blocks

commit 15ac8ded206f584d553a1c4eb5d14c0487a840c1
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Jun 1 23:24:24 2010 +0100

    updated to current Debian version (this file should be removed from revision control, it doesn't belong here)

commit dc8928289e07f5115f7cd2ac2a8f1e099b6b0c78
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Apr 14 11:21:04 2010 +0100

    revert temporary switch to _mzd_lqup_naive in _mzd_lqup (it was just a benchmarketing test)

commit 3025aba5e1ab59fa793c9f4d012ef87552d094a1
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Mar 23 23:53:00 2010 +0000

    be slightly more clever about selecting 'k' in _mzd_lqup_mmpf() by mirroring M4RI strategy

commit f24aa7eb10c5d09b3848db2404bfba81611d1a2d
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Feb 19 15:20:55 2010 +0000

    fixed a bug in permutation which caused segfaults (cf. Sage #8301)

commit 606a7c2ff5e0feda430f122d1c683209b6596d95
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Nov 24 23:34:42 2009 +0000

    renamed mzd_apply_p_right_tri to mzd_apply_p_right_trans_tri because this is what it does
    some some sparse-ish performance enhancements

commit e313afed53a29c584ffd77771a540f92c1d4ffa2
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Nov 24 21:25:02 2009 +0000

    only perform column swaps on non-zero rows in mzd_echelonize_pluq. For some sparse matrices, this gives an advantage

commit b13a65ec9bd0df852911a660d41c7dbfb365de6a
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Nov 19 10:40:31 2009 +0000

    considerable protability improvement in configure.ac due to David Kirkby
    cf. http://trac.sagemath.org/sage_trac/ticket/7375#comment:6

commit 2253817e29c2db48f15ab32365b486036eb99777
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Nov 18 16:40:53 2009 +0000

    defaulting to '0' instead of 'unkown' in ax_cache_size.m4. This should make things more cross-platform

commit c53933588d5535c5d15d6056ba250a9b6889ae7c
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Nov 4 11:47:34 2009 +0000

    fixed doxygen warnings

commit bee51b0912eec55a918b00b9bd0889f138e968a4
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Nov 4 11:39:11 2009 +0000

    Added tag release-20091101 for changeset 66644740d92d

commit f00c655dd119e13105651a8992d524655baf47d3
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Nov 3 12:35:43 2009 +0000

    fixing warnings/errors reported by Microsoft Visual Studio

commit 98bb3bae43c21749b301ec2e4eb3ec2fee42f946
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Nov 2 16:16:54 2009 +0000

    another sizeof(size_t) != sizeof(word) bug

commit 42c077d18f4adbc1f3571ad42f593784d48a8abe
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Nov 2 15:11:17 2009 +0000

    fix bug which lead to wrong results on t2.math.washington.edu

commit 13c20fa55d6c20df1abdec275f56781c7290a4ab
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Nov 2 10:44:57 2009 +0000

    changing the soname version to 20091101 in preparation for new release

commit b6e30a2d367bd17be56250a08a070f536abe0036
Merge: 37e42f4 a3ef911
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Nov 2 07:05:52 2009 +0000

    merge

commit 37e42f4e24659dd703d4af32f28d99d6cf78f15f
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Nov 2 07:04:56 2009 +0000

    fixed potential segmentaton fault in mzd_row_add_offset

commit a3ef91134fd2b32460d254744f7d560b42ea37f4
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sun Nov 1 16:44:35 2009 +0000

    moving 'step 1.5' of LQUP MMPF to _mzd_lqup_submatrix because it caused confusion that the postprocessing is outside of that function.

commit 06c8f5a35112ee828117534b1aa6ea6803a622ff
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sun Nov 1 14:47:30 2009 +0000

    implemented timing experiment to calculate L1 and L2 cache size. This isn't working perfectly yet and thus it is only optional for now.

commit dcb6e833cfa027a496e4979bcd8ebc85e09d916d
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Oct 28 21:31:11 2009 +0000

    whoops, forgot to check in configure.in

commit 6ca80f4cf039bd4b7766b1f0018eafa6d5c3c8fe
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Oct 28 21:25:37 2009 +0000

    don't check for the number of CPUs on configure. The macro is not cross platforms and we don't use it anyway (fixes #16)

commit a0641d9ba654a2ba86222d9e1b5c2127e211bef1
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Sep 9 19:36:26 2009 +0100

    improve performance of mzd_transpose using Hacker's Delight bit-fiddling trick (closes: #15)

commit ffcb62356cecb4e4c63cc3eeeb15b0caaa2b110d
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Jun 30 13:08:22 2009 +0100

    use L2_CACHE_SIZE for PLUQ cutoff (experimental)

commit 371647a613874227692448fd33f29edd3d524630
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Jun 30 10:21:49 2009 +0100

    copy submatrix to temporary when switching to MMPF

commit 0c27162e37b44c527642e3a535bb39f7504c5f3a
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat Jun 27 14:39:27 2009 +0100

    added _pluq_mmpf back for debugging etc.

commit 75ce4c4ac4d7a25a1403400db6e222d54dc91049
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat Jun 27 09:01:08 2009 +0100

    don't apply permutation if todo rows == 0

commit 81bfc1c77dd1256d71c527c8ac9c08af7500b7cb
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat Jun 27 04:12:38 2009 +0100

    some performance improvements for sparse-ish matrices

commit 0a1f1a7072f4f49773714e897235ab6f4c94bdfa
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Jun 26 19:22:03 2009 +0100

    fixed a bug which escaped me for the last check in because I didnt check with cutoff=64

commit 79cd369289b5495996d90678dfc9cb0284cdf941
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Jun 26 18:38:23 2009 +0100

    improved performance for LQUP factorisation to roughly match that of PLUQ, still work to be done
    to improve upon PLUQ

commit f833b38265bfbc2510237873195ec6438f1ba5a1
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Jun 26 15:56:13 2009 +0100

    switching MMPF from PLUQ to LQUP and enabling it

commit f71b32d1f8d76c5f6266699a295b7207a916ecb7
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Jun 26 15:42:17 2009 +0100

    Added tag release-20090617 for changeset 46b89e01b348

commit b478058a069e81c2182584656f45784d973880c9
Author: Clement Pernet <clement.pernet at imag.fr>
Date:   Fri Jun 26 16:27:20 2009 +0200

    Switch PLUQ -> LQUP
    Test suite passes
    Need optimizations

commit adee03389dbc7ad12dc487a0bf3fbfa1d44d4329
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Jun 26 13:41:15 2009 +0100

    made mzd_apply_p_right and mzd_apply_p_right_trans more efficient to decrease the penalty of column swaps.

commit d789c5c599a3717bf670dbbb5ca7f43b4ceb9b88
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Jun 25 22:08:49 2009 +0100

    only swap at the end of the base case not during while finding the pivots. This allows a more
    performant search for pivots since we don't fill up the matrix with zero columns. This improves the 'sparse' case considerable.
    
    Joint work with Clement Pernet.

commit 1ca96154fe41c7aae8fc562292dc65f14b55e31a
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Jun 23 14:03:24 2009 +0100

    implemented adding 3 and 4 rows in one step for PLUQ MMPF and adapted constants accordingly

commit d8a2e34127d50237a928e7c72ec71a44f16024c0
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Jun 22 09:48:37 2009 +0100

    fix bug in mzd_is_zero() where small zero matrices wouldn't be reported as such

commit 73f6f8095253f0940450420f7df8b801ba8dd491
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Jun 18 18:16:47 2009 +0100

    switch back to using threads if any additional thread is available, don't require at least four

commit 4401f1cc33b670c26e2fd445b3260088998635a8
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Jun 18 09:55:10 2009 -0700

    added low leverl parallelisation ot process_rows2_pluq and added  that the parallel sections in mzd_mul_mp_even() should use num_threads(4)

commit 0a74845f587ca25bdf03a30a8edff6829f5c992f
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Jun 18 15:03:46 2009 +0100

    fixing OpenMP doctest failures

commit 3db4efffe2240803789ee77af05652aad7580ca1
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Jun 18 14:56:13 2009 +0100

    experiments with OpenMP

commit 93ded0a3481dd34eebfcbcee888d84a00bf8b50b
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Jun 17 16:39:43 2009 -0700

    fix compilation with --enable-openmp

commit ff6150f95395a6993d951c9485d226d637d9ea76
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Jun 17 14:41:27 2009 -0700

    fix L1 detection on OSX x86

commit 75abf88c5a70cabf2dd817d903b4c42a29e2f157
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Jun 17 20:40:43 2009 +0100

    yet another fixing attempt for cache size detection

commit 622221cbd228ea27016fd851030e9d1f63890cef
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Jun 15 12:07:19 2009 +0100

    do not prepend zero in cache size detection since that will trigger octal interpretation of the result

commit bbc1ab911d22c5101af7170f891b71cca347397a
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri May 29 12:57:04 2009 +0100

    fixed testcode for mzd_kernel_left_pluq()

commit b505de5bf3aaa72ed21463cac0248c31d6387696
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri May 29 12:23:01 2009 +0100

    added test code for mzd_kernel_left_pluq()

commit 2c2fc8cf0d06648ac39055a5ebabd75c583f9417
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu May 28 23:59:11 2009 +0100

    implemented mzd_kernel_left_pluq to compute kernels via PLUQ

commit c268996232f1e7f6e6fc271b4e5a457df1bcb451
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue May 12 02:27:21 2009 +0100

    fixed release soname

commit 7442fcdb2e0e1cec86bb06c078b1fbafbbf25cfb
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Apr 9 11:27:30 2009 +0100

    set max malloc size to 1GB

commit 181bab66245db75853c3574471000a45101089f5
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Apr 9 11:21:17 2009 +0100

    fixed a few warnings and one error as reported by MSVC

commit b95acce69d985f69519b49f98eaded288ab04278
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Apr 9 11:08:18 2009 +0100

    added Macro as an author

commit 1bd98f275b04c5b387ded348d2d952a96722664e
Author: Peter Jeremey <peterjeremy at optushome.com.au>
Date:   Sat Mar 21 13:04:52 2009 +0000

    this patch solves:
    1) The 'o: not found' problem
       caused by a stray 'o' in configure.in
    2) The 'test: x: unexpected operator'
       caused by non-standard test syntax in configure.in
    3) The 'arith: syntax error: " 16#40020140"'
       caused by non-standard $((...)) syntax in m4/ax_cache_size.m4
    4) The failure in "checking for x86 cpuid  output... unknown"
       I'm not sure why this check is done at all but I've corrected the
       syntax in ax_gcc_x86_cpuid.m4
    5) The failure to correctly detect the number of CPUs
       Caused by badly broken code in m4/ax_count_cpus.m4
    6) Several autoconf warnings caused by incorrect cache variable names.
    
    I've also enhanced m4/ax_cache_size.m4 to verify that it's safe to
    query the CPU cache size before doing so.

commit a0cf4ed6ee225b80d90fd4b9ae5b91aac9d548f4
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Mar 19 15:40:22 2009 +0000

    cleaning up the new code

commit cab25a1fd44181a35647c2a6268c56923b64fb74
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Mar 19 12:00:12 2009 +0000

    refactoreded packedmatrix to allow more than one malloc call to allocate the matrix
    also renamed packedmatrix to mzd_t and permutation to mzp_t

commit e246edc9d73594a02ad6237c67ee7ec5081062b9
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Mar 13 17:48:09 2009 +0000

    call _mzd_mul_va from mzd_mul_naive if appropriate

commit 69bf5d2c3f3056717bcffc4d5064e82d9c0595a7
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Mar 13 16:53:52 2009 +0000

    fix bench_elimination.c vs. new mzd_echelonize_m4ri() API.

commit 60f5186724b563e7eb16730151e8b16ee12e699e
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Mar 13 15:44:57 2009 +0000

    some trivial doxygen fixes

commit e1fc9b2efb3b0bb1d69b1306d566b7bb1e5fb308
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Mar 13 15:36:09 2009 +0000

    improved documentation (added docs on return values) and removed redundant parameters from mzd_echelonize_m4ri()

commit 982e6a18cbade0620f555bd7d77f158c27d30487
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Mar 13 14:11:39 2009 +0000

    remove unecessary if() statements in mzd_pluq_mmpf()

commit 8146b2ba99c32c0591dba915a07d32cd0c71cb95
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Mar 13 14:11:15 2009 +0000

    fixed a SIGSEGV in mzd_echelonize_pluq() when full==0

commit 732be065a5720ce7661666739062089a5378bb75
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat Feb 7 19:30:25 2009 +0000

    renamed Macros' functions _evenb -> _even and the original functions _even -> even_orig to
    make sure that user's actually call Marcos' version.

commit 17933e484ca9237401a9a2a962de14e6a3b084dc
Author: bodrato at mail.dm.unipi.it <bodrato at mail.dm.unipi.it>
Date:   Tue Jan 13 20:07:32 2009 +0100

    Added new functions for addmul and addsqr using new sequences.
    Added also some trivial tests to test_multiplication.c

commit 961a545fa80d464d6dce5684104e75c35cbb0bda
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat Jan 10 16:52:25 2009 +0000

    inlining a couple of often called functions, this should help a bit

commit 315dd1397aa623d83fd9eff9ee4d25c0cd7e6107
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat Jan 10 16:15:28 2009 +0000

    make m4ri_coin_flip static inline to remove noise from oprofile run

commit fc820bf0f6cf4cd1030d4f7f40da89ede0c521af
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat Jan 10 15:47:09 2009 +0000

    make bench targets depend on Makefile

commit a6c9f9e8aadb49b64ee4b7d7923def43ffbd9101
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Jan 9 13:11:15 2009 +0000

    small clean-up in mzd_cooy()

commit 56e32b22e7b746e798fba999f6a4d3612bb6fd1a
Author: bodrato at localhost <bodrato at localhost>
Date:   Sat Jan 3 15:13:29 2009 +0100

    New Strassen-like sequence for multiplication, and squaring.

commit 25d8a96fa37428d5dfcb82722b327a63ec6c74d7
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Jan 5 12:45:47 2009 +0000

    Added tag release-20090105 for changeset 0b25b0a1474a

commit bb0baaad494a1d5958e6ad5f6371fcbb6d8ca0ea
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Jan 5 12:42:15 2009 +0000

    updated MSVC project to include pluq_mmpf and solve

commit f61bffe3fb8f7e74125178ada78b0bce67a8e738
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Jan 5 12:40:40 2009 +0000

    fixed MSVC compiler warnings

commit 9a7b5b30ff09667c78d9d0f4a945c06c3eabaf1e
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Jan 5 12:17:29 2009 +0000

    fixed doxygen warning

commit b27f2f26ca8bff784eb023835c30d6c572a1d616
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Jan 5 12:09:48 2009 +0000

    preparing for release 20090105

commit 309149c2448210e0bfa2ebb1c1f4b12cbf71ac3b
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sun Jan 4 21:38:33 2009 +0000

    fixed memleak in test_solve

commit a70b47883b5a219f8dcc8e23453c5fe5665a6bac
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Jan 2 13:59:28 2009 +0100

    improvement for sparse matrices in M4RI

commit 0b735b5e2182b02ff995596499c747cab21bf893
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Jan 2 00:37:42 2009 +0100

    improved cache friendliness of column swaps in LQUP

commit 47c83feed2969e5fdf3c4b40cb3fb9149033392c
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Jan 2 00:09:09 2009 +0100

    apply_p_right_trans() more cache friendly

commit b6a94567749b9f3d4d62cabc279ccff4f88c988f
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Jan 1 22:37:18 2009 +0100

    added mzd_density()

commit 77550f7f85f411c1c2a8d9de7e8c1fd019bb00ea
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sun Dec 28 18:19:27 2008 +0100

    spend less time in mzd_process_rows when in mzd_process_rows2_pluq

commit c5c531406df3f7243db5547cba59236f61cb87f0
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sun Dec 28 16:47:38 2008 +0100

    improved bench_elimination to allow choice of algorithms (mmpf, pluq, naive)

commit dfe011132bc8846c9c5ccdcaed7cd65671d1e28c
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sun Dec 28 16:36:03 2008 +0100

    fixed PLUQ MMPF bug

commit 0f247d1820f7c19d1d56c7291aa9e86fdcf8b85e
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sun Dec 28 01:19:51 2008 +0100

    fixed some minor bug in TRSM

commit 56bce6e88709f4b97492e9eb99d9fd9fd6928b4d
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat Dec 27 18:59:06 2008 +0100

    more testcases (mzd_echelonize_m4ri() fails)

commit 056282f3e77f4da4d1c06a6c10867ec1b513e429
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat Dec 27 17:27:58 2008 +0100

    better handling of sparse matrices in MMPF

commit 00e4d5f56ede1427affa2b655e575a5b0fcb9bdf
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat Dec 27 16:04:14 2008 +0100

    renamed a fullrank -> halfrank in testroutine

commit 65af57e48035aa582e37eacd8c7430f56a84a0e2
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat Dec 27 15:06:47 2008 +0100

    new strategy for dealing with not-full rank submatrices in MMPF

commit 540e5b4514754f041784a303dad4754be79b3a6e
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Dec 26 18:23:00 2008 +0100

    added COPYING file to repository because autotools insist on GPLv3+ while we're GPLv2+

commit b3072cc22d2d1ebeb94c6c16350d48539790022e
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Dec 26 17:56:01 2008 +0100

    yet another printf fix

commit 5aa254631656826e3da43232ef9e2cd1f63cd596
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Dec 26 17:54:28 2008 +0100

    fixed warning in test_solve

commit a934a5dd77d5c16f58eef3d59fb2efb144b1ff45
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Dec 26 17:37:33 2008 +0100

    fixed solve for full rank A

commit 096e3f45eb981990d5a5f237a2d23082442303e1
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Dec 26 13:54:39 2008 +0100

    made some todos more visible

commit 352b02307f658c34a27bb1abcf3740e3d0f8eec5
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Dec 26 12:11:33 2008 +0100

    updated AUTHORS

commit de69e2fd2c55d955b5b55c061af25ba9a9387ec4
Merge: 2300348 72f6dca
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Dec 25 23:18:28 2008 +0100

    merge mzd_row_add_offset move

commit 23003482f66005f4830bf1de968d84369da62931
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Dec 25 23:03:13 2008 +0100

    mzd_print_matrix -> mzd_print; mzd_mul_m4rm_t removed

commit e8d56c57ee314761d1d867a2dbf1d63789590b5b
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Dec 25 22:53:13 2008 +0100

    changed API and updated docs: mzd_reduce_ mzd_echelonize_

commit 72f6dca5d50489ea4e50582c97823e4e3fd6cdd0
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Dec 24 13:43:52 2008 +0000

    mzd_row_add_offset not static inline anymore

commit 6c24c702e4b6fa344fc57144ea049c7ebdd3f7bb
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Dec 24 11:47:03 2008 +0100

    fixed MMPF

commit 668c993ceb43d07e8b261c54ca8192b071c9be49
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Dec 23 18:00:02 2008 +0100

    implemented mzd_echelonize_pluq (mzd_echelonize_FOO is so much better than mzd_reduce_FOO)

commit 5c1ac5af174c07d422384dd8d688c4c6142782df
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Dec 23 15:00:33 2008 +0100

    slightly faster column swaps?

commit b0bd9aa12d8e9b80727edefc0efed83c68dd970d
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Dec 23 14:34:34 2008 +0100

    use fast pivot searching code in mzd_reduce_m4ri

commit 072ce641a0a5373903c5104137a661d3194e8f03
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Dec 23 13:52:48 2008 +0100

    massive speed-up for sparse matrices

commit ddfb1277396ce8187bad7c33c1e33f58d4053882
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Dec 22 18:02:44 2008 +0100

    commented out SSE2 attempt for mzd_col_swap

commit b7546a643e7104b26fbd9482035be401fc7cf74d
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Dec 22 17:50:25 2008 +0100

    factored out pivot finding to fast subroutine

commit 46319871eaf660b79b381514cffe21f20e2f2aa1
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Dec 22 15:54:16 2008 +0100

    allow half rank in bench_elimination

commit b71c0fd999273e0ca8ae0f2f19072ff717badefa
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Dec 22 15:50:35 2008 +0100

    clarified documentation

commit b77efda242a29b1991bb427f805dc908829678d3
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sun Dec 21 01:50:24 2008 +0000

    improved (faster) pivot search in MMPF

commit 7a89588ad2fbc1232cd715c7adf4ddab6ff28ba7
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat Dec 20 23:09:27 2008 +0000

    faster M4RI for sparse matrices

commit 16bccc9c2e248ba2c71386aa3b5873547da1295d
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat Dec 20 21:31:30 2008 +0000

    PLUQ is really really slow for e.g. half ranks. Some code to fix this but no luck so far

commit 199dbf7ab85fd533fead22529dac0292692f4514
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat Dec 20 19:16:08 2008 +0000

    factored out PLUQ MMPF and wrote faster MMPF routine

commit b67ae3925344394df3dd02e3753c2e33b8066005
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat Dec 20 17:54:33 2008 +0000

    better strategy for column swaps in mzd_pluq_mmpf (still way too slow for matrices with r << n)

commit b89f15dcc2c9828771550e49fe9488bb82f763ab
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat Dec 20 17:52:49 2008 +0000

    use m4ri_random_word in mzd_randomize (todo: check randomness)

commit 02a3b2fc13091db6ef42b5406ea59f21ee97b4b8
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat Dec 20 17:51:58 2008 +0000

    added m4ri_random_word (check randomness of output)

commit fca50c10511df4196368fb6ccb33fbb20dd57db2
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Dec 19 17:11:18 2008 +0000

    PLUQ factorisation with MMPF base case seems to be working!

commit c31c38d932e0b1388d45c0ee0b2b656d763f7210
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Dec 19 16:20:13 2008 +0000

    a supposedly working PLUQ implementation (doesn't work with MMPF yet)

commit 6d09fc5864f1d575bb37e076dd420cd86b920064
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Dec 19 12:26:16 2008 +0000

    -fixed spelling of naive across the board
    -alternative take on Q update (still buggy)

commit 6a4fd1ea0509c2ea1a858fa5f63459a12c7190e9
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Dec 5 22:27:47 2008 +0100

    doctest should cat LQUP failures for smaller examples

commit f8ad2923d4e0363c046398eca322cd15b7821dfa
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Dec 5 18:59:07 2008 +0100

    better crossover and 'better' Q update

commit b5b8121f590f5b2a23b6a1bd5b8ddde295e43ab6
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Dec 5 18:26:39 2008 +0100

    mzd_col_swap is a bit faster now, fixed memleak in bench_lqup

commit e786f350bbdcadf94d1b1b182eab5d511b0ca14e
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Dec 5 16:20:43 2008 +0100

    documented/cleanup up MMPF

commit 50f7e47e0952efe19c73df31780af864b150c967
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Dec 4 16:34:23 2008 +0100

    remove debug printing

commit b55b952c6a326a9934d3bdd8b8e16b626fe69a3d
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Dec 4 16:18:17 2008 +0100

    mzd_submatrix accepts offsets now

commit e1f197c2bd27c7fb696ec04a4c34d9a58ecb8626
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Dec 4 15:33:50 2008 +0100

    cleaned up some cruft left over from debugging sessions

commit 2a354c0d122c7617faa9618b0f3678a1224575a3
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Dec 4 15:17:23 2008 +0100

    PLUQ permutations are still wrong, MMPF might be alright

commit af8ec2428b95d2cca7e7fd77267ac9d3c3b3a3b3
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Nov 21 15:11:17 2008 +0000

    and added mzd_row_clear_offset again

commit 9a3e25a1556d44d1f899266c01833264479bd28c
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Nov 21 14:51:39 2008 +0000

    added (untested) mzd_copy_row function. The function is based on Michael Brickenstein's copy_row.

commit 549a35846749170d4c280a7d770f1fa06d83ee35
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Nov 21 14:37:06 2008 +0000

    removed a lot of old functions that were not needed anymore

commit db628d9cb619ea3a5fe67013c75a9f91514b2508
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Nov 21 14:01:48 2008 +0000

    added optimized function for v*A where v is a (1,d) vector and A is a (d,d) matrix. The code is
    based on code contributed by Michael Brickenstein (mult_by_combining_rows).

commit 830d4eeef8f80b0e9643d15ac7dc64501fdcc4b0
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Nov 21 12:27:49 2008 +0000

    bumped version in Makefile.am to aim for release for end of month

commit 51116957738c55f213946bbf5be2d5224a70de23
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Nov 20 19:27:57 2008 +0000

    Q seems to be correct now for MMPF

commit 759b9c44b472ed74ea8af14f85d71d8f15cbd9bd
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Nov 20 19:27:39 2008 +0000

    added method for permutation printing

commit be618e0c336df062d387d029f7b3f08fe20fb3ab
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Nov 19 21:42:41 2008 +0000

    MMPF: deleting L for now for debuggin purposes, once Q is correct, don't kill L

commit 118c8aa8d7d996530795972604740b585f25025c
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Nov 19 18:55:22 2008 +0000

    renamed LQUP -> PLUQ

commit 8800dde7dd4bc603b3fb30057af16cfa5f2f620a
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Nov 19 17:20:01 2008 +0000

    remove some assert(M->offset==0)

commit 4208ce31696e261181f6d9492ebb015b6bd63178
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Nov 19 16:47:31 2008 +0000

    some minor clean-up after fixing the TRSM tests

commit be7e755989e4e225e515a1026c352c9d67ccd6d9
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Nov 19 16:43:11 2008 +0000

    fixed a bug introduced by fixing RIGHT_BITMASK

commit 0dce193ce87e38321f9f7fc605bf56c38bb03a9d
Author: Clement Pernet <clement.pernet at gmail.com>
Date:   Wed Nov 19 15:55:55 2008 +0000

    Added the 2 remaining trsm and the corresponding testsuite and benchmarks.

commit 528a30c9a5bf9cfb85fbb7acd987faa048474c1d
Author: Jean-Guillaume Dumas <Jean-Guillaume.Dumas at imag.fr>
Date:   Wed Nov 19 15:08:35 2008 +0000

    * added is_zero
    * added linear system solving using pluq
    * added appropriate test suite for solve

commit 588ba0185cd37546384732564b34f4889a0f394a
Author: Michael Brickenstein <brickenstein.mfo.de>
Date:   Wed Nov 19 11:52:17 2008 +0000

    don't bail out in mm_malloc if asked for nothing

commit dc4706b2164d8e8daf9e414aade1d619b84c2a10
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Nov 19 11:47:00 2008 +0000

    PLUQ MMPF work in progress

commit 407090cc19be7c573944b84362c8dcd2a70ff44e
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sun Nov 16 22:37:56 2008 +0000

    PLUQ work in progress

commit 2c2aa7edf72371f4bb5c29387e3cfb6e28f1f685
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Nov 12 21:09:20 2008 +0000

    faster LQUP (use Strassen instead of M4RM only) and more comprehensive test suite

commit bfc6b75399af8ff182db0edcc6e43de8c9080e45
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Nov 12 19:00:55 2008 +0000

    improved testsuite build process

commit 513207e1d430216d890b723b43023dc9feb39a90
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Nov 10 22:03:31 2008 +0000

    do not use rowswap array to swap rows, always copy:
    (1) to fix bug in rowswapping matrix windows
    (2) to improve data locality

commit 469a3b3b347878dcfb5a0f3c9783e849fbf21005
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Nov 10 21:20:24 2008 +0000

    update/correct license statement in source files. M4RI was always GPLv2+

commit 4dd9fc8daf2ad2befbb813e80b283bbfa531feec
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Nov 10 20:58:20 2008 +0000

    I'm just playing with MMPF LQUP (not to be taken seriously)

commit c03bec41870cf5c9ebdbe6d063fb6141d6e6a33e
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Oct 28 19:42:11 2008 +0000

    fix doctest failure under OpenSolaris

commit 7188b801d00bcfcd91c348af8bfd4f10ff459eb4
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Oct 28 18:59:22 2008 +0000

    update MSVC project file

commit eeaa6a970a1a294a8c94433ef00ad9d565bfb350
Author: Clement Pernet <clement.pernet at gmail.com>
Date:   Tue Oct 28 18:40:42 2008 +0000

    fix LQUP doctest

commit 703f3b64838f0a44afff55154ee50afd5068df26
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Oct 28 18:23:47 2008 +0000

    fix two MSVC warnings

commit ff3c77cfad3bb38f3c2844886969bad81a74315a
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Oct 28 17:53:21 2008 +0000

    enabling LQUP doctests

commit 809a95a69a476789fd66b77a315c59d399bd90d3
Author: Clement Pernet <clement.pernet at gmail.com>
Date:   Tue Oct 28 17:43:37 2008 +0000

    Work in progress on the LQUP front: fixed a bunch of bugs, and get LQUP working on full rank matrices.

commit 04323cebdddeb161e81a00e862bc5b891878c128
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Oct 28 17:11:51 2008 +0000

    new release

commit f41ba0018a8d8dd94bc270021d9a2b142b4edc96
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Oct 13 14:41:55 2008 +0200

    improved Makefile.am and added make check

commit c24f84c0b245346d10e6875b0460ca25234d17c6
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat Oct 11 10:16:18 2008 +0200

    better benchmarketing code

commit ebeb5e92b2b36c7c32fb7ba386ca511b8acb40df
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Oct 8 17:17:06 2008 +0100

    fixed bug in trsm routines

commit f123e20e8d2286f2003dc002e8eaca3c95d27754
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Sep 15 20:50:35 2008 +0100

    fix mzd_add speed regression

commit 089e243fc77a094e650e9dbae76a892aa2605b99
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Sep 15 20:36:51 2008 +0100

    make C++ compiler happy by fixing m4ri_die's signature

commit 720e6c9dbd3b80445bb456804cf03c06f66368cb
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat Sep 6 16:24:05 2008 +0100

    LQUP basecase (slow but seemingly correct for square matrices)

commit 8ee1fd075dfc40d9df6629b97421c36db14e8a87
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat Sep 6 16:10:04 2008 +0100

    some more work on LQUP but not working yet

commit 336df66a5268384ea249219f5cee518ee4350901
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Sep 5 18:17:55 2008 +0100

    anakha's fix again for configure's cache detection

commit 393901cc226ea846973614b647c45b8d7aa5753f
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Sep 5 11:01:39 2008 +0100

    playing around with LQUP

commit ef73ba280cef2ac5e23c7eead486dcfa81cecb6e
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Sep 5 11:00:48 2008 +0100

    checking in fix by anakha

commit 25a88a3f438bad367935ebcda5c84c78eff5d435
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Sep 4 15:50:17 2008 +0100

    some work on LQUP basecase, not working yet

commit 7080ad3d9739d8271dd6faefe838f1f2b33055a6
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Sep 4 15:50:04 2008 +0100

    suppress redundant output

commit 4ad753523dac2d714afcabd9c61e9021e2d6b327
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Sep 4 10:55:15 2008 +0100

    Added tag release-20080904 for changeset ce71e2c84ad1

commit 7574ec02d26bee06c8ae79cb9f056107ad011559
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Sep 4 10:53:26 2008 +0100

    added RIGHT_BITMASK equivalent for LEFT_BITMASK and (hopefully) made the code more readable
    by factoring out the bit shifting to a macro

commit e987c9bb21d1ab641448eff52942fb2228f5a044
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Sep 4 10:33:25 2008 +0100

    ... and reverted my changes again since they don't work

commit 6498839c457ecbe5f05ccc398ed168526a1df751
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Sep 4 10:22:53 2008 +0100

    checking in  Arnaud Bergeron's cache detection fix for PPC + my adaptation

commit 5023095c3285b0f9947531b18200d2bf4778ebf2
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Sep 3 19:38:40 2008 +0100

    fix/unify bit shifting bugs as exposed on Itanium

commit 70f39d308efb05c32c6e2c227257d21a7df28087
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Sep 3 18:07:03 2008 +0100

    fix cache size detection handling

commit 3c04873a96cad47e92cc29b24712caf34565e6ef
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Sep 3 17:35:37 2008 +0100

    Added tag release-20080901 for changeset bf3d55ccb73b

commit dbca7bc5719bdbad58cf72fc867634c0cc61bfbf
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Sep 3 17:35:15 2008 +0100

    release 20080901

commit 0e02578bf93f5006e24a2656369b7939e7c3a5ed
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Sep 1 12:09:04 2008 +0100

    more scratch code for LQUP

commit 5f7b5a80237c57a4e94aceb7bd84b45289bcade0
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Sep 1 12:08:53 2008 +0100

    fix memleak in addmul

commit 3048846c48b068ffb5fdccb11b2709f719727905
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Aug 28 16:09:30 2008 +0100

    more work on LUP, still not correct

commit 0693ffbe9efde7af25ee37b07e46127667d6f8cd
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Aug 28 16:09:16 2008 +0100

    removed watch.h from m4ri.h

commit 473b2e5c233272f0678d76dba7f59bc41b6c274c
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Aug 28 12:40:51 2008 +0100

    fix warnings issues by ICC & remove unused watch.c/.h

commit b4d15ddc42ec323517bf5a988fab5511ae8a4097
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Aug 28 12:10:10 2008 +0100

    work on LQUP (or LUP right now)

commit 2e7c501ee3c0a0725720c2a9c682b0818685d149
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Aug 26 15:24:25 2008 +0100

    Added tag release-20080826 for changeset 6b307aa254cb

commit 47e64ba936ce2cfa507f253752504e4a82d36f0a
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Aug 26 15:23:51 2008 +0100

    new release

commit 9d21cb718cae2434808219232379f1995c31c0c6
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Aug 26 15:23:02 2008 +0100

    fix a SIGSEGV and sometimes wrong results for matrix multiplication

commit 362fa317e203686037710ce1886bd0bf2b612d9c
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Aug 21 22:02:56 2008 +0100

    new strategy for k for multiplication, should fit Opteron and C2D

commit 44db00205c5e38ba47e1d5698f49484b57b017b7
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Aug 21 14:21:26 2008 +0100

    documentation update

commit 88aa669bafd3c82910e02c6f96f717502f850299
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Aug 19 16:57:18 2008 +0100

    slight coding-style clean-up after merging Clement's patch

commit 41f375d1042e98db00c58484b238196a16ff8a3c
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Aug 19 16:27:33 2008 +0100

    merge of Clement Pernet's patch:
    --------------------------------------
    Intermediate progress to the matmul based LQUP implementation project:
    
     * Introduces offset for matrices, updates mzd_copy, mzd_nul_naiv, all the matmul code, to be consistent with odd offsets
     * introduces the triangular system solving with matrix: left and right looking with upper and lower triangular matrices.
     * add a test_trsm test in the test suite

commit c496849f79956df5455e4826d5157be0a580467e
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Aug 18 16:46:50 2008 +0100

    fix docs

commit c392c2daccd0ca6522afd3b5898d8435b30e2567
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Aug 18 16:46:37 2008 +0100

     - fix compilation with MSVC
     - enable shared lib constructor/desctructor for SunCC
     - fix docs

commit f3adb57ac3e5abacc4b243c4923de52bd402d3c4
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sun Aug 17 16:43:09 2008 +0100

    define CPU_L2_CACHE in misc.h if it isn't there already
    revert strategy in elimination to good compromise across plattforms. 2x strategy is best on Opteron but not best on C2D

commit 0ea343b3c24a22a8d1462a06489d229419f29fbd
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat Aug 16 17:27:31 2008 +0100

    new strategy for k in M4RI, seems to work well on Opteron and C2D

commit 9b35fcf77f28f891b44c452492bfd1d5ac3edab9
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Aug 14 21:27:03 2008 +0100

    renamed reduction to elimination

commit d9cacdbe728bbef7448525531d54a157a16c7446
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Aug 14 21:23:41 2008 +0100

    preparation for next release (targeted: Sunday)

commit fdb6b8249d0b87e7dbd4085b6292d25703a51dad
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Aug 14 21:22:38 2008 +0100

    updated README and AUTHORS

commit ea0e7f5cb08a080566928e6b9218ff931e199004
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Aug 13 18:22:29 2008 +0100

    changed strategy for parallel multiplication to block-parallel-then-strassen

commit 21783c25ca81904971da8581915331724a799d29
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Aug 13 16:33:04 2008 +0100

    adapted parameters for Opteron

commit 717dc9669a478613023b335637a6fc638b3b2745
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Aug 6 19:51:57 2008 +0100

    removed proximity schedule again

commit dfd294bbf09b2ab52f96d7494d219a66b2df0031
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Aug 6 19:51:32 2008 +0100

    added "proximity schedule" from FFLAS, but that doesn't seem to improve performance

commit 016688624063450d264130ca230f5ed65388fd80
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Aug 6 19:51:06 2008 +0100

    __SUNCC__ -> __SUNPRO_C__, untested

commit 29cdd9ff4498cc7fad7c5607d1bd4f5cd8614741
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Aug 6 19:50:39 2008 +0100

    quick rename of one variable, trivial

commit b92128f83c965db6068588df9e812906586869cf
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Aug 6 19:50:11 2008 +0100

    added extern "C" safeguard

commit e8d9ee9743f4ef2705f748c054e155f5fcb12f26
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Jul 17 19:24:47 2008 +0100

    thread safe-ness + refined lib constructor/destructor

commit d769826ab0496d9b156609db5221ca3946dca71f
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Jul 17 17:19:48 2008 +0100

    improved and enabled memory manager, also introduced shared library constructors and destructors. These seem to work with GCC, needs
    testing with SunCC and needs implementation with MSVC.

commit 1b5e19f89247d047ce63b38a71218f122de8a8c7
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Jul 17 17:18:42 2008 +0100

    renamed combineX_sse2 to combineX

commit 4229833bfea43c4a4db4693cda839652969a1e65
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Jul 17 17:18:12 2008 +0100

    if create/destroy_all_codes is called twice ignore the second call.

commit 1d18c76709ee2b67b9b10234a221a11674f39fa7
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Jul 16 16:05:22 2008 +0100

    removed -fopenmp

commit 2d174fe4c7fc455379a8fb4cb2414e033b48160b
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Jul 16 16:05:08 2008 +0100

    added cached memory management option, which is disabled since it doesn't seem to make a difference

commit c75286d369837c34e0abafa1c43b23c467c52333
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Jun 24 15:53:53 2008 +0100

    int/long -> size_t cleanup courtesy of MSVC

commit 625801d98da0196ecbd9423c6db919d7b5a1c1a7
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sun Jun 22 02:20:04 2008 -0700

    patch bomb:
     - use libtool -release versioning for now, since our API is not stable
     - added --enable-debug option to configure
     - threw in a bunch of asserts to make sure we can catch the ignorance of A->offset
     - added documentation to most functions
     - migrated int -> size_t in many places which seems like the right thing to do

commit 23e42f83952a65100703dd17b462027ab9472861
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Jun 20 02:15:37 2008 -0700

    commenting stuff out that prevents the build

commit e92aca60d7ea71a692f04fee735117fe610be835
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Jun 20 02:13:22 2008 -0700

    checking in all files that automake doesn't autogenerate

commit 1d1ab5b7a055fae93aed739dba81cd3bc9b29b1c
Author: Clement Pernet <clement.pernet at gmail.com <Clement Pernet <clement.pernet at gmail.com>
Date:   Fri Jun 20 00:35:08 2008 -0700

    some more stuff on the weird addmul

commit 42e56c97104b489f7c7d5bb4c14a29556e32f3ac
Author: Clement Pernet <clement.pernet at gmail.com <Clement Pernet <clement.pernet at gmail.com>
Date:   Fri Jun 20 00:04:40 2008 -0700

    Martin patch:"more experimental permutation code, needs testing"

commit 35b8a8de244767998932c0ec75370fc239192de4
Author: Clement Pernet <clement.pernet at gmail.com <Clement Pernet <clement.pernet at gmail.com>
Date:   Thu Jun 19 23:55:18 2008 -0700

    * new matrix_addmul with any weird dimensions (still need to be tested)
    * lqup in progress

commit 433565ccc2b1e60f51dfabe42dcc44f2f9f800b1
Author: Clement Pernet <clement.pernet at gmail.com <Clement Pernet <clement.pernet at gmail.com>
Date:   Wed Jun 18 17:21:40 2008 -0700

    fixing trsm calls to addmul
    further work on lqup

commit 66e480d6804d8664827647263dbcaa0c98300c8a
Author: Clement Pernet <clement.pernet at gmail.com <Clement Pernet <clement.pernet at gmail.com>
Date:   Wed Jun 18 12:42:54 2008 -0700

    * add permutation window
    * work in progress in lqup

commit 1a2397a255a2015444dd89e548f927ba2ef98947
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Jun 17 22:40:51 2008 -0700

    initial untested code for permutations

commit e7567913d29c1e7f6bb5c636066c68bbe1b38550
Author: Clement Pernet <clement.pernet at gmail.com <Clement Pernet <clement.pernet at gmail.com>
Date:   Tue Jun 17 22:44:32 2008 -0700

    work in progress in lqup

commit 20284599f583b010e63fa714aed165119eaf487f
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Jun 17 21:29:25 2008 -0700

    merging Clement's patch, everything should work

commit f0e8f698973c77f8f3c9b91898844919509f271a
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Jun 17 21:19:16 2008 -0700

    API CHANGE, dropping all _impl's. also improved MP Strassen slightly

commit c9bb71ac7e0c3bd2ae9cd326d65405cd0b94727d
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Jun 17 19:51:50 2008 -0700

    sane default value for Strassen cutoff

commit 8ccdd98207e9cb7f6943796e899d3c8b78ecaff6
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Jun 17 18:32:53 2008 -0700

    M4/autoconf trickery

commit 9f7a44308e56a2a330aad7b33d67a195d5db65b1
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Jun 17 17:37:08 2008 -0700

    2nd attempt at col_rotate, doesn't update permutation yet

commit ed0e6b167e4e991c3015a502a98c7d0942c44240
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Jun 17 16:59:37 2008 -0700

    first version of col_rotate

commit 0303bc5fa7d64e2c36555a8f44bdc6f77e28efe2
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Jun 17 16:59:23 2008 -0700

    macros more robust by adding lots of brackets

commit faed0fe9f7e6ec3937d17e01a82d3908ef0f3be3
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Jun 17 00:32:39 2008 -0700

    added a bunch of functions and CHANGED THE API!

commit f63bdad7d4166dac2e64bde050d4623fa736a26d
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Jun 16 22:48:27 2008 -0700

    fixed dimensions of X0,X1,X2 in addmul_strassen

commit 275a3e02923d72af65669e04a604be3a2c4217ff
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Jun 16 00:48:43 2008 -0700

    added mzd_col_swap

commit 8f3c4ca2802343f5c9e5dbee57e79d08810e4e32
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sun Jun 15 22:38:15 2008 -0700

    implemented memory efficient addmul

commit 65e15c7ba29334628b7c7ae8d4a80fdce6521558
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sun Jun 15 22:37:56 2008 -0700

    fix typo in documentation

commit b21c2ff3b7feb883bc61af038643f0d099536bb2
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sun Jun 15 22:37:47 2008 -0700

    fix printing for ncols%RADIX == 0

commit 72bbfbf193375dbffd54a712408b553c044827d2
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sun Jun 15 16:58:57 2008 -0700

    work in progress: mzd_addmul_strassen

commit fe2a2bc44eef06587d0d6ce438f78cac8eeaf4e8
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sun Jun 15 16:58:40 2008 -0700

    adapted parameter k for top_reduce too

commit 88b8ef995a6c72fe355540ca89b9ba2c117d8474
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat Jun 14 12:54:54 2008 -0700

    slightly improved the k parameter for reduction, the M4RM k parameter can be adapted for the Core2
    but not for the Opteron

commit 6d257c41f699d0c89a8e5c8104f756df2955f266
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Jun 3 09:50:00 2008 +0100

    fix Gaussian reduction for full=FALSE, reported by Wael Said

commit a975af3174a135ab7073ccb3ee024a757fe0ec4b
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Jun 2 21:29:22 2008 +0100

    added documentation for lacking bounds checks
    print matrices only up to ncols not up to RADIX*width

commit ccd0b9ebf1dbeeb5bd289954393a1b1b80eec14e
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sun Jun 1 18:46:03 2008 +0100

    big check-in (sorry):
     - mzd_transpose much faster due to improved data locality
     - parity.h documented
     - mzd_reduce_m4ri uses 4 Gray code tables now
     - removed a couple "unsigned" since MSVC doesn't like comparison between signed and unsigned
       and it is nice to detect overflows to have the sign bit, also you can check i > 0, which is
       also nice

commit c33f9412afd023b5be2d1d2b54b42863eda6bb67
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri May 30 12:47:30 2008 +0100

    don't reduce a row if it is already reduced, slight overhead for random matrices, huge gain
    for e.g. GB matrices

commit b14c19706c2d409be863cffd2c9495a51bf439c8
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri May 30 11:32:05 2008 +0100

    4 Graycode tables seem to be good, need to test on Opteron. For large matrices we hit L2 so
    we might reconsider block'ing. For a 19907x 29323 we are twice as slow as M4RM multiplication, which needs way more RAM.

commit 645e366c09a2f79814d91ad1eb10f648c74e8329
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri May 30 11:17:11 2008 +0100

    another attempt at speed improvements

commit c53303125eeb94bd0af9aca253ceda79e9b4d44e
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu May 29 11:34:41 2008 +0100

    avoid potential memleak in shared library mode where the Gray codes are rebuild several times.

commit 9866e3426395166f568d161c7e0fca32c1aec4ef
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu May 29 11:34:11 2008 +0100

    renamed GRAY8 macro to M4RM_GRAY8 since it only applies to multiplication

commit 9287acb566847ffecfd2a1fbca72328dfe38ca9a
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu May 29 11:07:31 2008 +0100

    removed references to old implementations

commit bf1eaaee3857772ec0eb0920f85c03c633ac3f65
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu May 29 10:35:45 2008 +0100

    removed old commented-out reduce implementation

commit 850af6fc4cdd50e4a227b93c06bfc71bac3264b2
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed May 28 22:44:46 2008 +0100

    implement lazy strategy, i.e. attempt to not reduce rows already reduced.
             Before      After
    --------------------------
    hfe25_5:  7.21s      4.98s
    hfe30_5: 43.50s     30.86s

commit e70b7311cb93c9ef0997ebdc194a94e0332e7da4
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue May 27 22:51:14 2008 +0100

    removing number of parallel processed rows to two.

commit ba4206a22602bea689cb0757da2f440e7227218c
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue May 27 22:29:26 2008 +0100

    some slight improvement to mzd_row_add_offset

commit 2bc8715ecda6dd71fd0d7ee1fc417f2d048b7074
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue May 27 15:50:32 2008 +0100

    implemented using two Gray code tables at the same time, which improves performance.
    Need to check if e.g. 8 tables still improve performance

commit 58dae2439819f10c7d7bb4d131698b61071de402
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue May 27 13:35:19 2008 +0100

    more speed improvements for M4RI

commit fc136e8a07144d2932df4603dbdb89d60b512de3
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue May 27 11:45:47 2008 +0100

    speed improvement for M4RI

commit 02435209adbf1113066be3669630021e40154d0d
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue May 27 10:51:41 2008 +0100

    remove mzd_process_row and changed interface for mzd_process_rows to treate stoprow exclusive (this is more C-ish)

commit 5f35772ac9ab39e71d398162143b253334ba77cf
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon May 26 21:27:31 2008 +0100

    M4RI doesn't fall back to Gaussian elimination so easily anymore. In fact, it never does. This
    is advantageous in most cases except very sparse cases which should probably treated differently completely.

commit 443c46a46fe244e9b3dfa5ec8fd3149a210147c8
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon May 26 13:46:23 2008 +0100

    Michael Brickenstein:
    - define long constants using ll suffix
    Alexander Dreyer:
    - mask MIN/MAX in ifndef/endif to avoid multiple definitions of these common macros

commit 80589016150c0b81a0e5abce85d414ca0284f5e5
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon May 26 13:37:12 2008 +0100

    more small work m4ri1, this is buggy, experimental, play-around code

commit 7fdd881b86ba663c9fb3f58e775015c26263e2af
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat May 24 00:20:31 2008 +0100

    new M4RI1 routine for matrix reduction, which is still buggy for singular matrices

commit 7aef26d011a1147131e6a65c88c3218c529adb15
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri May 23 18:40:15 2008 +0100

    slight simplification for process rows and HAVE_SSE2

commit 470d198f729ff0454e32898d9c0f9525af6168d2
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri May 23 18:39:58 2008 +0100

    remove unused variable

commit 489d55856ea1af5a64dd4707e880e486e819fb7e
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu May 22 19:20:21 2008 +0100

    slightly more clever loop unrolling using a Duff device, doesn't make much of a difference

commit c79262fb0f429738a050f28e5d0062bc74f60456
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed May 21 22:52:08 2008 +0100

    fix include order

commit b9cc4f5f7bc0c17066c77ee30bd03b805638a723
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed May 21 21:51:57 2008 +0100

    updated MSVC project, added all relevant headers to m4ri.h

commit 8823fdd51568f9e27bdc30d9445093fe103068e5
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed May 21 18:45:47 2008 +0100

    make OpenMP support configurable

commit 2cdb95ba5b9adfb2ecb4d2402309eb69c25c1eef
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed May 21 17:33:59 2008 +0100

    added Bill's cutoff improvement

commit 9fb92a4cb7a3e33b93725d136b84d119a54196a1
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed May 21 17:15:36 2008 +0100

    fixed bug Bill Hart reported, fix all things Valgrind reported and made code run faster on C2D.
    Unsure about sage.math though, it isn't benchmarkable right now

commit eadfaf56d649c65c3dd82115b722607c45fdc08e
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed May 21 16:01:19 2008 +0100

    added more test (corner) cases

commit a8c282476e8fad7d6ac61d9f44386142dd6ff881
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed May 21 14:38:24 2008 +0100

    added new testcase, cleanup for valgrind

commit ff4685423c4ce53b9fa147b6d35f40545f2737ac
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue May 20 11:49:49 2008 +0100

    fix bug in reduction introduced by speeding up make_table
    add test code to catch these things

commit fe2df1dad262f932fc47da22c0b2a4bf516875be
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon May 19 22:14:42 2008 +0100

    added support for SSE2 to new _mzd_mul_m4rm_impl this improves performance on C2D considerably,
    but makes things worse on the Opteron

commit 0e00979aa8b87a3b853674e4f39fe488734904d8
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon May 19 18:54:01 2008 +0100

    allow control over number of Gray code tables via define GRAY8
    renamed HAVE_OMP to HAVE_OPENMP

commit e6d4d314dfe391d2569fa1e47c68d23daf82d032
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon May 19 18:12:38 2008 +0100

    use 8 instead of 2 Graycode tables (implementation and idea by Bill Hart)

commit 2a96ec5a1ac088369e352dbb321d601cc9b6a272
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon May 19 17:07:25 2008 +0100

    fixes for the last check-in (all rows are aligned now if no windows are used)

commit 306c2ca91a63f88598b4617b143b349b0341d42d
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon May 19 17:00:11 2008 +0100

    some (style) improvements for SSE2 code by Bill Hart

commit b53370817c497987f7c37e6ad5a27dc5bca2824a
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon May 19 13:03:29 2008 +0100

    implemented first parallel strassen-winograd multiplication (compile with -fopnemp -DHAVE_OPENMP)

commit 00cebb7422415ef2307c93a40d478055a479fc43
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sun May 18 00:29:57 2008 +0100

    new implementation of M4RM multiplication with two Gray code tables. The idea is by Bill Hart

commit a64434147c98cd72cb3324545c853124c49a4dd4
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat May 17 22:26:31 2008 +0100

    removed parameters T and L for M4RM (they weren't used anyway)

commit 24705af8d4d2610d87546b7805cfac460b94fe1a
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat May 17 20:41:14 2008 +0100

    fix commenting style

commit 642e7b461cce16874bc445598e30587740dfdafe
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat May 17 20:34:13 2008 +0100

    copy window to matrix to improve data locality in strassen multiplication

commit 26ec41f987eed7fca800836cef8d3bf80213b57c
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat May 17 18:10:55 2008 +0100

    reverting benchmarking code to square matrices

commit ce763b5a9b29382df8018887026514ee3b10adfe
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat May 17 17:44:23 2008 +0100

    block'ing naiv matrix multiplication and using that by default if B->ncols < some threshold

commit 1475ec80fe9dea634c0d31d98c56acd19dfe5547
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat May 17 16:11:54 2008 +0100

    faster transpose
    faster naiv multiplication
    faster mzd_make_table

commit cb800ab944a3e68d25ccbc56f22c4fbb8778ffce
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri May 16 23:36:47 2008 +0100

    re-added SSE2 support to mul_m4rm which gives a quite tiny speed-up
    removed unused variables

commit 904488bdbd871911393eac78a53c8f02b6492dee
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri May 16 21:45:59 2008 +0100

    nicer parameter names for mzd_combine

commit 65ab0b4af2e91da1b66f0b3fb145f8956de56c62
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri May 16 20:55:03 2008 +0100

    make run_bench return min,median,average and max

commit 1ecce5b758f1ef92e798f46866781952d6668bcd
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri May 16 18:15:39 2008 +0100

    document M4RM_BLOCKSIZE

commit 0f918fe4ea92075f28d251795054d3d2518da495
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri May 16 17:27:57 2008 +0100

    added William Hart's Block M4RM implementation which gives a significant speed-up!
    adapted m4ri_opt_k for that purpose too

commit 056bc801861bf95b7bb5db8da18615c2501fdc7e
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri May 16 15:42:12 2008 +0100

    faster naiv multiplication but still not as fast as is could be.

commit 300d31ba7f76d544fc7716532c07758ac2fe6295
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri May 16 15:41:25 2008 +0100

    only call _mm_malloc if it is really available

commit f1d3d6a17476ba805aedba455cc030c00819bfc9
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu May 15 20:23:52 2008 +0100

    fixing benchmarking/testing code and adding it to revision control

commit 9fefca504a8eb63769e54f6529985c08c26fdd87
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu May 15 18:33:00 2008 +0100

    don't use free on _mm_malloc'd memory

commit c8b70bbb71d6ba5ab56182de0242ccc8503bd1aa
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed May 14 21:23:26 2008 +0100

    compile fix for HAVE_SSE2 == False

commit da9779ce6c0f2aad87011280a90fbb66689d71ae
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed May 14 21:12:07 2008 +0100

    some minor documentation updates

commit efd9968d0f8ba9d6ddb33c0301f84a05c9dd58e6
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed May 14 18:11:30 2008 +0100

    fix SIGSEGV

commit 43a48764021935efe533c06bf756d67f3f9ce413
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed May 14 15:30:15 2008 +0100

    unify SSE2_CUTOFF

commit 8231e2d86f3da04c57e146d1ba11666acb4c4e25
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed May 14 15:15:38 2008 +0100

    reintroducing SSE2 to m4rm multiplication

commit a84461ff90e44d513d70a863fc54d3e4ab2f89a2
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed May 14 14:30:09 2008 +0100

    more documentation for the Opteron vs. Core2Duo performance compromise

commit a40d246f33db308bfa7dd3e211ec47691ed1af86
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed May 14 14:21:01 2008 +0100

    adapt documentation: We use Strassen-Winograd not Strassen

commit 727beb25d7cbeb695e184ecbc65e8a35eb4a1811
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed May 14 14:20:44 2008 +0100

    using XOR directly rather than calling mzd_combine gives a significant speed-up so we do that for now. Need to check if this is related to SSE2 and if we can re-introduce it

commit 0b42f8ad8fd3bf0997ea8a1c2745211cc8e770d5
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sun May 11 23:40:39 2008 +0100

    added support for Visual Studio 2008 Express

commit f3d659cf73b050c2f517105de5b467c68b09c3e2
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sun May 11 23:34:47 2008 +0100

    remove unecessary local variables, add explicit casts as picked up by MSVC

commit de41147cf97033fd1503890f19de170ba3f26818
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sun May 11 23:33:32 2008 +0100

    fixed compilation under OSX (32-bit) and under OpenSolaris (32-bit)

commit 2080a5f5681d7052598be311f297b40f76be78af
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon May 5 00:41:27 2008 +0800

    docstring updates and API unification

commit 8c380fc82eef860c835269938fbbea5278d469b1
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat May 3 23:58:32 2008 +0800

    declaring more parameters const

commit cb94853e61340c99a6c377f627426ba45ff34667
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat May 3 23:49:01 2008 +0800

    some cosmetic changes to packedmatrix.c

commit db82ad6e0bb0cb39587550b32433e165a8dd8888
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat May 3 23:43:38 2008 +0800

    marking more parameters const

commit 696350b226fee246107b5753d867e43a6fff5f0a
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat May 3 23:37:23 2008 +0800

    slightly improved clearing of target matrix in _mzd_mul_m4rm_impl

commit 7d4314a8561a98bcf378c1e659b99cb25dee89c5
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat May 3 23:31:51 2008 +0800

    SAFECHAR =  (1.3 * RADIX) is sufficient

commit eebec4f8cadd9fad7bd8d09af3fb92aa77852451
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat May 3 23:30:08 2008 +0800

    moved mzd_combine to packedmatrix.[c|h]
    _mzd_add_impl uses mzd_combine

commit 4db38668acd0af6b538b3b4f7a7f83d883157c82
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat May 3 23:22:46 2008 +0800

    removed dead test code, added strassen.h to m4ri.h

commit eb8a32bf4a1308ef8ec2fa157c5af1d8fa8a775c
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri May 2 00:21:41 2008 +0800

    implemented memory efficient strassen multiplication operation schedule

commit 59bcd961cabb8bb4d17c9ebfff8f60ca02009ce4
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Apr 30 18:16:50 2008 +0800

    Doxygen coverage 100%

commit be17748886adc69d5ff232cfcc6486a1a9e7b957
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Apr 25 11:29:22 2008 +0100

    fix version-info

commit d0c2e49b20a6ce209db73c6fc4433189531a1c06
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Apr 25 11:25:36 2008 +0100

    misc cleanups

commit fd00a0fa5e80c13206c82c22885f71dd73085e8f
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Fri Apr 25 00:45:29 2008 +0100

    a potentially more cache-friendly implementation, needs checking

commit 44ce2a3f4c6272b0f6a46f6df5c61bdc0b6caded
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Apr 24 23:17:38 2008 +0100

    doxygen updates

commit f29e19a7a50ff0db2031438a695c4b357ac72835
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Apr 24 01:23:53 2008 +0100

    simplified combine, don't try to outsmart the compiler

commit f80146cdb85003c08c019c0ffdc40f3c10257306
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Thu Apr 24 00:07:38 2008 +0100

    refactoring should be done

commit d9edfb90cb670e01de46a908e0eb72eb3d85befe
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Wed Apr 23 18:16:16 2008 +0100

    continued refactoring (should be almost done) and fixed bug in naiv multiplication

commit 2c82de491d61017683fca503ed22b9bc3b7eddcd
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sun Apr 20 22:08:48 2008 +0100

    fix build on PPC

commit 575b8e25cf425ce04cb25c3a793a7265beb80dec
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sun Apr 20 21:34:23 2008 +0100

    - added support for SSE2 if available (autodetection)
    - implemented Strassen multiplication
    - made API more C-ish (this is work in progress but most functions are done)
    - added lots of documentation in Doxygen style
    - added some tests to the test suite (still incomplete)

commit 802d57ce290bee4fe427539c9b90704241dd17df
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sun Apr 20 11:36:26 2008 +0100

    Strassen multiplication seems to work now

commit 126735f6ff95f9629e52f0998b99893a266a0e3f
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat Apr 19 20:38:16 2008 +0100

    added support for SSE2 instructions (for now these need to be enabled by hand). The speed-up is
    hardly noticable for realistic examples though. Also renamed a bunch of functions.

commit 44dc98bdc754a04a8c76c675f00a3e9a6e238e4f
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Sat Apr 19 11:29:30 2008 +0100

    Strassen seems to work if the matrix dimensions are exactly right

commit e4d8226c5ca8f421b940afd6e67fc8aa100f74aa
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Tue Apr 15 16:03:20 2008 +0100

    - refactoring (renaming of functions, files)
    - more documentation
    - added topReduceM4RI function
    - added first steps towards a testsuite

commit 03b02ba88920d6272684c04fe9d6997810404231
Author: Martin Albrecht <malb at informatik.uni-bremen.de>
Date:   Mon Apr 14 11:29:43 2008 +0100

    initial commit

-----------------------------------------------------------------------

-- 
libm4ri: library of Method of the Four Russians Inversion



More information about the debian-science-commits mailing list