[vspline] 14/72: small changes, documentation

Sun Jul 2 09:02:38 UTC 2017

This is an automated email from the git hooks/post-receive script.

kfj-guest pushed a commit to branch master
in repository vspline.

commit 441efc5e85ce364069b609e39c315bac7c105c13
Author: Kay F. Jahnke <kfjahnke at gmail.com>
Date:   Wed Nov 23 11:59:01 2016 +0100

    small changes, documentation
---
 bspline.h                   |   7 ++-
 eval.h                      |  20 +++---
 example/impulse_response.cc | 146 ++++++++------------------------------------
 example/pano_extract.cc     |  69 +++++++++++++++++++--
 filter.h                    |  25 ++++++--
 multithread.h               |  60 ++++++++++++++++++
 prefilter.h                 |  13 +++-
 prefilter_poles.cc          |   2 +
 remap.h                     |  30 ++++-----
 9 files changed, 215 insertions(+), 157 deletions(-)

diff --git a/bspline.h b/bspline.h
index 51cf5e6..e5b07f4 100644
--- a/bspline.h
+++ b/bspline.h
@@ -121,6 +121,11 @@
   object with spline degree 0 and 'shifting' it to the desired degree for evaluation. Note that
   you'll need the EXPLICIT strategy for the purpose, because otherwise the spline won't have
   enough 'headroom' for shifting.
+  
+  If stronger smoothing is needed, this can be achieved with the code in filter.h, passing in
+  appropriate pole values. a single-pole filter with a positive pole in [ 0 , 1 [ will do the
+  trick - the larger the pole, the stronger the smoothing. Note that smoothing with large pole
+  values will need a large 'horizon' as well to handle the margins properly.
 
   With shifting, you can also create a 'poor man's pyramid'. While using no additional storage,
   you can extract smoothed data from the spline by shifting it up. This only goes so far, though,
@@ -530,7 +535,7 @@ public:
   int shift ( int d )
   {
     int new_degree = spline_degree + d ;
-    if ( new_degree < 0 )
+    if ( new_degree < 0 || new_degree > 24 )
       return 0 ;
 
     bracer<view_type> br ;
diff --git a/eval.h b/eval.h
index 3627a57..4bf3108 100644
--- a/eval.h
+++ b/eval.h
@@ -54,6 +54,12 @@
     and less memory-bound, so the profit from vectorization, especially for float data,
     is more pronounced here than for the prefiltering. On my system, I found single-precision
     operation was speeded up about four times (AVX2).
+    
+    The central class of this file is class evaluator. evaluator objects are set up to
+    provide evaluation of a specific b-spline. Once they are set up they don't change and
+    effectively become functors with several overloaded evaluation methods for different
+    constellations of parameters. The evaluation methods typically take their arguments per
+    reference.
 */
 
 #ifndef VSPLINE_EVAL_H
@@ -437,26 +443,22 @@ struct alt_bspline_derivative_weight_functor
 /// The code in class evaluator begins with a sizeable constructor which sets up as much
 /// as possible to support the evaluation code to do it's job as fast as possible. Next follows
 /// the unvectorized code, finally the vectorized code.
-/// There is a great variety of overloads of class evaluator's operator() to make it as versatile
+/// There is a great variety of overloads of class evaluator's eval() method to make it as versatile
 /// as possible. The strategy is to have all dependencies of the evaluation except for the actual
 /// coordinates taken care of by the constructor - and immutable for the evaluator's lifetime.
 /// The resulting object has no state which is modified after construction, making it thread-safe.
-/// The operator() variants take only coordinates (in their various guises) as arguments, relying
-/// on the state of the evaluator fixed at it's construction. The operator() overloads also
+/// The eval() variants take only coordinates (in their various guises) as arguments, relying
+/// on the state of the evaluator fixed at it's construction. The eval() overloads also
 /// form a hierarchy, as evaluation progresses from accepting unsplit real coordinates to
-/// split coordinates and finally offsets and weights. This offers calling code to handle
+/// split coordinates and finally offsets and weights. This allows calling code to handle
 /// parts of the delegation hierarchy itself, only using class evaluator at a specific level.
 /// By providing the evaluation in this way, it becomes easy for calling code to integrate
-/// the evaluation into more complex functors by using std::bind. Consider, for example, code
+/// the evaluation into more complex functors. Consider, for example, code
 /// which generates coordinates with a functor, then evaluates a b-spline at these coordinates,
 /// and finally subjects the resultant values to some postprocessing. All these processing
 /// steps can be bound into a single functor, and the calling code can be reduced to polling
 /// this functor until it has obtained the desired number of output values.
 
-// TODO: if evaluator objects are only ever used one per thread, the 'workspace' for the
-// weights can be made a member of class evaluator. The current implementation is thread-safe, 
-// though, and does allow to use the same evaluator object from several threads.
-
 template < int dimension ,         ///< dimension of the spline
            class value_type ,      ///< type of spline coefficients/result (pixels etc.)
            class rc_type = float , ///< singular real coordinate, float or double
diff --git a/example/impulse_response.cc b/example/impulse_response.cc
index f9a42ec..8c43343 100644
--- a/example/impulse_response.cc
+++ b/example/impulse_response.cc
@@ -69,51 +69,36 @@
 /// note how three different ways of getting the result are given, the variants
 /// using lower-level access to the filter are commented out.
 
+// when playing with the filter a bit, I noticed that feeding the filter with the
+// absolute values of the poles produces an impulse response which looks much like
+// a gaussian and also represents a partition of unity (like the b-spline reconstruction
+// filter). Using this filter instead of the prefilter produces a blurred image, as one
+// might expect from a filter with an impulse response like this - the precise maths
+// aren't clear to me yet. I may have reinvented a very efficient method of calculating
+// a gaussian smoothing filter...
+// the impulse resonse produced by passing in the absolute values of the poles is the
+// absolute of the 'normal' impulse response, scaled by a factor.
+// to keep the filter stable, the poles must be less than 1.0. positive poles approaching
+// 1 produce more and more blurring when coming closer to 1. Since the impulse response
+// with larger poles becomes very broad, smoothing is pronounced then. Within the context of
+// my spline prefiltering code, such far-reaching effects aren't dealt with properly, since
+// the calculation of the 'horizon' looks at the absolute value of the pole, so when smoothing
+// the signal instead of sharpening it, if the absolute value of the pole is large, the margins
+// won't be right because the default horizon is too small. If the horizon is widened, eventually
+// it'll be good enough to accomodate even strong smoothing.
 
 #include <iomanip>
 #include <assert.h>
 #include <vspline/multithread.h>
 #include <vspline/vspline.h>
 
-/// sample function to run with multithread
-/// for any dimension d, we define 'rtwosome' as a function taking a shape_range_type
-/// of this dimension (this is a TinyVector of two nD shapes) and two MultiArrayViews
-/// of this dimension, containing doubles. Depending on the concrete coding needs,
-/// the functions used with multithread might be more or less specialized.
-
-template < int dimension >
-void rtwosome ( vspline::shape_range_type<dimension> r ,
-                vigra::MultiArrayView < dimension , double > _in1 ,
-                vigra::MultiArrayView < dimension , double > _in2 )
-{
-  // first, 'cut out' the subsets of the manifolds. Here, the ends of the range
-  // are nD shapes, which can be passed directly to subarray().
-  
-  const auto in1 = _in1.subarray ( r[0] , r[1] ) ;
-  auto in2 = _in2.subarray ( r[0] , r[1] ) ;
-  
-  // now set up iterators over the subsets:
-  
-  auto it1 = in1.begin() ;
-  auto it2 = in2.begin() ;
-  const auto end = in2.end() ;
-  
-  // and perform some calculation
-  
-  while ( it2 < end )
-  {
-    *it2 = *it1 + 42.0 ;
-    ++it1 ;
-    ++it2 ;
-  }
-}
-
 int main ( int argc , char * argv[] )
 {
   int degree = std::atoi ( argv[1] ) ;
   double cutoff = std::atof ( argv[2] ) ;
   assert ( degree >= 0 && degree < 25 ) ;
   
+  
 // using the highest-level access to prefiltering, we code:
 
   vspline::bspline < double , 1 > bsp ( 10001 , degree , vspline::PERIODIC ) ; // , vspline::EXPLICIT ) ;
@@ -142,90 +127,13 @@ int main ( int argc , char * argv[] )
 //         0.00001 ) ;
 //   f.solve ( v1.begin() ) ;
         
-//   std::cout << "double ir_" << degree << "[] = {" << std::endl ;
-//   std::cout << std::fixed << std::showpos << std::showpoint << std::setprecision(16);
-//   int k ;
-//   for ( k = 0 ; k < 10000 ; k++ )
-//   {
-//     if ( fabs ( v1[k] ) > cutoff )
-//       std::cout << v1[k] << " ," << std::endl ;
-//   }
-//   if ( fabs ( v1[k] ) > cutoff )
-//     std::cout << v1[k] ;
-//   std::cout << " } ;" << std::endl ;
-  
-  typedef vspline::evaluator < 1 , double , double > eval_type ; // get the appropriate evaluator type 
-  eval_type ev ( bsp ) ;                                   // create the evaluator
-//   std::cout << ev ( 5001.5 ) << std::endl ;
-  
-  vigra::MultiArray < 1 , double > coords ( 25 ) ;
-  coords[2] = 5000.1 ;
-  vigra::MultiArray < 1 , double > target ( 25 ) ;
-  
-  typedef vigra::MultiArrayView < 1 , double > view_type ;
-  typedef typename view_type::difference_type shape_type ;
-  
-  view_type va ( coords ) ;
-  view_type vb ( target ) ;
-  view_type vc ;
-//   // we need an instance here, passing twosome directly doesn't compile
-//   std::function < void ( view_type , view_type ) > fun ( twosome ) ;
-//   coprocess ( fun , 4 , va , vb ) ;
-//   
-//   std::function < void ( std::tuple < view_type , view_type > ) > bfun ( btwosome ) ;
-//   auto t = std::make_tuple ( va , vb ) ;
-//   coprocess ( bfun , 4 , t ) ;
-  
-  vspline::shape_range_type<1> range ( shape_type(3) , va.shape() ) ;
-  std::function < void ( vspline::shape_range_type<1> , view_type , view_type ) > rfun ( rtwosome<1> ) ;
-  vspline::multithread ( rfun , 4 , range , va , vb ) ;
-  
-  for ( auto v : vb )
-    std::cout << v << std::endl ;
-
-//   typedef vspline::split_type < 1 , int , double > splt ;
-//   if ( degree & 1 )
-//   {
-//     for ( int i = 0 ; i < 25  ; i++ )
-//       ev.eval ( coords[i] , target[i] ) ;
-//   }
-//   else
-//   {
-//     for ( int i = 0 ; i < 25  ; i++ )
-//       ev.eval ( coords[i] , target[i] ) ;
-//   }
-    
-//   typedef vigra::TinyVector < double , 1 > monad_t ;
-//   vigra::MultiArray < 1 , monad_t > mcoords ( 25 ) ;
-//   mcoords[5] = { 5000.1 } ;
-//   vigra::MultiArray < 1 , monad_t > mtarget ( 25 ) ;
-  vspline::remap < double , double , 1 , 1 > ( bsp , coords , target ) ;
-  for ( auto v : target )
-    std::cout << v << std::endl ;
-  
-  typedef vspline::vector_traits < double > :: type dvs_t ;
-  typedef vigra::TinyVector < dvs_t , 1 > dv_t ;
-  
-  dv_t dv ;
-  dv[3] = 5000.1 ;
-  std::cout << dv << std::endl ;
-  
-  dv_t rv ;
-  ev.v_eval ( dv , rv ) ;
-  std::cout << rv << std::endl ;
-  
-  ev.v_eval ( (double*)&(dv) , (double*)&(rv) ) ;
-//   ev.v_eval ( (vigra::TinyVector<double,1>*)&(dv) , (double*)&(rv) ) ;
-  
-//   using namespace vigra::multi_math ;
-//   vigra::TinyVector < double , 12 > xx ;
-//   xx[3] = 1.0 ;
-//   std::cout << sqrt ( xx ) << std::endl ;
-//   pseudo_vector < double , 3 > tv3 ;
-//   pseudo_vector < double , 3 > zero ;
-//   {
-//     auto less = ( tv3 < zero ) ;
-//     for ( int i = 0 ; i < 3 ; i++ )
-//       std::cout << less[i] << std::endl ;
-//   }
+  std::cout << "double ir_" << degree << "[] = {" << std::endl ;
+  std::cout << std::fixed << std::showpos << std::showpoint << std::setprecision(16);
+  int k ;
+  for ( k = 0 ; k < 10001 ; k++ )
+  {
+    if ( fabs ( v1[k] ) > cutoff )
+      std::cout << v1[k] / << " ," << std::endl ;
+  }
+  std::cout << " } ;" << std::endl ;
 }
\ No newline at end of file
diff --git a/example/pano_extract.cc b/example/pano_extract.cc
index 13a5b65..939f921 100644
--- a/example/pano_extract.cc
+++ b/example/pano_extract.cc
@@ -287,16 +287,29 @@ public:
 
 #include <vigra/colorconversions.hxx>
 
+// TODO: the next three point functors currently can't be run, because the
+// infrastructure is gone (applying point functions to arrays)
+
 /// functor to convert a UINT8 RGB pixel from sRGB to linear RGB
 /// this uses vigra's functor, which in turn uses a lookup table
 /// and is therefore fast.
 
-void degamma ( TinyVector<unsigned char,3> & v )
+typedef vigra::RGBValue<unsigned char,0,1,2> uc_pixel ;
+
+void degamma ( uc_pixel & v )
 {
   static vigra::sRGB2RGBFunctor< unsigned char , unsigned char > fun ;
   v = fun ( v ) ;
 } 
 
+void rdegamma ( shape_range_type < 2 > range ,
+                vigra::MultiArrayView < 2 , uc_pixel > * p_data )
+{
+  auto data = p_data->subarray ( range[0] , range[1] ) ;
+  for ( auto & v : data )
+    degamma ( v ) ;
+}
+
 /// functor to convert a pixel from sRGB to linear RGB. This functor
 /// actually performs the calculation. This is used for other pixels.
 
@@ -316,6 +329,18 @@ void to_l_rgb ( value_type& v )
   }
 }
 
+template < typename ele_type , int _max = 255 >
+void e_to_l_rgb ( ele_type& component )
+{
+  const ele_type max ( _max ) ;
+
+  component /= max ;
+
+  component = ( component <= 0.04045 ) 
+              ? max * component / 12.92
+              : max * pow ( (component + 0.055) / 1.055 , 2.4 ) ;
+}
+
 #ifdef USE_VC
 
 /// functor to convert a vector of single values to linear RGB
@@ -333,14 +358,33 @@ void v_to_l_rgb ( Vc::Vector < entry_type > & v )
                  * exp (    entry_type ( 2.4 )
                           * log ( ( v + entry_type ( 0.055 ) ) * entry_type ( 1.0 / 1.055 ) )
                        ) ;
+}
 
+template < typename entry_type , int _max = 255 >
+void r_v_to_l_rgb ( shape_range_type < 2 > range ,
+                    vigra::MultiArrayView < 2 , entry_type > * p_view )
+{
+  auto view = p_view->subarray ( range[0] , range[1] ) ;
+  auto p_data = view.data() ;
+  int nv = view.size() / Vc::Vector < entry_type > :: Size ;
+  
+  for ( int i = 0 ; i < nv ; i++ )
+  {
+    Vc::Vector < entry_type > v ( p_data ) ;
+    v_to_l_rgb ( v ) ;
+    v.store ( p_data ) ;
+    p_data += Vc::Vector < entry_type > :: Size ;
+  }
+  for ( int i = nv * Vc::Vector < entry_type > :: Size ; i < view.size() ; i++ )
+    e_to_l_rgb ( view[i] ) ;
+}
 // ouch... Vc does not provide pow(), so we use
 // pow(x,y) = exp^(y*lnx)
 
 // would be nice:
 //                  * pow ( ( v + entry_type ( 0.055 ) ) * entry_type ( 1.0 / 1.055 ) ,
 //                          entry_type ( 2.4 ) ) ;
-}
+
 
 #endif
 
@@ -486,7 +530,11 @@ void process_image ( char * name ,
 
     // vectorization doesn't work for unsigned char, but we can multithread:
 
-    divide_and_conquer < buffer_view > :: run ( buffer , degamma ) ;
+//     divide_and_conquer < buffer_view > :: run ( buffer , degamma ) ;
+
+    buffer_view bv ( buffer ) ;
+    shape_range_type < 2 > range ( shape_type() , bv.shape() ) ;
+    multithread ( &rdegamma , ncores , range , &bv ) ;
 
     // still we need to work on floating point data
 
@@ -524,11 +572,20 @@ void process_image ( char * name ,
     // silently assumes input actually is sRGB...
     // this conversion takes quite a lot of time!
 
+    MultiArrayView < 2 , rc_type > core_flat_view ( vigra::Shape2 ( core.shape(0) * 3 , core.shape(1) ) ,
+                                                    (rc_type*) ( core.data() ) ) ;
+                                                    
+    shape_range_type < 2 > range2 ( shape_type() , core_flat_view.shape() ) ;
+    
 #ifdef USE_VC
-    divide_and_conquer < decltype(core) > :: run ( core , v_to_l_rgb<rc_type,65535> ) ;
-#else
-    divide_and_conquer < decltype(core) > :: run ( core , to_l_rgb<pixel_type,65535> ) ;
+    multithread ( &r_v_to_l_rgb<rc_type,65535> , ncores , range2 , &core_flat_view ) ;
 #endif
+    
+// #ifdef USE_VC
+//     divide_and_conquer < decltype(core) > :: run ( core , v_to_l_rgb<rc_type,65535> ) ;
+// #else
+//     divide_and_conquer < decltype(core) > :: run ( core , to_l_rgb<pixel_type,65535> ) ;
+// #endif
     component_max = 65535 ;
   }
   else
diff --git a/filter.h b/filter.h
index 6200ba8..d49be89 100644
--- a/filter.h
+++ b/filter.h
@@ -40,15 +40,28 @@
     The code in this file provides efficient filtering of nD arrays with an n-pole
     forward-backward recursive filter, accepting a variety of boundary conditions and
     optionally using multithreading and/or vectorization to speed things up.
-    
+
     The data have to be presented as vigra MultiArrayViews of elementary floating point
     types or their 'aggregates' (TinyVectors, pixels, etc.), the code is dimension-agnostic
     but templated to the array types used, so the dimensionality is not a run time parameter.
     
     Note the code organization is bottom-up, so the highest-level code comes last.
     Most code using filter.h will only call the final routine, filter_nd.
+    
+    While the initial purpose for the code in this file was, of course, b-spline prefiltering,
+    the generalized version I present here can be used for arbitrary filters. There is probably
+    one filter which is most useful in the context of vspline: passing a single positive pole
+    in the range of ] 0 , 1 [ smoothes the signal very efficiently.
 */
 
+// TODO: with buffering used throughout, we might make the code more general:
+// input could be of any type convertible to a type the 1D filter can handle,
+// and would be cast to the type the buffer contains when the buffer is filled.
+// after completion of the filter, the data might be submitted to yet another cast
+// on copying them out to the target. With such a scheme, operation on, for example,
+// integral data, would become possible without need for converting the data first to
+// an nD array of real values.
+
 // include common.h for the border condition codes
 // TODO pull these out?
 
@@ -1113,17 +1126,17 @@ void ele_aggregating_filter ( source_type &source ,
                               double tolerance
                             )
 {
-  typedef typename source_type::value_type ele_type ;    ///< elementary type in source
+  typedef typename source_type::value_type ele_type ;    // elementary type in source
 
   // we have a traits class fixing the simdized type (in common.h):
   
   typedef typename vector_traits < ele_type > :: type simdized_type ;
 
-  const int vsize ( simdized_type::Size ) ;              ///< number of ele_type in a simdized_type
-  typedef typename simdized_type::IndexType index_type ; ///< indexes for gather/scatter
-  typedef typename simdized_type::MaskType mask_type ;   ///< and mask for masked operation
+  const int vsize ( simdized_type::Size ) ;              // number of ele_type in a simdized_type
+  typedef typename simdized_type::IndexType index_type ; // indexes for gather/scatter
+  typedef typename simdized_type::MaskType mask_type ;   // and mask for masked operation
   
-  int count = source.shape ( axis ) ;                    ///< number of vectors we'll process
+  int count = source.shape ( axis ) ;                    // number of vectors we'll process
 
   // I initially tried to use Vc::Memory, but the semanics of the iterator obtained
   // by buffer.begin() didn't work for me.
diff --git a/multithread.h b/multithread.h
index 96abd37..21ab2c9 100644
--- a/multithread.h
+++ b/multithread.h
@@ -78,7 +78,9 @@
 
 #include <assert.h>
 #include <vigra/tinyvector.hxx>
+#include <vigra/multi_array.hxx>
 #include <thread>
+#include <vspline/common.h>
 
 namespace vspline
 {
@@ -110,6 +112,9 @@ using shape_type = vigra::TinyVector < long , dimension > ;
 template < int dimension >
 using shape_range_type = range_type < shape_type < dimension > > ;
 
+template < int dimension >
+using shape_partition_type = partition_type < shape_range_type < dimension > > ;
+
 // #include <iomanip>
 // #include <vspline/vspline.h>
 // #include <vigra/multi_math.hxx>
@@ -409,6 +414,61 @@ int multithread ( void (*pfunc) ( range_type , Types... ) ,
   return multithread ( pfunc , partitioning , args... ) ;
 }
 
+// some apply type routines, preliminary
+// the idea is to write a functor which bundles all processing steps.
+// this functor is then aplied to the target array, filling in the result
+// by calculating it value by value.
+
+template < typename view_type >
+void apply_point_func ( void (*pfunc) ( typename view_type::value_type & ) ,
+                        shape_range_type < view_type::actual_dimension > range ,
+                        view_type * p_view )
+{
+  auto view = p_view->subarray ( range[0] , range[1] ) ;
+  for ( auto & v : view )
+    pfunc ( v ) ;
+}
+  
+#ifdef USE_VC
+
+template < typename view_type >
+void apply_v_point_func ( void (*pvfunc) ( typename view_type::value_type * ) ,
+                          shape_range_type < view_type::actual_dimension > range ,
+                          view_type * p_view )
+{
+  auto view = p_view->subarray ( range[0] , range[1] ) ;
+  typedef typename view_type::value_type value_type ;
+  typedef typename vigra::ExpandElementResult < value_type > :: type ele_type ;
+  const int vsize = vspline::vector_traits < ele_type > :: vsize ;
+  int aggregates = view.size() / vsize ;
+  int leftover = view.size() - aggregates * vsize ;
+  assert ( leftover == 0 ) ;
+  ele_type * base = (ele_type*) (view.data()) ;
+  for ( int a = 0 ; a < aggregates ; a++ )
+    pvfunc ( base + vsize * a ) ;    
+}
+ 
+template < typename view_type >
+void apply_v_point_func ( void (*pvfunc) ( typename view_type::value_type * ) ,
+                          void (*pfunc) ( typename view_type::value_type & ) ,
+                          shape_range_type < view_type::actual_dimension > range ,
+                          view_type * p_view )
+{
+  auto view = p_view->subarray ( range[0] , range[1] ) ;
+  typedef typename view_type::value_type value_type ;
+  typedef typename vigra::ExpandElementResult < value_type > :: type ele_type ;
+  const int vsize = vspline::vector_traits < ele_type > :: vsize ;
+  int aggregates = view.size() / vsize ;
+  int leftover = view.size() - aggregates * vsize ;
+  ele_type * base = (ele_type*) (view.data()) ;
+  for ( int a = 0 ; a < aggregates ; a++ )
+    pvfunc ( base + vsize * a ) ;    
+  for ( int l = 0 ; l < leftover ; l++ )
+    pfunc ( view [ vsize * aggregates + l ] ) ;    
+}
+ 
+#endif
+
 } ; // end if namespace vspline
 
 #endif // #ifndef VSPLINE_MULTITHREAD_H
\ No newline at end of file
diff --git a/prefilter.h b/prefilter.h
index 0b06b47..cff1266 100644
--- a/prefilter.h
+++ b/prefilter.h
@@ -60,7 +60,6 @@
     available as solve_vigra(), and the vectorized version, solve_vc(). Note that
     the vectorized code is more constrained in what numeric types it can process
     (namely, only float, double and their aggregates).
-    Unit testing code for these two versions is in prefilter_test...
     
     In another version of this code I used vigra's BSPlineBase class to obtain prefilter
     poles. This required passing the spline degree/order as a template parameter. Doing it
@@ -100,6 +99,8 @@
     I was tempted to choose the explicit scheme over the implicit. Yet since the code for the
     implicit scheme is there already and some of it is even used in the explicit scheme I keep
     both methods in the code base for now.
+
+    [CIT2000] Interpolation Revisited by Philippe Thévenaz, Member,IEEE, Thierry Blu, Member, IEEE, and Michael Unser, Fellow, IEEE in IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 19, NO. 7, JULY 2000,
 */
 
 #ifndef VSPLINE_PREFILTER_H
@@ -149,7 +150,7 @@ using namespace vigra ;
 /// splines become less of an issue; the extra arithemtic to prefilter for, say, quintic splines is
 /// done very quickly, since no additional memory access is needed beyond a buffer's worth of data
 /// already present in core memory.
-
+///
 /// solve_vigra and solve_vc are just thin wrappers around filter_nd in filter.h,
 /// injecting the actual number of poles, the poles themselves and a flag wether to
 /// use vector code or not.
@@ -197,6 +198,14 @@ void solve_vigra ( input_array_type & input ,
 
 #ifdef USE_VC
 
+/// solve_vc is the equivalent to solve_vigra, but uses vectorization. If your machine
+/// supports vectorization (SSE, AVX...), this is the variant you should try if you can,
+/// to establish if you can get the process to speed up. This may not be the case - on my
+/// system (core i5, 4 cores, AVX2), depending on other parameters, sometimes this routine
+/// is faster than solve_vigra. Prefiltering is very memory-intensive, the amount of actual
+/// caclulation is, by contrast, small, so the effect of vectorization for prefiltering
+/// isn't as large as it's efect on evaluation.
+
 template < typename input_array_type ,      // type of array with knot point data
            typename output_array_type >     // type of array for coefficients (may be the same)
 void solve_vc ( input_array_type & input ,
diff --git a/prefilter_poles.cc b/prefilter_poles.cc
index d30ec5b..5165f43 100644
--- a/prefilter_poles.cc
+++ b/prefilter_poles.cc
@@ -40,7 +40,9 @@
 
     compile with:
     g++ -std=c++11 prefilter_poles.cc -oprefilter_poles -lgsl -lblas
+    
     run
+    
     ./prefilter_poles > poles.cc
 
     TODO: could do with some TLC...
diff --git a/remap.h b/remap.h
index fdb5e29..9383245 100644
--- a/remap.h
+++ b/remap.h
@@ -68,6 +68,11 @@
 /// If the source/target array lend themselves to it, we can pass it's memory directly
 /// to the vectorized eval code. To keep the complexity down,  if the memory is not
 /// suited, I 'manually' aggregate to a small buffer and pass the buffer to and fro.
+/// There is possibly room for improvement here - one might use gather/scatter operations
+/// where the data aren't strictly in sequence - but the bulk of the processing time
+/// needed is in the access to (possibly widely scattered) memory and the complex
+/// calculations needed to provide the result, so I don't include code for more
+/// data access modes here.
 ///
 /// There is also a second set of remap functions in this file, which don't take a
 /// 'warp' array but a functor instead. This functor receives, for every location in the
@@ -81,7 +86,8 @@
 /// lens correction, etc.
 ///
 /// In the 'example' folder, there is a program called pano_extract.cc, which demonstrates
-/// the use of a transformation-based remap.
+/// the use of a transformation-based remap. There's also pv.cc which uses remap extensively,
+/// and, being a panorama viewer, gives instant visual feedback showing the results as images.
 
 #ifndef VSPLINE_REMAP_H
 #define VSPLINE_REMAP_H
@@ -137,6 +143,12 @@ struct coordinate_traits < split_type < N , IT , FT > >
 /// Note how I pass in several pointers. Normally I'd pass such data by reference,
 /// but I can't do this here (or haven't found a way yet) because of my parameter-handling
 /// in multithread() which I couldn't get to work with references. The effect is the same.
+// TODO instead of passing in a b-spline, one might pass in an evaluator. If this is
+// coded so that the evaluator can be any object capable of the methods called for class
+// evaluator in this routine, the remap becomes more general, since it can use any old
+// 'evaluator' with this interface. For the unvectorized code, supplying this interface
+// is trivial, there is only one method used. For the vectorized version, only one more is
+// needed (taking pointers to vsize values/coordinates).
 
 template < class value_type ,      // like, float, double, complex<>, pixels, TinyVectors etc.
            class coordinate_type , // usually TinyVector of real for coordinates
@@ -375,18 +387,14 @@ int remap ( MultiArrayView < dim_in , value_type > input ,
 // TODO: we want an equivalent processing offset/tune pairs which would be even more compact
 
 template < class ic_type , class rc_type , int dimension , int coordinate_dimension >
-class warper
+struct warper
 {
-public:
-  
   typedef vigra::TinyVector < ic_type , coordinate_dimension > nd_ic_type ;
   typedef vigra::TinyVector < rc_type , coordinate_dimension > nd_rc_type ;
 
   vigra::MultiArray < dimension , nd_ic_type > _select ;
   vigra::MultiArray < dimension , nd_rc_type > _tune ;
   
-public:
-  
   vigra::MultiArrayView < dimension , nd_ic_type > select ;
   vigra::MultiArrayView < dimension , nd_rc_type > tune ;
 
@@ -394,8 +402,6 @@ public:
   
   shape_type shape ;
 
-public:
-  
   /// constructor taking a shape argument. This creates the containers for integral and
   /// remainder parts of the coordinates internally.
 
@@ -420,7 +426,7 @@ public:
 
 /// single-threaded remap taking separate arrays 'tune' and 'select', which are chunks of
 /// the arrays in a warper object. This routine is called by the one below it via
-/// divide_and_conquer_3.
+/// 'multithread'
 
 template < class value_type ,      // like, float, double, complex<>, pixels, TinyVectors etc.
            class nd_ic_type , 
@@ -819,7 +825,7 @@ public:
 /// The first is what is implemented in the previous implementation of remap: we
 /// calculate a 'warp' array full of transformed coordinates and process it en bloc.
 /// Alternatively, with the coordinate transformation at hand, we can write a remap
-/// variant which performs the coordinate transformation to each target coordinate
+/// variant which performs the coordinate transformation for each target coordinate
 /// as it goes along, saving the construction of the warp array.
 ///
 /// The transformation is passed in by using a 'transformation' object, which is a
@@ -919,10 +925,6 @@ void st_tf_remap ( shape_range_type < dim_out > range ,
 /// multithreaded tf_remap routine, splitting the target array to chunks.
 /// This is just as for the warp-array-based remap. The target array is split into
 /// chunks, which are in turn processed in a thread each by the single-threaded routine.
-/// Note the parameter 'offset'. Since we operate on the coordinates of the target array,
-/// we have to keep track of the coordinate the origin of each chunk had in the unsplit
-/// target. This is what 'offest' does, it's effectively added to the coordinates in the
-/// chunks when they are transformed.
 
 template < class value_type ,      // like, float, double, complex<>, pixels, TinyVectors etc.
            class rc_type ,         // float or double for coordinates

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-science/packages/vspline.git