[Pkg-exppsy-maintainers] SVM+RFE vs. SMLR

Thu Mar 6 17:13:22 UTC 2008

I'm in the process of modifying the SMLR code to resample the zero
weights a bit more because the current method can give rise to too
much variation due to sparse random resampling.  This will slightly
slow the code and take up more RAM because I'm calculating it on a
per-weight basis, but should be far more accurate.

I'll email when I push the new code.

Best,
Per

On Thu, Mar 6, 2008 at 11:27 AM, Yaroslav Halchenko
<debian at onerussian.com> wrote:
> Plenty but just now found that I screwed up a bit: min error in the
>  labels in minimal not across means across runs - but global min, thus is
>  not what we see in the plots
>
>  Also damn legend covered some part but those are not interesting ones
>
>  Find all subjects plots for those SMLRs
>  http://www.onerussian.com/Sci/analysis/pymvpa/smlrs1/
>
>  as I see ml=0.01 doesn't do good
>
>  going to try it with lm=1.5 and all 1e-5 (on all lms)
>
>  Another tiny new bit:
>  As you could see in git's yoh/master there is now
>  doc/examples/clfs_examples.py (could be renamed) the main purpose of
>  which is to serve an extended version of smlr_example benchmark
>
>  For now I just added only very dummy datasets (so avg train/test times are
>  kinda bogus and denominated ;-)) and basic classifiers, and here is
>  current output
>
>  $> doc/examples/clfs_examples.py
>  Dummy 2-class univariate with 2 useful features: <Dataset / float64 20 x 1000 uniq: 2 labels 5 chunks>
>   Linear C-SVM (default)        : correct=65.0% train:0.0sec predict:0.0sec
>   Linear nu-SVM (default)       : correct=65.0% train:0.0sec predict:0.0sec
>   SMLR(default)                 : correct=90.0% train:0.1sec predict:0.0sec
>   SMLR(Python)                  : correct=90.0% train:7.8sec predict:0.0sec
>   RidgeReg(default)             : correct=50.0% train:6.8sec predict:0.0sec
>   Rbf C-SVM (default)           : correct=60.0% train:0.0sec predict:0.0sec
>   Rbf nu-SVM (default)          : correct=65.0% train:0.1sec predict:0.0sec
>   kNN(default)                  : correct=55.0% train:0.0sec predict:0.0sec
>  Dummy XOR-pattern: <Dataset / float64 80 x 2 uniq: 2 labels 80 chunks>
>   Linear C-SVM (default)        : correct=0.0% train:0.0sec predict:0.0sec
>   Linear nu-SVM (default)       : correct=71.2% train:0.0sec predict:0.0sec
>   SMLR(default)                 : correct=0.0% train:0.0sec predict:0.0sec
>   SMLR(Python)                  : correct=0.0% train:0.0sec predict:0.0sec
>   RidgeReg(default)             : correct=50.0% train:0.0sec predict:0.0sec
>   Rbf C-SVM (default)           : correct=0.0% train:0.0sec predict:0.0sec
>   Rbf nu-SVM (default)          : correct=97.5% train:0.0sec predict:0.0sec
>   kNN(default)                  : correct=98.8% train:0.0sec predict:0.0sec
>
>
>  The goal is to extend with interesting data and evolved SVMs (ie SVM + RFE for
>  instance, or SVM + feature selection based on ANOVA/ SMLR's weights/SVM weights
>  but without RFE -- just plain non-0 or 1% of highest weights). That should
>  provide illustrative example of built-in ML techniques in hands we have here,
>  and provide easy assessment of efficiency in terms of computation time.
>
>
>
>  On Thu, 06 Mar 2008, Per B. Sederberg wrote:
>
>  > So, do you have the new results?
>  > P
>
>  > On Wed, Mar 5, 2008 at 3:59 PM, Yaroslav Halchenko
>  > <debian at onerussian.com> wrote:
>  > > > pulling these "relevant" weights away.  So, unless you are redoing the
>  > >  > regression at each step, removing those features completely, you are
>  > >  > actually punishing the SMLR each time you pull out weights that it
>  > >  > thinks are valuable.
>  > >  but we do retrain after each such feature removal,
>  > >  ie we don't simply prune the weight (set it to 0), we remove that
>  > >  feature from training data for a classifier, then retrain classifier.
>
>  > >  so we do fair job imho ;-) or have I misread your message?
>
>  > >  --
>  > >  Yaroslav Halchenko
>  > >  Research Assistant, Psychology Department, Rutgers-Newark
>  > >  Student  Ph.D. @ CS Dept. NJIT
>  > >  Office: (973) 353-5440x263 | FWD: 82823 | Fax: (973) 353-1171
>  > >         101 Warren Str, Smith Hall, Rm 4-105, Newark NJ 07102
>  > >  WWW:     http://www.linkedin.com/in/yarik
>
>  > >  _______________________________________________
>  > >  Pkg-exppsy-maintainers mailing list
>  > >  Pkg-exppsy-maintainers at lists.alioth.debian.org
>  > >  http://lists.alioth.debian.org/mailman/listinfo/pkg-exppsy-maintainers
>
>
>
>  --
>
>
> Yaroslav Halchenko
>  Research Assistant, Psychology Department, Rutgers-Newark
>  Student  Ph.D. @ CS Dept. NJIT
>  Office: (973) 353-5440x263 | FWD: 82823 | Fax: (973) 353-1171
>         101 Warren Str, Smith Hall, Rm 4-105, Newark NJ 07102
>  WWW:     http://www.linkedin.com/in/yarik
>