[Pkg-exppsy-maintainers] Q: Crossvalidation feature selection

Yaroslav Halchenko debian at onerussian.com
Mon Dec 24 21:08:27 UTC 2007


Hi Per,

And Merry Christmass!

> > In any case, Michael  would correct me if I am wrong, by now we didn't
> > yet have a Classifier which would do some feature selection, ie you had
> > to implement loop through the splits manually and run RFE
> > FeatureSelection (using some SensitivityAnalyzer such as OnewayAnova or
> > LinearSVMWeights if you use SVM) on each split manually.
> OK, I can certainly do that loop and keep track of the results myself.
>  I just thought some version may already be there (and I see it may be
> soon :))
oki doki -- it is accomplished to some degree ;-)

I really hate naming of classifiers we have now, so a bit of refactoring
will be needed to make them sane. If you have any suggestion -- please
don't hesitate to share

Ok -- we have few new classifiers and sensitivity selections. First let
me describe FeatureSelection's:

SensitivityBasedFeatureSelection -- just basic one, alike 1 step RFE --
   ie it just removes some features based on results of some
   SensitivityAnalyzer.

FeatureSelectionPipeline -- which is to
  apply a list of FeatureSelection algorithms (like first remove 50% of
  silent according to ANOVA, and do SVM-sensitivity based RFE on the
  rest))

Now we come to the classifier you wanted:

FeatureSelectionClassifier -- given a base classifier and
  FeatureSelection (like one from above) - create a classifier which
  first does feature selection and then trains, and predicts using only
  selected features.


I've done just basic testing, so it still might have bugs. Also I really
want to discuss with Michael and maybe to come up with some better
naming convention.  May be simply shorten the names of the classes: like
every Classifier-derived class simply has Clf suffix, the same for
FeatureSelection to have FS, etc.

> I'm working on an analysis right now that could easily be generalized
> into an example once this classifier is in there.  Let me know when
> you want me to try it out.
please do ;-)




-- 
Yaroslav Halchenko
Research Assistant, Psychology Department, Rutgers-Newark
Student  Ph.D. @ CS Dept. NJIT
Office: (973) 353-5440x263 | FWD: 82823 | Fax: (973) 353-1171
        101 Warren Str, Smith Hall, Rm 4-105, Newark NJ 07102
WWW:     http://www.linkedin.com/in/yarik        



More information about the Pkg-exppsy-maintainers mailing list