[Pkg-exppsy-maintainers] Q: Crossvalidation feature selection
Yaroslav Halchenko
debian at onerussian.com
Mon Dec 24 21:08:27 UTC 2007
Hi Per,
And Merry Christmass!
> > In any case, Michael would correct me if I am wrong, by now we didn't
> > yet have a Classifier which would do some feature selection, ie you had
> > to implement loop through the splits manually and run RFE
> > FeatureSelection (using some SensitivityAnalyzer such as OnewayAnova or
> > LinearSVMWeights if you use SVM) on each split manually.
> OK, I can certainly do that loop and keep track of the results myself.
> I just thought some version may already be there (and I see it may be
> soon :))
oki doki -- it is accomplished to some degree ;-)
I really hate naming of classifiers we have now, so a bit of refactoring
will be needed to make them sane. If you have any suggestion -- please
don't hesitate to share
Ok -- we have few new classifiers and sensitivity selections. First let
me describe FeatureSelection's:
SensitivityBasedFeatureSelection -- just basic one, alike 1 step RFE --
ie it just removes some features based on results of some
SensitivityAnalyzer.
FeatureSelectionPipeline -- which is to
apply a list of FeatureSelection algorithms (like first remove 50% of
silent according to ANOVA, and do SVM-sensitivity based RFE on the
rest))
Now we come to the classifier you wanted:
FeatureSelectionClassifier -- given a base classifier and
FeatureSelection (like one from above) - create a classifier which
first does feature selection and then trains, and predicts using only
selected features.
I've done just basic testing, so it still might have bugs. Also I really
want to discuss with Michael and maybe to come up with some better
naming convention. May be simply shorten the names of the classes: like
every Classifier-derived class simply has Clf suffix, the same for
FeatureSelection to have FS, etc.
> I'm working on an analysis right now that could easily be generalized
> into an example once this classifier is in there. Let me know when
> you want me to try it out.
please do ;-)
--
Yaroslav Halchenko
Research Assistant, Psychology Department, Rutgers-Newark
Student Ph.D. @ CS Dept. NJIT
Office: (973) 353-5440x263 | FWD: 82823 | Fax: (973) 353-1171
101 Warren Str, Smith Hall, Rm 4-105, Newark NJ 07102
WWW: http://www.linkedin.com/in/yarik
More information about the Pkg-exppsy-maintainers
mailing list