[scikit-learn] annotated tag 0.14 created (now ed8d2a2)
Andreas Tille
tille at debian.org
Wed Dec 28 13:10:48 UTC 2016
This is an automated email from the git hooks/post-receive script.
tille pushed a change to annotated tag 0.14
in repository scikit-learn.
at ed8d2a2 (tag)
tagging d13928cc0653f52de55e22118915b0c5bcba13d7 (commit)
replaces 0.4
tagged by Gael Varoquaux
on Thu Aug 8 00:50:13 2013 +0200
- Log -----------------------------------------------------------------
0.14 release
A. Flaxman (3):
DOC: add random_state parameter to StratifiedShuffleSplit doc string
DOC: latex beautification
DOC: latex beautification
Abhijeet Kolhe (1):
Fix setup.py to resolve numpy requirement
Adrien Gaidon (5):
FIX: typo for default init_size in MiniBatchKMeans
Added tests to check for the correct value of init_size
FIX: make GridSearchCV work with precomputed kernels
raise ValueError when given a kernel_function or a non-square kernel matrix + some tests
Fixed a small typo
Alejandro Weinstein (1):
Fix link to plot_lda_qda example.
Alex Companioni (1):
Issue #339: minimizing number of calls in tests.test_hmm.
Alexander Fabisch (1):
DOC update example path
Alexandre Abraham (7):
Fix a bug in the ward clustering.
Add a non-regression test for the bug of connectivity fixing.
Put conversion after component computation
Fix test function name.
Fix typos
BUG: Fix path in doc cleaning
Merge branch 'master' of https://github.com/jaquesgrobler/scikit-learn into fix_doc_clean
Alexandre Gramfort (620):
Merge branch 'master' of /Volumes/DAVID/scikit-learn
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
API: changing the way the parameters of Lasso+E-Net are optimized
ENH : imroving documentation of lasso + enet paths function
ENH : add LeavePLabelOut cross-validation generator
ENH : adding support for mean-shift clustering with a flat kernel
ENH: making data contiguous in memory in coordinate descent
ENH: adding affinity propagation algorithm
removing pl.show()
ENH: adding exception raising
Merge branch 'master' of github.com:agramfort/scikit-learn
using staticmethod rather than property
cosmit
setting array as fortran in lasso + enet coordinate descent
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
MISC : renaming affinity propagation example
broke glm to improve model selection
ongoing work on glm with crossval
Merge branch 'master' of http://github.com/GaelVaroquaux/scikit-learn
Merge branch 'master' of http://github.com/GaelVaroquaux/scikit-learn
Merge branch 'master' of http://github.com/GaelVaroquaux/scikit-learn
continue improve glm cv
Merge branch 'master' of http://github.com/GaelVaroquaux/scikit-learn
fix glm cv
Merge branch 'master' of http://github.com/GaelVaroquaux/scikit-learn
fix frozenset
BUG : fix in affinity propagation
BUG : fix in stock market example
BUG : fix with blas on mac os x
ENH : moving bench_glm.py to benchmarks folder
ENH : glm coordinate descent with BLAS
BUG : fix blas support in setup.py with coordinate descent
ENH : adding stratified cross-validation object
ENH : fix doctests in glm, svm and lda
adding grid search code
BUG : fix doctests in neighbors
BUG : fix doctest in datasets/base.py
ENH : using digits in grid search example
Merge branch 'master' of github.com:GaelVaroquaux/scikit-learn
API : renaming GridSearch to GridSearchCV
API : cross val generator in now given in fit in grid search object
Merge branch 'master' of github.com:GaelVaroquaux/scikit-learn
ENH : update grid search example
ENH : first draft of RFE
ENH : fix RFE + example
ENH : improve RFE
ENH : adding loss functions in metrics.py
Merge branch 'master' of github.com:agramfort/scikit-learn
Merge branch 'master' of github.com:agramfort/scikit-learn
ENH : fix RFE and RFECV
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
ENH : allow grid search to work with lists of grids
ENH : using BaseEstimator with GNB
cosmit'
ENH : adding BaseClassifier and BaseRegressor base classes
ENH : using mixin rather than base class to bring score methods to estimators
ENH : fix in svc.coef_ + cosmit
ENH : fix in svc.coef_ + cosmit
ENH : using np.logspace instead of np.linspace in paths
ENH : using np.logspace instead of np.linspace in paths (after merge)
API : making Y optional in fit for OneClassSVM
FIX : removing duplicated example
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
ENH : new SVR example
ENH : new SVR example
Merge branch 'temp'
ENH : improve QDA (taken from Matt Perrot)
ENH : improve LDA (taken from Matt Perrot)
ENH : improve LDA QDA example (taken from Matt Perrot)
MISC: cosmit in LDA, QDA
ENH : new example for LDA vs QDA
ENH : removing old example for LDA vs QDA
ENH : attempt to have a default parameter for bandwidth in MeanShift algorithm
ENH : adding doc to clustering module API : adding trailing underscores to estimates in clustering classes
ENH: adding test for RFE and reaching 100% coverage
ENH : adding doc for grid_search module
MISC : cosmit nfeatures -> n_features, nsamples -> n_samples, nclasses -> n_classes
FIX : adding missing doc file
FIX : fix in subplot index in plot_iris.py
ENH : removing unused preprocessing routines
FIX : in Makefile that calls now nosetests directly
FIX: removing useless imports
ENH : more work on LARS (doc + examples)
Merge branch 'master' of github.com:scikit-learn/scikit-learn
ENH : continue refactoring of GLM module (doc, moving files, config etc.)
ENH : more refactoring of GLM module
FIX : fixing __init__ files for examples
ENH : cosmit + fix examples for doc generation
cosmit in examples
Merge branch 'master' of github.com:scikit-learn/scikit-learn
BUG : fix in Lars at the end of path + more tests (not working yet)
ENH : using explained variance as score for regression problems
ENH: on the use of explained_variance in mixin regressor class
FIX : fix in handling of intercept in glm base and ridge
TEST : adding test to ridge with no intercept
ENH : draft of what could be a preprocessing routine (done by hand for now)
FIX : prevent pipeline.score to do a fit which was wrong
FIX : it may happen that pipeline.estimator do not implement predict
moving ridge out of bayes.py
ENH : adding PCA filter
ENH : adding computation of percentage of variance explained by each component
FIX : for doctest in PCA
ENH : adding ledoit-wolf for robust covariance estimation
ENH : adding FastICA class + example
Merge branch 'add_ica' of http://github.com/bthirion/scikit-learn into ica
ENH : more on ICA (examples + doc)
Merge branch 'ica'
FIX : fix ica vs pca example
ENH : adding example + refactor in covariance module
splitting ledoit_wolf.py in two files
oups missing example file
FIX: missing covariance.py
cleaning the handling of the intercept in GLM linear models
ENH : avoiding computing a pinv at each iteration in BayesianRidge
Merge branch 'master' of github.com:scikit-learn/scikit-learn
TEST : bayes
FIX : imports in __init__ of glm
cosmit in ARDRegression
TEST : removing lda.py
COSMIT : PEP 8 in PCA
Merge branch 'master' of github.com:scikit-learn/scikit-learn
FIX : RegressinMix score had flipped y_true and y_pred
EXAMPLE : adding model selection example with train/test error graphical illustration
EXAMPLE : making only one figure in model selection example
Merge branch 'master' of github.com:scikit-learn/scikit-learn
FIX __init__.py of glm.sparse
EXAMPLE: add example of dense vs sparse Lasso on dense and sparse data
FIX : example of dense vs sparse Lasso on dense and sparse data
passing Gram in LARS and LassoLARS
ENH : more doc in lars.py, handling of intercept
Merge branch 'fabian/python_lars_fast_2'
fix doctests in lars.py
more on lasso benchmark
Merge branch 'master' of github.com:scikit-learn/scikit-learn
FIX : preprocessing : scaler should not be allowed with axis=1 (opt removed)
skipping BayesianRidge failing test
using diabetes in lasso/lars examples
removing assert for debug
ENH : speeding up the LARS
BUG: bug fix in LARS Lasso mode + speed improvement (we can still do better)
adding LARS with Gram to benchmark
ENH : speed in LARS by forcing X to be fortran ordered + cosmit (unactive -> inactive)
pretifying the LAR / LARS examples to match with results on wikipedia page
cosmit in docs of glm module
Merge branch 'master' of github.com:scikit-learn/scikit-learn
ENH : improvements in bayes
Merge branch 'master' of github.com:scikit-learn/scikit-learn
fix doc generation on plot_lasso_coordinate_descent_path.py example (pb on my box)
DOC: updating doc on Univariate feature selection
FIX: ticket 147 on pb with 2d y in f_regression
FIX: ticket 147 on pb with 2d y in f_classif
adding 'iid' option in cross_val_score
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
pretifying plot_weighted_classes.py
FIX: quick fix in predict_proba in LogisticRegression
removing debug compile flags
sgd module code review
sgd module code review
adding path example on logistic on IRIS dataset
Merge branch 'sgd'
increasing precision in plot_logistic_path.py to get nicer path
DOC: spelling
ENH : pyflakes on examples to avoid useless imports + addint print __doc__
ENH : more love in examples (adding print __doc__ + some brief descriptions in headers + fixing Anova SVC Pipeline example)
FIX: fix docstrings in LARS (issue 8 on github)
cosmit + typos in doc
Merge branch 'master' of github.com:scikit-learn/scikit-learn
ENH: adding partial support for predict_log_proba in Pipeline and log reg
rewriting f_oneway in the scikit to avoid useless recomputations
Merge branch 'master' into log_proba
adding comment to explain the reimplementation of f_oneway
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' into log_proba
typo
removing use_svd option in LDA. Only scipy SVD is supported.
Merge branch 'master' into log_proba
ENH: adding predict_log_proba to LDA and QDA + tests to reach 100% coverage
ENH : adding support for predict_log_proba in Naive Bayes
ENH: adding support for predict_log_proba in SVC and sparse.SVC
ENH : adding predict_log_proba in sparse logistic regression
FIX: make sure class_weight='auto' do not change the result for balanced problems
Merge branch 'master' into log_proba
API : implement coef_init as fit parameter in glm.coordinate_descent module.
API: exposing fit_intercept params in LassoCV and ElasticNetCV
ENH : adding test in pipeline + increase coverage
fix doc generation pb introduced by previous commit
FIX: fix class weight auto
pep8 in plot_weighted_samples.py
ENH : adding kneighbors_graph to build the graph of neighbors as a sparse matrix
FIX fragile doctest
ENH : adding NeighborsBarycenter for regression pbs using k-Nearest Neighbors
DOC: adding NeighborsBarycenter to doc
DOC: better docstring for barycenter_weights function
DOC: even better docstrings in neighbors
MISC: reindenting BallTree C++ code (no tabs + 4 spaces)
DOC : more on docstrings in neighbors.py
review of gaussian process module
API renmae k->n_neighbors
Merge branch 'log_proba'
ENH : improving the speed of ridge with inplace computation + symmetric pos def constraint
Merge branch 'neighbor_barycenter'
ENH : coordinate descent speed up when n_samples > n_features in cd_fast.pyx
ENH : allowing Gram matrix precomputing in Lasso / ElasticNet to
Merge branch 'master' of github.com:scikit-learn/scikit-learn
ENH : speed improvement in lasso_path with precomputed gram matrix
Merge branch 'master' of github.com:scikit-learn/scikit-learn
pep8 in coordinate_descent.py
pep8 + N->n_samples and D->n_features
Merge branch 'master' of github.com:scikit-learn/scikit-learn
giving more love to benchmarks (pep8, pyflakes, var names, etc...)
Merge branch 'master' of github.com:scikit-learn/scikit-learn
revert previous commit regarding mpl_toolkits.mplot3d in bench
API : maxit replaced by max_iter everywhere
ENH : new scikits.learn.metrics.pairwise module
Merge branch 'master' of https://github.com/dubourg/scikit-learn into dubourg-master
pyflakes in plot_gp_diabetes_dataset.py
renaming plot_gp_diabetes_dataset.py as nothing is plotted
FIX : fix extra parenthesis in mixture ...
reviewing hierarchical clustering code
adding missing setup.py in cluster
ENH : nicer implementation of StratifiedKFold now usable with regression
DOC: updating doc for StratifiedKFold + ellipsis in svm support
ENH : adding function to test the significance of a cross val score with permutations in supervised problems
ENH : add possibility to pass RandomState
s/permutation_score/permutation_test_score
fix pb with nose and permutation_test_score function
Merge branch 'permutations'
FIX : really accurate pvalue in cross-val permutation test
FIX : even more accurate pvalue in cross-val permutation test
s/euclidian_distances/euclidean_distances
typo
ENH : cross-val generator can now return integer indices
DOC: better docstring in cross val with indices
DOC: update RST doc for crossval with indices
removing print used for debug
ENH : speeding up kneighbors_graph function avoiding the use of a LIL matrix
FIX : in hierarchial cluster + Mixin fix + tests + coverage + PEP8
FIX : fix pb in affinity propagation when S dtype is not float
ENH : adding inverse_transform to pipeline + better handling of coef_
ENH : adding coef_ attribute in GridSearchCV
ENH : adding inverse_transform to univariate selectors + pep8
removing old svn id tag
ENH : refactoring Ward feature agglomeration to make it work with Pipeline
first attempt to use caching in gridsearch with hierarchical clustering... WIP
ENH : improving ward for better joblib caching
removing plot_dendogram function
TEST : fix ward clustering tests
in hierarchical : s/adjacency_matrix/connectivity, s/k/n_clusters
remaining s/k/n_clusters
ENH (ward): return children as numpy array (better for joblib)
Merge branch 'master' into asaf
ENH: avoid storing parent and weights in Ward (better joblib)
DOC : better docstring in hierarchical clustering
adding example to rst doc
better ward rst doc examples
moving swiss_roll generator in samples generator
removing Return from class docstring
s/cord_/coord_
in setup.py s/ward/cluster
Merge branch 'hcluster' into hcluster2 that matches master
fix remaining n_comp
FIX : fixing Lars lasso with early stopping using alph_min + adding test for it
Merge branch 'master' of github.com:scikit-learn/scikit-learn
fix LassoLARS docstring
Merge branch 'hcluster2' of http://github.com/bthirion/scikit-learn into hcluster2
adding test scikit vs scipy.
FIX: ugly bug in connectivity on grids and images
ENH : factorizing img_to_graph and grid_to_graph
ENH : ones on diag in grid_to_graph + fix dtype
cosmit
Merge branch 'hcluster2'
cosmits with trailing spaces
Merge branch 'master' of github.com:scikit-learn/scikit-learn
pretifying nmf plot
pep8
ENH : using make_blobs in plot_affinity_propagation
ENH : using make_blobs in plot_mean_shift
ENH : using make_blobs in plot_mini_batch_kmeans
FIX : removing useless seed fix in plot_mean_shift
Merge pull request #178 from kwgoodman/master
Merge pull request #181 from lucaswiman/master
prettify plot_sparse_pca.py
adding authors in sparse pca
ENH : prettify dict learn example on image patches
pep8
prettify plot_sparse_pca.py
adding authors in sparse pca
FIX : using product form utils.fixes for python 2.5
pep8
MISC : fix docstring, cosmit in image.py
FIX; missing import in dict_learning.py (OMP in transform in not tested
ENH : new radius_neighbors_graph to build graph of nearest neighbor from radius
DOC: adding radius_neighbors_graph to doc
pep8
Merge pull request #230 from agramfort/radius_neighbors_graph
pep8
FIX : fix failing test in comparison between lassoCD and lars
pyflakes warnings
pep8
DOC: adding note on glmnet parameter correspondance in ElasticNet
ENH : adding LASSO model selection example based on BIC and AIC
BUG: s/empty/zeros in plot_lasso_bic_aic.py
pep8
Merge pull request #265 from JeanKossaifi/master
API : renaming LARS to Lars
MISC: s/larslasso_results/lars_lasso_results
pep8
Merge branch 'master' into rename_lars
Merge branch 'master' of github.com:scikit-learn/scikit-learn into rename_lars
Merge branch 'master' of github.com:scikit-learn/scikit-learn into rename_lars
ENH: adding LARS and LassoLARS deprecated classes
Merge pull request #278 from agramfort/rename_lars
Merge pull request #281 from glouppe/master
pep8
ENH : prettify OMP/LARS benchmark
Merge pull request #277 from vene/omp
ENH: speed up estimate_bandwidth with BallTree + use make_blobs in test_mean_shift.py
ENH : using make_blobs in cluster examples
pep8
FIX : using product form utils.fixes for python 2.5
pep8
MISC : fix docstring, cosmit in image.py
Merge pull request #295 from bdholt1/boston
DOC : fix doc building
ENH : new LassoLarsIC estimator
MISC : adding GaelVaroquaux to the authors of least_angle.py
ENH: addressing @ogrisel's comments on PR 298
ENH + DOC: addressing @GaelVaroquaux's comments
DOC: clarify doc on BIC/AIC
Merge branch 'master' of github.com:scikit-learn/scikit-learn into normalize_data
Style + typos
API : adding proper normalize options in Lasso and ElasticNet with clean up
ENH : more standard import of scipy.sparse
FIX : fix rounding error in test + pep8
FIX : putting back common.py
FIX : in meanshift typos, style, example
Merge pull request #346 from npinto/patch-1
DOC : fix sgd docstring
ENH : better plot_img_denoising
Merge pull request #350 from tinyclues/master
STY : pep8
STY: mostly style + avoid a zip in favor of an np.argsort
STY : in label_propagation.py
ENH : using numpy broadcasting instead of dot_out
Merge pull request #376 from fabianp/fast_tests
STY: imports in covariance + pep8
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge pull request #404 from amueller/grid_search_doc
STY: pep8 + naming
DOC: prettify plot_permutation_test_for_classification.py
DOC : adding permutation_test_score to changelog
ENH : adding support for scaling by n_samples of the C parameter in libsvm and liblinear models
FIX : removing param nu from sparse.SVR, C from NuSVR + pep8
$Merge branch 'master' into n_samples_scaling
typo
s/C_scale_n_samples/scale_C
STY: pep8 + pyflakes
Merge pull request #464 from NelleV/FIX_bibtex
Merge branch 'master' into n_samples_scaling
STY: prettify doctest
ENH : adding scale_C in NuSVR
ENH : more contrasted colormap
MISC: typos + subplot adjust
ENH : C scaling of sparse models
Merge remote-tracking branch 'origin/master' into n_samples_scaling
ENH : adding missing scale_C in docstring
Merge pull request #465 from amueller/fastica_wowhiten
STY: PEP 257 in ridge.py
Merge pull request #473 from amueller/dataset_whitespace
Merge pull request #477 from jakevdp/gmm-fix
ENH : avoid global seeding in plot_polynomial_interpolation.py
ENH : clean up plot_feature_selection.py
Merge pull request #482 from DraXus/master
STY : pep8 and add print __doc__ in plot_sparse_coding.py
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
STY : pep8
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
misc
STY: s/grid_points/cv_scores
Merge pull request #495 from vene/sc-mixin
Merge pull request #507 from jakevdp/neighbors-check
Merge pull request #532 from amueller/grid_search_attributes
ENH : reformatting hmm_stock_analysis.py examples
MISC : typos in hmm_stock_analysis.py
ENH : rename hmm_stock_analysis.py so it appears as a figure in the doc
ENH : make metrics.auc work with 2 samples + add test
Merge pull request #591 from jaquesgrobler/doc_update
fix with new as_float_array
STY: pep8
mv randomized_lasso.py randomized_l1.py
ENH : some doc + renaming in RandomizedLasso
ENH : better plot_randomized_lasso.py with score path
ENH : prettify plot_randomized_lasso.py
ENH : creating lasso_stability_path func + adding tests on randomized_l1
ENH : add docstring to RandomizedLogistic
FIX: fix test_randomized_logistic
STY: s/a/scaling + adding docstring
DOC : adding doc for Randomized sparse linear models + fix test
ENH : adding sample_fraction to lasso_stability_path + add to doc
typos
cosmit in doc + pep8
cosmit in doc
ENH : addressing @ogrisel comments (PEP257, naming, see also)
DOC: rephrase rand linear model doc
ENH : fix docstrings + add func missing reference
ENH : center y too in _randomized_lasso
ENH : adding support for multiple regularization parameters in RandomizedLinearModel
MISC: removing one XXX
ENH : early stopping in lasso_stability_path (faster)
ENH : fix legeng of plot_randomized_lasso.py
pep8
API: set scale_C to True by default in libsvm/liblinear models
update what's new
DOC : add warning in docstrings for scale_C gone in 0.12
DOC: indent pb
DOC: update scale_C docstrings + add notes to svm.rst
ENH : use not(scale_C)
remaining docstring to be updated
update docstring with WARNING
TST: use assert_true instead of assert + remove some relative imports
FIX : fix SVM examples with new scale_C=True
FIX : fix ward benchmark
Merge pull request #654 from GaelVaroquaux/enet_cv
Merge pull request #679 from amueller/logistic_l1_l2_sample
API: use C=None by default in libsvm/liblinear bindings so (C=1, scale_C=False) which is libsvm default == (C=None, scale_C=True) which is the scikit default
FIX : remove useless C definition in non-fit methods
ENH : adding scaled_C_ attribute
Merge pull request #699 from njwilson/issue-250
TST : add test on ridge shapes for different y shapes
TST : add test failing test to reproduce #708
FIX : fix test for #708
FIX : fix test failing with OMP
ENH: y_mean with consistent shape in _center_data
FIX : prevent ICA with defined n_camponents and whiten=False (fix for #697)
TST: capture warning in test
FIX : use joblib from externals
Merge pull request #728 from satra/fix/f_regression
ENH : speed up f_regression
FIX : array copy for compat pb
FIX : missing self.copy = copy in PLS GH Issue #758
cosmit : consistent linestyle in plot_lasso_coordinate_descent_path
ENH : add duality gap check with Lasso(positive=True)
Merge pull request #747 from ibayer/posCoeff
Merge pull request #773 from amueller/forest_pre_dispatch
Merge pull request #782 from jaquesgrobler/Update_Changelog
Merge pull request #783 from dwf/svm_docs_minor
change web site for agramfort
FIX : fix SVC pickle with callable kernel
cosmit
FIX : callable kernel for prediction
FIX : sparse SVC clone with callable kernel
Merge pull request #796 from amueller/kmeans_dtype
Merge pull request #814 from invisibleroads/master
Merge pull request #813 from invisibleroads/patch-1
FIX : make plot_ica_vs_pca.py deterministic (fix for #815)
Merge pull request #802 from amueller/arpack_backports
typo
fix for #824
DOC : update SVM examples with scale_C
API : change back default C to 1. explicitely and epsilon 0.1
FIX : svm decision function test
Merge pull request #851 from duckworthd/master
TST : tesitng intercept_ between dense and sparse
adding alexis to authors
typo
update tip on svm C param
Merge pull request #872 from jaquesgrobler/master
FIX : use RandomState rather than global seed
Merge pull request #881 from amueller/fix_ica_components_rename
FIX: fix buildbot ICA pb
Merge pull request #876 from alexis-mignon/master
FIX : fix a division by zero in LARS #63
Merge pull request #892 from ibayer/fix_mldata_docstring
FIX: C range in plot_cv_digits
Merge pull request #891 from ibayer/merge_cd
FIX : cleanup classes.rst + pep8 after merge of coordinate descent
Merge pull request #900 from kernc/neighbors_predict_proba
FIX : fix predict_proba in KNeighborsClassifier for old numpy
FIX: fix grid search when X is list #925
Merge pull request #932 from jaquesgrobler/master
Merge pull request #938 from ogrisel/svmlight-double-precision
Merge pull request #969 from jaquesgrobler/master
missing pl.show() in plot_digits_agglomeration.py
Merge pull request #983 from GaelVaroquaux/faster_ward
MISC : update my web site URL in what's new
ENH : MultiTaskLasso works (still draft)
FIX : fix docstring in MultiTaskLasso
ENH : add multi task lasso example
ENH + DOC : add MultiTaskElasticNet + doc + 1 example
update what's new
FIX : support 1d y in MultiTaskFoobar
rename ylabel in MultiTaskLasso example
moving MultiTaskLasso doc after E-net
FIX : remove unnecessary dgemm in cd_fast.pyx
FIX : catching pb with sparse input in MultiTaskElasticNet
FIX : make as_float_array keep fortran order on dense array when copy
ENH: simplify dict learning with gram and reg_param handling
ENH : add copy arg to array2d and new atleast2d_or_csr usual for sparse coordinate descent
ENH : add copy param to array2d_or_csx
ENH : add support for multitarget in sparse enet + simplify input checking
ENH : use multitarget in dict learning
FIX : fix tests
DOC : getting over docstrings
ENH : avoid a copy in MultiTaskElasticNet
add note on what's new
ENH : add support for sparse data in ElasticNetCV/LassoCV (not optimal)
ENH : use multitarget Lars and LassoLars in dict_learning
ENH : simplify handle of copy of Gram and X with array2d in OMP
style + typo
DOC : better reg_param docstring in dict learning
ENH : use build_dataset in multi target test
ENH : update warn for multitarget
update coef_path_ docstrings
use assert_true
API : consistent use alpha_/alphas_ for alpha/alphas estimated by CV in linear models (issue #1041)
DOC : add useful comment in code
addressing for round of reviews
DOC : better docstring for fit_path
DOC : fix rho=1 is L1 penalty #1139
fix failing test
TST : use nose assert_true and not python assert
ENH : proper IsotonicRegression model + example + test
remove support for extrapolation
FIX : for test_common sparse support
pep8
adding my name in IR example
ENH : finish addressing @GaelVaroquaux comments + improve coverage + add linear regression to example
typo
FIX : fix LLE test (don't ask me why...)
misc
DOC : avoid mentioning ElasticNet in Lasso.fit docstring
Merge pull request #1223 from ibayer/master
ENH : cleanup FactorAnalysis object
API : rename psi to noise_variance + some cleanup in FA
TST : add test that FA log like increases over iterations
add Bishop's book to refs in FA
update what's new with FactorAnalysis
DOC : adding FactorAnalysis to classes.rst
FIX : fix application example due to API change
FIX : missing import warnings
typo
typos
DOC: typos in ensemble.rst
DOC: typos in ensemble.rst
FIX : clean test + pep8 + reply fix to the code
API : move isotonic regression out of linear_model
DOC : fix move of isotonic in doc + examples
TST : use assert_true and not assert in test
Merge pull request #1483 from aweinstein/fix_doc_example
Merge pull request #1504 from NelleV/isotonic
Merge pull request #1505 from NelleV/mds
DOC : add doctring in plot_lasso_and_elasticnet.py
DOC: adding Bishop as ref for ARD
Merge pull request #1577 from ApproximateIdentity/n_jobs-documentation
Merge pull request #1578 from zaxtax/elastic_documentation
DOC : missing alpha doc in LassoLars
ENH : add reconstruction_err_ for NMF with sparse input
use scipy.linalg in test_nmf.py
adding comment on why sparse frobenius is ok as done
Merge pull request #1607 from agramfort/reconstruction_err_nmf_sparse
FIX : fix kfold balance due to int rounding
FIX : test due to KFold change
FIX : better fix of KFold balance
fix doctest
TST : improve test_kfold_balance test
update what's new
TST : improve again test_kfold_balance test
Merge pull request #1772 from jnothman/comment_exhaustive_search
typo
pep8
Merge pull request #1907 from aflaxman/stratified_shuffle_split_rand_state_doc_str
Merge pull request #2071 from djv/patch-1
Merge pull request #2075 from jnothman/agglomeration_simplify
FIX : use unique from fixes
Merge pull request #2074 from jnothman/ward_docstring
Merge pull request #2080 from ahojnnes/dist-todo
FIX : missing y=None in FactorAnalysis
Merge pull request #2087 from ahojnnes/examples-print-doc
Merge pull request #2118 from NelleV/DOC_fix
Merge pull request #2135 from fhs/meanshift-doc
Merge pull request #2138 from NelleV/kCCA
Merge pull request #2142 from sergeyf/master
Merge pull request #2145 from NelleV/kCCA
FIX : finish get rid of fit_... param
ENH : avoid one copy in FastICA code
misc
update ICA examples
adding comment
Merge pull request #2196 from erg/labelencoder-docs-fix
ENH : massive refactoring of CV models in coordinate descent. Now the algo core is in path functions
update what's new
DOC : more fixes in covariance module
Merge pull request #2202 from NelleV/isotonic_reverse
Merge pull request #6 from jaquesgrobler/cov_doc_fix
Merge pull request #2203 from agramfort/cov_doc_fix
cosmit : protect attributes in RBM for sphinx
pep8
better coverage
fix doctest
ENH : use warning instead of print
update what's new
Merge pull request #2212 from dengemann/ica_memory
Merge pull request #2213 from cmd-ntrf/master
Merge pull request #2217 from vene/ica_fit_transform
Merge pull request #2182 from NelleV/pls_refactor_2
DOC+ENH: fixes in least_angle + one vectorization
DOC : better doc of array shapes in fastica
MISC : use linalg from scipy
ENH : removing warnings from tests in cd linear models
Merge pull request #2194 from NicolasTr/as_float_array_copy
Merge pull request #2223 from arjoly/doc-datasets
DOC : docstring fixes
DOC : more docstring fixes
use pre_fit in OMP
API : deprecate a lot of extra parameters in OMP object
API : deprecations in orthogonal_mp
ENH : update example of OMP
update what's new + classes.rst
Merge pull request #2247 from pgervais/docfixes
Merge pull request #2258 from NicolasTr/ignore_pycharm_files
Alexandre Passos (87):
Adding random projections SVD to scikits.learn.pca as an option
Adding the power iteration parameter to fast_svd (to make it better in high-rank very-big very-sparse matrices according to the Martinsson et al survey
Merging the rng changes
The derivation of the variational algorithm for the DP mixture of gaussians
Beginning the code; so far only doing the E step
First draft of the code; untested
The dp is already fitting properly
Fixing indentation bug
Changing the DP derivation to rst---equations don't work
Fixed the math
Removing useless whitespace between methods
Reorganizing the directory structure
Adding variational inference for a finite gaussian mixture model
I'm returning precision, not covariance matrices. Make that clear
Editing the documentation
Making it clear that the covariances don't work
Merge branch 'master' into variational-infinite-gmm
Fixing small bug
Adding example; adding explicit lower bound computation; optionally monitoring convergence; full and tied work, somehow spherical and diag diverge.
Using a smaller example to speed things up
Simplifying the code a bit
Fixing last bugs in the bound and updates; improving docs
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into variational-infinite-gmm
Fix docstring find&replace issue; restoring VBGMM
Adding reference in the derivation
pep8 dpgmm.py
Fixing test failures in mixture
Fixing pyflakes warnings
Adding complexity note to the documentation
Replacing DP by dirichlet process
Don't use np.linalg
Explaining what is dpgmm
Adding see also sections to the mixture models
Fix the 'give' in plot-dpmm
Editing a single example for the GMM and DPGMM explaining the difference
Making the documentation findable
Editing the documentation substantially
Adding doc to VBGMM
Adding usage note to dp-derivation
Adding some test coverage. For some odd reason some tests fail on 'make test' but pass on 'nosetests scikits/learn/tests/test_mixture.py'. Any idea why?
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into variational-infinite-gmm
Fixing the docs
Changing the image url in the doc
Even seeding the RNG in setup_func doesn't make the tests consistent
There was a bug in the setup, now things are working deterministically
Deleting stray print statement
Adding an rng parameter to the GMM classes
Fixing the imports
Inlining the helper norms
Beginning to vectorize the code
more vectorizing
Finish removing quadratic dependence on n_states; update docs
Adding norm to scikits.learn.base, using that
Putting norm in utils
Vectorizing parts of the VBGMM, which I had skipped due to it being a lot less useful than DPGMM
Incorporating some caching and vectorizing to improve performance as per line profiles
Fixing typo bug
Caching another computation
Small typo bug in _bound_z
a no-op that fixes tests
Change monitor to verbose, better output
Fixing typo-bug in the full covar update. There are still a couple of nondeterministic bugs to be taken care of
Making test_sample stop failing for no reason
Removing the square from norm() and creating helper sqnorm() in dpgmm
Prevent setting the covariance parameters
Caching the computation of the constant part on _bound_pxgivenz
Caching part of the bound for diag that was missing
moving some parameters from fit to __init__.
Merge branch 'variational-infinite-gmm' of https://github.com/GaelVaroquaux/scikit-learn into gael-variational
Fixing the names in the hmm test
Merging gael's branch
Merge branch 'variational-infinite-gmm' of https://github.com/GaelVaroquaux/scikit-learn into variational-infinite-gmm
Renaming bound_pxgivenz
Renaming covar to prec
Finishing the renamings
Adding a squiggly curve example for the mixture models
Improving the coverage of dpgmm
Testing lognormalize
Splitting test_mixture
Preventing underflow in wishart_logz
Fixing 0* problem in z log z
Fixing another underflow bug in digamma. Now the bound for spherical covariance never diverges as a cluster gets empty
Also, no warnings when running these tests
Fixing test failures resulting from the merge
Fixing some under and overflows; this doesn't fix all test errors yet
Removing some more underflows, still not all
dpgmm: setting the weights to something reasonable
Alexis Metaireau (3):
fix a typo in neighbors docs
fix restructured text problems in the developers doc
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
Alexis Mignon (30):
Added positive constraints for the elastic net
Made the code pep friendly
Added fit_intercept for sparse ElasticNet as well as corresponding test
Corrected bad comment and the use of a typedef
Made code pep compliant
DOUBLE does'nt stand for a dtype
Added utility functions for csc sparse matrices
Modified: uses utility function for sparse csc matrices
Modified data generation so it can generate data adapted to positiveness constraints
Removed most python function calls
Removed duplicate definition of csc_mean_variance_axis0
Made the code pep8 compliant
Corrected doctring: CSR -> CSC
Regenerated with Cython
Corrected missing import of csc_mean_variance_axis0
Made code pep compliant
Modified: in 'center_data' makes a copy only when needed
Made code pep8 compliant
Unified access to 'mean_variance_axis0' for CSC and CSR matrices
Removed undeed functions
Added warm restart option and completed docstring
Completed docstrings, factorized some tests and added checks on dimensions
Added test case for warm_start
Added size check on coef_init
Made code pep8 friendly. Used random state with fixed seed.
Made code pep8 friendly.
Modified chi2 kernel approximation such that it deals with zero elements
kernel approximation: simplified mangement of non zero elements
For the sake of clarity, creates new temporary arrays instead of copying the same one several times.\n Modified error message for negative valued arrays.
pep8 compatibility
Amit Aides (9):
Fix to sparse SVC with kernel='poly'
Added Multinomial Naive Bayes classifier
Fix to the documentation of the Multinomial Naive Bayes.
Pep 8 compliance and cleanup for the multinomial naive bayes
Merge remote branch 'upstream/master'
Some more pep8
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn
Merge remote branch 'upstream/master'
naive bayes name change MNNB->MultinomialNB
Andreas Mueller (1368):
Remove copy and paste errors from nearest neighbors example
Fixed issue 82: bug in init of Kmeans.
Minor documentation: how passing a callable for init works.
Changed default initialization method to "k-means++" for consistency with k_means
k-means clustering test: changed data points to be far away from zero. Now
transpose data on input and sources on output.
Adjusted examples to new ICA interface
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
typo
I don't really understand this, but it makes the error go away.
Added warning to fastica
pep8
fixed bug
pep8
typo
mention LDA in docstring
print docstring in examples
typo
pep8 and starting with X in right shape
rst fix
Notes on Fortran-ordering in fastica
test for vectorizer_inverse_transform
non-regression test for warm-start intercept shape using a binary dataset
letting intercept_init be of shape (), reshape to (1,) for consistency
added hopefully more intelligible error messages.
pep8
pep8
typo, pep8 and line continuations
test for new error strings
slight beautification (in my opinion)
don't test on error message, just on raise
pep8
DOCS: Image is aligned to the right...
DOC Added documentation for important attributes of GridSearchCV
specify dict type
DOCS: Typo in url
ENH: Adds more verbosity to grid_search. verbose > 2 gives scores while running grid.
Merge pull request #414 from amueller/grid_search_verbosity
DOC: Document "cache size" argument of SVR
COSMIT: remove unused error string.
COSMIT: remove unused error string.
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
ENH: removed kernel cache from fit method of DenseLibSVM, added to __init__ of BaseLibSVM
Added kernel cache argument to init of all SVC and SVR classes. For the moment the conservative 100MB default.
BUG cache_size instead of cache as paramter name
BUG: cache_size also for sparse SVMs
ENH: SVM cache_size default value changed to 200 mb
ENH Sparse SVM: removed cache_size parameter from fit method. Is now part of constructur.
DOC fixed doctests for cache_size parameter
DOC slight reformatting of kernel cache note in module docs.
BUG: minor mistake in earlier commit.
DOC: fogot doctests in python files.
DOC: another doctest.
ENH: in Scaler, warn if fit or transform called with integer data.
Merge pull request #425 from amueller/svm_cache_size
ENH parameter "epsilon" in SVR error messages is given correct name.
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
DOC Made reference to "Getting started" in "Datasets" section a link.
DOC: inline example for precomputed svm kernel
ENH in preprocessing.Scaler, raise warning also if given unsigned int
DOC/Website: Changed link on "support" page to scikit-learn.org, added 0.9 release doc link
DOC fixed whitespace in GridSearchCV doc string so that html doc is generated correctly.
COSMIT removed unused import, pep8
COSMIT pep8 in cluster module, removed unused import
COSMIT pep8 whitespace
COSMIT removed emacs modeline
COSMIT: pep8 whitespaces instead of aligned decimal points
COSMIT indentation
COSMIT ugly line break for pep8
COSMIT reindented for pep8
COSMIT pep8 whitespaces
Merge pull request #447 from amueller/pep8
ENH: in sgd classifier, check that parameter alpha is greater than zero
COSMIT some pep8
some pyflakes
COSMIT more pep8
COSMIT more pyflakes
COSMIT: more pep8. enough for today...
ENH: fastica returns whitening matrix "None" when whitening=False
TEST non-regression test for issue 238, FastICA failing with whiten="False"
COSMIT pep8
COSMIT pyflakes
COSMIT: pep8
COSMIT pep8 in backported sparsetools...
DOC Added Gael's explanations about the memory usage in grid_search / joblib
DOC: Auto example digit classification plot without interpolation and axis.
FIX: typo in with statement
Example for random dataset function.
Random dataset example: make figure look nice on the web
DOC: Added random dataset plot to doc.
COSMIT: random dataset plot prettified
DOC Added comment about equivalence of nu-SVM and C-SVM to the docs
Examples: Replaced NuSVM by rbf SVM in example. RBF-SVMs are really important, NuSVMs not so much imho.
pep8. whoops..
COSMIT: pep8
FIX: Return "None" fist.
Example for finding the hyperparameters in a RBF SVM
Examples: Make SVM parameter estimation look good on the web.
DOC: Fixed legend in iris svm example
DOC Nonlinear SVM example changed to satisfy my sense of aesthetics. Hope you like it.
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
COSMIT pep8
Example illustrating parameters of an RBF SVM
COSMIT removed unused import math in utils/extmath.py
FIX: make kmeans test not raise warning when init is passed.
FIX: make kmeans test not raise warning when init is passed.
DOC Description of the basic dataset API
DOC: Corrections and additions to the dataset docs. Also more detailed docstrings
DOC test fix. Set printoptions to get rid of epison.
FIX: whitespace after ..
DOC test fix finally....
DOC fixed fastica docstring: if whiten=False K=None
ENH linnerud dataset interface adjusted to be consistent with the others
FIX: typo in diabetes docs
DOC RST field lists don't behave as I want them to:(
COSMIT datasets doc using rst tables
FIX This should fix the doctests in the datasets dir. They take quite long, I think it's because of the svmlight loaders. So I didn't include them in the standard make target
COSMIT rst formatting
DOC: Added missing rst label
FIX RST references
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
FIX rst errors in docs
FIX doc rst references
DOC Added link for Satrajit Gosh, removed dead link for Robert Layton since I couldn't find his website.
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
DOC Robert Layton again.
ENH prettify kmeans vs minibatch kmeans example
ENH adjust subplots to look good on the web.
FIX minor typo
FIX errors in doc
FIX minor docfixes
added kernel approximation using monte carlo approximation of fourier transform
ENH pipeline compatible interface to fit, transform and fit_transform
DOC example comparing linear classification, kernel svm and kernel approximation with explicit mapping.
kernel approximation example
DOC beautified kernel approximation example plot
better docs, remove unimplemented kernel approximations
COSMIT pep8
DOC added kernel_approximation module to docs
DOC: placeholder entry in user guide
ENH: renamed D to n_components for consitency
DOC approximate kernel functions narrative docs
DOC: more narrative documentation for kernel_approximation
DOC: references for approximate feature maps
COSMIT: pep8 in kernel approximation test
DOC approximate kernels: added formular for skewed chi squared kernel
COSMIT removed commented out import
ENH: additive chi squared kernel implemented and tested
pep8
DOC: added AdditiveChi2Sampler to doc modules
ENH: Default value for n in AdditiveChi2Sampler
DOC narrative doc for additive chi squared kernel
ENH: sensible defaults for RBFSampler and SkewedChi2Sampler
ENH Added AdditiveChi2Sampler to feature_extraction __init__
BUG: AdditiveChi2Sampler fit method should return self
ENH: in Chi2Samplers, check if input inside inside desired range.
FIX: Renaming of RBFSampler argument
DOC: Move kernel approximation to be a "plot" example.
Don't test as strictly so not to fail randomly..
Example of decision surface of approximate kernel svm
Moving kernel_approximation to the top level
ENH: Restructuring User Guide: kernel_approximation, preprocessing and feature_extraction are under a common chapter, "
DOC: finetuning the narrative docs for kernel_approximation
DOC: kernel_approx make examples show correctly
DOC rst
ENH Addressing some of Gael's comments, mainly naming and docstrings
ENH better testing
ENH fixed location of the legend in kernel_approximation example
DOC more discussion in docstring
ENH timing results in approx kernel example
ENH kernel approximation: More specific references and example referencing the narrative docs.
FIX: use safe_sparse_dot in kernel_approx transform
DOC minor doc improvements, different example
NONSENSE improve the example that i'll remove in a sec
BUG import ...
COSMIT + SPELL
DOC added reference to the user guide in kernel_approximation module
FIX path in plot
FIX typo that cost me half a day of sprinting...
ENH Remove redundant example
FIX fix module links, figure split into two
COSMIT pep8
FIX: Kernel approximation module in references in alphabetical order.
DOC trying to clarify the kernel_approx documentation.
DOC FIX typo
FIX docstring errors...
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
FIX: missing import
FIX: fixed link to Virgile in whats_new
Merge pull request #486 from jakevdp/util-docs
Merge pull request #490 from mblondel/news20_loader
Merge pull request #488 from mblondel/sparse-kmeans
FIX: Added DBSCAN to references
FIX: typo in docs
Merge pull request #417 from larsmans/multilabel
COSMIT minor ticks
FIX getting rid of some more sphinx problems
FIX: SO EINE SCHEISSE!
COSMIT fixing indents in balltree
Merge pull request #510 from amueller/aaarrrgghhh
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
COSMIT "Examples" instead of "Example"
COSMIT Addressing @agramford's comments about whitespace and a minor fix in pipeline.
FIX: Section Returns not Return
COSMIT: Class docs don't have a 'Methods' section. It is autogenerated.
COSMIT Examples not Example
COSMIT make 'References' bold and minor other fixes.
COSMIT underscore fixes
COSMIT "Optional Parameters" Section removed
COSMIT pep8
FIX developers rst malformed
remove unused link
COSMIT remove unused malformed tag
FIX indentation and string literals
FIX backtics for members_, spaces around colon (not cologne)
COSMIT minor docstring stuff
COSMIT remove Methods section
FIX: rename complexity section into notes section
FIX docstring variable names
FIX rename "Details" into "Notes"
FIX remove infinite recursion
COSMIT: Make references link and show up correctly, parameters of __init__ documented in Class, not in function.
COSMIT make formulars show up correctly, use reference formatting for references
COSMIT make references use reference formatting
COSMIT format references and dict stuff...
COSMIT Indentation of formulars
FIX removed duplicate explicit linke for Vlad
FIX: RST indentation and blank lines
FIX RST and references
FIX minor rst
FIX workarounds for docutils bug
FIX whitespace where rst demands it...
FIX workaround for table problem
FIX two more underscores
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into doc_underscores_for_real
COSMIT docs hmm
FIX: don't use latex in rst
FIX + COSMIT rst warnings
COSMIT docs
FIX: fix again errors in NMF after merge
FIX: Document properties in a way that the docstring actually shows
FIX: rst errors in ball_tree
FIX: Notes instead of Note in preprocessing init
FIX remove handles for references as they are not used anywhere and raise warnings if doubled.
Merge pull request #513 from amueller/doc_underscores_for_real
COSMIT docs underscore fixes (again)
COSMIT fixing doc errors and making html docs pretty
COSMIT Minor beautifications and RST error fixes
FIX doctest errors + cosmit
Merge branch 'master' of github.com:scikit-learn/scikit-learn
DOC use SVC in grid search instead of SVR. Iris is a classification dataset as pointed out by @agramfort here:
DOC score returns accuracy, not error
FIX for doctest I just broke :-/
DOC uncommenting doctests in balltree.pyx, addind doctest: +SKIP
COSMIT a little less skips...
ENH Add underscores to estimated attributes in GridSearchCV and deprecation warnings.
ENH renamed best_estimator and best_score in examples and tests.
COSMIT typo
ENH in GaussianNB, let estimated parameters have underscores.
DOC Reworking Bayesian regression documentation
DOC mentioning sparsity of ARD, reblocking text
COSMIT typo, thanks @vmichel for pointing it out.
DOC added reference for sparsity of ARD
COSMIT pep8
DOC fix linking to load_sample_images and load_sample_image in docs
DOC underscores in DeprecationWarnings... shame on me for forgetting that....
DOCs workaround for docutils bug (column alignment problem)
DOC external references go under "references" not "see also". "See also" can only handle internal references
ENH liblinear: cythonized sign switch for n_class<=2
ENH liblinear: get rid of n_class sign by switching class signs in liblinar implementation.
COSMIT typo
whatsnew: gave myself some credit
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' into svm_coef_sign
FIX adjust _set_coef_ and _set_intercept_ to sign switch
ENH DenseBaseLibSVM.coef_ correct. test simplified.
DOC try to document layout of dual_coef_ in multiclass libsvm
DOC fixed errors in load_images doc and SKIP'ed load_image doctest as was already the case for load_images
DOC: OCD and added image loader to class reference
DOC Trying to enhance the tree/forest docs. Headlines in tree, added reference, hopefully better description of 'min_density'.
DOC layout of dual_coef_ in 1vs1 svm in user guide, example
DOC fixed indices in dual_coef_ example
COSMIT factor out 1vs1 coef construction in libSVM, PEP8
DOC added RidgeClassifier to References
DOC fixes in Multiclass docs. Didn't show correctly on web.
DOC multi-class narrative: added links to the references, made citation clickable
ENH trees in random forests save the indices of the training data used in bootstrap sample
ENH Add function to predict on left part of training set
ENH use self.classes_, check input on predict_oob, add test
DOC Out of bag error estimates in grid_search module
COSMIT @glouppe says this is more pythonic :)
DOC reformulation out of bag error
COSMIT in doc: @ogrisel's remarks
ENH oob score as attribute, not separate function.
ENH: added oob_score_ and oob_prediction_ to regression ensembles
FIX copy/paste error. guess it was to late
ENH made oob_score an ``__init__`` param as suggested by @agramfort
DOC what's new, minor doc improvements
Merge pull request #571 from amueller/tree_indices
ENH: Replace asserts by appropriate errors. Fixes the rest of issue #570.
COSMIT how I love these sphinx errors
DOC complicated objects as parameters confuse sphinx and the reader. Fixes issue #567.
ENH: Default in Vectorizer "None" as @ogrisel suggested
DOC website: added link to 0.10 docs under support.
DOC added required versions of python, numpy and scipy to install documentation. Closes issue #579
COSMIT pep8
COSMIT removed unused imports
Merge branch 'master' into svm_coef_sign
DOC comment in linear.cpp
DOC @ogrisel's suggestion: putting a link to pull request in liblinear.cpp
COSMIT pep8
DOC fixed doc errors in metrics module
COSMIT removed unused imports
Merge pull request #546 from amueller/svm_coef_sign
FIX RandomizedLogisticRegression test import
COSMIT removed unused import
DOC fix sphinx errors
DOC more fixes in Docs
DOC cluster metrics: fixed see also sections, errors in references section.
COSMIT pep8
FIX SGD loss example for new hinge loss.
FIX lasso_dense_vs_sparse_data.py example needed update.
COSMIT pep8
DOC add cross_val_score to references, OCD.
FIX bug in text feature extraction, issue #606
COSMIT pep8
DOC fix sphinx errors
ENH: moved class_weight parameter in svms from fit to ``__init__``.
MISC Adjusted class_weight param in examples, fixed legend in unbalanced dataset examples.
DOC typos.
MISC reinserted class_weight as fit parameter, added deprecation warning.
MISC cleanup
DOC margin for old warning wrapper fixed
MISC Deprecated class weights in SGDClassifier
Merge pull request #578 from jakevdp/old-version-warning
pep8
COSMIT pep8
COSMIT get rid of warning in nosetests for equidistant neighbors. it's intentional.
MISC more sensible NMF test.
COSMIT pep8 wooops thanks @ogrisel
MISC forest tests: boston faster, probability test faster and no warning.
MISC decision tree test faster and no warning
COSMIT simplified error message checking, remove deprecation warning.
MISC more iterations for test_lasso_path. Still runs in <.1s, gives no warning and more accuracy.
MISC more iterations also for test_enet_path, same runtime as before, no warning.
COSMIT pep8
COSMIT pep8
FIX added missing import
MISC added warning to coordinate descent if alpha=0, don't call cd with alpha=0 in tests.
MISC replaced deprecated mean_square_error in test.
MISC test for warnings as @ogrisel suggested.
Merge pull request #620 from amueller/coordinate_decent_alpha_warning
add min_leaf (minimum size of leaf node) to decision tree
ENH min_leaf for ExtraTree
ENH added test for "min_leaf"
ENH set min_split if min_leaf is set.
DOC add load_svmlight_file to references
DOC minor fixes and typos
DOC more rst fixes....
DOC typo in whatsnew
Merge branch 'master' into svm_class_weights
DOC renamed duplicate label
FIX flip sign in decision function of LibSVM in binary case.
MISC renamed min_split and min_leaf to min_samples_split, min_samples_leaf, added them to the ensemble classifiers and documented them....
FIX OneClassSVM decision function sign.
ENH more elaborate one class svm testing....
MISC address @mbondels comments
MISC simplified test
FIX one class test, added more decision function tests.
COSMIT pep8 + "leafs" typo.
DOC Added changes to decision functions and coef_ to whatsnew
MISC don't use deprecated mean_square_error
Merge branch 'master' into svm_class_weights
Merge pull request #610 from amueller/svm_class_weights
COSMIT pep8
FIX whooops sorry
DOC Insert hidden toctree, mv "included" files from rst to txt
MISC Issue #639. Remove unused member types in linear_model CVs
DOCs change extension from txt to inc, add inc as doctest extension to makefile
MISC verbosity parameter for forests: better control over tree building.
Merge pull request #641 from amueller/doc_fixes
FIX dataset docs: changed suffixes in include to match rename.
DOC fixed inconsistent titles. sphinx didn't like them and didn't show these sections.
MISC @ogrisels comment about human-parsable counting
Merge pull request #643 from amueller/forest_logging
DOC C is pretty large now...
MISC class_weights constructor parameter in RidgeCV
DOC doc fixes
MISC added removal version for scikits.learn deprecation warning.
MISC remove ball_tree and cross_val namespaces
MISC scikits.learn removal at .12. I'm not so good at counting, sorry.
Merge pull request #660 from amueller/remove_namespaces
COSMIT renaming scikits.learn to sklearn in some places
COSMIT pep8
MISC Update all the other deprecation warnings that I forgot.
FIX: class_weight only in classifier Ridge classes
DOC Documentation for RidgeClassifierCV
DOC add removed docstring.
COSMIT pep8
ENH Added tests and fixes
DOC remove "for dense data" heading for SVM classes
Merge branch 'master' into linear_model_class_weights
DOC document classification plot
MISC removed deprecated api from examples
WEBSITE: make example gallery look even better!
DOC added reference to r2 score
ENH rename parameter "multi_class" of LinearSVC to "crammer_singer", add docs, add tests
FIX forgot doctest
DOC minor addition to SVM kernel parameters
DOC more readable make_friedman docs....
Merge pull request #649 from daien/GridSearchCV_precomputed_kernel
COSMIT don't use deprecated names
ENH new samples generators for classification and clustering. Refactored label propagation example a bit
ENH cluster comparison example (starting)
Merge pull request #669 from amueller/example_gallery_css
ENH added "shadow" parameter class_weight_ as @ogrisel suggested.
MISC changed parameter name back but changed semantics, as @mbondel suggested.
COSMIT pep8
DOC added one more sentence about crammer-singer
COSMIT typo. thanks @ogrisel.
DOC crammer_singer docstring by @ogrisel
ENH clustering example with spectral clustering and ward with connectivity. looking better now, still not perfect.
FIX broke label_probagation example, now fixed it again.
Merge branch 'master' into sample_datasets
Merge pull request #673 from amueller/crammer_singer_rename
DOC add new dataset generators to class reference
WEBSITE: another css enhancement to give figures a max width.
DOC move references from Notes to References section in docstrings
MISC simplified kpca example with new dataset generator, another minor fix in generator
DOC lasso/enet regression example with coefficient plots, corrected r2 score
DOC Basic docstrings for LDA and QDA classes
DOC lda/qda examples: remove redundant example, prettyfied other.
DOC Added QDA to references, narrative docs, improved docstrings
COSMIT newline in LDA doc
DOC explanation for plot in lda/qda narrative
MISC use Gaels pretty plot, add dbscan, normalize data...
COSMIT cleanup, pep8
ENH issue #661, plus some renaming and minor cleanup
MISC forbid mle initialization of PCA for n_samples < n_features
DOC added clustering example to the docs
COSMIT make plot look more like other coef plots
COSMIT removed debugging print
MISC added xlim and ylim for @ogrisel's weird matplotlib ;)
ENH fixed seed, added center positions
Merge pull request #674 from amueller/sample_datasets
FIX minor doc fixes
DOC add link to narrative in lda and qda references
DOC add ``estimate_bandwidth`` utility for MeanShift to the references and narrative
MISC make Ward check if input is sparse.
MISC make Ward test if connectivity is a valid connectivity matrix.
COSMIT changed error message for Ward
DOC another coefficient plot
COSMIT Adjust title for example gallery
ENH 2d plot for l1l2 digits example
COSMIT last try to make my plot pretty....
BUG fixed error that I introduces earlier: connectivity can also be `None`
DOC fixed reference to an example (that I also broke before)
Cosmit typo
FIX plot example fix for old matplotlib, so that it shows on the website.
Merge branch 'master' of github.com:amueller/scikit-learn
COSMIT make cross_validation nosetest slightly more readable and more pep8 respecting
FIX make class weight nosetests work
FIX get rid of some doctest errors (with the stricter nosetester)
ENH refactoring of dot-file export
COSMIT comments
COSMIT minor visual enhancement
ENH: don't fail on "yeast" dataset
Merge pull request #711 from davidmarek/sparse_pca
DOC Added clustering functions to references.
Merge pull request #685 from ibayer/master
ENH local variable in ``fit`` instead of modifying the estimator parameters. thanks @GaelVaroquaux
DOC: Added ElllipticEnvelop to the References
DOC added reference for EllipticEnvelop and fixed some sphinx errors.
FIXed nosetests. Thanks @pprett
Merge pull request #707 from amueller/graphviz_dot_refactoring
Merge pull request #648 from amueller/linear_model_class_weights
COSMIT Typo
COSMIT pep8
DOC sphinx/rst errors
DOC Believe it or not - this fixes the annoying sphinx error. And don't dare to
COSMIT minor fixes to docs
COSMIT fixed references to covariance.EllipticEnvelop in docs
COSMIT pep8
DOC correct links to face recognition example, take care of trailing underscores.
COSMIT pep8
ENH grid_search forgets estimators
DOC slightly better docs for ``refit``, document ``best_params``.
FIX clone base_clf before setting params.
FIX messed up something in the short cut method.
ENH pre_dispatch for foresters
FIX redundant code is redundant
COSMIT add todo comment to grep
Merge pull request #770 from amueller/oblivious_grid_search
ENH normalized_mutual_information
Revert "Merge pull request #773 from amueller/forest_pre_dispatch"
COSMIT don't use deprecated attributes in tutorial.
COSMIT pep8
FIX don't use parameters to fit in GMMHMM.
FIX don't use Python 2.5 method of checking for warnings
MISC Don't warn on equidistant on iris. iris has duplicate datapoints.
FIX don't use fit parameters in grid_search test
ENH convert X to float in k_means predict.
MISC don't use private ``set_params`` method as that raises a warning.
MISC don't use iris in testing as it has duplicate data entries. Add some noise to simple examples.
MISC added note that we need better tests
DOC typo
ENH check if backport of sparse scipy ARPACK is needed. The backport breaks with scipy 0.11
Added mutual_info_score to the references
DOC narrative docs for normalized_mutual_info_score
DOC make formulars for clustering metrics more pleasing to the eye
ENH fix if entropy is zero in normalized_mutual_info_score
COSMIT cleanup + pep8 in examples
MISC extended example, fixed doc build warning
DOC made it more explicit that AMI is better than NMI
COSMIT + MISC pep8, pyflakes, typos and some other cleanup of examples.
DOC typos (thanks @ogrisel) and some elaboration in docstring.
Merge pull request #800 from amueller/less_neighbors_warnings
FIXed pca example that I broke when "cleaning up"
ENH checked for scipy version
ENH add ``decision_function`` to ``Pipeline``
ENH joined tests for less duplication, checked shapes as @ogrisel suggested.
FIX we need to do "LooseVersion" to support dev/git versions of scipy
COSMIT pep8
COSMIT make test more explicit
COSMIT removed unused "verbose" option in dbscan
COSMIT removed unused import in test
FIX copy/paste error
FIX removed verbose also from main DBSCAN class
DOC added reference to Hila's thesis, added comment about equivalence.
ENH replaced v_measure_score computation with nmi computation.
DOC removed NMI from example plot as it is the same as V-measure
COSMIT dbscan test doesn't use fit params
DOC comment on normalized mutual information
ENH simplified entropy calculation
Revert "DOC removed NMI from example plot as it is the same as V-measure"
Revert "Revert "DOC removed NMI from example plot as it is the same as V-measure""
Revert "ENH replaced v_measure_score computation with nmi computation."
COSMIT typos by `git grep independant`
DOC corrected relation of V-measure to normalized mutual information.
MISC removed unused lines, see #666.
COSMIT rst in example
ENH adjusted examples to new matplotlib 1.1.1
MISC don't use ``set_cmap``
MISC use logsumexp in DPGMM for less warnings
FIX typos in examples
FIX one more example
MISC trying to remove scale_C
MISC forgot two
DOC docs and examples have scale_C removed
FIXed many tests
DOC some doc corrections
ENH remove duplicate definition of "assert_lower" in tests
FIX ditto (numbers are to random)
ENH backport "assert_less" and "assert_greater", rename "assert_lower" and use it everywhere :)
ENH rename out_dim to n_components in manifold module
FIX assert_greater message
DOC Added pipeline user guide
ENH use random states everywhere, never call np.random.
FIX don't do anything in the __init__
WEB Added page with links to various tutorials/presentations on scikit-learn
DOC added some explanation to video page
ENH added random_state to Gaussian Process
FIX testing: random state problem in forest testing.
DOC minor fixes to rst and image paths
DOC banner 14 duplication?
DOC more minor fixes
DOC fix last docstring error. Don't remove redundant docstring. I dare you, I double dare you mother******!
RELEASE 0.11
COSMIT typo in whatsnew
RELEASE HEAD is now 0.12-git
COSMIT pep8
MISC don't use fit parameters in example
ENH rename unmixing_matrix_ to components_ in FastICA
DOC document 'labels' argument of confusion_matrix
DOC fix see also in gmm
FIX made "unmixing_matrix_" a property as @larsmans suggested.
COSMIT pep8
ENH rename 'k' in KMeans and MiniBatchKMeans
ENH renamed 'k' to n_clusters in SpectralClustering
ENH rename k in clustering examples and doctests to n_clusters
ENH fixed ``n_cluster`` to ``n_clusters`` in examples. Thanks @agramfort
ENH check whether "k" was used in fit, not init, as GaelVaroquaux suggested.
Merge pull request #874 from temporaer/master
Merge pull request #858 from amueller/fastica_components_rename
COSMIT pep8
FIX typo in example. My bad.
FIX renamed what was `components_` to `sources_`
COSMIT rst error
COSMIT fixing doc building errors.
COSMIT typo
Merge pull request #776 from amueller/normalized_mutual_information
Merge pull request #868 from larsmans/liblinear-1.91
ENH "fit_pairwise" for spectral clustering.
ENH Starting on affinity propagation
DOC typo
DOC Improving docstring for SpectralClustering
ENH fixed affinity propagation test. Need more tests.
ENH fit_pairwise, transform_pairwise for KernelPCA
ENH base svm has fit_pairwise and predict_pairwise.
ENH fit_transform_pairwise for KernelPCA
ENH isomap uses new interface.
COSMIT get rid of debugging output
ENH GridSearchCV uses the new API
COSMIT forgot one print...
DOC Deprecation warning with removal version 0.13.
ENH going for a universal property ``_pairwise`` instead of many functions.
ENH Cleanup
FIX Fixing rebasing problems...
COSMIT avoid errors in tests.
ENH slight improvement to mds speed, modified examples to not run mds that long.
ENH added old confusion_matrix implementation as alternative for few labels.
Merge pull request #887 from danohuiginn/master
BUG fixing bug in entropy that I introduced, adding regression test.
FIX faces_decomposition example. That this broke only now is a sign of deep magic, better left unexplored.
Merge pull request #888 from jaquesgrobler/master
DOC removed irrelephant/confusion reference, added pointer to source (as there is no other possible reference).
DOC user guide pdf building. Kicked out a formular that rendered neither in html nor latex. Please don't hit me.
Merge pull request #889 from vene/generate-multitarget
Merge pull request #875 from AlexandreAbraham/ward_coo_bug
COSMIT pep8
MISC raise more helpful error message in GaussianProcess if optimization fails.
MISC added bigger "tiny" in lars_path. least_squares is float32.
MISC reduce code duplication, fix "self.gamma" modification
MISC A bit more cleaning up in BaseLibSVM
DOC added "fetch_mldata" to references.
CLEANUP remove linear_model.sparse.setup.py
COSMIT pep8
DOC rename lambda to alpha in plot_lasso_model_selection. Closes #903.
TESTING check that SVC checks the shape of precomputed kernels.
ENH Check that X is non_zero for MultinomialNB.
ENH fixed doctests, addressed comments.
DOC improve kmeans init doc.
Merge pull request #894 from amueller/svm_sparse_dense
FIX more doctests that I broke.
DOC comment in whats_new on changed behavior of ``gamma`` in SVM
Merge pull request #914 from alexis-mignon/master
Merge branch 'master' into fit_pairwise
MISC callable kernel gridsearch fix...
ENH factorize common tests.
ENH don't list abstract base classes
ENH make base classes abstract meta classes
ENH make all Estimators default constructible (except SparseCoder)
ENH Add MetaEstimatorMixin, make RFE default constructible
ENH make GMMs and LLE cloneable.
COSMIT get rid of warnings (can't get rid of deprecation warnings only :-/)
ENH make BaseLabelPropagation abstract base class, make OutlierDetectionMixin not inherit from ClassifierMixin
BUG fix testing for abstract classes
ENH default score func for univariate feature selection: f_classif
Make sparse svm base class ABC
FIX better class selection, more strict testing.
ENH more tests
MISC raise NotImplementedError instead of value error in decision_function of sparse SVM
ENH do zero mean, unit variance on iris, don't test naive Bayes (for the moment)
ENH change defaults on SGD (works on digits and iris and I just guessed them).
ENH avoid division by zero in LDA, also avoid reusing variable names.
MISC don't test SVM for the moment, rest works :)
ENH make LinearModel and LinearModelCV abstract base classes
ENH test regressors
MISC shuffle iris for SGD based methods
Revert "ENH change defaults on SGD (works on digits and iris and I just guessed them)."
ENH Fix seed that makes SGDClassifier work.
ENH create BaseRidge base class
ENH test more shapes, test non-consecutive classes, test accuracy on test set
FIX minor rebasing and other problems
MISC cleanup common testing
Merge pull request #893 from amueller/common_test
FIX for filtering of meta estimators in python2.6
ENH better input validation for prediction in SVC, LinearSVC.
DOC Also added some notes on my recent merge with tests and stuff to the whatsnew.
MISC fixed random seeds in LLE tests.
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
COSMIT pep8
COSMIT pep8
ENH in OvR, use constant predictor if one class always present or never.
MISC address Gael's and Lars's comments, make ECOC tests deterministic.
FIX trying to fix long-standing linker issue
COSMIT pep8
trying out some testing stuff
ENH put atlas checking in one place and load from there.
DOC typo / wrong parameter in lle docs
Improve test-coverage ;)
COSMIT some RST fixes for the docs
Remove empty statement
DOC doctest failed on my box because I had higher precision...
COSMIT typos in covertype benchmark
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
Merge pull request #886 from amueller/multiclass_always_present
COSMIT, removed scikits.learn things, removed orphan file.
ENH trying to catch that damn thing.
ENH better error messaged in multiclass as @mbondel suggested.
Merge pull request #1 from cournape/linking_arrayfuncs
ENH corrected errormessages for always present labels. ugh
FIX doctests for changed dtype
ENH fixed warning for output code
FIXed another doctest.
ENH add verbose warning about too little trees for oob. Should we catch the divison by zero warning for classification?
DOC made the pls example plots so much prettier
Merge branch 'master' into fit_pairwise
Fixed merge problem
ENH Removed stupid ``_pairwise`` property in BaseEstimator.
MISC minor cleanup in spectral clustering
FIX/TST test anc fix grid search with kernel pca and precomputed kernel in pipeline.
COSMIT comments not docstrings in tests
Merge branch 'master' into fit_pairwise
TST precision issue on my windows box :-/
ENH slight cleanup in LDA, QDA, support for arbitrary class labels.
ENH use LabelEncoder
COSMIT typo in pairwise docs
DOC added LabelEncoder to the References.
Merge pull request #1001 from serch/master
Merge pull request #1008 from mrjbq7/doc-fixes
COSMIT pep8
ENH just a little more input validation testing
DOC added default value of shrink_theshold to NearestCentroid docstring.
DOC added ``lowercase`` to CountVectorizer docstring.
FIX feature selection dies on non-csr sparse matrices (that are unsubscribable). Regression test should go in common testing.
DOC added class_weight to LogisticRegression docstring
ENH auc_score and average_precision_score. Closes issue #158.
ENH added to ``__init__.py`` and references.
DOC explained RFE default behavior in docstring.
MISC Added unconfigured windows box to mailmap. Sorry about that.
DOC add parameters to TfidfTransformer docstring
ENH slight cleanup in LDA, QDA, support for arbitrary class labels.
ENH use LabelEncoder
FIX Removed code-duplication introduced in rebase.
FIX Fixed variable names. Thanks @mblondel
DOC Added wikipedia references to docstrings
Merge pull request #1013 from amueller/auc_score
DOC Updated whatsnew
ENH sparse matrix support in univariate feature selection
TST Simplified tests, test that sparse and dense versions give the same result, always return arrays, not matrices.
DOC Polished some docstrings
ENH Added copy keyword to safe_sqr, added to dev docs.
COSMIT Fixed commata
ENH Addressed @mblondel's comments.
ENH simplify as @mblondel suggests
ENH sparse matrix support for RFE and RFECV. Closes issue #1018.
DOC updated whats_new
ENH going back to not using LabelEncoder.
Merge branch 'qda_lda_1000' of github.com:amueller/scikit-learn into qda_lda_1000
Merge pull request #1000 from amueller/qda_lda_1000
typo in linear_model doc
ENH add verbosity parameter to cross_validation_score
MISC catch warnings in covariance tests
Typo in last commit :-/ sry
ENH catch expected warning in ward clustering
ENH renamed ``min_n`` and ``max_n`` parameters in CountVectorizer to enable gridsearch over them together.
ENH renamed parameter bounds_n to ngram_range, fixed doctests and tests.
ENH addresses @ogrisel's comments
ENH fix merge with char_wb_ngram
ENH check that classifier decision_function and predict_proba validate shape of input.
Merge pull request #1046 from TimSC/master
COSMIT pep8
ENH rename paramter ``p`` of AffinitPropagation to ``preference``, slightly change the meaning of scalar parameter. Scaling the medium seems more intuitive that giving absolute values.
DOC fixed renaming of ngram_range in feature_extraction narrative
TST check that transformers fail gracefully on sparse input
ENH affinity propagation now has an ``affinity`` parameter, instead of a ``precomputed`` parameter, to support other affinities in the future.
ENH renamed ``gaussian`` affinity to ``rbf`` in spectral clustering for consistency.
COSMIT renamed n_points to n_samples everywhere, fixed shape docstring that @mblondel pointed out.
FIX Worst feature in RFECV missing. closes issue #681.
ENH renamed ``neq_sqr_euclidean`` to ``euclidean`` so we it is easier to parse
ENH Convert input into float in GMM
ENH add test, revert affinity propagation to previous parametrization (was a bit over-eager there)
TST added tests for different spectral clustering affinities
Merge branch 'fit_pairwise'
MISC add verbose keyword to AffinityPropagation
FIX fixed horrible bug in spectral clustering!!!!
ENH updated whatsnew for bugfix, removed warning box, tightened test.
TST classifier behavior with only one class present
ENH also test MultinomialNB
ENH some cleanup in grid_search.
Merge pull request #1068 from amueller/grid_search_cleanup
ENH add test for consistend predict_proba shape also in the two-class case.
tst add check for isotropic data in spectral clustering
FIX try to be a bit nicer to arpack - any one with a different setting care to try to make a more stable test?
FIX doctest corrected (hopefully this is deterministic) + cosmit
FIX removed isotropic spectral clustering test because of arpack problems.
FIX use backport of np.unique
FIX forgot some uniques
DOC fix minor sphinx errors and stuff
enh: try to get decision function to work in two class case
ENH make QDA and LDA decision functions adhere to standard shape [n_samples,] in two class case.
Fixed tests for RidgeClassifier
DOC updated whatsnew, moved @pprett's api fix into the api section.
ENH addressed @agramfort's comment, also removed the special case from testing as @mblondel fixed it :)
ENH added min_df keyword to CountVectorizer, default=2
ENH more robust testing for int
ENH more robust testing if parameter is int or float, as suggested by @larsmans in #1066.
FIX typo
COSMIT Typo. Englais svp. Closes #1090.
COSMIT trying to fix doc issues
DOC added min_df change to whatsnew, made more estimator names clickable.
ENH rudimentary testing of tranformer objects
MISC added comment to explain SelectKBest k in common tests
COSMIT copy+paste error
ENH test that regressors can handle integer data.
ENH add ClassifierMixin with ``fit_predict`` and some tests.
COSMIT remove commented out score
DOC CountVectorizerDocstring readability
DOC Added section on issue tracker tags to development docs
ENH raise ValueError in r2_score when given only a single sample.
ENH support custom kernels on sparse matrices
ENH added low-level bail out in sparse svm
MISC use assert instead of value error.
FIX add exception, check exception, if sparse.SVC is called with kernel='precomputed'
ENH fix error by removing unnecessary test.
DOC added some comments to the sparse precomputed kernel tests.
DOC updated whatsnew with ProbabilisticPCA fix by @kuantkid
Merge pull request #1109 from buma/predict_proba_doc
FIX affinity propagation typo
DOC fixed some sphinx errors, issues in docs....
COSMIT pep8
DOC fixed reference in whatsnew
DOC added some more API changes to whatsnew
FIX removed sparse_encode_parallel
COSMIT pep8
COSMIT typo, thanks @ogrisel
DOC add people and commits do whatsnew
MISC starting 0.13 cycle
ENH more robust transformer testing.... don't ask why that came up
ENH address issue 1028: clone estimator in RFE
ENH issue 1115: grid search support for rfe via ``estimator_params``
ENH fixed bug in sparse RFECV, fixed bug in RFECV init (step was always 1), added decision_function and predict_proba to RFE and RFECV
MISC rfe outputs loss, not score
FIX typo
add y to tfidf vectorizer
WEBSITE updated logo, changed scikits-learn to scikit-learn.
ENH remove some deprecated parameters / functions /estimators
FIX remove test for deprecated parameter.
Example: added a pretty PCA 3D plot of iris, as this dataset is used in so many examples.
ENH minore example beautification
DOC fixed default value of ``compute_importance`` in DecisionTreeClassifier docstring.
DOC typo in ElasticNet docstring
DOC add isotonic regression to References (even if we move it soon), also OCD.
FIX error in error message ^^ closes #1155.
ENH fix percentile tiebreaking, add warning
DOC document attributes scores_ and pvalues_ in feature selection docstrings, some superficial cleanup.
DOC somewhat improved feature selection example
ENH in NMF only use svd initialization by default if n_components < n_features.
FIX fixed typo in code, added smoke test.
COSMIT remove unused imports
DOC added Conrad Lee's PR to whatsnew
COSMIT pep8
FIX unicode support in count vectorizer. Closes #1098.
FIX docstring for count vectorizer. Sorry about that.
COSMIT remove unused import
ENH add MinMaxScaler, #1111
ENH do normalization in single pass over data
DOC added missing docstrings
ENH rename Scaler to StandardScaler everywhere
COSMIT pep8
DOC remove sparse support from docstring as there is none. Also cosmit on docstrings.
ENH add FeatureStacker estimator
ENH add feature stacker example
COSMIT + DOC more dosctrings, minor improvements
ENH implement get_feature_names
TST added tests, fix feature names.
ENH add parallel fit and transform with joblib.
ENH add transformer weights
TST add test for feature weights in feature stacker
DOC move example (there is nothing to plot) and add some text
MISC renaming FeatureStacker to FeatureUnion, adding docs
DOC added FeatureUnion to whatsnew.
ENH remove deprecated sparse SVM class from cross-validation test.
COSMIT pep8
FIX bug in pipeline.inverse_transform, improve coverage.
ENH support for string labels in Neighbors classifiers
ENH rename ``_classes`` to ``classes_``, fix outlier labeling, remove unnecessary mapping to indices.
COSMIT reuse variable name
ENH added non-regression test
COSMIT removed unused import
FIX np.unique doesn't have return_inverse keyword, use backport from utils.
ENH slightly better error message for robust covariance
enh even better error message
ENH make multi-class more robust in discovering scoring functions
ENH in all_estimators, skip testing modules. They have dummies.
TST improve test-coverage in base, remove unreachable code-path
COSMIT pep8
DOC added whatsnew entry for mutual info fix and faster confusion matrix.
ENH rename k to n_folds and n_bootstraps to n_iterations
DOC cleanup some docstrings (not scipy standard yet)
ENH set n_fold default to 3, rename k to n_fold in all doctests, docs, and examples
COSMIT rename n_iterations to n_iter in cross_validation
MISC renamed n_iterations to n_iter in all other places.
DOC added changes / renames to whatsnew
ENH rewrite K-Means m-step as loop over samples, cythonize.
ENH separate sparse and dense case, cythonize further.
ENH fix int type in kmeans
ENH fix kmeans for old numpy (bincount minlength)
FIX also the other function in kmeans. whoops
FIX bincount mess I made in kmeans.
ENH rename rho to l1_ratio in ElasticNet and friends
ENH rename rho in SGD
ENH address @agramfort's comments, fix some doctests
DOC add changes to whatsnew.
ENH simplify as suggested by @larsmans.
FIX for len(result) > minlength
DOC tried to clarify meaning of l1_ratio in whatsnew
ENH remove some unreachable code from gridsearch
ENH sparse matrix support in randomized logistic regression
FIX doctests for max_iter
FIX two more docstrings. Sorry.
FIX seed liblinear using srand. Fixes issue #919.
ENH add random seed to logistic regression
ENH don't use deprecated interface in PPCA & cosmit
REL put myself as contact / maintainer, fixed url
FIX rebase mishap
DOC small example / doctest for kernel approximation.
DOC typo in whatsnew
DOC more typos in whatsnew.
ENH use the numbers module introduced in Python 2.6 to check for number types.
ENH added OneHotEncoder
DOC minor fixes / typos. Thanks @larsmans.
ENH user-specified dtype, defaults to np.float, nicer numpy stuff :)
TST skip test in common_tests, reach 100% coverage on new code.
DOC more typos omg. comment about automatically inferring maximum values.
ENH better example.
enh masking out zero features
TST fixed doctests, added more tests. Still 100% line coverage :)
ENH removed ``remove_zeros`` parameter.
DOC more extensive classifier comparision on synthetic datasets
ENH more noise, cross-validated parameters.
ENH train/test split, plot accuracy, make plot pretty.
ENH simplify circles dataset generator, make classes balanced.
FIX typo in dataset generation
ENH I'm more happy with the last example now....
FIX adjust gamma in kernelPCA tests to fit slightly modified circles with balanced classes.
FIX HMM test failures
ENH used asarray to avoid copy
COSMIT pep8
enh: add code analysis target to makefile
FIX small bug in feature selection error message.
COSMIT do less deprecated things.
FIX revert useless change.
DOC warn about parallel k-means on OS X.
ENH minor improvements in testing, new utility function for safely setting random states.
FIX cross_val_score now honors ``_pairwise``
DOC added my last PR (cross_val_score fix) to whatsnew
WEB color fix for link headlines
DOC document callable kernels in SVM docstring.
DOC add user guide for MinMaxScaler
COSMIT in mean shift docs
FIX hotfix for NMF sparsity problem
FIX dirty fix for expected mutual info in cython.
ENH added OneHotEncoder
DOC minor fixes / typos. Thanks @larsmans.
ENH user-specified dtype, defaults to np.float, nicer numpy stuff :)
TST skip test in common_tests, reach 100% coverage on new code.
DOC more typos omg. comment about automatically inferring maximum values.
ENH better example.
enh masking out zero features
TST fixed doctests, added more tests. Still 100% line coverage :)
ENH removed ``remove_zeros`` parameter.
Merge branch 'larsmans_pr' into one_hot_encoder
COSMIT pep8
DOC corrected whatsnew.rst. Thanks @ogrisel.
ENH check in all classifiers in fit and predict that input is finite. inspired by #1027.
ENH add checks for clustering, regressors, transformers
FIX revert old behavior, all tests work :)
MISC address Gael's comments
DOC added comment about default for n_nonzero_coefs.
COSMIT pep8
ENH added check for non-negative input.
Merge pull request #1279 from amueller/one_hot_encoder
FIX don't use pl.subplots.
ENH adding "apply" to random forests
ENH add RandomHashingForest estimator.
ENH added docs, example and tests.
DOC Some narrative documentation for Random Forest Hashing.
FIX for sparse matrix in RandomForestHasher
ENH refactor inheritance structure.
ENH use random regression task to avoid memory overhead of n_sample classes.
ENH Added Example
DOC added references
MISC renamed RandomForestHasher to RandomForestEmbedding
MISC don't use pl.subplots, fix typo
MISC rename plot_random_forest_hasher to plot_random_forest_embedding
ENH fix plot in docs. thanks @ppret.
DOC forgotten rename
DOC fixed links in whatsnew.
DOC added dump_svmlight_file to the references
DOC improve MinMaxScaler narrative docs.
DOC added new precision_recall_curve to whatsnew
DOC fix some layout on the "presentations" page, add Jake's resent PyData NYC tutorial.
MISC rename RandomForestEmbedding to RandomTreesEmbedding
COSMIT don't do deprecated things in test (hmm)
COSMIT pep8, removing unused imports and recommend ``toarray`` instead of ``todense``
ENH make sparse svm test more robust, catch warning on deprecated class
ENH use blobs instead of iris in the common classifier tests. Iris has duplicat datapoints which raises annoying neighbors warnings.
ENH slight cleanup in common tests, less warnings.
ENH Check what ``__init__`` does in test_common
FIX messed up memorizing gmms parameter in GMMHMM before.
DOC added comment to test.
DOC explain what the test is doing.
ENH add chi2 and exponentiated chi2 kernel.
FIX add generated c file
DOC add chi2_kernel and exponential_chi2_kernel references.
TST added a test for chi2 and exponential chi2 kernel.
FIX input validation, test chi2 in pairwise function, add reference.
ENH fused types for chi2_kernel
ENH renamed chi2 to additive_chi2 and exponential_chi2 to chi2, as usually the exponential version is meant with "chi2"
DOC updated whatsnew
DOC cleared up difference to AdditiveChi2Sampler, added some "see also"s
DOC added stuff about chi2 kernel to narrative docs
FIX typo bug, more tests. Still more tests coming right up!
DOC added "precomputed" variant to docs.
TST 100% line coverage
ENH explicit check for zero denominator
ENH address @ogrisel's comments.
ENH addressed @kuantkid's comments. Also add myself to pairwise.py authors.
FIX import assert_greater from testing module
FIX csr conversion in amg code in spectral embedding
Merge pull request #1428 from tnunes/feature_union_fit_transform
ENH cleanup tests, lower tolerance
COSMIT pep8
FIX and test deprecated import of spectral_embedding from cluster
TST better test-coverage in clustering module
COSMIT in cross-validation tests
FIX random state in test by @briancheung. Thanks
TST better coverage in dict learning and cross validation
TST better coverage in preprocessing module
DOC add matplotlib version requirement, rephrase
COSMIT Mean Shift docs.
Merge pull request #1441 from kuantkid/fix_spectral_test
COSMIT some fixes in whatsnew rst
ENH Nystroem kernel approxmation
ENH renamed class NystromKernelApproximation to Nystrom (it is in the kernel_approximation module). Also improvements to example docstring
DOC docstrings for Nystroem.
ENH cosmit, gamma defaults do None, not 0. address some of @mblondel's comments.
ENH tests for Nystrom, check that n_components is smaller than n_samples.
DOC narrative doc for Nystroem.
DOC updated whatsnew with Nystroem.
ENH don't import * in utils __init__.py
TST better coverage for GridSearchCV, test unsupervised algorithm.
TST better test-coverage for image patch extraction.
TST better coverage in kernel_approximation
ENH input validation only in ``fit`` in LassoLarsIC, check that error is raised.
TST document and test verbosity parameter of lars_path
TST some more tests for SGDClassifier input validation
ENH / TST better coverage of supervised clustering metrics, slight cleanup
DOC make unit test requirements a bit stricter. 80% is sub-par with current code-base
COSMIT pep8
COSMIT renaming chunk_size to batch_size in MiniBatchDictionaryLearning and MiniBatchSparsePCA
DOC add rename to whatsnew
cosmit pep8
FIX GridSearchCV on lists that I broke in 8b3e4d06c05ac82130176161404f0434b74fe2c7
ENH added test, started on cross_val_score
ENH allow lists in check_arrays
ENH make cross_val_score work, some refactoring in GridSearchCV
ENH consistency: stuff is not an array if it doesn't have ``shape``.
TST GridSearchCV raises ValueError when precomputed kernels are not matrices.
ENH Simplify estimator type checking in GridSearchCV.
FIX don't use assert_allclose. It is not supported in numpy 1.3
COSMIT pep8
COSMIT featuers -> features typo
COSMIT PEP8
COSMIT pep8
DOC add version when setting parameters in fit will be removed to docstring
FIX typo / bug in test_common that ignored the first init parameter.
TST make test more stable.
ENH slight improvement of common tests.
DOC slight cosmit in metrics docstrings.
FIX i should trust my past self a bit more
ENH use an array instead of a dict in RFECV
Cosmit pep8
TST a little more coverage in unsupervised metrics.
ENH clean up redundant code in pairwise
ENH more test coverage in pairwise distances
FIX more robust test for silhouette score
DOC classifier comparison: plot data without decision boundary first, better (imho) color scheme.
DOC add Nystroem kernel approximation to the references
FIX stupid mistake
COSMIT pep8
COSMIT Typo
COSMIT update warning, pep8
ENH refactoring class weights for SVM and SGD
TST all classifiers now have "classes_". adjust test_common.
ENH remove class_weight_label from python side of SVM
ENH remove class_weight_label from sparse svm
TST move test of "classes_" to the appropriate test in "test_common".
FIX remaining doctests
DOC docstring for compute_class_weight
ENH remove class_weight_label from LibLinear python side.
ENH removed unused old function
TST fix import in test
ENH addressed @ogrisel's comments.
DOC changed docstring to be more clear.
ENH documented changes for SVC classes_ changes.
ENH move utility function into dedicated file, not __init__.py
TST start on testing consistent class weights
ENH nu-SVC doesn't support class_weights
FIX liblinear class weight in binary case, robust testing.
cosmit whitespace
DOC add comment in liblinear
TST better test for class weights (that actually tests something)
ENH test automatic setting of class weights in common test
TST skip RidgeClassifier in class weight test for the moment
DOC added fix to whatsnew.
FIX don't test auto in ridge classifier as it is not supported currently
FIX tests for auto class weights
DOC more concrete whatsnew
FIX skip tests for naive bayes for the moment.
DOC made myself contact for authors, changed my website to blog.
TST add cosine_kernel to kernel tests, pep8
ENH lazy import of metrics in base, not preprocessing in metrics.
ENH document attributes in QDA and LDA, rename to adhere to sklearn conventions.
DOC fix shape of coef_ for LDA.
TST somewhat hacky fix for tests on image loading.
ENH more logical behavior, better docstring, tests
FIX do checks even if allow_lists
DOC try to be as clear as possible.
ENH cleanup in check_pairwise_arrays, raise error on sparse input in chi2_kernel and manhattan_distance
COSMIT doc formating
DOC updated whatsnew
ENH added class_weight to Naive Bayes docs.
FIX random seed in FastICA testing.
DOC fix docstring of GMM
ENH rename proximity to dissimilarity
ENH common test that set_params returns self.
COSMIT remove empty file
DOC more accurate comment in class weight computation
FIX make sure laplacian in spectral clustering test is really PSD
DOC add recall_score to new classification metrics listing
DOC document gamma in chi2_kernel.
TST add common test to check if estimators overwrite their init params
ENH use only a few samples in test.
FIX in tree and ensemble: don't overwrite random state in fit.
FIX don't overwrite random_state in fit in EllipticEnvelope
FIX don't modify random_state in clustering algorithms.
ENH make code more clear: MiniBatchKMeans only uses random_state in first run of partial_fit.
FIX in ward: don't overwrite n_components.
FIX remaining parameter issues in GradientBoosting
TST took the safty off the tests ;)
Merge pull request #1582 from ApproximateIdentity/doc-n_jobs-parallel
DOC some sphinx fixes
DOC fix in mds example (new interface)
DOC mds example: suppress warning for explicit initialization
DOC don't use deprecated parameter rho in the lasso / enet examples
COSMIT typos in hierarchical clustering warning
DOC more sphinx fixes
EXAMPLE don't use deprecated interface in lasso model selection
COSMIT pep8 in examples
COSMIT pep8
DOC more sphinx fixes
FIX sort indices in CSR matrix for SVM
TST add regression tests for Alex' fix.
ENH rename cosine_kernel to cosine_similarity. Also make the test actually do something.
DOC fixed problem in citations in spectral_embedding
COSMIT typos
ENH don't use deprecated class_prior fit parameter for NB in test
ENH in spectral_embedding: do input validation before anything else
TST in testing deprecated load_filenames catch deprecation warning
TST catch expected warning in sparse coordinate descent test.
DOC cosmit fix column span alignment errors.
FIX example uses old parameter name
COMPatibility more careful deprecation of mode and k in SpectralClustering
COMP more careful deprecation of seed in SGDClassifier
COMP add deprecated property rho to ElasticNet
COMP keep seed as init parameter of Perceptron, only deprecate
COMP add deprecated ``labels_`` property to LinearSVC
FIX deprecated properties in ElasticNet
COMP in SVC rename self.label_ to self._label (it is redunant now but I don't want to refactor the rest of the day) and add a deprecated property label_, that points to classes_.
FIX in Perceptron and doctest
FIX in common tests: don't test init parameters that are deprecated. They might be changed.
FIX some doctests for SGD
COSMIT typo thanks @jaquesgrobler
ENH don't return deprecated parameters by get_params.
FIX typo in spectral clustering deprecation
TST catch deprecation warning when testing SVC label_ attribute, also test new classes_ attribute.
DOC reorganized whatsnew a bit, put new estimators on top.
DOC added user guide links to all estimators on the whatsnew page
DOC some more fixes for whatsnew
EXAMPLES add header to hash_vs_dict_vectorizer.py - otherwise it won't show in the html docs.
COSMIT pep8
ENH undo renaming of class_prior to class_weight in naive bayes
Merge pull request #1529 from vene/lgamma_port
DOC some more minor fixes to syntax / links
DOC fix indentation typo
DOC added commit counts for 0.13 to whatsnew, added website for Rob Zinkov aka zxtx
COSMIT pep8
DOC updated commit counts.
REL change version to 0.14-git everywhere, update news, support page.
website: fix for survey bar
COSMIT remove unused imports, pep8
TST some more tests for multi output lars
DOC fix typo in LinearSVC error message
FIX make error message work when return_path=False. Btw I feel that getting "references" for numbers out of numpy arrays is pretty ugly.
TST fix random states in all dict learning tests, make test independent of test sequence.
Revert "trying travis cfg with system-site-packages"
COSMIT pep8
DOC add return values of cross_val_score and train_test_split to docstrings.
ENH added test, started on cross_val_score
ENH adding SomeScore objects for better (?!) grid search interface.
ENH refactor, taking @GaelVaroquaux's and @ogrisel's suggestions into account
ENH deprecated ``score_func``, introduced ``score`` parameter in GridSearchCV
TST test giving score as string in GridSearchCV
FIX rename ``score`` to ``scoring`` because of the name-clash with the ``score`` function.
FIX two score objects, adjust tests to new interface
ENH remove old interface completely from tests.
DOC fix docstring
ENH working on cross_val_score, trying to simplify unsupervised treatment.
ENH better testing of old an new interface. Still a bit to do for unsupervised grid search, though.
FIX usage of scores for unsupervised algorithms.
ENH use new api in permutation_test_score, don't use old api in testing.
ENH fbeta score working, more tests
DOC-string for AsScorer
ENH renamed ap and auc, added RecallScorrer
DOC narrative docs for scoring functions. Put them next to GridSearchCV. Should they go into metrics?
ENH update example, minor fix.
DOC improve cross validation and grid search docstring
FIX rename error
DOC add whatsnew entry
DOC fixed formatting in user guide
FIX example
DOC added a new template to sphinx so view the "__call__" function.
COSMIT address @ogrisel's comment.
FIX rename ZeroOneScorer to AccuracyScorer
DOCFIX for zero_one_score / accuracy_score renaming
DOC add narrative about score func objects to the model_evaluation docs.
ENH rename scorer objects to lowercase as they are instances, not classes
DOC minor fixes in pairwise docs.
ENH/DOC add "score_objects" function for documenting the score object dict.
DOC add metrics.score_objects to the references
DOC use table from score_functions docstring in model_evaulation narrative.
DOC move scoring function narrative above dummy estimators, fix tables, some refinement.
DOC minor fixes in score_objects documentation.
DOC better table of score functions in grid-search docs.
ENH GridSearchCV and cross_val_score check whether the returned score is actually a number, not an array (otherwise cross_val_score returns bogus).
TST improve coverage of permutation test scores
TST slightly better test coverage in cross_val_score
COSMIT built-in typo
DOC some improvements as suggested by @ogrisel
TST add test for pickling custom scorer objects
DOC more improvements by @ogrisel
COSMIT rename AsScorer to Scorer
MISC moved score_objects.py to scorer.py, added module level doc string and license note.
DOC add kwargs in Scorer to docstring.
ENH add ``__repr__`` to Scorer
DOC addressed @ogrisel's comments.
COSMIT text reflow
MISC pep8: rename scorers to SCORERS, remove score_objects getter
DOC remove duplicate table, add references to appropriate user guide section to docstrings of cross_val_score, GridSearchCV and permutation_test_score
DOC add note on deprecation of score_func to whatsnew
FIX imports for Scorer and SCORERS
DOC fixes in whatsnew, typo
TST smoke test repr
COSMIT removed unused imports, fixed error message in test of boosting
ENH break ties in OvO using scores
TST test for breaking OVO ties
COSMIT pep8
ENH get rid of imports in test_common by checking by names, not classes.
ENH fix test_estimators_overwrite_params to also test regressors and transformers. Then fix all the regressors and transformers ... meh!
ENH set the random state to avoid heisenfailures
COSMIT pep8, removing unused imports
FIX remove dtype from covertype, add fetch_covtype to init, add missing docstrings.
FIX doctest kernelpca
ENH get rid of most imports in test_common
TST stronger tests for arbitrary classes. make explicit what works and what doesn't.
FIX rebasing trouble in common tests: the meaning of dont_test changed
FIX don't compare strings with "is". that is really not robust!
ENH in transformer pickle test, only test transformers that provide a 'transform' method. and only test that.
ENH in common tests, use long variable names for all tests
FIX remove all unseeded random variables from common tests.
Merge pull request #1695 from mrjbq7/issue-1694
COSMIT pep8: blank line contains whitespace
DOC added sentence about oob_decision_function_ containing NaN to docstring. Still need some narrative about oob score.
DOC add 0.13.1 changelog to whats_new.rst
DOC add random_state parameter to docs of LogisticRegression and LinearSVC
TST/FIX set random_state in logistic regression tests
TST/FIX always use "almost equal" for floats.
FIX MinMaxScaler bug.
TST FIX random state for LibLinear sparse tests
ENH add randomized hyperparameter optimization
DOC fixed links in whatsnew
Merge pull request #1736 from jamestwebber/patch-1
Merge pull request #1740 from tjanez/move_roc_curve_test
COSMIT pep8
DOC FIX links on grid search narrative
FIX compute_class_weight edge case
DOC some sphinx / rst fixes
MISC minor fixes in examples
DOC FIX column span alignment problem in NMF ^^
COSMIT typo
DOC fixing some more rst / sphinx errors :-/
DOC more sphinx stuff.
Merge pull request #1767 from rmcgibbo/balltree_docstring
DOC add roll your own estimator docs
FIX for iid weighting in grid-search
DOC FIX finite precision
COSMIT pep8
DOC correct / simplify dbscan examle
COSMIT typo. the French again ;)
FIX setting k in KMeans and MiniBatchKMeans was silently ignored. Left over in 07c56d7cd2ddfe71e7a4399d74fc367d6000d854 Damn, that was nasty :-/
COSMIT pep8
FIX jenkins error on numpy 1.3.0
DOC documented n_init parameter of MiniBatchKMeans. Closes #1900.
FIX broken scorer, add non-regression test.
FIX WARN about **params being not used in GridSearchCV.fit. Closes #1815.
FIX bug in callable kernel decision function - Sorry, I think that was me.
FIX test error in test common for KernelPCA that doesn't respect its n_components.
FIX typo in test for RdigeCV
DOC typo in RandomizedSearchCV docstring
DOC fetch_20newsgroups returns the text, not text files. see SO question: http://stackoverflow.com/questions/16615523/using-scikits-kmeans-to-cluster-ones-own-documents
DOC Fixed documentation of kernel parameters: sigm uses gamma, but not degree. Closes #1972.
DOC clarification in Scoring objects: Its not a good sign if I don't understand my own wording.
DOC much more readable formula in chi2 kernel doc
COSMIT sphinx fixes
COSMIT pep8
DOC FIX typo on fbeta, closes #2219
fix whitespace around new tree.pyx docstring
use new virtualenv features of travis, so we don't have to kill the virtualenv
FIX hopefully fixing travis.
FIX hopefully fixing travis.
DOC improve svm sample weight example
DOC improve documentation of sample_weight, add to docstring.
TST small improvement of test for sample weight in svm
cosmit typo
Show 95% confidence interval, not 40% confidence ^^
FIX whoops sorry!
fix pycharm file ending
ENH add "make_y_1d" to utils, use it in estimators where needed.
fix make ``make_y_1d`` save for lists.
use column_or_1d, move it to utils
ENH rename eval / pseudolikelihood to score_samples
fixing ridge and label binarizer... I'm pretty sure that worked before?
FIX make neighbors y prediction shape consistent
TST add regression test for label_binarizer
FIX/ENH make StandardScaler convert int input to float and warn about it, instead of warning and rounding for dense and crashing for sparse.
DOC adjust docstring as suggested by @gvaroquaux
addressing @ogrisel's comments: catch warnings in test, no unneeded digits
COSMIT fixing some unused imports, adding stuff to __all__, and light pep8 (not all whitespace to make rebasing less painful)
DOC fixing some sphinx stuff.
more sphinx fixes
first try at bootstrap-based website
"fix" sidebar stuff - this was not my idea
remove gray boxes around h3 on the two new pages
put banner into header, make it spread over whole page
Fix link to flowchart, add text descriptions.
Minor fixes in front-page text, css
rework front-page box texts
fix typo, missing p
fix and refine some css and html tags
add example banner image
add section, estimator and model links on the frontpage
fix styling of rst links
add links for examples
fix css that I just broke with the sphinx links
flatten the tutorial / doc structure as proposed by @ogrisel
add js for collabsible toc tree in the user guide.
minor typo thing
don't have old version warning on install, as that will be shared across all versions.
added "show source" link to footer, made dimensionality reduction examples link to decomposition
slightly hackish way of inserting a whatsnew link. I really don't want all the sphinx containers here, though. Asked on stackoverflow about it btw.
a little less ugly footer. @glouppe should maybe have a look ;)
make links to old versions actually do something (currently link to the user guide as the other versions are not rebuild yet).
replaced lorem ipsum in news. still a draft but whatever.
nicer dates
Try to raise and test warnings.
DOC added website to whatsnew, added link to github for Nelle
FIX don't use old API in examples
more fixes for docs, deprecated interfaces
FIX made the building of the docs slightly more robust. readme files in folders without examples kill it otherwise.
try to fix the toctree in a semi-meaningful way.
DOC/EXAMPLES fix more documentation errors, deprecated api usages.
EXAMPLES remove non-existing example from doc, don't trigger deprecated interface in enet_path, lasso_path
much better input validation, test that warning is raised on (n_samples, 1) y
rearrange permutation_score parameters to match previous ones.
DOC add link to fetch_covertype to covertype narrative docs
Andrew Winterman (18):
implemented predict_proba for OneVsRestClassifier
forgot an except clause
removed unnecessary repeat
corrected doc for predic_proba, also caught few errors.
wrote test_ovr_predic_proba method
divided test for predict_proba into two functions
removed check for predict_proba method.
[pep 257](http://www.python.org/dev/peps/pep-0257/) and and other doc improvements.
corrected bad test in test_multiclass
Flake8 Corrections made
spell checked
Spelling is checked, passes Flake8 without errors.
added backtick around self.classes_ in multiclass.py
changed n_folds > min_labels error to warning
removed tests for the old error.
added test for warning. Added warning category
removed a carriage return in warning message
added space between # and text
Anne-Laure Fouque (3):
ENH added R^2 coeff which is now used as score function in RegressorMixin
renamed explained_variance_score to r2_score in linear_model
adding r2_score : fixed typos and doctest
Anze (5):
P3K: Fixed imports.
P3K: Cannot compare list to tuple.
Replaced use of deprecated method.
P3K: Changed StringIO to BytesIO to fix a failing test.
P3K: Fix build for py3k + pip.
ApproximateIdentity (4):
Changed a minus sign to a plus sign in the documentation of n_jobs in some files.
Changed minus sign to plus.
Added n_jobs to multiclass.py
Revisions due to previous pull request.
Ariel Rokem (1):
Added description of input parameters in svm.SVC docstring
Arnaud Joly (267):
ENH add random-seed args
Call DecisionTreeRegressor instead of Tree
COSMIT Remove duplicated assignement
Use the check_input argument
DOC : add description of check_input args
DOC explain parameter estimators_
DOC explain parameter estimators_ (2)
ENH Move parameter checking to fit
COSMIT
FIX casting bug
ENH preserver contiguous property
COSMIT
DOC describe reasons for reshape
PEP8
FIX: perform transition from tree to DecisionTreeRegressor
FIX feature importance computation + Enable smoke test of feature importances
Update whats new
ENH add author
COSMIT use sklearn.utils.testing
ENH Let the user decide the number of random projections
Clean random_dot features
Clean random_dot features (2)
Clean random_dot features (3)
Clean random_dot features (3)
ENH let the user decide density between 0 and 1
COSMIT
ENH Strenghtens the input checking
ENH Add gaussian projeciton + refactor sparse random matrix to reuse code
ENH add more tests with wrong input
ENH add warning when user ask n_components > n_features
DOC: correct doc
ENH add more tests
Update doctests
ENH cosmit naming consistency
FIX renaming bug
COSMIT
WIP: add benchmark for random_projection module
ENH finish benchmark
Typo
ENH optim sparse bernouilli matrix
FIX example import (name changed)
FIX: argumetn passing selection of sparse/dense matrix
ENH assert_raise_message check for substring existence instead of equality
ENH add two test to check proper transformation matrix
PEP 8 + PEP257
DOC improve dev doc on reservoir sampling
COSMIT + ENH better handle dense bernouilli random matrix
FIX: make test_commons succed with random_projection
DOC removed unrelevant paragraph(s)
ENH add implementation choice for sample_int
ENH add various sampling without replacement algorithm
Typo
TST: Add tests for every sampling algorithm + DOC: improved doc
DOC: fix mistake in the doc + ADD benchmarking script
ENH Rename sample_int to sample_without_replacement
DOC + ENH: minor add in doc + set correct default
FIX: broken import
FIX typo mistakes + ENH change default behavior to speed the bench with Gaussian random projection
ENH Add allclose to sklearn.testing
ENH improve naming consistency
PEP8
COSMIT
DOC + typo
DOC set narrative doc for random projection
FIX: broken test due to typo correction
DOC minor improvements
DOC mainly switch from .\n:: to ::
FIX typo mistakes
DOC improve name in example
DOC Separate the jl example from references
ENH Add jl lemma figure to random_projection.rst
COSMIT (typo, doc, simplify code)
pep8
Typo
DOC typo in narrative doc
DOC fix typo in filename
DOC clarification
ENH flatten random_projection module + add sklearn.utils.random
ENH refactor matrix generation BaseRandomProjectiona and subclass
DOC improve layout (url)
Make the JL / RP example use the digits dataset by default
FIX broken import
pep257 + COSMIT: naming consistency
COSMIT
COSMIT
Remove unused line
DOC improve doc for jl lemma function
typo
ENH Rename Bernoulli random projection to sparse random projection
ENH Rename Bernoulli random projection to sparse random projection
DOC add see also
pep8
COSMIT make everything use the common interface
DOC improve + fix mistakes + TST added
ENH Simplify assert_raise_message + TST add them
DOC add utitilies to the doc
DOC + FIX density to Ping and al. recommandation
ENH make jl lemma work even with non numpy array
DOC add default values
ENH Add support for multioutput metrics
DOC add narrative doc for regression metrics
Update what's new
TST check that ValueError is raised when the number of output differ
ENH add mean absolute error
DOC cosmit alphabet order of classification metric in ref
DOC typo
ENH add multioutput support for dummy estimator
DOC instance attributes + TST: do not record warning
DOC typo
ENH preserve output ndim
COSMIT reorganized functions in the module
DOC add narrative overall description of classification metrics
DOC add hinge loss narrative doc
DOC Set reference links in the doc
DOC add narrative doc on zero_one loss metric
DOC add narrative doc on zero_one_score
DOC add narrative doc for precision, recall and fbeta measures
DOC add narrative doc on roc curve
DOC add narrative doc on auc and average precision
DOC add narrative doc on matthews_corrcoef
DOC add narrative doc for explained variance
DOC add reference to multioutput metrics in regression
DOC add link to clustering metrics
Update what's new
ENH renamed metrics.zero_one to metrics.zero_one_loss
ENH rename zero_loss_score to accuracy_score
ENH ClassifierMixin use a metrics from sklearn.metrics
DOC add classification_report to the narrative doc
DOC typo and mistakes
DOC comment from @amueller + several minor improvements
TST + DOC add many examples on sklearn.metrics
DOC typo + minor improvements
DOC remove redundant comment
DOC better example with dummy estimator + link to appropriate reference
ENH use deprecated decorator
FIX DOC missing default behavior change
DOC COSMIT pretty math
DOC clarification of api change
FIX catch deprecation warning
COSMIT (don't change anything see sparse_random_matrix)
Typo
FIX add doctest ellipsis
FIX doctests dtype
Typo
ENH multilabel metrics: accuracy, Hamming, 0-1 loss
DOC FIX foating point issue
FIX numpy 1.3 issues with multilabel metrics
ENH add normalize option to accuracy_score + FIX bug with 1d array
DOC return_path argument, prettier references
ENH more pythonic way to treat list of list of labels
ENH add jaccard similarity score metrics
FIX compatibility issue with np 1.3 py 2.6
ENH add multilabel support to PRF metric family
ENH remove pos_label argument with multilabel binary indicator format
ENH remove warnings at testing time
FIX unique_labels in corner case
FIX issue with comparable but different dtype
ENH don't allow mix of input multilabel format
ENH simpler check for mix of string and number input
COSMIT better name
Typo
ENH use type_of_target within unique_labels
ENH improve documentation with allowed label types
ENH check that we don't mix number and strings
Flatten label type checking
TST add smoke test for all supported format
COSMIT
PY3K use six.string_type
OPTIM + ENH simplify mix string and number check
FIX bug with indicator format
ENH use a comprehension over imap
@arjoly and @glouppe thanks their funding FNRS and DYSCO
ENH remove _is_1d and _check_1d_array thanks to @GaelVaroquaux
flake8
ENH raise ValueError with row vector if multilabel or multioutput is not supported
ENH being less permissive thanks to @jnothman
DOC add example is_multilabel
ENH handle properly row vector
Flake8
ENH better error message
FIX switch to the new format syntax
ENH prettier error message for _binary_clf_curve with bad input shape
ENH use ravel instead of atleast_1d and squeeze whenever possible
ENH coherently input checking for regression metrics
ENH dryer thanks to @jnothman
TST stronger test for _column_or_1d function
FIX ^ is a symetric difference
MAINT Set random_state, modernize tests
TST max_features for more tree estimators
TST remove unused tests
ENH add missing pxd of utis.random
ENH Use file configuration
FIX signature
TST error message for _check_clf_target
COSMIT
FIX TST given cosmit
COSMIT don't need set
DOC explain the code
COSMIT product(..., repeat=2)
Update mailmap
DOC add missing datasets helper
ENH remove deprecated
ENH remove deprecated things (2)
Update what's thanks @NicolasTr
ENH add support for string input with classification metrics
ENH use the new format syntax
ENH remove inspect
COSMIT
Update what's new
DOC state that string is possible
TST with labels arguments
FIX what's new...
ENH remove bad examples
DOC let some example for prf metrics
ENH allows make_multilabel_classification to return label indicator f…
TST grid_search_cv works with multioutput data
TST cross_val_score with multoutput data
COSMIT
ENH consistency mse=> mean_squared_error ari => adjusted_rand_score
FIX docstring
Update what's new
DOC add missing links to the scorer and classication section
ENH add multioutput support to KNeighborsRegressor
ENH add multioutput support to RadiusNeighborsRegressor
ENH add multioutput support for KNeighborsClassifier
ENH add multioutput support to RadiusNeighborsClassifier
DOC + example with multioutput regression face completion for knn
ENH allows make_multilabel_classification to return label indicator format
ENH TST grid search with multioutput
ENH TST random search with multioutput data
DOC gridsearch support mulioutput data
TST cross_val_score with multioutput data
DOC more information about which classifier support multilabel
DOC unveil that some estimators support multilabel classification and multioutput-multiclass classification
DOC overall improvements
pep8
DOC credit + fix typo + wording + use mathplotlib.pyplot
ENH take @glouppe comments into account
FIX small title issue
DOC update what's knn and radius-nn support multioutput data
FIX bug in f_score with beta !=1
FIX formula inversion for sample-based precision/recall
FIX set same default behavior for precision, recall and f-score
ENH raise warning with ill define precision, recall and fscore
Backport assert_warns and assert_no_warnings from np 1.7
TST test warning + ENH Add warning average=samples
FIX TST with warnings thx to @jnothman
flake8
ENH set warning to stacklevel 2
TST silence warning
ENH use with np.errstate
DOC TST correct comment
FIX warning test
FIX warning tests in preprocessing
PY3K remove __pycache__ in make clean
FIX PY3K warning.catch_filter set record
DOC overall improvements in the multiclass documentation
DOC take into account @vene and @ogrisel + specify format for multioutput-multiclass
DOC rewording
Typo
DOC ENH take into account @NelleV comments
DOC more comments from @NelleV
DOC Remove deprecated reference + acknowledge @larsman
DOC Update what's new
ENH more explicit name for auc + consistency for scorer, fix #2096
DOC put the narrative documentation of roc_curve and roc_auc_score in one place
FIX search and replace misstake
Aymeric Masurelle (19):
FIX : pass random_state to kmeans in gmm.fit
FIX : add condition pos_label!=None for multiclass purpose in metrics.precision_recall_fscore_support
TEST : add a test, test_precision_recall_f1_score_multiclass_break(), that breaks with current master and now works
Change metrics.py as before and shorten test (test_precision_recall_f1_score_multiclass_break() in test_metrics.py) to show where it breaks
ADD : cosinus kernel calculation in metrics/pairwise.py
add cos_kernel in help of decomposition/kernel_pca.py
name change: cos into cosine
change way of calculating cosine_kernel in metrics/pairwise.py
add test for cosine_kernel in metrics/test_pairwise.py
correct indent pb and re-edit cosine_kernel help in metrics/pairwise.py
fix style issue by running pep8 on metrics/pairwise.py and on metrics/tests/test_pairwise.py
remove duplicated test_cosine_kernel() in metrics/tests/test_pairwise.py
change test_cosine_kernel to include normalize from preprocessing.py in metrics/tests/test_pairwise.py
remove duplicated dimension check in metrics/pairwise.py
add reference to cosine similarity in cosine_kernel help from metrics/pairwise.py
modify cosine_kernel func to use normalize from preprocessing.py and change the test_cosine_kernel adding scipy.sparse inputs respectively in metrics/pairwise.py andmetrics/test_pairwise.py
modify test_cosine_kernel to compare result obtain with linear kernel in metrics/tests/test_pairwise.py
FIX: add prefix 'np.' to sqrt for test_cosine_kernel in metrics/tests/test_pairwise.py
FIX: move import of normalize function into the cosine_kernel call in metrics/pairwise.py
Bala Subrahmanyam Varanasi (2):
modified 'gid' to 'git'
pep8 compliant
Bastiaan van den Berg (1):
BUG allow outlier_label=0 in RadiusNeighborClassifier
Ben Root (7):
This should make the hungarian algorithm accept rectangular cost matrices. Also enabled the tests.
An additional check needed in case where there are fewer columns than rows.
Added support for hungarian assignment problems where one dimension of the cost function is zero-length.
Created an alternative hungarian solver for rectangular matrices that does not involve matrix padding.
hungarian() now returns a 2-D array of indices instead of a 1-D array. Also modified the find_permutations test to accomodate.
Some minor changes to docs, and small simplification in code.
Updating namespace usage from scikits.learn to sklearn
Benjamin Peterson (1):
ENH import six package for Py2/Py3 compat in a single codebase
Bertrand Thirion (74):
introduced gael's implementation of fast_ica and debugged GS orthogonalization
cosmit in fastica, that created a bug -- to be fixed
updates in fastica and more tests
completed and cleaned the tests
improved the tests
solved conflict in test_fastica
added probabilistic PCA and associated tests; works reasonably well
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' of github.com:scikit-learn/scikit-learn
fix in ppca
cosmit in pca and test_pca
merged origin and fixed a conflict
merged origin
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' of github.com:scikit-learn/scikit-learn
merged the mainr epo
Merge branch 'master' of github.com:scikit-learn/scikit-learn
new criterion for wards clustering
one single cython module for inertia and ward distance
always use scikits ward algo when no structure is provided
tiny updates on lda (checks, numerical stability)
removed the unused inerta stuff
Variable renaming and dostring fixing
merged with master logsum -> logsumexp
ENH: renaming estimated variables from self._variable to self.variable_
removed the decode
removed the decode in dpgmm and removed return_log in eval
ENH: Cleaned after rebase and compatibility with hmm
ENH: Removed X and z varaibles from dpmm cladd (should not ship the data)
ENH:aviod initializaing GMM means with zeros
ENH: more snsible initialization in case of divergence
BF: Mended the tied covariance estimator
ENH: added multiple initialization to the GMM -- untested
FIX: fixed collateral dammages in hmm
added some tests to ensure that GMMs work in about all conditions
ENH: renaming cv_type and posterior to more explicit name + tested multiple init
avoid changing the covariance when computing the Gaussian density
FIX: Fixed a buf I introduced in dpgmm
ENH: Added AIC/BIC + tests. Seems to work
Cosmit in dpgmm
merged with master
Changed the shape of spherical covariance matrices to be equal to disgonal covariance matrix, in order to avoir handling the dimension in particular
Merge branch 'master' of github.com:scikit-learn/scikit-learn into gmm-fixes
detail fixed in an example
Hopefully clarified notations in dpgmm
Many corrections in dpgmm to remove en-necessary loops (significant speed-up) + renaming
Fixed an example that happened to fail
Several details outlined by Jake
handled the eval on Null data
merged the master repo
Added an example with model selection
Oups: really added an example with model selection
ENH: Removal of properties from GMM -- unfinished
removed properties from dpgmm
replace log_weights_ by weights_, which makes the API more consistent
Getting rid of properties in hmm, gmm, dpgmm
fixed a doctest
ENH: Some cleaning in the examples
ENH:pep8
ENH: enforcing skls conventions
A pass on the docs
corrected the doc for dpgmm
removed get_means, set_means, get_weights, set_weights
ENH: renamed plot_gmm_model_selection.py to plot_gmm_selection.py
Fixed the doctests in hmm
COSMIT:pep8 in hmm
Corrected the docs
ENH: changes in the code to fulfill Gaels requirements
Merge branch 'master' of github.com:scikit-learn/scikit-learn into gmm-fixes
ENH:Added back rvs as deprectaed and updated whatsnew.rst
ENH: fixed the GMM docs
ENF changed INF_EPS to EPS in hmm too.
Bogdan Trach (1):
* doc/conf.py: added required latex packages (bm and morefloats)
Brandyn A. White (2):
Fixed docstring to reflect current code in precision_recall_curve.
Faster confusion_matrix implementation
Brian Cajes (6):
Improving code coverage for datasets module. Moved dataset imports inside test_data_home, because it is preferable for import errors to only affect the tests that require those imported methods. My first commit to scikit. -bcajes
revert to original import placement style
Improving code coverage for datasets module. Moved dataset imports inside test_data_home, because it is preferable for import errors to only affect the tests that require those imported methods. My first commit to scikit. -bcajes
bring datasets.base to 97% coverage with a few more tests
removing backup file
checking data.shape for each test dataset
Brian Cheung (15):
Discretization method for spectral clustering added along with tolerence setting to loosen eigendecomposition constraints
Documentation and small bugs fixed and code cleaned up
Small comments/constants added
Added more info in documentation
Small aesthetic fixes to discretization
pep8 formatting
More description of the discretization algorithm.
Even more description of the discretization algorithm.
Documentation changes, removed more camel case variables
Fixed some memory inefficiencies and clearned up documentation and code semantics
Example for spectral clustering embedding handling
Added newline to the end of file
removed a hardcoded value
Modified lena segmenation example to include different embedding solvers
Removed savefig
Brian Holt (198):
Refactored decision trees and forests to support CART algorithm.
Refactored decision trees and forests to support CART algorithm.
Added documentation
make number of classes explicit
Added visualisation and corrected bugs in CART algo
Merge branch 'enh/ensemble' of https://github.com/satra/scikit-learn
Merge https://github.com/scikit-learn/scikit-learn
improved nosetests doctest time
PEP8
Merged decisiontree and tree_model into tree, random_forests to ensemble
20% speed improvement by moving _find_best_split to cython
removed occurances of tree_model
Merge https://github.com/scikit-learn/scikit-learn
Added the Boston House Prices dataset
Fixed imports and run unit test
Merge pull request #6 from vene/boston
Corrected import of the data: all 506 columns are now usable
merge
Merge branch 'boston' of https://github.com/bdholt1/scikit-learn
Updated documentation for boston house prices dataset
FIX: removed the required parameter K
FIX: dataset description
Further optimisation of _find_best_split
Further optimisation of _find_best_split
Refactoring: speedup of decision tree code
Further performance improvements. Now approx 30 - 50% faster than MILK.
Merge branch 'boston'
Updated benchmarking for trees
PEP8
DOC: added documentation for graphviz method
FIX: corrected computation error and typed incoming arrays
FIX: corrected graphviz visualisation.
removed everything except the plain and simple decision tree to make reviewing easier
DOC: Updated the documentation to reflect decision trees.
Corrected newlines, and ensured only tree related changes are in this set
FIX: replaced ad-hoc RNG with suggested scikits.learn implementation. Tidied up dependent examples.
Merge with master
Merge https://github.com/scikit-learn/scikit-learn into enh/tree
Merge https://github.com/scikit-learn/scikit-learn
Removed unused import
PROF: improved speedup thanks to ppret
Merge branch 'enh/tree'
Initialise random state for examples
DOC: Added +ELLIPSIS for examples
ENH: Support binary (-1,1) classification as well as [0,...,K) multiclass classification
Merge git://github.com/scikit-learn/scikit-learn into enh/tree
Removed unnecessary import
Fixed doctest example
Updated documentation for class interface
Minor patches to docs
Optimisation: moved _find_best_split to cython.
DOC: change classification to regression
Merge github.com:scikit-learn/scikit-learn into enh/tree
DOC: corrected doctest
don't allocate a new pm for each call: 3 times faster
Moved to @pprett's faster splitting code (debugged)
Added more debugging info to graphviz
Moved to the version without a sample mask, since correctly implemented it is almost as fast
Fixed error of splitting between identical feature vals
DOC: updated comments
Fixed memory leak in libsvm
Improved graph visualisation
Move initial entropy computation outside loop.
raise ValueErrors with appropriate messages
merged upstream master
merged upstream master
Standardise error messages
Copied ensemble and random forest classifiers to new branch
Check that labels are in range for multiclass classification
Check that labels are in range for multiclass classification
Further clarification of error messages
Merge branch 'upstream-master' into crossval
Fixed regression bug. Thanks @pprett
Merge branch 'upstream-master' into enh/tree2
merged enh/tree2 into enh/tree
Fixed doctest
Enforce 64bit and 32bit types and correct regression bug (divide by zero).
Refactored construct to subsample dimensions.
store all tree parameters in the RF base class so that clone() will work
Revert to _Fixed Doctest_ and added regression bug fix
update to unit test and doc test
enforce type on storage arrays
enforce 64 bit types on parameters
further type enforcement
initialise variables
removed unused import, removed unnecessary backslash
Improved names and documentation for Leaf and Node
Renamed K to n_classes
renamed F to max_features
renamed features to X
renamed labels to y
renamed n_dims to n_features
explained min_split
renamed C to predictions
improve documentation
renamed K to n_classes
COSMIT: improved documentation
renamed pm to label_count
renamed K to n_classes
improved documentation and renamed features and labels
renamed var to variance
fixed comments
Updated docstrings
merged upstream-master into enh/tree
Merge pull request #9 from ogrisel/bdholt1-enh-tree
Merge pull request #10 from ogrisel/bdholt1-enh-tree
merged upstream/master
renamed scikits.learn to sklearn
Push coverage up to 96%, added graphviz test
merging
Merge pull request #11 from pprett/bdholt1-enh/tree
added example usage of graphviz
Merge branch 'enh/tree' of github.com:bdholt1/scikit-learn into enh/tree
fixed unit test of graphviz
added trees (boston and iris datasets)
pep8
moved the min_split test to beginning of recursive_split
group imports by hierarchy
sed s/dimension/feature/g
time is measured in seconds
print left and right child repr in graphviz
Merge branch 'enh/tree' of github.com:bdholt1/scikit-learn into enh/tree
fixed graphviz test failure
added feature_mask to reduce fancy indexing
replaced == with 'not' operator
updated the decision tree docs (not done yet)
use Fortran array layout
corrected feature_mask implementation
allow for different architectures
merged upstream/master moving to sklearn
merged enh/tree
Merge pull request #12 from pprett/bdholt1-enh/tree
Incorporated suggested changes to Graphviz exporter
visit -> export
cosmit: added spaces
cosmit: improved documentation
fixed indentation and added section on memory requirements
Updated documentation to include the iris svg example
improved documentation
np.float64 -> DTYPE. Set DTYPE to np.float32.
make sorting more efficient by transposing and sorting along last axis.
Use a sample mask instead of fancy indexing.
Merge pull request #13 from pprett/bdholt1-enh/tree
COSMIT: corrected comments
made sample_mask a fit parameter
updated documentation to reflect min_density concept
Merge pull request #14 from pprett/bdholt1-enh/tree
there is no more Leaf class
added feature_names to GraphViz export
Tidied up graphviz related code
test for improperly formed feature_names
removed sample_mask parameter
only return values that are used
Merge branch 'master' of github.com:scikit-learn/scikit-learn into enh/tree
Merge branch 'enh/tree' of github.com:bdholt1/scikit-learn into enh/tree
Merge pull request #16 from pprett/bdholt1-enh/tree
use np.isfortran
use None as the default marker
compute node id's on the fly
removed leftover class_counter
Merge pull request #17 from larsmans/enh/tree
added test for pickle-ability
Merge branch 'enh/tree' of github.com:bdholt1/scikit-learn into enh/tree
Merge pull request #19 from pprett/bdholt1-enh/tree
fixed failing docttest
improved tree documentation
included a mathematical formulation for CART
verify that scores from pickled objects are equal to original
pep8
Merge pull request #20 from GaelVaroquaux/tree
COSMIT: +SKIP on classification doctest
rewrote GraphvizExporter into a function export_graphviz
removed duplicate tests (already in fit)
Merge pull request #21 from glouppe/tree
classes can be any integer values
require that the next_sample_larger_than is greater than the previous by at least 1.e-7
regenerate cython
if threshold is indistinguishable from a, choose b
modified threshold comparison from < to <=
Merge branch 'master' of github.com:scikit-learn/scikit-learn into enh/tree
Added tree module to whats_new
release sv_coef memory
tree construction depends on n_features
Merge pull request #22 from ogrisel/bdholt1-enh-tree
Added person webpage
added trailing underscore
Merge branch 'master' of github.com:scikit-learn/scikit-learn into enh/ensemble
Merge pull request #23 from larsmans/enh/ensemble
scikits.learn -> sklearn
update parameter names
Merge branch 'master' of github.com:scikit-learn/scikit-learn into enh/ensemble
remove enforcement of return type
replaced ratio r with sampling with replacement
Re-ran the tests and found that the GaussianNB error was much lower.
Fixed typo
added multi-ouput tree example
updated documentation to reflect multi-output DT regression
added link
Bryan Silverthorn (3):
Test KernelPCA support for n_components.
Add support for n_components in KernelPCA.
PEP8 fix.
Bussonnier Matthias (1):
[Docstring Typo] making there -> making their
Carlos Scheidegger (1):
BUG: missing subpackage svm/sparse on setup.py. fixes issue #559
Charles McCarthy (2):
Fixed data.filenames consistency issue when 'all' specified for 'subset'.
Added basic test for filenames consistency when all specified.
Charles-Pierre Astolfi (1):
Typo fix
Christian Jauvin (4):
Mechanism to propagate optional estimator.fit arguments when using CV
changed **fit_kwargs to explicit fit_params dict
make sure that param has len attr + a test
replaced assert with assert_true + error msg
Christian Osendorfer (17):
Fixed problem with big full covariance matrices: sum,log instead of log,prod for loglikelihood computations.
Factor Analysis -- implemented with EM + SVD.
TST: Make factor analysis test repeatable.
Extended faces decomposition example with Factor Analysis.
Factor Analysis learns variance of generative model for every dimension. Illustrated with faces.
pep 257.
Make sure that psi=0 does not break em.
Some documentation for FA.
More or less same code already available.
Plot noise variance for FA. Changed some things to make plot_gallery usable for this, too.
Adding some plots for FA. Ordering of articles must be adopted.
Extended test a bit.
Added score function.
Two iterations are enough for the test.
score works like ppca.score().
adapted to new signature of score().
Moved paragraph on FA before ICA.
Christoph Deil (1):
Fix typo in README
Claire Revillet (1):
- fix missing links to the C math libray
Clay Woolam (103):
added label propagation class
switch map and sum commands to numpy
fixing up tests, adding "unlabeled_identifier"
basic features of multiclass labeling up
fixing the way labeling works
checking in minor changes
added documentation, reworking tests
fixing up tests
added a lot more to label propagation, explained algorithms and differences between the two models
more documentation
added beginning of examples
added "structure" example
tweaked structure plot
finalized SVM comparison example
all tests pass
removed some stuff from documentation
updated pydoc to make behaviour clearer
passed PEP8, using already implemented kernel functions
making everything more numpy compatible
graph construction and example more numpy-like
fixed other diagonal matrix construction
rename misnamed "plot" example
example conforms to pep8
other example conforms to pep8
made test conform to pep8
predict() method now numpy friendly (100% numpy friendly now)
more numpy integration
removed function kernel, switched to string for picklability
fixed a bug in the circle example
moved label propagation examples to lower subfolder
more numpy friendliness
more numpy use,
fine tuned some documentation
added a snazzy label propagation versus SVM decision boundary plot
added more explanation to the plot
added semi_supervised directory
removed old, useless code
removed unused imports
added more documentation, another doctest for LabelSpreading
minor tweaks to the overall layout of the code
reverted plot_iris accidental commit
added unlabeled_identifier explanation to docstrings
Merge remote-tracking branch 'upstream/master'
fixed indentation problem in documentation rst
conformance to pep8
fixed bug in tests causing gram matrix construction to not work properly (assumed casts to floats)
added two new examples, including an active learning demo with label propagation
heavily downsampled digits examples (runtime a few seconds now) and removed supporess_warrnging bug
changed doc to remove long runningtime warning
rennamed active learning example so it won't be run for doc compilation
changed subplot titles so the plot is more clear
fixed structure example
added vene's subplot adjustments
Merge branch 'new_lp'
made convergence check function private
fixed spelling error with variable name (indicies -> indices)
optimized _build_graph with inplace methods, conform to standards with variable names
one more optimization! avoids cast to numpy matrix and does in place matrix multiplications
fixed test cases to conform to api changes & new internal parameters
updated docs!
Merge git://github.com/scikit-learn/scikit-learn
localized a variable
fixed test suite, changed module to conform to new sklearn naming scheme
fixed examples for new naming scheme
merged ogrisel's docs & optimization, also fixed active learning example plot
changed a bunch of variable names, fixed some test cases
all code works great, all tests pass, full coverage
changed a variable name to conform to scikits code
correct variable names and added inline comments for active learning examples
added attributes text to explain named attributes
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
added support for sparse KNN graphs and tests
finishing up sparse additions (need to complete todo)
sparse KNN graphs now work
ENH add label propagation algorithm
finalized KNN work, all tests pass properly
Merge branch 'larsmans-label-propagation'
removed extra semisupervised folder
polished the lp & test code
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into label-propagation
variable name changes, using premade functions, doc fixes as per
variable name changes, doc corrections
removed unlabeled_identifier, updated tests and examples to reflect this
corrected example that still refered to unlabeled_identifier
optimization that stores the spatial index when using knn graphs
updated rst docs with kernel information
shuffled digits example, added sensible point colors to plot chart,
docs describe the different kernels available in techniques
TL directory change to push label propagation code into semi_supervised
added __init__.py file to semi_supervised folder
Updated docs for label propagation, added more technical details about
specific fine tuning to the label propagation docs
doc updates & tweaks
fixed typo in test code
added AISTAT ref to docs
added AISTAT ref to rst doc
fixed bug causing error on sparse input data
corrected the documentation and add semi-supervised section to the user
placed semi-supervised under supervised learning techniques in user
Merge remote-tracking branch 'upstream/master'
fixed error in graphviz export code causing graph error raised with
Conrad Lee (39):
Modified learn.cluster.mean_shift_.py so that the mean_shift function uses a KDTree to efficiently calculate distance within some threshold. KDTree implementation is in C and is from scipy.spatial. Tested only using the example located in examples/cluster/plot_mean_shift.py
Added another variant of mean shift clustering (in scikits/learn/cluster/mean_shift_.py that seeds using a binning technique on a grid.
Modified learn.cluster.mean_shift_.py in the following ways: Replaced old seeding strategy with bucket strategy which should be scalable. Modified nearest neighbor lookup to make it more scalable by adding a maximum number of neighbors -- in most cases this will not make a difference in the results --- the impact of this change is tunable with the max_per_kernel parameter. It is now possible to force all points to belong to a cluster (default) or only those points that are within t [...]
Modified learn/cluster/mean_shift_.py in the following ways: Added more efficent and proper removal of duplicate clusters. Took seed detection out of mean_shift function and put it in its own function. Default bucket size for seed detection is now the bandwidth.
Made following changes to cluster.mean_shift_.py: Added documentation for new functions. Made following changes to cluster.__init__.py: this module now imports the get_bucket_seeds function from mean_shift_.py
scikits.learn.cluster.mean_shift_.py modified in the following way: improved documentation
Changed plot_mean_shift.py example to use larger data set to show how bandwidth estimation dominates the runtime.
Changed scikits.learn.cluster.mean_shift.py: Updated reference for mean_shift algorithm
Changed scikits.learn.cluster.mean_shift.py: Added Conrad Lee as author.
Changed scikits.learn.cluster.mean_shift: modified so that complies with pep8.
Changed scikits.learn.cluster.__init__.py and examples/cluster/plot_mean_shift.py: modified so that complies with pep8.
Changed scikits.learn.cluster.mean_shift_.py: Now uses BallTree because of built in query_radius function, allowing us to get rid of the get_points_within_range function. Changed MeanShift to not use bucket seeding by default.
Hard coded bandwidth to 1.30 because otherwise its calculation is too slow.
Changed scikits.learn.cluster.mean_shift_.py: now uses blas nrm2 to compute norm.
Modified file scikits.learn.cluster.mean_shift_ Replaced a list comprehension and a for loop with numpy operations to improve efficiency.
Modified file scikits.learn.cluster.mean_shift_: removed print lines used for debugging, made code compliant with pep8
Modified file scikits.examples.plot_mean_shift.py: updated reference.
Mean shift: now uses norm function from utils.extmath
Mean shift: removed obsolete reference to KD-Tree with reference to BallTree
Removed obsolete import of izip, made description of complexity more concise and accurate
Mean shift: settled on term 'bin' and removed unnecessary references to 'bucketing' or 'discretization' from variable names and documentation
Mean shift: Fixed a minor type
Mean shift: Moved a test file in preparation for merge with agramfort's branch
Merged agramfort's branch with my own
Mean shift: removed my old test script due to merge with agramfort, changed num points in plot example to ten thousand to speed it up.
Brought my branch for mean shift modification up to date with current head on github
Mean shift: modified get_bin_seeds so that it no longer has to copy all points
Mean shift: Fixed a bug that occurs when the cluster_all argument is False
Merge remote-tracking branch 'upstream/master'
Mean shift: fixed bug introduced during upstream merge
cross_validation.py: fixed bug in text of error message
metrics.py: modified precision_recall_curve to lower computational complexity
metrics.py: pep8 and other cosmetic changes
metrics.py: Added more comments to precision_recall_curve.
metrics.py: bugfix in precision_recall_curve and added tests
metrics.py: more detailed comment in precision_recall_curve
metrics.py: pep8
metrics.py: COSMIT more commets on precision_recall_curve
metrics.py: COSMIT, replaced cryptic np.r_ with np.hstack
Corey Lynch (10):
cythonized expected_mutual_information
added authors
Changed example svc kernel to be linear, however the error curve ends up flat under the new kernel.
Used more extreme values of C to show a more pronounced error curve.
Took out a save image line
Edited docs to reflect change in kernel used.
added yticks
added yticks
added yticks
limited range of C cross validation
Dan O'Huiginn (1):
Fix a few spelling/grammar errors in the docs
Dan Yamins (14):
added arithmetical ordering patch for labels in linear.cpp and test for liblinear predict
comment
simplification in liblinear testing
pep8 compliance in liblinear testing code
simplified liblinear prediction function
two trailing whitespaces removed from an multiline comment :)
minor syntacting improvement in liblinear test function ...
one more minor improvement to liblinear test code
I think i've got it this time ...
pep8 compliant at last!
various changes to handle fortran ordering in matrices
some pep8 fixes .... but probably more to come
removed testing thing
pep8 stuff as well removed testing stuff
Daniel Duckworth (9):
Merged svm parameter selection visualization
split plot_rbf_parameters.py's plot into two
Added plot_rbf_parameters example to SVM doc
Fixed bug in plot_rbf_parameters.py causing only one figure to show
Fixed location of ".. _svm_mathematical_formulation:" in svm.rst
Convert input dtype to float in pairwise_distances
Convert input dtype to float in pairwise_distances
Merge remote-tracking branch 'upstream/master'
Python 2.6 bugfix for plot_rbf_parameters.py
Daniel Nouri (14):
Test qda with 'priors' parameter
Test QDA.covariances_ attribute
Don't cover this deprecated method
Test non-normalized GaussianProcess
Test _BaseHMM._decode_map
Test _BaseHMM.{predict,predict_proba}
Make this bit of code more compact (and improve code coverage).
Remove unused code branch. (_hmmc must be always available nowadays.)
Remove stale test code
Remove obsolete comment
Improve cross_validation test coverage: 94% -> 99%
Improve metrics.metrics code coverage: 95% -> 100%
Improve svm.base test coverage: 92% -> 98%
Add docs for `vocabulary_` and `stop_words_` attributes of Countvectorizer.
Daniel Velkov (1):
Fix wrong argument name in RFECV docstring
David Cournapeau (1):
REF: hack to be able to share distutils utilities.
David Marek (17):
fixed SparsePCA.transform returning NaN for 0 in all samples. (fixes #615)
Added test for SparsePCA.transform (checks #615)
ENH: Added p to classes in sklearn.neighbors
TEST: tested different p values in nearest neighbors
DOC: Documented p value in nearest neighbors
DOC: Added mention of Minkowski metrics to nearest neighbors.
FIX+TEST: Special case nearest neighbors for p = np.inf
FIX: pep8
ENH: Use squared euclidean distance for p = 2
ENH: train_size and test_size in ShuffleSplit (#721)
TEST: Added more tests for ShuffleSplit
TEST: Tested ShuffleSplit with different types of test_size
Changed deprecation warning.
DOC: Added changes in ShuffleSplit and sklearn.neighbors
Error checking now works for more types than just int and float.
Use numpy dtype.kind instead of isinstance
TEST: assert_equal instead of assert
David Warde-Farley (21):
Rephrase motivation for Sparse PCA
Misc rephrasings of sparse PCA docs.
Remove 'structured sparsity not implemented' comment
Prefix explanation of sparse PCA formulation with 'Note that'
atoms -> components for clarity
Trailing whitespace fix.
Rewording in docstring
gradient descent -> coordinate descent in docstring
'Returns' section of the _update_code docstring
Wrap np.seterr reset in a try..finally block
ImporError -> ImportError
Added loader code for (Roweis) Olivetti faces dataset.
Added imports to __init__.py for Olivetti faces
Documentation for the Olivetti Faces dataset.
Remove 'load_' alias for 'fetch_'
Use prints for now instead of logging at Gael's request
Add a shuffle keyword, default False
Fix math notation for exp and tanh.
Add pointer to kernel equations from SVC docstring.
Rephrased narrative doc reference in docstring.
Added RST comment about where to find narrative docs.
Denis Engemann (28):
FIX + ENH: catch custom function argument errors and inform user
FIX transform tests
FIX: remove inplace mod
COSMITS
FIX: inverse transform + add mean_
COSMITS
FIX: syntax typo
FIX: tutorial
COSMITS + DOC
COSMITS
ENH: improve tutorial to be more clean.
ENH + FIX: remove inverse-t kwarg + fix mean_
FIX: address @agramfort 's comments
FIX: address remaining issues
ENH: speed up logcosh
ENH: improve ICA memory profile by 40%
ENH: add failing test exposing bug in RandomizedPCA
FIX: only center if copy == True
ENH: get it right.
FIX: inverse_transform; tests
DOC better doc message
API: get rid of **params in PCA estimators.
DOC: more doc string fixes in pca.py
DOC: more fixes in pca.py doc strings
STY: get rid of unnecessary identifiers
FIX: X.copy() test now works
STY: removing unnecessay import
COSMITS
Denton Cockburn (3):
DOC fix some docstring/parameter list mismatches
renamed weight to sample_weight in sklearn/isotonic.py
DOC missing stuff in randomized_l1 module
Diego Molla (2):
Minor bug fix in metrics.adjusted_rand_score
Added tests
Doug Coleman (18):
BUG: Don't test test_k_means_plus_plus_init_2_jobs on Mac OSX >= 10.7 because it's broken. See #636. Closes #1407.
BUG: Fix the random_state on make_blobs() in test_classifiers_classes(). Fixes #1462.
BUG: Make a RandomState object and use it in test_transformers(). Fixes #1368.
FIX: Cast floats to int before slicing in robust_covariance
BUG: Build random forests the same way regardless of n_jobs and add a test for this. Don't predict in parallel since the cost of copying memory in joblib outweighs the speedups for random forests. Fixes #1685.
COSMIT: Fix up a loop.
COSMIT: Better assert.
DOC: Update new magic numbers in docs since random forests train differently now.
FIX: sklearn.ensemble.forest: Refactor to remove references to parallelism in predict() functions.
BUG: Fix performance regression on large datasets in random forest.
DOC: Emphasize that n_jobs is for fit and predict methods in random forests.
BUG: Use Py_ssize_t to index into numpy arrays to help Python handle big data.
MISC: Update _tree.c with cython.
BUG: Use ``Py_ssize_t`` in a few more places for strides. Add the c file again.
DOC: Clarify docs on preprocessing.Binarizer.
FIX: Finish package rename from mst -> sparsetools. Fixes #2189.
DOC: Fix backwards docs on thresholds for preprocessing.
FIX: Newer numpy causes scipy to issue a DeprecationWarning. Ignore it. Fixes #2234.
Dougal Sutherland (3):
StratifiedKFold: remove pointless copy of labels
stochastic_gradient: fix mistake in _init_t docstring
stochastic_gradient: describe all losses, fix epsilon description
DraXus (2):
peping8 examples
peping8 examples/applications
Edouard DUCHESNAY (45):
add pipeline
WIP pipeline
Example of feature selection pipeline
Merge branch 'master' of github.com:vmichel/scikit-learn
Cosmetic on Pipeline
Merge branch 'master' of github.com:vmichel/scikit-learn
Partial Least Square 2 blocks mode A (PLS) implementation
PLS examples
Merge branch 'master' of github.com:scikit-learn/scikit-learn
PLS mode A : two estimation algo: NIPALS & SVD
PLS: WIP
PLS : cosmetic changes
PLS
PLS cosmetic
PLS: optimize, compare against R implementation, clrify terms
PLS: simplify API + som additionnal test
PLS: add transform function
PLS: test_pls fix a bug
Merge branch 'master' of github.com:scikit-learn/scikit-learn
PLS: transform method
PLS : add predict function
PLS : predict
Merge branch 'master' of github.com:scikit-learn/scikit-learn
PLS : make sure this also works with 1 dimensional response (PLS1)
Merge branch 'master' of github.com:scikit-learn/scikit-learn
remove quotes "" on columns names
PLS cosmetic: PEP8, etc.
PLS, new specific classes: PLSCanonical, PLSRegression, CCA + some cosmetics
PLS: computation optimization
PLS API
PLS: API (2)
PLS : coeficients computation
PLS : check for numerical instabilities + force float
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'pls' of https://github.com/fabianp/scikit-learn
Merge branch 'master' of github.com:scikit-learn/scikit-learn
resolve conflict
Merge branch 'pls' of https://github.com/fabianp/scikit-learn
resolve conflict
samples generators: remove multivariate_normal_from_latent_variables
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Check that scikit-learn implementation of PLS provides exactly the same outcomes
Some more non regression test on PLS
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge pull request #869 from pprett/pls-scale-by-zero
Emmanuelle Gouillart (7):
Corrected a few typos in the documentation.
In spectral clustering example, forced the solver to be arpack
Example on tomography reconstruction with Lasso for the gallery.
COSMIT: PEP08
Tomography example: PEP08, typos...
Reference to tomography example in narrative doc
ENH: a few typos in docstrings
Eugene Nizhibitsky (1):
Fix staged_predict_proba() in gradient_boosting.
Eustache Diemert (35):
added first version of out-of-core example
revision round #1 (move to examples/applications, 1 file, auto-download dataset)
pep8 / pep257 compliant formating
get rif of feature dicts, leverage HashingVectorizer class directly
plot as both a function of time and n_examples
using print() function
improve explanations on out-of-core learning paradigm
improve explanations on example structure
fixed use of docstrings + added section in whats_new.rst + added data dir to .gitignore
more robust data location
use same, separate held-out data to estimate accuracy after each mini-batch
added first version of out-of-core example
revision round #1 (move to examples/applications, 1 file, auto-download dataset)
pep8 / pep257 compliant formating
get rif of feature dicts, leverage HashingVectorizer class directly
plot as both a function of time and n_examples
using print() function
improve explanations on out-of-core learning paradigm
improve explanations on example structure
fixed use of docstrings + added section in whats_new.rst + added data dir to .gitignore
more robust data location
use same, separate held-out data to estimate accuracy after each mini-batch
fixed conflict in whats_new.rst
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn into out-of-core-examples
factorized instance extraction + plots
added note on test set creation rationale
cosmit : inline extract_instance
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn into out-of-core-examples
more structured iteration using islice + wrappers; renamed chunk for minibatch as the latter seems more common in hte literature
added sub section on out-of-core scaling in the narrative docs
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn into out-of-core-examples
some more language corrections
more pep257 fixes (not for ReuterStreamReader as it is not really the interesting class here)
DOC recommend understanding NumPy in the tutorial
DOC expand feature selection docs with an example
Fabian Pedregosa (846):
Add intercept to classes Lasso and ElasticNet
Cosmetic changes in SVM doc.
Start of 0.5 development cycle.
Re-enable code that was removed for the release
Cleanup gmm example. Removed unused modules.
In LAR, normalize only non-zero columns.
Add support in LAR for unnormalized columns.
LAR: add a test for zero coefficients.
Cosmetic changes in glm module.
Add modules to top-level __init__.
Rename ninter --> n_iter in the API guidelines.
Add documentation to svm mdoule.
FIX: bug in blas_opt detection.
Link against compiled cblas in case this is not in the system.
Bug fixing in setup.py
Apply changes made by Olivier.
One more tests on LibSVM with precomputed callable kernel.
Refactoring of LibSVM bindings.
Test for libsvm margin.
More bugfixes for blas detection in setup.py
FIX: numpy 1.4 fixes.
Mark as known to fail some tests in test_hmm.
Use numpy.testing instead of unittest to skip failing tests.
Refine cblas detection on OSX.
FIX: compatibility fixes for py3k.
Initial support for sparse matrices in SVMs (scikits.learn.sparse.svm)
Refine cblas detection on OSX.
FIX: compatibility fixes for py3k.
Initial support for sparse matrices in SVMs (scikits.learn.sparse.svm)
FIX: bug fixing on sparse.svm.
FIX: more bugfixing in sparse.svm.
Doc updates to the svm module.
Remove unused imports in qda module.
Some doc for the svm module.
Add target in to Makefile.
Fix names and missing parameters in LinearSVC.
Add support for sparse matrices in liblinear bindings.
Add a reference to density estimation in GMM docs.
Use relative imports inside scikits.learn.
Remove unused imports from hmm module.
Refinement and bugfixing in the liblinear bindings.
More refactoring and bugfixing with liblinear.
More refactoring in libsvm + liblinear.
remove unused imports from setup.py
run all tests suite through nose.
move liblinear into its own folder
Bug fixing in liblinear bindings.
Added some failing tests.
Bug fixing in liblinear bindings.
XFail tests that fail (or are plainly wrong).
Refactor layout of developer docs.
Revert unwanted changes (aka ooooops!).
Added tests to trigger failure on classes using liblinear.
Refinement and bugfixing in the liblinear bindings.
More refactoring and bugfixing with liblinear.
More refactoring in libsvm + liblinear.
remove unused imports from setup.py
run all tests suite through nose.
move liblinear into its own folder
Bug fixing in liblinear bindings.
Added some failing tests.
Bug fixing in liblinear bindings.
XFail tests that fail (or are plainly wrong).
Refactor layout of developer docs.
Revert unwanted changes (aka ooooops!).
Added tests to trigger failure on classes using liblinear.
Update the developer docs.
Refactoring & bug solving in liblinear.
FIX: fix liblinear predict in the multiclass case.
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
Add reference to pybrain from the ann docs.
Update numpydoc (sphinx extension).
Update svm rst doc.
Add rst doc for logistic (empty for now).
FIX: fix shape of support vectors in liblinear sparse.
Update git information.
Do not compare LinearSVC and SVC for exactly equal classification.
Update git information in README.rst.
Update sphinxext/docscrape from numpy's trunk.
Refactoring in the svm module.
Re-enable probability estimates in logisitic regression.
Rename failing example in order to build the doc.
FIX: fix generating the examples with some tricky uses of pylab.
Update install information.
Fallback to plain html for image rendering in index.html
Fix bugs in dev docs.
Updates on install doc.
Update mailmap file.
Updates on sparse.svm.SVC
Remove install_requires line.
Update README. Remove unused dependencies.
Use str for printing parameters.
Update setup.py.
Change make setup to run setup.py
Use repr for arrays in representation of classifiers.
Use nosetest as testing tool in README.
Allow setting variable PYTHON, NOSETESTS in Makefile.
Doc: correct size of intercept in svm.
Keep shrinking and probability as booleans in SVM.
Refactoring: put all gmm examples in its own directory.
Some love for the rst docs.
Create new class NuSVR.
Some patches for k_means.
Backport changes in sparsetools to compile under python2.7.
Add a pure-python version of LARS and refactor structure in glm.
add README for gmm examples.
Refactoring and doc for svm module.
FIX: fixes for the lars lasso code.
Fix build system.
Fixes on the Lasso-LARS algorithm.
Add benchmarks for the LARS algorithm.
Change score function and add docstrings.
Some work on the rst docs.
more doc love.
DOC: more work on svm module.
Fix in LARS: specify manually number of interations for full path.
Remove "debugging" traces...
Fix doctests.
DOC: some doc glm module.
Convert to ndarray in Ridge
DOC: glm module.
FIX: fix intercept in LinearRegression.
Doc: Add stub file.
Benchmarks for some Glm classifiers.
DOC: more on glm module.
Fix typo
Backport total_seconds from python2.7 to use in benchmarks
Refactoring in glm.benchmarks.
Rename nSV_ --> n_support_ in svm module.
Some more doc for the glm module.
Update doc svm.NuSVC
Make BSD find happy.
Compat: Add function copysign in the case of numpy < 1.4
Do not import pylab globally in benchmarks
Test for function utils.fixes._copysign.
FIX: fix previous (stupid) commit.
Welcome Virgile Fritsch
Update docstring for BayesianRidge.
DOC: update docstrings in glm.bayes.
Update docstrings in svm and logistic.
FIX: some fixes for bayesian doc in glm.rst
DOC: some more fixes on BayesianRegression doc.
Better error message in fit svm.
Fix failing tests (sparse svm).
Merge http://github.com/GaelVaroquaux/scikit-learn into gael
Be able to do _get_params and _set_params in a recursive way.
Fix imports.
Temporary fix for Table of Contents not showing.
Do not import pyalb in the benchmarks.
LARS refactoring speedup Work In Progress!!!
more on lars optimization
more on lars speedup WIP
LARS with precomputed kernel working.
more work in lars optimization.
More on LARS performance: triangular solving and cholesky deletes.
A more challenging example.
cleanup and fix some tests.
More on LARS.
more optimizations
cleanup
Update CBLAS files: add rotg, rot, trsv, remove tpsv.
Good bye minilearn.
Fixes for ref atlas.
Some fixes for the atlas we ship.
Add missing cblas_dcopy files.
New theme for the web page.
Add what's new page and a nicer sidebar for index page.
Move glm related benchmarks to a common location.
Performance improvements for LARS precomputed Gram matrix.
Remove weight_label keyword from SVR.
Remove Minilearn C sources.
Add some developer information for the CBLAS we ship.
Cosmetic changes to the cblas README.
More love for the new web page theme.
Refactoring in the svm module.
Also remove Windows python extension (.pyd) in make clean.
move sparse.svm into the svm module to match glm.sparse.
Make a reference page in the docs.
Update class reference.
Cleanup in setup scripts.
Bugfixing in setup.py
cosmit.
Cosmetic changes to svm tests.
Correct default value of gamma in the svm docstrings.
Refactoring in sparse SVM and bug solving (default value of gamma).
Refactor svm tests.
Fix doctest failing by last bug fixing.
Add target test-doc to Makefile to test the RST docs.
Remove obsolete debugging code from grid_search.
Remove obsolete comments from doc.
Cosmetic changes to grid_search_digits.
Reduce import time.
Polynomial kernel also uses keyword gamma.
Fix typo in svm docs.
Fix wrong link in doc.
DOC: update doc about LARS.
Add LARS, LassoLARS to class reference.
Update funding info.
Update API changes in feature_selection doc.
Remove the ann module.
Remove obsolte css code from the docs.
Update docs on sparse svm.
Correct spelling errors in svm documentation.
Fix spelling errors in glm.rst
Remove non ASCII characters from the docs (problem in latex output)
Fix non-html (latex) generation of the docs.
Fix RST (numpydoc) markup.
Update organization in index.rst
Update doc of neighbors module.
Update links in svm doc.
Update theme in web page (sidebar color)
Fix malformed RST in BallTree.cpp
Change doctests that are machine dependent.
Update joblib to 0.4.5
Comment out fragile joblib tests.
Add test.py script that runs nose.
Remvoe printing statements from tests.
Adopt numpy naming scheme for __version__ attribute.
Compatibility fixes for utils.graph.
Use by default np.unique.
Compatibility fixes for scipy <= 0.7 and numpy <= 1.4
Compatibility fixes for old scipy.sparse.
Do not include Makefile in final release.
Add missing files to setup.py
Add missing images
Rename features --> feature_extraction to match module feature_selection.
Update information on testing.
FIX: bug in setup.py file from glm/sparse/
Update web page.
FIX: fix imports in example for renamed modules.
Add function template for doc.
Update web page theme.
Update mailmap file.
Add Feature Selection classes to the reference docs.
0.5 changelog (Work in progres).
DOC: better link that literalinclude.
Combine user guide into a single file.
Update changelog.
Add png logo.
Update MANIFES.in file.
Update test.py and README.
Simplify test machinery.
Use ELLIPSIS for machine-dependent results in joblib.
Comment out machine-dependent tests from joblib.
Update Makefile
Welcome Mathieu Blondel.
Fix doctests from the tutorial.
0.5 release candidate.
FIX: some setuptools oddities.
0.5.rc2 release.
Still fixing distutils oddities ...
0.5.rc3 release.
Web page update.
Add sparse ti glm/__init__
Fix typo in docstring.
Fix typo
Fix doctests from RST docs.
Fix links in about page.
Cosmetic changes in install.rst
You want the truth well here it is.
Add a link to the PDF version of the docs.
0.5 final release.
Start of 0.6 development cycle
Add note on executing the test suite.
Update web page.
Add a note on complexity for SVMs.
Add datasets to __init__ file.
Correct typo in docstring.
Allow access to multi-class SVM in liblinear.
Do not execute test coverage by default.
lighten GMM tests.
remove n_dim property (use plain field).
Fix and enable _covar_mstep_full in gmm.py
Cosmetic changes.
Bindings for libsvm-dense
Update svm benchmark with latests libsvm.
Some fixes for libsvm-dense
More accurate info in examples.
Update svm examples affected by latest API changes.
DOC: Some docstring for libsvm low level API.
Revert "DOC: Some docstring for libsvm low level API."
Revert "Update svm examples affected by latest API changes."
Revert "More accurate info in examples."
Revert "Some fixes for libsvm-dense"
Revert "Update svm benchmark with latests libsvm."
Revert "Bindings for libsvm-dense"
ENH: enhacements in the gmm module.
Make previous commit work also with old versions of scipy.
No specific need that matrix is upper-triangular in gmm.
Fix doctests in gmm (skip random ones).
Revert "Fix doctests in gmm (skip random ones)."
Revert "No specific need that matrix is upper-triangular in gmm."
Revert "Make previous commit work also with old versions of scipy."
Revert "ENH: enhacements in the gmm module."
Bindings for libsvm-dense
Update svm benchmark with latests libsvm.
Some fixes for libsvm-dense
More accurate info in examples.
Update svm examples affected by latest API changes.
DOC: Some docstring for libsvm low level API.
Compile _libsvm_sparse in the sparse module.
Add setup.py to svm.sparse
Preliminary fix for naming issue in OSX with libsvm.
Add a namespace to svm methods to avoid same name mangling.
Fix for building libsvm in a portable way.
FIX: fix doctest with recent API changes.
FIX: fix fragile doctest.
Updated liblinear to latest version 1.7.
Make liblinear quieter.
Update classes to use new features from liblinear 1.7.
Move logistic into glm and add a sparse version.
Doc: better tests for logistic.
Fix imports in example.
Fix doctests in sgd module.
Welcome Peter.
Avoid iterating over features in gmm.
Add more sanity checks for svm with precomputed kernels.
Use n_jobs=1 as default value in SGD module.
Unique URL for release-specific doc
Cleanup in libsvm bindings.
Cosmetic changes in gmm.
Improve docstrings in metrics.py
Cosmetic changes
DOC: Add new installation media and a note for pythonxy users.
FIX: prefix with plot examples that produce output image.
New implementation of LARS algorithm.
Add a test for lars_path.
Fix typo (wantto -> want to)
remove obsolete bench_lars.py
FIX: replace nsamples --> n_samples in svm docstrings.
Remove BaseLib class.
Implement make html-noplot for building the doc.
Update libsvm docstring with latest API changes.
Rename predict_margin --> decision_function.
Indentation fixes in libsvm bindings.
Performance improvements in LARS.
Better heuristic in LARS.
Add support for np.float32 matrices in lars_path.
Add parameter precompute='auto' for *LARS classes.
Some LARS refactoring.
Rename scikits.learn.gmm to scikits.learn.mixture.
Update developers info.
Add GridSearch and GridSearchCV to the class reference.
Update svm docs (content of dual_coef_).
Account for lower=True option in solve_triangular.
Do not import gaussian_process from top level __init__.
update NuSVC docstring.
Fix failing doctests in gaussian_process.rst.
Fix GridSearch does not exist.
Give credit for web page layout.
glm --> linear_model rename holocaust.
Welcome Vincent Dubourg.
Update AUTHORS information.
Initial support for weighted samples in svm module.
Cosmetic changes to web page layout.
Fix example paths for GMM after renaming.
Update class reference list.
Cosmetic changes in documentation.
Add sgd.* to class reference.
Move benchmarks outside the source tree.
Fix precompute keyword in LARS.
Update LARS benchmarks with latest API changes.
Cosmetic changes in plot_weighted_samples.py
Add cross-references between LassoLARS and Lasso.
More rename in the sgd module.
ENH: prettify web page layout.
Some love for scikits.learn.svm.
FIX web page layout for very long paths.
Update LARS documentation.
Fix for linear_model.rst
More love for rst docs.
Like it or not, we depend on setuptools.
Use original diabetes data as shipped by the R package lars.
rename lars --> least_angle
Remove duplicates in linear_model/__init__.py
Use relative imports in datasets.
FIX: sparse svms do not accept callable kernels.
py3k fixes: callable has been removed.
Py3k compatibility
Remove redundant site.cfg parsing.
Update status of py3k support.
Cosmetic changes in LARS.
FIX: correctly add depends files to setup.py.
Make libsvm recognize labels in increasing order.
Correct array size in decision_function docstring
TEST: sanity check on decision_function.
Inverse sign in decision_function.
No need to sort predict_proba any more.
Add a comment on inverting the sign of decision_function.
FIX: order of indices of support vectors in multiclass.
Shuffle globally for iris in test_svm.
Divide parameter alpha / n_samples for consistency with Lasso.
Cosmetic changes.
Cosmetic changes in lars.
Update .mailmap
FIX: fix bug in sparse liblinear: bias parameter was not set.
FIX lda, qda: new numpy.bincount requires integer arguments.
Started Changelog 0.6.
Change link in plot_face_recognition.
Remove example plot_lar.py
FIX: do not invert the sign of decision_function in OneClasSVM.
Add missing options to OneClassSVM.
web page layout fixes.
Remove duplicate docs (sphinx generates this for us).
Prepare for 0.6 release.
Remove generated classes on make clean.
Add notes on fluctiations of liblinear.
Add type info to docstrings.
FIX: backwards compatibility for scipy <= 0.8
Remove Methods from docstring.
FIX: scipy 0.9 compatibility fixes
FIX: second argument in euclidean_distances.
Cosmetic changes.
Better version detection for scipy
FIX: stupid mistake.
FIX Stupid mistake
More robust utils.fixes.
FIX: docstring.
FIX: np.unique.
Start 0.7 development cycle.
Add AUTHORS to web page.
Note on LinearSVC.
Web page layout.
FIX: update to latest API.
Web page update.
FIX tests when run with scikits.learn.test()
Update doc.
Update Mailmap.
Update authors list.
Update README.
Add all doc to generated latex.
Add species distribution modelling to OneClass examples.
Add other ways to contribute to the doc.
Little doc improvements to the grid_search.
DOC: remove duplicate information.
Remove unused imports
Add installation instructions for NetBSD.
Revert "Partial Least Square 2 blocks mode A (PLS) implementation"
Revert "PLS examples"
Revert "PLS mode A : two estimation algo: NIPALS & SVD"
Some docstrings added to ridge.
Rename lb -> label_binarizer.
Add note on multi-class classification.
Add some more doc to LabelBinarizer.
Some love for lars_path.
Turn off axis in plot_iris.
ENH: implement decision_function for libsvm-based classes.
DOC: svm.rst refactoring.
FIX: always raise ValueError on deficient input in BaseLibSVM.
FIX: fixes & tests for liblinear decision_function.
ENH decision_function liblinear, sparse variant.
FIX: fixes for liblinear decision_function.
Nicer support vectors in example plot_separating_hyperplane.py
PEP8 fixes.
Doctest fixes.
Remove obsolete info.
Squash function in test_svm.py
remove unused.
Add RandomizedPCA to RST docs.
PCA docstrings reestructuring.
Do not resize the array on k=1.
ENH: Neighbors refactoring.
Add parameter eps to NeighborsBarycenter.predict.
FIX: fix dimensions in plot_neighbors_regression.
Simpler doctest for neighhbors.
FIX: rename adjacency --> connectivity in kneighbors_graph.
Change the algorithm used in neighbors.barycenter.
small fix in barycenter
remove unused imports.
Rename barycenter --> barycenter_weights (as it was before).
Neighbors refactoring.
FIX: fix collinearity issues in least_angle.py
Regenerate Cython file _liblinear.pyx
Remove arbitrary code in tests.
Simpler check for orthogonality.
Add pls to __init__
DOC: set up barebones documentation for PLS.
FIX: do not resize array in knn_brute.
Faster Neighbors* in high dimensional spaces.
Use squared distances.
FIX: typos and missing info in docstring.
metrics.pairwise has right to live.
Rename inplace --> brute_inplace
ENH: better consistency tests for neighbors module.
FIX: typo.
FIX: don't import assert_allclose
So this is why people kept posting issues to SF's trac ...
Deleted code is debugged code.
Cosmetic changes in decision_function.
Rename strategy --> algorithm in Neighbors*.
Improve performance of GMM sampling
Second patch by f0k.
Cosmetic fixes in GMM.
More cosmetic changes in GMM.
Rename ndim --> n_dim
Rename nobs --> n_obs
Some more docstring fixes for mixture.
Examples cleanup: remove pl.close, it is now handled by gen_rst.
Changelog for 0.7
More doc on 0.7 release.
More on changelog.
Minor fixes in changelog.
Add metrics to the doc.
More fixes for the changelog.
Some more changelog stuff.
FIX: mxf --> Xinfan Meng
Documentation update.
Replace latex with simple syntax in docstrings.
Start of 0.8 development cycle.
Building on Windows.
Build precompiled windows binaries.
ENH: make transform() work when no Y is given.
Remain compatible with numpy 1.2
Do not import scipy.sparse globally.
Implement probability estimates for SVR and OneClass.
Raise NotImplementedError on predict_proba when model do not implement
Update numpy/scipy requirements.
Read README.rst for description in PYPI
DOC: clearer doc for BallTree.
DOC: docstring enhacements for Gaussian Naive Bayes.
DOC: some documentation for naive_bayes module.
Refactoring in svm module.
ENH: better doc and tests for unbalanced svm's
Python 3 compatibility.
Nicer low-level API for libsvm.
Ignore OSX .DS_Store files.
Revert "Python 3 compatibility."
FIX: rename eps to tol also in svm.sparse.
ENH: cython bindings for libsvm's cross_validation routine.
Revert "Python 3 compatibility."
FIX: rename eps to tol also in svm.sparse.
ENH: cython bindings for libsvm's cross_validation routine.
FIX: cross val return array size.
Initial implementation of cross validated SVC
Python 3 compat, this time with npy_3kcompat.h
Revert "Initial implementation of cross validated SVC"
Merge branch 'cython-balltree-wrapper' of https://github.com/thouis/scikit-learn
Cosmetic changes in base.py
FIX: py3k compat.
I won't import scipy.sparse globally.
Some cleaning in libsvm sparse bindings.
name consistency in sparse svm
ENH: low-level API of libsvm.
Cleanup in libsvm helper.
FIX: important fix for sparse SVC (weights were not initialized correctly).
Don't hardcode n_jobs.
Add regularization in the computation of barycenter weights.
Add regularization in the computation of barycenter weights.
libsvm low-level API refactoring.
PEP inquisition.
Some fixes for web layout.
Remove obsolete information.
More low-level refactoring.
Return first score in case of ties.
rename grid_points_scores_ to grid_scores_ in GridSearchCV
Some tests for the things I changed in GridSearchCV.
Merged pull request #135 from paolo-losi/l1_logreg_minC.
DOC: fix links to l1_min_c
FIX: reference to l1_min_c
Merge branch 'covariance' of git://github.com/VirgileFritsch/scikit-learn
Cosmetic changes in covariance.
DOC: add low-level methods from libsvm.
FIX: fix rename of grid_scores_
Do not open file write file until download is complete.
Add tests for libsvm.cross_validation.
Add optional parameter n_class to load_digits.
Merge pull request #144 from larsmans/balltree-cleanup.
FIX: missing import in plot_covariance_estimation.py
Py3K: use explicit floor division
Return also t from swiss_roll generator (needed to plot colors)
CSS style tweaks.
more CSS tweaks.
Some more CSS tweaks
Initial implementation of Locally Linear Embedding.
Re-generate .cpp from ball_tree.pyx
pep8 clean.
FIX: python2.5 SyntaxError
FIX: tuples have no .index in python2.5
FIX: more python2.5 SyntaxError
FIX: explicit linking against std++ breaks under mingw32.
FIX: fix import paths in doctests.
Merge pull request #157 from fabianp/joblib_fix
FIX: compatibility python2.5
DOC: add docstrings to BallTree.
Update neighbors with latest changes to BallTree.
Update .mailmap
Layout fixes.
Add analytics code to web page, SF discontinued web page stats.
Changelog
Some doctest fixes.
More docstring fixes.
FIX: change doctest to avoid results with NaN
I have no idea why, but this fixes the broken doctest.
Start of 0.9 development cycle
Welcome Lars & Edouard.
FIX: pls docstring.
DOC: added section on complexity for LLE.
Rename embdding_vectors_ --> embedding_
Add submodule for manifold.
Cosmetic changes.
Merge pull request #3 from GaelVaroquaux/manifold
More on practical tips.
Typo
FIX: bad import
Move cache_size out of model parameters.
Cosmetic changes in the docs.
Docstring for test.
Test for non-contiguous input for svms
Implement predict_proba for sparse svms.
FIX: doctests in svm doc
ENH: support instance of BallTree as input to kneighbors_graph.
Merge branch 'master' of github.com:scikit-learn/scikit-learn into manifold
Implement transform method in LLE.
FIX: fix test.
more fixes.
FIX: fix segfault in cases of infeasible nu (NuSVM)
FIX: transform method.
Merge pull request #153 from fabianp/manifold
FIX: use NeighborsClassifier in test.
FIX: some bugs in locally_linear_embedding.
DOC: remove obsolete information in neighbors.rst
Add max_iter to LARS.
DOC: fix errors in manifold doc + style tweaks.
Explicit cmap in swissroll example.
Add test and cleanup for 2c1c88
Test: test for unnormalized predictors.
Add failing test.
DOC: add reference to FastICA from the ICA docs.
DOC: add fit_intercept to LinearSVC docstring.
Refactoring in ridge.py
Rename of cg -> dense_cg and 'default'-> 'dense_cholesky'.
Some docstring updates.
Move scipy_future into utils.arpack
Add Jake to the mainfold credits.
Merge pull request #222 from jakevdp/balltree-doc
Explicit cmap for plot_compare_methods.
Cosmetic cleanup.
FIX: bad logic in Pipeline.
Revert "FIX: bad logic in Pipeline."
Refactoring in libsvm bindings.
FIX: fix bug in LLE with dense solver
Update ARPACK from scipy.
Backward compatibility fixes for testing LLE.
FIX: arpack doctest
comment LLE arpack test
Protect against MemoryError in libsvm.fit
FIX: doctest Ridge.
FIX: add newline after autosummary:: sphinx directive.
Layout & consistency fixes linear models documentation.
cosmetic linear_model.rst
FIX doc linear_model.rst
Layout tweaks.
DOC: new example for Ridge + more rst docs
Merge pull request #236 from JeanKossaifi/sparse_matrix_type
Don't use np.atleast_2d when interfacing with native code.
Some documentation for hmm module, and a warning.
Revert "pyflakes warnings"
Covariance with residual at the end for path is zero.
FIX: LARS doctest in linear_model.rst
Update rsync command
Merge branch 'variational-infinite-gmm' of https://github.com/GaelVaroquaux/scikit-learn
Replace logsum by np.logaddexpr in hmm, tweaked some tests.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge pull request #218 from fabianp/fix_lars
Rename n_states --> n_components in mixture & hmm + cosmetic changes.
FIX: support numpy < 1.3
Merge pull request #280 from vene/lars_n_features
Remove max_features keyword from lars_path.
Use default value for n_nonzero_coefs
Remove hardcoded n_jobs from examples.
Revert "Remove hardcoded n_jobs from examples."
Don't use n_jobs=-1 in the examples.
Refactor tests for SVR.
Correct NuSVR impl in the sparse case.
Add tests for last commit.
Remove fit params from all objects.
Merge pull request #316 from jakevdp/cython-ball-tree
Compatibility fixes for Python 2.6 and doctest fixes
FIX: py3k compatibility.
FIX: py3k compat.
Merge pull request #326 from bdholt1/fix/svm
Welcome Brian Holt
FIX: broken example
Generate thumbnails in the example gallery
Link images to example file in new gallery
FIX some broken examples.
Rename face_recognition so that the result is plotted.
Revert "Rename face_recognition so that the result is plotted."
FIX: linnerud dataset mixed variables.
Layout tweaks
Cosmetic changes in example docstring
layout tweaks
Move project directory from scikits.learn to sklearn
Add a compatibility layer for some modules.
Forgot to add a blank image for the docs.
Revert "Add a compatibility layer for some modules."
Revert "Move project directory from scikits.learn to sklearn"
Move project directory from scikits.learn to sklearn
Add a compatibility layer for some modules.
correct imports
Merge pull request #335 from fabianp/rename
Add more modules for compatibility layer.
More renaming.
scipy.lena() no longer works on scipy's dev version.
FIX: fix variable referenced before assignement in libsvm.pyx
Do not import mixture from top-level sklearn.
DOC: add parameter C to docstring.
Use LinearSVC's docstring instead of outdated one.
scipy.lena has moved to scipy.misc.lena in scipy's dev version.
Use 1 / n_features as value for gamma.
FIX broken tests by last commit.
Add changelog for changing gamma parameter.
FIX example logisitic regression.
Move matrix factorization to work in progress.
Initial changelog -- to be completed.
More changelog and .mailmap
Why, emacs, why ??
Update changelog
DOC: broken link to example
FIX: add test fixtures to distribution.
FIX: broken link to example
DOC: always generate pages for autosummary.
FIX: some sklear.test() fixes.
Add Vlad as GSOCer
Complete Changelog.
FIX: import path under scipy's dev version.
Comment tests that depend on PIL.
Comment out tests for the current release.
FIX typo
FIX: docstring for RadiusNeighborsRegressor
sklearn.test() does not like doctest that don't print.
Doc: Print --> Issue
Safer assert_all_finite.
Some more doctest fixes for sklear.test()
Update commiters list
Start of 0.10 development cycle.
Some Python 2.5 fixes.
More python2.5 fixes
FIX: assign NaN to an integer array has no effect on old numpy
Some more changelog stuff.
Update MANIFEST.in: scikit-learn --> sklearn
Add mldata loader and olivetti dataset to changelog.
Faster tests for coordinate_descent.
Add changelog entry.
Merge pull request #375 from VirgileFritsch/mcd
Merge pull request #383 from bdholt1/svm-mem-leak2
Add Brian's name to the Changelog.
FIX: keywords {precompute, Xy} where implemented and documented but unused ...
Cosmetic changes in LARS
FIX: Py3k compatibility.
Delete benchmarks/bench_svm.py
Delette benchmarks/bench_neighbors.py
MISC: More meaningful names for lapack functions in least_angle.
Removed unused parameters in least_angle
Convert to scipy doc convention + add missing options
FIX: array2d was did not return contiguous arrays with order='C' ...
FIX: do not use reshape in libsvm sparse bindings.
Use centralized directory for generated files.
Description for logo: font, color, etc.
DOC: Move practical info into its section and delete duplicates.
Style: webpage tweaks
Style update in documentation.
Doc: minor fixes
Minor update and fixes to linear_model documentation
Minor update and fixes to linear_model documentation
Move implementation details into RST doc.
Docstring conventions.
DOC: rename n -> p
Web page layout tweaks.
Small comment on the dual parameter
Use M.dot instead of np.dot on sparse matrices
FIX: LLE mode='auto' for small matrices and tuples.
FIX: use .toarray() instead of .todense()
COSMETIC: more readable syntax for mult. of sparse matrices.
Merge pull request #466 from amueller/svm_iris_example
Remove useless benchmark.
FIX: broken benchmark
Move uninteresting example to docstring
FIX docstring
Merge pull request #456 from vene/sparse-coder
Remove duplicate definition in RST
Replace unmaintainable test
More robust test for lars_path
Typo in example. Thanks Virgile for the cool example.
Revert code that I erroneously changed
Remove old API change warning
Merge pull request #504 from jakevdp/sphinx-images
FIX: docstring
DOC: exaple for sklearn.test()
FIX: convert lena to float32 (originally it's ints)
FIX: doctest
Still some tweaks for the sklearn.test() example
Remove pylab code from docstring and +SKIP those that requie PIL
FIX: explicit conversion to float64 in ElasticNet
FIX: bug in elasticnet with precompute not being updated correctly.
DOC: complete docstring for regression score function
DOC: restructure docstring of ElasticNet.
Changelog
Start of 0.11 development cycle.
Mailmap alias
And the winner is ...
DOC: links for people that have webpage.
DOC: some documentation fixes.
DOC: docstring update for dump_svmlight_file
Refactor in KFold.
Set the download link to PYPI.
FIX: bug in DenseBaseLibSVM when subclasses implement new params
FIX: inheritance in DenseBaseSVM
Add Satra to the AUTHORS list.
WEB: update the designer's URL
FIX: latex underscore
Explitit cmap for background.
Some doc for the example "Lasso path using LARS"
Some documentation for example plot_ridge_path
BUILD: add gemv cblas routine
BUILD: add dger cblas function
Update README.rst
Merge pull request #1078 from buguen/docs
Print running time as a floating-point number with two decimals.
Merge pull request #1138 from fabianp/doc_float
Robustify LARS. Fixes issue #487
New (faster) implementation of isotonic regression
ENH Improve Ridge's conjugate gradient descent
Added the paper I used to implement isotonic_regression.
Add support for preference contraints in svmlight format.
FIX: query_id parameter and other cosmetic changes
Add test for load_svmlight_files
Merge pull request #1182 from fabianp/svmlight
FIX: typo in ValueError message.
Add support for query_id in dump_svmlight_file
DOC: added svmlight qid support to whats_new.rst
Python3 compat: print()
ENH: Consider order in X for IsotonicRegression.
Better tests + cosmetic changes.
Store X as an ordered array.
Clarify docstring in lars_path
Update LIBSVM_CHANGES
Add SVD-based solver to ridge regression.
Remove unnecessary code in ridge svd
BUG: solver was not passed to computational method in Ridge object
Use Cholesky solver by default, but use SVD as fallback
Use ValueError for non-existant solvers
Merge pull request #1914 from fabianp/ridge_svd
Test for singular matrices in Ridge regression
Fix broken link to web designer
Fix broken link to web designer
Fazlul Shahriar (1):
DOC fix docstring typos in cluster/mean_shift_
Federico Vaggi (5):
Added test_regressor_pickle to tests.
Added test_classifiers_pickle to tests.
Finished adding pickle tests.
Removed the use of StringIO, using pickle.dumps instead.
cosmetic: Changed all instances of nonlinear to non-linear
Felix Brockherde (1):
FIX scores calculation in ovo multiclass
Feth Arezki (1):
lfw: import imread from new location in scipy
Florian Hoenig (3):
added test that fails because Scaler.fit changes a sparse input vector when Scaler is initialized with copy=False
removed bug in Scaler.fit
improved test_scaler_without_copy
Francois Savard (2):
Fixed docstring for C param in BaseLibLinear/SVM subclasses.
Added version info to deprecation warning
Félix-Antoine Fortin (2):
Modified package name in Easy Install section.
DOC/FIX affinity_propagation damping default value.
Gael Varoquaux (1272):
MISC: Make sure that the tests pass on numpy 1.2
MISC: Comsit + replace some global seeds with RandomSate
MISC: Rename to let the underscore RULE!
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
API: Create a base estimator class.
ENH: improve base class
ENH: Temporarily remove the typing for the base_estimator
TEST/BUG: Test the BaseEstimator class and fix the repr
Merge branch 'master' of http://github.com/agramfort/scikit-learn
Cosmit
MISC: Remove the #$! import *
ENH: convert all GLM estimators to the BaseEstimator class
BUG: Fix the OLS regression
BUG: Fix constructors with arguments.
BUG: Syntax error
BUG: str of linear models now working.
API: Change the type of params: turn this into a frozenset: unmutable and
Merge branch 'master' of http://github.com/agramfort/scikit-learn
ENH: Change _params to frozenset
API: Change argument controling whether intercept should be fitted
Cosmit in tests
Merge branch 'master' of http://github.com/agramfort/scikit-learn
API: Change the BaseEstimator and parameter signature logic.
Cosmit
Cosmit
ENH: Convert LDA and clustering to use the new BaseEstimator
MISC: Change the title of the documentation.
Cosmit
ENH: Make the clustering more usable
ENH: Add an example of playing with the stock market
ENH: Make SVNs fit to the BaseEstimator API.
ENH: Make SVNs fit to the BaseEstimator API.
Cosmit
MISC: Put the nearest neighbors estimator to the BaseEstimator
MISC: rename base_estimator.py to base.py
BUG: Make sure that the docs still build with recent versions of numpy
BUG: Make sure the docs still build with recent versions of numpy
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
BUG: Adapt the sparse SVM to the rename of base_estimator.
MISC: Remove warning when compiling docs.
MISC: Adding titles to examples.
DOC: Document best practices/coding guidelines to make it easier for
Cosmit
MISC: Put the nearest neighbors estimator to the BaseEstimator
MISC: rename base_estimator.py to base.py
BUG: Make sure that the docs still build with recent versions of numpy
BUG: Make sure the docs still build with recent versions of numpy
BUG: Adapt the sparse SVM to the rename of base_estimator.
MISC: Remove warning when compiling docs.
MISC: Adding titles to examples.
DOC: Document best practices/coding guidelines to make it easier for
Cosmit
ENH: Make the grid_search take instances of estimators rather than
Add a setup.cfg to specify default nosetests behavior.
MISC: 80 character bordel!
Add a setup.cfg to specify default nosetests behavior.
MISC: 80 character bordel!
Cosmit: rename grid to iter_grid
Merge branch 'master' of github.com:GaelVaroquaux/scikit-learn
DOC: Beautify example
Merge branch 'master' of github.com:GaelVaroquaux/scikit-learn
MISC: Beautify examples.
Cosmit
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
Merge branch 'master' of http://github.com/vmichel/scikit-learn
ENH: rework univariate selection to reach a compromise between ease of
ENH: First go at a help for cross-validated evaluation of a score.
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
Merge branch 'master' of http://github.com/agramfort/scikit-learn
ENH: Add an example showing the dependency of SVC+Anova on the number of
ENH: Add joblib as a bundle dependency.
BUG: Fix doctests in pipeline.py
BUG: Fix doctest in GMM
ENH: Add script to update joblib dependency
Cosmit: rename MixinClassif to ClassifMixin
ENH: Make sure that in cross_val_scores the StatifiedKFold is used only
Merge branch 'master' of github.com:GaelVaroquaux/scikit-learn
MISC: Small change to contribution guidelines, suggested by Mathieu
COSMIT: For the sake of underscores
MISC: comment
ENH: Add parallel computing for cross validation.
BUG: Get the BaseEstimator to work even if there is not __init__
ENH: Make the parallel cross validation more efficient.
BUG: Fix cross_val_scores for unsupervised problems.
ENH: Improve cross_val in parallel
Merge branch 'cross_val_gael' of git at github.com:GaelVaroquaux/scikit-learn
Misc
ENH: Improve the repr for the BaseEstimator
Merge branch 'gael' of http://github.com/agramfort/scikit-learn
BUG: Fix a bug preventing from LinearModelCV to print.
Misc
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
BUG: Fix forgotten import in example
MISC: Remove pointless ellipsis directive (doctest)
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
BUG: Fix lda and qda on 64 bits.
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
Merge branch 'master' of ssh://gvaroquaux@scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
TEST: Make sure that doctests for bundled dependendies pass.
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
BUG: Make sure that joblib does get installed.
ENH: Make sure tests get installed.
ENH: Improve the repr for the BaseEstimator
TEST: Make sure that doctests for bundled dependendies pass.
BUG: Make sure that joblib does get installed.
ENH: Improve the repr for the BaseEstimator
TEST: Re-enable external tests.
BUG: Fix doctests to account for change in BaseEstimator repr
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
Merge branch 'cross_val_gael' of git at github.com:GaelVaroquaux/scikit-learn into cross_val_gael
MISC: Update joblib to 0.4.4
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
ENH: Make sure that the QDA inherits from the ClassifierMixin
Merge branch 'cross_val_gael' of github.com:GaelVaroquaux/scikit-learn into cross_val_gael
ENH: Make sure that the QDA inherits from the ClassifierMixin
ENH: Make sure that the logistic regression does inherit from
ENH: Add image to graph feature-extraction helper, and some basic graph
ENH: Make sure that the logistic regression does inherit from
ENH: Add some code to compute a graph Laplacien on sparse and non sparse
Merge branch 'master' of github.com:scikit-learn/scikit-learn into cross_val_gael
BUG: Temporary fix for 'array does not own memory' in SVM
ENH: First implementation of spectral clustering.
BUG: Temporary fix for 'array does not own memory' in SVM
Merge branch 'master' of github.com:scikit-learn/scikit-learn
ENH: Clean up the image clustering code and add an example on lena.
DOC: Add documentation for spectral clustering.
ENH: Add an estimator object for the spectral clustering.
ENH: Add k-means cluster with clever initialization.
ENH: K-means algorithm with good initialization, more
MISC: Restore an example that is now working again.
API: Change 'clustering' to 'cluster'
Merge branch 'master' of git at github.com:GaelVaroquaux/scikit-learn
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
BUG: Add a forgotten setup.py line
ENH: Backport a fast graph connect component algorithm from scipy.
MISC: Cosmit based on comments from Olivier and Alex
DOC: Add some notes on complexity of clustering algorithms.
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
BUG: Adding missing setup.py file.
BUG: Remove UTF8 character checked in by mistake.
BUG: Make graph laplacian and spectral clustering work in 64 bits.
ENH: For numpy >= 1.5, use np.linalg.slogdet as a fast_logdet
BUG: Fix bug with numpy >= 1.5 introduced by my previous (stupid) commit.
ENH: Remove 'import *' in glm/__init__
BUG: Fix tests broken by last commit
ENH: Make spectral clustering tests more robust
Cosmit (PEP 8)
BUG: Fix glm/setup.py so that the glm sub package installs right.
TEST: make the test location consistent.
API: Add cluster as an import of the main __init__
BUG: Fix warnings module not imported in coordinate_descent. Thanks to
Merge branch 'master' of github.com:scikit-learn/scikit-learn
MISC: Move AUTHORS to AUTHORS.rst so that it displays better on github
BUG: Make sure computations do not get executed at import time, so that
Cosmit
BUG: Remove failing doctest.
MISC: More tests and more docs for preprocessing.
BUG: Cater for NaNs in SelectPercentile.
Cosmit: 2 lines between function definitions
ENH: Make sure that the cross_val_score uses StratifiedKFold for
ENH: GridSearchCV: add an 'iid=True' and open the option to optimize
MISC: 3-Fold cross-val by default
ENH: Make sure that grid_search uses a StratifiedKFold by default on
BUG: Fix doctest
DOC: better example for SVM-Anova
ENH: Make sure docs build on older versions of sphinx
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
DOC: Prettify
DOC: Make first page more compact.
DOC: Update the developer guidelines.
MISC: Tweak front page
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
MISC: Cosmit on PCA tests to get understandable errors from the buildbot.
MISC: Delayed import of pylab, to work on the buildbot
MISC: Relative imports
BUG: Fix tests to be moroe robust
Cosmit
Merge branch 'master' of github.com:scikit-learn/scikit-learn
MISC: Add a warning in the spectral clustering if pymag is not present
Revert "MISC: Add a warning in the spectral clustering if pymag is not present"
Revert "Revert "MISC: Add a warning in the spectral clustering if pymag is not present""
BUG: Import stats explicitely to work with scipy > 0.7
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Fix typo on David's name
Merge branch 'master' of github.com:scikit-learn/scikit-learn
ENH: Add _set_params/_get_params in Pipeline
Merge branch 'master' of github.com:GaelVaroquaux/scikit-learn
ENH: Implement _reinit on Pipelines.
ENH: Use new pipeline framework in SVN-ANOVA
BUG: Import stats explicitely to work with scipy > 0.7
ENH: Finish the __repr__ for the pipeline
ENH: Add error management to the KFolds
ENH: Port the nipy k-means with some cleanups and enhancements.
ENH: Change the initialisation heuristic for k-means: in general random
BUG: Adapt spectral to new k_means API
Merge branch 'master' of github.com:scikit-learn/scikit-learn into kmeans
ENH: Add error management to the KFolds
Cosmit
Cosmit
Merge branch 'master' of github.com:scikit-learn/scikit-learn into pipeline
ENH: Change pipelines so that they are simpler and address subobjects
BUG: Fix bug introduced by previous commit
BUG: Fix doctests.
DOC: Better docstring.
ENH: Make setting nested parameters on Pipeline really work.
Cosmit
TEST: Add a smoke test for cross_val_score
MISC: Fix spelling.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
DOC: Add more links to the index
DOC: Add forgotten link targets.
DOC: prettify the nearest neighbors docs
DOC: Prettify the SVM docs.
MISC: Add 'Python' in the examples page
DOC: fix the PCA iris example.
MISC: Remove non-necessary lines from PCA example
DOC: Prettify the clustering documentation.
DOC: Fix the reference classes documentation
DOC: Prettify the GLM docs
MISC: Quiet down the tests.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
ENH: Add a tester to the scikit.
ENH: Make doctests pass with numpy tester.
MISC: Make sure tests always run.
MISC/DOC: fix reference
TEST: Fix test_pca sign error.
TEST: Fix whitespace in doctests.
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
ENH: GridsearchCV, Pipelines and cross validation
ENH: Make sure that Pipeline and GridSearch objects are indeed recognized
ENH: Make sure clone works on pipelines
ENH: Implement a score for the GridSearch.
ENH: Make sure that a GridSearchCV has a score
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'cross_val' of github.com:GaelVaroquaux/scikit-learn
Merge branch 'master' of http://github.com/fabianp/scikit-learn
ENH: Small optimization to BaseEstimator
BUG: Make sure that grid_search works with sparse data.
MISC: Cosmit in new GMM classifier example
DOC: Make the plot_ica_vs_pca example richer.
MISC: Some tweeks to the layout so that the docs display better on a
DOC: Fix title level in install
DOC: make the index page content clearer
MISC: Explicit acronym
MISC: PEP8 in docs
DOC: Change the titles' layout
DOC: Rewamp the tables of contents and corresponding layout
Merge branch 'master' of github.com:scikit-learn/scikit-learn
DOC: Make sure the docstring of pca render well
DOC: Remove empty section
DOC: Add documentation for ICA/PCA
DOC: Remove useless tables of contents
DOC: work on the clustering documentation
Merge branch 'master' of github.com:scikit-learn/scikit-learn
DOC: Tweak in the clustering docs.
DOC: document with more details the GMM module.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' of github.com:scikit-learn/scikit-learn
DOC: make the neighbors doc sexier
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Cosmit: explicit what OVA means as much as possible.
Cosmit
MISC: Recover changes overidden by manual merge.
BUG: Fix metrics to run on 2.5
ENH: Cosmetic improvements to the face example
MISC: Cosmit+Doc in fast truncated PCA
MISC: Remove redundant code and cosmit
Merge branch 'master' of github.com:scikit-learn/scikit-learn
ENH: Update embedded joblib to 0.4.6
ENH: import symbols on subpackage's __init__
API: Return self in _set_params
BUG: svm_gui: C is not defined for OC-SVM
ENH: Raise error when cloning bug estimators
BUG: Deal with 1D data in preprocessings.
BUG: fix cross_val and GridSearch in unsupervised.
BUG: Fix GridSearch in unsupervised
BUG: Fix the doc-generation of examples
Cosmit
MISC: Fix example
DOC: minor changes in gaussian_process docs
BUG: Fix missing gaussian_process subpackage in setup.py
FIX more missing files in setup.py
API: Remove long-depreciated function
BUG: FIx doctests broken in previous commit
DOC: documentation CD Enet fit parameters
DOC: Cosmit in docs
DOC: score is reserved to 'better is higher'
DOC: Better plotting in RFE example
ENH: Small tweak in BaseEstimator repr
ENH: Add control of the dtype in img_to_graph
ENH: dtype is img_to_graph defaults to input dtype
DOC: Add scipy in the install dependencies.
DOC: Typo in docstring
DOC: document better similarity matrix of spectral clustering
DOC: typos in docstring
ENH: Reorganise the feature agglomeration
ENH: Accept strings as memory
DOC: Add the logistic regression to linear models doc
DOC: Be explicite about what criteria are used in GridSearchCV
ENH: Add inverse transform to univariate_selection
MISC: Make sure that nosetests doesn't try to run the bench
ENH: Add a benchmark for ward
API: fit params -> class params in GrideSearchCV
MISC: Docstring formating
ENH: Tweaks for k_means performance.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Cosmit
ENH: n_leaves = n_samples in ward tree
MISC: np.zeros -> np.empty
ENH: Avoid big temporaries in hierarchical
MISC: Cleanup
ENH: Hierarchial: don't compute moments twice
ENH: hierarchical: gain memory with izip
ENH: hierarchical: simpler, faster without connectivity
MISC: labels in cluster -> int
MISC: Fix ward bench vs scipy
MISC: Avoid depending on numpy > 1.4
BUG: Add missing import
MISC: Less code duplication in lfw
Cosmit
API: SVMs: eps -> tol
MISC: Fix example to adjust to eps -> tol
ENH: Fixed seed for shuffling in SGD
BUG: fix grid_to_graph
MISC: cosmit + use private prng
MISC: fix typo
Merge remote branch 'vincentschut/master'
BUG: Fix bug introduced by PLS
DOC: Minor fixes to documentation
BUG: fix kneighbors method in high dim
DOC: improve PLS docs and example
MISC: Update joblib
ENH: Add verbosity to the gird_search
ENH: More parallelism in GridSearchCV
ENH: GridSearCV: better verbose
TEST: Fix trivail doctest failure
BUG: iter on complete grid (GridSearchCV)
MISC: html-nodoc default target
Merge remote branch 'origin'
Merge branch 'master' of https://github.com/yml/scikit-learn
TEST: Ellipsis on numericaly instable docs
Cosmit
BUG: doctest the joblib in externals not global
BUG: restore ellipsis in doctests
DOC: add the show-source back on html
BUG: fix multiple figure plotting
BUG: restore ellipsis in doctests
Merge branch 'master' of github.com:scikit-learn/scikit-learn
DOC: update rst docs to use multiple figures
DOC: front page link: Ward lena
DOC: better docs for Ward
ENH: Avoid gathering old images in docs
MISC: reduce disk consumption when generating docs
COSMIT: make the layout a bit cleaner in NMF docs
BUG: giving up on cleaning build
DOC: Better center of images
MISC: Two figures in plot_pca_vs_lda
DOC: move working notes to wiki
DOC: tweak sidebar
DOC: More sidebar tweaks
DOC: fix link to bug tracker
DOC: tweaks to developers notes
DOC: move KMeans to top of clustering
DOC: Less warnings during build
DOC: Fix more warnings
BUG: fix links to examples
MISC: cleaner generated code in doc/examples
MISC: separate decomposition examples to new dir
DOC: rmk on Sphinx version
DOC: add tiny docstrings where missing
DOC: Fix warnings
DOC: module entries in reference documentation
DOC: fix indentation
DOC: link fixes in kernel PCA
DOC: move working notes to wiki
DOC: tweak sidebar
DOC: More sidebar tweaks
DOC: fix link to bug tracker
DOC: tweaks to developers notes
DOC: move KMeans to top of clustering
DOC: Less warnings during build
DOC: Fix more warnings
BUG: fix links to examples
MISC: cleaner generated code in doc/examples
MISC: separate decomposition examples to new dir
DOC: rmk on Sphinx version
DOC: add tiny docstrings where missing
DOC: Fix warnings
DOC: module entries in reference documentation
DOC: fix indentation
DOC: link fixes in kernel PCA
DOC: Make sure that mpl is not interactive
DOC: Tweak DPGMM docs
COSMIT: lognormalize->log_normalize
COSMIT: avoid 'import as'
COSMITs
COSMIT: avoid one-liner
COSMIT: Local imports last
Revert "COSMIT: avoid one-liner"
MISC: move mixture's test to sub directory
COSMIT
COSMIT
COSMIT
ENH: Vectorize bound computing
ENH: Speed up _bound_z in DPGMM
COSMIT: DPGMM: Move bound computing to functions
ENH: Speed improvements in DPGMM
DOC: Improve the GMM vs DPGMM example
BUG: Fix bug introduced by moving test_mixture
DOC: Fix layout
DOC: Tweaks in mixture
BUG: fix testing (heisen) bugs in hmm
TEST: Heisen-bug fixing
ENH: Update joblib
Merged pull request #140 from fabianp/lfw.
MISC: Prettify GMM example
COSMIT: Pep 8 and remove useless imports
ENH: avoid useless computation and warnings
Merge pull request #148 from yarikoptic/master.
DOC: Fix layout
DOC: Fix layout
DOC: improve datasets information
MISC: links to upstream datasets
BUG: Make SVMs work on non contiguous arrays
COSMIT: Better fix for continuity in SVMs.
MISC: Recythonize the ball_tree
DOC: Fix links in covariance
MISC: Remove unused imports
ENH: Make LLE work with older PyAMG
MISC: Prettify the swiss roll example
COSMIT: Remove unused import
DOC: Prettify MiniBatchKmean example
BUG: Fix pyflakes warning in k_means
BUG: params not applied in MiniBatchKMeans
ENH: MiniBatchKMeans: avoid useless computation
BUG: MiniBatchKMeans: Error in stopping criteria
MISC: Cosmit in k_means_
DOC: cosmit in MiniBatchKMeans docs
TEST: Control seed in fastica tests
ENH: Capture different data length in grid_search
BUG: Avoid NaNs in lars_path
ENH: add pre_dispatch to GridSearchCV
ENH: Catter for lists in grid_search
Merge pull request #185 from amueller/master
BUG: Minor bugs in cross_val
Merge pull request #203 from amueller/docs_fix_again
Python2.5 compatibility
BUG: explicit imports in doctests
ENH: Lasso and LassoCV: fit params -> class params
DOC: Tweak the cross-val lasso path example
ENH: LARS: fit_params -> class params
COSMIT: Minor refactor in lars_path
ENH: Add a ShuffleSplit cross-validation iterator
BUG: Fix bug introduced in 83cf11c
Merge pull request #216 from ametaireau/master
BUG: change alpha scaling in LassoLARS
BUG: LassoLARS: X: modified during the normalization
BUG: LassoLARS didn't renormalize the coefs
ENH: Update joblib to 0.5.2
ENH: Update joblib
MISC: minor cleanups
ENH: Add small info on diabetes
COSMIT: Simplify the lena ward example
Cosmit
Merge pull request #247 from NelleV/FIX_doc
Cosmit
ENH: l1_distance: gaussian_process -> metrics
Doc: fix minor error in docstring
DOC: sparse_pca: put maths at the end
DOC on sparse_pca
DOC: Add l1_distances to classes.rst
TEST: faster tests, and more coverage
Merge pull request #248 from dwf/misc_fixes
BUG: Fix gmm bug + test failures
FIX/ENH: numerical stability in GMM
Comsit: PEP8
Cosmit: remove useless comments
MISC: Restructure compound decomposition example
TEST: SparsePCA: testing fit_transform useless
TEST: testing HMM more robust
Merge pull request #212 from vene/sparsepca
ENH: mixture: better numerical stability
Cosmit: Fix (some) pyflakes warnings
ENH: olivetti faces: control RNG in shuffle
DOC: Add a descr to olivetti_faces
DOC: fix some formating issues
DOC: Fix layout
DOC: fix layout
DOC: fix layout
Add forgotten 'install' for mixture.
BUG: Fix clone for ndarrays and sparse matrices
BUG: Fix clone for nadrrays and sparse matrices
Removing unused code
ENH: Avoid np.logaddexp.reduce
DOC: more precisions in univariate_selection
Merge pull request #266 from glouppe/master
BUG: fix dotests
DOC: stress that only chi2 works with sparse
COSMIT: remove unused import
DOC: Improve the Bayesian regression docs
Typo
Sorry, other typo
Merge pull request #279 from JeanKossaifi/master
ENH: Add a subset="all" to 20news
API: load_20newsgroups is depreciated
Cosmit
API+ENH: load data by default in mlcomp and 20news
ENH: compression in 20newsgroup caching
DOC: leftover false info in docstrings
DOC: load_filenames -> load_files
DOC: Link the Olivetti docs in the main docs
DOC: more explicit docs on alpha/rho in elasticnet
ENH: cv objects created by a helper function
COSMIT: fix doc indentation to PEP8
BUG+COSMIT: rewamp the lasso path examples
ENH: Add a LassoCV using LARS
COSMIT: Nobody expects the PEP8 inquisition
API: add import paths for LarsCV and LassoLarsCV
MISC: Follow changes to alpha scaling
ENH: Add normalization of X to LarsCV
BUG: Propagate fix 086b58f5 to LassoLarsCV
DOC: LARS docstring
BUG: Avoid div by 0 in lars_path_residues
ENH: Expose eps in LARS
DOC: Tweak the bayesian ridge docs
DOC+TEST: LarsCV
TEST: Improve test coverage of LarsCV
DOC: document eps in least_angle better
MISC: LarsCV: preallocate mse_path
ENH: use _check_cv in LassoLarsCV
DOC; fix documentation
MISC: mse_path in LassoLarsCV is now the mean
DOC: add example comparing LassoCV and LassoLarsCV
DOC: typos
API: _check_cv -> check_cv
Merge remote branch 'jakevdp/kernelpca-arpack'
TEST: Robustify LLE tests
BUG: Fix a bug introduced in rebasing
BUG: normalize before center in lars_path_residue
DOC: cosmetic changes to lars-bic doc and examples
DOC: make lasso docs easier to read
COSMIT: remove unused import
BUG: make lobpcg work with non-sparse matrices
COSMIT: tweak plot_compare_methods example layout
COSMIT: print time in plot_lle_digits example
MISC: fix image in manifold doc
MISC: prettify the faces example
COSMIT: doc and examples in decomposition
Merge pull request #314 from emmanuelle/spectral
ENH: More interesting benchmarks for OMP
API: eps -> tol in bayes
Merge pull request #317 from agramfort/normalize_data
Merge pull request #318 from JeanKossaifi/master
DOC: change the name scikits.learn to scikit-learn
Merge pull request #331 from JeanKossaifi/master
DOC: Fix doctest
DOC: scikits.learn -> scikit-learn
DOC: fix link
DOC: scikits.learn -> sklearn
DOC: Minor scikits -> scikit
BUG: sklearn/setup.py : learn -> sklearn
BUG: Backward compatibility layer sklearn.externals
ENH: Add verbosity control to LinearModelCV
BUG: scikits.learn -> sklearn: backward compatibility
COSMIT: PEP08
Unused import
BUG: backward compat: scikits.learn -> sklearn
ENH: add control of n_init in spectral clustering
BUG: scikits.learn -> sklearn backward compat
DOC: larger lena size in denoising example
Cosmit: make in-place modifications explicit
DOC: update whats_new.rst
BUG: ShuffleSplit: repr for random_state not number
DOC: formatting examples as a topic
ENH: GridSearchCV can has predict_proba
FIX bug introduced in 68e6544
Remove BaseLibLinear.predict_proba not implemented
DOC: Install.rst wrong packaging info
COSMIT
scikits.learn -> scikit-learn in README
`scikits.learn` in the README, to catch google
DOC: fix rst
TEST: skip unreliable doctest
DOC: minor doc ENH for trees
COSMIT: tree code simplification
COSMIT: np.random should never be called
COSMIT: no seeding of the global RNG
ENH: move parameter checking to fit
COSMIT: y is a vector, not a matrix
Cosmit, PEP8
DOC: doc and example cosmetics for trees
DOC: improve spectral clustering docs
API: spectral clustering uses arpack by default
DOC: proper docstring for load_sample_image
API: default in spectral clustering: auto
ENH: add doc target to Makefile
Merge branch 'master' into tree
Minor cosmit
DOC: use random_state in KMeans
DOC: improve silhouette coefficient docs
MISC: better check_build error reporting
PEP08 names in graph_shortest_path
COSMIT
TEST: simplify test case
SPEED tree: 2X in Gini criteria
MISC: mk roc_curve work on lists
MISC: __version__ in scikits.learn
DOC: add IterGrid in reference
COSMIT: no import as
MISC: Warn for integers in scaling/normalize
MISC: better warning message
COSMIT: never use np.linalg, but scipy.linalg
BUG: ProbabilisticPCA.score work with pipeline
MISC: remove links to sourceforge URL
DOC: fix links in mixture
MISC: add citation information
BUG: vectorizer.inverse_transform on arrays
DOC: pdf compilation
ENH: Easier debugging in check_build
ENH check_build: better error msg for local imports
DOC: turn off generation of index pages
ENH: Capture stdout in executed examples
COSMIT: layout in plot_kmeans_digits example
DOC: minor fix to AMI docs
ENH: First sketch of glasso
ENH: example for l1 covariance estimator
ENH: Add cd solver to glasso
COSMIT glasso: docstring and cleanup
ENH: the GLasso estimator
DOC: Better glasso example
TEST: test GLasso
ENH Glasso: don't penalize the diagonal
ENH: Add a GLassoCV
ENH GLassoCV: iteratively-refined Grid search
ENH GLasso: stability on correlated data
ENH GLassoCV: better parameter optimization
TEST GLasso: increase test coverage
DOC: narrative documentation for GLasso
COSMIT: @agramfort's comments
DOC: add sparse inverse covariance in whats_new
PEP8
DOC: rmks on structure recovery
DOC: better stock_market example (WIP)
COSMIT: address most of @ogrisel's comments
ENH: don't echo convergence warning on CV grid
DOC GraphLasso: be explicit about which algorithm
DOC GraphLasso: notes on algorithms and recovery
DOC: docstring in stock market example
DOC/API: integrate make_sparse_spd_matrix
Typo
MISC: address @larsman's comments
API: g_lasso.py -> graph_lasso_.py
DOC: GLasso -> GraphLasso
MISC: @VirgileFritsch and @mblondel's comments
MISC: silence stdout in GraphLassoCV tests
ENH GraphLasso: Silence warning
ENH: graph_lasso works on empirical covariance
BUG: update tests to changes in graph_lasso
BUG: fix layout in examples
MISC: fix rst bug
DOC: put class reference in the banner
COSMIT: prettify plot_oneclass
DOC: rework front page
DOC: Add 'up' relative link
DOC: title for the user guide content file
DOC: don't display empty tocs
MISC: scikits.learn -> sklearn
DOC: proper link structure in examples
DOC: title to relative links
DOC: EPD ships a recent version, but not latest
DOC: state clearly the version number
MISC: plot_stock_market cluster on learned covariance
BUG: fix score() with GraphLasso
Compatibility with numpy 1.1
BUG GraphLassoCV: score() needs a store_precision attribute
DOC: restore 'This page' in sidebar
Merge pull request #463 from npinto/patch-2
MISC: update joblib
BUG: fix joblib doctest
BUG: make the tests pass with numpy 2
COSMIT
COSMIT: prettify datasets docs
Merge pull request #469 from amueller/preprocessing_epsilon_doctest
DOC: start to merge statistical learning tutorial
Merge pull request #471 from amueller/linnerud_renaming
DOC: explicit the __init__ convention
Cosmit on randomized range finder
Merge pull request #475 from amueller/datasets_doctests
BUG: fix RandomizePCA: renaming of fast_svd args
DOC: scikit.learn -> sklearn
BUG: casting with numpy 2.0
BUG: API change in fast_svd
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
Merge branch 'master' into n_samples_scaling
MISC: FutureWarning on C scaling
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
COSMIT: beautify the plot_oneclass example
DOC: outlier detection improve docs and examples
DOC: improve outlier detection docs
API: h -> support_fraction
Cosmit
ENH: use controled random numbers
BUG: follow API change in outlier_detection
MISC: update whats_new
DOC: cosmit in kernel approximation
DOC: removing dangling link
Cosmit in metrics
BUG: fix bug introduced in fd8c
ENH: Store the full cv_scores in grid_search
DOC: add alpha_ to attributes of LassoLarsIC
COSMIT: utils.fixes: document versions
Merge pull request #512 from amueller/doc_consitency
ENH: update joblib
MISC: improve copy_joblib script
ENH: Integrate joblib 0.5.7
Merge pull request #478 from glouppe/tree
DOC: doctest bug
Cosmit: example prettier without colorbar
DOC: add links to examples.
DOC: improve univariate feature selection docs
MISC: move SelectorMixin outside of __init__.py
OPTIM: minor optimization
MISC: better error message
COSMIT
TST: fix doctest
TST: fix touchy doctest
COSMIT: avoid set_cmap and pcolormesh in example
Cosmit in docs
MISC: fix bibtex
ENH: Make LinearRegression work with sparse
DOC: update LinearRegression docstring
FIX: sparse LinearRegression with scipy 0.7.0
ENH: update joblib
MISC: tag explicitely a dependency
ENH: use joblib compression in datasets
MISC: tune test verbosity
ENH: update joblib
DOC: restore index on pages
Merge pull request #526 from amueller/ball_tree_skip_doctests
Merge pull request #537 from amueller/gaussian_nb_underscore
MISC: species distribution example plotted
ENH: better error messages
MISC: shorten a bit the description
DOC: fix image
DOC: layout
DOC: random selection of frontpage images
DOC: compress a bit the layout
DOC: shorten a bit the front page
DOC: avoid imgs taking 2 lines
DOC: Add a few images to the banner
DOC: fix wrong link
DOC: avoid line return
ENH: get the murmurhash to build properly
DOC: prettify ensemble docs
BUG: restore score functionality in grid_search
ENH: refit now works in the GridSearchCV
FIX: MurmurHash3 compilation on older GCC
Cosmit: remove unused imports
MISC: fix bibtex
Merge pull request #588 from jakevdp/balltree-fix
ENH: make LassoLarsIC more reproductible
BUG: fix test_precision_recall_curve
ENH: Add randomized lasso
ENH: randomized_lasso example: multiple alpha
Better randomized_lasso
Jacknife in randomized_lasso
Add a randomized logistic
COSMIT: pep08
ENH: Add pre_dispath to RandomizedLinearModel
ENH: RandomizedLinearModels transformers + memory
BUG: fix broken merge
MISC: inherit from BaseClassifier
BUG: parameter was not set right
DOC: Improve feature selection docs
DOC: try to improve randomized lasso example
ENH: numerical stability in LassoLarsCV
DOC: update dostring
ENH: grid in terms of alpha/alpha_max
DOC: nicer path
DOC: beautify feature_selection docs
DOC: cross-reference linear_model and randomized_lasso
DOC: enrich example docstring.
DOC: better example for randomized lasso
MISC: make sure two figures hold on a line
DOC: example and docs for randomized-lasso
MISC: address @ogrisel and @mblondel's comments
Cosmit
MISC: add randomized linear models to what's new
BUG: make clone work on 2D arrays
TST: add a test for bug fixed in previous commit
COSMIT: make the plot landscape
DOC: improve the label_propagation docs
COSMIT: authorship and licensing info
Cosmits
DOC: minor rmk on label_propagation
TEST: assert -> nose.tools.assert_equal
Merge branch 'label-propagation'
BUG: fix typo in tests
DOC: update whats_new
BUG: fix tests under numpy 1.5
TEST: add a test for whitening in ICA
PEP8
ENH: control random state in ICA
BUG: SVM raw_coef_ must be fortran ordered.
MISC: cosmit: use subpackage setup.py
DOC: reorganize GMM docs
DOC: reorganize GMM docs
DOC: more examples for DPGMM
Cosmit
MISC: remove custom __repr__
Merge branch 'master' of github.com:scikit-learn/scikit-learn
BUG: fix doctests
ENH: optim hierarchical: heapq in tree traversal
ENH: hierarchical: speedups in tree cut
MISC: clean up old c file
MISC: assert -> raise ValueError
BUG: typo
MISC: fix broken link to example
ENH: parallel in lasso_stability_path
API univariate_selection: _scores -> scores_
ENH: update joblib to release 0.6.2: bugfix
Merge pull request #613 from bwhite/patch-1
MISC: remove joblib from .gitignore
BUG: add missing file in joblib
Merge pull request #601 from agramfort/scale_C_true
BUG: follow API change in example
ENH: update joblib
Merge pull request #603 from jakevdp/GPML-fixes
Merge pull request #637 from fannix/fix
ENH: optim in ward_tree
Cosmit
BUG: ShuffleSplit should give reproducible splits
ENH: small speedups in coordinate descent
Revert "ENH: small speedups in coordinate descent"
ENH/FIX: in graph shortest path
Faster hierarchical cluster for very dense trees
ENH: Add the ability to set rho by cross-val
ENH: store the path for rho in ENet
BUG: fix tests and reorganize code
ENH: draft of parallel CV in elastic net
TEST: setting rho with ElasticNetCV
DOC: document ElasticNetCV
MISC: cosmit to please @agramfort
BUG: Same MSE scaling for LassoLarsCV and LassoCV
TEST: better tests of LassoCV and LassoLarsCV
DOC: add a link the Gohlke's 64bit windows binaries
DOC/TEST: HMM fix doc layout and doctest
ENH: Add controled random_state in HMMs
DOC: prettify HMM sampling example
Cosmit
COSMIT: underscores are better than unseparated words
TST: fix trivial bug and control the rng
MISC: fix the random number generators
Merge branch 'hmmc'
TEST: fix doctest on non 64bit boxes
COSMIT: readability
TEST: Fix cross_validation tests
BUG: fix cross_validation on numpy 1.3
Merge pull request #709 from ibayer/cleanExamples
Merge pull request #705 from agramfort/fix_ica
MISC: better verbosity in lars
DOC: more visible version remark
ENH Ward: better behavior for non-fully-connected graphs
ENH: Don't modify connectivity unless specified
DOC: affinity-propagation in clustering comparison
DOC: add clustering example on front page
Merge pull request #726 from emmanuelle/doc_correction
ENH: summary table on clustering
DOC: better clustering comparison table
DOC clustering comparison: link table and figure
MISC: tweak example layout
DOC: finish table to compare clustering
Merge branch 'WIP_tut'
DOC: Better narrative for DBSCAN
DOC: finish misc in tutorial
BUG: no plotting in doctests
COSMIT: layout tweak
Redo CSS layout killed by commut 94088b81
BUG: fix doctests
Merge pull request #730 from jaquesgrobler/rename_EllipticEnvelope
DOC: timings in cluster comparison example
COSMIT: prettier plot
Merge pull request #733 from jaquesgrobler/master
DOC: misc wording
TEST GNB: test that class_prior sum to 1
Merge pull request #751 from jaquesgrobler/master
DOC: Manhattan distance == l1 norm
BUG fix LinearSVM doctest
MISC: verbosity in SVMs
ENH: use warning.catch_warnings
ENH: neighbor warning always raised
API: n_test -> test_size in Bootstrap
COSMITs on GGM
TEST: Fix doctest
Cosmit: comment on 'clever' code
Warn: Passing params to fit is depreciated
DOC: testing without sklearn.test()
COSMIT: macports package name
COSMIT: better warnings
ENH MiniBatchKMEans: increase init_size for large k
DOC: better description of init_size
DOC create example section for datasets
DOC title for the tutorial examples
EXMPL: fix legend in sgd sample weights
COSMIT we no longer support Py 2.5
COSMIT simplify a bit examples
DOC: restructure what new
BUG: explicit adding of libm at build
BUG test_oneclass_decision_function: fix RNG
COSMIT: no capitals outside of class names
COSMIT: remove print
BUILD: add libm onlyon posix systems
MISC: simpler faster code with vectorization
SPD: Minor speedups
SPD: minor speedups
FIX: handle deprecation with estimator API
BUG: fix assert_greater/assert_lower
BUG: fix assert_greater
BUG: fix doctests
DOC: cosmits in docs
COSMIT: only classes should have capitals
ENH: make LinearSVC copyiable
TST: do not raise warnings in sklearn.test()
BUG: fix testing on older numpy
DOC: cosmits on tutorials and videos
DOC: wording of whats_new
BUG: use permutation rather than shuffle
CLEAN sparse_encode: remove unused arguments
ENH: avoid an underflow
Revert "ENH: avoid an underflow"
DOC: instructions on testing
DOC: faster and more meaningful example
ENH: prevent multiprocessing in tests under Windows
DOC: avoid 2 rows of images
DOC: more readable title
DOC: Feature extraction vs feature selection
DOC: image to graph utilities
ENH: update joblib
BUG: remove n_jobs=-1 from examples
Merge branch 'install-windows' of https://github.com/vene/scikit-learn
FIX: control RNG seeds in ICA tests
DOC: fix rst layout
MISC: clean up top-level namespace
P3K: more Py3k compat changes
BUG: multiple jobs in dict_learning
BUG: fix install bug for _check_build
BUG: casting error with recent numpys
DOC: note on heat kernel for spectral clustering
Typo
Typo
BUG: reassigning cluster centers with X sparse
BUG: k_means k -> n_clusters
COSMIT: k -> n_clusters
COSMIT: avoid deprecation warnings
MISC: os.name -> platform.system()
FIX: unique in old numpy
COSMIT in plot_mds.py example
DOC: misc improvements in MDS docs
DOC: minor MDS doc/example changes
MISC: update whats_new with MDS
BUG: address ill-conditionned designs in Lars
Cosmit: PEP8 :P
Cosmit: PEP8
COSMIT: intermediate variable
Merge pull request #953 from jaquesgrobler/nature_css_addons
ENH: backport gen_rst changes from NISL
ENH: minor speedup in Ward
ENH: factor 2 speedup in Ward
ENH: minor speed up in ward
ENH: minor speed up in Ward
Merge branch 'master' of github.com:scikit-learn/scikit-learn
MISC: avoid unprotected np.random
TST: testing without hard-coding the values
TST: test on diabetes rather than iris
Cosmit
BUG: example now needs 'assume_centered'
ENH: using slices rather than indice masks
ENH: avoid unecessary steps (covariance)
Cosmit: more explicit names
FIX: remove leftover print
Note on control of the RNG seed during testing
DOC: cosmit performance instructions
TST: test check_build
ENH: remove setuptools
ENH: restore 'develop' mode install
FIX: remove executable bit on joblib files
BUG: fix setup.py for develop
TST: test the setup.py using the configure step
MISC cleanup old coverage info in Makefile
ENH: Faster ward for large n_clusters
BUG: fix ward tests
DOC: ward docstring and testing
TEST: improve test coverage in hierarchical
FIX: make ward_tree work on 1D data
MISC: very minor speedup
COSMIT: remove left over profiling
TST: More testing in hierarchical
TST: test TypeError in Ward
TST: more tests for hierarchical
DOC: notes on improving code coverage
COSMIT: explainations of the partial import
MISC: build_utils: module rather than a subpackages
ENH: use sklearn.__version__ in setup.py
Merge branch 'linking_arrayfuncs'
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Cosmit: comment
TST: fix doctest
Update whats_new
Clean: remove debug print
PEP8
Typos
BUG: keep same shape for y in MultiTaskLasso
DOC: explicit MultiTaskLasso.coef_ dimensions
DOC: formatting and rephrasing in MultiTaskLasso
Merge pull request #1005 from NelleV/MDS
ENH: understandable error message for X sparse
BUG: casting rule with recent numpy
BUG: do not use diag_indices
BUG: choose seed to get affinity test working
BUG: fix my fix for affinity :(
DOC: link to Randomized sparsity in Lasso section
Merge branch 'master' into mixins
Revert "Rename Y to y in PLS"
Merge branch 'master' of github.com:scikit-learn/scikit-learn
BUG: sparse matrices in ElasticNetCV
MISC rest
DOC: improve scale_c_example
DOC: add a reference on multi-output trees
MISC: docstring work
BUG: fix setuptools feature
MISC: small docstring work
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Minor changes to contributing
BUG: parallel computing in MDS
BUG: deprecated k parameter in MiniBatchKMeans
BUG: copy and keep ordering
BUG: remove leftout debug prints
DOC: Enet alpha=0 => advice to use LinearRegression
MISC: add ltsa in docstring
ENH: MCD for large dataset
BUG in error message for k-means
BUG in error msg for spectral clustering
BUG: propagate random-state in MCD
DOC: protect `classes_` for valid rst
FIX: doctests under Windows 64bit
Update changelog
DOC: use nosetests rather than sklearn.test()
ENH: support arbitrary dtype in kNN classifiers
TEST: predict_proba in knn classifier with y string
DOC: fix doc mistakes
DOC: another layout fix
DOC: add doc on making a release
TST: cater for 0.9 not > 0.9
BUG: obey numpy 1.7's stricter rules
Merge remote-tracking branch 'origin/pr/1234'
BUG: cater for dev versions of numpy
MISC: use toarray instead of todense
BUG: RandomizedPCA needs random_state set
ENH: make RandomizedPCA.fit idempotent
TST: fix doctest
TST: fix test counting warnings
BUG: follow scipy API change
DOC: typo
DOC: improve the model selection exercise
ENH/FIX add a lobpcg solver to spectral embedding
MISC: decrease verbosity by default
FIX: numerical stability in spectral
MISC: addressing @satra's comments
ENH: make sure that spectral EVD is solve once
MISC: @agramfort's comments
PEP8
BUG: fix test error
BUG: make precision_recall invariant by scaling probs
BUG: fix setuptools feature
MISC: split example in two plots
API: change 'embed_solve' to 'assign_labels'
TST: increase coverage in spectral clustering
DOC: add docs of assign_label in spectral clustering
COSMIT: long remarks go in 'notes' section
BUG: restore numpy 1.3 compatbibility
MISC: minor clean ups in hmm code
Merge pull request #1290 from tjanez/master
COSMIT: pep8 in arrayfuncs.pyx
BUG: dot on sparse matrices broken in recent numpy
BUG: fix doctest bug
DOC: improve wording in covariance docs
DOC: typo
COSMIT: pep8, wording, layout
DOC: fixed string formatting in example
MISC: remove unused import
BUG: LassoLars path ending contained junk
TEST: one addition test on the length of the path
TEST: test that alpha is decreasing in LassoLars
BUG: lars corner case with path length == 1
ENH: multi-target Lars: lists rather than arrays
ENH: early stopping LARS for degenerate active set
MISC: address comments
MISC: more precise warning
WIP: drop for good correlated regressors
MISC lars_path: cleaner code in degenerate case
ENH: early stopping for lars
TST: add a test for lasso and lars
MISC: comment
ENH lars_path: early stopping after drop for good
TST: difficult test for early stopping
COSMIT: better comments
BUG: missing import introduced by rebase
DOC: Update whats_new with lars improvements
BUG: compat with numpy 1.3
DOC: spelling
BUG: AUC should not assume curve is increasing
COSMIT
DOC: LinearRegression document the shape of coef_
DOC: n_responses -> n_targets
TEST: decrease precision in test_lars_drop_for_good
BUG: imports should be locals
MISC: wording of doc/comments in example
ENH: RandomForestEmbedding in lle_digits example
DOC: cross-ref Random Forest embedding and manifold
DOC: list of dicts in GridSearchCV
DOC: wording and layout on front page
ENH: update joblib to 0.7.0a
BUG: fix properties on joblib files
BUG: Add forgotten file
BUG: update joblib to 0.7.0b
BUG: fix murmurhash compilation with recent Cython
ENH: use broadcasting, not tile
COSMIT: pep8
TEST: fixing randomly failing test
ENH: rng local to tests
TEST: add a test of sample weights
TST: improve last test
BUG: fix sample_weights in ridge
BUG: shape bug in tests
BUG: fix sample weights in ridge
BUG: Ridge: sample_weights in intercept
TST/BUG: test_common sample_weights in ridge
ENH: random reassignements in MiniBatchKMeans
ENH: fine tuning to the random assignement
DOC: example of dict-learning with KMeans
DOC: improve online KMeans example
DOC: dict learning with kmeans in narrative doc
DOC: fix typo
BUG: check n_clusters == len(cluster_centers_)
PEP8
DOC: change the example to lighter dataset
ENH: more control on reassignment in MiniBachKMeans
DOC: link to example
DOC: add comment
DOC: complete whats_new
TST: random reassignment in MiniBatchKmeans
TST: test verbosity mini_bach_kmeans
ENH: control random_state in MiniBatchKMeans
COSMIT: simplify parallel code in multiclass
DOC: put math at back, simplify formulation
MISC: fix rst in whats_new
MISC: index arrays with integers
DOC: voronoi + kmeans picture
DOC: typo in warning
BUG: reassignment_ratio == 0 in MiniBatchKmeans
BUG: sparse center reassignment MiniBatchKMeans
BUG: sparse vs non sparse centers
BUG: fix test to use sparse array
DOC: reference for discretise option
COSMIT :: in rst is easier for syntax highlighters
DOC: minor formatting in model_evaluation.rst
DOC: minor rst issues
DOC: misc rst formatting
COSMIT: prettify code and figure in example
COSMIT
Merge branch 'treeweights'
Merge pull request #1656 from rlmv/idf_diag
BUG: update joblib to 0.7.0d
TST: add a test for empty reassignment in MBKmeans
BUG: highly-degenerate roc curves
BUG: fix change of behavior in last commit
DOC: add example and ref to lars_path in lasso_path
BUG: ElasticNectCV choosing improper l1_ratio
ENH: minor changes for numpy versions
DOC: remove typo
DOC: libatlas3-base in requirement
ENH: Avoid computations in ElasticNetCV
ENH: improve memory usage in ElasticNetCV
DOC: docstring of private functions
BUG: fix sparse support in ElasticNetCV
COSMIT: address @agramfort's comments
DOC add 2012 GSOC students
COSMIT: labels in plot_lasso_coordinate_descent_path
COSMIT: txt -> rst
DOC: cosmit - fix latex typo
ENH: avoid MemoryError on manhattan_distances
BUG: old versions of numpy
BUG: old versions of numpy
MISC: details about the donations
BUG: type conversion in spectral_embedding
MISC: remove unused imports
BUG: restore Python 2.6
COSMIT: two empty lines between functions
Merge branch 'pr_1732'
BUG: fix sparsetools tests in old scipy
PEP8
Cosmit
Merge branch 'pr_2002'
BUG: fix unsafe casting
DOC: improve RBM example
MISC: remove unecessary dtype
ENH: better error message on scoring
DOC: reorganize model_evaluation
MISC: address comments and test failure
DOC: address remarks by @NelleV
DOC: Address @larsman's comments
DOC: @amueller's comments
ENH: Add the hungarian algorithm
TEST: Increase testing of hungarian
MISC: cosmit in hungarian
ENH: Speed up in hungarian
ENH: More speedups in hungarian
ENH: More speedups in hungarian
ENH: Still more speed ups in Hungarian
ENH: More speedups on Hungarian
API: scikits.learn -> sklearn
BUG: fix some numpy 1.3 compat issue
BUG: numpy 1.6 compat
:
BUG: fix kde tests
MAINT: update copy_joblib script
ENH: update joblib to 0.7.1
MAINT: misc change to copy_joblib
ENH: make bdist_rpm work
COMPAT: empty_like does not have a dtype in np 1.3
COMPAT: fix arpack and pls on old scipy/numpy
COMPAT: string formatting syntax in Py 2.6
COMPAT: median and nans in old numpys
COMPAT: no assert_warns in np 1.3
BUG: fix Py 3
DOC: invert priorities bootstrap <-> nature.css
DOC: sidebar lighter
ENH: add a new DataConversionWarning
MISC: fix plot_multilabel example
BUG: implement concrete __init__ for SGDRegressor
BUG: tests were raising the DataConversionWarning
Merge branch 'pr_2304'
MAINT: recompile Cython files
DOC: add whats_new on the news
TST: adjust test relying on change order
MISC: deprecate balance_weights (it's internal)
REL: 0.14a1 Release candidate for 0.14
MISC: update whats_new
MISC: fix reference to example
DOC: DBSCAN misc doc formatting
DOC: also point installation menu to stable
DOC: reduce the number of examples
MAINT: remove sklearn.test()
MISC: deprecation notice
MISC: document sklearn.test deprecation
ENH: custom distutils clean command
DOC: layout tweaks
DOC: bigger menu fonts
DOC: button layout tweak
TST: avoid a crash in Windows + Anaconda Py3.3
MISC: fix wrong timing in example
TST: avoid nose running sklearn.test as a test
MAINT: randn on float is deprecated
MISC: deprection is in 2 releases
DOC: update documentation for release
DOC: fix CSS bug
MAINT Update mailmap
REL: 0.14 release: update whats_new and version
Gilles Louppe (719):
DOC: Missing dot in Pipeline class description
Enforce axis=1 in Normalizer.transform + doc fixes
DOC: Fixed issue #110
DOC: Missing import in doctests
BUG: `copy=None` in `Scaler.transform` instead of `copy=False`
Complete rewriting of samples_generator.py
Fixes for broken tests due to the API changes in samples_generator.py (1)
Merge remote-tracking branch 'upstream/master' into samples_generator
Merge remote-tracking branch 'upstream/master' into samples_generator
Fixes for broken tests due to the API changes in samples_generator.py (2)
Fixes for broken benchmarks due to API changes in samples_generator.py
Fixes for broken examples due to changes in samples_generator.py
`seed` renamed to `random_state` and default value set to None.
Added references to functions in the `datasets` module.
Merge remote-tracking branch 'upstream/master' into samples_generator
Fixed a broken test.
Added tests for the samples generator module.
Added references to samples_generator.make_* functions in the documentation.
Small improvements in the documentation of the toy datasets.
dictionnary -> dictionary
Merge remote-tracking branch 'upstream/master'
Improvements of the RFE module.
Merge remote-tracking branch 'upstream/master'
Documentation + PEP8
More robust test on `step`.
Fixed a syntax error
Small code simplification.
Merge remote-tracking branch 'upstream/master'
Improved test coverage of rfe.py to 100%
Fixes of minor bugs + improved test coverage (now 100%)
Addressed Gael's comments.
Addresses Gael's comments. (2)
Addresses Gael's comments. (3)
Typo.
Improved test coverage of samples_generator and feature_extraction modules.
Fixed a small introduced due to a previous commit.
Merge remote-tracking branch 'upstream/master' into test-coverage
Improved documentation + predict/score.
Cosmit
Typo
Typo (2)
Merge remote-tracking branch 'upstream/master'
PEP8
Merge remote-tracking branch 'upstream/master'
Fixed examples
Improved test coverage to 100%
Added RFE into the narrative documentation
Doc: grammar
Added n_features_ attribute to RFE
Moved "feature selection" section back into the "supervised learning" chapter
Ensure 0.0 on diagonal elements if X is Y
Doc: Implementation details of euclidean_distances
Merge pull request #343 from glouppe/euclidean_distances
ENH: `np.fill_diagonal` replaced with more portable code. Added an explanatory comment.
scikits-learn -> sklearn
Added link to personal web page
Changes on the feature_selection module.
ENH: Cleaned setup.py
Merge remote-tracking branch 'bdholt1/enh/tree' into tree
DOC: Some docstrings have been rewritten + small cosmetic changes
Merge remote-tracking branch 'bdholt1/enh/tree' into tree
DOC: Improved documentation + cosmit changes
COSMIT: GraphViz exporter cleaned up
ENH: Made apply_tree_sample slightly more efficient + various cosmits
Regenerated _tree.c
Fixed issue #378 on the RFE module
Updated changelog.
Added a numerical stability test to decision trees
Added a numerical stability test to decision trees
Revert "Added a numerical stability test to decision trees"
Merge remote-tracking branch 'upstream/master'
Merge remote-tracking branch 'upstream/master'
Merge remote-tracking branch 'upstream/master'
DOC: Added load_boston in classes.rst
Merge remote-tracking branch 'upstream/master'
Simplified tree module API.
Added some comments
Allow for max_depth to be set to None
Simplified the tree code
Added k_features argument to build randomized trees.
First draft at find_best_random_split (not yet tested)
Renamed k_features to max_features
Added some explanatory comments into the code logic
Re-extended the _build_tree API
Factored is_classification
Added ExtraTreeClassifier and ExtraTreeRegressor
Typo
First draft at forest of random trees (work in progress)
Added some tests
Cosmit
Fixed bugs in forest + first test
Check X is a fortran-array and y is contiguous
Fixed bugs
Added tests of the forest module (work in progress)
Default value of n_trees=10
bootstrap=False for extra-trees
Set random_state=1 in tests
Added documentation in the forest module (work in progress)
Cosmit
Completed documentation
Added some tests
Added predict_log_proba
Added some more tests
Removed old random forest files
Added some more tests
Cosmit
Regenerate _tree.c
Fixed a small bug
Cosmit
Use super()
Use take instead of __get_item__
Rewrote some comments
Cosmit
Revert changes on conf.py (mistake on my part)
Added random_state parameter to _find_split functions
Factored out changes on the ensemble module
Merge remote-tracking branch 'origin/master' into tree
Fixing conflicts
Merge remote-tracking branch 'upstream/master'
Removed extra-trees (for now)
Removed extra-trees from __init__
Removed extra-trees (again!)
Merge pull request #432 from glouppe/tree
Merge remote-tracking branch 'upstream/master'
Merge remote-tracking branch 'upstream/master'
Rebase of @bdholt1's ensemble branch
DOC: Added module descriptions
PEP8: tree.py, forest.py
Merge remote-tracking branch 'upstream/master' into ensemble-rebased
DOC: Added warning and see also
ENH: Modified forest API to make it possible to grid-search the parameters of the underlying trees
Merge remote-tracking branch 'upstream/master' into ensemble-rebased
ENH: Check that base_tree is an estimator
ENH: Make forest derive from BaseEnsemble
Removed Bagging and Boosting modules from this PR
ENH: Make the Forest's API coherent with BaseEnsemble's API
FIX: Don't clone estimators at instantiation
TEST: Added test case for grid-searching over the base tree parameters
ENH: Cosmit
EXAMPLES: Improved plot_tree_regression
Typo
EXAMPLES: Improved plot_iris
EXAMPLES: Added plot_forest_iris
FIX: Trees couldn't be cloned properly
ENH: Added __init__.py into ensemble/tests/
DOC: Improved documentation in the examples
PEP8
TEST: Added tests of BaseEnsemble
TEST: Improved test coverage
EXAMPLES: Fixed a bug in plot_forest_iris
DOC: Cosmitis in the narrative documentation of the tree module
DOC: Improved narrative documentation of the tree module
DOC: Added ensemble methods to TOC
DOC: Added ensemble methods to the class reference
DOC: First draft at the narrative documentation of the ensemble module
DOC: Narrative doc of the ensemble module (work in progress)
DOC: Completed the narrative documentation (work in progress) + What's new
DOC: Fixed What's new
DOC: Last details on the narrative documentation
DOC: Added a last example in the narrative doc
Merge pull request #1 from ogrisel/glouppe-ensemble-rebased
DOC: Address @vene and @satra comments
TEST: Added test_base_estimator
DOC: Cosmit
ENH: Simplified RandomForest and ExtraTrees API
ENH: Use trailing _ for private attributes
DOC: Added warning in make_estimator
DOC: Removed 'default'
FIX: Bug with bootstrapping
FIX: Bug with bootstrapping (2)
FIX: Bug in plot_forest_iris
Merge remote-tracking branch 'upstream/master'
DOC: Use ELLIPSIS in doc-test
Cosmit
ENH: Address @agramfort comments
Benchmark: Added random forests and extra-trees to bench_sgd_covertype.py
Merge remote-tracking branch 'upstream/master'
Merge remote-tracking branch 'upstream/master'
FIX: Use random_state in _find_best_random_split
Merge remote-tracking branch 'upstream/master'
Merge remote-tracking branch 'upstream/master'
First draft at Reference rewrite
DOC: "the scikit-learn" -> "scikit-learn"
DOC: References to user guide sections
DOC: Standardize the module documentation format (work in progress)
DOC: Standardized the module documentation format (2)
DOC: Fixed graph_lasso reference
DOC: "Class Reference" -> "Reference"
DOC: Fixed warning
DOC: Changed sections titles in the reference
Merge pull request #461 from Balu-Varanasi/bug_in_rst_file
Merge pull request #467 from Balu-Varanasi/pep8-compliant
DOC: Fixed broken reference to user guide
Merge remote-tracking branch 'upstream/master'
ENH: Added feature importances to decision trees and to forests
TEST: Added test on feature importances
EXAMPLE: Added examples for feature importances using trees
COSMIT: rfe examples
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
EXAMPLE: Improved plot_forest_importances.py plot
COSMIT: tree examples
DOC: Fixed links to modules in the example gallery
DOC: Fixed broken links
EXAMPLE: Moved to the Olivetti dataset
ENH: Accelerate ensemble of trees by precomputing X_argsorted
FIX: bootstrap=False by default with extra-trees
EXAMPLES: Removed useless import
ENH: Use extra-trees instead of rf
COSMIT: examples
Added links and various cosmits
DOC: Added fetch_olivetti_faces to Reference
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
DOC: Cosmits on the Support page
ENH: Parallel fit/predict/predic_proba/feature_importances in forest
FIX: Ensure random random_states
ENH: use pre_dispatch
DOC: Return->Returns
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
DOC: Cosmit on the reference
ENH: Improved _parallel_predict_proba
DOC: add n_jobs to specs
ENH: Assign chunk of trees to jobs
EXAMPLE: renamed Frankenstein, set cmap in matshow
ENH: Forest -> BaseForest
DOC: Added reference for feature importance
ENH: Revisited importances API
Merge remote-tracking branch 'upstream/master' into tree
EXAMPLE: Fixed API changes
ENH: Missing default value for feature_importances_
ENH: Added SelectorMixin
TEST: Added tests of transform
ENH: Simplified API
DOC: Tree-based feature selection
ENH: don't sum if coef_ is 1-d
ENH: Inherits from TransformerMixin
PEP 257
PEP 257 (bis)
ENH: address @ogrisel comments
FIX: Used np.abs instead of ** 2
Merge remote-tracking branch 'upstream/master' into tree
ENH: Smart thresholds
Cosmit
PEP8
DOC: :mod: link
ENH: no predispatch with chunk strategy
FIX: Address Gael comments
Merge remote-tracking branch 'upstream/master' into parallel-forest
ENH: Simplified parallelization
PEP8
ENH: Simplified code
DOC: Quick docstrings for private functions
FIX: Revert changes
DOC: What's new
Merge pull request #2 from ogrisel/glouppe-parallel-forest
FIX: Address @ogrisel comments (1)
Merge branch 'parallel-forest' of github.com:glouppe/scikit-learn into parallel-forest
TEST: Added tests of parallel computation
DOC: Parallel computations in forest
TEST: Improved coverage of the ensemble package to 100%
DOC: Renamed example (+Parallel)
Merge remote-tracking branch 'upstream/master' into parallel-forest
Merge pull request #491 from glouppe/parallel-forest
DOC: Added missing BSD3 licenses
ENH: Better default values to trees and forests
TEST: Added tests of max_features values
DOC: Review of the narrative doc wrt max_features
DOC: Added warning to default values
DOC: typo
Merge pull request #523 from glouppe/tree-doc
DOC: fix broken doctest
FIX: max_features=None by default on single DT
Merge pull request #527 from otizonaizit/master
FIX: Add reference (stop words)
Merge remote-tracking branch 'upstream/master' into issue349
Merge pull request #528 from glouppe/issue349
DOC: Removed performance and utilities from toctree (they were appearing twice)
DOC: Fixed 'See also' in tree/forest
DOC: typo
2011 -> 2012
Merge pull request #627 from amueller/min_leaf_cherrypick
Merge pull request #684 from clayw/graphviz-fix
PEP8
ENH: move _compute_feature_importance into Tree
ENH: Use DTYPE instead of float64
Cosmit
ENH: Moved _build_tree into Tree
Cosmits + Fix to a test
Revert "ENH: Use DTYPE instead of float64"
FIX: return; instead of return NULL;
FIX: avoid dividing by zero in Tree.compute_importances
ENH: parallel computation of X_argsort
ENH: better argsort
ENH: cosmit and doc
Merge pull request #761 from glouppe/master
ENH: MultiOutputTree (wip)
Merge branch 'master' of github.com:scikit-learn/scikit-learn into tree-mo
ENH: Multi-output decision trees
ENH: Regenerate .c file
FIX: graphviz test
Merge branch 'master' of github.com:scikit-learn/scikit-learn into tree-mo
FIX: test_classification_toy
TEST: test_multioutput (1)
TEST: test_multioutput
ENH: make forests support multi-output
TEST: test_multioutput
ENH: Patch GradientBoosting
ENH: Patch GradientBoosting (2)
FIX: log_proba + DOC
DOC: What's new
PEP8
ENH: graphviz
DOC: narrative documentation
DOC: typo
DOC: Scikit-Learn -> scikit-learn
ENH: Cython improved code
ENH: Cython improved code (2)
DOC: narrative documentation
FIX: use and modify own y
COSMIT
FIX: segfault
DOC: Example
DOC: typo
DOC: example
DOC: typo
DOC: narrative documentation
DOC: docstrings for criteria
DOC: docstrings
Merge branch 'master' of github.com:scikit-learn/scikit-learn into tree-mo
Merge pull request #3 from bdholt1/glouppe-tree-mo
Merge branch 'master' of github.com:scikit-learn/scikit-learn into tree-mo
DOC: format
Merge pull request #923 from glouppe/tree-mo
Fix broken bot (sorry for that!)
Fix broken bot (again ;))
Merge branch 'master' of github.com:scikit-learn/scikit-learn into tree-speedup
DOC: What's new > Missing links
Merge branch 'master' of github.com:scikit-learn/scikit-learn into tree-speedup
Tree refactoring (1)
Tree refactoring (2)
Tree refactoring (3)
Tree refactoring (4)
Tree refactoring (5)
Tree refactoring (6)
Tree refactoring (7)
Tree refactoring (8)
Tree refactoring (9)
Tree refactoring (10)
Merge branch 'master' of github.com:scikit-learn/scikit-learn into tree-speedup
Merge pull request #948 from mrjbq7/trees
Merge branch 'master' of github.com:scikit-learn/scikit-learn into tree-speedup
Merge pull request #950 from mrjbq7/trees
Merge branch 'master' of github.com:scikit-learn/scikit-learn into tree-speedup
ENH: Tree properties
Tree refactoring (11)
ENH: make Tree picklable
Tree refactoring (12)
Tree refactoring (13)
FIX: avoid useless data conversion
FIX: avoid useless data conversion (2)
Tree refactoring (14)
Tree refactoring (15)
Tree refactoring (16)
FIX: @mrjbq7 comments
Tree refactoring (17)
Tree refactoring (18)
FIX: sample_mask
Merge branch 'tree-speedup' of github.com:glouppe/scikit-learn into tree-speedup
FIX: init/del => cinit/dealloc
Added _tree.pxd
FIX: gradient boosting (1)
COSMIT
Tree refactoring (19)
FIX: PyArray_ZEROS -> np.zeros?
FIX: gradient boosting (2)
Tree refactoring (20)
What's new
PEP8
Merge pull request #956 from Carreau/patch-1
COSMIT
Turn off warnings
FIX: test_feature_importances
FIX: test_feature_importances?
TEST: disable test_feature_importances for now
Merge pull request #946 from glouppe/tree-speedup
FIX: dtype conversion of y
EXAMPLE: plot importances with bars
FIX: forest / check_random_state in fit
FIX: tree / check_random_state in fit
FIX: bug in multi-output forest.predict_proba
Check for memory errors (1)
Check for memory errors (2)
Check for memory errors (3)
Avoid useless if-statements
Added a comment to clarify initial capacity
Merge pull request #1144 from glouppe/tree-malloc
DOC: return values of make_moons and make_circles
Merge pull request #1197 from glouppe/master
FIX: prevent early stopping in tree construction
FIX: prevent early stopping in tree construction (2) + Test
Merge pull request #1263 from glouppe/fix-1254
Merge pull request #1269 from mrjbq7/doc-fixes
ENH: Simplify the shape of (n_)classes_ for single output trees
ENH: Simplify the shape of (n_)classes in forest
PEP8
TEST: regression test for shape of (n_)classes
TEST: enforce flat classes_
What's new: API changes
ENH: better names for variables
What's new: added :class: keyword
FIX: convert predictions into a numpy array
FIX: docstring tests
Merge pull request #1445 from glouppe/tree-shape
What's new: typo
Merge pull request #1388 from arjoly/issue1047_gradient_boosting_uses_decision_trees
Merge pull request #1458 from seberg/contig_strides
What's new: fix by @seberg
Checkout files from ndawe:treeweights
FIX: roll back some changes
FIX: what's new
flake8
ENH: early binding + allocate features at tree creation (by @pprett)
FIX: oob test
DOC: sample_weight=None
DOC: what's new
DOC: typo
DOC: cosmit
FIX: use sklearn.utils.fixes.bincount
ENH: use random_state.shuffle
ENH: import aliases
ENH: import aliases (2)
ENH: import aliases (3)
PEP8 (some)
TEST: sample_weight
TEST: sample_weight (once more)
FIX: iris.target
FIX: raise an exception if negative number of samples
TEST: use rng
FIX: do not overwrite min_samples_split
FIX: set min_samples_split=2 by default
DOC: updated docstring
Typo
ENH: weighted r2 score for regression
COSMITs
ENH: Added balance_weights
ENH: added some tests
FIX: test_oob_score_regression
FIX: compute weighted oob scores
FIX: NaN problem + Added some tests
TEST: added some more tests
EXAMPLE: simplify n_estimators and n_samples
TEST: importances
TEST: multi-output problems
ENH: WeightedClassifier/Regressor mixins
DOC
FIX: drop support for multi-output
TEST: errors
ENH: staged_score
EXAMPLE: reduce the number of samples
EXAMPLE: merge plot_adaboost_iris into plot_forest_iris
EXAMPLE: drop plot_adaboost_quantiles
FIX: move balance_weights into preprocessing
PEP8 + PyFlakes
FIX: broken test
FIX: one more bug
FIX: remove prints
DOC: edited some docstrings
DOC: added references into classes.rst
ENH: rename boost method to _boost
DOC: cosmits + narrative documentation (begin)
DOC: proper citations
DOC
TEST: make test_importances more stable
DOC: narrative documentation
DOC: What's new
TEST: base_estimator
DOC: classes_ and n_classes_
DOC: put docstrings into subclasses to make them appear in the documentation
DOC + Better default parameter values
DOC: cosmits
DOC: typo
PEP8 and DOC
ENH: use shuffle
Roll back some changes
Roll back some changes (2)
FIX: what's new
Merge branch 'master' of github.com:scikit-learn/scikit-learn into adaboost
FIX: broken test
FIX: @amueller comments
Cosmits, code structure and tests
EXAMPLE: better plot_adaboost_regression
Revert changes on plot_adaboost_error.py
ENH: set default parameter values
Cleanup
EXAMPLE: give plot_adaboost_classification some love
DOC: narrative documentation
Merge branch 'master' of github.com:scikit-learn/scikit-learn into adaboost
Merge branch 'master' of github.com:scikit-learn/scikit-learn into adaboost
FIX: some nitpicks
ENH: remove boost_method parameter and use a string as switch
ENH: weights_ -> estimator_weights_
FIX: pprett comments
DOC: Added a References section in _samme_proba
COSMIT: flake8
ENH: weight -> estimator_weight
ENH: weight -> estimator_weight (2)
ENH: weight -> estimator_weight (3)
EXAMPLE: better x-axis label
EXAMPLE (2)
FIX: make_hastie_10_2 reference docstring
DOC: add a short dataset description in hastie example
DOC: narrative documentation
FIX: doctest
EXAMPLE: add AdaBoost to plot_classifier_comparison
FIX: some of Gael comments
What's new: Adaboost
Remove compute_importances parameter
What's new
ENH: Remove compute_importances in AdaBoost
ENH: Update feature_importances in GBRT
ENH: remove "mse" method and simplify
COSMIT
DOC: feature importances
Merge pull request #1657 from glouppe/feature-importances
DOC: add balance_weights to reference
EXAMPLE: compute_importances=True is no longer required (1)
EXAMPLE: compute_importances=True is no longer required (2)
DOC: narrative documentation on feature importances
ENH: precompute X_argsorted when possible
DOC: X_argsorted
Flake8
ENH: use isinstance instead
Merge pull request #1668 from glouppe/adaboost-tree
Merge pull request #1700 from erg/rf
FIX: use DOUBLE_t type
Merge pull request #1705 from glouppe/tree-fix
ENH: support float value for max_features
DOC: if float, then max_features is a percentage
ENH: Defer parameter checking of trees
DOC: GBRT max_features
TEST: added test
ENH: use numbers
FIX: numpy integers
PEP8
Merge pull request #1712 from glouppe/tree-maxfeatures
What's new: float values support for max_features
What's new: fix indentation
Merge pull request #1816 from ndawe/master
Merge pull request #1823 from erg/issue-1466
Merge pull request #1852 from slattarini/typofixes
ENH: moved export_graphviz to sklearn/tree/export.py
ENH: add max_depth to export_graphviz
ENH: output criterion name instead of "error" in export_graphviz
Merge pull request #1998 from kgeis/fix-setup-instruction
Merge pull request #2031 from jnothman/tree_comments
WIP: new Cython interface for decision trees
WIP: comments on the Cython interface
WIP: Criterion interface and base class
WIP: ClassificationCriterion (reset, update)
WIP: Gini criterion
WIP: entropy criterion
WIP: remove n_left and n_right attributes
WIP: MSE criterion
WIP: tree class
WIP: tree algorithm
WIP: add_node
WIP: node_value
WIP: node_value
WIP: predict + apply
WIP: Random Splitter
WIP: splitter
WIP: Best Splitter
WIP: sort features
WIP: first pass on tree.py
WIP: some debug
WIP: some more debug
WIP: debug in progress...
WIP: debug (tests still don't pass...)
WIP: one more bug fixed
WIP: cleanup
WIP: one more test fixed
WIP: more bugs fixed :)
WIP: 19 tests passed
WIP: test_tree.py now passes \o/
Cleanup
WIP: feature importances
WIP: discard samples with weight = 0
WIP: fix export functions
Cleanup
WIP: first pass on ensembles
WIP: use heapsort
WIP: small optimization to heapsort
WIP: remove asserts
WIP: use C-based random number generator
WIP: set n_classes as ndarray
FIX: fix test_random_hasher
WIP: fix adaboost
WIP: small optim to regression criterion
WIP: optimize tree construction procedure
WIP: optimization of the tree construction procedure
cleanup
recompile _tree.pyx
FIX: export_graphviz test
FIX: set random_state in adaboost
FIX: doctests
FIX: doctests in partial_dependence
FIX: feature_selection doctest
FIX: feature_selection doctest (bis)
WIP: allow Splitter objects to be passed in constructors
FIX
Some PEP8 / Flake8
Small optimization to RandomSplitter
FIX: fix RandomSplitter
Cosmit
FIX: free old structures
WIP: Added BreimanSplitter
WIP: small optimizations
WIP: fix BreimanSplitter
Cleanup
WIP: optimize swaps
Regenerate _tree.c
WIP: some optimizations to criteria
WIP: add -O3 to setup.py
WIP: normalize option for compute_feature_importances
WIP: Added deprecations in tree.py
WIP: updated documentation in tree.py
WIP: added deprecations in forest.py
WIP: updated documentation
WIP: unroll loops
WIP: setup.py
WIP: make sort a function, not a method
WIP: Cleaner Splitter interface
WIP: even cleaner splitter interface
WIP: some optimization in criteria
WIP: remove some left-out comments
WIP: declare weighted_n_node_samples
WIP: better swaps
WIP: remove BreimanSplitter
WIP: small optimization to predict
WIP: catch ValueError only
WIP: added some documentation details in _tree.pxd
WIP: PEP8 a few things
Benchmark: use default values in forests
WIP: remove irrelevant and unstable doctests
WIP: address @ogrisel comments
WIP: address @ogrisel comments (2)
WIP: remove partition_features
WIP: style in _tree.pyx
WIP: make resize a private method, improve docstring
WIP: use re-entrant rand_r
FIX: doctest in partial_dependence
WIP: break or shorten some long lines
FIX: doctest in feature_selection
WIP: break one-liner if statements
WIP: revert use of rand_r
FIX: broken tests based on rng
DOC: update header in rand_r.c
TEST: skip test in feature_selection (too unstable)
FIX: one more doctest
WIP: Faster predictions if n_outputs==1
WIP: Break comments on new line
WIP: make criteria nogil ready
WIP: enforce contiguous arrays to optimize construction
WIP: avoid data conversion in AdaBoost
WIP: use np.ascontiguousarray instead of array2d
TEST: add test_memory_layout
FIX: broken test
WIP: Make trees and forests support string labels
WIP: refactor some code in forest.fit
TEST: skip doctest in feature_selection (unstable)
WIP: better check inputs
WIP: check inputs for gbrt
Merge pull request #2131 from glouppe/trees-v2
What's new: new implementation for trees
FIX: remove debug message
FIX: remove -funroll-all-loops
FIX: ur strings are not supported in Python 3.3
DOC: some documentation for the Tree Cython structure
Merge pull request #2216 from glouppe/tree-doc
Benchmark: use specified dtype
TEST: cosmit on err_msg
Raise an exception if rows are full of missing values
FIX: doctest
Better error message
FIX: use range instead of xrange
FIX: imputation example
Merge pull request #2241 from arjoly/grid-cv-multioutput
Merge pull request #2262 from NicolasTr/fix_statistics
FIX: remove blank lines
Use epsilon=1e-7
FIX: partial dependence test
TEST: skip test_oob_multilcass_iris for now
Merge pull request #2277 from glouppe/tree-fix-32bits
COSMIT: typo in examples/imputation.py
Mr. Proper, act 1
Banner improvements
Banner style
Boxes on front page
Load bootstrap first
FIX: footer character encoding
CSS tweaks
CSS tweaks (2)
Lower part of the index
CSS tweaks
More css tweaks
Better alignment in the sidebar
CSS tweaks
More css kungfu
CSS stuff
Remove testimonials for now
CSS tweaks
Donate button + citing
Enhance contrasts
Contributin
Remove toc on the API page (it is already in the sidebar)
FIX: sidebar.js
Move Google javascript near </body>
FIX: remove dupplicate entry in What's new
Polishing on "Who's using scikit-learn"
Website: bottom buttons
Hannes Schulz (2):
MISC privatize/deprecate internal function of gaussian process
typo
Harikrishnan S (1):
DOC/FIX twenty_newsgroups.rst should use TfidfVectorizer
Hrishikesh Huilgolkar (7):
chi2 and additive_chi2 raise error if input are sparse matrices
Added same for additive_chi2_kernel
Fixed pep8 issues
pairwise_distance_functions renamed to PAIRWISE_DISTANCE_FUNCTIONS
Made more changes renamed pairwise_kernel_functions, kernel_params to allcaps
Added test for fit_transform(X)==fit(X).transform(X)
Fixed pep8 issues
Ian Ozsvald (3):
clearer decision surface plots and classifier final predictions for the ensembles
improved formatting
updated docs to fix formatting errors
Immanuel Bayer (72):
Test added for multiple-outcome:
bugfix: lstsq coefficients output needed to be transposed
fixed spelling error
docstring updated and list append replaced with
consistency
spelling
pep8 errors fixed
pip8 errors fixed
parallelized
parameter n_jobs added
BugFix, matrix was not flagged as sparse.
cleaned some examples
combat for sp_linalg.lsqr
test for positive constrained lasso added
positive constrained option for lasso added
lasso docstring update
remove outcommented lines
wording
example for lasso with positive constraint
renaming
reset wrongly committed file
use scikit function to make train test split
set w[ii] = 0 if tmp > 0
- changed parameter from positive_constraint to positive
indent
add examples for positive constraint lasso and enet
merged into plot_lasso_coordinate_descent_path
fix doctest
fixed doctest
Merge pull request #1 from agramfort/posCoeff
add dense attribute and dummy for sparse fit
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into merge_cd
add dense attribute and dummy for sparse fit
Merge branch 'merge_cd' of https://github.com/ibayer/scikit-learn into merge_cd
support of sparse input data added
tests of sparse coordinate_descent applied to the modified dense
-remove sparse option
remove sparse_coef_
Test is redundant since _set_coef function as been removed.
add property for sparse_coef_
add test for sparse_coef_ property
docstrings updated
merge cd_fast and cd_fast_sparse
remove redundant tests
remove redundant files, functionality has been moved to cd_fast.pyx
code removed and deprecated message added
fix docstring example
add test to check normalize option in sparse enet
Revert "remove redundant files, functionality has been moved to cd_fast.pyx"
Revert "remove redundant tests"
add sparse_std that has been wrongly removed in commit 48ba97f1 from the
update sparse_std call
some tests didn't use the numpy sparse matrix as input data and
make sure X is of dtype float64 in _sparse_fit
change input to inplace_csc_column_scale
modify test_normalize_option
test data changed for test_normalize_option
remove redundant folders in linear_model/sparse
remove unused imports
fix pip8
move sparse_center_data to linear_model.base
avoid copy if X has proper type, modify docstring
fix warning: add underscore to: grid_search.best_estimator_ and
add dual_gap_ and eps_ to Enet and Lasso docstring
extend eps_ description
renaming 'learn_rate' in 'learning_rate'
ENH hompage add links to headers in left panel
ENH add link to Citing
ENH renaming 'max_iters' to 'max_iter' for consistency
DOC missing class mention
ENH renaming 'n_atoms' to 'n_components' for consistency
ENH fix pep8
Imran Haque (2):
ENH Release GIL when entering LibSVM/Liblinear code
Release GIL around sparse liblinear training
Jack Hale (1):
WMinkowskiDistance corrections to error messages and docstring
Jacques Kvam (4):
add verbose output for gradient boosting algorithms
Changed verbose to int, added a low verbose option to just print '.'.
remove '\r' and format numbers to be fixed width, 7 digits of precision
fix GradientBoostingClassifier by passing verbose as a keyword argument
Jake VanderPlas (290):
fixed bug in BallTree cython wrapper
fixed small bug in cython wrapper for BallTree
updated ball_tree documentation
Merge commit 'upstream/master'
added MLLE, made some small fixes to manifold module
wrapped brute force neighbor search
added cython wrapper to BallTree.query_ball
query_ball -> query_radius, removed knn_brute
speed up BallTree.h
slight speedups to BallTree.h and ball_tree.pyx
added unit test for BallTree.query_radius
fixed reference-passing bug in BallTree.h
vastly improved MLLE speed
added HLLE code
sped up HLLE code
added ability to return distances and specify multiple search radii for BallTree.query_radius()
fixed r shape bug
Merge branch 'manifold' of git://github.com/fabianp/scikit-learn into manifold
pep8 changes
cosmetic changes
added arpack support in scipy_future; wrapped MLLE and HLLE into locally_linear function
removed old files; moved example to examples directory
pep8 changes
added LTSA method
pep8
fixed bug in modified LLE: now works for higher dimensions
added method argument to digits example
Merge pull request #1 from ogrisel/jakevdp-manifold
minor changes
NeighborsClassifier: changed window_size to leaf_size & updated documentation as discussed in Issue #195
fixed doc formatting
merged with sparse classifier commit
merged changes in master
pep8
merge with previous commits
H_tol/M_tol -> hessian_tol/modified_tol
Initial commit
fixed bug in calculating tau
added cythonized Floyd-Warshall algorithm
speed tweaks in Floyd-Warshall, and renamed graph_search->shortest_path
speedup in Floyd-Warshall: unsigned ints to prevent negativity checks
Added Dijkstra's algorithm with Fibonacci Heaps for significant speed gains in path searches
bug fix: free allocated memory
changed shortest_path() to accept a sparse distance matrix for more flexibility
cleanups & pep8
add tests, doc update
combined manifold examples
manifold doc update
Revert "combined manifold examples"
fixed bug in shortest path; consolodated isomap examples
ex. change
Merge branch 'manifold-test' into manifold-doc
cleaned up and documented Fibonacci code
added tests; cleanup; pep8
remove unused imports
first stab at implementation via KernelPCA
add arpack support to KernelPCA
small efficiency boost to KernelCenterer
np.random -> RandomState
K_pred_cols -> K_pred_cols_
Merge branch 'master' into manifold-isomap
manifold/shortest_path -> utils/graph_shortest_path
Implement Isomap + transform in terms of KernelPCA
add description to isomap transform
added Isomap.reconstruction_error()
store BallTree in Isomap for faster transform()
fix conflicts with master
Merge branch 'manifold-isomap' into manifold-doc
update manifold documentation
Merge commit 'upstream/master' into manifold-doc
changes to manifold doc
speed improvements on LLE variants for high dimensional data
manifold example updates
typo in HLLE
examples: make out_dim explicit
remove lobpcg from LocallyLinearEmbedding
merge with master; remove lobpcg references
initial commit
added compiled cython
assure C-ordered on init
fix NeighborsClassifier doctest
make memory allocation more efficient
documentation clarifications
ball_tree protocol 2, but paths are broken
Merge branch 'cython-ball-tree'
move ball_tree.pyx to scikits/learn/ and write pickle test
Merge commit 'upstream-RW/master' into cython-ball-tree
add BallTree pickle test cases
Merge branch 'cython-ball-tree'
refactor neighbors module
doc fixes
merge with upstream/master
Merge commit 'upstream/master' into neighbors-refactor
scikits.learn -> sklearn
add neighbors benchmark
change implementation to mixin pattern
move neighbors.py -> neighbors
fix doctests
merge upstream/master
move barycenter_weights to manifold
deprecation of NeighborsClassifier and NeighborsRegressor
Merge commit 'upstream/master' into neighbors-refactor
add deprecation warning to sklearn.ball_tree
Note neighbors module changes in doc/whats_new.rst
fix typos
gitignore: scikits.learn -> scikit_learn
Merge commit 'upstream-RW/master'
move neighbors examples to examples/neighbors/
Nearest Neighbors examples & documentation
switch to dynamically generated docstrings
commit dynamic doc changes
add weighting to classification and regression
add neighbors/tools to commit
add tests for weighted regression and classification
documentation of weighted classification and regression
add graphical neighbors benchmark
pep8 + move weighted_mode to utils
add tests & example for weighted_mode
benchmark -> bar plot
make constants uppercase
return to simple docstrings
increase BallTree test coverage
fix BallTree linkage
fix typos
Merge pull request #3 from ogrisel/jakevdp-neighbors-refactor
increase test coverage
pep8 + cosmetic changes
add warning flag to balltree + tests
warning_flag doc
add warning messages to KNeighbors
fixes for tests
attempt to address warnings catcher
hack to fix warning test
change warning message
simplify warning test; remove assert_warns from utils
bug: mode='LM' -> mode='LA'
remove unused return_log keyword in GMM
BUG/DOC: address manifold singularity issue
DOC: add utility information for developers
Move graph_shortest_path to utils/graph.py
remove duplicative utils.fixes.arpack_eigsh
Move validation utils to their own submodule
BUG: example plot compatibility with older matplotlib versions
Merge branch 'example-fix'
Merge pull request #4 from glouppe/dev-doc
randomized_range_finder -> randomized_power_iteration
Change logsum to logsumexp for comparability with scipy
BUG: fix scale_C bug in svm
TESTS: remove deprecated NeighborsClassifier calls
species datasets commit
clean up species distribution example
randomized_power_iteration -> randomized_range_finder
typo in fastica doc
Merge commit 'upstream/master' into util-docs
Merge commit 'upstream/master' into util-docs
DOC: add toc for developers resources
DOC: add warning that utils should only be used internally
use joblib for saving species data
Merge commit 'upstream/master' into dataset-fix
fix logsum test
Change depreciated behavior in feature agglomeration example
HACK: sphinx/prevent proliferation of build images in doc
simplify removal of _images dir
remove unneeded import
BallTree -> NearestNeighbors in Isomap
DOC: isomap fixes
convert LLE neighbors to NearestNeighbors object
BallTree -> NearestNeighbors in mean_shift
pep8
Merge pull request #501 from jakevdp/dataset-fix
remove unused import
remove unused imports
pep8
Merge commit 'upstream/master'
COSMIT: pep8
DOC: formatting
DOC: pep8, add quotations, and fix typos
fix for doc math issue
TYPO: generate all images
small simplification in LDA
add old version warning
add newline at file end
turn off old version warning
add random_state to LocallyLinearEmbedding
initialize indices and distances in balltree
check random state in _fit_transform
Address Issue #590 : use relative path link to about.html
Merge commit 'upstream/master'
ball_tree: more efficient array initialization
add info about valgrind to dev documents
Current version -> Latest version
Merge commit 'upstream/master' into old-version-warning
set warning margins to zero
allow for multiple nuggets in gaussian process
example + documentation of gaussian processes on noisy data
Merge commit 'upstream/master' into GPML-fixes
DOC: expand nugget explanation; combine two GPML examples
Merge pull request #6 from amueller/old-version-warning
fix link in warning
latest version -> latest stable version
BUG: fibonacci heap implementation
TEST: non-regression test for fibonacci heap bug fix
Generate c-code with cython 0.15.1
ENH: use shift-invert in spectral clustering
add detailed comment on ARPACK usage
DOC: add tutorial links
Merge branch 'cov-speedup' of git://github.com/vene/scikit-learn into vene-cov-speedup
speed up symmetric_pinv
additional speedup: all eigenvalues are real for symmetric matrix
TST: change LLE test to stable seed
DOC: fix documentation of arpack
Merge pull request #991 from jakevdp/doc-update
@jakevdp's version of pinvh
DOC: add google analytics theme option
clarify documentation for radius_neighbors
BUG update graph_laplacian to upstream SciPy version
Ball Tree, KD Tree, and tests
Fix tests for scipy <= 0.9
speed up KD tree construction by ~25%
add author & license information to pyx files
add median of 3 pivoting to quicksort
add pydist code
fix binary tree sort bug
add pydist: user-defined metric
add haversine distance
add exception passing to C functions
rename dist conversion funcs
Implement correct d-dimensional kernel norms
add metric mappings to dist_metrics
binary tree: make valid_metrics a class variable
dist_metrics: allow callable metric
add chebyshev distance to kd tree
add functionality to NearestNeighbors estimators
Roger-Stanimoto -> Rogers-Tanimoto
calculate kernel norm only once
compute kernel norm only once
TST: compare gaussian KDE against scipy version
Change dual splits to single splits in query_dual
Merge pull request #7 from jhale/new_ball_tree
add notes on implementation details to binary_tree.pxi
remove scipy cKDTree support from neighbors
add neighbors module changes to whats_new
Merge pull request #2104 from kastnerkyle/master
BUG: fix precision issues in kernel_density; remove buggy dual-tree KDE versions
add KDE Estimator class
add kwargs to PyFuncDistance
DOC: document the new neighbors functions & KDE
undo change to clustering example
fix conflicts with master
import KernelDensity from neighbors module
adjust math formatting in neighbors docs
fix NearestNeighbors to pass common tests
add KernelDensity to class list
set random seed in KDE example
skip KDE test to prevent failure due to older SciPy versions
fix typo: SkipTe -> SkipTest
fix doctest in neighbors
BUG: return proper algorithm in KDE
add species KDE example
PEP8: neighbors module
DOC: rearrange KDE examples
TST: increase test coverage in neighbors module
DOC: pep8 & formatting in neighbors docs
DOC: make doc tests pass
add 1D KDE example
DOC: small fixes to neighbors doc
DOC: move KDE discussion to separate page
add some notes and doc strings to neighbors cython code
add more documentation to ball tree and kd tree
DOC: tweak kde examples and move density docs
BUG: fix tophat sampling in KDE
Xplot -> X_plot
bt->tree; dm->dist_metric
Additional implementation notes in binary tree
BUG: use correct algorithm for callable metric
TST: set random state in callable_metric test
BUG: add new preprocessing module to setup.py
Merge pull request #2264 from jakevdp/setup_fix
neighbors numpy1.3 compat: fix typedefs, regen with cython 0.19
numpy 1.3 compat: use explicit type definitions
numpy 1.3 compat: make neighbors/dist_metrics compatible
COMPAT: make NeighborsHeap compatible with numpy 1.3
COMPAT: make NodeHeap compatible with numpy 1.3
COMPAT: make BinaryTree class compatible with numpy 1.3
COMPAT: make BallTree & KDTree compatible with numpy 1.3
COMPAT: last few BallTree/KDTree numpy 1.3 issues
BUG: type->dtype in a cross-platform way
compute offset in a cross-platform way
BUG: don't subtract offset in binary_tree
add explicit types to neighbors cython code
JakeMick (1):
TST added test of fit and transform for kernels for nystroem
James Bergstra (27):
k_means_ - added optional rng parameter to work routines
Centering data for k-means before fitting
k-means - added verbose-level print after initialization
added faster distance-computation algorithm to k-means _e_step
PCA train() stores eigenvalues associated with components
adding James Bergstra as author of k_means_ file
k-means adding all_paris_l2_distance_squared function
k-means - modified k_init to use pre-computed distances for faster, clearer code
k-means - added support for a callable "init" argument instead of copying all the k_init parameters as optional arguments - invite user to use a lambda or something
k-means - fixed misleading typo in error message
k-means - added optional parameters "precompute_distances" and "x_squared_norms"
k-means - added "verbose" parameter to KMeans class
k-means - added copy_x parameter to worker routine and BaseEstimator, allowing optional in-place operation
added optional args to euclidean_distances and removed k_means_.all_pairs_l2_distances_squared
fixed typo in my previous patch to PCA
added PCA.inverse_transform and unit test
added components_coefs_ (eigenvalues) member to RandomizedPCA to match PCA
test_pca - modified to use assert_almost_equal
euclidian_distances - repair special case for when X is Y
ENH: adding iter_limit to libsvm
FIX: committing updated Cython-generated libsvm bindings
ENH: Solver iter_limit emits warning instead of raising exception
ENH: renaming iter_limit -> max_iter
FIX: missing file hidden among the Cython output
ENH: hint about data normalization when SVC stops early
FIX: adding missing c files from cython
ENH: assert -> assert_equals
James McDermott (1):
DOC rename lambda to alpha in plot_lasso_coordinate_descent_path. (Re)-Closes #903.
Jan Hendrik Metzen (4):
Fixed bug in updating structure matrix in ward_tree algorithm.
Added test case that reproduces crashes in old version of ward_tree algorithm.
Performance tweaking in ward_tree.
FIX : Fixed bug in single_source_shortest_path_length in sklearn.utils.graph
Jan Schl�ter (3):
Replaced wrong k-means++ implementation with a correct one.
Extended docstring, renamed variables from javaStyle to python_style, replaced tab-indents with space-indents, pep8
Use scikits distance functions instead of scipy's. Avoid recomputations of x_squared_norms whereever possible. Completion and unification of docstrings.
Jaques Grobler (278):
Added a note to the install documentation
Added a note to the contributers documentation
Shorted the long line
Added a small note about the use of an upstream remote in the Contributions documentation
Shortened a line in the code
Merge branch 'WIP_tut', remote-tracking branch 'gaelVaroqueux/stat_tutorial' into WIP_tut
- Further integrated tutorial.rst (Section 2 in Userguide) with links to
moved tutorial files into separete folder within main tutorial folder. added folder for section2 tutorial. fixed some links.removed savefigure from plot_cv_diabetes.py
Merge remote-tracking branch 'origin/master' into WIP_tut
Merge branch 'master' into WIP_tut
Removed savefig from tutorial plot files.
Updated tutorial folders in doc with placeholders for other tutorials. updated index.rst for the tutorial menu accordingly
added an html page for plot_digits_first_image.py
Added links to some keywords.
Links, image resize and updated ipython code in tutorial
Added a dataset image, some links and 'import sklearn' updates
Added Knn classification example image&html
changed colours of plots, added links
Fixed link typo
Merge branch 'master' into WIP_tut
Simple linear regression example added to tut
Fixed spelling error,import lines,figures and html links for shrinkage section
Added links, images and docstrings to some plot files
fixed plots to have class coloured datapoints
Fixed some figures, added links & corrected SVM Param C explanation
Fixed missing image and GUI download link
Image page fixed
added div.green to the theme for Exersizes in scikit-tutorial
fixed link/updated some code
renamed file-names, finished model-selection, changed cv plot to use C
Section 4 done - images/links/htmls for images
All scikit tutorial images and links redone
Fixes for doctests
modified makefile for doctesting - not permanent
Merge remote-tracking branch 'origin/master' into WIP_tut
remove redundant file
removed redundant file
Better doctest time(wip),removed duplicate examples, update plot_ols.py
Merge remote-tracking branch 'origin/master' into WIP_tut
3 files moved into main example pool - links to them updated
Merged some examples into examples folder.
Merged a few examples into the example pool
delete redundant file, merged some examples and updated links
examples merged to example pool
deleted unused file, tutorial examples folder removed
replaced silence paramenter in makefile, links removed in stat_learn tutorial, big_toc_css copy deleted, heading changed in tutorial index, tutorial index info added
added ELLIPSIS to 4 examples
added ... to ellipsis
Merge remote-tracking branch 'origin/master' into WIP_tut
merged ols and ridge variance + some neating
fixed links & neatening
moved exercises into seperate folder, neating up
path fix of moved figure
fixed typo,changed 2.2s numbering, fixed 4 examples in exercises
fixed numbering in main User Guide
added collapsable sidebar - still WIP
Collapsable sidebar adding complete - appears to work well
Deleted redundant files
color change for button
comment added to gen_rst. Arrow added to button
Next button added:position correct,but does nothin
button is mostly working
spelling fixes
cleaned up
more cleaning-finished off
spelling errors,edit curse of dimensionality, explain top-down
bug fix - layout
changed hover colours for button
previous button added with hovering-effect
Merge branch 'master' into WIP_tut
fixed new doc-test error
Made old EllipticEnvelop deprecated class
changed message to *Use EllipticEnvelope instead*
Fixed broken image link
Removed `_plot` from the face recognition example
Added the name change for the recent change EllipticEnvelope
Changed GMM's API to suite rest of sklearn
1.Fixed typo 2.Removed has_key entries
restored last changes
Fixed syntax error
mixture/plot_gmm* examples updated
restored last changes
DPGMM API updated, along with plot_gmm_sin example
DPGMM and VBGMM API change, example updated
modified test_gmm to match API changes in gmm.py
updated documentation for gmm,dpgmm and vbgmm
Changed variable name `x` to `covar_type`
Updated `whats_new.rst` with API change
Added `note` to tutorial index for `doctest_mode` in `ipython`
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
changes to `fit` and `__init__`
decision logic removed from __init__
API update for HMM types with docstrings
tests updated to match API
fixed example`s fit(..) to new API
made `diag` explicit in example
Fixed typos, spacing errors & updated `Whats New`
fixed broken GaussianHMM documentation generation
correct some wrong fixes
reversed the order of the thresholds array
metrics.py
test added for this
fixed typos,updated `whats new`
typo fixed in what`s new
added alternating columns for tables in documentation and a tighter layout in pre
docstring fixes
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
Fixed broken links on Support page
Fixed broken links on Support page
Merge pull request #974 from jaquesgrobler/master
fixed long-name-references madness + removed some whitespace
trainling whitespace removed
blank line removed
slight adjustment to header size
Merge pull request #1075 from jaquesgrobler/master
Merge pull request #1077 from ludwigschwardt/minor-fixes
Added scale_c fiasco example
gael`s suggestions/tweaks
docstring change
docstring fixes
changed includes back - change broke JENKINS build
not the problem afterall - switch back
docstring changes
typos and alex`s review changes
small tweaks
changed includes back - change broke JENKINS build
not the problem afterall - switch back
add first collapsible toctree test
moved buttons to themes
working version
Links now clickable
-collapse toc moved to front page-
button colour change + comments
fixes - seemingly good version
highlighting of + implemented
-line highlight bug fixed, buttons changed, full expansion added
small bug fix and colour tweak
nitpick fix
cleanups
cleanups
toggle bug fixed
highlight fix
what`s new updated
remove `steps` from Attributes of docstring
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge pull request #1331 from jaquesgrobler/master
Merge pull request #1367 from fannix/master
Merge pull request #1369 from AlexandreAbraham/fix_doc_clean
doc fix - trailing underscore and init param update
goodness of fit fix
trailing whitespace sentence
Merge pull request #1372 from jaquesgrobler/doc-fix-dev_guide
plot fix
variable name change
new example added for manifold learning
Andy`s suggestions
links for MDS
small changes
final changes
pep8
heading change
added links to astropy and scipy workflow guids
Merge pull request #1564 from jaquesgrobler/contributor_guide_links
remove 1000`s of warnings from example
Merge pull request #1592 from jaquesgrobler/master
Add temporary survey banner
remove the equaldistance code warning, replace with doc warnings
typo fix
remove warning
warning removal
update warning box
deprecation warnings, indent fix
andys suggestions and test
add warning for no internet
Merge pull request #1644 from jaquesgrobler/doc_url_error
TYPO fix
example title change
gallery effects,icon change,cleanups
typo fix and heading changes
fix indentation error-cause lots of build warnings
4 thumbs per row/hover effect/some cleanup
fix for iris dataset
line_count sort added, some changes reverted
move comment out of list
remove comment, undo change
Merge pull request #1803 from kmike/hmm
rename example title
Switch off survey banner
newline at end of file
Merge pull request #1581 from jaquesgrobler/example_gallery_cleanup
temp disable line-count-sort for gallery while fixing bug
sort-by-line-count bug fixed
Merge branch 'master' of github.com:scikit-learn/scikit-learn
fix numbering for tutorials page
Add bit more instruction on writing docs
big O/tilde add in
removed old complexity info
image and html file added
link fixes
add further links
last links fixed
jquerys added
intigrated to tutorial index
update tutorial page
make links relative
rename image/html
add instructions for editing Readme, and script needed for that
remove svg2html script,toctree section added,doc page for ml_map created
sidebar added
layout fixes and top paragraph
TYPO fix
update what`s new
deleted unnecessary thumbnail
DOC improve description of cross validation
resized image
disable sidebar using cookies to remember last position
COSMIT pep8
Merge pull request #1884 from jaquesgrobler/ml_map
DOC added link to scipy lecture notes to tuts
Merge pull request #1924 from jaquesgrobler/FIX_sidebar_on_index_page
Merge pull request #1911 from Jim-Holmstroem/generalize_label_type_for_confusion_matrix
Merge pull request #1944 from jnothman/selectpercentile_limit_bug
fixed typo
maintenance scripts added for machine learning maps - needed for modifying the map in future
DOC Fix references to missing examples
fix incorrect reference
Merge pull request #1986 from jaquesgrobler/DOC_reference_fixes
add optional banner to index page to advertise code sprints
link updated
Merge pull request #1996 from jaquesgrobler/DOC_sprint_sponser_banner
hover removed from nature, jquery more recent version, containerexpansion on mouseover add
image resizing added
Zoom bug fixed
added docstring space to popup block
docstrings embedded into example hovers
Final visual effects added to hovering
Nelle`s review fixes addressed
Cross browser shadows covered
remove forgotten print
shorten displayed dosctring to 95 chars
fix white space inconsistency between header and docstring
example docstring fixes
logistic regresion example fix
Merge pull request #2056 from jnothman/leavepout_clarify
firefox bug fixed
classifiers comparison fix
DOC spellfixes
Donate buttons added `About us` and front page
donations paragraphs added
Merge branch 'master' of github.com:scikit-learn/scikit-learn
misalignment fix
example fixes to clean first docstring paragraph of rst code
fix merge conflict
border added for IE
make new classes for lasso_path/enet_path and deprecate old
rel_canonical prelim
Merge branch 'master' into ENH_docstrings_in_gallery
syntax fix
cleaned up-ready
Merge pull request #2017 from jaquesgrobler/ENH_docstrings_in_gallery
Small docstring changes for plot_ward_structured_vs_unstructered example, as mentioned in PR #2017
nitpick fixes, pep8 and fix math equations
removed old_version block test
Merge pull request #2205 from jaquesgrobler/ENH_rel_canonical
sidebar fix - sidebar.js was called before jquery. works fine under new version jquery too
sidebar/toctree harmonie, must still fix toggle
jquery reverted to 1.7.2 version. sidebar/toc-collapse works
DOC: few small doc fixes to layout bugs on new website
comments added to the changes
first carousel version added
firefox fix and more images added, auto-cycling disabled
arrows switched for dots
have images link to relevant examples
slight layout adjust
small layout changes for firefox, images taken from generated images now
indentation fixes
add more examples and cropping to first image
disable carousel for small displays, small tweaks
Jean Kossaifi (32):
Changed the default return type of ward_tree from bool to int
adding a comment on the test for grid_to_graph
pep8 and using np.bool instead of bool
FIX : _to_graph failed if mask's data was not of type bool
Test to check that the grid_to_graph function works with every type of
COSMIT : used implicit continuation inside parenthesis instead of
Typo : fix the 0.5 coefficient
Added normalize parameter to LinearModel
Added parameter normalize to LinearRegression
LassoLARS now uses the normalize parameter
Completed the integration of the parameter normalize
Implementation of the parameter normalize in bayes.py
added parameter normalize to coordinate_descent
added parameter normalize to ridge.py
Added parameter normalize to omp.py
Added parameter normalize
Fixed some errors (mainly docstrings)
Merge remote branch 'upstream/master' into normalize_data
Added a function as_float_array in scikits.learn.utils
Fix : deleted a forgotten line
FIX : corrected a bug in as_float_array and added a test function
PEP8 : replaced tabulations by spaces
FIX : if X is already of the good type, we musn't modify it
FIX : if X.dtype is changed, then a copy of X is returned, even if overwrite_X is True
Test : lasso_lars_vs_lasso_*
Merge branch 'normalize_data'
FIX : Ellipsis in least_angle.py doctests
FIX : ELLIPSIS in least_angle.py doctests
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
Sorting parameters in BaseEstimtor.__repr__
FIX : docstest fail
Cross_val : Removed useless & tricky parameter iid
Jim Holmström (10):
Added random_state=0 for AdaBoostRegressor
Replaced 'for i' with 'for _' at place where i is not used.
Extended test_confusion_matrix_binary to incorporate non-integer labels
Extended test_confusion_matrix_multiclass to incorporate non-integer labels
BUG: Fix for non-integer datatypes in confusion_matrix
ENH: faster preallocation and integer type for the accumulators
STY: one-lined lines that where less than 79
MAINT: let the result type be infered by coo_matrix, possible since np.ones already integer typed
MAINT: refactored metrics.auc to use np.trapz
ENH: Added input checks in confusion_matrix
Jochen Wersdörfer (2):
ENH CountVectorizer using arrays instead of lists
ENH added multiclass_log_loss metric
Joel Nothman (79):
Fix comment: returns fbeta_score, not f1_score
ENH allow SelectKBest to select all features in a parameter search
DOC Allowing a list of param_grids means GridSearchCV is more than grids
DOC clarify relationship between pos_label and average parameters for
ENH/FIX make best_estimator_'s predict functions available in parameter search
FIX make *SearchCV picklable
REFACTOR combine train_wrap and csr_train_wrap
ENH call asarray on returned scores and pvalues
TST ensure SelectKBest and SelectPercentile scores are best
FIX ensure SelectPercentile only removes tied features in case of ties
ENH _BaseFilter.inverse_transform should respect dtype
DOC Fix comment for _BaseFilter.inverse_transform
ENH sparse _BaseFilter.inverse_transform
FIXTST fix errors introduced to feature selection tests
DOC comment feature selection sparse inverse_transform
Merge pull request #1935 from jnothman/base_filter_inv_transform
ENH Feature selection should use CSC matrices
COSMIT Remove redundant code in CountVectorizer
TST test CountVectorizer.stop_words_ value
ENH Use csr_matrix.sum_duplicates instead of tocoo
DOC small typographical fixes in grid_search documentation
COSMIT refactor roc_curve and precision_recall_curve
FIX bug where hinge_loss(..., neg_label=1) produced incorrect results
Merge pull request #1880 from NicolasTr/patch_extractor_float_max_patches
DOC Fix estimator unsupervised fit method signature
DOC clarification of parameter search
DOC fix typos
COSMIT shorten long line for pep8
ENH Create FeatureSelectionMixin for shared [inverse_]transform code
DOC rewrite descriptions of P/R/F averages and define support
DOC/COSMIT fix typos in What's New
DOC add some contributions to What's New
TST Use assert_almost_equal in test_symmetry
COSMIT prefer partial over lambda in test_metrics
TSTFIX use name, not metric, in test_metrics error messages
DOC correct note on handling 0-denominator in P/R/F
Merge pull request #2005 from kmike/test_pipeline_methods_preprocessing_svm
ENH faster unique_labels for big sequences of sequences
DOC explain labels parameter to confusion_matrix
DOC Detail on parent-child relationship in tree
FIX/COSMIT helper to identify target types
FIX cannot use set notation for Py2.6
FIX need explicit dtype for array of sequences in numpy 1.3
COSMIT remove redundant target size check
FIX numpy 1.3 has no float16; use float32
FIX/TST np.squeeze in numpy1.3 fails with array of sequences
FIX numpy 1.3 throws error with array of arrays
FIX use Python 2.6-compatible str.format
COSMIT refactor cross-validation strategies
Include LeavePLabelOut in refactoring
A further refactor
COSMIT Base class for KFold/StratifiedKFold validation
COSMIT make BaseKFold abstract
COSMIT pep8 in cross_val_score
COSMIT Base class for [Stratified]ShuffleSplit
DOC clarify LeavePOut's combinatoric explosion
DOC similar note in narrative docs
DOC More explicit note
DOC fix docstring headings
COSMIT make helpers private with underscore
COSMIT make BaseKFold private with underscore
TST additional tests for preprocessing.Binarizer
COSMIT add underscore prefixes where forgotten in cross_validation
COSMIT much simpler agglomeration inverse_transform
TST stronger test for agglomeration transforms
DOC minor fixes to Ward docstrings
DOC fix docstrings for AgglomerationTransform
DOC detail Ward.children_ and fix n_components_ type
DOC comment on Ward algorithm
DOC clean pooling_func arg type
DOC copy comment describing hierarchical clustering children
Merge pull request #2054 from ogrisel/invalid-n-folds
FIX avoid spectral_embedding naming conflict
Merge pull request #2085 from agramfort/fix_y_score_fa
Merge pull request #2090 from kanielc/fix_weight
COSMIT move deprecated parameter to end
COSMIT refactor document frequency implementations
ENH print number of fits in BaseSearchCV._fit
DOC fix comment on svm probability param
Johannes Schönberger (3):
Remove invalid todo comment
Add missing doc string printing for examples
DOC : fixes in covariance module
John Benediktsson (11):
tree: check length of sample_mask and X_argsorted.
DOC: fix typos in tree docstrings.
DOC: fix value error text in Tree.compute_feature_importances.
COSMIT: Use np.array.fill for scalar values.
COSMIT: doc fixes to sklearn.feature_selection.univariate_selection.
COSMIT: fix typo of homoscedasticity.
COSMIT: fix reference to scipy.stats.kruskal.
COSMIT: fix more typos.
DOC: fix 'Controls' typo in sklearn.ensemble.forest.
COSMIT: fix typo in AUTHORS.rst.
COSMIT: fix excessive indentation.
John Zwinck (1):
FIX use float64 in metrics.r2_score() to prevent overflow
Joonas Sillanpää (3):
Radius-based classifier now raises exception, if no neighbors found
Corrected some mistakes, added optional outlier_label parameter, which can be given to outliers
Fixed weight calculation from distances (1. / dist), and weight function in tests (lamda d : d ** -2)
Joshua Vredevoogd (1):
DBSCAN BallTree implementation
Juan Manuel Caicedo Carvajal (1):
Check for consistent input in Logistic Regression.
Julien Miotte (2):
Fetching every figure generated by the example scripts.
Since we changed the name of the figure names, changing the rst files.
Justin Pati (1):
changed warnings in grid_search.py related to loss_func and score_func being passed
Justin Vincent (19):
PY3 xrange, np.divide, string.uppsercase, None comparison
TST + PY3 various fixes
Got all the doc-tests working
Merge in master
More python3 fixes (and just plain bugs)
use ELLIPSIS in doctest to deal with numpy changes.
Forcing the deprecation warnings to happen while in get_params.
Force warning to be heeded in deprecated args check. Possibly fixed a test bug (but maybe I just got it wrong)
Make a test not dictionary order dependent.
Fix up last doc tests.
Make the fixes 2.6 compatible
ELLIPSIS around a unicode issue.
Fix y vector. We wanted round off division so that y == [0 0 1 1 2 2 ...], not [0 .5 1 1.5...]
A little more of those unicode helpers
Another ELLIPSIS
Pop off the recently added filter after testing for deprecation warnings.
merge in origin
Comment change
Fix two remaining python3 bugs.
Kamel Ibn Hassen Derouiche (1):
FIX: compilation issues under NetBSD
Keith Goodman (2):
DOC: minor typos in covariance doc.
BUG: price accidentally used instead of volume
Kemal Eren (99):
ridge regression uses compute_class_weight()
Re-add deprecated class_weight parameter.
removed class_weight parameter from RidgeClassifier.fit()
check_pairwise_arrays() preserves dtype==numpy.float32
implement spectral biclustering and spectral co-clustering
wrote tests
wrote methods for generating bicluster data
added option to return piecewise vectors
cast data in fit()
made internal functions private
use random state in test
removed pickle test
shorten first lines of test docstrings
use random state in preprocess tests
duck typing, minor corrections: spacing and typos
fixed exceptions and their messages
updated svd()
better array validation
use random state in data generator
tests reuse data generators
user may select svd method
Added to docstring
split spectral biclustering into two classes
removed unused code
test bad arguments
now supports sparse data
check n_clusters parameter more thoroughly
made base class an abstract class
checkerboard panels may have arbitary values.
fixed exception type
removed empty mixin
started biclustering documentation and examples
shorter array slicing
made some methods into private methods
cleaner use of check_arrays()
named arguments
use safe_sparse_dot()
use np.random.RandomState directly
do not do any checks during __init__()
do not use mutable default arguments
added new tests for sample data generators
fixed bug in make_checkerboard(), so tests pass again
use assert_all_finite
skip permutation test for now
fixed some errors reported by pyflakes
raise exception instead of converting sparse arrays to dense
expanded biclustering documentation
corrected k_means in docstring
rearranged imports from general to specific
moved and renamed _make_nonnegative() and _safe_min()
added option to use mini-batch k-means
use dia_matrix
renamed 'preprocess' to 'normalize'
use sklearn.utils.extmath.norm
base class __init__ is no longer abstract
added more information to error messages
also use norm in _project_and_cluster()
make test more sparse
made 'bicluster' a submodule of 'cluster'
removed svd_kwargs argument
added n_svd_vecs parameter
tests use ParameterGrid to avoid deep nesting
replaced kmeans_kwargs with some useful k-means parameters
updated documentation
keep biclustering algorithms in submodule
renamed examples; added to example docstrings
re-added bicluster mixin, this time with some functionality
wrote newsgroup biclustering example
fixed a few things in examples, documentation, and docstrings
wrote bicluster scoring using jaccard index and hungarian matching
removed some parameters to speed up test
added default arguments to base class's__init__ to make test pass
test_make_checkerboard was wrong after api change
added documentation for bicluster evaluation
moved shuffle functionality to utility function
added consensus score to bicluster examples
renamed example to get output to work
made bicluster utilities for dealing with indicator vectors
index in one go. added sparse test.
documentation and docstring fixes
merged newsgroup example with Vlad's
moved bicluster examples to their own category
reduced noise in spectral coclustering example
updated newsgroups example
added n_discard parameter to _svd()
check value of n_components and n_best
a fix for nan values in singular vectors.
wrote tests to ensure svd works on perfect checkerboard
redundant phrase in docstring
put biclustering section after clustering section in reference
misc. fixes
changes to newsgroups example:
fixed some docstrings: backticks and missing parameters
updated setup.py
added myself to authors; added biclustering to whats new
examples use matplotlib.pyplot instead of pylab
consistency changes:
removed plot_ from newsgroups example file
import biclustering methods in sklearn.cluster and sklearn.metrics.cluster
Ken Geis (4):
Changed the setup instructions in the README to properly install the package in the user home.
FIX mbkmeans benchmark bug (k instead of n_clusters)
FIX off-by-one error in neighbors benchmark
ENH lots of benchmarks fixes
Kenneth C. Arnold (4):
Cosmit
Cosmit
fast_svd: factor out the randomized range finder (more generally useful)
Mark Cython outputs as binary so their changes don't clutter diffs.
Kernc (12):
KNeighborsClassifier now has a predict_proba() method
reversed changes to KNeighborsClassifier.predict()
an simple test case for KNeighborsClassifier.predict_proba()
feature_extraction.text.CountVectorizer analyzer 'char_nospace'
Oneliner docstring
words for n-grams padded with one space on each side
missing unicode modifier
replaced str.format() with string concatenation as it's 3 times faster
char_nspace -> char_nospace, thanks Lars
changed 'char_nospace' keyword to shorter and meaningful 'char_wb'
some narrative documentation...
mentioned 'char' vs 'char_wb' in the narrative
Kevin Hughes (1):
ENH actually use scikit-learn's PCA class in plot_pca_3d.py
Kyle Beauchamp (10):
Added code to address issue #1403
In preprocessing.binarize, eliminate zeros from sparse matrices
Added feature for issue #1527
Minor PEP8 fixes for issue #1527
Minor docstring fix for issue #1527
Added tests and docs for normalized zero_one loss
Fixed pep8 spacing issue and floating point doctest issue
Added CSC matrix testing for binarize and added type tests.
Added MinMaxScaler inverse_transform for issue #1552
Dummy commit to trigger travis
Kyle Kastner (6):
Removed pl.axis('tight') and set the plot limits with pl.xlim(), pl.ylim(). pl.axis('tight') appears to be adding whitespace around the colormesh
Added decision_function support to OneVsRestClassifier and a test, test_ovr_single_label_decision_function, in test_multiclass.py
Updated fixes for #2012.
Strengthened tests for OneVsRestClassifier decision_function
Cleaned up tests, and removed unused multilabel parameter in decision_function_ovr
Inlined extraneous function call from decision_function and added a check that the base estimator has a decision_function attribute
Kyle Kelley (1):
Converted Markdown style link to restructured text
Lars Buitinck (832):
Make ball tree code safer and 64-bit clean
Cleanup lib{linear,svm} C helper routines
Spellcheck and formatting in developers' docs
typo
Updated installation instructions
Merge pull request #160 from larsmans/master
Be more explicit about coverage testing
cosmetic change to ball tree C++ code
cosmetic doc changes
cosmetic: pep8 in utils/ + rewrote factorial (2x as fast)
factorial should not use O(n) memory
Python 3-safe attempted import of factorial and combinations
typos in README
typos in covariance docs
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn
Merge branch 'master' of https://github.com/amitibo/scikit-learn into amitibo-naive-bayes
naive bayes: copyedit + rename alpha_i to alpha
ENH: optional and user-settable priors in multinom naive bayes
naive bayes: minor fixes
Merge sparse and vanilla naive Bayes
docs + cosmit in naive_bayes
naive bayes: handle 1-d input
ball tree cleanup & 64-bit safety
naive bayes: fix predict_proba bug and change priors behavior
fix naive bayes docs and example + credit mblondel + vanity
typo: interation/iteration + re-Cythonize cd_fast.pyx
Merge branch 'master' of github.com:scikit-learn/scikit-learn
naive bayes: test pickling
naive bayes: safe_sparse_dot, doc and docstring updates
rename MultinomialNB params, rename GNB GaussianNB
reformulate MultinomialNB as linear classifier
NB: add class_log_prior_ and feature_log_prob_ back as properties
NB cosmit: *feature* independence
cosmit: expand MultinomialNB docstring
Safer importing in grid_search module
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge pull request #184 from larsmans/amitibo-naive-bayes
rm references to naive_bayes.sparse in docs
NB: rename use_prior to fit_prior
slightly improved logging in a few easy cases
rm self.sparse attr in MultinomialNB; not needed outside of fit
fix priors bug in MultinomialNB
2010 is so last year
Merge branch 'mldata' of https://github.com/pberkes/scikit-learn into pberkes-mldata
Improved error handling + reduce memory use
Simplify intercept fitting in MultinomialNB
Error in MultinomialNB docs
Added naive Bayes classifier for multivariate Bernoulli models
some documentation for BernoulliNB
Do binarizing in BernoulliNB
Simplify binarizing in BernoulliNB
fix message in document classification example
Merge branch 'master' into bernoulli-naive-bayes
Optimize BernoulliNB + improve docstring + add to doc-class example
Copyedit preprocessing docs
Refactor MultinomialNB: separate prior estimation and feature counting
Use unique from utils.fixes in naive_bayes
Fix bug in MultinomialNB: output transposed
Replace loop in MultinomialNB._count with dot product + pep8
BUG: binary classification failed in MultinomialNB, +regression test
Fix 404 from broken URL in release log
ENH: fit_transform on TfidfTransformer
add C parameter to LinearSVC docstring
Fix pprett's website URL (<> caused it to be a relative URL)
Merge branch 'master' into bernoulli-naive-bayes
Refactor MultinomialNB and BernoulliNB: introduce BaseDiscreteNB
vectorize loop in BernoulliNB for 100x speedup in sparse case
svmlight reader: don't use leading _ in identifiers
Merge branch 'svmlight_format' of git://github.com/mblondel/scikit-learn into mblondel-svmlight
SVMlight reader: minor fixes
SVMlight reader: ensure C calling conventions + docstring
Plumb memory leak in SVMlight reader
SVMlight reader: one more clear() instead of delete
SVMlight reader: cosmetic
SVMlight reader: skip one level of indirection
Simplify and document SVMlight/libSVM data reader
Use C++ exception handling in SVMlight reader.
finish exception handling in SVMlight reader
Extend MultinomialNB tests to BernoulliNB
Update BernoulliNB docs
BUG: broken doctest in BernoulliNB
Glitches in BernoulliNB and DiscreteNB (mostly docs)
Merge pull request #210 from larsmans/bernoulli-naive-bayes
SVMlight reader: memory leak, type test
(Hopefully) full exception safety in SVMlight reader
datasets/mldata.py is not a script, chmod 644
Python 2.5 and SciPy 0.7 (tentative) compat in mldata
Fix broken doctest in mldata
document placement new in SVMlight reader
fit_transform does NOT return self + other docfixes
Parallel vectorizing is slower than serial
Rewrote SVMlight parse_line with C++ iostreams
SVMlight reader: some extra tests + cleanup
Adapt kNN classifier to sparse input
Use new utils.atleast2d_or_csr in naive_bayes as well
document placement new in SVMlight reader
Document sparsity in k-NN
Correctly document sparse input possibilities in naive_bayes
Merge branch 'master' into sparse-knn
Add sparse k-NN test, fix a bug
Extend sparse k-NN test to try pairs of sparse matrix types
Fix bug in sparse k-NN and add disabled (!) test for sparse regression
Better document scipy.sparse support in neighbors module
Prevent some copying in neighbors + docstring for euclidean_distances
Use 10 neighbors in k-NN document classification
neighbors: check string equality with ==, not is
Copyedit SparsePCA docs
Copyedit SparsePCA docs
Merge pull request #219 from larsmans/sparse-knn
Some doc copyediting
Change normalization behavior in TfidfTransformer
Docfixes in feature_extraction.text
Remove bogus sparse vectorizing tests
docfixes in feature_extraction.text
document classification example doesn't demo only linear classifiers anymore
make parse_file in SVMlight reader static
Fix broken doctest in NeighborsRegressor
Search tfidf__norm space in text class. grid search example
Merge pull request #228 from larsmans/tfidf
Use four categories instead of all in doc. class. example
Optimize CountVectorizer.fit_transform (+ minor refactoring)
pep8 feature_extraction.text + rm content word "computer" from stop list
DOC: Expand and copyedit naive Bayes docs
Recythonize libsvm.pyx with Cython 0.14
Refactor/simplify CountVectorizer
Refactor feature_extraction.text (again) to use Counter
Replace mixture.logsum with numpy.logaddexp
on demand inverse vocabulary
Implement fit_transform for Vectorizer as well and document it
Default argument safety + cosmit in feature_extraction.text
typo
DOC fixes in datasets
Merge pull request #234 from larsmans/inverse-vectorizer
FIX hmm.py to succeed tests; stopgap, put old logsum.py in that module
FIX and ENH feature_extraction.text.CountVectorizer
default arg safety + docfixes
Started one-hot transformer
Merge branch 'master' of github.com:scikit-learn/scikit-learn
FIX broken test for CountVectorizer
Revert "Started one-hot transformer"
DOC grid_search + pep8
Refactor naive_bayes and don't treat BernoulliNB as linear model
ENH show top 10 terms per category in document classifier example
DOCFIX typos in svm module
cosmetic changes to DBSCAN
vectorize loop in DBSCAN with np.where
Cosmit DBSCAN test
DOCFIX DBSCAN: we use arrays, not matrices
Streamline imports in lfw.py: don't try anything with PIL
Restore conditional PIL import in datasets.lfw
DOC: copyedit docstrings in pls.py + (almost) pep8-clean
pep8 and docfixes in various modules
pep8 and docfixes for LLE
suppress division by zero warnings from precision, recall, f1
simplify np.seterr handling in sparse_pca
Use isinstance instead of the ancient (Py2.1) types module in fastica
FIX handles NaNs in LogisticRegression, and many more classes
assert_all_finite: pre-check if we're dealing with floats.
Rework assert_all_finite and related functions in utils
callable now actually allowed in fastica
disallow sparse input in dense liblinear
Merge pull request #259 from larsmans/input-validation
Chmod 644 feature_extraction/image.py: not a script
FIX more useful diagnostics in mlcomp_sparse_document_classification.py
Add χ² feature selection
Demo chi2 feature selection on document classification
document Chi2 feature selection
ENH test and fix chi2 feature selection
Rename f_chi2 to chi2
avoid mutable default arguments
s/euclidian/euclidean/g
More mutable default args
ENH decorator to mark functions and classes as deprecated
New-style deprecation of datasets.load_files
deprecated decorator won't work on __init__; skip it
make deprecated work on classes
typo
ENH optimize euclidean_distances for memory use
document and test sparse matrix support in euclidean_distances
ENH optimize idf computation in TfidfTransformer using np.binsort
ENH and DOC TfidfTransformer
FIX add idf smoothing to Vectorizer as well, defaulting to True
More specific exception in GaussianProcess + regression test
(Micro)optimization in DBSCAN
fix DBSCAN bug (oops)
new-style deprecation of load_20newsgroups
ENH set_params method on BaseEstimator, deprecate estimator params to fit
set_params: update according to @GaelVaroquaux's review
Rm k param from KMeans.fit again
DOC improve fbeta docstring
minor fixes in clustering metrics
cosmetic changes to ari_score
rename ari_score adjusted_rand_score
pep8 sklearn/utils/__init__.py
refactor linear models to call as_float_array only from _center_data
unconditionally call as_float_array in LinearModel._center_data
DOC: fix typos
DOC small stuff in base.py and multiclass.py
trees: don't use deprecated cross_val, error messages, use super
typo: threhold -> threshold
DOC minor editing to naive_bayes docs
Merge branch 'tmp'
rename overwrite_Foo params to copy_Foo (and inversed their meaning)
document overwrite_ -> copy_ API change in ChangeLog
BUG LinearSVC.predict would choke on 1-d input (+ regression test)
more helpful error message in SGDClassifier.predict_proba with wrong loss
Merge pull request #357 from larsmans/overwrite-to-copy
fix doctest failures in linear_models docs
refactor and simplify naive_bayes
prevent some copying in sparse SGD
BUG adapt text feature grid search example to new 20news loader
BUG fixed and cosmetics in CountVectorizer
BUG + optimization in GaussianNB
refactor common code of NB estimators into BaseNB class
Refactor/simplify naive Bayes tests
API change: 1-d output from BaseNB.predict_(log_)proba in binary case
ENH SGD error messages better still
FIX embarrassing SyntaxError in linear_model.base
BUG multiclass.predict_binary still relied on old MultinomialNB.predict_proba
DOC prob_predict -> predict_proba in SVM docstrings
Revert "BUG multiclass.predict_binary still relied on old MultinomialNB.predict_proba"
Revert "API change: 1-d output from BaseNB.predict_(log_)proba in binary case"
refactor SVMlight reader and writer
API change in SVMlight reader: handle multiple files with svmlight_load_files
Retry "BUG fixed and cosmetics in CountVectorizer"
CountVectorizer.fit_transformer refactoring, part N
Micro-optimize NMF for memory usage: topic spotting example down by ~17%
Replace two more flatten()s in NMF with ravel()s
FIX broken doctests in NMF + pep8
Allow sparse input to NMF
NMF: cosmit
Refactor ensemble learning code
FIX Issue 379 and use the opportunity to refactor libsvm code
DOC copy-edit naive bayes doc, with an emphasis on the formulas
COSMIT in chi² feature selection
DOC ported latexpdf target from Sphinx 1.0.7-generated Makefile
DOC typos in Ward tree docstring
COSMIT little things in hierarchical.py
BUG NMF topic spotting example would output n_top_words-1 terms
DOC explain multiclass behavior in LogisticRegression
COSMIT pep8 feature_extraction.text
DOC some stuff on input validation
ENH Cython version of SVMlight loader
ENH accept matrix input throughout
COSMIT rename safe_asanyarray to safe_asarray to prevent confusion
DOC correct Google URL
pep8 grid_search.py
FIX replace np.atleast_2d with new utils.array2d
DOC correct and clean up empirical covariance docstrings
ENH test input validation code on memmap arrays
Merge pull request #410 from larsmans/accept-matrix-input
ENH sample_weight argument in discrete NB estimators
BUG handle two-class multilabel case in LabelBinarizer
TEST better test for binary multilabel case in LabelBinarizer
ENH multilabel learning in OneVsRestClassifier
DOC OneVsRestClassifier multilabel stuff
ENH multilabel support in SVMlight loader
DOC multilabel classification in narrative docs
FIX Python 2.5 compat in utils/tests
COSMIT multiclass.predict_ovr
DOC expand Naive Bayes narrative doc (BernoulliNB formula)
COSMIT in naive_bayes
ENH prevent copy in sparse.LogisticRegression
Revert "ENH prevent copy in sparse.LogisticRegression"
DOC typos and style in linear_model docs
COSMIT cleanup sgd Cython code
DOC update cross validation docstrings for default indices=True
BUG handle broken estimators in grid search by cloning them
ENH don't require numeric class labels in SGDClassifier
BUG fix SGD doctests
BUG fix Naive Bayes test + refactor module
DOC typo
ENH support array-like y (lists, tuples) in GridSearchCV
ENH support arbitrary labels in metrics module
COSMIT rm comment in coord descent code about np.dot
COSMIT no need for csr_matrix "cast" in coord descent
ENH prevent copy in PCA if not necessary
FIX use super consistently in SVMs
ENH incrementally build arrays in SVMlight loader to reduce memory usage
Merge pull request #446 from larsmans/svmlight-loader-memory-use
DOC typos in ensemble.forest
drop Python 2.5; no more with statements from the __future__
drop Python 2.5; no more need for utils.fixes.product
drop Python 2.5; document and rm some workarounds for kwargs quirks
COSMIT rm some SciPy pre-0.7 compat code
raise TypeError instead of ValueError in check_arrays
COSMIT docstring fix + US spelling in K-means code
DOC I don't think Ubuntu 10.04 will be the last LTS release
test @deprecated using warnings.catch_warnings
COSMIT use utils.deprecated as a class decorator
don't use assert_in, not supported by nose on buildbot
Revert "FIX: more python2.5 SyntaxError"
Revert "FIX: python2.5 SyntaxError"
COSMIT use urlretrieve and "with" syntax in LFW module
COSMIT use ABCMeta in naive_bayes
COSMIT a few more easy cases of with open syntax
rm Py2.5 compat factorial and combinations from utils.extmath
use cPickle in spectral clustering tests
COSMIT use Python 2.6 except-as syntax
DOC rm Methods section from KMeans docstring
BUG typo in NB error msg
DOC fix datasets.load_digits example
DOC fix datasets.load_digits example, second attempt
COSMIT rename load_vectorized_20newsgroups + DOC + pep8
Merge pull request #2 from mblondel/multilabel
BUG only handle labels specially in SVMlight loader + multilabel
BUG fix off-by-one error in SVMlight format loader
DOC multilabel learning: note that it's experimental + @mueller's remark
DOC document svmlight file loader changes in changelog
COSMIT reorganise utils tests
TST add test for sklearn.utils.extmath.logsum
DOC copyedit kernel approximations docstring
DOC kernel approximations, some last bits
DOC unbreak kernel approx docstrings (UTF-8 + s/References/Notes/g)
Merge branch 'master' into multilabel
ENH add multilabel_ property to OvR and raise NotImplementedError in score
ENH demo sparse KMeans on 20news set (it's slow!)
Merge remote-tracking branch 'vene/lars_multilabel' into multilabel
BUG forget a return keyword in OvR classifier
DOC describe test_ovr_multilabel better
TST extra test for LabelBinarizer's multilabel behavior
COSMIT set union in LabelBinarizer
ENH improve stoplist handling in feature_extraction.text
DOC rm References sections in docstrings
DOC I broke the docs and I liked it
COSMIT make BaseLibSVM an abstract base class
BUG input validation in kernel approximations + pep8
BUG fix Vectorizer to play nicely with Pipeline
Revert "BUG Disallow negative tf-idf weight"
PY3K fix in datasets.samples_generator
scikits.learn -> sklearn migration in label propagation
BUG don't pass estimator params to fit in label propagation
DOC cosmetics in SVM docstring
COSMIT reintroduce ABCMeta into BaseSGD*
BUG refactor SGD classes to not store sample_weight
COSMIT rm unused svm.base.dot
BUG use ValueError in BaseLibSVM.coef_
BUG update test for SVMs raising ValueError for coef_
COSMIT remove superfluous imports in svm/sparse/base.py
BUG don't use deprecated attributes in GaussianNB.predict
remove deprecated Neighbors{Classifier,Regressor}
ENH raise ValueError in metrics instead of AssertionError
ENH intercept_ on linear OvR clf + change exception to AttributeError
DOC pep257, or "sentences end with a full stop"
ENH input validation in DBSCAN
DOC rm confusing line in BernoulliNB docstring
FIX small stuff in new tomography example
factor out some common code in dense/sparse SGD
prevent a copy in SGD regressor fitting
refactor SGD, part 2: simplify parameter passing
refactor SGD, part 3: factor out more sparse/dense common code
COSMIT rm no-op conversion in SGDRegressor
BUG restore symbolic class label support in SGD + test it
ENH merge dense/sparse LinearSVC, part 1: no more SparseBaseLibLinear
ENH merge dense/sparse LinearSVC, part 2: no more sparse.CoefSelectTransformer
ENH merge dense/sparse LinearSVC, part 3: deprecate sparse.LinearSVC
ENH merge dense/sparse LinearSVC, part 4: deprecate sparse.LogisticRegression
DOC reference for logistic regression training with liblinear
COSMIT refactor liblinear bindings
TST merge dense and sparse LogisticRegression tests
Merge branch 'master' into merge-linearsvcs
COSMIT fix ugly import, left over from LinearSVC refactoring
DOC put merged LinearSVC and LR in changelog + explain @mblondel's work
BUG fix SGD doctest
Merge pull request #561 from larsmans/merge-linearsvcs
BUG promote type-safety in murmurhash
BUG make coef_ 1-d in Naive Bayes for binary case
BUG replace assert by custom exceptions
COSMIT refactor SGD code further
Revert "COSMIT refactor SGD code further"
ENH merge sparse and dense SVMs, part 1
ENH merge sparse and dense SVMs, part 2
ENH merge sparse and dense SVMs, part 3: adapt sparse tests
DOC merge sparse and dense SVMs, part 4
Merge pull request #576 from larsmans/merge-svms
DOC improve intro to Git in the developers' documentation
DOC rm unused param from sparse.ElasticNet docstring
COSMIT abstract base class in univariate feature selection
ENH sublinear tf scaling in TfidfTransformer
DOC s/with dense data// in merged SGD module
refactor SGD regression input validation + doc fixes
ENH more generic dict-like test in CountVectorizer
DOC typos in whats_new
DOC typos
DOC typo
COSMIT refactor SGD with Dataset factory function
COSMIT rename _mkdataset function in SGD
ENH add DictVectorizer
ENH test feature_extraction.DictVectorizer
DOC syntax error in DictVectorizer docstring
COMPAT turns out collections.Mapping has an iteritems member
ENH add test for DictVectorizer.restrict
DOC + ENH DictVectorizer: complete docs, add dict_type param
COSMIT disable liblinear I/O code
ENH implement one-of-K/one-hot coding in DictVectorizer
COSMIT rename DictVectorizer source files
ENH optimize DictVectorizer (sparse case)
TEST more strict test for one-of-K coding in DictVectorizer
DOC narrative documentation for DictVectorizer
DOC + pyflakes in DictVectorizer
ENH reduce memory usage of DictVectorizer.transform in sparse case
BUG fix doctests for DictVectorizer (nose 0.X compat)
Merge branch 'dictvectorizer'
COSMIT simplify input validation in KMeans
DOC small fixes to NearestCentroid classifier
BUG disallow shrinking with sparse data in NearestCentroid
DOC typos, line-width and minor stylistic fixes in pipeline module
COSMIT shallow copy of steps in Pipeline + code style
Merge pull request #741 from ogrisel/sorted-dictvectorizer
COSMIT use sorted instead of list.sort in DictVectorizer
DOC small fixes to DictVectorizer documentation
BUG fix issue #753, "Sparse OneClassSVM missing argument to super()"
BUG re-allow zero-based indexes in SVMlight files
COSMIT replace utils.testing.assert_in with Nose-compatible functions
DOC + FIX DictVectorizer: actually support single Mapping arg in transform
ENH zero_based="auto" support + better n_features=None in load_svmlight_files
COSMIT vanity + license for ArrayBuilder
COSMIT refactor SVMlight loader
ENH fit_predict convenience method on KMeans and MiniBatchKMeans
Merge pull request #729 from larsmans/fit-predict
COSMIT pep8 SVMlight loader
BUG close files in time in SVMlight loader (with statement)
TEST + FIX zero_based="auto" behavior in SVMlight loader
DOC + PEP8 SVMlight loader
Merge pull request #756 from larsmans/svmlight_fix
DOC typo
COSMIT pep8 document classification example
DOC typo in example
DOC clarify zero_one_score
DOC typo
revert PLS param rename + move input validation out of loop
BUG chi² feature selection didn't work for COO matrices
ENH export f_oneway from feature_selection module
BUG ensure that SelectKBest actually selects k features
DOC clarify __check_build messages
DOC instruct new devs to *always* work in branches
COSMIT pyflakes + pep8 linear_model/base.py
ENH generalize LabelBinarizer to arbitrary Sequence types
BUG remove debugging statements from multiclass
BUG in LabelBinarizer (forgot to run the full testsuite)
DOC fixed sentence that was missing a verb
rm deprecated euclidian_distances synonym
ENH fix and test LabelBinarizer's handling of string labels
ENH import liblinear 1.91
COSMIT make a liblinear C private helper function static
BUG set new p parameter in liblinear helper
ENH support opening compressed files in SVMlight reader
ENH always support file descriptors in SVMlight loader
DOC typo in docstring
BUG do not close fd passed by user in SVMlight loader
FIX NearestCentroid.fit could not handle sparse formats other than CSR
DOC typo
DOC fix dead link
DOC + COSMIT additive chi² sampler
ENH scipy.sparse support in additive chi² sampler
DOC output from additive chi² sampler
COSMIT refactor input validation code and tests
COSMIT + DOC input handling and docstrings in RandomizedPCA
ENH classes_ on OvR classifier
DOC typos
COSMIT remove some dead code
BUG remove predict{_log,}_proba from SVR
COSMIT cleanup tests with pyflakes
ENH better input validation for dump_svmlight_file
ENH make generated SVMlight files self-describing in a comment
COSMIT don't call magic methods directly
ENH allow user-specified comment in SVMlight dumper
rm the long-deprecated scikits.learn package
TST: improve coverage of feature_selection.SelectorMixin
COSMIT suppress warning from qr_economic + docstring on Counter
TST absolute imports in spectral clustering tests
ENH more specific warning filter for qr_economic
TST upgrade trivial (single-class) k-NN problems to binary ones
DOC + TST vocabulary arg in CountVect docstring
COSMIT move BaseSGD to its only place of usage
COSMIT minor refactoring of SGD
DOC tutorial: explain what an estimator is
DOC rewrote logistic regression docs
DOC yet another AKA
DOC copyediting
TST (near-)empty lines and explicit zeros in SVMlight loader
COSMIT use property.setter in sklearn.svm
ENH performance of TfidfTransformer
COSMIT replace useless safe_sparse_dot in chi2 with np.dot
BUG fix broken top-10 features printing in text clf example
DOC copyedit HMM documentation
COSMIT const and void* correctness in liblinear wrapper
ENH refactor liblinear prediction code and add classes_ member
COSMIT liblinear C code cleanup
COSMIT comment out more unneeded liblinear code
DOC + COSMIT LogisticRegression: docstring + rewrite predict_proba
Merge pull request #1141 from pprett/sgd-predict-proba
DOC small fixes to SGD docstrings
COSMIT rm svm.sparse tests to prevent deprecation warnings
ENH micro-optimizations in SVMlight loader
BUG rm RidgeClassifier from 20newsgroups
Merge pull request #1143 from larsmans/refactor-liblinear
ENH no more distinction between "sparse" and "dense" LinearSVC
COSMIT rm deprecated SGDClassifier.classes property
COSMIT clarify L1/L2 LR sparsity demo
DOC fix link for IsotonicRegression
DOC fix IsotonicRegression docstrings
BUG allow array-like y in RFE
DOC RFE docstring + link RFECV in narrative docs
BUG rm LARS from linear_model.__init__
COSMIT refactor linear classifiers
TST improve Ridge test
COSMIT use LinearClassifierMixin in RidgeClassifier
COSMIT + DOC univariate feature selection
COSMIT re-indent docstring for safe_mask
BUG make GridSearchCV work with non-CSR sparse matrix
COSMIT rm deprecated class_weight from fit in Ridge
Revert "BUG rm RidgeClassifier from 20newsgroups"
ENH add max_iter argument to Ridge estimators
DOC Ridge improvements in whats_new
Merge pull request #1169 from larsmans/ridge-cg
COSMIT rm deprecated stuff -- lots of it
DOC rm references to deprecated stuff
TST writable coef_ and intercept_ on LogisticRegression
ENH let DictVectorizer build a CSR matrix directly and use array.array
DOC DictVectorizer returning CSR in ChangeLog
Merge pull request #1193 from larsmans/dictvectorizer-csr
COSMIT error messages in GenericUnivariateSelect
ENH perform feature selection on scores, not p-values, when possible
DOC some improvements to FeatureUnion docs
DOC LaTeX error in SVM narrative docs
ENH better error messages in CountVectorizer for empty vocabulary
TST CountVectorizer with empty vocabulary
Merge pull request #1208 from larsmans/check-empty-vocabulary
Merge pull request #1211 from kcarnold/gitattributes
DOC typos in README
DOC feature selection by scores instead of p-values
DOC various typos and other minor stuff
DOC clarify zero_based's implications in SVMlight loader
Merge pull request #1204 from larsmans/mi-feature-selection
BUG + DOC l1_ratio in SGD and CD
COSMIT correct error msgs in SGD and make them more consistent
Merge branch 'pr/1214'
DOC let BibTeX handle its own capitalization, except for {P}ython
BUG NaN handling in SelectPercentile and SelectKBest
COSMIT rm unused import
COSMIT website address + copyedit in __init__.py
DOC move implementation details on mixins to comments
Revert (rebased) merge of euclidean_distances speedup
ENH allow more than 1000 linear SVMs with custom random seeds
BUG halve the number of LinearSVCs
COSMIT use np.clip in SGD
ENH fit_transform on KMeans
ENH input validation in chi2, error for negative input
Merge branch 'master' into pr/1279
ENH OneHotEncoder docs + TypeError + test active_features_
ENH cut down on memory use of text vectorizers
DOC copyedit tutorials
COSMIT rm outdated file of changes to liblinear
Merge pull request #1335 from robertlayton/clustdocs
DOC typo in k-means docs
Merge pull request #1366 from agramfort/move_isotonic
DOC grammar in isotonic regression narrative docs
ENH feature hashing transformer
DOC narrative documentation for feature hashing
ENH speed up hashing and reduce memory usage by 1/3
ENH allow (feature, value) pairs in FeatureHasher
ENH 20newsgroups example for FeatureHasher
ENH + DOC FeatureHasher
ENH add dict support to FeatureHasher and make it the default input_type
Merge pull request #1374 from jakevdp/doc_GA_flag
BUG enforce and document max. n_features for FeatureHasher
DOC update Ubuntu installation instructions
FIX smoothing in Naive Bayes and refactor the discrete estimators
COSMIT no diff for pairwise_fast.c
DOC credit @sjackman in what's new for BernoulliNB fix
COSMIT refactor input validation code; skip some issparse calls
BUG Cholesky delete routines wouldn't compile on Solaris
COSMIT simplify unique_labels in sklearn.metrics
COSMIT shut up the build by calling np.import_array in Cython modules
Merge pull request #1556 from larsmans/cython-cleanup
COSMIT wrong path in .gitattributes
Update sklearn/metrics/metrics.py
update year in copyright notices
BUG don't write comments in SVMlight dumper by default
BUG hotfix for issue #1501: sort indices in SVMlight i/o
DOC fix travis URLs in README
TST sorting CSR matrix indices in SVMlight file handling
DOC improve cosine similarity docs
COSMIT make BaseVectorizer a mixin
DOC copyedit HashingVectorizer docs
Merge pull request #1598 from amueller/naive_bayes_class_prior_rename_revert
COSMIT rm deprecated svm.sparse module
COSMIT rm deprecated attrs from [LQ]DA
BUG last references to svm.sparse
COSMIT rm deprecated stuff
BUG fix failing doctest
BUG one more failing doctest
BUG move label_ from BaseLibSVM to BaseSVC
COSMIT decouple regression and classification in SVMs
BUG in RadiusNeighborClassifier outlier handling
Merge pull request #1576 from mrorii/fix_kneighbors
ENH rewrite radius-NN classifier's outlier handling
COSMIT translate lgamma replacement to C and clean it up
COSMIT add lgamma to gitattributes
DOC update SMART notation in TfidfTransformer docs
P3K: use print as a function in the examples
ENH refactor univariate feature selection
P3K use six.string_types and six.PY3
P3K one more iteritems
COSMIT rm Python 2.5 and Jython compat from six
BUG fix import problem in preprocessing
P3K StringIO vs BytesIO
DOC fix failing doctest due to unicode_literals
DOC whitespace in doctest
BUG revert P3K changes that broke mldata tests
rm gender classification example
P3K death to the print statement
P3K fix broken doctest and add forgotten print_function import
DOC no more need for compute_importances in trees
DOC copyedit FeatureHasher narrative
ENH move covtype loading to sklearn.datasets
TST covertype loader
DOC copyedit FeatureHasher narrative further
P3K range vs. xrange
Merge pull request #1524 from amueller/break_ovo_ties
DOC pretty math in kernel docstrings
BUG MinMaxScaler missing from preprocessing.__all__
BUG in KernelPCA: wrong default value for gamma
Merge pull request #1688 from hrishikeshio/fit_transform
ENH speed up RBFSampler by ~10%
BUG oops, removed validation by accident
BUG fix broken grid search example
COSMIT update mailmap
ENH sparsify method for L1-reg linear models
DOC developer guidelines for unit tests and classes_
DOC dev guide: random_state_ + @amueller's remarks
DOC r2_score may return negative values
Merge branch 'sparse-coef'
COSMIT callable instead of hasattr __call__
DOC rm failing doctest on graph_laplacian
DOC fix text vectorizer docs and add NLTK example
DOC fix broken doctests for feature_extraction.text
BUG restore empty vocabulary exc in CountVectorizer
ENH prevent copying of indices in CountVectorizer
DOC credit @ephes
Merge pull request #1713 from larsmans/vectorizer-memory-use
COSMIT use callable instead of hasattr
Merge pull request #1727 from amueller/min_max_scaler_fix
BUG broke the what's new while rebasing
ENH set min_df in fe.text back to 1
TST compute_class_weight in utils
FIX + TST + DOC compute_class_weight
ENH use bincount in compute_class_weight
BUG use fixes.unique
BUG in SVM tests
BUG fix compute_class_weights issue in SGD
Merge pull request #1753 from NelleV/FIX
P3K some more fixes in random places
DOC OpenBLAS is more dangerous than I thought
DOC oops, typo
COSMIT get rid of undocumented attributes on SVMs
PEP8 and allow non-bool truth values in CD
BUG + ENH: removal of components in kernel PCA
Merge pull request #1758 from larsmans/kernelpca-fix
P3K make feature_extraction.text work
BUG failing doctest
DOC IsotonicRegression wasn't in the changelog at all
P3K all of feature_extraction passes tests on Py2 and 3
DOC clarify column ordering in SVC scores
COSMIT DictVectorizer.inverse_transform readability
DOC CountVectorizer does NOT do stopword filtering by default
ENH don't recompute distances in MBKMeans
ENH cut MiniBatchKMeans memory usage in half for large n_clusters
DOC installation instructions: MacPorts, fix types, stdeb instructions
Merge pull request #1773 from jnothman/prf_docstring
BUG StandardScaler would ignore with_std for CSR input
BUG SGDClassifier and friends did not forget labels_ in re-fit
DOC clarify C parameter on LogisticRegression
TST + DOC + COSMIT refactor ParameterGrid and test it
ENH len on ParameterGrid and ParameterSampler
BUG deprecation of grid_scores_ in GridSearchCV
BUG always do cross-validation in GridSearchCV
DOC fix clone and get_params documentation
TST grid search/randomized search on non-BaseEstimator
TST actual sparse input in sparse k-NN tests
COSMIT prevent a copy in randomized LR
TST speed up comment tests by ~20%
TST radius-neighbors regression test not entirely stable
BUG additive_chi2 missing in KERNEL_PARAMS
BUG + DOC fix Nystroem for other kernels than RBF
COSMIT rm repetitive __main__ blocks from tests
ENH allow additional kernels on KernelPCA
TST fix broken doctest
P3K developer docs
Merge branch 'pr/1790' -- Python 3 support from PyCon sprint
Merge pull request #1812 from kmike/testing-fixes
DOC describe SVM probability calibration (and advise against it)
DOC further comments on SVM probabilities
ENH multiclass probability estimates for SGDClassifier
BUG digits grid search was passing cv to the wrong method
DOC typos in grid search docstrings
PY3 + TST decouple test_metrics from random module
Merge pull request #1836 from kmike/master
DOC distributions produced by hashing trick depend on input
DOC multiclass: typo and use case
DOC PR means pull request
FIX BytesIO and urllib usage in fetch_olivetti_faces
DOC I didn't mean soft-O by "tilde notation"
DOC describe API, not internals, for AdaBoost
DOC replace "arithmetical order" in AdaBoost docs
TST strengthen AdaBoost tests
FIX SVR complaining about a single class in the input
COSMIT do np.unique(y) once in SVC
DOC rewrite description of k-fold CV
mailmap entry for @lqdc
DOC define validation before cross validation
DOC typos in cross-validation description
clean up mailmap/deduplicate contributors
BUG disable memory-blowing SVD for sparse input in RidgeCV
FIX DictVectorizer behavior on empty X and empty samples
TST + DOC AdaBoostClassifier.predict_proba fix
COSMIT refactor AdaBoost code
ignore PDFs
ENH speed up sklearn.feature_selection.chi2
DOC dependency installation with yum (Red Hat, CentOS)
FIX bug (swapped args) in chi2
FIX yet another chi2 bug
ENH add latent semantic analysis/sparse truncated SVD
ENH use rnd SVD in TruncatedSVD by default for speed
COSMIT omit unused parameter/return value in svd_flip
TST strengthen TruncatedSVD tests
DOC + MAINT deprecate RandomizedPCA scipy.sparse support
FIX and link LSA clustering example
DOC explain normalization in LSA KMeans example
Merge pull request #1716 from larsmans/truncated-svd
FIX metrics/scoring bug with LeaveOneOut CV
MAINT remove deprecated gprime handling from FastICA + refactoring
Merge pull request #2067 from jnothman/test_binarizer
DOC no more mention of the Bunch in the narrative docs
FIX don't rely on Bunch behavior with fetch_covtype
DOC fix some docstring/parameter list mismatches
DOC fix RandomizedPCA docstring for n_components=None
ENH allow empty grid in ParameterGrid
MAINT ignore kernprof.py reports
DOC ParameterGrid on lists
Merge pull request #2082 from larsmans/empty-parameter-grid
DOC fix V-measure docstring
MAINT dedup Clay Woolam's contribs (>100 commits!)
FIX/ENH mean shift clustering
DOC typo
ENH micro-optimize RFECV
COSMIT refactor LibSVM wrapper for safety and readability
DOC fix some broken URLs
FIX charset -> encoding in load_files
DOC typo
Revert "FIX charset -> encoding in load_files"
FIX verbose output from k-means
FIX remove params from RandomizedSearchCV
FIX charset -> encoding in load_files
FIX search bug introduced in 1327057f4258f41712ecab5c94770aac5ff01982
FIX inconsistent attributes shapes in naive Bayes
FIX test failure in naive Bayes
FIX failing doctest for CountVectorizer
Merge pull request #2027 from mblondel/select_categorical
FIX copy in OneHotEncoder and _transform_selected
ENH optimize KMeans for sparse inputs
FIX KMeans bug; argsort result apparently not always C-contiguous
DOC what's new: faster KMeans
DOC more explicit description of degree param on SVMs
COSMIT pep8
ENH order *does* matter for sparse matrices
FIX get rid of the last few asanyarray calls
DOC fix erroneous docstring on preprocessing._transform_selected.
MAINT: dedup @jakevdp and @jnothman in mailmap
COSMIT simplify printing of number of fits in grid search
COSMIT fix a docstring in feature_extraction.text
P3K developer docs
TST r2_score float32 overflow fix
Revert "TST r2_score float32 overflow fix"
PY3 use urllib2 or urllib.request, based on Py2/3
DOC let OneHotEncoder, DictVectorizer and FeatureHasher refer to each other
DOC correct class_weight description for LogisticRegression
FIX memory usage in DictVectorizer.fit
ENH back-port rand_r from 4.4BSD
FIX move rand_r to tree module for now
DOC 20news filtering with smaller set and MultinomialNB
PY3 fix string literal syntax error
TST skip Graphviz export docstring in trees
TST use TruncatedSVD in random forest tests
COSMIT refactor random forests
COSMIT refactor forests, part 2
FIX faulty import in 20news docs
ENH fit_inverse_transform for FastICA
DOC document mixing_ attr on FastICA
COSMIT attribute checking in FastICA
COSMIT explicit None check in naive Bayes
ENH simplify the Scorer API
FIX bug in scorers that take probabilities
COSMIT RBM test in usual nose style + moved to proper module
BUG + COSMIT + ENH RBMs
Merge branch 'pr/1954'
MAINT _logistic_sigmoid.c is "binary"
PY3 fix RBM test
DOC copyedit RBM docstrings
DOC pep257 + c/e in sklearn.base
TST fix string labels in metrics tests
DOC copyedit preprocessing docs
MAINT ignore profiling results from kernprof.py
DOC copyedit KernelCenterer docstring
DOC minimal kernel centering narrative docs
DOC minor copyedit to FS docs
Merge pull request #2230 from pprett/neighbors-segfault-fix
TST catch deprecation warning in feature_extraction.text
Merge branch 'pr/2246'
DOC correct/copyedit linear model docstrings
FIX inline rand_r to fix build on Windows
DOC add an extremely simple classifier code example to dev docs
ENH rewrite multiclass_log_loss, rename log_loss, document it
ENH Scorer object for log loss
ENH add log_likelihood_score as -log_loss
PY3 new overfit prevention stuff in 20newsgroups loader
DOC SGDClassifier has multiclass predict_proba
DOC minor copyedit to narratives
FIX don't use old scoring API in randomized search
FIX use category and stacklevel=2 for {loss,score}_func
ENH speed up BernoulliNB's predictions
DOC "creating features" -> "feature extraction" + minor stuff
Revert "ENH add log_likelihood_score as -log_loss"
DOC copyedit example docstring
DOC XHTML fixes (unclosed tags, type="text/javascript")
ENH speed up logistic_sigmoid (using less code)
FIX make BaseSGDClassifier an ABC
Merge pull request #2295 from larsmans/fast-sigmoid
DOC credit to @ephes and myself for log loss in metrics
DOC copyedit SGDClassifier docstring
FIX integer types in Ward clustering
Lucas Wiman (1):
Fix spelling in dosctring.
Ludwig Schwardt (1):
FIX removed ancient templates from manifest to make sklearn pip-installable.
Luis Pedro Coelho (1):
cd_fast: use square norm directly
Mark Veronda (2):
Type-os and added great links to learning more about Machine Learning
Feedback from @amueller
Marko Burjek (7):
DOC Added SGDCLassifier support only binary prediction probabilites.
DOC Fixed a return in predict_proba in SGDClassifier
DOC add support for sparse arrays to SGDCLassifer
DOC forgot dot in SGDCLassifier documentation
DOC Fixed a return in predict_proba in SGDClassifier
DOC add support for sparse arrays to SGDCLassifer
DOC forgot dot in SGDCLassifier documentation
Martin Luessi (6):
WIP: doc hyperlinks, fixed size thumbnails
gzip support, whats_new
use Sphinx searchindex.js
no_image.png for examples w/o thumbnail
fix paths for Windows
links for scipy, cleanup
Mathieu Blondel (679):
Added filters to WordNGramAnalyzer.
Added a non-hashing dense vectorizer object.
Added transform() method to Pipeline object.
Updated dense vectorizer to follow transformer API.
Support fit_transform() in pipeline.
Support lists for training data in grid_search.
Use fit_transform and use iterables for documents.
Remove uncessary code.
Save memory when the matrix is built.
Added fit_transform() to pipeline.
Vectorizer should implement fit_transform.
SparseCountVectorizer, SparseTfidfTransformer and Sparse Vectorizer
normalize option for TfidfTransformer
fix garbage
Fix indentation.
Fix cross_val when y is a 2d-array.
Add refit option to GridSearchCV.
Merge branch 'master' into textextract
API changes to precision_recall
Fix consistency problem in the order of arguments for loss functions.
Add fbeta_score and f1_score metrics.
Rename roc to roc_curve.
Merge branch 'master' into textextract
Fix doctest in grid_search.
Merge branch 'master' into textextract
Add tests for predict_proba in LogisticRegression.
Use filter object.
Add dtype parameter to CountVectorizer and SparseCountVectorizer.
A few optimizations.
Move sparse code to sparse module.
Remove Sparse prefix from class names.
Move preprocessing to its own module.
Add Normalizer, LengthNormalizer and Binarizer.
Remove normalize option from TfidfTransformer.
Sparse equivalents of Normalizer, LengthNormalizer and Binarizer.
Fix hierarchy inconsistency for sparse module.
Move common sparse code to SparseBaseLibLinear.
Fix Sparse Logistic Regression.
Import LogisticRegression in sparse/__init__.py.
Merge branch 'master' into textextract
Activate class_weight option in fit() for liblinear-based classes.
Merge branch 'master' into textextract
Fix slicing issue when using sparse matrices.
Y -> y (capital letter is for 2d-arrays)
Raise exception when X_train.shape[1] and X_test.shape[1] don't agree.
Merge branch 'textextract' of git://github.com/ogrisel/scikit-learn into textextract
Merge branch 'textextract' of git://github.com/ogrisel/scikit-learn into textextract
Merge branch 'textextract'
Convert sparse matrix to CSR format in grid search.
Fix imports.
Pass kwargs to mlcomp loader.
Fix SGD-based binary classification example.
Note on fit_transform.
Test compute Gram matrix with support vectors only.
Activate stop word removal by default.
Add vocabulary property.
Fix small typos.
Make max_df to 1.0 by default.
Update matrix type in documentation.
Fix broken test.
class_weight="auto" for liblinear-based and sparse classes.
Fix math rendering in SVM documentation.
Fix typo.
Add LabelBinarizer.
Add sparse Ridge.
Support 2-d Y.
Add RidgeClassifier.
Add RidgeClassifier to 20newsgroup classification example.
Add efficient LOO cross-val for Ridge.
Add sample_weight to fit.
Add reference.
Add support for custom loss or score function.
Add label binarizer documentation.
Test 2-d y case.
Support fit_intercept in RidgeLOO.
Forgot to use sample_weight...
Default fit_intercept to True.
Add sparse RidgeLOO.
Add RidgeClassifierLOO.
Add class_weight.
Add some more documentation.
Add sample_weight.
Add dense_output option to safe_sparse_dot.
Use safe_sparse_dot.
Fix problem when output is a vector.
Add safe_asanyarray.
Handle sparse matrix in LinearModel.
Import necessary modules.
Fix tests for sparse case.
Add RidgeCV.
Merge dense and sparse code.
Rename to RidgeClassifierCV.
Fix 20newsgroup example.
Make RidgeLOO private.
Fix test.
Predict is already implemented in LinearModel.
Fix issue in RidgeCV.
PEP8!
Fix typo.
Add documentation on matrices used for clustering.
Rename _RidgeLOO to _RidgeGCV.
Note on efficiency.
Improve the documentation for LabelBinarizer.
Add TransformerMixin.
Use TransformerMixin in LabelBinarizer.
Merge branch 'ridge'
Fix typos.
Fix TransformerMixin.fit_transform.
Remove references to y in preprocessing objects.
Add sample_weight to Ridge.
Improve documentation for Ridge objects.
Move cv parameter to constructor in RidgeCV.
Temporarily disable sample_weight when cv is passed to RidgeCV.
Preserve backward compatibility in GridSearch.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Fix error in documentation.
Remove coef_ and get_support from Pipeline.
Add SparseTransformerMixin.
Use sparse.base.SparseTransformerMixin.
Add documentation on model persistence.
Minor fixes in RidgeCV.
Add reference for GCV.
Add Olivier Grisel to metrics.py's credits.
Comment broken test.
Rename SparseTransformerMixin to CoefSelectTransformerMixin.
Can now specify desired percentage of explained variance ratio in PCA.
Add a few sanity checks for SVC.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Add tests for sanity checks in SVC.
Flip the sign when the user accesses coef_ or intercept_ in the 2-class case.
Implement transform in LDA.
Add LDA to plot_pca.py and rename to plot_pca_vs_lda.py.
Proper implementation of predict_log_proba in LDA.
Add polynomial interpolation example.
Use np.vander.
Support multilabel case in LabelBinarizer.
Add linear_kernel, polynomial_kernel and rbf_kernel.
Small optimizations for polynomial_kernel and rbf_kernel.
Add KernelCenterer.
Add KernelPCA.
Add kernel PCA example.
Merge branch 'master' into kpca
Add KernelPCA documentation.
Add test for precomputed kernel.
Optim in polynomial_kernel.
Efficient fit_transform in PCA.
Merge branch 'mblondel-kpca' of https://github.com/vene/scikit-learn into kpca
Cosmit.
Use TransformerMixin in KernelPCA.
Merge branch 'master' into lda
Merge branch 'lda' of https://github.com/bthirion/scikit-learn into lda
Fix doctest.
pep8 love (integrism?).
Add test for invalid kernel.
Rename plot_kpca.py to plot_kernel_pca.py.
Add comment regarding PCA's fit_transform method.
Add note on sign ambiguity in PCA.
Merge branch 'kpca'
Add kernel PCA and linear PCA equivalence test in its own function.
Merge pull request #163 from paolo-losi/revert_preprocessing
Make the author file more consistent.
Merge pull request #167 from bsilverthorn/fix-kernelpca-ncomponents
Add sparse.LogisticRegression to class reference.
Better doc for the dataset loaders.
Make kernels consistent with SVM and add sigmoid kernel.
Fix LDA transform.
Add LDA to the handwritten digit 2d-projection example.
Add TransformerMixin to LDA and RandomizedPCA.
Cosmetics.
Merge pull request #200 from amueller/minor_docs
Merge pull request #193 from ogrisel/preprocessing-simplification
Better PCA docstrings.
Fix LDA.transform's docstring.
Typo.
Add hinge_loss to metrics.
Fast and memory-efficient loader for the svmlight format.
Allow to user to fix n_features.
Docstring.
Important note.
Propagate errors up to the Python level.
Narrative documentation.
Update credits.
Return false when couldn't read the file.
Fix comment.
Merge pull request #6 from larsmans/mblondel-svmlight
Merge branch 'mblondel-svmlight' of git://github.com/larsmans/scikit-learn into svmlight_format
Fix compile issues on Mac OS X.
Fix ref counting bug.
More comments.
Merge pull request #7 from larsmans/mblondel-svmlight
load_svmlight_format -> load_svmlight_file.
Merge branch 'master' into svmlight_format
Merge pull request #209 from mblondel/svmlight_format
Documentation fixes.
Add note to base fit_transform doc.
Raise error if file doesn't exist.
Fix parsing issues.
More tests for the svmlight reader.
Documentation fixes.
Better performance of Ax=b solver when b is 2d and A is sparse, and add
Fix doctest.
Reverse coef_ in Ridge.
Merge pull request #235 from mblondel/fix_ridge
Improve Logistic Regression sparsity example.
Better test and remove old garbage.
Allow CountVectorizer to be fitted twice.
Remove unnecessary submethod.
2011!
squared loss -> squared hinge loss.
Merge pull request #255 from vene/kernel-pca
Merge pull request #260 from glouppe/master
Merge pull request #261 from glouppe/master
Merge branch 'dbscan' of https://github.com/robertlayton/scikit-learn into dbscan
Handle metric="precomputed" in dbscan.
Use euclidean_distances in kmeans.
Cosmit: use dense_output=True.
Sparse matrix support in kernels.
PCA: fix issue #258.
PCA: better doc string for 0 < n_components < 1 case.
Partial support for sparse matrices in kernel PCA.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Remove unnecessary import.
Merge branch 'dbscan' of git://github.com/robertlayton/scikit-learn into dbscan
calculate_distances -> pairwise_distances + goodies.
Improve DBSCAN doc.
Fix DBSCAN example.
Remove automatically generated auto examples.
Test pickability in DBSCAN.
Test precomputed similarity in pairwise_distances.
Merge branch 'samples_generator' of git://github.com/glouppe/scikit-learn into samples_generator
Doc for sample generator cosmits.
Merge branch 'kmeans_transform2' of https://github.com/robertlayton/scikit-learn into kmeans_transform2
Add tests and fix bug.
Kmeans transform and predict doc improvements.
Merge pull request #296 from bdholt1/fix/feature_extraction
Add TransformerMixin (back?) to preprocessing classes.
Fix plot_kmeans_digits.py.
Typo.
Implement one-vs-the-rest multiclass strategy.
Fix bug in one-vs-rest when underlying estimator uses predict_proba.
Implement one-vs-one multiclass strategy.
Merge pull request #2 from ogrisel/robertlayton-kmeans_transform2
Implement error-correcting output-code multiclass strategy.
Test grid searchability.
Merge pull request #273 from robertlayton/kmeans_transform2
Docstrings!
Add new meta module to setup.py
Merge branch 'master' into multiclass
Check estimator and fix syntax error.
Documentation for the meta learners.
pep8-proof.
Fill missing docstrings.
Allow one-class only in LabelBinarizer.
Rewrite svmlight loader in pure Python for now.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' into multiclass
Fix mistake and docstring cosmits in SVC.
Moved multiclass module to top-level module.
Fix doc!
Fix setup!
Address @agramfort and @ogrisel's comments.
Merge branch 'master' into multiclass
More informative name for color quantization example.
More explanations and pep8.
Use 256 colors and add title.
Emphasize one-vs-all.
Better documentation.
Fix doctest errors (hopefully!).
Document fit_ecoc.
Typo.
Fix currentmodule.
Fix bad copy-paste.
Merge pull request #320 from mblondel/multiclass
64 colors + random codebook comparison.
Better title + authors.
Welcome to Robert and Gilles.
Sparse matrix support in the `density` util.
Documenting a secret feature and fixing bugs in the process.
Use l1 penalty.
Giving due credit (last minute ChangeLog item).
Cosmit.
Merge pull request #354 from amueller/liblinear_parameter_errors
Add dump_svmlight_file.
Export data option in SVG gui.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge pull request #407 from amueller/sgd_url_typo
BUG: Use threshold in LabelBinarizer in multi-label case.
ENH support decision_function in multi-label classification
Cosmit: used named parameter.
ENH Label indicator matrix support in LabelBinarizer and OVRClassifier
Remove C from NuSVR.
Revert "Remove C from NuSVR."
Revert "FIX : removing param nu from sparse.SVR, C from NuSVR + pep8"
Small comment on the dual parameter in LinearSVC.
Update svmlight loader documentation.
Fix svmlight loader doc.
Implement mean_variance_axis0.
Fix bug with sparse matrices.
Cosmit.
Test edge case.
tmp -> diff
Add score method to KMeans.
Use int for indptr and indices.
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
Sparse matrix support in KMeans.
Vectorized news20 dataset loader.
Merge multilabel branch with master.
Check that LabelBinarizer was fitted.
Multilabel classification dataset generator.
Test multilabel classifier on random dataset.
scale_C will be True in scikit-learn 0.11.
Merge pull request #8 from larsmans/news20_loader
Return bunch object.
Merge pull request #493 from amueller/kernel_approximation_doc
Add to class reference.
Add precompute_distances option back and export it.
Merge branch 'minibatch-kmeans-optim' of https://github.com/ogrisel/scikit-learn into minibatch-kmeans-optim
Address @ogrisel and @amueller's comments.
Better doc for the 20newsgroup dataset loader.
Do not use joblib's memoizer.
Use int16 for more compactness.
Merge branch 'master' into sparse-kmeans
Merge with master.
One more test.
Fix test.
Cosmit in MiniBatchKMeans.
Optimize for high dimensional data.
Use CCA as well in multilabel example.
Add missing reference.
Break down fit_transform into parts.
Cosmit
More tests for nuSVR.
Use rbf_kernel.
Add decision_function to ElasticNet.
FIX: support for regressors in multiclass module.
Support for coef_ in OneVsRestClassifier.
Mention multi-variate resgression support in Ridge.
Add safe_mask utility.
coef_ and intercept_ in LinearSVC are now writable.
Add safe_mask to developer doc.
Typos.
Create partial_fit and call partial_fit from fit.
Add partial_fit to SGDRegressor.
Partial tests + fix bugs.
Fix a few more bugs.
Use proper assertions.
Fix more bugs + tests.
Add decision_function to SGDRegressor.
Multiclass tests.
Merge dense and sparse SGD implementations.
Re-enable sparse tests.
Add deprecation warning.
Update docstrings.
What's new.
Removed needless line.
Use only one epoch in partial_fit.
Use named parameters.
Updat examples.
Update doc.
Use only epoch SGDRegressor.partial_fit.
Save iteration number.
More tests + fixes.
Fix bug when fit is called mutiple times.
Fix "what's new".
Merge pull request #10 from larsmans/sgd_partial_fit
Address @ogrisel and @larsmans 's comments.
pep8!
FIX: y should be np.float64.
Add filter_params option to pairwise_kernels.
Precomputed kernel can actually be non-squared.
Use pairwise_kernels in KernelPCA.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge pull request #11 from larsmans/sgd_partial_fit
More technically correct description.
Rename _get_params() to get_params().
Merge branch 'sgd_partial_fit'
Use classes_.
Better title in README.rst.
More intuitive warm-restart in SGD.
Fix doctests.
warm_restart -> warm_start
More intuitive warm-start in ElasticNet.
Fix doctests.
Copy in user-land.
Missing docstring in ElasticNet and Lasso.
Fix failure in `test_bad_input`.
Revert change on svm.base.
Remove if statement.
Suppress deprecation warnings.
Merge branch 'warm_start' of github.com:mblondel/scikit-learn into warm_start
Make sure order="C".
Merge branch 'warm_start'
Fix doctest.
preprocessing/__init__.py -> preprocessing/preprocessing.py
Move preprocessing.py to sklearn/.
Remove CoefSelectTransformerMixin and use SelectorMixin instead.
Better default threshold for L1-regularized models.
euclidian_distances is to be deprecated in v0.11.
Add n_jobs option to pairwise_distances and pairwise_kernels.
Merge branch 'enh/metrics' of https://github.com/satra/scikit-learn into metrics
Backward compatibility in precision, recall and f1-score.
Factor some code.
More what's new items.
Fix what's news.
Add Perceptron.
Add Perceptron to document classification example.
Minimal documentation.
Add references and implementation details.
Propagate parameters.
Expose more parameters.
Explain parameter in Hinge loss.
Don't rescale coef if not necessary.
Quick note on sparsity.
Don't break API in precision_recall_fscore_support.
Pep8!
Fix scale_C warning.
Merge branch 'perceptron' of github.com:mblondel/scikit-learn into perceptron
t -> threshold
Add mean_squared_error and deprecate mean_square_error.
Don't raise warning when passing explicit scale_C=False.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
DOC: scaling regression targets.
Merge pull request #623 from npinto/ridge-docfix
Set label encoding in LabelBinarizer.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Guess threshold if not explicitly provided.
Bug: must be strictly less than.
Pep8.
Don't raise warning in auto mode.
Merge pull request #712 from agramfort/fix_y_center
Merge branch 'shuffle_kfold' of https://github.com/NelleV/scikit-learn into kfold-shuffle
Test indices=False case.
Factor tests.
Merge branch 'combat' of https://github.com/ibayer/scikit-learn into lsqr_fix
Fix lsqr for scipy 0.7.
Add test for grid search with only one grid point.
Check param grid.
Return early if there's only one grid point.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Fix doctest failure.
Merge branch 'nearest_centroids' of https://github.com/robertlayton/scikit-learn into nearest_centroids
Fix doc mistakes.
Precomputed distance matrices can be rectangular.
Add test for precomputed distance.
Doc cosmits.
Fix bug when refit=False.
Fix kernel pca example.
Fix doctest in PLS.
Rename "p" to "espilon".
Allow regression losses for classification.
Add epsilon-insensitive loss.
predict_proba with loss="modified_huber".
Update doc.
Doc: predict_proba.
What's new.
Document API change.
Easier to understand formula.
DOC LabelBinarizer
BUG: now build works.
Add LabelNormalizer.
Documentation for LabelBinarizer and LabelNormalizer.
Pep8.
Cosmit: LabelBinarizer and LabelNormalizer are not classifiers.
More useful error message.
Doc cosmit.
Add test for non-numerical labels.
LabelNormalizer -> LabelEncoder.
Add documentation for non-numerical label case.
What's new.
Cosmit: be more explicit why LabelEncoder is useful.
Address @larsmans' comments.
Merge branch 'sgd_losses' of github.com:mblondel/scikit-learn into sgd_losses
Address @ogrisel and @pprett's comments.
Fix remaining merge conflict.
Fix doctest.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
What's new.
Fix typo.
Note regarding multilabel example.
Note on one-vs-all classification in SGD module.
Unused import.
Fix warning.
Merge pull request #877 from duckworthd/master
Fix #904.
Removed needless method redefinition.
Fix: RidgeClassifier must not inherit from RegressorMixin.
Clean up unused code.
Test default input.
Credits and license.
Update doc/whats_new.rst
Update doc/whats_new.rst
Typo.
Check that feature indices are sorted.
Add missing test file.
Optim in LabelEncoder.
Remove needless loop in inverse_transform.
Simplify LabelEncoder.fit_transform.
Fix warnings in multiclass module tests.
Remove duplicated line.
Add all_categories option.
Normalize training and test times.
Typo.
Simplify LabelEncoder.transform.
Test LabelEncoder.fit_transform with arbitrary labels.
Ignore joblib folder.
Fix #1080.
Decision threshold is now 0 in RidgeClassifier.
Optim + cosmit in StratifiedShuffleSplit.
Use fixed random state in isotonic regression example.
Note on the use of X in isotonic regression.
Fix confusing notation in isotonic regression.
Fix latex formula in isotonic regression doc.
Release manager change + fix Satra's URL.
Move solver option to constructor.
Add lsqr solver.
BUG: transmit parameters correctly from Ridge to ridge_regression.
Can afford better precision in news20 example.
Fix docstrings and doctests.
Add minimalistic test for each solver.
Fix damp parameter.
Fall back to dense_cholesky if sample_weight is given.
lsqr is not available in old scipy versions...
Better documentation on the choice of solver.
PEP8!
Cosmit: not a fan of defining a function in a loop :)
Update what's new.
More accurate API change description.
Fix warning message.
Merge pull request #1215 from amueller/pipeline_muliclass
Merge pull request #1237 from kalaidin/typos
Merge exthmath tests into the same file.
Add common assertions to sklearn.utils.testing.
Fix density utility when input is sparse.
Typo.
Fix test failure.
Use sklearn.utils.testing in tests.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
More use sklearn.utils.testing.
Even more sklearn.utils.testing.
Missing random_state in LinearSVC.
Merge pull request #1323 from dnouri/countvectorizer_doc_1154
FIX: vocabulary_ maps to feature indices.
Merge pull request #1320 from dnouri/test_coverage
Merge branch 'sgd_learners' of https://github.com/zaxtax/scikit-learn into passive_aggressive
Rename pa.py to passive_aggressive.py.
Cosmit: random_state is not necessary.
Fix many bugs and test PA-I.
Do not expose C in SGDClassifier / Regressor.
Implement and test PA-II.
Add SquaredHingeLoss.
Test different losses.
Add squared epsilon insensitive loss.
Test PA-II (regression).
Fix random_state in SGD.
Update narrative documentation.
Fix example.
Credit myself.
Fix see also.
Fix a few test failures.
Add one more test for PassiveAggressiveRegressor.
Fix underflow detected by test_common :)
Update document classification example.
Fix doctests.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Better documentation for C.
Add PassiveAggressive* to class reference.
Remove sample_weight and class_weight from PassiveAggressive*.
Add tests for partial fit.
Document epsilon.
Better documentation for epsilon in SGD.
Remove predict_proba from Perceptron and PassiveAggressiveClassifier.
Remove transform from PassiveAggressive*.
Fix typos and wording in RandomForestEmbedding.
Indicate dimensionality in RandomForestEmbedding example.
Cosmit: use less memory in feature hasher tests.
Cosmit: make KernelCenterer a private attribute in KernelPCA.
Improve KernelCenterer docstring.
Add add_dummy_feature.
Add RandomClassifier and tests.
Fix tests.
Add docstrings for RandomClassifier.
PEP8.
random_state=None by default.
Remove label encoder.
Implement predict_proba.
Add some narrative doc.
Address @amueller's comments.
Rename to dummy.DummyClassifier.
Add DummyRegressor.
Add dummy estimators to references.
Add what's new entry.
Add comments.
Check returned types.
Test expectations.
Test string labels.
Test exceptions.
Cosmit: save one line.
Address @amueller doc comments.
Skip common tests for Dummy*.
Typo :/
Add example in docstring.
Add to references.
Merge pull request #1382 from mblondel/add_intercept
Merge pull request #1373 from mblondel/random_clf
Remove unused import.
Improve error message when vocabulary is empty.
Fix bug in sqnorm (used by PassiveAggressive).
Link to travis.
Specify branch in status button.
Add missing assertion.
Update what's new.
Cosmits and typos.
Add perceptron loss to plot.
threshold parameter was ignored in SquaredHinge loss.
Welcome to Wei Li and Arnaud Joly.
Clean up test_pairwise.py.
More clean up of test_pairwise.py.
Cosmit: break up long line.
Merge pull request #1530 from agramfort/doc_lasso
X is not a constructor parameter.
Add missing types to docstring.
Move more minor contributors to what's new file.
Remove contact address.
Merge pull request #1561 from kyleabeauchamp/MinMaxScaler_Inverse
Merge pull request #1536 from kyleabeauchamp/issue-1403
Merge pull request #1604 from darkrho/doc-linear-model-typo
DOC: make distinction between evaluation and pairwise metrics.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Cosmit: more explicit xlabel.
Cosmit: more explicit label.
Update load_svmlight_file docstring.
FIX: X was converted twice.
Merge pull request #1804 from AlexanderFabisch/fix_example_path
Cosmit: remove needless blank lines.
Cosmit: more idiomatic way of clipping to zero.
Demystify magic values in NNLS implementation.
BUG: fix replacement for _neg.
Fix random state where appropriate.
Fixx doctest.
DOC: document attributes fitted by DictVectorizer.
DOC: put feature extraction before pre-processing.
COSMIT: better notation in CountVectorizer.
COSMIT: same changes in transform method.
COSMIT: more robust condition in inverse_transform.
Import gzip and bz2 only if necessary.
Move balance_weights out of preprocessing.
Add categorical_features option to OneHotEncoder.
Support both masks and arrays of indices.
Typo.
Rename _apply_transform to _transform_selected and make it a function
Merge branch 'master' of github.com:scikit-learn/scikit-learn into select_categorical
Address @jnothman's comments.
Test exception is raison when number of targets and penalties don't
Simplify ridge solvers (ongoing work).
Extract sparse_cg and lsqr solvers.
Extract dense_cholesky solver (linear case).
Extract dense_cholesky solver (kernel case).
Clean up.
Extract SVD-based solver.
Clean ups.
Remove copy option.
Cosmit in docstring.
What's new.
Remove if statement.
Cosmit.
Fix failures in grid search.
Do not set sample_weights unless need to.
Add warning when fall back to other solver.
Remove unused variable.
Fix failure in svd-based ridge solver w/ old numpy.
BUG: replace elif by if in Ridge solver selection.
Add fit_transform to FastICA.
Add inverse_transform to FastICA.
Add docstrings to methods in FastICA.
Address @dengemann's comments.
Add test.
Push failing test.
Merge pull request #2229 from larsmans/kernel-center-narrative
Typo.
Matthias Ekman (1):
ENH: add pre_dispatch option to cross_val_score
Matthieu Brucher (1):
Fixed a typo
Matthieu Perrot (25):
ENH: optional computing of estimated covariance of LDA classifier.
MISC: add an unfinished toy example to compare LDA with a (not yet implemented) QDA.
Merge branch 'master' of http://github.com/GaelVaroquaux/scikit-learn
BUG: Fixed example after last API changes
BUG: add missing call to pylab show function
BUG: Fixed pipeline feature selection example after last API changes
MISC: lda: Y -> y
Merge branch 'master' of http://github.com/GaelVaroquaux/scikit-learn
BUG: Fixed example after last API changes
BUG: add missing call to pylab show function
BUG: Fixed pipeline feature selection example after last API changes
ENH: add QDA classifier, some docs, examples and tests. LDA has been reworked a bit to follow the API of QDA and avoid useless operations.
Merge branch 'master' of http://github.com/GaelVaroquaux/scikit-learn
Merge branch 'master' of git://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
ENH: optional computing of estimated covariance of LDA classifier.
MISC: add an unfinished toy example to compare LDA with a (not yet implemented) QDA.
MISC: lda: Y -> y
ENH: add QDA classifier, some docs, examples and tests. LDA has been reworked a bit to follow the API of QDA and avoid useless operations.
Merge branch 'master' of git://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
cosmit in LDA/QDA
MISC: vectorize priors computation for LDA and QDA
Merge branch 'master' of ssh://revilyo@scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
MISC: remove debug
Merge branch 'lda' of https://github.com/mblondel/scikit-learn into discriminant_analysis
re-add self.means_
Matti Lyra (2):
Fixed an issue where CountVectorizer.decode leaves file pointers open after reading the contents of the file. This produces unpredictable behaviour as closing the file pointer is left to the implementation of the python interpreter.
Changed the CountVectorizer charset default back to 'utf-8' instead of 'utf8'. This was due to debugging on my local machine.
Meng Xinfan (2):
Update the docstring to reflect the package name changes.
fix an error in naive bayes docs
Michael Eickenberg (22):
fixed the function definition of cross_val_score
changed cross_val_score doc again
Added strided patch extractor to feature_extraction/image. Extracts patches 16x faster on the MiniBatchDictionaryLearning example
Now added extract_patches for random extraction as well
Now replaced max_patches part by fancy indexing
removed stuff i commented out
testing for correct output shapes and patch content of the last patch for 1 to 3 dimensional arrays
Changes in documentation and notation
ridge multi target with individual penalties written. To be tested
old tests passing
new multiple target tests added, functionality confined to direct usage of ridge_regression function
Ridge estimator works with individual penalties
test for ridge estimator
ridge doc string
ValueError for wrong shaped input instead of assertion failure, in order for sklearn/tests/test_common.py, line 238 to pass
docstring in Ridge estimator
added individual penalties function for all other solvers. Tests passing for all of them
always make alpha into an array
updated tests
tests passing
removed elaborate testing in ridge.fit, not necessary anymore
simplified _solve_svd
Mikhail Korobov (11):
P3K fix incorrect import
P3K: division should produce integer.
PY3 array.array wants str in Python 2.x and 3.x - give it a str
Update outdated comments in sklearn.hmm.
PY3: fix exception syntax in tests/test_common.py
PY3 fix test_cross_validation
PY3 fix OneHotEncoder doctest ( "<type 'float'>" is "<class 'float'>" in Python 3.x)
PY3 fix metaclasses. See #1829.
ENH speed improvements in HMM
TST Fixed test_pipeline_methods_preprocessing_svm: pca was unused
Fixed typo in metrics.py
Minwoo Jake Lee (2):
Merge remote-tracking branch 'master/master' into sparse-mbkm
moved _gen_even_slices to utils/init
Miroslav Batchkarov (1):
fixed the __repr__ method of cross_validation.Bootstrap, which failed if self.random_state is None
Miroslav Shubernetskiy (1):
PY3 allow multiple base classes in six.with_metaclass
Naoki Orii (1):
FIX issue #1457 KNeighbors should test that n_samples > 0
Nelle Varoquaux (177):
First draft of the mini batch KMeans - works, but a lot of cleaning up to do
Refactored: deleted the batch_k_means function, and created an option for the batch_k_means to avoid code duplication - Added some documentation
Added test one the batch k_means
Improve documentation
Batch K-Means
[batch k-means] Changed the algorithm to compute the centroids.
[batch k-means] Fixed the computation of the batch kmeans centroids
[MiniBatchKMeans] Starting refactoring code after the review
[MiniBatchKMeans] Small fixes
Merge branch 'master' into batchKMeans
[MiniBatchKMeans] Small fix in the initialisation for the random initialisation of the centroids
[MiniBatchKMeans] Fixed the tests for the new API
[BatchKMeans] Small fixes following Olivier & Gael's review
Merge remote branch 'scikit/master' into batchKMeans
[MiniBatchKMeans] Removed the unnecessary import in examples/cluster/mini_batch_kmeans.py
[MiniBatchKMeans] Now checks the validity of the data only when initializing the centroids. When the data is empty, return immediately
Merge with Olivier's branch
[MiniBatchKMeans] Documentation fixes
[MiniBatchKMeans] Added a benchmark
[MiniBatchKMeans] Added chart showing the speed and the inertia / total number of points depending on the chunk size and number of iteration
merge with master
[MiniBatchKMeans] PEP8 Compliance
[MiniBatchKMeans] Fixed typo in attribute: cluster_centers_
[MiniBatchKMeans] Added some documentation and example
[MiniBatchKMeans] PEP8 compliance
[MiniBatchKMeans] Added a fit method to the MiniBatchKMeans
Merge branch 'master' into batchKMeans
Merge branch 'master' into batchKMeans
[MiniBatchKMeans] PEP8 compliance and small fixed
Trailing white space
[MiniBatchKMeans] Small fixes
[MiniBatchKMeans] Added an example
[MiniBatchKMeans] Updated the example to compare BatchKMeans and MiniBatchKMeans - added the copy_x option to the BatchKMeans
[MiniBatchKMeans] Minor modifications on the examples
[MiniBatchKMeans] Added labels and scaled the axis properly on the benchmark plot
merge with master
Merge remote branch 'gael/batchKMeans' into batchKMeans
FIX the IRC chan used is scikit-learn, and not learn
FIX - error in the bibtex entry - extra comma that makes bibtex fail
closes #677 - improved affinity propagation docstrings
closes #703 - KFold has now an option to shuffle the data
Added unit test for shuffle option in KFold
Now tests the randomness of the KFolds when shuffle is True, and that all indices are returned in the different test folds
Updated mailmap
Updated mailmap (bis)
Added Pool Adjancent Violator
SMACOF algorithm for MDS
Added tests and documentation to the smacof algorithm
PAV now uses Kruskal's first approach to ties
Added a new dataset: traveling distances between 17 cities in france
MDS now computes the SMACOF algorithm several times, and returns the results with the lowest stress
Added documentation on MDS
MDS can now run several jobs in parallel thanks to joblib - when initial array passed, MDS will also only run once. If n_init is not set to 1, it will raise a warning
FIX mds tests where failing because of an interface change
Added docstrings to MDS
Cleaned up MDS's documentation
Added more documentation on the cities dataset
Fix errors due to previous refactoring on MDS
Changed dataset from france's mileage to knuth's USA mileage dataset
Replaced MDS US mileage distance example by a generated, more representative one
Added paragraphs on metric and nonmetric MDS, explaining the difference
MDS: out_dim → n_components
MDS: added documentation for n_jobs parameter
MDS - fixed some latex error in the documentation
Added a fit_transform method to the MDS class
Pool Adjacent Violators now does a max_iter number of iteration
DOC: added references to papers and licence - fixed the MDS example
a += a.T is different from a = a + a.T
Small explanation on the plot_mds example
np.diag raised a red flag - used broadcasting instead
Set the seed of the random_state generators to have nicely aligned results
Knuth load_cities dataset isn't used anymore
MDS: renamed positions_ to embedding_
Added MDS to manifold comparison methods
MDS: documentation fixes
FIX: load_cities doesn't exist anymore
Added test to sklearn.utils.bench's total_seconds method
FIX - the eps option of the MDS was overwritten
FIX in the makefile - we should delete pyc and so only from the source code, and not from everything in the root folder
Deprecated sparse classes from the SVM module - refs #1093
FIX sparse OneClassSVM was using the wrong parameter
FIX the AP was using a deprecated parameter
Decrease the number of convit in the AP
Renamed parameter convit to convergence_iteration and deprecated the old API
FIX typo in deprecation warning in the AP module
DOC better documentation on the AP
FIX The new parameter of the AP is called convergence_iter and not convergence_iteration anymore
ENH: Isotonic regression
MDS is now using the new isotonic_regression submodule
Added tests to isotonic_regression
DOC - added paragraph in user documentation on the isotonic regression + an example plot.
More documentation
FIX IsotonicRegression only takes vector input, hence don't test it in the common estimators
ENH IsotonicRegression now uses variable names that have more than 3 letters
ENH better error messages on the IsotonicRegression
Added a predict method to the IsotonicRegression
FIX random_state in MDS was not initialized properly
ENH isotonic regression is now slighty more robust to noise
Added test to check whether the isotonic regression changed y when all ranks were equal
ENH uses the IsotonicRegression classifier instead of the method
FIX the mds example did not plot the NMDS
FIX - nmds now uses the same scaling as previously
ENH we require a version of sphinx sufficient for "new" numpy_ext to work
FIX instead of appending numpy_doc to the list of extensions, directly add when creating the list
DOC: small fix in the regression's score method documentation
FIX make_classification now outputs integer labels
DOC formatting (k_means)
ENH - 3x speedup in the isotonic regression
FIX gen_rst.py was something using an undefined variable
Merge pull request #1886 from NelleV/DOX_fix
Added sponsors to the about.rst page
Spelling mistake
DOC fix in the hierarchical clustering
DOC Acknowledge sponsors for the Paris sprint
DOC fixed small mistakes in the pls module
Merge pull request #2140 from arjoly/ajoly-glouppe-sponsor
DOC fix small mistakes
DOC fixed some formatting in kernel approximation
DOC fixed some formatting in the multiclass module
Merge pull request #2146 from ianozsvald/clearer_iris_decision_surfaces
Merge pull request #2163 from ianozsvald/fix_plot_forest_iris_docs
ENH better error message when estimators don't specify their parameters in the signature.
Merge pull request #2187 from FedericoV/non_negative_style
Merge pull request #2195 from erg/bug-2189
ENH added an option to do an isotonic regression on decreasing functions
TEST: added a small test for fitting an isotonic regression on a decreasing function
TEST tests the class instead of the function for the decreasing isotonic regression
MAINT moved the pls file based module to a folder
TEST fixing pls tests failing:
MAINT Move the pls to the cca to a cross_decomposition module
MAINT renamed pls to cross_decomposition in the documentation
FIX the example plots of the pls module did not import pls methods from the correct module
FIX removed the cca and pls modules
FIX added the new module to the setup.py installation
DOC improved docs/docstrings on cross_decomposition
MAINT deprecated the pls module, moved CCA to cca_
FIX init methods of ABCMeta class also need to be abstract
FIX on py3k, we need explicit relative imports
FIX missing deprecation release information.
MAINT charset is deprecated in favor of encoding
TST added tests for encoding/charset deprecation
DOC better deprecation warning messages.
TST better testing of the PLS module
FIX PLSSVD now returns the correct number of components
COSMIT small documentation tweaks
DOC ignoring gen_rst's parsing errors
Merge pull request #2280 from larsmans/randomsearch-scoring
Merge pull request #2281 from ogrisel/improvements-to-setup-py
DOC fixed the optional arguments
FIX added some descriptions to each categories in the main webpage
FIX spelling mistake
FIX the css in the API
ENH added the fork me ribbon to the website
WEB added testimonials
DOC fixed the previous/next button
DOC fided the collapsable sidebar
DOC dropdown menu works
FIX minor edits on the website
DOC fixed z-index on the website
FIX website layout on small screens
FIX improve display on small device
DOC fix dropdown menu
FIX backward compatibility was broken
DOC added link from banner to example.
DOC now building to html/stable
DOC home always points to stable
ENH added an orange cite us button on the front page
FIX cite us buttong made blue bar span too much
DOC added testimonials
FIX forgot evernote's logo
ENH added telecom to the testimonials
DOC updated evernote's testimonials
ENH added AWeber's testimonial
ENH added carousel back on front page for testimonials
ENH better spacing on the first page
ENH testimonials img are now centered.
FIX typo in testimonials
Nick Wilson (7):
DOC: Various minor fixes to "Contributing" docs
Skip k-means parallel test on Mac OS X Lion (10.7)
FIX: Delete temporary cache directory
BUG: Fix metrics.aux() w/ duplicate values
FIX: Add NORMALIZE_WHITESPACE to broken doctest
Stop passing keyword arguments for positional args
Add verbose parameter to SVMs (fixes #250)
Nicolas Pinto (34):
MISC: cosmetic -- setup.py is now pep8 safe
MISC: cosmetic -- cross_val.py is now pep8 safe
MISC: cosmetic -- fastica.py is now pep8 safe
MISC: cosmetic -- pca.py is now pep8 safe
MISC: cosmetic -- scikits/learn/setup.py is now pep8 safe
MISC: cosmetic -- pls.py is now (almost) pep8 safe
MISC: cosmetic -- hmm.py is now pep8 safe (getting tiring, next time I'll show up earlier at the sprint ;-)
MISC: cosmetic -- base.py is now pep8 safe
MISC: cosmetic -- grid_search.py is now pep8 safe
MISC: cosmetic -- grid_search.py is now pep8 safe
MISC: cosmetic -- more pep8
MISC: cosmetic -- setup.py is now pep8 safe
MISC: cosmetic -- cross_val.py is now pep8 safe
MISC: cosmetic -- fastica.py is now pep8 safe
MISC: cosmetic -- pca.py is now pep8 safe
MISC: cosmetic -- scikits/learn/setup.py is now pep8 safe
MISC: cosmetic -- pls.py is now (almost) pep8 safe
MISC: cosmetic -- hmm.py is now pep8 safe (getting tiring, next time I'll show up earlier at the sprint ;-)
MISC: cosmetic -- base.py is now pep8 safe
MISC: cosmetic -- grid_search.py is now pep8 safe
MISC: cosmetic -- grid_search.py is now pep8 safe
MISC: cosmetic -- more pep8
Fix typo in SGDClassifier's docstring (via GitHub).
Add arXiv link to Halko et al. 2009 paper.
DOC: fix a few incoherencies in ridge.py
ENH: add verbose option to LinearSVC
BUG: fix LibLinear verbosity for L2R_L2_SVC
MISC: verbose should be int, not bool
TST: add smoke test for LinearSVC's verbose option
ENH: add store_loo_values attribute to _RidgeGCV see Issue #957
FIX: expose loo_values_ in RidgeCV instead of the private _RidgeGCV
COSMIT: rename M matrix to loo_values
COSMIT: -loo_values +cv_values
FIX: use rng with fixed seed
Nicolas Trésegnie (38):
DOC fix macports package name
Add test for PatchExtractor (float value for max_patches)
Fix float value support for max_patches in PatchExtractor
Fix as_float_array behaviour when copy=True
Add test of the as_float_array behaviour when copy=True
Add a copy parameter to safe_asarray()
Imp readability
Missing value imputation
Fix tests
Fix tests + doc improvements + renaming
Add test with default value of copy + doc improvements
Imp readability
Fix use of as_float_array
pep8
Imp variables names
Del use of as_float_array + naming and documentation improvements
Fix use of mask
Fix import names
Add pycharm files in .gitignore
Imp splitting of preprocessing.py
Imp splitting of test_preprocessing.py
Del unused imports in preprocessing + pep8
Fix imports
Imp move OneHotEncoder to preprocessing/data.py
pyflakes and pep8
Fix self.statistics_ souldn't be set if axis==1
Fix use of self
Refactor loss_func and score_func warnings in grid_search
Add score_overrides_loss to _deprecate_loss_and_score_funcs
Add deprecation warnings in Ridge
Add deprecation warnings in rfe
Add catching of the deprecation warnings in rfe and ridge tests
Refactor loss_func and score_func warnings in cross_validation + replacement in two examples
Fix 'scoring' docstrings
Imp documentation
Fix tests
Fix grid_search.py example
Fix tests
Noel Dawe (152):
adding boosting and decision trees
adding bagging and gradboost
minor change
working on interfacing with Cython
minor updates
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn
pull from upstream
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn
implemented AdaBoost
refactoring
minor fix
minor fix
almost done...
it compiles\!
now it really compiles
minor fix
working on segfault
now it works
trying to fix score bounds
updates
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn
sanity check in adaboost
more sanity checks in adaboost
fairly stable now
fixed bug where node cuts were not set but left at 0
working on limiting cases
updates
fixing bug in adaboost
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn
updates
minor change
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn
bagging now implemented
removing committee for now
updates
adding tests
better demonstration in test module
minor change
bugfix
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn
minor change
pep8
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn into decisiontree
updates
ignore splits that yield nodes with net negative weight in find_best_split
rm unneeded negative weight logic in Criterion.init_value and Gini.eval
add note about negative weight treatment in BaseDecisionTree.fit
add negative weights test (currently fails): predict_proba should still be valid probabilities
FIX: negative weight test. do not allow any class to have negative weight after a split
DOC: document negative weight treatment in the case of classification
implement AdaBoost
use weighted mean in ClassifierMixin.score
FIX: DecisionTreeRegressor.score
FIX: import not used
FIX: overlapping y-axis labels
FIX: use generator instead of np.random
rm doctest in make_gaussian_quantiles
fix variable naming in weight_boosting
FIX: TypeError for regressor
FIX minor comment
FIX: docs, code clean up, learn_rate -> learning_rate
FIX: plot_adaboost_classification.py
don't enforce DTYPE at the ensemble level
DOCS: note generator behaviour in staged methods
Make BaseWeightBoosting abstract and other misc changes
revert changes to grid_search
FIX: import
revert implementation of sample weights in BaseWeightBoosting.staged_score
revert a few spurious changes
pep8 + pyflakes, use arrays for errors_ and weights_
init weights_ to zeros and errors_ to ones
add Hastie 10.2 example
pep8
implement SAMME.R algorithm
update adaboost hastie example and weight_boosting tests
use broadcasting
combine real and discrete algorithms under one class
DOC: AdaBoostClassifier real arg
update example: fix histogram range
Merge pull request #20 from glouppe/adaboost
Merge pull request #21 from glouppe/adaboost
update adaboost example: exposes instability
displace predict_proba by 1e-10
Merge pull request #22 from glouppe/adaboost
FIX: adaboost predict_proba
only boost positive sample weights
FIX: only boost positive sample weights
Merge pull request #23 from glouppe/adaboost
FIX: negative and zero probabilities while boosting with SAMME.R
FIX: doctest
FIX: doctest and slightly larger displacement from zero probabilities (32 vs 64bit doctest instability)
remove weighted_r2_score (leave for next PR scikit-learn#1574)
revert spurious change in metrics.py
FIX: use full decision tree in AdaBoost and fix title in plot_forest_iris.py
DOC: add __doc__ to plot_adaboost_hastie_10_2.py
FIX: reference format
FIX: show decision boundary in plot_adaboost_classification.py
FIX: refactor plot_adaboost_classification.py and add legend
rename plot_adaboost_classification.py -> plot_adaboost_twoclass.py and add predict_twoclass method to AdaBoostClassifier
FIX: only possible split sometimes creating children with negative or zero weight in the presence of negative sample weights
FIX: improve multi-class AdaBoost example (rename to plot_adaboost_multiclass.py)
add author
typo
use metrics module and pep8
typo
fix class ordering in two-class
faster sample_weight initialization
speed improvements to make_gaussian_quantiles
even more speed improvements to make_gaussian_quantiles
py3k
DOC: note initialization of sample_weight if None
factorize common sample_weight check
Merge pull request #24 from glouppe/adaboost
add decision_function and staged_decision_function and refactor some code
Merge remote-tracking branch 'upstream/master' into treeweights
Merge pull request #25 from glouppe/adaboost
pep8
Merge pull request #26 from glouppe/adaboost
update adaboost regression example and use estimator_errors_
rm n_estimators argument from predict methods
DOC: fix docstring for make_gaussian_quantiles
FIX: alpha=.5 and use more difficult dataset in two-class example. Add mean and cov arguments to make_gaussian_quantiles
FIX: learning_rate default value consistency
FIX: TypeError message if base_estimator does not support class probabilities
FIX: comments from @ogrisel
make learning_rate=1 default for classification
only sum sample_weight once
rm sphinx/docutils formatting in exception messages
inline comment about learning_rate in hastie example
add note about SAMME.R converging faster than SAMME
add note about y coding construction
add description of dataset in two-class example
fix missing parenthesis in make_hastie_10_2 dataset
Merge pull request #27 from glouppe/adaboost
import pylab as pl
remove check for fit_predict
fix importance test and test both SAMME and SAMME.R algs
don't show class B probabilities in two-class example
two-class decision scores -> decision scores
clarification on two-class decision scores plot
explain decision scores in two-class example
fix AdaBoost.R2 and update example
DOC: loss_function
fix failing tests
fix failing doctest
Merge pull request #28 from glouppe/adaboost
API consistency with gradient boosting: loss_function -> loss
Merge pull request #29 from glouppe/adaboost
minor edits in docs
DOC: notes about examples and minor edits
make setup.py executable
AdaBoost: use estimator weights in predict_proba
Norbert Crombach (1):
Fix L2 regularization order in sgd_fast
Olivier Grisel (1411):
test to reproduce issue 67 on LARS coef shape
Merge branch 'master' into issue-67-LARS-shape
tracking changes from master
follow API change in LARS
Merge branch 'master' into issue-67-LARS-shape
Merge branch 'master' into issue-67-LARS-shape
make sparse coding test pass
more .gitignore
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
started work on document classification: bag of wordsw extraction and hashed tfidf
some tests for the text features extractor
checkpointing work in progress on MLComp dataset integration
remove labels handling from vectorizer code
more work on document classification dataset loader
smaller default dim: faster to load by default, need experimental setting to find good tradeoff
make it easy to find the raw source document
better parameter ordering
example usage of MLComp document classification datasets
use compiled re pattern
small fixes
Merge branches 'master' and 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
typos
add the ability to use stop words for text classification, but does not improve accuracy hence not enabled by default
typo in comments
faster and better accuracy with hinge loss of doc classif example but not sparse anymore since l2 reg...
make the features package a first class citizen
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
ENH: more efficient stopping criterion coordinate descent GLM and comparison with python-glmnet
blas-ification of elastic net + ensure that the gap is initialized and evaluated
cosmit
work in progress on sparse vector extraction for document datasets
exclude scikits.learn.external package from top level nosetests env
missing pl.show() in rfe examples
more missing pl.show() in examples
using a separate class for the sparse version of the hashing vectorizer
readd the dense version of the vectorizer
checkpointing work in progress on the sparse version of the document vectorizer
more scalable TF-IDF computation unfortunately using a python for loop
new example to demonstrate sparse TF-IDF + sparse SVM on 20 newsgroups (too slow right now)
Merge branch 'sparse-documents'
avoid useless allocations in dense_to_sparse conversion
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
experimenting with character n-grams features (basic morphological analyzer)
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
fix one simple import in doctest for GMM
simple Makefile for repetitive dev tasks on POSIX OS
fix broken/unstable sparse SVM tests
even more stability fix for sparse SVM
fix broken doctest in HMM
disabling broken doctest in Gaussian Mixture Models
fix HMM doctests
cosmit
ignore coverage output folder
trailing spaces
skip remaining failing tests in HMM test suite
fix inline comment
Showcase the new LinearSVC wrapper for with sparse liblinear bindings in the 20 newsgroups document classification example
tracking changes in master branch
fix broken test for text features extraction
fix broken test for text features extraction
tracking changes from master and restore broken SparseHashingVectorizer
add ability to compute token ngrams too
fix broken doctests for SVC / NuSVC
Merge branch 'master' into char-ngram-features
cosmit
cosmit + trailing spaces + improved some comments
pep8 spacing
more cosmit
cosmit
Merge branch 'master' of github.com:scikit-learn/scikit-learn into issue-77-sparse-cd
starting boilerplate for sparse coordinate descent
Merge branch 'master' of github.com:scikit-learn/scikit-learn into issue-77-sparse-cd
fix broken test
checkpointing work in progress
avoid confusing cython extension names
fixed various issues with sparse datatype handling in previous checkpoint
better note
first stab at the sparse CD
leave sparse evaluation of the dual gap for later
forgot files from previous checkin
Merge branch 'master' of github.com:scikit-learn/scikit-learn into issue-77-sparse-cd
check that sparse API for coordinate descent also work with dense list-based input
one more test for sparse CD
sparse dual gap too!
Merge branch 'master' of github.com:scikit-learn/scikit-learn into issue-77-sparse-cd
cosmit
Merge branch 'master' of github.com:scikit-learn/scikit-learn
fix broken doctest in cross_val
Merge branch 'master' into issue-77-sparse-cd
more robust tests
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' into issue-77-sparse-cd
missing import
added sparse Lasso utility class
OPTIM: fix a typo and some suboptimal cython constructs in dense coordinate descent
Merge branch 'master' of github.com:scikit-learn/scikit-learn
share the same cython impl for both lasso and elastic net CD
make d_w_max early stopping criterion scale invariant
cosmit: s/nsamples/n_samples/g and s/nfeatures/n_features/g
group stopping criterion related boilerplate in the same place for readability
FIX: make CD lasso robust to zero valued columns (useless features)
Merge branch 'master' of github.com:scikit-learn/scikit-learn
add duration to glmnet benchmark output
removed useless includes
make d_w_max threshold independant of the squared norm of y to make it useful in practice
Merge branch 'master' into issue-77-sparse-cd
port latest bugfix and optims from dense CD to sparse CD
fix NORMALIZE_WHITESPACE issues in doctests
more robust and understable CD elastic net test using explained variance score instead of RMSE
forgot to setup the good value of rho in last checkin
Merge branch 'master' into issue-77-sparse-cd
Merge branch 'textextract' of git://github.com/mblondel/scikit-learn
Merge branch 'master' of github.com:scikit-learn/scikit-learn
better docstring + cosmit for the RFE module
pep8
move the RFE module to the feature selection package
merge textextract branch from mblondel
make analyzers inherit from BaseEstimators to get better repr and parameters management
work in progress: refactoring the document classification dataset API to remove the feature extraction step
Merge branch 'master' into textextract
tracking changes from master
cosmit
more precise doc on SVM complexity
FIX: explained variance is not symmetric: ground truth comes first
Merge branch 'textextract' of git://github.com/mblondel/scikit-learn into textextract
ENH: s/filter/preprocessor/ + docstring cosmit
ENH: more docstring love
ENH: implement __repr__ for DefaultPreprocessor so that estimators __repr__ looks prettier
cosmit
make the pipeline / grid_search object nicer to introspect in tests
FIX: make grid_search output deterministic even in case of tie on the scores
mereging pprett sgd work while tracking master
mark heisentest as skipped: it randomly passes 3 out of 5 times on my box with pyamg installed
Merge branch 'master' into sgd
kill evil tabs
cosmit
cosmit on example
cosmit: PEP8 + some missing docstrings
more cosmit
more cosmit
merging with alexandre's fixes
fix broken doctests: they are space sensitive unfortunately
Merge branch 'master' into textextract
better way to load folder / files dataset
porting the sparse document classification example to the new API
cosmit: PEP8 + some missing docstrings
Merge branch 'master' into textextract
PEP8 + better docstrings
FIX: add missing README.txt file for the sgd examples
ENH: more cosmit, docstring, test cleanup for the metrics module
cosmit
register the SGD chapter in the user guide TOC
PEP 8 in metrics module
ignore generated doc elements
some more cosmits / PEP8
more PEP8
helpers to use tags with vim / emacs
Make it possible to pass explicit labels to confusion matrix
make binary classification recall explicit
factorizing code to make it easier to do the multiclass refactoring in one place
refactored test_metrics to handle the binary case explictly and make room for
test precision recall for binary classification
more code factorization: fscore joins the party
and you thought you could escape the PEP8 screening
missing test for f1 output
cosmit
extract label extraction logics
ENH: make precision, recall and f1_score handle multi-class
removing test for multi label perf evaluation
FIX: area under curve: recall is x and precision is y
Merge branch 'master' into issue-155-multiclass-precision-recall
Merge branch 'master' into issue-155-multiclass-precision-recall
ENH: new utitity in metrics module: the classification report
showcase the new classification report in the examples
add detailed performance report to the digits example
Merge branch 'master' into issue-155-multiclass-precision-recall
tracking changes occurring on master
cosmits
ENH: handle support to do weighted averages of P/R/F scores
scalar scoring functions for P / R / Fbeta
s/explained_variance/explained_variance_score
make the distinction between loss and score function more explicit
Merge branch 'master' into textextract
spelling
make the grid search able to use an arbitrary score function
Merge branch 'master' into textextract
removing the hashing vectorizers code that need a full rewrite
update SGD example to showcase the new OVA implementation
cosmit in k_means module
FIX: better k-means tests + fixed broken array init
FIX: potential division by zero in scaler
FIX: fixed more cheesy NaNs than an Indian restaurant in Paris
New example to demonstrate the KMeans API with various init strategies
trailing spaces holocaust
let me introduce the culprit of the last checkin
Merge branch 'master' into dense
more trailing spaces cleanup
ignore downloaded data from example
Various improvement in low dim classification example
Merge branch 'dense' of git://github.com/pprett/scikit-learn into dense
remaining conflict markers in previous checkin
cosmit in SGD example
s/libsvm/liblinear/ in classification example
remove the dependency to explicit ABC to keep 2.5 compat + PEP8
make the dense SGD code & docstring more readable
more precise docstring in base SGD class
forgot to finish a sentence on regularization in a docstring
more docstring love
cosmit
use multi proc in multiclass SGD by default
cosmits in SGD tests
more comits in the sgd tests
cleanup
cosmit
better test file name
reuse the dense SGD test suite for the sparse variant using test case inheritance
cosmit
cosmits in the SGD pyx files
more cosmit in pyx files
more cosmit in example
PEP8 in SGD tests + docstring
better looking docstring for sparse sgd
more info on loss and penalty params for sparse SGD
propagate spelling fixes to the dense SGD docstring
ducktyping in analyzers
work in progress on vocabulary dimension restriction
small fixes + updated the tests
cosmit
add note on fortran contiguous memory optim for the X array
Merge branch 'master' into textextract2
OPTIM: vectorizer with predifined dictionary 5x faster by eliminating scipy.sparse.vstack calls
some optims in the text preprocessors
OPTIM: sparse vectorizer uses COO a init
multi-line print cosmit
use a SGD model in the mlcomp demo since it is the fastest for this problem
cosmit
make it possible to do fancy indexing on filenames
move the mlcomp SGD example as a generic 20 newsgroup classification example
cosmit
better pipeline notation in vectorizer + classifier grid search example
4 more years!^W^W^W 1 more test for vectorizers with max_features
cosmit
factorize out shuffling dataset since it might be useful by default
new example on how to use pipeline / grid_search for extraction parameters
sample run output in the grid_search_text_extraction_parameters example
reST formatting of example
cosmit
better title for the mlcomp example
better example filename
reference new example in the documentation of the grid_search module
cosmit
ENH: automated class_weight for SVC on imbalanced data
more s/predict_margin/decision_function/ in examples
FIX: typo in custom score_func in grid_search
initial face regonition example using eigenfaces
FIX: better handling of NaNs in precision / recall / f-score metrics
Merge branch 'faces-example'
face recognition example using eigenfaces and SVMs
more explicit subplot titles
cosmit
FIX: actually truncate the SVD to make it faster + add some test
forgot the test file in my last checkin...
drop the warning since useful even if approximate as demoed in the faces example
make fast_svd deteriministc by default while allowing to pass rng seeds
test singular values as well
new benchmark: comparing SVD implementations
remove useless import
more documentation on fast SVD + missing reference
PEP8 + various cosmits in sample generators
more tests for the iterated power refinement of the Martinsson randomized SVD
ENH: make the PCA transformer use the iterated power refinement by default
one more test for SVD
Welcome to Alexandre Passos
OPTIM: do not allocate a (n_samples, n_samples) temporary array with scipy.linalg.qr when (n_samples, k + p)) is all what is needed
OPTIM: fast_svd now has a auto transpose mode that switch to the fastest impl
cosmit
switching back to scipy.linalg.qr with econ=True to avoid half-installed numpy issues with wrong lapack bindings
FIX: numerical instability in Rdige regression tests
cosmit
new example: principal eigen / singular vector of the wikipedia graph
Better docstrings in the example
simpler SVD benchmark: use the sample_generator utility and fixed effective rank
moving real word examples to the applications subfolder
better gitignore data archives
s/_sparsedot/safe_sparse_dot/g
even better .gitignore (teasing...)
cosmit on PCA module
avoid global variable in test
ENH: make the PCA transformer perform variance scaling by default + update the face recognition accordingly
FIX: GridSearchCV refit did not propagate the fit params
switch whintening off in PCA by default + ensure unit scale + better docstring
use a grid search for the SVM params in the faces example
updated lasso benchmark to showcase the region where LassoLARS is faster than Lasso CD
OPTIM: ensure lasso_path aligns the data only once in if not alread fortran contiguous
pep8
ENH: LassoCV / ElasticNetCV now uses all folds data + example
ENH: make the LassoLARS and LassoCD path examples easier to compare
make MSE plot of LassoCV more readable by scaling the y axis
FIX: update broken tests by last checkin
switch to base 10 for the alpha logs in the Lasso CD path plot
revert the plot style to the LARS paper conventions
select the best alpha using the mean of the CV MSEs instead of the median
cosmit: += assignement replaced by plain = in coorinate_decent (more natural, less confusing)
extract the randomized SVD implementation as a toplevel class able to handle sparse data as well
consistently rename n_comp to n_components
Merge branch 'master' into sparse-pca
update doctest to handle the change in regularizer strenght definition in LARS
FIX: typo s/mean/mean_/g in RandomizedPCA
Merge branch 'master' into sparse-pca
sed -i "s/\<n_componentsonents\>/n_components/g"
SVD benchmark have a consistent filename
factorized out correlated regression dataset utility function and updated
do not allocate useless memory in make_regression_dataset
launch test on documentation by default when running make
cosmit
OPTIM: do not precompute r2_score_ in ElasticNet in the fit call
do not precompute explained_variance_ in linear model: can be too costly: use r2_score when needed instead
new benchmark for lasso path implementations
merging master
temporary test fix for refit instability in linear SVC: a bugfix branch will be open to reproduce the issue
cosmit (reST formatting of the SGD module documentation)
Merge branch 'master' of github.com:scikit-learn/scikit-learn
more formatting in SGD reST and fixed docstest broken by last checkin :(
cosmit
ENH: make it possible to customize the WordNGramAnalyzer token regexp
Merge branch 'master' of git://github.com/jaberg/scikit-learn
PEP8
more PEP8
more PEP8
style conventions for variable names
FIX: allow the trivial border case k==n in KFold CV
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'cv_indices' of https://github.com/agramfort/scikit-learn
ENH: KMeans tolerance parameter renamed tol (as in coordinate descent) and made public
FIX and more tests for PCA and inverse_transform also for RandomizedPCA
Add documentation for the RandomizedPCA class
add new method for fecthing datadir + reorg os related imports
checkpoint WIP for the LFW dataset loader
Merge branch 'master' into lfw-dataset
fix broken dataset description
checkpointing work in progress
Merge branch 'master' into lfw-dataset
work in progress on LFW: fetching the data
more work on dataset loader for LFW pairs
get rid of the normalization that should not be part of the load time
Merge branch 'master' into lfw-dataset
make it possible to load the LFW people dataset using the scikits.learn.datasets infra
remove stupid color slicing 'feature' and shuffle the examples
pep8
better default slice values
better looking example
Merge branch 'master' into lfw-dataset
face verification example will be implemented later
Merge branch 'master' into lfw-dataset
cosmit typo
first test for the LFW loader skipped if missing data folder
more LFW tests
pep8
documentation for the LFW dataset loaders
Merge branch 'master' into lfw-dataset
generate fake LFW dataset to fully test the LFW loader even without access to the real data
add HTML coverage report
more robustness test checks for LFW loader
first stab at factoring the 20 newsgroups dataset loading
cosmit
cosmit
fix kw params propagation to load_files
update the grid search example
remove function autodoc section that breaks sphinx
better name: rename load_files to load_filenames
better name: rename class_names to target_names for consistency
merge lfw-dataset to 20newsgroups-dataset
cosmit
Merge branch 'lfw-dataset' into 20newsgroups-dataset
Merge branch 'lfw-dataset' of https://github.com/GaelVaroquaux/scikit-learn into lfw-dataset
cosmit / ordering
use explicit parameter passing
merge changes from LFW branch
Merge branch 'lfw-dataset'
Merge branch 'master' into 20newsgroups-dataset
some more work on the datasets documentation
improvements to the datasets documentation
fix: avoid creating a spurious '~' in the current working directory
pep8
typo
missing justification for the shuffling of samples
Merge branch 'master' into 20newsgroups-dataset
restore python 2.5 compat
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' into 20newsgroups-dataset
FIX: make PCA models usable in pipelines
Merge branch 'master' into 20newsgroups-dataset
add backward compat for old load_files public API
Merge branch 'master' of github.com:scikit-learn/scikit-learn
trailing spaces
pep8
style
add check to the nature of y to have more explicit error messages
explicit ValueError when not enough data for kmeans and some pep8
style
make RandomizedPCA work on list data
FIX: the datasets doctest fixture could never skip the tests when required
use WARNING level logs before using network access
make the test display the output on stdout
ENH: add function to clear the data_home cache + tests
full PEP8 compliance for the scikits.learn.datasets package
renamed load_* to fetch_* when network connection is potentially involved
add load_lfw_pairs and load_lfw_functions for backward compat and consistency
load_20newsgroups as an alias for fetch_20newsgroups in offline mode
trailing spaces
break test data symmetry to avoid heisenfailure in RandomizedPCA test
Merge branch 'master' of github.com:scikit-learn/scikit-learn
FIX: heisen test failure + some pep8 in test_pca.py
FIX: make the PIL dependency optional (skip LFW tests if not present) + explicit error message
FIX: make the PIL dependency optional (skip LFW tests if not present) + explicit error message
FIX: workaround broken PIL installs
Merge branch 'nmf-lite' of https://github.com/vene/scikit-learn into vene-nmf-lite
Merge branch 'nmf-lite' of https://github.com/vene/scikit-learn into vene-nmf-lite
ENH: plot eigencefaces in face recognition example
ENH: do not download LFW when building the documentation by default
Merge branch 'master' into vene-nmf-lite
Merge branch 'nmf-lite' of https://github.com/vene/scikit-learn into vene-nmf-lite
Merge branch 'text' of https://github.com/vmichel/scikit-learn into vmichel-text
FIX: update the examples to match the new text feature extraction API
FIX: feature_extraction.text is now a module instead of package
FIX: forgot to update the documentation after the feature_extraction.text refactoring
FIX: decrease disk usage in LFW data folder
ENH: factorize some plot code in face recognition example
FIX: broken link to plot_kernel_pca kernel in the documentation
typo
MISC: style fixes in NMF
ENH: improved contributors guide
Merge branch 'master' of github.com:scikit-learn/scikit-learn
ENH: add coverage install command
cosmit
DOC: first stap at the performance chapter (full of TODOs)
Merge branch 'master' of github.com:scikit-learn/scikit-learn
DOC: missing class reference
DOC: cosmit
MISC: another style fix for a private function in nmf
DOC: add sample python profiling session
DOC: note for later
DOC: add some missing reference in the performance guide
ENH: avoid the use of lambdas in NMF to get a more informative profiling output
Merge branch 'master' of github.com:scikit-learn/scikit-learn
DOC: fix small inaccurracy
DOC: more warning fixes for the classes reference toc
FIX: stupid statement in plot_face_recognition
DOC: make the face recognition example static (to avoid having to download the dataset to build the doc)
MISC: style fixes in NMF
ENH: improved contributors guide
ENH: add coverage install command
cosmit
DOC: first stap at the performance chapter (full of TODOs)
DOC: missing class reference
DOC: cosmit
MISC: another style fix for a private function in nmf
DOC: add sample python profiling session
DOC: note for later
DOC: add some missing reference in the performance guide
ENH: avoid the use of lambdas in NMF to get a more informative profiling output
DOC: fix small inaccurracy
DOC: more warning fixes for the classes reference toc
FIX: stupid statement in plot_face_recognition
DOC: make the face recognition example static (to avoid having to download the dataset to build the doc)
DOC: refined the python profiling example
DOC: fix / add more class reference links in perf doc
wording
DOC: started intro YEP
Merge branch 'variational-infinite-gmm' of https://github.com/alextp/scikit-learn into alextp-variational-infinite-gmm
DOC: use uppercase for project / language names
Merge branch 'batchKMeans' of https://github.com/NelleV/scikit-learn into NelleV-batchKMeans
ignore 'cython -a' HTML reports
Merge branch 'batchKMeans' of https://github.com/NelleV/scikit-learn into NelleV-batchKMeans
ENH: style, pep8, docstrings comments, variable names
ENH: more interesting batch size
ENH: more fixes for variable names
ENH: fix example docstring
DOC: more work on the performance chapter
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'variational-infinite-gmm' of https://github.com/alextp/scikit-learn into alextp-variational-infinite-gmm
Merge branch 'master' into alextp-variational-infinite-gmm
Merge branch 'variational-infinite-gmm' of https://github.com/alextp/scikit-learn into alextp-variational-infinite-gmm
Merge branch 'variational-infinite-gmm' of https://github.com/alextp/scikit-learn into alextp-variational-infinite-gmm
Merge branch 'variational-infinite-gmm' of https://github.com/alextp/scikit-learn into alextp-variational-infinite-gmm
Merge branch 'variational-infinite-gmm' of https://github.com/alextp/scikit-learn into alextp-variational-infinite-gmm
Merge branch 'variational-infinite-gmm' of https://github.com/alextp/scikit-learn into alextp-variational-infinite-gmm
Merge branch 'variational-infinite-gmm' of https://github.com/alextp/scikit-learn into alextp-variational-infinite-gmm
ENH: more informative test error message
typo
ENH: spectral clustering doc and style improvements (pep8, docstrings, references, variable names)
cosmit
cosmit
ENH: style / pep8 / docstring fixes in s/l/utils/fixes.py
ENH: new make_rng utility function to help make PRNG seeding explicit
TEST: forgot to checkin the unittest for the make_rng function
ENH: add test for picklability of the spectral clustering model
FIX: make normalizer use the real l1 norm on each row (without assuming positive values)
DOC: typo in line-prof package name
FIX: broken import in bench_plot_nmf
DOC: fix doctests to make them work with numpy 1.5 and olderw
merged master
DOC: trim_doctests_flags = True for sphinx
Merge pull request #147 from larsmans/master.
rename rng to random_state
cosmit
delayed check_random_state in k means and spectral clustering
Merge pull request #154 from larsmans/master.
kill trailing spaces
merge master
merge from master, update random_state API + pep8
Merge pull request #150 from pprett/learningrate
track changes from master
Compressed README.rst to make it an executive summary
started work on homogeneity, completeness and V-measure as clustering metrics
working implementation of V-measure, still needs doc and updated clustering examples
use V-measure metrics in K-means example
add missing return info in swiss roll docstring
illustrate clustering metrics on affinity propagation example
100% test coverage for the new clustering metrics
more tests
add more documentation for the new metrics
typo
typo
split some tests to make them more atomic
Merge branch 'master' into clustering-metrics
pep8
typos
Merge branch 'master' into clustering-metrics
typo
Merge branch 'batchKMeans' of https://github.com/NelleV/scikit-learn into NelleV-batchKMeans
pep8
Merge branch 'master' of github.com:scikit-learn/scikit-learn
more pep8
better docstring for the LabelBinarizer in the multilabel case
started work on normalizer API simplification
work in progress on package structure
FIX: rounding issues on python 2.6 in clustering metrics doctests
ENH: add a note on the symmetry of the metrics
ENH: simpler import statement in example
ENH: simpler import statement in example + explicit square
ENH: add links to the reference guide
ENH: better docstrings for symmetric considerations
cosmit
ENH: better organization of metrics references
ENH: reorganization of the document to be operational quicker
fix broken test introduced in last checkin
new utility function to generate blobby datasets
Merge branch 'master' of github.com:scikit-learn/scikit-learn
FIX: indexing bug when labels are not consecutive
Merge branch 'master' into clustering-metrics
FIX: broken doctests
Merge branch 'master' of github.com:scikit-learn/scikit-learn
ENH: new utility function to shuffle data in a consistent way
Merge branch 'batchKMeans' of https://github.com/NelleV/scikit-learn into NelleV-batchKMeans
Merge pull request #161 from ogrisel/clustering-metrics
ENH: small fixes in scikits.learn.utils.shuffle
Merge branch 'batchKMeans' of https://github.com/NelleV/scikit-learn into NelleV-batchKMeans
Welcome to Nelle\!
pep8
ENH: syntactic sugar for the shuffle utility
ENH: better / simpler handling of shuffling in MiniBatchKMeans
ENH: refactored shuffle to address the resampling with replacement case + more tests
FIX: n_samples bug in shuffle, 100% coverage in utils, missing reference doc entries
first shot at a boostrapping cross validator
typos
more typos
ENH: ensure that training and test split do not share any sample
ENH: better input validation + more representative doctest
ooops
cosmit
DOC: cleanup in cross validation doc
Merge branch 'master' into bootstrap
add bootstrap to reference doc
DOC: new section for the Bootstrap cross-validation
cosmit
cosmit
add see also in resample docstring
FIX: make cross_validation_score work with sparse inputs
merge master
cleanup leftover
ENH: add test for the permutation_test_score with sparse data
Merge branch 'master' into bootstrap
more tests
Merge branch 'balltree-wrapper' of https://github.com/jakevdp/scikit-learn into jakevdp-balltree-wrapper
Merge branch 'bootstrap'
FIX: make r2_score and explained_variance_score never return NaNs
Merge branch 'master' of github.com:scikit-learn/scikit-learn
pep8
add a comment explaining the + 10
Merge branch 'mldata' of https://github.com/pberkes/scikit-learn into pberkes-mldata
pep8 / style
fix broken test in MultinomialNB
ENH: more readable datasets definitions
ENH: avoid double HDD copy of mocked datasets + style
merge
merge master
add random projection and PCA to digits manifold example
use scikit-learn QR compat alias
cosmit
ENH: split figures for better reusability and readability
Merge branch 'extended-digits-manifold-example'
ENH: make the LLE random seeding controllable and deterministic by default
Merge branch 'master' of github.com:scikit-learn/scikit-learn
docstring style
FIX: broken doctests and missing max_iter attribute in LassoLARS
FIX: broken doctest in the documentation caused by the last fix
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' into preprocessing-simplification
work in progress on SampleNormalizer unification
enable test for the sparse variant
getting rid of the remaining stuff in the preprocessing.sparse package
more explicit / descriptive low level cython function names
cosmits / pyflakes / pep8
ENH: improve docstring with missing parameters and motivations
factorize a normalize utility function
s/SampleNormalizer/Normalizer/g
Merge branch 'master' into preprocessing-simplification
moar tests
more tests for preprocessing (scaling)
more tests for preprocessing: coverage is now 100%
make centering optional in Scaler / scale + fix broken test
one more test
one more test for preprocessing (no mean centering)
fail early
pep8
ENH: docstrings for Scaler / scale
bugfix: sparse_format can be omitted
typo
better docstring for Scaler
register the preprocessing utilities to the reference documentation
fixes in See also sections
ENH: give motivations for standardazation in the Scaler docstring
ENH: style fixes and better use of the scikit-learn API in ROC example
Merge branch 'master' into preprocessing-simplification
started work on the narrative documentation for the preprocessing package
typo
reorg TODO and notes
DOC: section on normalization
DOC: section on feature binarization
factorize the binarize function + write documentation
format
Merge pull request #194 from jakevdp/balltree-queryrad
Merge pull request #198 from amueller/fastICA_transposed
Merge pull request #207 from pprett/mbkm-fix
Merge branch 'master' into pberkes-mldata
DOC: reorg of dataset page to make it more consistent
FIX: make the dataset doctest fixture modular
typo
track changes from master
FIX: make the dataset doctest fixture modular
typo
PEP8
ENH: make rng of the LLE tests controllable to hunt down potential NaNs
FIX: add tolerance for lack of numerical precision
Merge remote-tracking branch 'lemin/sparse-mbkm'
remove leading _ in _gen_even_slices and duplicate implementation in sparse_pca
remove verbose output from GMMHMM test
Merge pull request #272 from glouppe/master
fixed broken doctest in HMM
Merge remote-tracking branch 'sabba/master'
Merge pull request #289 from sabba/master
ENH: more rng instance instead of singleton in tests
FIX: potential division by zero when normalizing non-pruned CSR matrices
PEP8 in LLE tests + better assertion failure messages
display the eigen solver name in case of LLE reconstruction test failure
ENH: make the file loader keep the filenames information
cosmit on docstring first line
FIX: broken Gram handling in OMP estimator + minor style improvements
Merge branch 'master' into jakevdp-manifold-isomap
FIX: broken dataset generator import + minor styling issues
fix comment
Merge pull request #303 from glouppe/master
FIX: avoid the dependency on pylab in the doctests
Merge remote-tracking branch 'vene/patch-extraction' into vene-patch-extraction
fix broken doctests
ENH: remove references to digits + format
plot the original centered sample + make sparse pca a little less sparse + kmean a little less like init
DOC: make the decomposition doc more consistent with running faces example
cosmit
ENH: use introspection to find the cluster components
DOC: group SparsePCA and MiniBatchSparsePCA chapter to reduce redundancy
cosmit
ENH: minor style fixes in docstrings and comments
cosmit
cosmit
FIX: removed recently introduced mistake from dict_learning_online docstring
Carve the emmerging consensus on __init__ vs fit parameters in the contributors documentation
cosmit
DOC: give some motivation for the return of self in fit
DOC: formatting mistake
DOC: more fitting doc improvements
typo
DOC: more formatting
yet another typo
Merge pull request #311 from glouppe/test-coverage
Merge pull request #302 from jakevdp/manifold-doc
DOC: section level fix in clustering doc
Merge remote-tracking branch 'robertlayton/kmeans_transform2' into robertlayton-kmeans_transform2
checkpoint style improvements for the KMeans predict
track changes from upstream/master
time the main operations
add warning utils and use it in KMeans when data matrix is integers, boolean, complex...
checkpointing work in progress on VQ example
ENH: add missing inverse_transform method for Scaler
Merge branch 'master' into robertlayton-kmeans_transform2
fix the VQ example by switching to floats in range 0 - 1
Merge branch 'master' into robertlayton-kmeans_transform2
cosmit
use the scipy public API rather than PIL
update the documentation
ENH: 'make test' now runs the doc doctests as well
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge remote-tracking branch 'JeanKossaifi/master'
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge remote-tracking branch 'JeanKossaifi/sorted_repr' into JeanKossaifi-sorted_repr
FIX NMF doctests
ENH: shorter doctest output
ENH: pipeline doctest style improvements
FIX: updating doctests in gaussian_process.rst and linear_model.rst
FIX: remaining broken doctests
FIX: doctests on buildbot
cosmit
ENH: new example: NMF topic extraction on 20 newsgroups
FIX: useless arg to argsort in NMF example
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge remote-tracking branch 'glouppe/master' into glouppe-master
Merge pull request #328 from bdholt1/crossval
more scikits.learn => sklearn updates
ENH: new Makefile target to cythonize everything
Merge branch 'master' of github.com:scikit-learn/scikit-learn
batch re-cythonization with version 0.15 and new package names
new package name
more renamings
fix typo in scikits.learn.qda
update the Makefile test-coverage target to work with the new package layout
Merge branch 'master' into bdholt1-enh-tree
trailing spaces in pyx file
More style consistency improvements
style: constant in capital letter on top + extract graphviz tree template
cosmit
More style improvements in _tree.pyx
ENH: cross_val module docstring and style improvements
ENH: more randomized cross val docstring & var naming improvements
Merge branch 'master' into bdholt1-enh-tree
ENH: doctest simplification by using the cross_val_score func
He who seeks only vanity and no love for humanity shall fade away
style
Better exception messages in SVM
ENH: make the cross_val_score able to use functions from the metrics module
ENH: better docstrings for SVMs
Merge branch 'master' into cross-validation-improvements
DOC: improvements to the cross validation doc layout + missing ref to ShuffleSplit and cross_val_score
Merge remote-tracking branch 'bdholt1/enh/tree' into bdholt1-enh-tree
Merge branch 'master' into bdholt1-enh-tree
Merge remote-tracking branch 'glouppe/master' into glouppe-master
Merge branch 'master' into glouppe-master
Add missing authorship + license info to NMF topics example
Merge branch 'master' into cross-validation-improvements
ENH: more cross_val doc for LOLO and LPLO
DOC: add info about smart CV and IC estimators
cosmit
ENH: s/n_labels/n_unique_labels/g in cross_val
FIX: compat with numpy version lacking the out argument for dot
ENH: misc style / docstrings improvements
Merge pull request #341 from ogrisel/cross-validation-improvements
s/\bcross_val\b/cross_validation/g
backward compat for cross_val namespace
cosmit
API: start 'API changes summary' section in doc/whats_new.rst
API: removal of fit parameters
FIX: fix broken tests on ElasticNetCV
batch trailing spaces cleanup
ENH: docstring cleanup
Mark sklearn.hmm as orphaned
FIX: make the @deprecated class decorator not break the __repr__ of estimators
ENH: implementation Adjusted Rand Index for clustering evaluation
cosmit
removing the undocument implementation of the unadjusted Rand index in kmeans_
cosmit
missing import in the metrics namespace
DOC: narrative documentation for the ARI
DOC: typos
FIX: fix broken document clustering example and add ARI to examples
add doctest for combinations (to document the n < k case)
more tests for ARI and clustering metrics
test non consecutive integers in perfect match
FIX: use scipy's fast implementation of comb + fix tests + limit cases + faster adjustment test
cosmit
OPTIM: use exact comb evaluation since it's faster for the ARI case
cosmit
cosmit
DOC: add example to illustrate the concept of adjustment for chance
more details about ARI value range
make example script filename more explicit
typo
Merge branch 'master' into cluster-metrics-2
Merge remote-tracking branch 'jakevdp/neighbors-refactor' into jakevdp-neighbors-refactor
cosmit + docstest
DOC: reorg, bold important points, include adjustment plot as figure
typo
Merge pull request #347 from ogrisel/cluster-metrics-2
Merge remote-tracking branch 'jakevdp/neighbors-refactor'
more enhancements, variable names and test fixes
Added items for cross validation and clustering metrics
trailing spaces
Merge remote-tracking branch 'vene/sc' into vene-sc
cosmit
DOC: howto register the %lprun line_profiler magic on IPython 0.11+
Merge pull request #313 from robertlayton/pairwise_distance
Merge branch 'master' into vene-sc
Merge remote-tracking branch 'vene/sc' into vene-sc
Merge branch 'sc' of https://github.com/vene/scikit-learn into vene-sc
Merge branch 'sc' of https://github.com/vene/scikit-learn into vene-sc
Merge branch 'sc' of https://github.com/vene/scikit-learn into vene-sc
Merge branch 'sc' of https://github.com/vene/scikit-learn into vene-sc
Merge branch 'vene-sc'
LassoLarsIC/CV and metrics.roc_curve in whats_new
Cosmit.
Merge pull request #353 from amueller/sgd_warm_starts
DOC: cross validation: introduce motivation and basic usage first
Merge branch 'master' of github.com:scikit-learn/scikit-learn
typo: s/accurracy/accuracy/g
Merge pull request #360 from cmd-ntrf/master
ENH: no need for L2 norm on input in doc clustering
ENH: make load_files use a fixed shuffling of the samples
DOC: better svmlight_loader / dumper docstrings
ENH: 30% speed improvements in load_svmlight_file
ENH: remove useless call to strip while staying robust to empty lines
ENH: make MiniBatchKMeans display more info in verbose mode
Merge pull request #373 from larsmans/svmlight
Revert "BUG fixed and cosmetics in CountVectorizer"
ENH: make it possible to skip label assignements in MiniBatchKMeans
thanks to @larsmans, TFIDF is now always positive :)
Merge remote-tracking branch 'bdholt1/enh/tree' into bdholt1-enh-tree
Merge pull request #381 from satra/doc/permutation
FIX: compat with numpy 1.5.1 and earlier in NMF
Merge remote-tracking branch 'bdholt1/enh/tree' into bdholt1-enh-tree
Merge pull request #377 from larsmans/sparse-nmf
pep8
pep8
OPTIM: inplace max in distances computation
OPTIM: avoid unnecessary repeted memory allocations in minibatch k-means
Merge remote-tracking branch 'bdholt1/enh/tree' into bdholt1-enh-tree
cosmit: pep8 and trailing spaces
merge master
DOC: fix broken links + various cosmits
FIX: remove non-ASCII char from silhouette docstrigs
Some clarification of the memory copy issues.
OPTIM: inplace dense minibatch updates and better variable names
cosmit
cosmit: better variable name in MiniBatchKMeans
Merge branch 'master' of github.com:scikit-learn/scikit-learn
ENH: make it possible to control the add variance caused by Randomized SVD
ENH: document clustering example simplification
FIX broken doctests on buildbot + pep257
Merge branch 'master' of github.com:scikit-learn/scikit-learn
first stab at nearest center in cython (+30% perf, need check correctness)
factorized label assignement as a reusable python func for the predict method
use direct blas ddot call and reuse _assign_labels in predict
FIX: broken test cause by the use of todense which return a matrix instance instead of a regular numpy array
WIP on simpler cython impl of the center update (still buggy)
compute inertia + remove code :)
update renamed function call
factorize dot product and bootstrap implementation for the dense case
use cpdef + less array overhead in ddot
started kmeans test suite refactoring
more code factorization
refactored the kmeans tests
test and fix input checks for various dypes
much cheaper yet stable stopping criterion for the minibatch kmeans
FIX: missing relative import marker
Merge pull request #400 from amueller/docs_typo
DOC: LogisticRegression is a wrapper for liblinear.
FIX #401: update tutorial doctests to reflect recent changes and add them to
Merge branch 'master' of github.com:scikit-learn/scikit-learn
DOC: new scikit-learn.org URLs and mention license in README.md
Merge remote-tracking branch 'robertlayton/ami' into robertlayton-ami
measure runtimes for various clustering metrics in adjusted for chance example
FIX warnings by avoiding 0.0 values in the log + cosmit
Merge branch 'master' into minibatch-kmeans-optim
unused import
low memory computation of the square diff
be more consistent with the usual behavior of fitted attributes
base convergence detection on EWA inertia monitoring
various cython cleanups
working in progress to make it possible to use a speedy version based on smoothed inertial only
ENH: more informative error messages when input has invalid shapes
Merge branch 'master' of github.com:scikit-learn/scikit-learn
ENH: more informative error message when shape mismatch in TF IDF transformer
merge master
preparing new stopping criterion impl
ENH: make it possible to pass class_weight='auto' as constructor param for SGDClassifier
Merge branch 'master' into minibatch-kmeans-optim
work in progress (broken tests) on early stopping with both tol and inertia lack of improvement
make min_dist test more explicit
fixed broken test
optimize label assignment for dense minibatch and new test
fix tests
fix tests
start with zero counts in tests
fix bug: x_squared_norms should follow the shuffle...
ensure that the sparse and dense variant of the minibatch update compute the same thing
better default value and parameter handling for max_no_improvement
switch to lazy sampling with explicit index to divide memory usage almost by 2 and decrease code complexity with no measurable impact on the run time
more code simplification
started example to check the convergence stability in various settings
FIX: buggy usage of for / else for k-means n_init loop
DOC: update what's new
tracking changes from master
FIX: broken HMM tests caused by KMeans convergence in one step
merge master
ENH: use integer indexing instead of boolean masks by default for CV
implemented n_init for MiniBatchKMeans
Merge branch 'master' into minibatch-kmeans-optim
refactored the init logic for MiniBatchKMeans
Merge branch 'master' into minibatch-kmeans-optim
fix stability and warning in tests
make k-means++ work on sparse input and use it as default for MB k-means
add version info in deprecation message
factorized out the early stopping logic in a dedicated method
first stab at a reinit strategy that work on low dim data only
new example to emphasize issues with current naive reinit scheme on sparse data
second experiment on reinit that does not work on high dim sparse data either
PEP8 + various cosmits
pep8 in sparse covariance example
PEP8 + PEP257 in samples_generator
PEP257 - docstring style
Merge branch 'master' into minibatch-kmeans-optim
FIX: make the doctests outcome deterministic
DOC: better toplevel docstring
DOC: add simple descriptions in the concrete class docstrings
FIX: workaround what looks like a numerical instability in doctest
Merge pull request #439 from glouppe/ensemble-rebased
Merge pull request #453 from yarikoptic/master
pep8
Merge pull request #452 from glouppe/doc
PEP257 cosmit
cosmit
Update README.txt dependencies info to match the configuration tested on jenkins
cosmit
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
track changes from master
pep8
fix k_means docstring to better match the scikit naming conventions
WIP: n_init refactoring
merge master
Merge pull request #481 from mblondel/mean_var2
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
Merge branch 'master' into minibatch-kmeans-optim
scale tolerance of minibatch kmeans on CSR input variance
delete broken example
example script is not meant to be executed when building the doc as it is slow
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
typo: accross => across
Merge branch 'master' into minibatch-kmeans-optim
typo: accross => across
Use python int for indices and indptr of scipy sparse matrices to ensure cross platform support
Make init less expensive by default on MinibatchKMeans to avoid dominating computation on large scale datasets
Fix broken duplicated / tests and more practical init
consolidating all cython utils for sparse CSR in the same file under utils
WIP: scaling CSRs
Merge branch 'master' into minibatch-kmeans-optim
FIX compat for errorbar legend for old matplotlib versions
slight optim: remove useless assignment from the inner loop
FIX: numerical instability caused by collapsed allocation of bad clusters to the center of mass
example tweaks
fix text position in example
its
better documentation for the convergence stability example
Merge branch 'master' into minibatch-kmeans-optim
simplify stability evaluation example
enable the kmeans stability as an auto examples as the speed is now fast enough
docstring in cython funcs + better var name: with_sqrt
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into minibatch-kmeans-optim
cosmit
merge master
readd dtype and ccontiguous checks removed by mistake during last conflict resolution
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into minibatch-kmeans-optim
merge master
remove useless dependency on pylab
fixed conflict in import resolution
FIX: validation is a relative package
FIX: py3k - more relative imports
FIX: py3k: string.letters is locale dependent and absent in py3k
Merge branch 'master' into sparse-scaler
WIP: feature scaling for CSR input (lacks some tests)
fix scaling, more tests and docstrings
Merge branch 'master' into sparse-scaler
wording
FIX: py3k integer division in robust covariance estimation
FIX: py3k integer division in samples generator
FIX: in py3k svmlight files must be explicitly opened in binary mode
FIX: py3k bytes split in svmlight format parser
Merge branch 'master' of github.com:scikit-learn/scikit-learn
FIX: py3k need explicit bytes buffers for svmlight format serialization
FIX: py3k need output file in binary mode for svmlight format serialization
FIX: py3k: string formatting is not supported on byte strings
FIX: fix test: integers are valid file descriptors in py3k
Merge branch 'master' into sparse-scaler
FIX: unused cython variable
More checks when transforming sparse matrices with centering scalers + typo
DOC: update narrative documentation
optim: avoid useless memory copy when input is non CSR
DOC: typo / wording
DOC: document sparsefuncs cython routines in developer section.
DOC: wording
DOC: wording
Merge pull request #515 from ogrisel/sparse-scaler
update what's new for sparse scaling
Fix the docstring of the univariate feature selection module to match the scikit conventions
cosmit
typo
cosmit
FIX: None and int comparison not authorized in py3k (in PCA)
FIX: dicts no longer have the has_key method in py3k: test for the method we actually use instead
FIX: make feature extraction work with the new py3k string API too
FIX: py3k's zip is not subscriptable
FIX: handle py3k exception API
FIX: previous fix for py3k str API in feature extraction was a bug in python 2
FIX: pervasive use of unicode in feature extraction for py3k compat
Update random forest face example to use several cores
ENH: make ShuffleSplit able to subsample the data
FIX: ensure fetch_20newsgroups_vectorized outputs CSR matrices to work with cross validators
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge pull request #519 from ogrisel/subsampling-shufflesplit
PEP257: docstring cosmits in utils.extmath
ENH: renamed fast_svd to randomized_svd + related improvements
FIX: incomplete test for inverse_transform in text feature extraction
Merge pull request #521 from lucidfrontier45/master
pep8 in forest.py
pep8 in tree.py
pep8 in kmeans tests
more pep8
more pep8
FIX: heisen doctest
DOC: readibility: make colon after 'Parameters' stay on the same line in reference documentation
FIX: Boston is a regression dataset
oops, the last test is about classification, not regression
Merge pull request #529 from eickenberg/doc_fix
ENH: mark coef_ as immutable for linear SVM models trained in the dual
immutable coef for the sparse SVM variant too
mark liblinear coef as immutable too
document the fact that coef_ is readonly for LogisticRegression and LinearSVC
avoid a memory copy in coef_ property
Merge pull request #541 from ogrisel/immutable-readonly-coef
FIX: broken link in SVM doc
Merge pull request #551 from fannix/master
FIX: make sklearn.base.clone robust to empty params
first stab at trying to wrap MurmurHash3
Merge pull request #3 from GaelVaroquaux/murmurhash
implementation & test for the murmurhash wrapper module
Export some public cython API
DOC: add entry for murmurhash in the developer utilities section
ENH: add the ability to hash int arrays
Better docstring
Shorter cpdef function names + missing docstrings
DOC: give usage example
test developers utilities as well
OPTIM: avoid unlikely np.int32 test upfront
Merge pull request #564 from ogrisel/murmurhash
FIX: broken build / tests
Merge remote-tracking branch 'larsmans/typesafe-murmurhash'
Merge pull request #587 from jakevdp/arpack-init
Merge pull request #593 from jaquesgrobler/doc_update
cosmit in memory debugging doc
Merge pull request #602 from jaquesgrobler/doc_remotes_note
ENH: use linear gradient cmap for more readable hyperparam heatmap
docstring cosmits and typos in label_propagation.py
useless imports
simpler random seeding scheme for parallel kmeans
less hacksih parallel random state seeding
avoid pl.set_cmap and align colors of colormesh with scatter
started work on utility function for quick train test split
more doctest
add parameters in docstring
DOC: narrative doc for train_test_split
add tests for invalid argument + fixed a type error
more tests
typo
reworked nested grid search example for better doc and output, use train_test_split and add more cross links
DOC: related improvement in GridSearchCV doc
DOC: more cross references
cosmit
DOC: what's new
Merge pull request #618 from ogrisel/train_test_split
FIX: make LFW data shapes consistent with Olivetti faces
ENH: more informative exception message
DOC: improved SVM docstrings
typo
Merge pull request #628 from daien/master
Merge pull request #633 from robertlayton/ig
Merge pull request #634 from amueller/svm_decision_function_dirty_fix
FIX #614: raise ValueError at KernelPCA init if fit_inverse_transform and precomputed kernel
DOC: formatting improvement to ensemble.rst
FIX: make the 20 newsgroups loader explicitly decode latin1 content
shorten example a bit with train_test_split
manually rescale C in face recognition example
Merge pull request #664 from conradlee/663-kfold-init-bug
Flatten the feature extraction API
Merge branch 'master' of github.com:scikit-learn/scikit-learn into text-feature-extraction-simplification
missing C re-scaling in example
missing C re-scaling in example
MiniBatchSparsePCA and MiniBatchDictionaryLearning still use chunk_size as argument
merge master
factorize feature names array
make CountVectorizer able to output binary occurrence info
add a test for custom dtype
DOC: improve docstring for Vectorizer
Flatten the combined vectorizer as well
Merge remote-tracking branch 'upstream/master' into text-feature-extraction-simplification
Fix grid search example
Fix charse in mlcomp example
DOC: started section on text feature extraction
Merge remote-tracking branch 'upstream/master' into text-feature-extraction-simplification
switch back to the old vocabulary constructor argument
Merge remote-tracking branch 'upstream/master' into text-feature-extraction-simplification
better blob seed so that both DBSCAN and meanshift are working well
Merge branch 'master' into text-feature-extraction-simplification
finally the right API with plenty of efficient overrides
Filter stop words before ngrams
demonstrate stop words in example (+ slighly faster convergence)
missing sklearn.semi_supervised package in setup.py
ENH: remove useless array wrap for feature names + more TF-IDF tests
Make Vectorizer not inherit from TfidfTransformer while preserving direct gridsearchability
FIX: division by zero errors and negative IDF
DOC: TF-IDF and customizing
DOC: updated parameters
Merge branch 'master' into text-feature-extraction-simplification
updated whats new
s/Bags/Bag/ and Vector Space Model
better explanation for bigram features
No accent stripping by default + various doc fixes
update strip_accents in Vectorizer as well
typo
typo
typos
remove lambda + better comment position
enable stop words in clustering example
typo
Renamed Vectorizer to TfidfVectorizer + deprecation warning
updated what's new + backward compat for vocabulary attribute
fixed and inheritance bug in TfidfVectorizer.fit_transform + removed vocabulary backward compat that breaks grid_search
useless import
Merge pull request #668 from ogrisel/text-feature-extraction-simplification
trailing whitespace
FIX: broken doctest under OSX
Merge pull request #694 from njwilson/skip-kmeans-2-jobs-mac
Merge pull request #692 from njwilson/minor-doc-fixes
Had a link to autopep8
Merge pull request #695 from njwilson/tmp-dir-for-cache
Merge pull request #696 from njwilson/issue-691
Merge pull request #698 from njwilson/master
OPTIM: skip buffer unpacking in kmeans
Merge pull request #693 from jaquesgrobler/Collapse_Sidebar
Merge pull request #714 from jaquesgrobler/Next_button
Merge pull request #717 from jaquesgrobler/Issue714
typo + cosmetics
ENH: sort features in dict vectorizer + new doc
ENH: refactored the HMM tests to ease PY3K transition
Fix bad reference to LFW in example
useless import
FIX #752: raise explict ValueError if k is too large
FIX: missing string formating argument in MBKMeans error message
removed useless assert
Merge pull request #748 from ogrisel/hmm-test-hierarchy-simplification
Merge pull request #742 from davidmarek/pdistance
FIX: #774 Add documentation for lprun config in qtconsole and notebook
FIX #807: non regression test for KPCA on make_circles dataset
Merge pull request #809 from zaxtax/master
Merge pull request #812 from amueller/pipeline_decision_function
typo
Add note for port install py27-scikits-learn
trailing space
add missing attribute estimators_ to the docstring of forest models
FIX #898: narrative documentation for feature importances in forest models
Merge pull request #921 from fhoeni/scaler_bugfix
FIX: heisentest for robust covariance: seed MinCovDet
Merge pull request #926 from agramfort/fix_X_list_grid_search
Merge pull request #928 from yarikoptic/master
FIX #937: preserve double precision values in svmlight serializer
add a what's new entry
work on smmlight serualizaer to preserve double precision values
track master
Merge pull request #945 from cpa/master
Merge pull request #971 from acompa/master
Update doc/support.rst
Merge pull request #955 from vene/mem_prof
Merge pull request #995 from kernc/CountVectorizer_analyzer_char_nospace
fix broken doctests for the new char_wb text analyzer
DOC: better narrative for char_wb text analyzer + add a whats_new entry
Merge pull request #1043 from jaquesgrobler/master
Merge pull request #1039 from jakevdp/lle-test-fix
Merge pull request #1045 from agramfort/fix/as_float_array
Merge pull request #1049 from fsav/c-docstring-patch
Merge pull request #1063 from welinder/peter-dev
Merge pull request #1009 from amueller/one_class_check
Merge pull request #1094 from ibayer/warnings
Merge pull request #1100 from NelleV/makefile
Merge pull request #1110 from buma/predict_proba_doc
ENH: pass verbose consistently in forest module
cosmit
FIX: wrong probabilities for OvR LogisticRegression
ENH: make test_common check normalized probabilities
Merge pull request #1189 from fabianp/svmlight
Merge pull request #1187 from ogrisel/bugfix-logistic-ovr-probabilities
FIX: broken doctest for DictVectorizer
FIX: missing figures in FA narrative doc
Merge pull request #1266 from cdeil/patch-1
Merge pull request #1292 from aymas/pass_rng_kmeans_gmm
Merge pull request #1344 from mattilyra/CountVectorizer.decode
FIX: missing # for comment in pyx file and readded missing AMI docstring
FIX: lars drop for good platform specific test failure
FIX #1354: machine precision assertion failure in test_liblinear_random_state
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge pull request #1361 from astaric/py3k
DOC: make MinMaxScaler example snippet readable outside of other sections context
DOC: more improvements / fixes on the MinMaxScaler doc
Merge pull request #909 from larsmans/hashing-trick
Merge pull request #1397 from SnippyHolloW/travis
Improved bench_covtype.py to load data faster and support configurable n_jobs
Merge pull request #1415 from SnippyHolloW/travis
Merge pull request #1418 from kuantkid/archlinux
Merge pull request #1408 from satra/fix/rebase1396
Merge pull request #1425 from arjoly/enh_bench_covertype
Merge pull request #1424 from jaquesgrobler/plot_omg_fix
FIX #1417: move nosetests configuration parameter to setup.cfg
Remove doctest-options from setup.cfg as not supported in old version of nose
Merge pull request #1430 from erg/issue-1407
Merge pull request #1429 from tnunes/fix_pipeline_fit_transform
Merge pull request #1440 from amueller/matplotlib_requirement
Display the test names to understand which test is triggering the segfault on jenkins
FIX: fixed random_state for heisen doctest failure in multiclass module
Merge pull request #1468 from erg/random-failures-12345
Delete iris.dot in tree.rst doctest
FIX: seed blobs dataset to have a stable spectral clustering under OSX 10.8
Merge pull request #1470 from kuantkid/fix_spectral_cluster_test
Add comment in test_spectral_clustering_sparse
Merge pull request #1465 from AWinterman/issue-1017
first pass at implementing sparse random projections
DOC: better docstrings
DOC: more docstring improvements
Remove non-ASCII char from docstring
use random projections in the digits manifold example
test embedding quality and bad inputs (100% line coverage)
typos
one more typo
OPTIM: CPU and memory optim by using a binomial and reservoir sampling instead of direct uniform sampling in the n_features space
note for later possible optims
fix borked doctests
make it possible to use random projection on the 20 newsgroups classification example
FIX: raise ValueError when n_components is too large
remove the random projection option from the 20 newsgroups example
leave self.density to 'auto' to implement the curified estimator pattern
more curified estimator API
useless import
change API to enforce dense_output representation by default
ENH: vectorize the johnson_lindenstrauss_bound function
started work on plotting the JL bounds to be used in the narrative documentation
More vectorization of the johnson_lindenstraus_bound function
More work on the JL example to plot the distribution of the distortion
WIP: tweaking JL function names
check JL bound domain
JL Example improvements
WIP: starting implementation implicit random matrix dot product
working on implicit random projections using a hashing function
OPTIM: call murmurhash once + update test & example
first stab at CSR input for hashing dot projections
implemented dense_output=False for hashing_dot
refactored test to check that both materialized and implicit RP behave the same
fixed broken seeding of the hashing_dot function
leave dense_output=False by default
use the 20 newsgroups as example dataset instead
make it possible to use a preallocated output array for hashing_dot
missing docstring and s/hashing_dot/random_dot/g
eps=1.0 is no longer a valid value
Typo / fix in JL lemma example
FIX: MinMaxScaler on zero variance features
Simpler inline comment
Add one more test for MinMaxScaler on newly transformed data
ENH: issue warning when minmax scaling integer data + test
ENH: add the squared hinge loss to the SGD loss example
Merge pull request #1517 from amueller/lda_qda_cleanup
Merge pull request #1562 from kmike/master
P3K: avoid iteritems / itervalues when feasible
P3K: decode error message in svm wrapper
ENH: output processing speed in MB/s for vectorizer example
Initial work on hashing vectorizer
Add fit_transform support using the TransformerMixin + missing ABCMeta marker
Improved the clustering example with HashingVectorizer
Remove TransformerMixin from vectorizers and do a direct fit_transform alias for HashingVectorizer instead
Improve module docstring of document clustering example
cosmit
Updated whats_new.rst
DOC: Started section on hashing vectorizer in narrative section
DOC: narrative doc for HashingVectorizer
DOC: typos
DOC: merged the whats new entries and add links to the narrative doc
DOC: address @mblondel's comments
ENH: measure feature extraction speed in document classification example
DOC: typos
Update travis config to remove -qq flag for scipy
P3K: support for py3k in dict_vectorizer module
PY3: Fix stdout capture in graph lasso test
P3K More python 2 / 3 compat in tree exports
Merge pull request #1660 from rlmv/fe_tests
P3K use six to have a python 2 & 3 compatible code base
Merge pull request #1726 from agramfort/round_kfold
Merge pull request #1730 from arjoly/doc-feature-selection
Merge pull request #1741 from arjoly/metrics-fix-np-1.3
PY3: Disable lib2to3
PY3: fix urlopen in mldata and california housing loaders
PY3: fix remaining cStringIO imports
PY3: fix for string literals in datasets' test_base.py
PY3: print function in coordinate descent doctest
PY3: record is a kwarg argument for warnings.catch_warnings
PY3: long is no longer a type in Python 3
Merge pull request #1839 from amueller/dbscan_example
FIX: use the mldata mock in docstring as well
Merge pull request #1913 from Jim-Holmstroem/refactored_precision_recall_fscore_support_to_count_with_integer_type
FIX: restore numpy 1.3.0 compat with np.divide fix
FIX #2032, FIX #2033: ensure module names consistency with __all__
Remove redundant test that was checked in by mistake
FIX inconsistent cv_scores_ generation for randomized search and re-add example
ENH: removed leftover condition to get a wider application of the import all consistency check
Enforce n_folds >= 2 for k-fold cross-validation
Merge pull request #2004 from oddskool/out-of-core-examples
FIX: make doc auto-linking support any Unicode / UTF-8 content
Make the out-of-core example plot work when launched by the sphinx extension
FIX: do not print to many messages to stdout when generating the documentation
PY3: New test for the get_params handling of deprecated attributes.
Better status for the Py3 port
Merge more Py3 fixes
PY3: refcounting change introduced a regression on the use of resize in LARS
FIX: pep8 and Py3 support in sklearn.neighbors.base
FIX: Python 3 support for the neighbors doctests
FIX: pep8 + Py3 fixes in test_dist_metrics
FIX: pep8 and Py3 support in sklearn.neighbors.dist_metrics
FIX: Py3 / pep8 fixes in test_ball_tree / test_kd_tree
Update Python 3 support status
Style
More readable condition and more precise error message
FIX: Py3 print statements to print functions
Rename LabelBinarizer.multilabel to .multilabel_ + DOC
WIP: partial fit for discrete naive Bayes models
Remove the class_prior partial_fit param
WIP: started to factorized the raw count collection
Incrementally is useless now
Add reference to the Manning text + restaure previous smoothing
FIX shape issue when y has only one single class + some missing doc
Factorize common classes checks in partial_fit implementations
Add note on a possible future performance optimization
Add a note on performance tradeoffs in the docstring of partial_fit
More informative error message. Also CV now use integer indices by default now.
Use floats everywhere to get rid of warnings when using sample_weight
More input checks
Better test name
Remove redundant shape check already done by check_arrays
Add missing test for sample weight with partial_fit + fix issue classes passed as a list instead of an array
One more input check test
Add missing test for deprecation warning
Found a bug: add a failing test
Use unique_labels more consistently in the multiclass model
Fix broken partial_fit test
Factorize label_binarize for binarizing a sequence of labels with fixed classes
Add a new whats_new entry
Add some doc for the new partial_fit method
wording
Avoid raising a deprecation warning on label_binarizer_.multilabel_
Fix docstring and add some usage examples
FIX: do not update feature_log_prob_ in _update_class_log_prior
Add one more tests to check the performance on digits
Make test_deprecated_fit_param pass under python 3 as well
Address wording and typos identified in review
Better parameterization for test_check_accuracy_on_digits
Add a whitespace in parameter docstring item
More accurate documentation for class_count_ and feature_count_
Rename helper partial_fit function
Merge pull request #2175 from ogrisel/nb-partial-fit
Merge pull request #2228 from amueller/travis_virtualenv_stuff
Trying to enable python 3.3 too.
Update .travis.yml
One more Python 3 fix in feature_extraction.rst
Py3 fix
More explicit tests in test_label_binarizer_column_y
Catch expected warning in sklearn/tests/test_naive_bayes.py (part of #2274)
Revert "Catch expected warning in sklearn/tests/test_naive_bayes.py (part of #2274)"
FIX PY3: list and tuples cannot be compared in Python 3
Py3: fix version comparison in imputation module
Add supported python versions to the classifiers + fixes
Sample compiler config for windows
Force stdc++ link for the windows build
Regenerate pairwise_fast.pyx with recent cython for windows build
Fix atomics definitions under windows for sklearn._hmm.pyx
typo
Use extra_link_args for -lstdc++
Ignore compiled shared library files generated in the source tree under windows
Merge pull request #2293 from amueller/warning_input_shapes
Rename cv_scores(_) back to grid_scores(_) to keep the name free for a future refactoring
Merge pull request #2299 from ogrisel/grid-scores
WIP: explicitly mark all base classes as ABC with abstractmethod inits
Add concrete __init__ for LinearSVM
Add concrete implementation for SGDClassifier
Fixed a typo in a contributor's name
Re-align the what's new file with the new ordering of items from master
partial_fit for naive Bayes was done for 0.14-rc, not 0.11...
Ignore the generated MANIFEST file
Also clean the dist folder when calling make
Olivier Hervieu (7):
Refactor roc_curve method.
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn
fixes typo in roc_curve method
[refs #350] - variable renaming regarding reviewer comments
Removes useless (and time consuming) statement.
Improves signal sorting method (using numpy primitives).
FIX inconsistent coef_.shape in LinearRegression
Paolo Losi (38):
liblinear bias/intercept handling
l1 logreg (liblinear): minimum C calculation
l1 logreg (liblinear): minimum C (sparse version)
review of min_C doc strings
numpy/scipy idioms as suggested by agramfort
pep8 compliance
min_C: reworked _y calculation
min_C: check for ill-posed problem _y * X == 0
min_C: let's avoid scipy.sparse top level import
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into l1_logreg_minC
min_C: fixes to the doc strings
s / shape = / .reshape() /
removed float64 and int32 conversion
docstrings updated
fix for "removed float64 and int32 conversion"
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into l1_logreg_minC
got rid of np.where
reimplemented l1_min_C as a function
removed old version of min_C
cleanup tests
some more cleanups
bound on C can be calculated also with one class
cleaned up tests
fixes to docstring (as for Fabian comments)
l1_min_c import in svm/__init__.py
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into l1_logreg_minC
Merge branch 'master' into l1_logreg_minC
DOC: added reference to l1_min_c
the l1 logreg example now works with l1_min_c
coverage 100% + pep8 small fix
Revert "Remove references to y in preprocessing objects."
Merge remote branch 'upstream/master' into revert_preprocessing
TEST: test for scaler in Pipeline
FIX: for SGD log loss
FIX: partial revert of the SGD log loss fix
DOC: Better doc string for l1_min_C
BENCHMARK covertype: select classifier via cmd line opt
Merge pull request #736 from paolo-losi/bench_covtype
Pavel (1):
Fixed typos.
Peter Prettenhofer (749):
initial checkin of sgd package.
set rho on 1 or 0 if L2 or L1 penalty.
l1 penalty implemented.
added class encoding.
does not belong to the repo.
Merge branch 'master' of git at github.com:pprett/scikit-learn
Code review from Alexandre:
Merge branch 'master' of github.com:pprett/scikit-learn
removed unnecessary print statements.
100% code coverage.
added doctests to SGD and sgd.LinearModel
initial checkin of sgd package.
set rho on 1 or 0 if L2 or L1 penalty.
l1 penalty implemented.
added class encoding.
Code review from Alexandre:
100% code coverage.
added doctests to SGD and sgd.LinearModel
Merge commit 'origin/master'
initial *draft* of the sgd module documentation added.
added Readme so that sphinx stops complaining.
additional documentation for sgd (plot of various convex loss functions).
math formulation cont'
penalty contour plot added.
more SGD documentation added: example, math formulation , implementation details.
EfficientBackprop reference added.
Documentation for sgd polished.
fixed doctests after SGD class index refactoring.
Removed tabs.
implemented OVA for multi-class SGD.
implemented OVA for multi-class SGD.
Merge branch 'master' into ova
SGD supports multi-class classification using one-vs.-all.
SGD multi-class documentation added.
SGD classifier supports multi-class with OVA.
documentation for multi-class sgd updated.
Changed docstrings for coef_ and intercept_ in sgd package. Wrap intercept_ in an array in the case of binary classification.
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
Liblinear docstring modified: deleted irrelevant attributes support_ and changed shape of intercept_ and coef_ accordingly.
Added dense implemenation of SGD.
Merge branch 'dense' of github.com:pprett/scikit-learn
Commit broken cython header import.
moved sgd_fast_sparse from sgd/sparse/src to sgd/src.
Moved sgd extension modules from sgd/src to sgd.
performance improvements in cython files; cython files rebuild.
added covertype example for dense sgd.
bugfix in plot_loss_functions (import loss functions).
covertype example now downloads dataset automatically.
Updated sgd documentation with multi-class documentation.
docstrings: n_jobs defaults to 1.
cosmit: color of data points matches color of decision regions and OVA hyperplanes.
warm start optimization changed from coef_ to init_coef_ and intercept_ to init_intercept_.
Multi-class documentation for module sgd added.
Include models with L1 and Elastic-Net penalty.
changed init_coef to coef_init (intercept likewise).
Merge branch 'warmstart'
Added new example on modeling the geographic distribution of species.
Merge branch 'speciesmodeling' of git at github.com:pprett/scikit-learn
added species distribution example as plot example.
if possible, species distribution example now uses basemap by default.
deleted old species_distribution_modeling example.
cosmit: pep8 and author
Reduced memory consumption in covertype example due to memory leak in np.loadtxt.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Added note on the importance of shuffeling. Minor changes in text.
Runtime improvement of species distribution example (fancy indexing).
set basemap as default.
Class weights for SGD similar to svm package. Same heuristic as Liblinear for multi-class (OVA): use only weight for the positive class.
removed parameters `p` and `C` from OneClassSVM (dense and sparse).
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
added tksvm from git://gist.github.com/673953.git.
use np.fromstring to load data from large csv text files.
changed predict_margin to decision_function
Merge branch 'svmgui'
added GUI example for SVM.
added tksvm from git://gist.github.com/673953.git.
added GUI example for SVM.
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
Merge branch 'importanceweighting' of git at github.com:pprett/scikit-learn
RegressorSGD added.
Merge branch 'master' into importanceweighting
changend "squarederror" to "squaredloss".
automatic refitting on radiobutton change and add example.
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
changed loss function names in SGD (squaredloss -> squared_loss; also for modified_huber).
added Oliviers ElasticNet convergence test to SGD.
move sgd into linear_model and rename sgd to stochastic_gradient.
finalized sgd module renaming.
moved sgd examples to examples/linear_model and added sgd prefix.
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
Merge branch 'master' into sgd-rename
COSMIT: smaller data points
updated SGD documentation (referenced in linear_model.rst and classes.rst).
fixed imports in non-auto examples.
BUGFIX in sparse.SGDRegressor
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
Merge branch 'sgd-rename'
refactored SGD module (removed code duplication, better variable naming).
Sample weights for SGDClassifier.
Additional tests for sample weights.
Merge branch 'master' into sgdsampleweight
pep8 + oliviers remarks
SGD: documentation for sample weights and class weights.
added doctests for sparse and dense SVR, NuSVR, NuSVC, and sparse SVC.
added doctests for sparse and dense SVR, NuSVR, NuSVC, and sparse SVC.
Cosmit in fast_sgd.pyx
fixed failed doctest. SVR parameter `p` renamed to `epsilon`.
SGD module supports two additional learning rates: constant and inverse scaling.
Added SGD regression benchmark
Fixed doc-tests and added doc strings for SGD learning rates.
added notes on learning rate schedules to sgd.rst.
added learning rate arguments to docstrings.
Run cython on sgd_fast and sgd_fast_sparse.
pep8 compliance
cosmit: removed trailing whitespace
ROC fixes for trivial classifiers (always predict one class) and input checks (raise ValueError in case of multi-class).
added doctests for roc and refined documentation.
cosmit: pep8
cosmit: beautified plotting.
docstrings: added note to roc_curve, changed y_scores to y_score.
docstrings: changed signature of metrics.auc from fpr and tpr to x and y.
Merge branch 'rocfix'
changed semantics of LossFunction.dloss.
cosmit: pep8 + doc
cosmit: changed docstring of svm_gui.py.
cosmit: removed requirements in svm_gui doc.
Merge remote branch 'upstream/master'
bugfix: bad Scaler example.
fixed LARS doctest.
Initial checkin of sparse.MiniBatchKmeans clustering + document clustering example on 20 news.
enh: compute cache only on samples in current batch.
Added different compute_cache functions: dot and euclidean distance.
added SpectralClustering to document_clustering example.
fix: random_state was set to None.
use provided x_squared_norms instead of recompute (if none euclidean_distances will recompute).
reuse squared sample norms if possible (_calculate_labels_inertia).
Use euclidean distance.
Merge remote branch 'upstream/master' into sparse-mbkm
dense and sparse seed differences: change order of shuffling X and init centroids to ensure consistant results.
changed mini batch representation in dense MiniBatchKMeans - saves mem but increases runtime.
deleted sparse clustering package.
Merged dense and sparse MiniBatchKMeans implementations.
Document clustering example updated.
cosmit: pep8
fast function to compute l2 norm of rows in CSR matrix.
set max_terms to 10k. Added spectral clustering.
more tests for (mini-batch) k-means (99% coverage).
Merge branch 'master' into sparse-mbkm
changed batch representation from indices to slices.
remove assert_warns from test case (not supported by build bots numpy version).
cosmit: docstring of MiniBatchKKeans
remove n_init arg from MiniBatchKMeans signature
fix: doctest formatting.
fix: remove n_init from mbkm tests.
fix: call set_params in mbkm fit.
Merge remote branch 'upstream/master'
cosmit: docstring + raise ValueError if kmeans input is sparse.
added document clustering example to KMeans clustering section.
Merge pull request #305 from vincentschut/mini-batch-kmeans-batch-labeling
fix: if n_samples < chunksize n_batches was 0 and no iterations are performed.
cosmit: rm debug output
add smoke test for MiniBatchKMeans
Merge remote branch 'upstream/master'
added NavigationToolbar to SVM gui
Merge branch 'enh/tree' of https://github.com/bdholt1/scikit-learn into bdholt1-enh/tree
Merge https://github.com/bdholt1/scikit-learn into bdholt1-enh/tree
Merge branch 'enh/tree' of https://github.com/bdholt1/scikit-learn into bdholt1-enh/tree
introduce reset method for Criterion and implemented linear version of MSE.
fix: weight left and right variance by num samples in each branch
added CART to covertype benchmark -> look at that error rate!
Merge remote branch 'bdholt1/enh/tree' into bdholt1-enh/tree
visitor pattern for export graphviz
cosmit: pep8 + docs
Merge remote branch 'bdholt1/enh/tree' into bdholt1-enh/tree
Merge remote branch 'bdholt1/enh/tree' into bdholt1-enh/tree
use hybrid sample_mask fancy indexing approach.
cosmit: docs + rm comments
added `min_density` parameter to CART
raise ValueError for min_split and max_depth on __init__ rather than fit.
we grow our trees deep
cosmit + n_samples fix
MSE bugfix (MSE.eval used to weight variances by n_left and n_right).
take DTYPE from _tree extension module
fix: inc n_left, n_right before variance computation; hopefully the last bugfix for MSE...
fix doctest + recompile cython code (accident)
make Node an extension type + change class label indexing.
recompile _tree.pyx
make _tree import relative
make node pickleable & tidy up some rebase mistakes
remove obsolete tests
check if y.shape[0] == X.shape[0]; this is especially troublesome for svm.sparse because most people are not aware of the sparse matrix - KFold troubles..
unified predict for sparse and dense SGD.
cosmit
fix: use None as default value for class_weight and sample_weight for sparse OneClassSVM; ample_weight -> sample_weight
cosmit: pep8
added y.shape[0] == X.shape[0] check to DiscreteNB
added X.shape[0] == y.shape[0] check to ElasitcNet
Merge remote branch 'upstream/master'
documented changes in whats_new
Merge remote branch 'upstream/master'
Merge branch 'fix-split-sample-mask' of https://github.com/TimSC/scikit-learn into TimSC-fix-split-sample-mask
compute threshold as t = low + (high - low) / 2.0
initial checkin of gradient boosting
GBRT benchmark from ELSII Example 10.2
added GBRT regressor + classifier classes; added shrinkage
use super in DecisionTree subclasses
first work on various loss functions for gradient boosting.
added store_sample_mask flag to build_tree
implemented lad and binomial deviance - still a bug in binomial deviance -> mapping to {-1,1} or {0,1} ?
updated benchmark script for gbrt.
some debug stmts
new benchmarks for gbrt classification
fix: MSE criterion was wrong (don't weight variance!)
more benchmarks
binomial deviance now works!!!!!
add gradient boosting to covtype benchmark
add documentation to GB
timeit stmts in boosting procedure.
add previously rm c code
updated tree
hopefully the last bugfix in MSE
new params in gbrt benchmark and comment out debug output
make Node an extension type + change class label indexing.
predict_proba now returns an array w/ as many cols as classes.
cosmit: tidyed up RegressionCriterion
added VariableImportance visitor and variable_importance property
minor changes to benchmark scripts
use `np.take` if possible, added monitor object to `fit` method for algorithm introspection.
cosmit
choose left branch if smaller or equal to threshold; add epsilon to find_larger_than.
compiled changes for last commit
cosmit
some tweaks and debug msg in tree to spot numerical difficulties.
added TimSC tree fix
changed from node.error to node.initial_error in graphviz exporter
recompiled cython code after rebase
fix: _tree.Node
comment out HuberLoss and comment in benchmarks
changed from y in {-1,1} to {0,1}
cosmit: beautified RegressionCriterion (sum and sq_sum instead of mean).
rename node.sample_mask to node.terminal_region
fix: Node.__reduce__
fix init predictor for binomial loss
performance enh: update predictions during update_terminal_regions
fix: samplemask
added timing info
cosmit: get rid of gcc warning (q_data_ptr was not initialized)
fix: overflow of `offset` variable if X.shape[0] * X.shape[1] > 250M
fix: broken doctest with precomputed kernel
changed Decision Tree representation to struct of arrays instead of composite structure.
fix: use tree.predict instead of functor
Graphviz visitor now works on array repr.
cosmit: doc strings
use safe_sparse_dot instead of np.dot
changed int64 to int32 in tree repr;
Merge branch 'tree-array-repr'
changed for `for i in 0 <= i < n` to `for i in xrange(n)`.
Merge branch 'tree-array-repr'
changed tree.left and tree.right to tree.children (similar to cluster.hierachical)
use new tree repr; adapt gradient boosting for new tree repr.
Merge branch 'master' into gradient_boosting
cythonized tree (still broken)
clear tree.py
updated _tree.c
updated GradientBoosting with current master
fix: update variable importance
added gradient boosting regression example
added test deviance to GBRT example
updated TODO in module doc
fix: sgd module clone issue w/ rho parameter
Merge remote branch 'upstream/master'
Merge branch 'master' into gradient_boosting
fix: make GradientBoostingBase clonable.
fix: learning rate schedule doc.
Merge remote branch 'upstream/master'
fix: rm `nu` argument from sparse.SVR (taken from dense SVR).
added unit tests for gradient boosting (coverage ~95%)
better test coverage
store loss object in estimator
don't use dict comprehensions (support python 2.5 and 2.6).
fix: tree doctests + ensemble doctests
Merge branch 'master' into gradient_boosting
stub for gradient boosting documentation
restore original bench_tree.py
Merge branch 'master' into gradient_boosting
min_density now works with store_terminal_regions (however, this only matters if you learn deep trees max_depth >> 5 which rarely happens).
cosmit
added input type and shape test
Merge remote branch 'upstream/master' into gradient_boosting
n_samples > min_split instead of >=
cosmits (cleanup after profiling)
repeat decorator now with arguments
fix: xmin -> X.min()
eliminate `compute_importances` fit parameter - make `feature_importances_` a property that will be computed on demand.
initial_error -> init_error
max_features bug in _tree.pyx (check if < 0 and assume all features!)
Merge branch 'tree-feature-importance' into old-gradient-boosting
merge with master finally resolved!
enh: performance enhancement by removing redundant computation of values - we use the state of `criterion` instead.
started work on gradient boosting docs
remove obsolete `sparse_coef_` doc string
remove reference to obsolete `sparse_coef_` parameter.
set coef_ to fortran layout after fit - this will enhance the test time performance for predicting singe data points.
added to whats new
cosmit: more detailed doc string for why fortran style arrays
Merge branch 'sgd-fortran-layout'
Merge branch 'master' into old-gradient-boosting
removed feature_importances_ property in tree module
work in progress on GBRT docs
added script to bench sklearn gbrt against R's gbm package.
cosmit: pep8 + comments
fix: undo compte_importances property merge in forest module and examples
wip: narrative doc
fix: table layout
restore original
restored original version
restored original version
restored original version
restored original version
Merge branch 'master' into gradient_boosting
Merge branch 'master' into gradient_boosting
test_oob_score_regression oob_score below 0.8 if n_estimators < 50
changed ``n_iter`` to ``n_estimators`` and attribute ``trees`` to ``estimators``.
added artificial dataset generator from Hastie et al. 2009, Example 10.2
wip: narrative doc for gradient boosting.
fix: wrong assertion
renamed estimators to estimators_
wip: narative documentatio for gradient boosting.
fix: import numpy in doctest
Merge remote branch 'upstream/master' into gradient_boosting
Merge remote branch 'upstream/master' into gradient_boosting
use mean_squared_error
added new mean_squared_error to metric imports
Merge remote branch 'upstream/master'
Merge branch 'master' into gradient_boosting
polished narrative documentation. fixed doctest.
cosmit: fix doc format
cosmit: fix doc format
Merge branch 'master' into gradient_boosting
factored out weight vector class; dense SGD now uses ``WeightVector`` instead of explicit ndarray and wscale.
enh: performance of WeightVector now comparable to explicit weight vector. some cosmits in dense sgd extension module.
wip: sparse sgd now uses WeightVector - there are some broken tests tough.
ENH changed naive bayes' self._classes attr to self.classes_
wip: still hunter sparse sgd bug
fix: forgot to scale by wscale at the end of dot_sparse. All tests are green again!
added new sgd dataset abstraction to unify sparse and dense implementations.
Merge branch 'master' into sgd-refactoring
major refactoring of sgd module::
use Py_ssize_t where appropriate; cosmit
Merge remote branch 'upstream/master' into sgd-refactoring
cosmit: better docstrings for SGD
Merge remote branch 'upstream/master' into sgd-refactoring
WeightVector now keeps track of its squared norm.
move WeightVector and Dataset abstraction to new module
moved WeightVector and dataset abstraction to new module
updated Dataset imports
no need for sgd_fast header anymore.
added largescale ext module to setup.py
fix: declare extension type attributes
comment in forest classes for covertype benchmark
Merge branch 'master' into gradient_boosting
renamed and updated covertype benchmark.
uncomment RandomForest
cosmit
expose 'ls' loss function for classification
cosmit: pep8
Merge branch 'master' into sgd-weight-vector
renamed largescale -> large_scale
Merge branch 'master' into gradient_boosting
Merge branch 'master' into sgd-weight-vector
moved WeightVector und SequentialDataset into seperate modules.
re-cythonized
fix: min_samples_split
Merge branch 'master' into sgd-weight-vector
don't need self here.
factored out norm updates and moved them to a dedicated subclass
cythonized
Merge branch 'master' into gradient_boosting
Merge branch 'gradient_boosting' of https://github.com/scottblanc/scikit-learn into scottblanc-gradient_boosting
Merge branch 'gradient_boosting' into scottblanc-gradient_boosting
cosmit: pep8
cosmit
added serialization test case
use `deviance` instead of `medviance` and `bdeviance`
wip: refactor ``fit_stage``; fix feature importances regression; tests still not green (performance regression on Example 12.7).
fix: make binary classification a special case.
refactoring for multi-class
test case for multi-class
comment out - yahoo learning to rank dataset
some profiling
impl. deviance for MultinomialDeviance.
fast tree prediction methods.
faster ``_predict`` by using low-level tree predict functions.
cosmit
forgot to remove debug function
changed self.classes to self.classes_
fix: forgot to rename classes
updated documentation: plots for gradient_boosting, new sample generator
new predict utils for early stopping; updated examples
Merge remote branch 'upstream/master' into gradient_boosting
updated benchmark script
delete benchmark scripts - include them in dedicated branch or ml-benchmarks
Merge remote branch 'upstream/master' into gradient_boosting
removed ``store_terminal_region`` from ``build_tree``.
mention multi-class
use ``apply_tree`` to compute terminal region. This is faster and reduces code complexity.
added __all__
enhanced documentation
type (differentiable)
boston -> Boston
Merge remote branch 'upstream/master' into sgd-weight-vector
un-done NormedWeightVector factorization; performance decrease on RCV1 is neglectable.
cythonized sgd files
Merge branch 'master' into gradient_boosting
Merge branch 'pprett/gradient_boosting' of https://github.com/glouppe/scikit-learn into glouppe-pprett/gradient_boosting
cythonized
added Gilles to authors
whats new? Gradient Boosting!
Merge remote branch 'upstream/master' into gradient_boosting
added util func to create random sample_masks
use random_sample_mask (issue pointed out by @glouppe);
update examples
update tests
remove np.seterr
cosmit: comments + rm unnecessary variables
cosmit: add comment to replace ``random_sample_mask`` if numpy requirement allows to do so
cosmit: fix ClassPriorPredictor docstring; rm comment
typos
typo
mv *Predictor to *Estimator
mv classification init estimators; use np.bincount for PriorProbabilityEstimator.
is_multi_class now is a class attribute.
update docs
don't need to store n_classes.
cosmit: no need for float literals
Merge branch 'master' into gradient_boosting
point out scalability problem with large numbers of classes;
cosmit; mention scalability issues w.r.t. large number of classes
Merge branch 'master' of https://github.com/udi/scikit-learn into udi-master
added prior test
more test cases for naive bayes
GaussianNB: use epsilon to overcome zero sigma problem.
rm print stmt
added gbrt extension module (faster prediction methods)
rm custom regression tree prediction method
faster prediction methods
wip
add prediction method for specific stage
add staged predict
use staged predict in gbrt examples
fast tree prediction based on mystic cython kung-fu
cosmit
staged_predict for regression
test for staged predict and cosmit
more test cases
more test cases (input check at prediction time, degenerate inputs)
use approriate data types (Py_ssize_t)
better input checks at prediciton time
rm old tree prediction methods;
cosmit
Merge branch 'gradient-boosting-enh2'
add test for multiple fits w/ different input shapes
fix issue 762: SGDRegressor does not clear coef_ from previous fit
asarray not needed because of check_arrays stmt above
rm unused vars
Merge branch 'fix-issue-762'
typo: Viola-Jones
Gradient Boosting also provided OOB estimates
fix: gradient boosting regressor does not check if X is c-continous
Merge remote branch 'upstream/master'
started work on Huber loss function for robust regression
ensure that std is not zero
add test case for scale div through zero
Merge branch 'master' into gbrt-huber
add huber loss to test
implemented huber loss for robust regression
fix errors in huber loss
add alpha parameter for huber robust regression loss
fix: ensure X is c-continuous
fix: make sure X is c-continuous
Merge branch 'master' into gbrt-huber
added feature subsampling to GBRT (via max_features)
fix: forgot comma
added test for max_features
fix: alpha needs to be scaled by 100
wip: added quantile regression loss; this allows for prediction intervals; adopted the GP regression example to show-case prediction intervals
added title to example
performance improvement for random split (ctyped two variables).
import random split
test for quantile loss function
Use BaseEstimator for constant predictors
cosmit
huber and quantile loss for gbrt
better docs for quantile reg
Merge branch 'master' into gbrt-huber
Merge remote branch 'upstream/master' into gbrt-huber
ctyped variables in ``find_random_split`` and use for loop over index range instead of array elements
Merge branch 'master' into gbrt-huber
fix: np.arange dtype issue; fix dtype to be np.int32
use np.int32_t instead of Py_ssize_t
Merge branch 'master' into gbrt-huber
Merge remote branch 'upstream/master' into gbrt-huber
use dtype float32
proper pylab import
Merge branch 'master' into gbrt-huber
Merge remote branch 'upstream/master' into gbrt-huber
added test case for symbol labels
y must be one dimensional
more tests
removed quantile regression example
added max_features to gbrt regularization example
fix: section label for gbrt was wrong
add quantile example again
added new features to whatsnew
Merge branch 'gbrt-huber'
change dtype of y to float64 (aka DOUBLE_t)
cosmit: better docstrings
forest uses DOUBLE for y
Merge branch 'master' into tree-y-float64
changed shape of predict_proba
adopted tests because of changed shape of predict_proba
adopted tests because of changed shape of predict_proba
cosmit in sgd docs
added change to ``whats_new``
add quantile regression example to gbm doc
Merge branch 'master' into sgd-predict-proba
Merge branch 'sgd-predict-proba'
added failing test for 2d y
rm redundant input check (we check in _partial_fit)
ravel y; use atleast2d_or_csr for input validation
_tocsr not needed because of atleast2d_or_csr
inline comment
cosmit: constants for penalty types and learning rate types; inline comments;
Merge branch 'master' into sgd-yshape-fix
fix typo
make smoke tests explicit; check ValueError on 2d inputs
work on BaseGradientBoostingCV
refactored prediction and decision_function (rm duplicate code)
ENH: use gini for feature importance
GradientBoosting classes with built in cross-validation; implemented via Decorator pattern
wip: aggregate fold via groupby
wip: fixing some set attr errors but still buggy if params not lists
remove *CV classes - only pick decision_function and staged predict refactoring
rm CV class tests
rm CV class legacy
remove CV class legacy
add API changes and feature_importance fix to whatsnew
added failing test for clone
rm instance variables learing_rate_type, loss_function, and penalty_type; create them before plain_fit
move get_loss_function to _partial_fit
add test for proper loss instantiation
n_iter must not be 0
refactored input validation; special loss function factory for huber and epsilon insensitive loss
use DEFAULT_EPSILON consistently
rename get_loss_function to _get_loss_function
Merge remote-tracking branch 'upstream/master' into sgd-clone-fix
added test to expose the predict_proba w/ sparse matrix regression
fix the predict_proba w/ sparse matrix regression by using shape instead of len
cosmit
followed @larsmans tip to get rid of _decision_function
fix docstring of predict_proba
add predict_log_proba and test; better docstrings
wip on fx interactions for GBRT
Merge branch 'master' into gbrt-interactions
implemented partial dependecy plot
fix: grid and model
cleaned tree traversal and sorted out weighting
cythonized and cosmit
automatically create grid from training data
add cartesian product
partial dependency plot example from ESLII 10.14.1
Merge branch 'master' into pr/975
docstrings for init and loss_
cosmit
added Emanuele to authors
Merge remote-tracking branch 'upstream/master' into pr/975
Merge branch 'master' into gbrt-interactions
Merge branch 'master' into gbrt-interactions
Merge branch 'master' into gbrt-interactions
add learn rate to partial dependency function
common ylim; comment out 3d plot
make fit_stage private
return axes instead of grid
3d plot of 2-way interaction plot
Merge branch 'master' into gbrt-interactions
multi-class is supported
cosmit
doc: use n_iter instead of epochs; remove backslash
Merge branch 'master' into gbrt-interactions
california housing dataset
cosmit
use California housing dataset loader
Merge branch 'master' into gbrt-interactions
remove legacy code
Merge remote-tracking branch 'upstream/master' into gbrt-interactions
renamed dependency -> dependence; docstring and cosmit
typo
fix: feature_importances_
rename dependency -> dependence
rename dependency -> dependence
add partial dependence plot example
document sample_mask and X_argsorted in BaseDecisionTree.fit and validate inputs using np.asarray (added two tests as well)
Merge branch 'master' into gbrt-interactions
tidy up deprecated warnings for learn_rate
Merge branch 'master' into gbrt-interactions
raise error if both grid and X are specified
initialize estimators_ with empty array not None
more input validation for partial dependence and doctest
tests for partial_dependence
rename learn_rate -> learning_rate
input validation for grid
test cases for grid
pep8
added test for cartesian
add partial dependence to whats new
documentation for partial dependence plots
add module imports
typo
cosmit
call pl.show
renamed datasets.cal_housing to datasets.california_housing
add plot titles
cosmit
Merge branch 'master' into gbrt-interactions
cosmit: docstrings
better narrative docs for partial dependence
cosmit: footnote header
empty instead of zeros
Merge branch 'master' into gbrt-interactions
more explicit typing (int32, float64)
Merge branch 'master' into gbrt-interactions
Merge branch 'master' into gbrt-interactions
add plotting convenience function
uses plotting convenience function
moved partial dependence into its own module.
doctest fix + cosmit
fix imports
remove partial dependence (moved to own module)
updated example
fix imports (partial dependence)
fix: california_housing not cal_housing
cosmit
switch axis for 2-way plot; better to compare with above plot
added partial dependence and fetch_california_housing to classes
better documentation
fix links
add partial dependence module
add test for staged_predict_proba
Merge branch 'master' into pr/1409
Merge branch 'master' into gbrt-interactions
better formatting of xticks (prevent overlap)
show how to use ``partial_dependence`` to generate custom plots.
doctest skip for plot function
fix doctests skip
renamed: ncols -> n_cols;
test decorator to skip tests if matplotlib cannot be imported
smoke test for plot_partial_dependence
fix: doc rename partial_dependence_plots -> plot_partial_dependence
Merge remote-tracking branch 'upstream/master' into gbrt-interactions
better input checking (e.g. for str features)
better handling of multi-class case (w/ symbol labels)
code snippets for narative doc and restructuring
fix: random_state got initialized in fit_stage; caused same feature subsample in each tree
add test for gbrt random_state regression
Merge branch 'master' into gbrt-random-state-fix
Merge branch 'master' into gbrt-interactions
doctest skip: matplotlib not available on travis
fix: doctest in ensemble.rst
Merge branch 'master' into gbrt-interactions
rephrased the one-way PDP description
Merge branch 'master' into gbrt-interactions
topics -> topic
Merge remote-tracking branch 'upstream/master'
use Agg backend with warn=False for matplotlib enabled tests
check in ``if_matplotlib`` if $DISPLAY set
use subplots_adjust instead of tight_layout
use 100 instead of 800 n_estimators; looks the same but faster; ESLII uses 800
ZipFile context manager is only available in Python >= 2.7
cosmit: remove fourth quote
set min_density when growing deep trees during gradient boosting
sampling w/ replacement via sample_weights
rename learn_rate -> learning_rate
raise ValueError if len(y_true) is less than or equal to 1
fix: docstring for power_t in SGDClassifier was not correct (0.25 instead of 0.5)
cosmit: rephrased doc
zero_one_loss now does normalize on default.
fix: map labels to {0, 1}
fix: deviance computation in BinomialDeviance was wrong (ignored cases where y == 0) - thanks to ChrisBeaumont for reporting this issue
raise ValueError if division through zero in LogOddsEstimator
add loss function for gradient boosting binomial deviance
pep8 and assert_equal instead of assert
correct docstring
Merge branch 'master' into gbrt-deviance-fix
use unique from sklearn backports (return_inverse)
Merge branch 'master' into gbrt-deviance-fix
Merge branch 'master' into gbrt-deviance-fix
decision_function forces dense output (in the case of sparse coef_)
Merge branch 'master' into pr/1798
get rid of ``rho`` in sgd documentation - has been replaced by ``l1_ratio``
Merge pull request #1893 from dougalsutherland/sgd-docs
corrected doctests after moving L2 penalty application in SGD
Merge remote-tracking branch 'upstream/master' into pr/2016
added SGD L2 fix to whatsnew
fix: add missing str formatting operator
enhanced (hopefully) DBScan documentation; killed some whitespace along the way...
Merge remote-tracking branch 'upstream/master' into dbscan-doc-enh
fix: needs_threshold not plural in repr
removed min_density example - dropped param
gbrt now works with new DecisionTree implementation
import classes - now they work!
fix: proper dtype for SIZE_t
add GBRT to covertype benchmark
added pxd to Manifest (to be included in source tarball)
Merge remote-tracking branch 'upstream/master'
add OOB improvement and set oob_score deprecated
example for oob estimates in GBRT
plot cv error as well
rm print stmt
rn: plt -> pl
fix: oob_improvement_ with trailing _
more docstrings
cosmit: use train_test_split - tuned params for nice plot
narrative documentation for oob improvement.
more tests
cosmit: better links and a note on efficiency using max_features
comments
cosmit: n -> n_samples
cosmit: rs -> random_state
more doc for OOB example
use new style str formatting
rearanged some code
rn: ACC -> Accuracy
rephrased max_features doc
moved to new pyplot import
more narrative documentation for oob in gbrt
regression tests for oob_improvement_
example doc string
Merge branch 'gbrt-oob-improvement'
covertype benchmark: use C-style input as default (most models require it as input)
fix: use asserts from sklearn.utils.testing
fix: python3.3 warning fix
doc: hedge the use of OOB estimates
Refactored verbose output in GBRT - output much more nice
fix: newest numpy doesn't like all-indexing non-existing dimension (reported by erg #2233)
Merge remote-tracking branch 'upstream/master'
remove negative indices from neighbors cython code
fix: check for impurity ties
added 32bit 64bit equality test case
adapt OOB regression test to change in tree module
Peter Welinder (2):
add support for non-ndarray lists
Merge branch 'master' into peter-dev
Philippe Gervais (11):
Style fixes
[DOC] missing parameter description
GraphLassoCV works with alphas given as list.
Simplified GraphLassoCV code.
Put back cov_init parameter in graph_lasso_path_
Speed up some tests
Removed unused import
Added GraphLassoCV changes to whatsnew.rst
[DOC] Corrected errors in clustering documentation
[DOC] fixed a typo in an warning message.
One more typo fixed
Pietro Berkes (22):
NEW: Function to automatically download any mldata dataset given its name
ERF: load files in "mldata" subdir; some documentation improvement
ERF: Error checking in fetch_mldata
ERF: fetch_mldata allows to use natural mldata.org names for datasets
FIX: trying to reverse-engineer mldata.org conventions
FIX: fetch_mldata fixed to support non-standard data sets in mldata.org
NEW: mldata tests
ERF: Simplify conversion of mldata.org data set name to filename
Merge pull request #1 from ogrisel/pberkes-mldata
FIX: Remove column name when renaming in fetch_mldata
ERF: Improved coverage of mldata, taking into account network availability
DOC: documentation for fetch_mldata
ERF: Test mldata download using mock urllib2
FIX: fix pep8 and pyflakes issues
ERF: refactor object mocking urllib2 for general use (to be used in doctests)
ERF: Refactor utility function to test that list of names are (not) in an object
ERF: Move testing utilities to make them accessible from doctests
FIX: Doctests use mock mldata.org and do not download
DOC: small fix in datasets.rst docs
Merge pull request #2 from larsmans/pberkes-mldata
Merge pull request #3 from ogrisel/pberkes-mldata
FIX: update mldata tests to match recent updates; mock_urllib2 now accepts ordering parameter
Rafael Cunha de Almeida (1):
Only reassign centers if to_reassign.sum() > 1
Raul Garreta (5):
PY3: used six.u to fix unicode variables in svmlight
PY3: six.moves.cStringIO to fix StringIO import
PY3: fix None comparison (when not in OS X) in test_k_means.py
PY3: used six.moves.xrange to fix xrange
PY3: used six.iteritems to fix dict iteritems in module pipeline.py
Richard T. Guy (4):
Switched dynamic default args in random forest
Added test
Switched default parameter to tuple from lists.
move tuple back into arguments
Rob Speer (6):
Change 'charse_error' to 'charset_error' in load_files.
Revise documentation about handling text and bytes.
Add a documentation section about decoding text.
Move the new "Decoding text files" doc section
FIX Minor stuff in document_classification_20newsgroups output
ENH Add filters on newsgroup text
Rob Zinkov (36):
Fixed typo in documentation
Adding guide on how to contribute to project
Fix indentation
Removed tabs from indentation
COSMIT: noting that PRs don't send mail to mailing list
Moved link for further info to be more prominent
Adding Passive Aggressive learning rates
Added documentation to stochastic_gradient
Added to documentation
Added documentation and removed PA
Added tests
COSMIT: spelling correction
Adding example
Added smoothing to example
COSMIT typo
PEP8 fix
PEP8 COSMIT
PEP8 COSMIT
Enforcing non-negative step-size
Split out PassiveAggressive Classifier into its own object
Adding PassiveAggressiveRegressor estimator
COSMIT
Added documentation for new classifier and changed seed to random_state
Fixed typo
Renamed learning_rate loss in PassiveAggressive
Correct documentation
Corrected doctests
Fix indentation
Fixed docstrings and seed tests
Fresh fixes of grammar errors
Grammar fixes
Adding support indices in svm for sparse matrices
COSMIT PEP8
Adding test to check support_ is equal in dense and sparse matrices
COSMIT PEP8
Recompiled base
Robert (11):
Twenty newsgroups will not create folder if the folder doesn't exist and the files won't be downloaded anyway
Example file based on Affinity Propogation example.
Fixed noted issues with previous version
params in DBSCAN.fit description
DBSCAN now takes either a similarity matrix, OR a feature matrix.
label_num is now only calculated once. This corrects a previous patch, which I incorrected half finished a refactoring, breaking the code badly :(
dbscan_.py file reinstated after accidental deletion
Function to calculate similarity matrix given either a feature matrix or a similarity matrix
Fixed documentation, and the input matrix is now consistently called 'X'.
NOW X is used consistently everywhere
pep8'd and pyflakes'd
Robert Layton (228):
DBSCAN clustering algorithm. A density based cluster analysis algorithm that looks for core points in dense neighbourhoods.
DBSCAN density based clustering algorithm (Ester et al. 1996)
Merge pull request #1 from larsmans/dbscan
labels_ doc updated
Added a paragraph in the documentation.
K-means with transform method.
pep8 fix for k_means_.py
Fixed documentation in example
Examples for dbscan in documentation
Much better example with pyplot, thanks to suggestions by GaelVaroquaux.
vq now the default in KMeans.transform
n_samples used instead of n_points in transform()
American spelling
Example now much more likely to return 3 clusters.
calculate_similarity changed to calculate_distance, moved to metrics.pairwise.py
Import of calculate_distance in metrics.__init__.py.
Merge branch 'master' of https://github.com/robertlayton/scikit-learn
Tests updated to work with the new distance based method.
Test using a callable function as the metric
Multiple small changes
pep8'd
kmeans example renamed
Digits example has plot.
Merge branch 'origin/master' into dbscan
Small changes, mostly to wording
Reference to calculate_distances fixed
Returned line I removed for some reason
Deleted line I returned that I really didn't delete.
K-means documentation updated to include information based on this PR
Extra example removed
Small fixes as per ogrisel's comments.
Merge remote-tracking branch 'remotes/origin/master' into kmeans_transform2
Small changes based on mblondel's comments. Nothing overly noticable
Replace points with samples everywhere
random_state used instead of giving index_order as argument
Description for components_ attribute. Renamed core_samples_ attribute to core_samples_indices_ to remove confusion
Split the transform method into a predict and a transform.
Merge remote-tracking branch 'upstream/master'
Merge branch 'master' into kmeans_transform2
Merge remote-tracking branch 'mblondel/kmeans_transform2' into kmeans_transform2
Merge remote branch 'upstream/master' into pairwise_distance
Initial changes to improve this module. pairwise_distance now uses a dict for functions.
Working through some of the errors in testing
Fixing twenty_newsgroups
Fixed a few import errors
Example images
VQ example. Not working yet - clusters aren't well formed I think.
Fixed loader problems
X -> XA, Y -> XB. pairwise_distance back to metrics
check_set_Y -> check_arrays
Ran tests and fixed a few bugs. Unit tests added.
Less verbose name
Test for tuple input. Tests now run in suite (forgot to have test_ at start of func name!)
XA -> X, XB -> Y
Merge branch 'master' into pairwise_distance
Moved metrics file to sklearn
pairwise_kernel function (untested, for comment)
PEP8 of metrics.py
import to metrics namespace for pairwise_kernels
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into pairwise_distance
Merge branch 'master' into pairwise_distance
Merge branch 'master' into pairwise_distance
Tests working, mostly pass
Merged PR263 into this PR
Fixed merge conflict
Fixes based on ogrisel's comments
l1_distances -> manhattan distance
pep8'd and pyflakes'd
Remove l1_distances completely, updated gaussian_process
Actually removed l1_distances this time
test_checks merged into test_pairwise. test_checks is empty for now.
Removed test_checks
Fixed doctest and checked tests working - most are;
pairwise callable metrics fixed
Now tests if tuples given as input
check_pairwise_arrays now ensures at least two dimensional arrays are returned.
pep8'd and pyflakes'd
metrics listed in pairwise_distances and pairwise_kernels
kwds ws being passed to squareform, instead of pdist. this has been fixed, with a test added
pairwise helper functions to give verbose knowledge of which metrics
Fix commenting in pairwise_distance
check for sparse matrices for scipy metrics, and throw error. test included
Brief description of kernels and distance metrics in doc
Added a list
Little more description
Fixed typos
manhattan_distances now returns [n_samples_X * n_samples_Y, n_features_X] shape array
Doc update for manhattan_distance
Fixed doctest error
Edited sklearn/metrics/pairwise.py via GitHub
Initial Silhouette Coefficient code. no tests yet, and haven't checked it actually works yet as well
Initial test. Not working yet
Included distance helper functions line for 0.9 release
API changes in metrics/pairwise.py
Merge branch 'silhouette' of https://github.com/robertlayton/scikit-learn into silhouette
Test working, pep8'd and pyflakes'd
Sparse matrix testing
Swapped y, D to distance, labels
silhouette_coefficient -> silhouette_score
Restructured metrics/cluster into a folder with supervised and unsupervised modules
Narrative documentation
Merge remote-tracking branch 'upstream/master' into silhouette
"whats_new" updated
Example updated, which required fixing a backwards compatability bug (adjusted_rand_score not imported in metrics/cluster/__init__.py)
Silhouette added to AP example
Using pairwise_distances in the Silhouette Coefficient. Updates to docs, code, tests and examples
Silhouette calcualted for all forms of k-means in example
Faster version by removing inner loop comprehension
Sampling to improve SC speed
sampling added to silhouette_score, examples updated to match
pep8 and pyflakes
Updated doc with new API
Removed unneeded line from doc
Merge pull request #364 from robertlayton/silhouette
Trying to fix NaN errors, but its not working. Pushing to work on it later.
Mutual information now works (tested!)
AMI now works, and has been tested against the matlab code (test based on this to come!)
Remove phantom double v-measure !?
Added tests. There are two errors, but I'm going to bed. I'll fix them in the morning.
Merge branch 'master' into ami
Merge branch 'ami' of github.com:robertlayton/scikit-learn into ami
- AMI in the cluster examples
Higher level import for ami_score
There is an overflow problem. It can be reproduced with the plot_adjusted_for_chance_measures.py example
Narrative doc, and I think I fixed the overflow issue (more tests to come)
Fixed logs to match the matlab code results.
Test now tests a much larger array
Test actually does what I meant it to do, and works sufficiently
Fixed this example. Tested the others (they worked!)
pep8 and pyflakes
Merge pull request #3 from ogrisel/robertlayton-ami
Optimising the expected mutual information code
Adding old version of EMI, as I'm about to change it
This version doesn't work either. I am uploading for historical sake.
Initial usage of gammaln. Not yet tested
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into ami
Still overflows, but the closest so far. Using gammaln
It works! Still have some optimisation to do, but it works for larger arrays
Moved start and finish outside of loop
comments, pep8 and pyflakes
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into ami
ami_score -> adjusted_mutual_info_score
ami_score -> adjusted_mutual_info_score
"What's new?" AMI!
Merge branch 'ami' of https://github.com/robertlayton/scikit-learn into ami
mutual_information_score -> mutual_info_score
and in plot_adjusted example (mutual_info_score)
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into ami
cosmit
Merge pull request #402 from robertlayton/ami
Fixed values in Adjusted Mutual Information doctests
l1_distances was renamed to manhattan_distances.
Mutual Information docstring incorrectly said it was the adjusted mutual information
Removed single k-means run to it's own function to enable optional parallelisation later.
Parallel version of k-means.
pep8 and pyflakes tested
Not doing a full sort for getting the best results
Updating random_state inbetween iterations of k-means fixes some issues
Doc updates
Fixed author reference (removed link as it wasn't working)
Added my twitter account as homepage.
feature_extraction/text.py: 'ignore' removed as a default, class param
Added a test (that doesn't work yet)
Test now works, testing both the Word and Char analyzers
decode_error -> charset_error
docstring update
cosmit
cosmit undoing (was testing)
pep8
cosmit to docstrings
NearestCentroid classifier, with test suite.
Shrink threshold working, along with a test
Sparse tests, but they are currently failing. Committing for comment
Typo for "neighbours", and converted to en-US
Test for sparse matrices. Tests fails, my guess is that centroids are the same.
Fixed bug in nearest_centroid, and removed boston test.
Narrative documentation
Sparse tests pass when using shrinkage
Turned on final test (it works!)
Broadcasting used to remove a loop
Removed asserts in code
Test use assert_array_equal where appropriate
pyflakes on test
Update to documentation
Moved to the `neighbors` namespace
Example of nearest neighbor, getting an improvement when using a shrink threshold of 0.1
Explain example in docs
Update examples/neighbors/plot_nearest_centroid.py
Update doc/whats_new.rst
Update doc/whats_new.rst
Removed unneeded numpy.array call in test
metric fixed in tests
Merge remote-tracking branch 'origin/nearest_centroids' into nearest_centroids
Merge pull request #5 from larsmans/nearest_centroids
This test repeats issues 960, with the silhouette coefficient returning nan
nan values are converted to zeros
k-means now no longer needed in test.
Distance matrix doesn't matter, and was therefore removed
Test for "amg" mode for spectral clustering added.
docfix: spectral_cluster doesn't return n_centers
pep8
Spectral will raise an error if the mode is set to amg and pyamg is not available
Test that an unknown mode raises the appropritate error
Update to the clustering.rst module file for k-means. Added a plain language description and the objective function.
Updated fixes from larsmans
Merge pull request #1478 from amueller/pep8
Merge pull request #1451 from amueller/chunksize_batchsize_rename
First draft of new Affinity Propogation description in docs.
Who doesn't love equations?
Spelling
Update doc/modules/clustering.rst
DOC improve mini-batch k-means narrative
DOC: Replaced all BSD style licenses with "BSD 3 clause"
Minimal spanning tree backported from scipy 0.13
Added test
Moved mst to a subfolder and added a README file
Added new files (from previous commit)
Merge pull request #2055 from jnothman/cv_refactor
Merge pull request #2076 from pprett/dbscan-doc-enh
Traversal in and tested. Next step is to remove references to old code
Removed reference from spectral_clustering to old csgraph
csgraph updated from hierarchical.py
Removed actual _csgraph file, tests still all pass
Turns out sparsetools wasn't needed either
Missed a spot
Reference to graph components updated in dev docs
Two more spots. I think that's it
Now that the folder has more than just mst in it, rename to sparsetools, which should help with referencing it.
Robert Marchman (13):
test case for unfitted idf vector
raise ValueError for unfitted idf vector
FIX docstring deletions
ADD test coverage for _check_stop_list
FIX comment typo
ADD test cases to fill out VectorizerMixin coverage
ADD another VectorizerMixin test
ADD test for get_feature_names
ADD test for tfidf fit with incompatible n_features
ADD test for TfidfVectorizer attribute setters
MV Mixin tests to CountVectorizer tests
RM CV import
MV _check_stop_list tests to CV get_stop_words
Robert McGibbon (3):
fix the kwarg name
updated the .c file
remade the cython with 0.18
Rolando Espinoza La fuente (1):
DOC typo: Pereptron -> Perceptron.
Roman Sinayev (3):
ENH Rewrote CountVectorizer fit_transform to be ~40% faster
ENH refactor and further speed up CountVectorizer
ENH speed up TfidfTransformer using spdiags
Ron Weiss (63):
added hmm code from http://github.com/ronw/gm
removed logging, dependency on abc, and unnecessary imports
added hmm unit tests
cleanup hmm module: made properties compatible with Python 2.5, etc.
changed hmm.trainer usage: each hmm object must have a _default_trainer property which can be overridden by passing a different trainer into hmm.train()
changed "train" -> "fit". Removed HMMGMM for now.
removed references to gmm.init() in gmm docstrings
fixed random seed in hmm unittests
removed init() method from hmm classes
minor tweaks to make hmm.GaussianHMM look like gmm.GMM
fixed bug in HMM viterbi logprob
added MultinomialHMM, unit tests
fixed *HMM.fit() to include *all* parameters by default
removed ndim argument from gmm.rvs()
added validate_covars back to gmm.py
added ndim property back to gmm to keep it consistent with HMM
Merge branch 'hmm'
added support for HMMs with GMM emissions
Merge remote branch 'upstream/master'
fixed GMM examples
fixed GMM examples
fixed broken doctests in gmm.py
updated gmm.py to comply with scikit-learn API. fixed pep8, pyflakes errors
DOC: fixed typos in developer documentation
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
BUG: fixed failing test in GMM
remove GMM.lpdf method
merge
update hmm module to comply with scikit-learn API
remove hmm.HMM factory to simplify hmm module's interface
merge hmm_trainers into hmm module
finish merge of hmm_trainers with hmm and remove hmm_trainers
remove extraneous tests from test_hmm.py
speed up hmm unit tests, add test for GaussianHMM with priors
fix GMMHMM bugs. speed up tests
Merge branch 'master' of github.com:scikit-learn/scikit-learn
FIX: failing gmm tests
Merge branch 'master' of github.com:scikit-learn/scikit-learn
change GMM initialization to use cluster.KMeans
change GaussianHMM initialization to use cluster.KMeans
merge
fix bug in hmm.GaussianHMM mstep update for 'full' covariance
Reapply "ENH: enhacements in the gmm module."
fix gmm examples
merge
fix bug in GMM._get_covars dimensions
make HMM interface consistent with GMM
clean up interfaces in hmm and gmm
remove n_symbols argument from MultinomialHMM.__init__
Merge branch 'master' of github.com:scikit-learn/scikit-learn
add GMM classification example
clarify GMM classifier labels
add GMM.predict_proba
add default initialization of GMM.weights to constructor
rename GMM.n_dim to GMM.n_features to be consistent with the rest of the scikit
add HMM.predict_proba
rename HMM.n_dim to HMM.n_features to be consistent with the rest of the scikit
fix pep8, pyflakes errprs
Merge branch 'master' of github.com:scikit-learn/scikit-learn
scikits.learn.gmm -> scikits.learn.mixture
Merge branch 'master' of github.com:scikit-learn/scikit-learn
BUG: fix GaussianHMM.fit to allow input sequences of different lengths
FIX remove broken test in test_mixture
Ronan Amicel (2):
Fix broken merge by ogrisel :-P
PEP8 + missing fit methods
Roy Hyunjin Han (2):
Fixed some typos
Update examples/exercises/plot_iris_exercise.py
Salvatore Masecchia (6):
FIX: coordinate descent stopping rule
added missing _set_params call in LineadModelCV
unified and simplified path params creation in LinearModelCV
fixed parameters passing of LinearModelCV.fit, with test
thread safe tests for coordinate descent
pyflakes/pep8 on coordinate descent
Satrajit Ghosh (58):
BF: k-fold should accept k==n
BF: k-fold should accept k==n
resolved init
initial import from milk
renamed, additional import
started conversion to scikits
updated information gain and set_entropy functions
modified base classes
updated docstring to reflect use
updated load_iris to return features
enh: updated decision tree classifier and associated example
updated default impurity measure
added new impurity measures
updated random forest classifier to operational status
updated cython script to calculate gini measure
removed classifier.py
resolved conflicts
Merge remote-tracking branch 'noel/decisiontree' into treemerge
fix: trailing-spaces option fixed to be executed
doc: updated docstring for permutation_test_score to reflect nature of p-value given the type of score_func
sty: ran make trailing-spaces
doc: fixed spelling
doc: updated docstring based on feedback
fix: permutation test score averages across folds
added avg_f1_score
tst: added tests
enh: added matthew's correlation coefficient
sty: pep8 + doc
Merge branch 'master' into enh/metrics
fix: added ensemble to setup.
Merge branch 'master' into enh/metrics
enh: added support for weighted metrics closes #83
doc: added description for matthew's corrcoef from wikipedia
sty: pep8 fixes
sty: pep8 on test file
doc: removed strange character
fix: updated tests to reflect that micro shows the same precision and recall
fix: average with elif
doc: improved description of average
api: changed pos_label to None for metrics
Merge remote-tracking branch 'upstream/master' into enh/metrics
Merge remote-tracking branch 'mblondel/metrics' into enh/metrics
Merge pull request #443 from satra/enh/metrics
fix: convert input arrays to float
fix: force copy to True in case underlying default behavior changes.
tst: added test for feature selection. this test would have failed in the previous case. closes #727
doc: added reference to lobpcg and note about small number of nodes
fix: addressing gael's comments
fix: set syntax
fix: increase robustness of label binarizer test
sty: white space
fix: change affinity check
doc: clean up style and grammar
ref: change name to indicate semantics
fix: removed unused keyword precomputed and clean up if clauses
fix: moved random state check to fit
doc: removed merge diff markers
doc: align hyphens
Scott Dickerson (3):
train_test_split: test_size default is None
Modified docstrings
Modified docstrings and tests
Scott White (2):
add support for multi-class
add todo
Seamus Abshere (1):
ENH reduce size of files produced by dump_svmlight_file
Sebastian Berg (1):
FIX: Do not rely on strides for contiguous arrays
Sergey Feldman (1):
Adding covariance regularization to QDA
Sergey Karayev (2):
fixing bug in linear_model.SGDClassifier for multi-class warm start
removing accidental space
Sergio Medina (2):
Fixed small typo, even though the message is kind of the same and the one with the typo is waaay funnier.
Corrected a few things on the Mutual Information doc pages.
Shaun Jackman (1):
BernoulliNB: Fix the denominator of P(feature)
Shiqiao Du (36):
improved computational speed by calling fast scipy build-in function and replaceing double loop
fixed some pep8 warnings
Merge remote branch 'upstream/master'
added a cython module to the hmm
replaced (T, N) -> (n_samples, n_states)
- renamed (n_samples, n_states) -> (n_observations, n_components) in hmm.py
Merge pull request #1 from agramfort/hmmc
dropped "_c" suffix
debugged _hmmc.pyx
fixed proble of _accumulate_sufficient_statictics in hmm.py
- removed unnecessary **kwargs specification in fit and _do_mstep methods
replaced deprecated "rvs" to "sample"
made `sample` also return the sequence of internal hidden states
added doc for hmm
- fixed typo in hmm.rst
made `sample` also return the sequence of internal hidden states
rebased to the master and fixed conflicts
bug fixed
fixed _do_viterbi_pass
fixed doc
fixed typo
replaced function call of decode to predict
removed pure python codes and beam pruning options
Added change history to what's new
updated author and pep8
modified phrases in what's new
- added decoder selection
fixed some typo, doctest and pep8
added comment on decoder algorithm in the rst doc.
Merge pull request #2 from GaelVaroquaux/hmmc
Merge pull request #847 from kwgoodman/master
fixed bug of initialization in hmm.py
added test_fit_with_init to tests/test_hmm.py
pep8, ignored E126-E128
- avoid startprob, transmat, emissionprob containing a zero element by
- check input format of MultinomialHMM.fit
Stefano Lattarini (1):
COSMIT various typofixes
Steve Koch (1):
Update hmm.rst
Steven De Gryze (9):
PY3: fixed basestring in crossvalidation.py
PY3: use b() convenience function for string literals
PY3: ensuring file stream is read as binary
PY3: convert string literal to bytes using six in cython file
replacing numpy array with range for use in random.sample
PY3: changing None to 0 to ensure comparability in py3
PY3 fixing utf8 comments in svm through try/except and six.b
PY3: forcing execution of map by using tosequence
PY3 fix comparison of ndarray and string
Sturla Molden (1):
Update typedefs.pxd with correct ITYPECODE
Subhodeep Moitra (17):
P3K: 'type' has been renamed 'class' in python3
P3K: Fixed dtype doctests for Python3
P3K: Fixed print related Python3 errors
P3K : Fixed range iterator to be list
PK3: __len__ returned float instead of int. Typecasted.
P3K : Convert int type checking to np.integer
P3K : Typecasted float to int
P3K : Changed / to // to typecast float to int
P3K: Modified RuntimeError message args
P3K : Replaced / by //
P3K : Refactored test cases to use setUp
P3K: print back compatible with python2.6-7 with __future__ import
P3K: Fixed None < Float Python 3 error
P3K: Fixed unicode pickling error by changing to BytesIO
P3K: Fixing prints and dtypes
P3K: Fixed RuntimeError.message
P3K: Fixed print related Python3 errors
Szabo Roland (3):
ENH Added custom kernels to SpectralClustering
BUG Add lambda_ attribute to ARDRegression after fit
DOC Add labels and some explanation to confusion matrix example
Tadej Janež (17):
DOC: further improvements to the model selection exercise
DOC: further improvements to the model selection exercise
Merge remote-tracking branch 'upstream/master'
DOC: another improvement to the model selection exercise
DOC: Improved the code that shows how to export a decision tree to Graphviz and generate a PDF file.
Skip doctest for the Python code involving pydot.
Skip doctest for the remaining line involving pydot.
Removed an unnecessary if statement in KFold __iter__ method.
Improved the test that checks the balance of sizes of folds returned by KFold.
DOC Corrected the docstring of KFold about the sizes of the folds.
COSMIT Moved the test_roc_curve_one_label test where other ROC curve tests are.
FIX KFold should return the same result when indices=True and when indices=False.
ENH Function auc_score should throw an error when y_true doesn't contain two unique class values.
ENH optimizations in sklearn.cross_validation
FIX Moved copying of labels in LeaveOneLabelOut and LeavePLabelOut to __init__.
TST Added test that checks if LeaveOneLabelOut and LeavePLabelOut work normally if the labels variable is changed before calling __iter__.
DOC Fixed doc test to work with the fixed versions of LeaveOneLabelOut and LeavePLabelOut.
Thomas Jarosch (1):
BUG delete/delete[] error in Liblinear
Thouis (Ray) Jones (4):
Wrapped BallTree in Cython.
Renamed for backwards compatibility, fixed C++ Exceptions to propagate to python
balltree - be explicit about return types' width
check input arguments to BallTree, and be more careful in dealloc'ing
Tiago Nunes (6):
Add fit_transform to FeatureUnion
Change / to (…) line continuation
Add test case for FeatureUnion.fit_transform
Fallback to fit followed by transform if fit_transform is unavailable
Add test case for fit_transform fallback
Fix pipeline fails if final estimator doesn't implement fit_transform
Tim Sheerman-Chase (5):
FIX: Corrected NuSVR impl type and set epsilon to None
Added a fix to prevent tree splits on samples that are
Removed exception from _find_best_split to avoid code bloat.
Removed unnecessary variables
Enable graphvis export function to export trees as well as regressors
Tiziano Zito (1):
FIX broken links to Rubinstein's K-SVD paper.
Udi Weinsberg (1):
corrected Gaussian naive-bayes to correctly computer the class priors
Vincent Dubourg (35):
Hello list,
Correction of a bug with the management of the dimension of the autocorrelation parameters.
Forgot to retire pdb.
Commit of a 'Gaussian Process for Machine Learning' module in the gpml directory. The module implement a class named GaussianProcessModel. I also add doc, examples and tests (involving a coupling with the cross_val module).
Correction of a bug in test_gpml.py (now runs perfect on my machine!). I just don't know how to involve this test within the whole scikit testing procedure (nosetests). Also add a modification of the TOC in doc.
Correction of a bug in the basic regression example.
Delete the old kriging.py module
Modification of the score function. The score function now evaluates the deviation between the predicted targets and the true ones. This is for convenience only because it allows then to use the distributing capacity of the cross_val module. The old score function is renamed with the more explicit name: `reduced_likelihood_function` (see eg the DACE documentation).
Modification of the main __init__.py file of the scikits.learn package in order to load the gpml module and tests.
Renames as suggested by Alexandre. Simplification of the examples. Remove the interactive contour label picking in the probabilistic classification example.
Bugged example after modification. Now correct!
I Ran the PEP8 and PYFLAKES utils and corrected the gaussian_process module related files.
Can't comply with contradictory PEP8 rules on some specfic code such as:
I removed the time-consuming test and made a regression example from it.
Replaced np.matrix(A) * np.matrix(B) by np.dot(A,B), so that the code is a lot clearer to read...
Removed plotting command from the examples in the GaussianProcess class docstring.
Simplification of input's shape checking using np.atleast_2d()
Changes in format of the fit() input (np.atleast_2d for X, and np.newaxis cat for y).
Force y to np.array before concatenating np.newaxis in fit().
Modifications following Gaël latest remarks.
Added Welch's MLE optimizer in arg_max_reduced_likelihood_function() plus reference in the docstring.
Correction of a minor typo error in correlation_models docstring
Improvement of the documentation with a piece of code and reference to the regression auto_example. Add a README.txt file at the root of the examples/gaussian_process directory.
From: agramfort: don't use capital letters for a vector. Y -> y.
Forgot to retire pdb... Again!
Forgot one capital Y in the piece of code of the RST docpage.
Removed trailing spaces in the RST doc page.
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
metrics.explained_variance was renamed to metrics.explained_variance_score so that I needed to modify this example.
Removal of the submodule relative imports in the toplevel init file.
gaussian_process module changes:
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'gaussian_process_review'
Debug in GaussianProcess.predict for batchwise computation
Debug GaussianProcess.predict for variance estimation in 'light' storage mode.
Vincent Michel (46):
New feature selection
Last version of univariate selection
Merge branch 'master' of git at github.com:vmichel/scikit-learn
Corrections of indentation in univariate_selection
remove old univariate_selection
Correct nosetets for univariate_selection
Add doc to univariate_selection
Merge branch 'master' of git at github.com:vmichel/scikit-learn
Merge branch 'master' of git at github.com:agramfort/scikit-learn
Merge branch 'master' of git at github.com:agramfort/scikit-learn
Add rfe example
update example
Merge branch 'master' of git at github.com:agramfort/scikit-learn
Merge branch 'master' of git at github.com:agramfort/scikit-learn
Corrections in rfe
Remove feature selection
Add ranking_
Add Crossvalidated version of RFE
Add example of RFE CV
ENH : New version of Bayes Ridge
Merge branch 'master' of git at github.com:vmichel/scikit-learn
Newer (and faster !) version of Bayesian regression.
Merge branch 'master' of https://vmichel@github.com/scikit-learn/scikit-learn
ENH : New version of Bayes Ridge
Newer (and faster !) version of Bayesian regression.
Update tests for bayes
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
Add first draft of variational bayes
Add variational inference
DOC: Update doc for bayesian regression + examples
Merge branch 'master' of github.com:scikit-learn/scikit-learn
FIX : Remove reference to Variational Bayes
More doc in bayes.py, fix bug in high dimension, add score
Merge branch 'master' of github.com:scikit-learn/scikit-learn
ENH : change the convergence trigger
More coverage for bayes
DOC : create and start doc for cross-validation.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
DOC : add changes in classes.rst
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Add ward algorithm + feature agglomeration
Add documentation on Ward algorithm
Add documentation on hierachical clustering.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
[base.py] revert previous commit, as the error is raised when object does not follow scikit API
[feature_extraction] Refactor text/* to text.py
Vincent Schut (8):
added a converged_ attribute to GMM to indicate whether fit() returned because of convergence or because max_iter was reached.
reset GMM.converged_ when calling fit() again
split >80 char comment in 2
add GMM.converged_ attribute to GMM docstring
some optimizations for GaussianProcess
pep8 improvements
remove unnecessary parens
batch k-means: calculate labels and intertia in chunks to prevent memory errors
Virgile Fritsch (79):
DOC: Fix typos in svm module documentation.
Remove Y from fit in OneClassSVM.
Add a reinitialization function for estimators + write test for
Merge branch 'master' of github.com:GaelVaroquaux/scikit-learn
Change test name for the _reinit() method.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' of github.com:GaelVaroquaux/scikit-learn
DOC: Explain the _set_params method in BaseEstimator class.
Rename PyBallTree* --> BallTree* in BallTree.cpp.
DOC: typos + change name of LDA vs QDA examples
Refactoring of the covariance estimators modules.
OAS estimator of covariance + new example.
Refactoring of the covariance module and examples + add OAS.
Merge branch 'covariance' of github.com:VirgileFritsch/scikit-learn into covariance
More covariance refactoring: separate MLE computation from object.
Rename BaseCovariance as EmpiricalCovariance + reviews comments.
Remove useless calls to np.asanyarray and improve computation.
Cosmit
Handle integer type case for the estimation of covariances.
Use np.cov instead of empirical_covariance in covariance module.
Reintroduce empirical_covariance function + docstrings + cosmit.
DOC: Documentation about covariance estimation.
Compatibility Ubuntu 11.04 (with matplotlib 0.99.3)
Modify the method computing errors on covariances (<cov_object>.error)
Bug fix: turn <covariance_object>.mse into <covariance_object>.error
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Covariance errors computation API changes.
Docstrings about labels + cosmit in the metrics module.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Add a void cython module affording to check that `make` has been run.
Implements a robust covariance estimator: Rousseeuw's MCD.
Integrate Fabian's comments on Minimum Covariance Determinant.
Implements a robust covariance estimator: Rousseeuw's MCD.
Integrate Fabian's comments on Minimum Covariance Determinant.
Merge branch 'mcd' of github.com:VirgileFritsch/scikit-learn into mcd
BF: index out of bound in GraphLassoCV grid refinement.
Refactor MCD robust covariance estimator: it is easier to regularize.
Merge with Gael's glasso changes.
Make the design even more modular for MinCovDet.
Make the "robustness parameter" accessible through the API.
Integrate Gael's minor comments + Magnify examples + 1D data case.
Remove `correction` and `reweighting` parameters from the API.
Merge pull request #396 from VirgileFritsch/refactor_mcd
OPT: (minor) remove useless determinant computation in FastMCD.
Separate correction and reweighting steps from raw MCD computation.
Add a set of tools and a new object for outliers detection (+ example).
Add tools to perform outlier detection with sklearn + documentation.
Clean working directory
Integrate AlexG's comments on doc and examples + add tests.
Magnify novelty and outlier detection examples again + minor fixes.
DOC: Move Parameters section outside objects __init__ method.
Example on real data (outlier detection on boston housing data set).
Fix bugs + adjust OCSVM parameter in outlier detection example.
Cosmit: address Olivier's comments on examples naming.
BF: Avoid two consecutive centering of the data in outlier_detection.
rename mahalanobis_values to raw_values in covariance decision method.
ENH: make LedoitWolf estimation scale (memory usage) with n_features.
The LedoitWolf object has to return a covariance estimate or breaks.
Put Ledoit-Wolf shrinkage coefficient estimation in a separate function.
Avoid extra computations + clean `assume_centered` argument use.
Remove forgotten line related to previous commit.
Catch non-invertibility errors within MinCovDet computation.
Improve covariance module test coverage.
More tests for the covariance module.
BF: adapt a svm test to recent numpy versions.
BF: Make MinCovDet work with n_samples >> n_features.
Merge branch 'cov-speedup' of https://github.com/vene/scikit-learn into cov-speedup
Add comments on optimized precision computations.
Add comments on optimized precision computations.
Merge pull request #1015 from vene/cov-speedup
BF: Address issue #1059 in GMM by adding a supplementary check.
BF: Fix broken tests: change a check for compatibility with HMM.
BF: fix issue #1127 about MinCovDet breaking with X.shape = (3, 1)
Improve doc and error msg in MinCovDet in response to issue #1153.
BF: GridSearchCV + unsupervised covariance shrinkage selection.
Change legend + complete docstrings.
Improve example narrative doc (rewritten intro).
Fix typos in doc.
Add y=None to covariance estimators for API consistence purpose.
Vlad Niculae (758):
Barely functional NMF implementation.
Updated the example with doctest tags.
Cleaned some syntax, implemented more flexibility.
Fixed svd-based initialization, fixed example
Wrote a few test cases.
Merge branch 'master' into nmf
Merged upstream changes
Added benchmark.
Merge branch 'master' into nmf
Added CRO-based initialization, TODO tests, bench
Untracked changes
Merge branch 'master' into nmf-nnls
Put CRO inside nmf.py
Sparsity constraints and measures of sparsity
Merge branch 'master' into nmf-nnls
Style fixes all around. Clarified NNDSVD docstring.
Decreased default NMF tolerance to improve results.
Corrected sparseness measures in NMF.fit
Removed print in CRO.fit; moved utils to top.
Possibly fixed errors in doctest (not verified yet)
Doctests pass now
Fixed bug in transform (lack of .T), renaming
Non-negative least squares testing
Renamed tolerance to tol for consistency.
Wrote tests to cover mostly everything
NMF example on faces dataset
Implemented fit_transform
Tweaked plot aspect ratio
Fixed broken tests due to interface change
Tests now behave better
Renaming; removed numpy 2-norm
Removed useless _fit_transform
CRO inherits from BaseEstimator
Applied suggestions; updated bench and example
Updated doctest
pep8 fixes
Abbreviation expansion in benchmark
Fixed comments in NMF example
pep8 on test_nmf
Removed comments.
Removed CRO for now
Added nndsvda and nndsvdar options for NMF.init
Merge branch 'nmf-nnls' into nmf-lite
Benchmarks and more pep8
Fixed benchmark, removed unused import.
Fixed NMF benchmark colors
Merge branch 'nmf-lite' of git://github.com/agramfort/scikit-learn into nmf-lite
Merged
Fix benchmarks printing of error for alt-nmf
Documentation. Discussed fixes. Set default to ar.
Added KPCA citation.
Added NMF to classes.rst
Fixed non-ascii characters
Change PCA test to fit just once
Updated documentation with references
Added y=None in fit for pipelining
Fixed relative URI in NMF doc refs
Clarification of example in NMF doc
Capitalized Gram, added y=None in fit, pep8 test.
Docstring formatting in test_nmf.py
Docstrings in nmf.py
Merge branch 'nmf-nnls'. Docstring fixes, mainly.
Clarified NNDSVD in docstring
Documented NNDSVD. Fixed ar perturbation range.
Corrected error in docstring re: nndsvdar
Added disclaimer in nndsvdar docstring
Clarified invalid sparseness parameter error msg.
Clarified init parameter error message.
Transposed shape of components_ attribute
Renamed NMF to ProjectedGradientNMF
Updated authors
Merge branch 'master' into nmf-lite
DOC: Added both plots to NMF doc, tweaked plots.
DOC: Made plots look better.
pep8 in plot_kpca
Attributes renamed and documented.
Began work on decompositions package.
FIX: very confusing internal naming in NMF
Merge branch 'nmf-fix' into decomposition
Decomposition module WIP
Merge branch 'master' into nmf-fix
Merge branch 'master' into decomposition
Working decomposition package
MISC: pep8ification
Missed one reference
Merge branch 'master' into decomposition
FIX: KernelPCA plot in doc
FIX: forgot to track init file in tests
API: components_ shape fixed in PCA classes
ENH: More accurate and clean numeric code in PCA
ENH: More avoidance of np.dot for diagonal entries
Renamed fastica.py to fastica_.py
Merge branch 'master' into decomposition
FIX: Explicit docstring inheritance
FIX: last char in char analyzer, max_df behaviour
FIX: doctest
Copied the Sparse PCA file from the gist
Fixed Lasso call, all is still not right
LARS _update_V fixed by Gael
PEP-8
Initial factoring into SparsePCA class
Implemented transform, fixed confusion
DOC: clarified the default for NMF initialization
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' into sparsepca
Updated transform function, began tests
Merged Gael's gist newest update
Merge branch 'master' into sparsepca
A couple of passing tests
factored out the example code
DOC: a little commenting
renaming, included tests
Merge branch 'master' of github.com:scikit-learn/scikit-learn into sparsepca
Updated init.py
one more test and a quick example
pep8
DOC: foundations, prettified example
Doc enhancement, added alpha in transform
Merge branch 'master' into sparsepca
Added ridge in transform (factored here for now)
Removed print statement from test. Whoopsie!
Merge pull request #2 from agramfort/sparsepca
Initial integration of Orthogonal MP
Renaming, some transposing
Tests and the refactoring they induce
PEP8
Added signal recovery test
rigurous pep8
Added the example
Cosmetized the example
s/nonzero/non-zero
Added Olivier's patch extractor with enhancements
cleanup
Tests for various cases
PEP8, renaming, removed image size from params
Merged Gael's latest update to sparse_pca.py
Merge branch 'sparsepca' of github.com:vene/scikit-learn into sparsepca
Merge branch 'sparsepca' into sc
FIX: update_V without warm restart
FIX: weird branching accident
Merge branch 'sparsepca' into sc
Revert "FIX: update_V without warm restart"
Revert "FIX: update_V without warm restart"
Revert "Revert "FIX: update_V without warm restart""
Merge branch 'sparsepca' into sc
Initial integration of Orthogonal MP
Renaming, some transposing
Tests and the refactoring they induce
PEP8
Added signal recovery test
rigurous pep8
Added the example
Cosmetized the example
Added Olivier's patch extractor with enhancements
cleanup
Tests for various cases
PEP8, renaming, removed image size from params
FIX: weird branching accident
Revert "FIX: update_V without warm restart"
Revert "Revert "FIX: update_V without warm restart""
Merge branch 'sc' of github.com:vene/scikit-learn into sc
FIX: update_V without warm restart
Added dictionary learning example
Merge pull request #3 from agramfort/sc
renaming for consistency, tests for PatchExtractor
Initial shape of dictionary learning object
Added DictionaryLearning to __init__.py
FIX: silly bugs so that the example runs
ENH: Tweaked the example a bit
PEP8
Copied the Sparse PCA file from the gist
Fixed Lasso call, all is still not right
LARS _update_V fixed by Gael
PEP-8
Initial factoring into SparsePCA class
Implemented transform, fixed confusion
Updated transform function, began tests
Merged Gael's gist newest update
A couple of passing tests
factored out the example code
DOC: a little commenting
renaming, included tests
Updated init.py
one more test and a quick example
pep8
DOC: foundations, prettified example
Doc enhancement, added alpha in transform
Added ridge in transform (factored here for now)
Removed print statement from test. Whoopsie!
s/nonzero/non-zero
Merged Gael's latest update to sparse_pca.py
FIX: update_V without warm restart
FIX: weird branching accident
Revert "FIX: update_V without warm restart"
Revert "Revert "FIX: update_V without warm restart""
Merge pull request #5 from agramfort/sc
Merge branch 'sparse_pca' of git://github.com/GaelVaroquaux/scikit-learn into sparsepca
Finished merging Gael's pull request
Merge branch 'master' into sparsepca
Merge branch 'master' into sc
Merge branch 'sparsepca' into sc
Merge branch 'sc' of git://github.com/larsmans/scikit-learn into sc
Renaming, part one
Renaming, part two
Renamed online dict_learning appropriately
Merge branch 'sparsepca' into sc
Renaming part three
Fixed dico learning example
Used @fabianp's ridge refactoring
Exposed ridge_regression in linear_model init.py
Merge branch 'master' into sparsepca
Updated ridge import
Merge branch 'sparsepca' into sc
FIX: checks in orthogonal_mp
Cleanup orthogonal_mp docstrings
OMP docs, a little broken for now
DOC: omp documentation improved
DOC: omp documentation fixes
DOC: dict_learning docs
dictionary learning tests
Fixed overcomplete case and updated dl example
fixed overcomplete case
online dictionary learning object
factored base dico object
pep8
Merge branch 'sparsepca' into sc
pep8
more transform methods, split_sign
OMP dictionary must have normalized columns.
Merge branch 'master' into sparsepca
Merge branch 'master' into sc
DOC: improved dict learning docs
Tweaked the dico example
exposed dict learning online in init
working on partial fit
denoising example
Annotate the example
partial fit iteration tracking, test still fails
FIX: typo, s/treshold/threshold
Merge branch 'sparsepca' into mblondel-fix_ridge
simplify sparse pca
Tweak denoise example spacing
pep8 examples
pep8
Merge branch 'master' into sparsepca
Merge branch 'mblondel-fix_ridge' into sparsepca
Merge branch 'sparsepca' into sc
random state control, comment fixes
Merge branch 'sparsepca' into sc
random state control
clarify lasso method param
Merge branch 'sparsepca' into sc
clarify lasso method param in sc too
s/seed/random_state in patch extractor
DOC: fixed patch extraction comments
ENH: PatchExtractor transform
d:/progs/Git/s/seed/random_state in dico learning example
d:/progs/Git/s/seed/random_state in denoising example
FIX: s/V_views/code_views and pickling
Merge branch 'sparsepca' into sc
DOC: more sparse pca narrative documentation
FIX: gram when method=cd
Merge branch 'master' into sparsepca
removed fit_transform overload
Merge branch 'sparsepca' into sc
DOC: consistent punctuation, minor enh
DOC: missed a couple of dots
ENH: verbose and title in sparse pca example
DOC: fixed typo in sparse pca narratives
Merge branch 'dwf_sparse_pca' of git://github.com/GaelVaroquaux/scikit-learn into dwf_sparse_pca
TEST: fake parallelism
TEST: fake only on win32
TEST: no meddling with joblib outside of win32
Merge branch 'master' into sparsepca
Lower tolerance in sparse pca example
DOC: sparse pca transform rephrasing
DOC: more sparse pca transform rephrasing
One big decomposition example
DOC: consistent coding method in docstrings
Merge pull request #7 from GaelVaroquaux/dwf_sparse_pca
TEST: more coverage
FIX: sparse pca ignored initialization
Merge pull request #8 from GaelVaroquaux/dwf_sparse_pca
Merge branch 'sparsepca' of github.com:vene/scikit-learn into sparsepca
FIX: typo in example s/cluter/cluster
pep8
pep8 in example
FIX: messed up images in narrative doc
FIX: example image order is consistent (for now)
ENH: predictable ordering in example, included kmeans
kernel pca gets its own module
Merge branch 'master' into sc
DOC: fixed SparsePCA docstring issue
Brought in OMP from the larger branch
added functions to classes.rst
Remove useless prints in example
Merge branch 'master' into omp
consistency with lasso: s/n_atoms/n_features
DOC: some fixes
failing test for expected behaviour
FIX: LARS and LassoLARS did not accept n_features
PEP8
FIX: doctests
Merge branch 'master' into lars_n_features
FIX: broken doctest in Lars
cleared n_features naming confusion
s/n_nonzero_features/n_nonzero_coefs
Factored out sparse samples generator
pep8
OrthogonalMatchingPursuit estimator
pep8
Merge branch 'master' into omp
cosmit in example
unified notation
made code consistent with docstring
cleaned up tests, added count_nonzero to fixes
Added OMP bench
better cholesky management
pep8
arrayfuncs solve_triangular and EPIC creeping bugfix
fixed check for None
set random seed to hide odd random test failures
fix more None checks
more clarity
Added early stopping as in reference implementation
n_nonzero_coefs defaults to 10% if eps not passed
began rewriting the tests
transposed generator, updated tests
fixed stupid mistake causing the sample generator to be inconsistent
warn when omp stops early
no need for min, it would break on the previous line
change matrix order, gram looks ok now
use np.asfortranarray
tests robust to warnings
do not overwrite global warn filters in test
use np.argmax instead of x.argmax()
while 1 instead of while True
use nrm2 from BLAS
It's official: omp is faster than lars (w/o Gram)
API changes, part I
API changes, part II: Return of the Estimator
FIX: precompute_gram=auto
DOC: docstrings fixes
pep8
don't use gram in example, useless slowdown
FIX: benchmark was broken
DOC: docstrings
Convert to F-order as soon as possible
F-order asap, don't assume any overwriting
that was unneeded
clearer benchmark
Merge pull request #11 from agramfort/omp
DOC: referenced OrthogonalMatchingPursuit in doc
Merge branch 'omp' of github.com:vene/scikit-learn into omp
updated samples generator according to @glouppe's refactoring
typo s/dictionnary/dictionary
PEP8
Merge branch 'master' into omp
FIX: broken samples generator test
FIX: cruel bug in OMP, no more unneeded warnings now.
Merge branch 'master' into sc
Added Olivier's patch extractor with enhancements
Tests for various cases
PEP8, renaming, removed image size from params
s/seed/random_state in patch extractor
DOC: fixed patch extraction comments
ENH: PatchExtractor transform
extra blank line
pep8 in test file
image.py authors
speed up tests
improved warning for invalid max_patches
New file: Feature extraction documentation
Added feature extraction as a chapter
fix copy paste error in docstring
DOC: improved docstrings
Updated documentation, fixed bug in the process
DOC: clarified docstrings even more
Merge branch 'master' into sc
Accidentally removed a line in a test
pep8 in doc
rename coding_method, transform_method to fit/transform_algorithm
fix broken test
changed digits to faces decomposition example
added dict_learning_online function
MiniBatchSparsePCA is born
Removed dict_init in MiniBatchSparsePCA, docstrings
code reuse by inheritance, more tests
Fast-running face decomposition example
DOC: updated narrative docs for MiniBatchSparsePCA and example
DOC: fixes and updates
DOC: minor errors
FIX: broken test
Added MiniBatchSparsePCA and dict_learning_online to classes.rst
DOC: fixed issue in MiniBatchSparsePCA docstring
ENH: cleaner random number handling in tests
Removed default value of n_components=None in SparsePCA
Fixed inappropriate checks for None
Switched dict_learning_online returns order for consistency
ridge_alpha as instance parameter
prettify face decomposition example (ft. GaelVaroquaux)
add refs to example
Merge branch 'master' into sc
duplicated import
FIX: denoise example was broken
FIX: reconstruction test
make tests share data
clarify docstrings
added init test
partial_fit passes the test
added least-angle regression to dictionary learning transform
plugged in lars instead of lasso_lars in denoising example
Merge branch 'master' into sc
redesign the denoising example
FIX: BayesianRidge doctest
tweaked the example a little more
removed thresholding from denoising example
completely removed thresholding from denoising example
Prettify example
More work on example
tweaking example
DOC: clarified and enhanced dictionary learning narratives
added dictionary learning to classes.rst
corrected reference to omp
DOC: fixed link to decomposition example
DOC: fix See also
DOC: fix See also in both places
DOC: cleaner see also section
DOC: improved dict learning narratives some more
Data centering in denoising example
Prettify structure example
DOC: minor style changes
DOC: tweaks
Removed print in digits classification example
DOC: fixed links and made examples build
Merge branch 'clayw-label_prop' of github.com:vene/scikit-learn into clayw-label_prop
DOC: clarified example titles
Removed fit_params from dictionary learning objects
plot the dictionary in denoising example, other one will disappear
completely removed the duplicated example
Prettify the example
Rehauled example to show the difference
Renamed the example, bounded the difference range
Lower the range of the difference in example for better contrast
Added norm to titles
More explicit docstring in the example
Removed verbosity (example now 4s faster!), prettier output
fix output bug
PEP8 and style
Merge branch 'master' into sc
style
Merge branch 'master' into sc
Use fit_params in Pipeline
Moved dict_learning stuff out of sparse_pca.py
rename eps to tol in omp code
Exposed sparse_encoding, docs not updated
Consistent defaults
Updated the first part of the docs
Updated the docs
removed fit_transform for dict learning
Updated the narrative doc
Tweaking the example
Improved the example clarity
Merge branch 'master' into sc
removed unused imports
fixed all pyflakes warnings
Merge branch 'master' into sc
Copied tests, fixed examples imports, enhanced see alsos
Merge branch 'master' into sc
Merge branch 'sc' of git://github.com/agramfort/scikit-learn into sc
Merge branch 'master' into sc
Included dictionary learning online in decomp example
Added missing dashes in doc
Merge branch 'master' into sc
Merge branch 'vene-sc' of git://github.com/ogrisel/scikit-learn into sc
Merge branch 'master' into sc
Merge branch 'dictionary_learning' of git://github.com/GaelVaroquaux/scikit-learn into sc
renamed MiniBatchDictionaryLearning
layout
Reordered dictionary learning docs
tweaked faces decomposition and added to dict learning docs
added dict learning face decomposition to docs
Fixed image display in docs
simplified fit_algorithm keyword
s/img_denoising/image_denoising
made sparse_encode functions visible
added see also refs to sparse_encode functions
Reordered dictionary learning docs
Stabilized and improved face decomposition example
explicit seeding of olivetti faces loader
MISC: even better check_build error reporting
DOC: added Gaussian Processes to class reference
FIX: keep track of index swapping in OMP
Merge branch 'master' into omp_bug
Merge branch 'omp-bug-test' into omp_bug
Testing for swapped regressors in OMP
Merge branch 'omp-bug-test' into omp_bug
PEP8
Merge branch 'master' into omp_bug
Merge pull request #408 from vene/omp_bug
Skip tests in OMP that fail on old Python versions
Fix one-dimensional y in Gram OMP estimator
Added SparseCoder estimator
Basic testing
DOC: add missing split_sign in docstrings
FIX: 10% of features should be at least 1
PEP257 :)
restore typo
Added SparseCoder to init and class index
initial work on docs
implement noop fit in SparseCoder
clean up test
Fixed doc links
Fixed lena in example
Fixed lena import in denoising example
Merge branch 'master' into sparse-coder
cleaned up imports in test
Merge branch 'master' into sparse-coder
FIX: objective functions in Lasso linear model docs
DOC: correct ordering of returns in dict_learning_online
DOC: clarified dimensions in _update_dict
Fix the API and the scaling inside dict_learning
DOC: specify scaling in linear_model.rst
work on failing tests
Merge branch 'master' into sparse-coder
skip tests that were wrongly passing before
Test for almost equal instead of equal in sparse_encode_error
FIX: slices generation
Hide sparse_encode -- redundant
DOC: add optimization objective to lasso and enet docstrings
DOC: make docstrings as good as I could
Warnings and deprecation
DOC: better cross refs and docstrings
Adapted examples for alpha scaling
Merge branch 'master' into sparse-coder
PEP8
added sparse coding example
s/threhold/threshold
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into sparse-coder
Add SparseCoder example
Rehauled SparseCoder example
Merge branch 'master' into sc-example
Added @vene's work to the changelog
sparse coding transform is now a mixin
EHN: multilabel samples generator can create different number of labels per instance
pyflakes test_multiclass
Add the samples generator to the references
ENH: Added the synthetic example
ENH: Really added the synthetic example
DOC: add multiclass to class reference
DOC: add example to multiclass.rst
DOC: really add example to multiclass.rst
DOC: add image to narrative doc
Added missing space in PIL warning
DOC update changelog
Add Andy to the author list
Allow unlabeled samples in multilabel ex, collab between @vene and @mblondel on the plane
FIX typo that broke the test
ENH make example more expressive
Change seed to make example behave better
Removed unused imports in species dataset
FIX: issue #540, make omp robust to empty solution
Merge branch 'omp-zerofix'
ENHanced the multilabel example aspect
s/jacknife/jackknife
DOCFIX: make math block render
Add warnings and clean up tests
FIX: doctests for scale_C, took some liberties
FIX: bug in test_setup. Actually avoid multiprocessing now.
FIX: wrong cover-package, misleading coverage as 100%
DOC: updated testing instructions
Remove a warning from kmeans tests
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Remove deprecation warning in sparse_encode
Merge pull request #873 from vene/remove_sc_warning
ENH: make_regression supports multiple targets
Update make_regression return shapes in docstring
FIX: sparse ElasticNet tests that were not testing much at all
fix typo
ENH: faster design in FastICA
Begin updating the developers performance documentation
Update and fix errors in memory profiling documentation
DOC: better phrasing about memory profiling
Begin updating the developers performance documentation
Update and fix errors in memory profiling documentation
DOC: better phrasing about memory profiling
We already have the inverse at that step
Replase pinv calls with dgetri
More lapack inverting
Refactored fast_pinv without lapack calls
Compute pseudoinverse using eigendecomposition
Vectorize singular value inversion
Remove unused import
Merge branch 'master' into cov-speedup
Merge remote-tracking branch 'VirgileFritsch/cov-speedup' into cov-speedup
Merge remote-tracking branch 'jakevdp/vene-cov-speedup' into cov-speedup
Update and rename pinvh (by @jakevdp)
Cloned @jakevdp's pinvh tests
Remove odd-looking period in tests
Use pinvh in plot_sparse_recovery example
grammar
Use pinvh in bayes.py
Use pinvh in GMM and DPGMM
Remove deprecated _set_params and the call in grid_search
Remove chunk_size from k_means
Removed load_filenames and load_20newsgroups
Remove sparse_encode_parallel
Removed deprecated parameters in GridSearchCV
Remove LARS and LassoLARS
Remove fast_svd.
Remove _get_params
Corrected deprecation schedule in cross_validation
Remove deprecated properties in naive_bayes
Add or fix deprecation schedule in warnings.
Fix example using deprecated API, output was misleading.
Remove deprecated load_20newsgroups from classes.rst
FIX: randomly failing CountVectorizer test
MDS is not a transformer, fix the test to skip PLS
Merge branch 'master' into mixins
Improve the common tests, make fast_ica pipelinable
Support y-dependent transform as in PLS
fit_transform in PLS to support y
Make PLS degrade gracefully on sparse data
Rename Y to y in PLS
Check for sparse input in isomap and lle
Check for sparse data in MDS despite not being tested
Skip CCA in test_regressors_int
First effort in multitarget lassolars
ENH: move Gram precomputation outside of the loop
TEST: precomputed lasso and lars
Unnecessary copying
FIX: add test, fix memory initialization bug
ENH: multidimensional y in ElasticNet (WIP)
return_path option in lars_path
Add possibility to ignore the path in Lars objects
Fix doctests
Add __all__ for half of the scikit
Add __all__ for the second half of the scikit
Expose ENGLISH_STOP_WORDS
We already have the inverse at that step
Compute pseudoinverse using eigendecomposition
Vectorize singular value inversion
Cloned @jakevdp's pinvh tests
Use pinvh wherever it helps in the codebase.
First go at speeding up Euclidean distances
Make it less yellow
More reusable code, speed up symmetric case
Better cython style.
Add dense sparse support and precomputation
FIX: buggy case when X=dense, Y=sparse
Consistent argument naming and useful maintenance notes
FIX using out with sparse matrices
Relative imports, fix todense bug
safe_sparse_dot into preallocated output
Add test for dense_output, fix bug, cleaned up logic
Avoid reallocation in manifold.mds
add type prefix to blas funcs
DOC Clarify the docstrings
Added Cython-generated euclidean_fast.c
Separate dense_output and out parameters, document better
API change: mutually exclusive preallocation and precomputation
FIX: csr_matrix induced unwanted copying
Rename euclidean_fast to _euclidean_fast
Clean setup.py in metrics
ENH: improve test coverage
Add failing test and no-op flip
Sign flipping as suggested by @ogrisel, not in place
Make sign flip in place
Test more seeds for svd sign flipping
Add sign flip as flag in randomized_svd
Make sure svd_flip test actually tests something
Make randomized_svd flipped by default
svd_flip test fails on Travis. Change random seed, see if it helps
Cannot easily ensure non-uniqueness without the fix, just test uniqueness
TEST flipped svd remains correct
FIX: makes our libsvm port compile under MSVC
Merge branch 'master' of github.com:scikit-learn/scikit-learn
DOC: fix typo and formatting around MurmurHash3
DOC: Fixed wrong link and formatting in decomposition docs
DOC: fixed latex and formatting in SVM docs
DOC: more consistency in metrics docstrings
DOC: More consistency in metrics and clustering metrics docstrings
DOC: more consistency in docstrings for unsup clustering metrics & missing link
DOC: fixed missed details in metrics docstrings
DOC: addressed more inconsistencies in metrics docstrings
ENH: use lgamma function from John D. Cook
Merge branch 'master' into lgamma_port
FIX: variable naming inconsistency in NMF
DOC FIX: multi-target linear model attribute shapes
DOC spelling and clarification
Make callable svc test more robust for MacOSX.
Added RBM to whats_new.rst
DOC Added skeleton for RBM documentation
ENH Rename RestrictedBolzmannMachine to BernoulliRBM
FIX: make BernoulliRBM doctest pass
FIX: BernoulliRBM check random state in fit, not in init
FIX: validation in `BernoulliRBM.transform`
DOC: first attempt at RBM documentation
Link to RBM docs from the unsupervised toctree
FIX: uneven RBM image
DOC: PCD details and references
Fix typos in example
PEP8 and indentation
DOC add plot and example to docs
DOC rewrite BernoulliRBM example description
Set seed through params, not globally
FIX handling of random state, hide some of API
Pep8 example
Update example params by grid search, and docstring
One space after dot
DOCFIX neural networks module
DOCFIX spacing and clarification in RBM docstring
More stable implementation of logistic function and its derivative by @fabianp
Use gen_even_slices instead of homebaked code
ENH Add fast and stable logistic sigmoid to utils and RBM
ENH Support sparse input in RBMs
ENH Prevent memory copying in RBM's _fit
Do not touch uncopied memory
Nudge images using convolve, slower but more readable
Clarify narrative docs
Clarify and python3 RBM example
Periods and other docstring issues
Remove redundant test
Python3 support in RBM
TST RBM smoke-test verbosity
FIX missing class attribute in ICA. Common test was failing
FIX: fastica function dictionary default value
Deprecate FastICA.sources_
TEST remove deprecated stuff from fastica tests
Document the deprecation
FIX bug in test
Clean up and rename Hungarian algorithm
Clarify and clean up example
Remove print in Hungarian tests
Consistency for floats in consensus score
Add warning in private _Hungarian docstring just in case
ENH make spectral clustering test more stable to random seed
ENH add return_path in orthogonal matching pursuit
TEST for omp path feature
ENH OrthogonalMatchingPursuitCV estimator
FIX respect conventions in OMP init
FIX OrthogonalMatchingPursuit normalized twice
Use projected gradient solver in transform to support sparse matrices
Use same parameters when solving the transform
Use scipy.nnls.optimize for dense data
Add failing test for libsvm random state proba
FIX support random state in libsvm
DOC document changes in LIBSVM_CHANGES
DOC update docstrings to reflect libsvm random_state
Fix libsvm seed when predict_proba in tests and examples
Clarify and make libsvm random seed more consistent
Comment predict params in libsvm
DOC reference and rename cross decomposition module
FIX raise tolerance in svm predict_proba test
Make common PLS tests more stable
FIX for MSVC inline fmin, fmax and log2
FIX for MSVC inline fmax in dist_metrics
Add LibSVM random state to changelog
Wei Li (109):
FIX: this fixes issues #746 ProbabilisticPCA minor things
FIX: this further fixes issues #746 with API compatibility warning and integer division fix
ENH: using coo matrix construction to accelerate calculation of the contingency matrix
FIX: numerial issues in NMI
COMIT pep8
ENH add refs to issue #884
FIX: ADD test cases for exact 0 case, and nmi equal to v_measure case
FIX: accelerate v_measure calculation based on mutual information
COSMIT add doc to clearify how nmi is normalized and pep8 fix
COSMIT pep8 fix for test_supervised
FIX: fixes error caused by break line
Using coo_matrix to accelerate confusion_matrix calculation
COSMIT
ENH add test for testing v_measure is a variant of nmi
COSMIT typos in doc strings
FIX let test use random_state(seed)
PEP8..
FIX typos and vague comments
DOC add comments for log(a) - log(b) precision
COSMIT fails to see the function name use mi rather than mutual information
FIX doctest to check up to 6 digits precision
FIX: eliminate \ for continuation from doctests
FIX issue #1239 when confusion matrix y_true/y_labels has unexpected labels
PEP8
ENH docstring misleading
ADD install guide for archlinux
ADD spectra_embedding for wrap function spectra_embeeding as an estimator from spectral clustering
ENH finish sketch for the estimator wrapper
ENH add warning for inverse transform
ADD test cases for spectra_embedding
ADD empty test scripts
COSMIT
FIX typos
FIX inconsistent typos
FIX nearest_neighbor graph build
ADD add test_examples for pipelined spectral clustering and callable affinity
FIX remote does not have test file wired...
MOV move spectra_embedding from decomposition to manifold
ENH docs partially updated, happy mooncake festival
ENH move spectral_embedding as standalone, fixes for tests
COSMIT
ADD add the laplacian eigenmap to examples
ADD test cases for two components, unknown eigenvectors, unknown affinity
COSMIT
ENH test-coverage
PEP8 test files
ADD spectra_embedding for wrap function spectra_embeeding as an estimator from spectral clustering
rebase: fixing conflict
ENH add warning for inverse transform
ADD test cases for spectra_embedding
ADD empty test scripts
COSMIT
FIX typos
FIX inconsistent typos
FIX nearest_neighbor graph build
ADD add test_examples for pipelined spectral clustering and callable affinity
FIX remote does not have test file wired...
rebase: fixing conflict
ENH docs partially updated, happy mooncake festival
ENH move spectral_embedding as standalone, fixes for tests
COSMIT
ADD add the laplacian eigenmap to examples
ADD test cases for two components, unknown eigenvectors, unknown affinity
COSMIT
ENH test-coverage
PEP8 test files
SYNC doc built error on one machine, sync with another
DOC docs for spectral embedding
DOC dox fix and misc post-rebase things
MRG merge with @Gael's PR 1221 and some name changes
FIX lobpcg, amg drops the constant eigen vectors by default
ADD check for symmetric and check for connectivity
ADD add test for check_connectivity
COSMIT
Change sparse graph to use cs_graph funcs. minor doc changes
Minor doc changes
FIX spectral embedding offers choice whether to drop the first eigenvector
COSMIT
RENAME parameter rename in examples
RENAME rename eigen_tol and eigen_solver, and warning about using old variable name eig_tol and mode
ADD add a test for discretize function
COSMIT and Typo
FIX backwards support
FIX doc fix and test fix
COSMIT
ADD added examples, and eliminate unnecessary imports
FIX nn-affinity does not support sparse input
COSMIT and minor fixes
DOC update whatsnew
FIX: amg requires sparse matrices input
missing _set_diag
fix spectral related testing errors
COSMIT and unused lines
FIX further improve the thresholds
FIX discretization test have shape problem, use coo_matrix instead of LabelBinarizer
Addressing @ogrisel's comments
FIX roc_curve failed when one class is available
COSMIT
DOC fix
TYPO fixes
DOC address @amueller's comment
FIX typo
Update whatsnew
FIX spectral_embedding test erros, ADD spectral embedding to sphere examples
MOD use safe_asarray instead of np.asarray
MISC update my mailmap
MOD address @mblondel's comments
MOD move generating matrix out of the loop
Merge pull request #1563 from kuantkid/sparse_knn_graph
X006 (2):
Dataset loader moved to datasets.base, but not being installed
Updates for DBSCAN clsutering docs
Xinfan Meng (5):
fix a bug of affinity propagtion, which is caused by incorrect index
BUG Disallow negative tf-idf weight
Fix a test case
Fix broken links
DOC Change URLs of NNDSVD papers to avoid paywall
Yann Malet (2):
Update the installation guide with Ubuntu related info
Fix a Broken link in the documentation
Yann N. Dauphin (25):
ENH added Restricted Boltzmann machines
30% speed-up thanks to in-place binomial
ENH 12% RBM speedup with ingenious ordering of operations
rename h_samples to h_samples_
added URI for RBM reference
improved docstring for transform
renamed _sigmoid to _logistic_sigmoid
use double backquotes around equations
logistic_sigmoid moved to function
transposed components_, no performance penalty
only compute pseudolikelihood if verbose=True
more accurate pseudo-likelihood
use iteration terminology instead of epochs in RBM
default n_components from 1024 to 256
clarify some method names (ex: mean_h -> mean_hiddens)
added epoch time
ENH RBM example
switched to digits
moved rbms to neural_networks module
add tests for rbm
trim whitespace
use train_test_split
neural_networks -> neural_network
ENH rename n_particles to batch_size in RBM
TST added more RBM tests
Yannick Schwartz (26):
added a StratifiedShuffleSplit in the cross validation schemes
added test for stratified shuffle split
updated stratified shuffle split test
fixed sss test
cleanup of arg check and doc update
put sss validation in external function
updated doc/whats_new.rst, doc/modules/classes.rst and doc/modules/cross_validation.rst for the sss
sss raises error if a class has only one sample, added associated test
pep8
changed train_fraction to train_size
Fixed random state, changed _validate_sss name, fixed _validate_stratified_shuffle_split bug
New stratified shuffle split version that only return indices arrays
stratified shuffle split can return masks
Fixed StratifiedShuffleSplit issue for unbalanced classes
Fixed n_test issue in StratifiedShuffleSplit
pep8 fix
Added new tests for StratifiedShuffleSplit
Fixed SSS test
Removed redefinition of variable i in SSS
Permute the train and test sets in SSS to avoid class-sorted folds
Added validation for some corner cases in SSS
Updated tests for SSS
Added tests for the StratifiedShuffleSplit to check the sizes of the training and testing sets, and that they don't overlap
Minor cleanup of StratifiedShuffleSplit
BUG: set random state in LogisticRegression
Update multiclass/multilabel documentation
Yaroslav Halchenko (23):
DOC: removing a stale request for subversion write permissions
Allow to build _libsvm.so against system-wide LIBSVM's svm.h
API to control LIBSVM verbosity without patching
recythoning _libsvm.pyx for previous commit
revert change to libsvm -- now verbosity is controlled via API
enable more doc testing for test-doc Makefile rule
adding acknowledgement to Dr.Haxby for my support ;-)
FIX: removed obsolete entries and added current ones for top-level __all__ + unittest
DOC: minor spellings and formatting (trailing spaces, consistent spacing etc)
RF: use joblib.logger submodule itself while accessing its function in grid_search
FIX: reflect SVC API change (eps -> tol) in doc/tutorial.rst
FIX: lars_path -- assure that at least some features get added if necessary
test case for previous commit
minor -- pass verbose into LARS in the test case
FIX: strings are not necessarily singletones + catch mistakes earlier
DOC: minor spellings fixes in pls.py
DOC: minor typo "precom[p]uted"
DOC: fix name for line_profiler_ext.py extension
DOC: enhancement for Debian installation + fixed various typos
DOC rudimentary docstring to deprecated.__init__ describing "extra"
ENH do not fail the test reslying on numpy div 0 warnings if those are not spit out by numpy in general
ENH: sklearn.setup_module to preseed RNGs to reproduce failures
BF: explicitly mark train_test_split as not the one for nosetesting
Your Name (1):
[base.py] Do not break while trying to pprint not existent attribut
andy (8):
FIX manifold example - sorry, my bad.
COSMIT RST in manifold sphere example.
ENH fix random seed in manifold example
DOC added note in example that digits data is to small.
ENH Add "proximity" parameter to MDS.
FIX soime typos, modify test.
FIX another typo, fix examples
ENH updated to more examples.
bob (1):
Couple of small changes from comments
buguen (1):
correcting typos in the doc
draix (1):
PY3: replaced izip
emanuele (1):
FIX: added logsumexp and nan_to_num to avoid underflows and NaNs
fcostin (9):
optimisations to Ridge Regression GCV
faster GCV for Ridge for n_samples > n_features
fixed tests to work with Ridge GCV
updated RidgeCV docstring and changelog
fixed bug with _values (thanks @mblondel)
fixed bug with > 1d y arrays
svd fails for sample_weights, use eig instead
coerce sparse matrices to dense before SVD
refactoring (thanks @GaelVaroquaux, @mblondel)
hrishikeshio (1):
DOC dev guide: deprecation
jamestwebber (2):
Update coordinate_descent.py
Fixed precompute issue (again) in ElasticNet and enet_path
jansoe (1):
fix error in unwhitened case
leonpalafox (1):
Change exception text when multiple input features have the same value from: "Multiple X are not allowed" to: "Multiple input features cannot have the same value"
mr.Shu (9):
moved class_prior in NB to __init__
added deprecation warning to fit function
fixed docstring tests
fixed typos
added warnings
updated based on comments
fixed local variables
renamed the new parameter to class_wieght
fixed docstring test
nzer0 (1):
Documentation ERROR: mixture.DPGMM.precs_
sergeyf (8):
Update qda.py
Update qda.py
Missed a space!
Updating to ensure pep8 compliaance
reg_param is a float
Update qda.py
Update test_qda.py
Update qda.py
syhw (23):
travis config file
update travis config
put the requirements at the right place
added requirements to travis config file
Merge https://github.com/scikit-learn/scikit-learn
Travis CI cfg + status in README + sklearn requirements
with Ubuntu's scipy instead of pip's
with python-nose
removed requirements.txt from travis cfg
removed requirements.txt
changed the build image URL in README for after pull-merge
trying travis cfg with system-site-packages
Merge https://github.com/scikit-learn/scikit-learn into travis
nudging the digits dataset for BernouilliRBM example
TST added a 'fit [[0],[1]] + gibbs sample it' test for RBMs
replaced test_gibbs by a smoke test for NaNs
check for pseudo_likelihood clipping
COSMIT refactoring rbm
RBM example now verbose
squeezing logistic_sigmoid result only on 1D arrays
adding a test for sparse matrices in RBM
changing free_energy to private in RBM
added neural_network to setup
uber (1):
example yahoo stock issue fix
unknown (4):
Added documentation for the Naive Bayes classifiers.
Added sparse MNNB and modified the textual examples to benchmark it.
Modified the Naive Bayes nose tests to the new location of the module and added sparse test.
changed wording in linear model docs about Normalized. It was frustrating me haha
-----------------------------------------------------------------------
No new revisions were added by this update.
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-science/packages/scikit-learn.git
More information about the debian-science-commits
mailing list