[scikit-learn] annotated tag 0.17 created (now 26241eb)
Andreas Tille
tille at debian.org
Wed Dec 28 13:10:51 UTC 2016
This is an automated email from the git hooks/post-receive script.
tille pushed a change to annotated tag 0.17
in repository scikit-learn.
at 26241eb (tag)
tagging da4f480a6adf5fed30a42500fe0e5a21c404ac2a (commit)
replaces 0.4
tagged by Andreas Mueller
on Thu Nov 5 14:58:00 2015 -0500
- Log -----------------------------------------------------------------
scikit-learn 0.17 release
= (3):
add polynomial features to user guide
add connection to kernel method
fix SVM can be tricked into running proba() #4791
A. Flaxman (4):
DOC: add random_state parameter to StratifiedShuffleSplit doc string
DOC: latex beautification
DOC: latex beautification
DOC: copy-edits and active voice
Aaron Schumacher (8):
target is response, not explanatory
typo: "fot" -> "for"
typo: "requiered" -> "required"
DOC: typo: 1e-2 is 0.01 not 100
DOC remove import scikits.learn from tutorial
typos: remove extra s's
tweak to more natural version
typo: 'precomputed' should be in quotes
Aaron Staple (1):
Add quantile strategy to DummyRegressor. Fixes #3421.
Abhijeet Kolhe (1):
Fix setup.py to resolve numpy requirement
Adithya Ganesh (5):
Added check to ensure that NumPy and SciPy meet minimum version requirements (NumPy >= 1.6.1, SciPy >= 0.9)
Amended error message to remain valid when NumPy/SciPy are installed but out of date
Added reference to SciPy/NumPy min_version vars in ImportError
Made setup.py pep8 compliant
Added output of out-of-date SciPy/NumPy versions to setup.py, if detected
Adrien Gaidon (5):
FIX: typo for default init_size in MiniBatchKMeans
Added tests to check for the correct value of init_size
FIX: make GridSearchCV work with precomputed kernels
raise ValueError when given a kernel_function or a non-square kernel matrix + some tests
Fixed a small typo
Akshay (1):
Removed n_jobs parameter from the fit method and added it to the constructor
Aldrian Obaja (1):
FIX bug in fit_params assumed to be an array
Alejandro Weinstein (1):
Fix link to plot_lda_qda example.
Alex Companioni (1):
Issue #339: minimizing number of calls in tests.test_hmm.
Alexander Fabisch (88):
DOC update example path
FIX Do one run with MiniBatchKMeans and explicit centers
Add learning curve
Refactor cv code
Clean up
Refactor RFE and add _check_scorable
FIX typo in docstring
Merge `fit_grid_point` into `_cross_val_score`
Return time
Move set_params back to fit_grid_point
Log score and time in 'cross_val_score'
check_scorable returns scorer
Clean up
Replace '_fit_estimator' by '_cross_val_score'
Fix PEP8, style and documentation
Remove wrong variable names
Remove helper function '_fit'
Remove 'fit_grid_point' from 'BaseSearchCV'
Check substrings of error messages
Rename '_split' to '_split_with_kernel'
_passthrough_scorer is a function
Remove '_deprecate_loss_and_score_funcs'
Check error message
Use assert_raises_regexp to check error messages
Add prototype for validation curve
Add documentation and tests
Improve test coverage
Improve interface and documentation
Mocks inherit from BaseEstimator
Correct docstring
Add narrative documentation
Improve documentation
Fix link to validation curve
Add example with polynomial regression
Correct and improve documentation
Improve phrasing
Rephrase sentence
Simplify first part of the documentation
Fix typo
Return scores of all folds
Fix documentation
Improve test coverage
Matching colors
MAINT Rename over/underfitting example
Add t-SNE
Generalize with average of nearest neighbors
Shorten documentation of perplexity
Compare t-SNE with other manifold learners
Simplify optimization
Refactoring
Add first test
Test gradient descent
Test binary search
Add t-SNE to other examples
Add more tests
Use regular expressions
100% test coverage
Remove example with digits dataset
Rename dist to affinities
Modify learning schedule
Fix tests
Explain example
Rename attributes
Document attributes
PEP8
Update documentation
Do not stop too early
Describe how to set the learning rate
Remove generalization
Add section in narrative documentation
Adress Gael's comments
Do not use global random number generator
Replace squareform(pdist(*, "sqeuclidean"))
Adress Olivier's comments
Mention Barnes-Hut-SNE and fast optimization
Integrate PCA initialization
Use PCA initialization in examples
Fix docstring
Use euclidean_distances in original space
Mention TruncatedSVD and clean up (PEP8, Pyflakes)
Affinity must be 'precomputed' or 'euclidean'
Rename arguments
Allow sparse data
Correct examples
Remove trailing '\' and add test
Describe default metric
Add t-SNE to whatsnew
Fix issue #4154
Alexander Measure (1):
Update naive_bayes.rst
Alexandre Abraham (7):
Fix a bug in the ward clustering.
Add a non-regression test for the bug of connectivity fixing.
Put conversion after component computation
Fix test function name.
Fix typos
BUG: Fix path in doc cleaning
Merge branch 'master' of https://github.com/jaquesgrobler/scikit-learn into fix_doc_clean
Alexandre Gramfort (860):
Merge branch 'master' of /Volumes/DAVID/scikit-learn
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
API: changing the way the parameters of Lasso+E-Net are optimized
ENH : imroving documentation of lasso + enet paths function
ENH : add LeavePLabelOut cross-validation generator
ENH : adding support for mean-shift clustering with a flat kernel
ENH: making data contiguous in memory in coordinate descent
ENH: adding affinity propagation algorithm
removing pl.show()
ENH: adding exception raising
Merge branch 'master' of github.com:agramfort/scikit-learn
using staticmethod rather than property
cosmit
setting array as fortran in lasso + enet coordinate descent
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
MISC : renaming affinity propagation example
broke glm to improve model selection
ongoing work on glm with crossval
Merge branch 'master' of http://github.com/GaelVaroquaux/scikit-learn
Merge branch 'master' of http://github.com/GaelVaroquaux/scikit-learn
Merge branch 'master' of http://github.com/GaelVaroquaux/scikit-learn
continue improve glm cv
Merge branch 'master' of http://github.com/GaelVaroquaux/scikit-learn
fix glm cv
Merge branch 'master' of http://github.com/GaelVaroquaux/scikit-learn
fix frozenset
BUG : fix in affinity propagation
BUG : fix in stock market example
BUG : fix with blas on mac os x
ENH : moving bench_glm.py to benchmarks folder
ENH : glm coordinate descent with BLAS
BUG : fix blas support in setup.py with coordinate descent
ENH : adding stratified cross-validation object
ENH : fix doctests in glm, svm and lda
adding grid search code
BUG : fix doctests in neighbors
BUG : fix doctest in datasets/base.py
ENH : using digits in grid search example
Merge branch 'master' of github.com:GaelVaroquaux/scikit-learn
API : renaming GridSearch to GridSearchCV
API : cross val generator in now given in fit in grid search object
Merge branch 'master' of github.com:GaelVaroquaux/scikit-learn
ENH : update grid search example
ENH : first draft of RFE
ENH : fix RFE + example
ENH : improve RFE
ENH : adding loss functions in metrics.py
Merge branch 'master' of github.com:agramfort/scikit-learn
Merge branch 'master' of github.com:agramfort/scikit-learn
ENH : fix RFE and RFECV
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
ENH : allow grid search to work with lists of grids
ENH : using BaseEstimator with GNB
cosmit'
ENH : adding BaseClassifier and BaseRegressor base classes
ENH : using mixin rather than base class to bring score methods to estimators
ENH : fix in svc.coef_ + cosmit
ENH : fix in svc.coef_ + cosmit
ENH : using np.logspace instead of np.linspace in paths
ENH : using np.logspace instead of np.linspace in paths (after merge)
API : making Y optional in fit for OneClassSVM
FIX : removing duplicated example
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
ENH : new SVR example
ENH : new SVR example
Merge branch 'temp'
ENH : improve QDA (taken from Matt Perrot)
ENH : improve LDA (taken from Matt Perrot)
ENH : improve LDA QDA example (taken from Matt Perrot)
MISC: cosmit in LDA, QDA
ENH : new example for LDA vs QDA
ENH : removing old example for LDA vs QDA
ENH : attempt to have a default parameter for bandwidth in MeanShift algorithm
ENH : adding doc to clustering module API : adding trailing underscores to estimates in clustering classes
ENH: adding test for RFE and reaching 100% coverage
ENH : adding doc for grid_search module
MISC : cosmit nfeatures -> n_features, nsamples -> n_samples, nclasses -> n_classes
FIX : adding missing doc file
FIX : fix in subplot index in plot_iris.py
ENH : removing unused preprocessing routines
FIX : in Makefile that calls now nosetests directly
FIX: removing useless imports
ENH : more work on LARS (doc + examples)
Merge branch 'master' of github.com:scikit-learn/scikit-learn
ENH : continue refactoring of GLM module (doc, moving files, config etc.)
ENH : more refactoring of GLM module
FIX : fixing __init__ files for examples
ENH : cosmit + fix examples for doc generation
cosmit in examples
Merge branch 'master' of github.com:scikit-learn/scikit-learn
BUG : fix in Lars at the end of path + more tests (not working yet)
ENH : using explained variance as score for regression problems
ENH: on the use of explained_variance in mixin regressor class
FIX : fix in handling of intercept in glm base and ridge
TEST : adding test to ridge with no intercept
ENH : draft of what could be a preprocessing routine (done by hand for now)
FIX : prevent pipeline.score to do a fit which was wrong
FIX : it may happen that pipeline.estimator do not implement predict
moving ridge out of bayes.py
ENH : adding PCA filter
ENH : adding computation of percentage of variance explained by each component
FIX : for doctest in PCA
ENH : adding ledoit-wolf for robust covariance estimation
ENH : adding FastICA class + example
Merge branch 'add_ica' of http://github.com/bthirion/scikit-learn into ica
ENH : more on ICA (examples + doc)
Merge branch 'ica'
FIX : fix ica vs pca example
ENH : adding example + refactor in covariance module
splitting ledoit_wolf.py in two files
oups missing example file
FIX: missing covariance.py
cleaning the handling of the intercept in GLM linear models
ENH : avoiding computing a pinv at each iteration in BayesianRidge
Merge branch 'master' of github.com:scikit-learn/scikit-learn
TEST : bayes
FIX : imports in __init__ of glm
cosmit in ARDRegression
TEST : removing lda.py
COSMIT : PEP 8 in PCA
Merge branch 'master' of github.com:scikit-learn/scikit-learn
FIX : RegressinMix score had flipped y_true and y_pred
EXAMPLE : adding model selection example with train/test error graphical illustration
EXAMPLE : making only one figure in model selection example
Merge branch 'master' of github.com:scikit-learn/scikit-learn
FIX __init__.py of glm.sparse
EXAMPLE: add example of dense vs sparse Lasso on dense and sparse data
FIX : example of dense vs sparse Lasso on dense and sparse data
passing Gram in LARS and LassoLARS
ENH : more doc in lars.py, handling of intercept
Merge branch 'fabian/python_lars_fast_2'
fix doctests in lars.py
more on lasso benchmark
Merge branch 'master' of github.com:scikit-learn/scikit-learn
FIX : preprocessing : scaler should not be allowed with axis=1 (opt removed)
skipping BayesianRidge failing test
using diabetes in lasso/lars examples
removing assert for debug
ENH : speeding up the LARS
BUG: bug fix in LARS Lasso mode + speed improvement (we can still do better)
adding LARS with Gram to benchmark
ENH : speed in LARS by forcing X to be fortran ordered + cosmit (unactive -> inactive)
pretifying the LAR / LARS examples to match with results on wikipedia page
cosmit in docs of glm module
Merge branch 'master' of github.com:scikit-learn/scikit-learn
ENH : improvements in bayes
Merge branch 'master' of github.com:scikit-learn/scikit-learn
fix doc generation on plot_lasso_coordinate_descent_path.py example (pb on my box)
DOC: updating doc on Univariate feature selection
FIX: ticket 147 on pb with 2d y in f_regression
FIX: ticket 147 on pb with 2d y in f_classif
adding 'iid' option in cross_val_score
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
pretifying plot_weighted_classes.py
FIX: quick fix in predict_proba in LogisticRegression
removing debug compile flags
sgd module code review
sgd module code review
adding path example on logistic on IRIS dataset
Merge branch 'sgd'
increasing precision in plot_logistic_path.py to get nicer path
DOC: spelling
ENH : pyflakes on examples to avoid useless imports + addint print __doc__
ENH : more love in examples (adding print __doc__ + some brief descriptions in headers + fixing Anova SVC Pipeline example)
FIX: fix docstrings in LARS (issue 8 on github)
cosmit + typos in doc
Merge branch 'master' of github.com:scikit-learn/scikit-learn
ENH: adding partial support for predict_log_proba in Pipeline and log reg
rewriting f_oneway in the scikit to avoid useless recomputations
Merge branch 'master' into log_proba
adding comment to explain the reimplementation of f_oneway
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' into log_proba
typo
removing use_svd option in LDA. Only scipy SVD is supported.
Merge branch 'master' into log_proba
ENH: adding predict_log_proba to LDA and QDA + tests to reach 100% coverage
ENH : adding support for predict_log_proba in Naive Bayes
ENH: adding support for predict_log_proba in SVC and sparse.SVC
ENH : adding predict_log_proba in sparse logistic regression
FIX: make sure class_weight='auto' do not change the result for balanced problems
Merge branch 'master' into log_proba
API : implement coef_init as fit parameter in glm.coordinate_descent module.
API: exposing fit_intercept params in LassoCV and ElasticNetCV
ENH : adding test in pipeline + increase coverage
fix doc generation pb introduced by previous commit
FIX: fix class weight auto
pep8 in plot_weighted_samples.py
ENH : adding kneighbors_graph to build the graph of neighbors as a sparse matrix
FIX fragile doctest
ENH : adding NeighborsBarycenter for regression pbs using k-Nearest Neighbors
DOC: adding NeighborsBarycenter to doc
DOC: better docstring for barycenter_weights function
DOC: even better docstrings in neighbors
MISC: reindenting BallTree C++ code (no tabs + 4 spaces)
DOC : more on docstrings in neighbors.py
review of gaussian process module
API renmae k->n_neighbors
Merge branch 'log_proba'
ENH : improving the speed of ridge with inplace computation + symmetric pos def constraint
Merge branch 'neighbor_barycenter'
ENH : coordinate descent speed up when n_samples > n_features in cd_fast.pyx
ENH : allowing Gram matrix precomputing in Lasso / ElasticNet to
Merge branch 'master' of github.com:scikit-learn/scikit-learn
ENH : speed improvement in lasso_path with precomputed gram matrix
Merge branch 'master' of github.com:scikit-learn/scikit-learn
pep8 in coordinate_descent.py
pep8 + N->n_samples and D->n_features
Merge branch 'master' of github.com:scikit-learn/scikit-learn
giving more love to benchmarks (pep8, pyflakes, var names, etc...)
Merge branch 'master' of github.com:scikit-learn/scikit-learn
revert previous commit regarding mpl_toolkits.mplot3d in bench
API : maxit replaced by max_iter everywhere
ENH : new scikits.learn.metrics.pairwise module
Merge branch 'master' of https://github.com/dubourg/scikit-learn into dubourg-master
pyflakes in plot_gp_diabetes_dataset.py
renaming plot_gp_diabetes_dataset.py as nothing is plotted
FIX : fix extra parenthesis in mixture ...
reviewing hierarchical clustering code
adding missing setup.py in cluster
ENH : nicer implementation of StratifiedKFold now usable with regression
DOC: updating doc for StratifiedKFold + ellipsis in svm support
ENH : adding function to test the significance of a cross val score with permutations in supervised problems
ENH : add possibility to pass RandomState
s/permutation_score/permutation_test_score
fix pb with nose and permutation_test_score function
Merge branch 'permutations'
FIX : really accurate pvalue in cross-val permutation test
FIX : even more accurate pvalue in cross-val permutation test
s/euclidian_distances/euclidean_distances
typo
ENH : cross-val generator can now return integer indices
DOC: better docstring in cross val with indices
DOC: update RST doc for crossval with indices
removing print used for debug
ENH : speeding up kneighbors_graph function avoiding the use of a LIL matrix
FIX : in hierarchial cluster + Mixin fix + tests + coverage + PEP8
FIX : fix pb in affinity propagation when S dtype is not float
ENH : adding inverse_transform to pipeline + better handling of coef_
ENH : adding coef_ attribute in GridSearchCV
ENH : adding inverse_transform to univariate selectors + pep8
removing old svn id tag
ENH : refactoring Ward feature agglomeration to make it work with Pipeline
first attempt to use caching in gridsearch with hierarchical clustering... WIP
ENH : improving ward for better joblib caching
removing plot_dendogram function
TEST : fix ward clustering tests
in hierarchical : s/adjacency_matrix/connectivity, s/k/n_clusters
remaining s/k/n_clusters
ENH (ward): return children as numpy array (better for joblib)
Merge branch 'master' into asaf
ENH: avoid storing parent and weights in Ward (better joblib)
DOC : better docstring in hierarchical clustering
adding example to rst doc
better ward rst doc examples
moving swiss_roll generator in samples generator
removing Return from class docstring
s/cord_/coord_
in setup.py s/ward/cluster
Merge branch 'hcluster' into hcluster2 that matches master
fix remaining n_comp
FIX : fixing Lars lasso with early stopping using alph_min + adding test for it
Merge branch 'master' of github.com:scikit-learn/scikit-learn
fix LassoLARS docstring
Merge branch 'hcluster2' of http://github.com/bthirion/scikit-learn into hcluster2
adding test scikit vs scipy.
FIX: ugly bug in connectivity on grids and images
ENH : factorizing img_to_graph and grid_to_graph
ENH : ones on diag in grid_to_graph + fix dtype
cosmit
Merge branch 'hcluster2'
cosmits with trailing spaces
Merge branch 'master' of github.com:scikit-learn/scikit-learn
pretifying nmf plot
Merged pull request #5 from larsmans/master.
pep8
ENH : using make_blobs in plot_affinity_propagation
ENH : using make_blobs in plot_mean_shift
ENH : using make_blobs in plot_mini_batch_kmeans
FIX : removing useless seed fix in plot_mean_shift
Merge pull request #178 from kwgoodman/master
Merge pull request #181 from lucaswiman/master
prettify plot_sparse_pca.py
adding authors in sparse pca
ENH : prettify dict learn example on image patches
pep8
prettify plot_sparse_pca.py
adding authors in sparse pca
FIX : using product form utils.fixes for python 2.5
pep8
MISC : fix docstring, cosmit in image.py
FIX; missing import in dict_learning.py (OMP in transform in not tested
ENH : new radius_neighbors_graph to build graph of nearest neighbor from radius
DOC: adding radius_neighbors_graph to doc
pep8
Merge pull request #230 from agramfort/radius_neighbors_graph
pep8
FIX : fix failing test in comparison between lassoCD and lars
pyflakes warnings
pep8
DOC: adding note on glmnet parameter correspondance in ElasticNet
ENH : adding LASSO model selection example based on BIC and AIC
BUG: s/empty/zeros in plot_lasso_bic_aic.py
pep8
Merge pull request #265 from JeanKossaifi/master
API : renaming LARS to Lars
MISC: s/larslasso_results/lars_lasso_results
pep8
Merge branch 'master' into rename_lars
Merge branch 'master' of github.com:scikit-learn/scikit-learn into rename_lars
Merge branch 'master' of github.com:scikit-learn/scikit-learn into rename_lars
ENH: adding LARS and LassoLARS deprecated classes
Merge pull request #278 from agramfort/rename_lars
Merge pull request #281 from glouppe/master
pep8
ENH : prettify OMP/LARS benchmark
Merge pull request #277 from vene/omp
ENH: speed up estimate_bandwidth with BallTree + use make_blobs in test_mean_shift.py
ENH : using make_blobs in cluster examples
pep8
FIX : using product form utils.fixes for python 2.5
pep8
MISC : fix docstring, cosmit in image.py
Merge pull request #295 from bdholt1/boston
DOC : fix doc building
ENH : new LassoLarsIC estimator
MISC : adding GaelVaroquaux to the authors of least_angle.py
ENH: addressing @ogrisel's comments on PR 298
ENH + DOC: addressing @GaelVaroquaux's comments
DOC: clarify doc on BIC/AIC
Merge branch 'master' of github.com:scikit-learn/scikit-learn into normalize_data
Style + typos
API : adding proper normalize options in Lasso and ElasticNet with clean up
ENH : more standard import of scipy.sparse
FIX : fix rounding error in test + pep8
FIX : putting back common.py
FIX : in meanshift typos, style, example
Merge pull request #346 from npinto/patch-1
DOC : fix sgd docstring
ENH : better plot_img_denoising
Merge pull request #350 from tinyclues/master
STY : pep8
STY: mostly style + avoid a zip in favor of an np.argsort
STY : in label_propagation.py
ENH : using numpy broadcasting instead of dot_out
Merge pull request #376 from fabianp/fast_tests
STY: imports in covariance + pep8
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge pull request #404 from amueller/grid_search_doc
STY: pep8 + naming
DOC: prettify plot_permutation_test_for_classification.py
DOC : adding permutation_test_score to changelog
ENH : adding support for scaling by n_samples of the C parameter in libsvm and liblinear models
FIX : removing param nu from sparse.SVR, C from NuSVR + pep8
$Merge branch 'master' into n_samples_scaling
typo
s/C_scale_n_samples/scale_C
STY: pep8 + pyflakes
Merge pull request #464 from NelleV/FIX_bibtex
Merge branch 'master' into n_samples_scaling
STY: prettify doctest
ENH : adding scale_C in NuSVR
ENH : more contrasted colormap
MISC: typos + subplot adjust
ENH : C scaling of sparse models
Merge remote-tracking branch 'origin/master' into n_samples_scaling
ENH : adding missing scale_C in docstring
Merge pull request #465 from amueller/fastica_wowhiten
STY: PEP 257 in ridge.py
Merge pull request #473 from amueller/dataset_whitespace
Merge pull request #477 from jakevdp/gmm-fix
ENH : avoid global seeding in plot_polynomial_interpolation.py
ENH : clean up plot_feature_selection.py
Merge pull request #482 from DraXus/master
STY : pep8 and add print __doc__ in plot_sparse_coding.py
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
STY : pep8
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
misc
STY: s/grid_points/cv_scores
Merge pull request #495 from vene/sc-mixin
Merge pull request #507 from jakevdp/neighbors-check
Merge pull request #532 from amueller/grid_search_attributes
ENH : reformatting hmm_stock_analysis.py examples
MISC : typos in hmm_stock_analysis.py
ENH : rename hmm_stock_analysis.py so it appears as a figure in the doc
ENH : make metrics.auc work with 2 samples + add test
Merge pull request #591 from jaquesgrobler/doc_update
fix with new as_float_array
STY: pep8
mv randomized_lasso.py randomized_l1.py
ENH : some doc + renaming in RandomizedLasso
ENH : better plot_randomized_lasso.py with score path
ENH : prettify plot_randomized_lasso.py
ENH : creating lasso_stability_path func + adding tests on randomized_l1
ENH : add docstring to RandomizedLogistic
FIX: fix test_randomized_logistic
STY: s/a/scaling + adding docstring
DOC : adding doc for Randomized sparse linear models + fix test
ENH : adding sample_fraction to lasso_stability_path + add to doc
typos
cosmit in doc + pep8
cosmit in doc
ENH : addressing @ogrisel comments (PEP257, naming, see also)
DOC: rephrase rand linear model doc
ENH : fix docstrings + add func missing reference
ENH : center y too in _randomized_lasso
ENH : adding support for multiple regularization parameters in RandomizedLinearModel
MISC: removing one XXX
ENH : early stopping in lasso_stability_path (faster)
ENH : fix legeng of plot_randomized_lasso.py
pep8
API: set scale_C to True by default in libsvm/liblinear models
update what's new
DOC : add warning in docstrings for scale_C gone in 0.12
DOC: indent pb
DOC: update scale_C docstrings + add notes to svm.rst
ENH : use not(scale_C)
remaining docstring to be updated
update docstring with WARNING
TST: use assert_true instead of assert + remove some relative imports
FIX : fix SVM examples with new scale_C=True
FIX : fix ward benchmark
Merge pull request #654 from GaelVaroquaux/enet_cv
Merge pull request #679 from amueller/logistic_l1_l2_sample
API: use C=None by default in libsvm/liblinear bindings so (C=1, scale_C=False) which is libsvm default == (C=None, scale_C=True) which is the scikit default
FIX : remove useless C definition in non-fit methods
ENH : adding scaled_C_ attribute
Merge pull request #699 from njwilson/issue-250
TST : add test on ridge shapes for different y shapes
TST : add test failing test to reproduce #708
FIX : fix test for #708
FIX : fix test failing with OMP
ENH: y_mean with consistent shape in _center_data
FIX : prevent ICA with defined n_camponents and whiten=False (fix for #697)
TST: capture warning in test
FIX : use joblib from externals
Merge pull request #728 from satra/fix/f_regression
ENH : speed up f_regression
FIX : array copy for compat pb
FIX : missing self.copy = copy in PLS GH Issue #758
cosmit : consistent linestyle in plot_lasso_coordinate_descent_path
ENH : add duality gap check with Lasso(positive=True)
Merge pull request #747 from ibayer/posCoeff
Merge pull request #773 from amueller/forest_pre_dispatch
Merge pull request #782 from jaquesgrobler/Update_Changelog
Merge pull request #783 from dwf/svm_docs_minor
change web site for agramfort
FIX : fix SVC pickle with callable kernel
cosmit
FIX : callable kernel for prediction
FIX : sparse SVC clone with callable kernel
Merge pull request #796 from amueller/kmeans_dtype
Merge pull request #814 from invisibleroads/master
Merge pull request #813 from invisibleroads/patch-1
FIX : make plot_ica_vs_pca.py deterministic (fix for #815)
Merge pull request #802 from amueller/arpack_backports
typo
fix for #824
DOC : update SVM examples with scale_C
API : change back default C to 1. explicitely and epsilon 0.1
FIX : svm decision function test
Merge pull request #851 from duckworthd/master
TST : tesitng intercept_ between dense and sparse
adding alexis to authors
typo
update tip on svm C param
Merge pull request #872 from jaquesgrobler/master
FIX : use RandomState rather than global seed
Merge pull request #881 from amueller/fix_ica_components_rename
FIX: fix buildbot ICA pb
Merge pull request #876 from alexis-mignon/master
FIX : fix a division by zero in LARS #63
Merge pull request #892 from ibayer/fix_mldata_docstring
FIX: C range in plot_cv_digits
Merge pull request #891 from ibayer/merge_cd
FIX : cleanup classes.rst + pep8 after merge of coordinate descent
Merge pull request #900 from kernc/neighbors_predict_proba
FIX : fix predict_proba in KNeighborsClassifier for old numpy
FIX: fix grid search when X is list #925
Merge pull request #932 from jaquesgrobler/master
Merge pull request #938 from ogrisel/svmlight-double-precision
Merge pull request #969 from jaquesgrobler/master
missing pl.show() in plot_digits_agglomeration.py
Merge pull request #983 from GaelVaroquaux/faster_ward
MISC : update my web site URL in what's new
ENH : MultiTaskLasso works (still draft)
FIX : fix docstring in MultiTaskLasso
ENH : add multi task lasso example
ENH + DOC : add MultiTaskElasticNet + doc + 1 example
update what's new
FIX : support 1d y in MultiTaskFoobar
rename ylabel in MultiTaskLasso example
moving MultiTaskLasso doc after E-net
FIX : remove unnecessary dgemm in cd_fast.pyx
FIX : catching pb with sparse input in MultiTaskElasticNet
FIX : make as_float_array keep fortran order on dense array when copy
ENH: simplify dict learning with gram and reg_param handling
ENH : add copy arg to array2d and new atleast2d_or_csr usual for sparse coordinate descent
ENH : add copy param to array2d_or_csx
ENH : add support for multitarget in sparse enet + simplify input checking
ENH : use multitarget in dict learning
FIX : fix tests
DOC : getting over docstrings
ENH : avoid a copy in MultiTaskElasticNet
add note on what's new
ENH : add support for sparse data in ElasticNetCV/LassoCV (not optimal)
ENH : use multitarget Lars and LassoLars in dict_learning
ENH : simplify handle of copy of Gram and X with array2d in OMP
style + typo
DOC : better reg_param docstring in dict learning
ENH : use build_dataset in multi target test
ENH : update warn for multitarget
update coef_path_ docstrings
use assert_true
API : consistent use alpha_/alphas_ for alpha/alphas estimated by CV in linear models (issue #1041)
DOC : add useful comment in code
addressing for round of reviews
DOC : better docstring for fit_path
DOC : fix rho=1 is L1 penalty #1139
fix failing test
TST : use nose assert_true and not python assert
ENH : proper IsotonicRegression model + example + test
remove support for extrapolation
FIX : for test_common sparse support
pep8
adding my name in IR example
ENH : finish addressing @GaelVaroquaux comments + improve coverage + add linear regression to example
typo
FIX : fix LLE test (don't ask me why...)
misc
DOC : avoid mentioning ElasticNet in Lasso.fit docstring
Merge pull request #1223 from ibayer/master
ENH : cleanup FactorAnalysis object
API : rename psi to noise_variance + some cleanup in FA
TST : add test that FA log like increases over iterations
add Bishop's book to refs in FA
update what's new with FactorAnalysis
DOC : adding FactorAnalysis to classes.rst
FIX : fix application example due to API change
FIX : missing import warnings
typo
typos
DOC: typos in ensemble.rst
DOC: typos in ensemble.rst
FIX : clean test + pep8 + reply fix to the code
API : move isotonic regression out of linear_model
DOC : fix move of isotonic in doc + examples
TST : use assert_true and not assert in test
Merge pull request #1483 from aweinstein/fix_doc_example
Merge pull request #1504 from NelleV/isotonic
Merge pull request #1505 from NelleV/mds
DOC : add doctring in plot_lasso_and_elasticnet.py
DOC: adding Bishop as ref for ARD
Merge pull request #1577 from ApproximateIdentity/n_jobs-documentation
Merge pull request #1578 from zaxtax/elastic_documentation
DOC : missing alpha doc in LassoLars
ENH : add reconstruction_err_ for NMF with sparse input
use scipy.linalg in test_nmf.py
adding comment on why sparse frobenius is ok as done
Merge pull request #1607 from agramfort/reconstruction_err_nmf_sparse
FIX : fix kfold balance due to int rounding
FIX : test due to KFold change
FIX : better fix of KFold balance
fix doctest
TST : improve test_kfold_balance test
update what's new
TST : improve again test_kfold_balance test
Merge pull request #1772 from jnothman/comment_exhaustive_search
typo
pep8
Merge pull request #1907 from aflaxman/stratified_shuffle_split_rand_state_doc_str
Merge pull request #2071 from djv/patch-1
Merge pull request #2075 from jnothman/agglomeration_simplify
FIX : use unique from fixes
Merge pull request #2074 from jnothman/ward_docstring
Merge pull request #2080 from ahojnnes/dist-todo
FIX : missing y=None in FactorAnalysis
Merge pull request #2087 from ahojnnes/examples-print-doc
Merge pull request #2118 from NelleV/DOC_fix
Merge pull request #2135 from fhs/meanshift-doc
Merge pull request #2138 from NelleV/kCCA
Merge pull request #2142 from sergeyf/master
Merge pull request #2145 from NelleV/kCCA
FIX : finish get rid of fit_... param
ENH : avoid one copy in FastICA code
misc
update ICA examples
adding comment
Merge pull request #2196 from erg/labelencoder-docs-fix
ENH : massive refactoring of CV models in coordinate descent. Now the algo core is in path functions
update what's new
DOC : more fixes in covariance module
Merge pull request #2202 from NelleV/isotonic_reverse
Merge pull request #6 from jaquesgrobler/cov_doc_fix
Merge pull request #2203 from agramfort/cov_doc_fix
cosmit : protect attributes in RBM for sphinx
pep8
better coverage
fix doctest
ENH : use warning instead of print
update what's new
Merge pull request #2212 from dengemann/ica_memory
Merge pull request #2213 from cmd-ntrf/master
Merge pull request #2217 from vene/ica_fit_transform
Merge pull request #2182 from NelleV/pls_refactor_2
DOC+ENH: fixes in least_angle + one vectorization
DOC : better doc of array shapes in fastica
MISC : use linalg from scipy
ENH : removing warnings from tests in cd linear models
Merge pull request #2194 from NicolasTr/as_float_array_copy
Merge pull request #2223 from arjoly/doc-datasets
DOC : docstring fixes
DOC : more docstring fixes
use pre_fit in OMP
API : deprecate a lot of extra parameters in OMP object
API : deprecations in orthogonal_mp
ENH : update example of OMP
update what's new + classes.rst
Merge pull request #2247 from pgervais/docfixes
Merge pull request #2258 from NicolasTr/ignore_pycharm_files
Merge pull request #2290 from dengemann/more_ica_improvements
FIX : backport tanh out param in old numpy
revert the tanh fix
ENH : simplify ProabilisticPCA covariance_ computation + misc in pca.py
ENH : avoid extra alloc in _infer_dimension_ in pca.py
FIX : self.component in PCA was changed by fit when passed mle or explained variance
ENH : avoid extra allocation in FactorAnalysis
ENH : use math.log and not np.log for scalars
doc fix
misc
FIX : broken fit_transform
MISC : pep8 and pyflakes on test_isotonic.py
Merge pull request #2412 from rolisz/patch-2
FIX : fix randomized SVD in FA
Merge pull request #2406 from dengemann/tinker_fa
ENH : add score to PCA from ProbabilisticPCA
API : deprecating ProbabilisticPCA
misc
typo
ENH : make PCA scoring work with n_features is big. Avoid covariance computation in PCA.fit
DOC : remove reference to ProbabilisticPCA in doc
update what's new
FIX : do not store precision_ attribute in score method
ENH : add get_precision method with matrix inversion lemma to PCA for faster scoring
ENH : add tests for get_precision in PCA + some fixes for corner cases
ENH : add get_precision method with matrix inverse lemma to FactorAnalysis + use precision in score
pep8
ENH : add PCA+FA model selection example
pep8 + misc
ENH : add score samples to PCA
API : add score_samples in FactorAnalysis + update example
better wording
add ref to online pdf
FIX : fix ProbabilisticPCA.score with homoscedastic=True
for loop in tests of get_covariance + get_precision
typo
typo
wording
simplify FA tests after rebase
pimp example
show example in narrative doc
avoid zip + topic in doc
FIX : get rid of coef_init in Lasso in dict learning
FIX : cleanup coef_init in coordinate descent
FIX : warm_start in coordinate_descent (was ignored)
FIX : set deprecation to 0.16 in coordinate descent
API : set return_models to False by default in path functions in coordinate descent
FIX : get rid of convergence warning in test_coordinate_descent
ENH : Gaussian process for arbitrary-dimensional output spaces by @JohntheBear
ENH : simplify input checking in GP
update what's new
Merge pull request #2497 from dengemann/extend_fast_dot
ENH : better shrinkage range for ShrunkCovariance
s/n_features/n_samples in doc/datasets/index.rst
Merge pull request #2576 from ankit-maverick/issue2560
Merge pull request #2583 from amueller/doc_rbf_parameters
Merge pull request #2624 from dengemann/fix_dot
Merge pull request #2628 from dengemann/improve_ica_example
Merge pull request #2647 from trein/patch-1
FIX : explained_variance_ratio_ in RandomizedPCA
update tests
better sparse support
pep8
update what's new
add test with sparse data
address @ogrisel's comments
Merge pull request #2669 from GaelVaroquaux/phimeca
fix f_oneway with ints
pyflakes
Merge pull request #2683 from blagarde/master
Merge pull request #2708 from likang7/patch-3
Merge pull request #2706 from likang7/patch-2
TST : speed up tests + cosmit
FIX : n_jobs was missing from LassoCV
update what's new
Merge pull request #2742 from ai8rahim/issue-2741
update what's new
clarify what's new
Merge pull request #2857 from eltermann/copyright-updated
Merge pull request #2858 from ameasure/patch-1
Merge pull request #2866 from Manoj-Kumar-S/Iss2751
Merge pull request #2884 from dsullivan7/clusterdoc
Merge pull request #2895 from kaushik94/master
Merge pull request #2920 from yoni/master
Merge pull request #2931 from GaelVaroquaux/rm_solve_triangular
Merge pull request #2945 from maheshakya/olivetti
Merge pull request #2958 from hamsal/check_arrays_n_dim
Merge pull request #2951 from Manoj-Kumar-S/speed_sparse
Merge pull request #3017 from cgohlke/patch-2
Merge pull request #3035 from ogrisel/test-sgd-stability-error
Merge pull request #3022 from ogrisel/travis-old-numpy-scipy
Merge pull request #3116 from eickenberg/ridge_wrong_solver_exception
Merge pull request #3135 from ugurthemaster/patch-1
typo
Merge pull request #3151 from rajatkhanduja/fixing_examples
Merge pull request #3152 from jdowns/lfw-conditional-import
Merge pull request #3154 from rajatkhanduja/fixing_examples
DOC : fix alpha docstring in dict learning
DOC : really fix alpha docstring in dict learning
FIX : omp default param could fail
ENH : use local seed + avoid warning
Merge pull request #3200 from ugurthemaster/patch-2
Merge pull request #3176 from agramfort/omp_default
Merge pull request #3230 from bwignall/quickfix-gaussian
Merge pull request #3231 from bwignall/quickfix-cap
Merge pull request #2822 from AlexanderFabisch/tsne
Merge pull request #3254 from laurentluce/presentations-links-fix
Merge pull request #3260 from larsmans/cd-deprecation
DOC : clarify error message in _binary_roc_auc_score
FIX : allow nd X for cross_val_score (was working in 0.14)
Merge pull request #3343 from ihaque/doc_partial_fit
Merge pull request #3344 from ihaque/remove_dup_fit
Merge pull request #3368 from stevetjoa/typo-diriclet
Merge pull request #3375 from ldirer/hashing_fix3356
Merge pull request #3472 from arjoly/fix-metrics-division
Merge pull request #2862 from MechCoder/LogCV
Merge pull request #3465 from pvnguyen/patch-1
Merge pull request #3588 from tejesh95/patch-1
fix_gnb_proba
Merge pull request #3653 from MechCoder/minor_opt
rephrase doc, rename unseen, fix pep8
Merge pull request #3674 from MechCoder/fix_l1_penalty
Merge pull request #3697 from luispedro/cv_doc_fix
Merge pull request #3705 from floydsoft/patch-1
Merge pull request #3724 from mlopezantequera/patch-1
Merge pull request #3740 from MechCoder/lassolarsicattribute
cosmit in warning message
Merge pull request #3480 from dsullivan7/sgd
Merge pull request #3770 from dsullivan7/asgdintercept
Merge pull request #3773 from dsullivan7/sgdtestfix
Merge pull request #3646 from s8wu/multinomial_newtoncg
Merge pull request #3778 from MechCoder/expose_positive
Merge pull request #3783 from jakevdp/gmm-doc-fix
Merge pull request #3772 from MechCoder/manhattan_metric
Merge pull request #3795 from dsullivan7/svm_doc
Merge pull request #3811 from MechCoder/fix_repeated_checking
Merge pull request #3861 from ogrisel/fix-python-2.6-test
Merge pull request #3857 from MechCoder/test_sparse_knc
Merge pull request #3867 from snuderl/patch-1
Merge pull request #3868 from MechCoder/typo_example_faces
Merge pull request #3889 from jdcaballero/master
Merge pull request #3905 from jlopezpena/fix-example-plot_pca_3d
Merge pull request #3920 from jlopezpena/pls-coef
Merge pull request #3923 from Lothiraldan/patch-1
add example for birch + make it more pythonic
Merge pull request #3940 from ragv/doc_own_estimators
Merge pull request #3954 from amueller/fix_sgd_learningrate_docs
Merge pull request #3802 from MechCoder/birch
Merge pull request #3962 from MechCoder/check_X_y
Merge pull request #3931 from dsullivan7/cwwarn
Merge pull request #3964 from bwignall/boostdoctypo
Merge pull request #3824 from nmayorov/rfecv_bugfix
Merge pull request #3969 from MechCoder/remove_check_X_y
Merge pull request #3966 from mvdoc/wtree_children
rename alpha to shrinkage + docstring cosmits
use random state in test (no global seed) + cosmit
Merge pull request #3974 from lesteve/fix-silhouette-typo
Merge pull request #3979 from lmichelbacher/patch-1
Merge pull request #3952 from MechCoder/callable_connectivity
Merge pull request #3985 from MechCoder/return_distances
Merge pull request #3997 from pricingassistant/doc-typo-fix
Merge pull request #3994 from jnothman/vec_dbscan
Merge pull request #4010 from MechCoder/sparse_bug
Merge pull request #4012 from pricingassistant/doc-typo-fix
Merge pull request #4021 from jakevdp/fix_check_symmetric
Merge pull request #4024 from jakevdp/ensure_symmetric
Merge pull request #4044 from andreasvc/patch-1
FIX : avoid unecessary computation sparse dual gap
Merge pull request #4105 from amueller/minor_spelling_fixes
Merge pull request #4113 from ragv/isotonic
Merge pull request #4118 from jmetzen/kernel_ridge_doc
Merge pull request #4109 from amueller/gmm_tied_covariance
Merge pull request #4229 from ragv/4135
Merge pull request #4244 from ugurcaliskan/patch-1
Merge pull request #4214 from ogrisel/fix-empty-input-data
Merge pull request #4204 from mvdoc/master
Merge pull request #4265 from harrymvr/doc_spectral_clustering
COSMIT : pep8 + 2 spaces
TST improve coverage of calibration.py
Merge pull request #4359 from josephlewis42/master
Merge pull request #4364 from amueller/appveyor_badge
Merge pull request #4395 from akitty/patch-1
FIX : allow NaN in input of calibration if estimator handles it
Merge pull request #4476 from CaMeLCa5e/master
Merge pull request #4514 from sinhrks/missing_all
Merge pull request #4519 from ClimbsRocks/patch-1
Merge pull request #4516 from amueller/distance_function_docstring
Merge pull request #4581 from kno10/patch-1
Merge pull request #4596 from ibayer/doc_train_test_split
Merge pull request #4600 from jfraj/bug_20newsgroup
Merge pull request #4647 from lesteve/better-appveyor-badge
Merge pull request #4662 from GaelVaroquaux/infonea_testimonial
Merge pull request #4528 from amueller/rfe_feature_importance_test
Merge pull request #4676 from zacstewart/master
Merge pull request #4161 from rasbt/ensemble-classifier
Merge pull request #4734 from ajschumacher/patch-6
Merge pull request #4679 from TomDLT/newton_cg
Merge pull request #4753 from TomDLT/newton_cg
Merge pull request #4762 from trevorstephens/gridsearch_unused_params
Merge pull request #4778 from untom/fixname
Merge pull request #4777 from TomDLT/liblinear_randomstate
Merge pull request #4792 from amueller/coveralls-badge
Merge pull request #4810 from mblondel/cite_api_paper
Merge pull request #4818 from amueller/sgd_decision_function_deprecation
Merge pull request #4829 from rvraghav93/remove_deprecated_check_cv
Merge pull request #4833 from tw991/feature
Merge pull request #4763 from sonnyhu/RidgeCV_slice_sampleweight
Merge pull request #4835 from tw991/feature2
Merge pull request #4847 from TomDLT/doc_dev
Merge pull request #4857 from christophebourguignat/master
Merge pull request #4861 from ltiao/patch-1
Merge pull request #4562 from hsuantien/emptylabelset
Merge pull request #4894 from tw991/rf
Merge pull request #4893 from jmschrei/cca_stability
Merge pull request #4918 from ejcaropr/doc-fix
Merge pull request #4977 from edsonduarte1990/master
Merge pull request #5001 from hlin117/svm-documentation
Merge pull request #4933 from banilo/regr_score_docs
Merge pull request #5011 from TomDLT/rcv1_order
Merge pull request #5025 from jakirkham/fix_typo
cosmit in plot_cv_predict
Merge pull request #4960 from mkrump/lb_encoder
Merge pull request #5092 from kazemakase/fix-lda-covariance
typos
Merge pull request #5126 from rasbt/rbf
Merge pull request #5143 from Eric89GXL/fix-r2
Merge pull request #5157 from amueller/ridge_intercept_docs
Merge pull request #5131 from michigraber/nonnegative-lars
pep8 + simplify positive test
unused import
Merge pull request #5216 from JPFrancoia/master
Merge pull request #5258 from amueller/docfixes
Merge pull request #5283 from TomDLT/remove_warnings
Merge pull request #5335 from rrohan/current
misc + pep8
Merge pull request #5336 from giorgiop/pca-warning
Merge pull request #5363 from Naereen/patch-1
Merge pull request #5365 from Naereen/patch-2
DOC : update docstring of l1_ratio for E-Net CV classes
Alexandre Passos (87):
Adding random projections SVD to scikits.learn.pca as an option
Adding the power iteration parameter to fast_svd (to make it better in high-rank very-big very-sparse matrices according to the Martinsson et al survey
Merging the rng changes
The derivation of the variational algorithm for the DP mixture of gaussians
Beginning the code; so far only doing the E step
First draft of the code; untested
The dp is already fitting properly
Fixing indentation bug
Changing the DP derivation to rst---equations don't work
Fixed the math
Removing useless whitespace between methods
Reorganizing the directory structure
Adding variational inference for a finite gaussian mixture model
I'm returning precision, not covariance matrices. Make that clear
Editing the documentation
Making it clear that the covariances don't work
Merge branch 'master' into variational-infinite-gmm
Fixing small bug
Adding example; adding explicit lower bound computation; optionally monitoring convergence; full and tied work, somehow spherical and diag diverge.
Using a smaller example to speed things up
Simplifying the code a bit
Fixing last bugs in the bound and updates; improving docs
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into variational-infinite-gmm
Fix docstring find&replace issue; restoring VBGMM
Adding reference in the derivation
pep8 dpgmm.py
Fixing test failures in mixture
Fixing pyflakes warnings
Adding complexity note to the documentation
Replacing DP by dirichlet process
Don't use np.linalg
Explaining what is dpgmm
Adding see also sections to the mixture models
Fix the 'give' in plot-dpmm
Editing a single example for the GMM and DPGMM explaining the difference
Making the documentation findable
Editing the documentation substantially
Adding doc to VBGMM
Adding usage note to dp-derivation
Adding some test coverage. For some odd reason some tests fail on 'make test' but pass on 'nosetests scikits/learn/tests/test_mixture.py'. Any idea why?
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into variational-infinite-gmm
Fixing the docs
Changing the image url in the doc
Even seeding the RNG in setup_func doesn't make the tests consistent
There was a bug in the setup, now things are working deterministically
Deleting stray print statement
Adding an rng parameter to the GMM classes
Fixing the imports
Inlining the helper norms
Beginning to vectorize the code
more vectorizing
Finish removing quadratic dependence on n_states; update docs
Adding norm to scikits.learn.base, using that
Putting norm in utils
Vectorizing parts of the VBGMM, which I had skipped due to it being a lot less useful than DPGMM
Incorporating some caching and vectorizing to improve performance as per line profiles
Fixing typo bug
Caching another computation
Small typo bug in _bound_z
a no-op that fixes tests
Change monitor to verbose, better output
Fixing typo-bug in the full covar update. There are still a couple of nondeterministic bugs to be taken care of
Making test_sample stop failing for no reason
Removing the square from norm() and creating helper sqnorm() in dpgmm
Prevent setting the covariance parameters
Caching the computation of the constant part on _bound_pxgivenz
Caching part of the bound for diag that was missing
moving some parameters from fit to __init__.
Merge branch 'variational-infinite-gmm' of https://github.com/GaelVaroquaux/scikit-learn into gael-variational
Fixing the names in the hmm test
Merging gael's branch
Merge branch 'variational-infinite-gmm' of https://github.com/GaelVaroquaux/scikit-learn into variational-infinite-gmm
Renaming bound_pxgivenz
Renaming covar to prec
Finishing the renamings
Adding a squiggly curve example for the mixture models
Improving the coverage of dpgmm
Testing lognormalize
Splitting test_mixture
Preventing underflow in wishart_logz
Fixing 0* problem in z log z
Fixing another underflow bug in digamma. Now the bound for spherical covariance never diverges as a cluster gets empty
Also, no warnings when running these tests
Fixing test failures resulting from the merge
Fixing some under and overflows; this doesn't fix all test errors yet
Removing some more underflows, still not all
dpgmm: setting the weights to something reasonable
Alexey Grigorev (1):
printing top terms even if LSI is used
Alexis Metaireau (7):
Configure sphinx to be able to load extensions
Fix OptionParser import
Fix little typos in the general_concepts document.
merge with upstream
fix a typo in neighbors docs
fix restructured text problems in the developers doc
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
Alexis Mignon (39):
Added positive constraints for the elastic net
Made the code pep friendly
Added fit_intercept for sparse ElasticNet as well as corresponding test
Corrected bad comment and the use of a typedef
Made code pep compliant
DOUBLE does'nt stand for a dtype
Added utility functions for csc sparse matrices
Modified: uses utility function for sparse csc matrices
Modified data generation so it can generate data adapted to positiveness constraints
Removed most python function calls
Removed duplicate definition of csc_mean_variance_axis0
Made the code pep8 compliant
Corrected doctring: CSR -> CSC
Regenerated with Cython
Corrected missing import of csc_mean_variance_axis0
Made code pep compliant
Modified: in 'center_data' makes a copy only when needed
Made code pep8 compliant
Unified access to 'mean_variance_axis0' for CSC and CSR matrices
Removed undeed functions
Added warm restart option and completed docstring
Completed docstrings, factorized some tests and added checks on dimensions
Added test case for warm_start
Added size check on coef_init
Made code pep8 friendly. Used random state with fixed seed.
Made code pep8 friendly.
Modified chi2 kernel approximation such that it deals with zero elements
kernel approximation: simplified mangement of non zero elements
For the sake of clarity, creates new temporary arrays instead of copying the same one several times.\n Modified error message for negative valued arrays.
pep8 compatibility
MAINT remove default values from private funcs in GMM
ENH faster sampling in GMMs
Change the way the covariance is computed to avoid problem with not positively defined covariance matrices
Added test to check that obtained covariance matrices are positive definite after learning a GMM
Cosmetic changes in the doc string. Added an indentation level
Clarified a comment and factorized the tests for all covariance types
Converted test_positive_definite_covars to a generator.
Converted docstring of 'test_positive_definite_covars' into a comment to avoid description ptoblems with nose (Related to issue #4250)
Put the error state change about underflow errors in a 'with' statement in '_covar_mstep_full'.
Ali Baharev (1):
Typo: PCA is not the abbreviation of Probablisitic
Allen Riddell (2):
DOC: Update dead link in cross_decomposition.rst
DOC: Fix typo in CalibratedClassifierCV
Alyssa (4):
[DOC] Updated install instructions for Arch Linux
<MRG> Doc generation works in Python 2 and 3
[MRG] Addressed comments
[MRG] Addressed comment, fixed open-ended except
Amit Aides (9):
Fix to sparse SVC with kernel='poly'
Added Multinomial Naive Bayes classifier
Fix to the documentation of the Multinomial Naive Bayes.
Pep 8 compliance and cleanup for the multinomial naive bayes
Merge remote branch 'upstream/master'
Some more pep8
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn
Merge remote branch 'upstream/master'
naive bayes name change MNNB->MultinomialNB
Amos Waterland (4):
Explicitly invoke the Python interpreter.
Use double backticks.
Remove extraneous period.
Explain the -1 Python syntax.
Anders Aagaard (1):
FIX six issue with module imports
Ando Saabas (1):
Store tree node info for all nodes, instead of just leaves
Andreas Mueller (1798):
Remove copy and paste errors from nearest neighbors example
Fixed issue 82: bug in init of Kmeans.
Minor documentation: how passing a callable for init works.
Changed default initialization method to "k-means++" for consistency with k_means
k-means clustering test: changed data points to be far away from zero. Now
transpose data on input and sources on output.
Adjusted examples to new ICA interface
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
typo
I don't really understand this, but it makes the error go away.
Added warning to fastica
pep8
fixed bug
pep8
typo
mention LDA in docstring
print docstring in examples
typo
pep8 and starting with X in right shape
rst fix
Notes on Fortran-ordering in fastica
test for vectorizer_inverse_transform
non-regression test for warm-start intercept shape using a binary dataset
letting intercept_init be of shape (), reshape to (1,) for consistency
added hopefully more intelligible error messages.
pep8
pep8
typo, pep8 and line continuations
test for new error strings
slight beautification (in my opinion)
don't test on error message, just on raise
pep8
DOCS: Image is aligned to the right...
DOC Added documentation for important attributes of GridSearchCV
specify dict type
DOCS: Typo in url
ENH: Adds more verbosity to grid_search. verbose > 2 gives scores while running grid.
Merge pull request #414 from amueller/grid_search_verbosity
DOC: Document "cache size" argument of SVR
COSMIT: remove unused error string.
COSMIT: remove unused error string.
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
ENH: removed kernel cache from fit method of DenseLibSVM, added to __init__ of BaseLibSVM
Added kernel cache argument to init of all SVC and SVR classes. For the moment the conservative 100MB default.
BUG cache_size instead of cache as paramter name
BUG: cache_size also for sparse SVMs
ENH: SVM cache_size default value changed to 200 mb
ENH Sparse SVM: removed cache_size parameter from fit method. Is now part of constructur.
DOC fixed doctests for cache_size parameter
DOC slight reformatting of kernel cache note in module docs.
BUG: minor mistake in earlier commit.
DOC: fogot doctests in python files.
DOC: another doctest.
ENH: in Scaler, warn if fit or transform called with integer data.
Merge pull request #425 from amueller/svm_cache_size
ENH parameter "epsilon" in SVR error messages is given correct name.
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
DOC Made reference to "Getting started" in "Datasets" section a link.
DOC: inline example for precomputed svm kernel
ENH in preprocessing.Scaler, raise warning also if given unsigned int
DOC/Website: Changed link on "support" page to scikit-learn.org, added 0.9 release doc link
DOC fixed whitespace in GridSearchCV doc string so that html doc is generated correctly.
COSMIT removed unused import, pep8
COSMIT pep8 in cluster module, removed unused import
COSMIT pep8 whitespace
COSMIT removed emacs modeline
COSMIT: pep8 whitespaces instead of aligned decimal points
COSMIT indentation
COSMIT ugly line break for pep8
COSMIT reindented for pep8
COSMIT pep8 whitespaces
Merge pull request #447 from amueller/pep8
ENH: in sgd classifier, check that parameter alpha is greater than zero
COSMIT some pep8
some pyflakes
COSMIT more pep8
COSMIT more pyflakes
COSMIT: more pep8. enough for today...
ENH: fastica returns whitening matrix "None" when whitening=False
TEST non-regression test for issue 238, FastICA failing with whiten="False"
COSMIT pep8
COSMIT pyflakes
COSMIT: pep8
COSMIT pep8 in backported sparsetools...
DOC Added Gael's explanations about the memory usage in grid_search / joblib
DOC: Auto example digit classification plot without interpolation and axis.
FIX: typo in with statement
Example for random dataset function.
Random dataset example: make figure look nice on the web
DOC: Added random dataset plot to doc.
COSMIT: random dataset plot prettified
DOC Added comment about equivalence of nu-SVM and C-SVM to the docs
Examples: Replaced NuSVM by rbf SVM in example. RBF-SVMs are really important, NuSVMs not so much imho.
pep8. whoops..
COSMIT: pep8
FIX: Return "None" fist.
Example for finding the hyperparameters in a RBF SVM
Examples: Make SVM parameter estimation look good on the web.
DOC: Fixed legend in iris svm example
DOC Nonlinear SVM example changed to satisfy my sense of aesthetics. Hope you like it.
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
COSMIT pep8
Example illustrating parameters of an RBF SVM
COSMIT removed unused import math in utils/extmath.py
FIX: make kmeans test not raise warning when init is passed.
FIX: make kmeans test not raise warning when init is passed.
DOC Description of the basic dataset API
DOC: Corrections and additions to the dataset docs. Also more detailed docstrings
DOC test fix. Set printoptions to get rid of epison.
FIX: whitespace after ..
DOC test fix finally....
DOC fixed fastica docstring: if whiten=False K=None
ENH linnerud dataset interface adjusted to be consistent with the others
FIX: typo in diabetes docs
DOC RST field lists don't behave as I want them to:(
COSMIT datasets doc using rst tables
FIX This should fix the doctests in the datasets dir. They take quite long, I think it's because of the svmlight loaders. So I didn't include them in the standard make target
COSMIT rst formatting
DOC: Added missing rst label
FIX RST references
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
FIX rst errors in docs
FIX doc rst references
DOC Added link for Satrajit Gosh, removed dead link for Robert Layton since I couldn't find his website.
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
DOC Robert Layton again.
ENH prettify kmeans vs minibatch kmeans example
ENH adjust subplots to look good on the web.
FIX minor typo
FIX errors in doc
FIX minor docfixes
added kernel approximation using monte carlo approximation of fourier transform
ENH pipeline compatible interface to fit, transform and fit_transform
DOC example comparing linear classification, kernel svm and kernel approximation with explicit mapping.
kernel approximation example
DOC beautified kernel approximation example plot
better docs, remove unimplemented kernel approximations
COSMIT pep8
DOC added kernel_approximation module to docs
DOC: placeholder entry in user guide
ENH: renamed D to n_components for consitency
DOC approximate kernel functions narrative docs
DOC: more narrative documentation for kernel_approximation
DOC: references for approximate feature maps
COSMIT: pep8 in kernel approximation test
DOC approximate kernels: added formular for skewed chi squared kernel
COSMIT removed commented out import
ENH: additive chi squared kernel implemented and tested
pep8
DOC: added AdditiveChi2Sampler to doc modules
ENH: Default value for n in AdditiveChi2Sampler
DOC narrative doc for additive chi squared kernel
ENH: sensible defaults for RBFSampler and SkewedChi2Sampler
ENH Added AdditiveChi2Sampler to feature_extraction __init__
BUG: AdditiveChi2Sampler fit method should return self
ENH: in Chi2Samplers, check if input inside inside desired range.
FIX: Renaming of RBFSampler argument
DOC: Move kernel approximation to be a "plot" example.
Don't test as strictly so not to fail randomly..
Example of decision surface of approximate kernel svm
Moving kernel_approximation to the top level
ENH: Restructuring User Guide: kernel_approximation, preprocessing and feature_extraction are under a common chapter, "
DOC: finetuning the narrative docs for kernel_approximation
DOC: kernel_approx make examples show correctly
DOC rst
ENH Addressing some of Gael's comments, mainly naming and docstrings
ENH better testing
ENH fixed location of the legend in kernel_approximation example
DOC more discussion in docstring
ENH timing results in approx kernel example
ENH kernel approximation: More specific references and example referencing the narrative docs.
FIX: use safe_sparse_dot in kernel_approx transform
DOC minor doc improvements, different example
NONSENSE improve the example that i'll remove in a sec
BUG import ...
COSMIT + SPELL
DOC added reference to the user guide in kernel_approximation module
FIX path in plot
FIX typo that cost me half a day of sprinting...
ENH Remove redundant example
FIX fix module links, figure split into two
COSMIT pep8
FIX: Kernel approximation module in references in alphabetical order.
DOC trying to clarify the kernel_approx documentation.
DOC FIX typo
FIX docstring errors...
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
FIX: missing import
FIX: fixed link to Virgile in whats_new
Merge pull request #486 from jakevdp/util-docs
Merge pull request #490 from mblondel/news20_loader
Merge pull request #488 from mblondel/sparse-kmeans
FIX: Added DBSCAN to references
FIX: typo in docs
Merge pull request #417 from larsmans/multilabel
COSMIT minor ticks
FIX getting rid of some more sphinx problems
FIX: SO EINE SCHEISSE!
COSMIT fixing indents in balltree
Merge pull request #510 from amueller/aaarrrgghhh
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
COSMIT "Examples" instead of "Example"
COSMIT Addressing @agramford's comments about whitespace and a minor fix in pipeline.
FIX: Section Returns not Return
COSMIT: Class docs don't have a 'Methods' section. It is autogenerated.
COSMIT Examples not Example
COSMIT make 'References' bold and minor other fixes.
COSMIT underscore fixes
COSMIT "Optional Parameters" Section removed
COSMIT pep8
FIX developers rst malformed
remove unused link
COSMIT remove unused malformed tag
FIX indentation and string literals
FIX backtics for members_, spaces around colon (not cologne)
COSMIT minor docstring stuff
COSMIT remove Methods section
FIX: rename complexity section into notes section
FIX docstring variable names
FIX rename "Details" into "Notes"
FIX remove infinite recursion
COSMIT: Make references link and show up correctly, parameters of __init__ documented in Class, not in function.
COSMIT make formulars show up correctly, use reference formatting for references
COSMIT make references use reference formatting
COSMIT format references and dict stuff...
COSMIT Indentation of formulars
FIX removed duplicate explicit linke for Vlad
FIX: RST indentation and blank lines
FIX RST and references
FIX minor rst
FIX workarounds for docutils bug
FIX whitespace where rst demands it...
FIX workaround for table problem
FIX two more underscores
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into doc_underscores_for_real
COSMIT docs hmm
FIX: don't use latex in rst
FIX + COSMIT rst warnings
COSMIT docs
FIX: fix again errors in NMF after merge
FIX: Document properties in a way that the docstring actually shows
FIX: rst errors in ball_tree
FIX: Notes instead of Note in preprocessing init
FIX remove handles for references as they are not used anywhere and raise warnings if doubled.
Merge pull request #513 from amueller/doc_underscores_for_real
COSMIT docs underscore fixes (again)
COSMIT fixing doc errors and making html docs pretty
COSMIT Minor beautifications and RST error fixes
FIX doctest errors + cosmit
Merge branch 'master' of github.com:scikit-learn/scikit-learn
DOC use SVC in grid search instead of SVR. Iris is a classification dataset as pointed out by @agramfort here:
DOC score returns accuracy, not error
FIX for doctest I just broke :-/
DOC uncommenting doctests in balltree.pyx, addind doctest: +SKIP
COSMIT a little less skips...
ENH Add underscores to estimated attributes in GridSearchCV and deprecation warnings.
ENH renamed best_estimator and best_score in examples and tests.
COSMIT typo
ENH in GaussianNB, let estimated parameters have underscores.
DOC Reworking Bayesian regression documentation
DOC mentioning sparsity of ARD, reblocking text
COSMIT typo, thanks @vmichel for pointing it out.
DOC added reference for sparsity of ARD
COSMIT pep8
DOC fix linking to load_sample_images and load_sample_image in docs
DOC underscores in DeprecationWarnings... shame on me for forgetting that....
DOCs workaround for docutils bug (column alignment problem)
DOC external references go under "references" not "see also". "See also" can only handle internal references
ENH liblinear: cythonized sign switch for n_class<=2
ENH liblinear: get rid of n_class sign by switching class signs in liblinar implementation.
COSMIT typo
whatsnew: gave myself some credit
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' into svm_coef_sign
FIX adjust _set_coef_ and _set_intercept_ to sign switch
ENH DenseBaseLibSVM.coef_ correct. test simplified.
DOC try to document layout of dual_coef_ in multiclass libsvm
DOC fixed errors in load_images doc and SKIP'ed load_image doctest as was already the case for load_images
DOC: OCD and added image loader to class reference
DOC Trying to enhance the tree/forest docs. Headlines in tree, added reference, hopefully better description of 'min_density'.
DOC layout of dual_coef_ in 1vs1 svm in user guide, example
DOC fixed indices in dual_coef_ example
COSMIT factor out 1vs1 coef construction in libSVM, PEP8
DOC added RidgeClassifier to References
DOC fixes in Multiclass docs. Didn't show correctly on web.
DOC multi-class narrative: added links to the references, made citation clickable
ENH trees in random forests save the indices of the training data used in bootstrap sample
ENH Add function to predict on left part of training set
ENH use self.classes_, check input on predict_oob, add test
DOC Out of bag error estimates in grid_search module
COSMIT @glouppe says this is more pythonic :)
DOC reformulation out of bag error
COSMIT in doc: @ogrisel's remarks
ENH oob score as attribute, not separate function.
ENH: added oob_score_ and oob_prediction_ to regression ensembles
FIX copy/paste error. guess it was to late
ENH made oob_score an ``__init__`` param as suggested by @agramfort
DOC what's new, minor doc improvements
Merge pull request #571 from amueller/tree_indices
ENH: Replace asserts by appropriate errors. Fixes the rest of issue #570.
COSMIT how I love these sphinx errors
DOC complicated objects as parameters confuse sphinx and the reader. Fixes issue #567.
ENH: Default in Vectorizer "None" as @ogrisel suggested
DOC website: added link to 0.10 docs under support.
DOC added required versions of python, numpy and scipy to install documentation. Closes issue #579
COSMIT pep8
COSMIT removed unused imports
Merge branch 'master' into svm_coef_sign
DOC comment in linear.cpp
DOC @ogrisel's suggestion: putting a link to pull request in liblinear.cpp
COSMIT pep8
DOC fixed doc errors in metrics module
COSMIT removed unused imports
Merge pull request #546 from amueller/svm_coef_sign
FIX RandomizedLogisticRegression test import
COSMIT removed unused import
DOC fix sphinx errors
DOC more fixes in Docs
DOC cluster metrics: fixed see also sections, errors in references section.
COSMIT pep8
FIX SGD loss example for new hinge loss.
FIX lasso_dense_vs_sparse_data.py example needed update.
COSMIT pep8
DOC add cross_val_score to references, OCD.
FIX bug in text feature extraction, issue #606
COSMIT pep8
DOC fix sphinx errors
ENH: moved class_weight parameter in svms from fit to ``__init__``.
MISC Adjusted class_weight param in examples, fixed legend in unbalanced dataset examples.
DOC typos.
MISC reinserted class_weight as fit parameter, added deprecation warning.
MISC cleanup
DOC margin for old warning wrapper fixed
MISC Deprecated class weights in SGDClassifier
Merge pull request #578 from jakevdp/old-version-warning
pep8
COSMIT pep8
COSMIT get rid of warning in nosetests for equidistant neighbors. it's intentional.
MISC more sensible NMF test.
COSMIT pep8 wooops thanks @ogrisel
MISC forest tests: boston faster, probability test faster and no warning.
MISC decision tree test faster and no warning
COSMIT simplified error message checking, remove deprecation warning.
MISC more iterations for test_lasso_path. Still runs in <.1s, gives no warning and more accuracy.
MISC more iterations also for test_enet_path, same runtime as before, no warning.
COSMIT pep8
COSMIT pep8
FIX added missing import
MISC added warning to coordinate descent if alpha=0, don't call cd with alpha=0 in tests.
MISC replaced deprecated mean_square_error in test.
MISC test for warnings as @ogrisel suggested.
Merge pull request #620 from amueller/coordinate_decent_alpha_warning
add min_leaf (minimum size of leaf node) to decision tree
ENH min_leaf for ExtraTree
ENH added test for "min_leaf"
ENH set min_split if min_leaf is set.
DOC add load_svmlight_file to references
DOC minor fixes and typos
DOC more rst fixes....
DOC typo in whatsnew
Merge branch 'master' into svm_class_weights
DOC renamed duplicate label
FIX flip sign in decision function of LibSVM in binary case.
MISC renamed min_split and min_leaf to min_samples_split, min_samples_leaf, added them to the ensemble classifiers and documented them....
FIX OneClassSVM decision function sign.
ENH more elaborate one class svm testing....
MISC address @mbondels comments
MISC simplified test
FIX one class test, added more decision function tests.
COSMIT pep8 + "leafs" typo.
DOC Added changes to decision functions and coef_ to whatsnew
MISC don't use deprecated mean_square_error
Merge branch 'master' into svm_class_weights
Merge pull request #610 from amueller/svm_class_weights
COSMIT pep8
FIX whooops sorry
DOC Insert hidden toctree, mv "included" files from rst to txt
MISC Issue #639. Remove unused member types in linear_model CVs
DOCs change extension from txt to inc, add inc as doctest extension to makefile
MISC verbosity parameter for forests: better control over tree building.
Merge pull request #641 from amueller/doc_fixes
FIX dataset docs: changed suffixes in include to match rename.
DOC fixed inconsistent titles. sphinx didn't like them and didn't show these sections.
MISC @ogrisels comment about human-parsable counting
Merge pull request #643 from amueller/forest_logging
DOC C is pretty large now...
MISC class_weights constructor parameter in RidgeCV
DOC doc fixes
MISC added removal version for scikits.learn deprecation warning.
MISC remove ball_tree and cross_val namespaces
MISC scikits.learn removal at .12. I'm not so good at counting, sorry.
Merge pull request #660 from amueller/remove_namespaces
COSMIT renaming scikits.learn to sklearn in some places
COSMIT pep8
MISC Update all the other deprecation warnings that I forgot.
FIX: class_weight only in classifier Ridge classes
DOC Documentation for RidgeClassifierCV
DOC add removed docstring.
COSMIT pep8
ENH Added tests and fixes
DOC remove "for dense data" heading for SVM classes
Merge branch 'master' into linear_model_class_weights
DOC document classification plot
MISC removed deprecated api from examples
WEBSITE: make example gallery look even better!
DOC added reference to r2 score
ENH rename parameter "multi_class" of LinearSVC to "crammer_singer", add docs, add tests
FIX forgot doctest
DOC minor addition to SVM kernel parameters
DOC more readable make_friedman docs....
Merge pull request #649 from daien/GridSearchCV_precomputed_kernel
COSMIT don't use deprecated names
ENH new samples generators for classification and clustering. Refactored label propagation example a bit
ENH cluster comparison example (starting)
Merge pull request #669 from amueller/example_gallery_css
ENH added "shadow" parameter class_weight_ as @ogrisel suggested.
MISC changed parameter name back but changed semantics, as @mbondel suggested.
COSMIT pep8
DOC added one more sentence about crammer-singer
COSMIT typo. thanks @ogrisel.
DOC crammer_singer docstring by @ogrisel
ENH clustering example with spectral clustering and ward with connectivity. looking better now, still not perfect.
FIX broke label_probagation example, now fixed it again.
Merge branch 'master' into sample_datasets
Merge pull request #673 from amueller/crammer_singer_rename
DOC add new dataset generators to class reference
WEBSITE: another css enhancement to give figures a max width.
DOC move references from Notes to References section in docstrings
MISC simplified kpca example with new dataset generator, another minor fix in generator
DOC lasso/enet regression example with coefficient plots, corrected r2 score
DOC Basic docstrings for LDA and QDA classes
DOC lda/qda examples: remove redundant example, prettyfied other.
DOC Added QDA to references, narrative docs, improved docstrings
COSMIT newline in LDA doc
DOC explanation for plot in lda/qda narrative
MISC use Gaels pretty plot, add dbscan, normalize data...
COSMIT cleanup, pep8
ENH issue #661, plus some renaming and minor cleanup
MISC forbid mle initialization of PCA for n_samples < n_features
DOC added clustering example to the docs
COSMIT make plot look more like other coef plots
COSMIT removed debugging print
MISC added xlim and ylim for @ogrisel's weird matplotlib ;)
ENH fixed seed, added center positions
Merge pull request #674 from amueller/sample_datasets
FIX minor doc fixes
DOC add link to narrative in lda and qda references
DOC add ``estimate_bandwidth`` utility for MeanShift to the references and narrative
MISC make Ward check if input is sparse.
MISC make Ward test if connectivity is a valid connectivity matrix.
COSMIT changed error message for Ward
DOC another coefficient plot
COSMIT Adjust title for example gallery
ENH 2d plot for l1l2 digits example
COSMIT last try to make my plot pretty....
BUG fixed error that I introduces earlier: connectivity can also be `None`
DOC fixed reference to an example (that I also broke before)
Cosmit typo
FIX plot example fix for old matplotlib, so that it shows on the website.
Merge branch 'master' of github.com:amueller/scikit-learn
COSMIT make cross_validation nosetest slightly more readable and more pep8 respecting
FIX make class weight nosetests work
FIX get rid of some doctest errors (with the stricter nosetester)
ENH refactoring of dot-file export
COSMIT comments
COSMIT minor visual enhancement
ENH: don't fail on "yeast" dataset
Merge pull request #711 from davidmarek/sparse_pca
DOC Added clustering functions to references.
Merge pull request #685 from ibayer/master
ENH local variable in ``fit`` instead of modifying the estimator parameters. thanks @GaelVaroquaux
DOC: Added ElllipticEnvelop to the References
DOC added reference for EllipticEnvelop and fixed some sphinx errors.
FIXed nosetests. Thanks @pprett
Merge pull request #707 from amueller/graphviz_dot_refactoring
Merge pull request #648 from amueller/linear_model_class_weights
COSMIT Typo
COSMIT pep8
DOC sphinx/rst errors
DOC Believe it or not - this fixes the annoying sphinx error. And don't dare to
COSMIT minor fixes to docs
COSMIT fixed references to covariance.EllipticEnvelop in docs
COSMIT pep8
DOC correct links to face recognition example, take care of trailing underscores.
COSMIT pep8
ENH grid_search forgets estimators
DOC slightly better docs for ``refit``, document ``best_params``.
FIX clone base_clf before setting params.
FIX messed up something in the short cut method.
ENH pre_dispatch for foresters
FIX redundant code is redundant
COSMIT add todo comment to grep
Merge pull request #770 from amueller/oblivious_grid_search
ENH normalized_mutual_information
Revert "Merge pull request #773 from amueller/forest_pre_dispatch"
COSMIT don't use deprecated attributes in tutorial.
COSMIT pep8
FIX don't use parameters to fit in GMMHMM.
FIX don't use Python 2.5 method of checking for warnings
MISC Don't warn on equidistant on iris. iris has duplicate datapoints.
FIX don't use fit parameters in grid_search test
ENH convert X to float in k_means predict.
MISC don't use private ``set_params`` method as that raises a warning.
MISC don't use iris in testing as it has duplicate data entries. Add some noise to simple examples.
MISC added note that we need better tests
DOC typo
ENH check if backport of sparse scipy ARPACK is needed. The backport breaks with scipy 0.11
Added mutual_info_score to the references
DOC narrative docs for normalized_mutual_info_score
DOC make formulars for clustering metrics more pleasing to the eye
ENH fix if entropy is zero in normalized_mutual_info_score
COSMIT cleanup + pep8 in examples
MISC extended example, fixed doc build warning
DOC made it more explicit that AMI is better than NMI
COSMIT + MISC pep8, pyflakes, typos and some other cleanup of examples.
DOC typos (thanks @ogrisel) and some elaboration in docstring.
Merge pull request #800 from amueller/less_neighbors_warnings
FIXed pca example that I broke when "cleaning up"
ENH checked for scipy version
ENH add ``decision_function`` to ``Pipeline``
ENH joined tests for less duplication, checked shapes as @ogrisel suggested.
FIX we need to do "LooseVersion" to support dev/git versions of scipy
COSMIT pep8
COSMIT make test more explicit
COSMIT removed unused "verbose" option in dbscan
COSMIT removed unused import in test
FIX copy/paste error
FIX removed verbose also from main DBSCAN class
DOC added reference to Hila's thesis, added comment about equivalence.
ENH replaced v_measure_score computation with nmi computation.
DOC removed NMI from example plot as it is the same as V-measure
COSMIT dbscan test doesn't use fit params
DOC comment on normalized mutual information
ENH simplified entropy calculation
Revert "DOC removed NMI from example plot as it is the same as V-measure"
Revert "Revert "DOC removed NMI from example plot as it is the same as V-measure""
Revert "ENH replaced v_measure_score computation with nmi computation."
COSMIT typos by `git grep independant`
DOC corrected relation of V-measure to normalized mutual information.
MISC removed unused lines, see #666.
COSMIT rst in example
ENH adjusted examples to new matplotlib 1.1.1
MISC don't use ``set_cmap``
MISC use logsumexp in DPGMM for less warnings
FIX typos in examples
FIX one more example
MISC trying to remove scale_C
MISC forgot two
DOC docs and examples have scale_C removed
FIXed many tests
DOC some doc corrections
ENH remove duplicate definition of "assert_lower" in tests
FIX ditto (numbers are to random)
ENH backport "assert_less" and "assert_greater", rename "assert_lower" and use it everywhere :)
ENH rename out_dim to n_components in manifold module
FIX assert_greater message
DOC Added pipeline user guide
ENH use random states everywhere, never call np.random.
FIX don't do anything in the __init__
WEB Added page with links to various tutorials/presentations on scikit-learn
DOC added some explanation to video page
ENH added random_state to Gaussian Process
FIX testing: random state problem in forest testing.
DOC minor fixes to rst and image paths
DOC banner 14 duplication?
DOC more minor fixes
DOC fix last docstring error. Don't remove redundant docstring. I dare you, I double dare you mother******!
RELEASE 0.11
COSMIT typo in whatsnew
RELEASE HEAD is now 0.12-git
COSMIT pep8
MISC don't use fit parameters in example
ENH rename unmixing_matrix_ to components_ in FastICA
DOC document 'labels' argument of confusion_matrix
DOC fix see also in gmm
FIX made "unmixing_matrix_" a property as @larsmans suggested.
COSMIT pep8
ENH rename 'k' in KMeans and MiniBatchKMeans
ENH renamed 'k' to n_clusters in SpectralClustering
ENH rename k in clustering examples and doctests to n_clusters
ENH fixed ``n_cluster`` to ``n_clusters`` in examples. Thanks @agramfort
ENH check whether "k" was used in fit, not init, as GaelVaroquaux suggested.
Merge pull request #874 from temporaer/master
Merge pull request #858 from amueller/fastica_components_rename
COSMIT pep8
FIX typo in example. My bad.
FIX renamed what was `components_` to `sources_`
COSMIT rst error
COSMIT fixing doc building errors.
COSMIT typo
Merge pull request #776 from amueller/normalized_mutual_information
Merge pull request #868 from larsmans/liblinear-1.91
ENH "fit_pairwise" for spectral clustering.
ENH Starting on affinity propagation
DOC typo
DOC Improving docstring for SpectralClustering
ENH fixed affinity propagation test. Need more tests.
ENH fit_pairwise, transform_pairwise for KernelPCA
ENH base svm has fit_pairwise and predict_pairwise.
ENH fit_transform_pairwise for KernelPCA
ENH isomap uses new interface.
COSMIT get rid of debugging output
ENH GridSearchCV uses the new API
COSMIT forgot one print...
DOC Deprecation warning with removal version 0.13.
ENH going for a universal property ``_pairwise`` instead of many functions.
ENH Cleanup
FIX Fixing rebasing problems...
COSMIT avoid errors in tests.
ENH slight improvement to mds speed, modified examples to not run mds that long.
ENH added old confusion_matrix implementation as alternative for few labels.
Merge pull request #887 from danohuiginn/master
BUG fixing bug in entropy that I introduced, adding regression test.
FIX faces_decomposition example. That this broke only now is a sign of deep magic, better left unexplored.
Merge pull request #888 from jaquesgrobler/master
DOC removed irrelephant/confusion reference, added pointer to source (as there is no other possible reference).
DOC user guide pdf building. Kicked out a formular that rendered neither in html nor latex. Please don't hit me.
Merge pull request #889 from vene/generate-multitarget
Merge pull request #875 from AlexandreAbraham/ward_coo_bug
COSMIT pep8
MISC raise more helpful error message in GaussianProcess if optimization fails.
MISC added bigger "tiny" in lars_path. least_squares is float32.
MISC reduce code duplication, fix "self.gamma" modification
MISC A bit more cleaning up in BaseLibSVM
DOC added "fetch_mldata" to references.
CLEANUP remove linear_model.sparse.setup.py
COSMIT pep8
DOC rename lambda to alpha in plot_lasso_model_selection. Closes #903.
TESTING check that SVC checks the shape of precomputed kernels.
ENH Check that X is non_zero for MultinomialNB.
ENH fixed doctests, addressed comments.
DOC improve kmeans init doc.
Merge pull request #894 from amueller/svm_sparse_dense
FIX more doctests that I broke.
DOC comment in whats_new on changed behavior of ``gamma`` in SVM
Merge pull request #914 from alexis-mignon/master
Merge branch 'master' into fit_pairwise
MISC callable kernel gridsearch fix...
ENH factorize common tests.
ENH don't list abstract base classes
ENH make base classes abstract meta classes
ENH make all Estimators default constructible (except SparseCoder)
ENH Add MetaEstimatorMixin, make RFE default constructible
ENH make GMMs and LLE cloneable.
COSMIT get rid of warnings (can't get rid of deprecation warnings only :-/)
ENH make BaseLabelPropagation abstract base class, make OutlierDetectionMixin not inherit from ClassifierMixin
BUG fix testing for abstract classes
ENH default score func for univariate feature selection: f_classif
Make sparse svm base class ABC
FIX better class selection, more strict testing.
ENH more tests
MISC raise NotImplementedError instead of value error in decision_function of sparse SVM
ENH do zero mean, unit variance on iris, don't test naive Bayes (for the moment)
ENH change defaults on SGD (works on digits and iris and I just guessed them).
ENH avoid division by zero in LDA, also avoid reusing variable names.
MISC don't test SVM for the moment, rest works :)
ENH make LinearModel and LinearModelCV abstract base classes
ENH test regressors
MISC shuffle iris for SGD based methods
Revert "ENH change defaults on SGD (works on digits and iris and I just guessed them)."
ENH Fix seed that makes SGDClassifier work.
ENH create BaseRidge base class
ENH test more shapes, test non-consecutive classes, test accuracy on test set
FIX minor rebasing and other problems
MISC cleanup common testing
Merge pull request #893 from amueller/common_test
FIX for filtering of meta estimators in python2.6
ENH better input validation for prediction in SVC, LinearSVC.
DOC Also added some notes on my recent merge with tests and stuff to the whatsnew.
MISC fixed random seeds in LLE tests.
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
COSMIT pep8
COSMIT pep8
ENH in OvR, use constant predictor if one class always present or never.
MISC address Gael's and Lars's comments, make ECOC tests deterministic.
FIX trying to fix long-standing linker issue
COSMIT pep8
trying out some testing stuff
ENH put atlas checking in one place and load from there.
DOC typo / wrong parameter in lle docs
Improve test-coverage ;)
COSMIT some RST fixes for the docs
Remove empty statement
DOC doctest failed on my box because I had higher precision...
COSMIT typos in covertype benchmark
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
Merge pull request #886 from amueller/multiclass_always_present
COSMIT, removed scikits.learn things, removed orphan file.
ENH trying to catch that damn thing.
ENH better error messaged in multiclass as @mbondel suggested.
Merge pull request #1 from cournape/linking_arrayfuncs
ENH corrected errormessages for always present labels. ugh
FIX doctests for changed dtype
ENH fixed warning for output code
FIXed another doctest.
ENH add verbose warning about too little trees for oob. Should we catch the divison by zero warning for classification?
DOC made the pls example plots so much prettier
Merge branch 'master' into fit_pairwise
Fixed merge problem
ENH Removed stupid ``_pairwise`` property in BaseEstimator.
MISC minor cleanup in spectral clustering
FIX/TST test anc fix grid search with kernel pca and precomputed kernel in pipeline.
COSMIT comments not docstrings in tests
Merge branch 'master' into fit_pairwise
TST precision issue on my windows box :-/
ENH slight cleanup in LDA, QDA, support for arbitrary class labels.
ENH use LabelEncoder
COSMIT typo in pairwise docs
DOC added LabelEncoder to the References.
Merge pull request #1001 from serch/master
Merge pull request #1008 from mrjbq7/doc-fixes
COSMIT pep8
ENH just a little more input validation testing
DOC added default value of shrink_theshold to NearestCentroid docstring.
DOC added ``lowercase`` to CountVectorizer docstring.
FIX feature selection dies on non-csr sparse matrices (that are unsubscribable). Regression test should go in common testing.
DOC added class_weight to LogisticRegression docstring
ENH auc_score and average_precision_score. Closes issue #158.
ENH added to ``__init__.py`` and references.
DOC explained RFE default behavior in docstring.
MISC Added unconfigured windows box to mailmap. Sorry about that.
DOC add parameters to TfidfTransformer docstring
ENH slight cleanup in LDA, QDA, support for arbitrary class labels.
ENH use LabelEncoder
FIX Removed code-duplication introduced in rebase.
FIX Fixed variable names. Thanks @mblondel
DOC Added wikipedia references to docstrings
Merge pull request #1013 from amueller/auc_score
DOC Updated whatsnew
ENH sparse matrix support in univariate feature selection
TST Simplified tests, test that sparse and dense versions give the same result, always return arrays, not matrices.
DOC Polished some docstrings
ENH Added copy keyword to safe_sqr, added to dev docs.
COSMIT Fixed commata
ENH Addressed @mblondel's comments.
ENH simplify as @mblondel suggests
ENH sparse matrix support for RFE and RFECV. Closes issue #1018.
DOC updated whats_new
ENH going back to not using LabelEncoder.
Merge branch 'qda_lda_1000' of github.com:amueller/scikit-learn into qda_lda_1000
Merge pull request #1000 from amueller/qda_lda_1000
typo in linear_model doc
ENH add verbosity parameter to cross_validation_score
MISC catch warnings in covariance tests
Typo in last commit :-/ sry
ENH catch expected warning in ward clustering
ENH renamed ``min_n`` and ``max_n`` parameters in CountVectorizer to enable gridsearch over them together.
ENH renamed parameter bounds_n to ngram_range, fixed doctests and tests.
ENH addresses @ogrisel's comments
ENH fix merge with char_wb_ngram
ENH check that classifier decision_function and predict_proba validate shape of input.
Merge pull request #1046 from TimSC/master
COSMIT pep8
ENH rename paramter ``p`` of AffinitPropagation to ``preference``, slightly change the meaning of scalar parameter. Scaling the medium seems more intuitive that giving absolute values.
DOC fixed renaming of ngram_range in feature_extraction narrative
TST check that transformers fail gracefully on sparse input
ENH affinity propagation now has an ``affinity`` parameter, instead of a ``precomputed`` parameter, to support other affinities in the future.
ENH renamed ``gaussian`` affinity to ``rbf`` in spectral clustering for consistency.
COSMIT renamed n_points to n_samples everywhere, fixed shape docstring that @mblondel pointed out.
FIX Worst feature in RFECV missing. closes issue #681.
ENH renamed ``neq_sqr_euclidean`` to ``euclidean`` so we it is easier to parse
ENH Convert input into float in GMM
ENH add test, revert affinity propagation to previous parametrization (was a bit over-eager there)
TST added tests for different spectral clustering affinities
Merge branch 'fit_pairwise'
MISC add verbose keyword to AffinityPropagation
FIX fixed horrible bug in spectral clustering!!!!
ENH updated whatsnew for bugfix, removed warning box, tightened test.
TST classifier behavior with only one class present
ENH also test MultinomialNB
ENH some cleanup in grid_search.
Merge pull request #1068 from amueller/grid_search_cleanup
ENH add test for consistend predict_proba shape also in the two-class case.
tst add check for isotropic data in spectral clustering
FIX try to be a bit nicer to arpack - any one with a different setting care to try to make a more stable test?
FIX doctest corrected (hopefully this is deterministic) + cosmit
FIX removed isotropic spectral clustering test because of arpack problems.
FIX use backport of np.unique
FIX forgot some uniques
DOC fix minor sphinx errors and stuff
enh: try to get decision function to work in two class case
ENH make QDA and LDA decision functions adhere to standard shape [n_samples,] in two class case.
Fixed tests for RidgeClassifier
DOC updated whatsnew, moved @pprett's api fix into the api section.
ENH addressed @agramfort's comment, also removed the special case from testing as @mblondel fixed it :)
ENH added min_df keyword to CountVectorizer, default=2
ENH more robust testing for int
ENH more robust testing if parameter is int or float, as suggested by @larsmans in #1066.
FIX typo
COSMIT Typo. Englais svp. Closes #1090.
COSMIT trying to fix doc issues
DOC added min_df change to whatsnew, made more estimator names clickable.
ENH rudimentary testing of tranformer objects
MISC added comment to explain SelectKBest k in common tests
COSMIT copy+paste error
ENH test that regressors can handle integer data.
ENH add ClassifierMixin with ``fit_predict`` and some tests.
COSMIT remove commented out score
DOC CountVectorizerDocstring readability
DOC Added section on issue tracker tags to development docs
ENH raise ValueError in r2_score when given only a single sample.
ENH support custom kernels on sparse matrices
ENH added low-level bail out in sparse svm
MISC use assert instead of value error.
FIX add exception, check exception, if sparse.SVC is called with kernel='precomputed'
ENH fix error by removing unnecessary test.
DOC added some comments to the sparse precomputed kernel tests.
DOC updated whatsnew with ProbabilisticPCA fix by @kuantkid
Merge pull request #1109 from buma/predict_proba_doc
FIX affinity propagation typo
DOC fixed some sphinx errors, issues in docs....
COSMIT pep8
DOC fixed reference in whatsnew
DOC added some more API changes to whatsnew
FIX removed sparse_encode_parallel
COSMIT pep8
COSMIT typo, thanks @ogrisel
DOC add people and commits do whatsnew
MISC starting 0.13 cycle
ENH more robust transformer testing.... don't ask why that came up
ENH address issue 1028: clone estimator in RFE
ENH issue 1115: grid search support for rfe via ``estimator_params``
ENH fixed bug in sparse RFECV, fixed bug in RFECV init (step was always 1), added decision_function and predict_proba to RFE and RFECV
MISC rfe outputs loss, not score
FIX typo
add y to tfidf vectorizer
WEBSITE updated logo, changed scikits-learn to scikit-learn.
ENH remove some deprecated parameters / functions /estimators
FIX remove test for deprecated parameter.
Example: added a pretty PCA 3D plot of iris, as this dataset is used in so many examples.
ENH minore example beautification
DOC fixed default value of ``compute_importance`` in DecisionTreeClassifier docstring.
DOC typo in ElasticNet docstring
DOC add isotonic regression to References (even if we move it soon), also OCD.
FIX error in error message ^^ closes #1155.
ENH fix percentile tiebreaking, add warning
DOC document attributes scores_ and pvalues_ in feature selection docstrings, some superficial cleanup.
DOC somewhat improved feature selection example
ENH in NMF only use svd initialization by default if n_components < n_features.
FIX fixed typo in code, added smoke test.
COSMIT remove unused imports
DOC added Conrad Lee's PR to whatsnew
COSMIT pep8
FIX unicode support in count vectorizer. Closes #1098.
FIX docstring for count vectorizer. Sorry about that.
COSMIT remove unused import
ENH add MinMaxScaler, #1111
ENH do normalization in single pass over data
DOC added missing docstrings
ENH rename Scaler to StandardScaler everywhere
COSMIT pep8
DOC remove sparse support from docstring as there is none. Also cosmit on docstrings.
ENH add FeatureStacker estimator
ENH add feature stacker example
COSMIT + DOC more dosctrings, minor improvements
ENH implement get_feature_names
TST added tests, fix feature names.
ENH add parallel fit and transform with joblib.
ENH add transformer weights
TST add test for feature weights in feature stacker
DOC move example (there is nothing to plot) and add some text
MISC renaming FeatureStacker to FeatureUnion, adding docs
DOC added FeatureUnion to whatsnew.
ENH remove deprecated sparse SVM class from cross-validation test.
COSMIT pep8
FIX bug in pipeline.inverse_transform, improve coverage.
ENH support for string labels in Neighbors classifiers
ENH rename ``_classes`` to ``classes_``, fix outlier labeling, remove unnecessary mapping to indices.
COSMIT reuse variable name
ENH added non-regression test
COSMIT removed unused import
FIX np.unique doesn't have return_inverse keyword, use backport from utils.
ENH slightly better error message for robust covariance
enh even better error message
ENH make multi-class more robust in discovering scoring functions
ENH in all_estimators, skip testing modules. They have dummies.
TST improve test-coverage in base, remove unreachable code-path
COSMIT pep8
DOC added whatsnew entry for mutual info fix and faster confusion matrix.
ENH rename k to n_folds and n_bootstraps to n_iterations
DOC cleanup some docstrings (not scipy standard yet)
ENH set n_fold default to 3, rename k to n_fold in all doctests, docs, and examples
COSMIT rename n_iterations to n_iter in cross_validation
MISC renamed n_iterations to n_iter in all other places.
DOC added changes / renames to whatsnew
ENH rewrite K-Means m-step as loop over samples, cythonize.
ENH separate sparse and dense case, cythonize further.
ENH fix int type in kmeans
ENH fix kmeans for old numpy (bincount minlength)
FIX also the other function in kmeans. whoops
FIX bincount mess I made in kmeans.
ENH rename rho to l1_ratio in ElasticNet and friends
ENH rename rho in SGD
ENH address @agramfort's comments, fix some doctests
DOC add changes to whatsnew.
ENH simplify as suggested by @larsmans.
FIX for len(result) > minlength
DOC tried to clarify meaning of l1_ratio in whatsnew
ENH remove some unreachable code from gridsearch
ENH sparse matrix support in randomized logistic regression
FIX doctests for max_iter
FIX two more docstrings. Sorry.
FIX seed liblinear using srand. Fixes issue #919.
ENH add random seed to logistic regression
ENH don't use deprecated interface in PPCA & cosmit
REL put myself as contact / maintainer, fixed url
FIX rebase mishap
DOC small example / doctest for kernel approximation.
DOC typo in whatsnew
DOC more typos in whatsnew.
ENH use the numbers module introduced in Python 2.6 to check for number types.
ENH added OneHotEncoder
DOC minor fixes / typos. Thanks @larsmans.
ENH user-specified dtype, defaults to np.float, nicer numpy stuff :)
TST skip test in common_tests, reach 100% coverage on new code.
DOC more typos omg. comment about automatically inferring maximum values.
ENH better example.
enh masking out zero features
TST fixed doctests, added more tests. Still 100% line coverage :)
ENH removed ``remove_zeros`` parameter.
DOC more extensive classifier comparision on synthetic datasets
ENH more noise, cross-validated parameters.
ENH train/test split, plot accuracy, make plot pretty.
ENH simplify circles dataset generator, make classes balanced.
FIX typo in dataset generation
ENH I'm more happy with the last example now....
FIX adjust gamma in kernelPCA tests to fit slightly modified circles with balanced classes.
FIX HMM test failures
ENH used asarray to avoid copy
COSMIT pep8
enh: add code analysis target to makefile
FIX small bug in feature selection error message.
COSMIT do less deprecated things.
FIX revert useless change.
DOC warn about parallel k-means on OS X.
ENH minor improvements in testing, new utility function for safely setting random states.
FIX cross_val_score now honors ``_pairwise``
DOC added my last PR (cross_val_score fix) to whatsnew
WEB color fix for link headlines
DOC document callable kernels in SVM docstring.
DOC add user guide for MinMaxScaler
COSMIT in mean shift docs
FIX hotfix for NMF sparsity problem
FIX dirty fix for expected mutual info in cython.
ENH added OneHotEncoder
DOC minor fixes / typos. Thanks @larsmans.
ENH user-specified dtype, defaults to np.float, nicer numpy stuff :)
TST skip test in common_tests, reach 100% coverage on new code.
DOC more typos omg. comment about automatically inferring maximum values.
ENH better example.
enh masking out zero features
TST fixed doctests, added more tests. Still 100% line coverage :)
ENH removed ``remove_zeros`` parameter.
Merge branch 'larsmans_pr' into one_hot_encoder
COSMIT pep8
DOC corrected whatsnew.rst. Thanks @ogrisel.
ENH check in all classifiers in fit and predict that input is finite. inspired by #1027.
ENH add checks for clustering, regressors, transformers
FIX revert old behavior, all tests work :)
MISC address Gael's comments
DOC added comment about default for n_nonzero_coefs.
COSMIT pep8
ENH added check for non-negative input.
Merge pull request #1279 from amueller/one_hot_encoder
FIX don't use pl.subplots.
ENH adding "apply" to random forests
ENH add RandomHashingForest estimator.
ENH added docs, example and tests.
DOC Some narrative documentation for Random Forest Hashing.
FIX for sparse matrix in RandomForestHasher
ENH refactor inheritance structure.
ENH use random regression task to avoid memory overhead of n_sample classes.
ENH Added Example
DOC added references
MISC renamed RandomForestHasher to RandomForestEmbedding
MISC don't use pl.subplots, fix typo
MISC rename plot_random_forest_hasher to plot_random_forest_embedding
ENH fix plot in docs. thanks @ppret.
DOC forgotten rename
DOC fixed links in whatsnew.
DOC added dump_svmlight_file to the references
DOC improve MinMaxScaler narrative docs.
DOC added new precision_recall_curve to whatsnew
DOC fix some layout on the "presentations" page, add Jake's resent PyData NYC tutorial.
MISC rename RandomForestEmbedding to RandomTreesEmbedding
COSMIT don't do deprecated things in test (hmm)
COSMIT pep8, removing unused imports and recommend ``toarray`` instead of ``todense``
ENH make sparse svm test more robust, catch warning on deprecated class
ENH use blobs instead of iris in the common classifier tests. Iris has duplicat datapoints which raises annoying neighbors warnings.
ENH slight cleanup in common tests, less warnings.
ENH Check what ``__init__`` does in test_common
FIX messed up memorizing gmms parameter in GMMHMM before.
DOC added comment to test.
DOC explain what the test is doing.
ENH add chi2 and exponentiated chi2 kernel.
FIX add generated c file
DOC add chi2_kernel and exponential_chi2_kernel references.
TST added a test for chi2 and exponential chi2 kernel.
FIX input validation, test chi2 in pairwise function, add reference.
ENH fused types for chi2_kernel
ENH renamed chi2 to additive_chi2 and exponential_chi2 to chi2, as usually the exponential version is meant with "chi2"
DOC updated whatsnew
DOC cleared up difference to AdditiveChi2Sampler, added some "see also"s
DOC added stuff about chi2 kernel to narrative docs
FIX typo bug, more tests. Still more tests coming right up!
DOC added "precomputed" variant to docs.
TST 100% line coverage
ENH explicit check for zero denominator
ENH address @ogrisel's comments.
ENH addressed @kuantkid's comments. Also add myself to pairwise.py authors.
FIX import assert_greater from testing module
FIX csr conversion in amg code in spectral embedding
Merge pull request #1428 from tnunes/feature_union_fit_transform
ENH cleanup tests, lower tolerance
COSMIT pep8
FIX and test deprecated import of spectral_embedding from cluster
TST better test-coverage in clustering module
COSMIT in cross-validation tests
FIX random state in test by @briancheung. Thanks
TST better coverage in dict learning and cross validation
TST better coverage in preprocessing module
DOC add matplotlib version requirement, rephrase
COSMIT Mean Shift docs.
Merge pull request #1441 from kuantkid/fix_spectral_test
COSMIT some fixes in whatsnew rst
ENH Nystroem kernel approxmation
ENH renamed class NystromKernelApproximation to Nystrom (it is in the kernel_approximation module). Also improvements to example docstring
DOC docstrings for Nystroem.
ENH cosmit, gamma defaults do None, not 0. address some of @mblondel's comments.
ENH tests for Nystrom, check that n_components is smaller than n_samples.
DOC narrative doc for Nystroem.
DOC updated whatsnew with Nystroem.
ENH don't import * in utils __init__.py
TST better coverage for GridSearchCV, test unsupervised algorithm.
TST better test-coverage for image patch extraction.
TST better coverage in kernel_approximation
ENH input validation only in ``fit`` in LassoLarsIC, check that error is raised.
TST document and test verbosity parameter of lars_path
TST some more tests for SGDClassifier input validation
ENH / TST better coverage of supervised clustering metrics, slight cleanup
DOC make unit test requirements a bit stricter. 80% is sub-par with current code-base
COSMIT pep8
COSMIT renaming chunk_size to batch_size in MiniBatchDictionaryLearning and MiniBatchSparsePCA
DOC add rename to whatsnew
cosmit pep8
FIX GridSearchCV on lists that I broke in 8b3e4d06c05ac82130176161404f0434b74fe2c7
ENH added test, started on cross_val_score
ENH allow lists in check_arrays
ENH make cross_val_score work, some refactoring in GridSearchCV
ENH consistency: stuff is not an array if it doesn't have ``shape``.
TST GridSearchCV raises ValueError when precomputed kernels are not matrices.
ENH Simplify estimator type checking in GridSearchCV.
FIX don't use assert_allclose. It is not supported in numpy 1.3
COSMIT pep8
COSMIT featuers -> features typo
COSMIT PEP8
COSMIT pep8
DOC add version when setting parameters in fit will be removed to docstring
FIX typo / bug in test_common that ignored the first init parameter.
TST make test more stable.
ENH slight improvement of common tests.
DOC slight cosmit in metrics docstrings.
FIX i should trust my past self a bit more
ENH use an array instead of a dict in RFECV
Cosmit pep8
TST a little more coverage in unsupervised metrics.
ENH clean up redundant code in pairwise
ENH more test coverage in pairwise distances
FIX more robust test for silhouette score
DOC classifier comparison: plot data without decision boundary first, better (imho) color scheme.
DOC add Nystroem kernel approximation to the references
FIX stupid mistake
COSMIT pep8
COSMIT Typo
COSMIT update warning, pep8
ENH refactoring class weights for SVM and SGD
TST all classifiers now have "classes_". adjust test_common.
ENH remove class_weight_label from python side of SVM
ENH remove class_weight_label from sparse svm
TST move test of "classes_" to the appropriate test in "test_common".
FIX remaining doctests
DOC docstring for compute_class_weight
ENH remove class_weight_label from LibLinear python side.
ENH removed unused old function
TST fix import in test
ENH addressed @ogrisel's comments.
DOC changed docstring to be more clear.
ENH documented changes for SVC classes_ changes.
ENH move utility function into dedicated file, not __init__.py
TST start on testing consistent class weights
ENH nu-SVC doesn't support class_weights
FIX liblinear class weight in binary case, robust testing.
cosmit whitespace
DOC add comment in liblinear
TST better test for class weights (that actually tests something)
ENH test automatic setting of class weights in common test
TST skip RidgeClassifier in class weight test for the moment
DOC added fix to whatsnew.
FIX don't test auto in ridge classifier as it is not supported currently
FIX tests for auto class weights
DOC more concrete whatsnew
FIX skip tests for naive bayes for the moment.
DOC made myself contact for authors, changed my website to blog.
TST add cosine_kernel to kernel tests, pep8
ENH lazy import of metrics in base, not preprocessing in metrics.
ENH document attributes in QDA and LDA, rename to adhere to sklearn conventions.
DOC fix shape of coef_ for LDA.
TST somewhat hacky fix for tests on image loading.
ENH more logical behavior, better docstring, tests
FIX do checks even if allow_lists
DOC try to be as clear as possible.
ENH cleanup in check_pairwise_arrays, raise error on sparse input in chi2_kernel and manhattan_distance
COSMIT doc formating
DOC updated whatsnew
ENH added class_weight to Naive Bayes docs.
FIX random seed in FastICA testing.
DOC fix docstring of GMM
ENH rename proximity to dissimilarity
ENH common test that set_params returns self.
COSMIT remove empty file
DOC more accurate comment in class weight computation
FIX make sure laplacian in spectral clustering test is really PSD
DOC add recall_score to new classification metrics listing
DOC document gamma in chi2_kernel.
TST add common test to check if estimators overwrite their init params
ENH use only a few samples in test.
FIX in tree and ensemble: don't overwrite random state in fit.
FIX don't overwrite random_state in fit in EllipticEnvelope
FIX don't modify random_state in clustering algorithms.
ENH make code more clear: MiniBatchKMeans only uses random_state in first run of partial_fit.
FIX in ward: don't overwrite n_components.
FIX remaining parameter issues in GradientBoosting
TST took the safty off the tests ;)
Merge pull request #1582 from ApproximateIdentity/doc-n_jobs-parallel
DOC some sphinx fixes
DOC fix in mds example (new interface)
DOC mds example: suppress warning for explicit initialization
DOC don't use deprecated parameter rho in the lasso / enet examples
COSMIT typos in hierarchical clustering warning
DOC more sphinx fixes
EXAMPLE don't use deprecated interface in lasso model selection
COSMIT pep8 in examples
COSMIT pep8
DOC more sphinx fixes
FIX sort indices in CSR matrix for SVM
TST add regression tests for Alex' fix.
ENH rename cosine_kernel to cosine_similarity. Also make the test actually do something.
DOC fixed problem in citations in spectral_embedding
COSMIT typos
ENH don't use deprecated class_prior fit parameter for NB in test
ENH in spectral_embedding: do input validation before anything else
TST in testing deprecated load_filenames catch deprecation warning
TST catch expected warning in sparse coordinate descent test.
DOC cosmit fix column span alignment errors.
FIX example uses old parameter name
COMPatibility more careful deprecation of mode and k in SpectralClustering
COMP more careful deprecation of seed in SGDClassifier
COMP add deprecated property rho to ElasticNet
COMP keep seed as init parameter of Perceptron, only deprecate
COMP add deprecated ``labels_`` property to LinearSVC
FIX deprecated properties in ElasticNet
COMP in SVC rename self.label_ to self._label (it is redunant now but I don't want to refactor the rest of the day) and add a deprecated property label_, that points to classes_.
FIX in Perceptron and doctest
FIX in common tests: don't test init parameters that are deprecated. They might be changed.
FIX some doctests for SGD
COSMIT typo thanks @jaquesgrobler
ENH don't return deprecated parameters by get_params.
FIX typo in spectral clustering deprecation
TST catch deprecation warning when testing SVC label_ attribute, also test new classes_ attribute.
DOC reorganized whatsnew a bit, put new estimators on top.
DOC added user guide links to all estimators on the whatsnew page
DOC some more fixes for whatsnew
EXAMPLES add header to hash_vs_dict_vectorizer.py - otherwise it won't show in the html docs.
COSMIT pep8
ENH undo renaming of class_prior to class_weight in naive bayes
Merge pull request #1529 from vene/lgamma_port
DOC some more minor fixes to syntax / links
DOC fix indentation typo
DOC added commit counts for 0.13 to whatsnew, added website for Rob Zinkov aka zxtx
COSMIT pep8
DOC updated commit counts.
REL change version to 0.14-git everywhere, update news, support page.
website: fix for survey bar
COSMIT remove unused imports, pep8
TST some more tests for multi output lars
DOC fix typo in LinearSVC error message
FIX make error message work when return_path=False. Btw I feel that getting "references" for numbers out of numpy arrays is pretty ugly.
TST fix random states in all dict learning tests, make test independent of test sequence.
Revert "trying travis cfg with system-site-packages"
COSMIT pep8
DOC add return values of cross_val_score and train_test_split to docstrings.
ENH added test, started on cross_val_score
ENH adding SomeScore objects for better (?!) grid search interface.
ENH refactor, taking @GaelVaroquaux's and @ogrisel's suggestions into account
ENH deprecated ``score_func``, introduced ``score`` parameter in GridSearchCV
TST test giving score as string in GridSearchCV
FIX rename ``score`` to ``scoring`` because of the name-clash with the ``score`` function.
FIX two score objects, adjust tests to new interface
ENH remove old interface completely from tests.
DOC fix docstring
ENH working on cross_val_score, trying to simplify unsupervised treatment.
ENH better testing of old an new interface. Still a bit to do for unsupervised grid search, though.
FIX usage of scores for unsupervised algorithms.
ENH use new api in permutation_test_score, don't use old api in testing.
ENH fbeta score working, more tests
DOC-string for AsScorer
ENH renamed ap and auc, added RecallScorrer
DOC narrative docs for scoring functions. Put them next to GridSearchCV. Should they go into metrics?
ENH update example, minor fix.
DOC improve cross validation and grid search docstring
FIX rename error
DOC add whatsnew entry
DOC fixed formatting in user guide
FIX example
DOC added a new template to sphinx so view the "__call__" function.
COSMIT address @ogrisel's comment.
FIX rename ZeroOneScorer to AccuracyScorer
DOCFIX for zero_one_score / accuracy_score renaming
DOC add narrative about score func objects to the model_evaluation docs.
ENH rename scorer objects to lowercase as they are instances, not classes
DOC minor fixes in pairwise docs.
ENH/DOC add "score_objects" function for documenting the score object dict.
DOC add metrics.score_objects to the references
DOC use table from score_functions docstring in model_evaulation narrative.
DOC move scoring function narrative above dummy estimators, fix tables, some refinement.
DOC minor fixes in score_objects documentation.
DOC better table of score functions in grid-search docs.
ENH GridSearchCV and cross_val_score check whether the returned score is actually a number, not an array (otherwise cross_val_score returns bogus).
TST improve coverage of permutation test scores
TST slightly better test coverage in cross_val_score
COSMIT built-in typo
DOC some improvements as suggested by @ogrisel
TST add test for pickling custom scorer objects
DOC more improvements by @ogrisel
COSMIT rename AsScorer to Scorer
MISC moved score_objects.py to scorer.py, added module level doc string and license note.
DOC add kwargs in Scorer to docstring.
ENH add ``__repr__`` to Scorer
DOC addressed @ogrisel's comments.
COSMIT text reflow
MISC pep8: rename scorers to SCORERS, remove score_objects getter
DOC remove duplicate table, add references to appropriate user guide section to docstrings of cross_val_score, GridSearchCV and permutation_test_score
DOC add note on deprecation of score_func to whatsnew
FIX imports for Scorer and SCORERS
DOC fixes in whatsnew, typo
TST smoke test repr
COSMIT removed unused imports, fixed error message in test of boosting
ENH break ties in OvO using scores
TST test for breaking OVO ties
COSMIT pep8
ENH get rid of imports in test_common by checking by names, not classes.
ENH fix test_estimators_overwrite_params to also test regressors and transformers. Then fix all the regressors and transformers ... meh!
ENH set the random state to avoid heisenfailures
COSMIT pep8, removing unused imports
FIX remove dtype from covertype, add fetch_covtype to init, add missing docstrings.
FIX doctest kernelpca
ENH get rid of most imports in test_common
TST stronger tests for arbitrary classes. make explicit what works and what doesn't.
FIX rebasing trouble in common tests: the meaning of dont_test changed
FIX don't compare strings with "is". that is really not robust!
ENH in transformer pickle test, only test transformers that provide a 'transform' method. and only test that.
ENH in common tests, use long variable names for all tests
FIX remove all unseeded random variables from common tests.
Merge pull request #1695 from mrjbq7/issue-1694
COSMIT pep8: blank line contains whitespace
DOC added sentence about oob_decision_function_ containing NaN to docstring. Still need some narrative about oob score.
DOC add 0.13.1 changelog to whats_new.rst
DOC add random_state parameter to docs of LogisticRegression and LinearSVC
TST/FIX set random_state in logistic regression tests
TST/FIX always use "almost equal" for floats.
FIX MinMaxScaler bug.
TST FIX random state for LibLinear sparse tests
ENH add randomized hyperparameter optimization
DOC fixed links in whatsnew
Merge pull request #1736 from jamestwebber/patch-1
Merge pull request #1740 from tjanez/move_roc_curve_test
COSMIT pep8
DOC FIX links on grid search narrative
FIX compute_class_weight edge case
DOC some sphinx / rst fixes
MISC minor fixes in examples
DOC FIX column span alignment problem in NMF ^^
COSMIT typo
DOC fixing some more rst / sphinx errors :-/
DOC more sphinx stuff.
Merge pull request #1767 from rmcgibbo/balltree_docstring
DOC add roll your own estimator docs
FIX for iid weighting in grid-search
DOC FIX finite precision
COSMIT pep8
DOC correct / simplify dbscan examle
COSMIT typo. the French again ;)
FIX setting k in KMeans and MiniBatchKMeans was silently ignored. Left over in 07c56d7cd2ddfe71e7a4399d74fc367d6000d854 Damn, that was nasty :-/
COSMIT pep8
FIX jenkins error on numpy 1.3.0
DOC documented n_init parameter of MiniBatchKMeans. Closes #1900.
FIX broken scorer, add non-regression test.
FIX WARN about **params being not used in GridSearchCV.fit. Closes #1815.
FIX bug in callable kernel decision function - Sorry, I think that was me.
FIX test error in test common for KernelPCA that doesn't respect its n_components.
FIX typo in test for RdigeCV
DOC typo in RandomizedSearchCV docstring
DOC fetch_20newsgroups returns the text, not text files. see SO question: http://stackoverflow.com/questions/16615523/using-scikits-kmeans-to-cluster-ones-own-documents
DOC Fixed documentation of kernel parameters: sigm uses gamma, but not degree. Closes #1972.
DOC clarification in Scoring objects: Its not a good sign if I don't understand my own wording.
DOC much more readable formula in chi2 kernel doc
COSMIT sphinx fixes
COSMIT pep8
DOC FIX typo on fbeta, closes #2219
fix whitespace around new tree.pyx docstring
use new virtualenv features of travis, so we don't have to kill the virtualenv
FIX hopefully fixing travis.
FIX hopefully fixing travis.
DOC improve svm sample weight example
DOC improve documentation of sample_weight, add to docstring.
TST small improvement of test for sample weight in svm
cosmit typo
Show 95% confidence interval, not 40% confidence ^^
FIX whoops sorry!
fix pycharm file ending
ENH add "make_y_1d" to utils, use it in estimators where needed.
fix make ``make_y_1d`` save for lists.
use column_or_1d, move it to utils
ENH rename eval / pseudolikelihood to score_samples
fixing ridge and label binarizer... I'm pretty sure that worked before?
FIX make neighbors y prediction shape consistent
TST add regression test for label_binarizer
FIX/ENH make StandardScaler convert int input to float and warn about it, instead of warning and rounding for dense and crashing for sparse.
DOC adjust docstring as suggested by @gvaroquaux
addressing @ogrisel's comments: catch warnings in test, no unneeded digits
COSMIT fixing some unused imports, adding stuff to __all__, and light pep8 (not all whitespace to make rebasing less painful)
DOC fixing some sphinx stuff.
more sphinx fixes
first try at bootstrap-based website
"fix" sidebar stuff - this was not my idea
remove gray boxes around h3 on the two new pages
put banner into header, make it spread over whole page
Fix link to flowchart, add text descriptions.
Minor fixes in front-page text, css
rework front-page box texts
fix typo, missing p
fix and refine some css and html tags
add example banner image
add section, estimator and model links on the frontpage
fix styling of rst links
add links for examples
fix css that I just broke with the sphinx links
flatten the tutorial / doc structure as proposed by @ogrisel
add js for collabsible toc tree in the user guide.
minor typo thing
don't have old version warning on install, as that will be shared across all versions.
added "show source" link to footer, made dimensionality reduction examples link to decomposition
slightly hackish way of inserting a whatsnew link. I really don't want all the sphinx containers here, though. Asked on stackoverflow about it btw.
a little less ugly footer. @glouppe should maybe have a look ;)
make links to old versions actually do something (currently link to the user guide as the other versions are not rebuild yet).
replaced lorem ipsum in news. still a draft but whatever.
nicer dates
Try to raise and test warnings.
DOC added website to whatsnew, added link to github for Nelle
FIX don't use old API in examples
more fixes for docs, deprecated interfaces
FIX made the building of the docs slightly more robust. readme files in folders without examples kill it otherwise.
try to fix the toctree in a semi-meaningful way.
DOC/EXAMPLES fix more documentation errors, deprecated api usages.
EXAMPLES remove non-existing example from doc, don't trigger deprecated interface in enet_path, lasso_path
much better input validation, test that warning is raised on (n_samples, 1) y
rearrange permutation_score parameters to match previous ones.
DOC add link to fetch_covertype to covertype narrative docs
WEBSITE add yhat testimonial
ENH make sure all "SkipTest" calls have an error message.
don't import tests
FIX don't raise file-level SkipTests
FIX website: css hickup
DOC explanatory sentence for svm grid search example
Website: add spotify testimonials
ENH make minibatch k-means use batches in prediction and computing labels
ENH use nostests yield construct for better error reporting in common tests.
enh give yielded tests nicer names
FIX remove ``.description`` from test generators as this is not thread-safe.
ENH Define fused types so only two functions are generated.
WEBSITE FIX spotify logo!
MISC remove accidentally commited file. Whoops!
DOC make loss function in SGD consistent with subgradient. Comment by Martin Jaggi :)
MAINT remove _label attribute from SVC.
Add FAQ to docs.
removed sklearn, removed statsmodel, capitalization in SciPy
Add GPU support (there is none and will probably be none) to faq.
DOC faq formulation
Minor fixes in the docs
Fix some fun column span alignment errors.
removed unused variables and imports, add used imports where missing.
COSMIT TYPO
Stop using some deprecated interfaces.
TST add test for minibatch k-means reassigns.
FIX bug where random state was reset in each call to partial_fit, such that reassignment either occured never or always.
FIX bug where distances were of illegal shape
FIX crash when to_reassign.sum() > X.shape[0]
TST split test for partial_fit and fit
TST add additional test for batch_size > n_samples in fit.
MISC rename random.pyx to _random.pyx
ENH backport np.random.choice.
FIX broken tests for minibatch k-means
FIX use choice in minibatch_kmeans
FIX another (two?) bugs: the default code path didn't ever compute distances.
skip doctests
simplify tests
remove redundant code
FIX up the rest of the stuff. Introduce .5 as a magic number. Hurray for magic numbers?
DOC more / better documentation
FIX reset counts of reassigned clusters.
ENH never reassign everything, more meaningful tests.
[DOC] minor fixes
ENH allow y to be a list in GridSearchCV, cross_val_score and train_test_split.
add test for safe_indexing, add another test for cross_val_score
BUG: Support array interface
Catch ConvergenceWarning in RandomizedL1
ENH rename parameters in MockListClassifiers.
move around examples for better structure.
Add related project to website.
ENH don't convert dataframes in grid search and cross-validation
split test_common.py into checks and test file.
Closes #2360. Fix tiebreaking.
some cleanups in common_test, speedup.
Make fit_transform and fit().transform() equivalent in nmf
slight speedup
Make everything accept lists as input.
ENH Make all decision functions have the same shape. Fixes SVC and GradientBoosting. Closes #1490.
Refactor input validation.
remove check_arrays stuff and old input validation
ENH add allowed_sparse named argument for @ogrisel
Merge pull request #3447 from amueller/input_validation_b
DOC Add solido testimonial
Update feature_stacker.py
FIX in KDE fit and score allow y=None, add test for use with pipeline and gridsearch
Fix using ground truth in silhouette score in document clustering. Closes #3806.
DOC explain SelectorMixin strategy with multiple classes.
DOC Add note about sample weights in export_graphviz.
FIX raise error when number of features changes in partial fit, Add common test for inconsisten number of features for partial fit.
DOC RidgeCV more explicit documentation of CV parameter, taken from GridSearchCV.
Remove unused distance calculation from clustering example.
DPC FIX latex error in multiclass hinge loss
Update documentation.rst
WEB fix css for new version of sphinx.
Use plot_galler=0 in makefile for sphinx >1.2.3 Also works with 1.2.3
Merge pull request #3892 from mrshu/mrshu/naive-bayes-small-fixes
Merge pull request #3921 from trevorstephens/docstring-format-fixes
DOC add missing import to plot_underfitting_overfitting.py, fix plot height.
Merge branch 'pr/3914'
FIX random state in isomap test
Allow list of strings for type_filter in all_estimators.
DOC explain t0 in the docstring (currently somewhat confusing)
Merge pull request #3955 from hammer/patch-1
remove _iff_attr_has_method function
don't give names to decorators
FIX backward compatibility.
Merge pull request #3523 from cle1109/slda
make attribute decriptor have the right docstring.
add regression test for docstring delegation
fix docstrings in sphinx
rename decorator
Fix typos, no shortenings in name, simplify example.
TST add tests for nd grid-search can train_test_split
remove deprecated ``n_bootstraps`` parameter in Bootstrap cv object, verbose in FactorAnalysis and SelectorMixin. Also adjust some deprecation strings.
remove deprecation tests.
Add parameters to pipeline methods, add some missing docstrings.
If scoring is provided, it MUST be used. Also, a scorer might not need predict.
tighten tests and adjust for new logic.
add some nonregression tests.
add fixes to whatsnew
remove bad estimator test as it was enforcing too many restrictions, instead test that we don't enforce these restrictions.
make gridsearch_no_predict test stricter.
add another bug that I just discovered existed and was fixed in the PR to whatsnew.
address @jnothman's comments
fix docstring for pipeline score func.
Merge pull request #4049 from akshayah3/sparse
Merge pull request #4054 from tttthomasssss/bugfix-4051-affinity-prop
Say best_estimator_ depends on refit=True. Fixes #2976.
Merge pull request #4078 from pratapvardhan/misc-docs
TST slight cleanup of common tests.
Merge pull request #4075 from banilo/pipe_typo
Merge pull request #4089 from wujiang/patch-1
Merge pull request #3640 from untom/fix_consensus_score
does not compute
make related packages more prominent
Merge pull request #3562 from arjoly/bench-mnist
Make ParameterSampler sample without replacement if all parameters are given as lists.
TYPO in model evaluation docs
TYPO in coverage_area docstring.
COSMIT spelling
DOC minor fixes do docstrings, don't document deprecated parameters.
TST add test for sparse matrix handling in clustering
Merge pull request #4052 from amueller/clustering_sparse_matrix
TST set random state in check_classifiers_train
remove duplicated test, probably caused by rebase issues.
Explain why we are somewhat selective, lower citiation rule of thumb
FIX check for integer 1 before checking for floating 1 (isinstance(1, numbers.real) == True)
Merge pull request #3836 from untom/fixedsplit
add PredefinedSplit to classes.rst and whats_new.rst
Added input validation refactoring from last sprint to whatsnew. Not padding my back, just keep forgetting if that was in 0.15.2 or not.
DOC staged_* returns generators. Fixes #3831.
TST remove tempfiles that we create in svmlight tests and agglomerative clustering tests
Merge pull request #4155 from trevorstephens/nb-fix-3186
Add test for MSE estimate bug.
sort labels in precision_recall_fscore_support
search C not gamma of linear svm in tutorial
Make nystroem approximation robust to singular kernel.
TST Silence some tests.
minor fixes, addressing @agramfort's comments.
website: add skll to related projects
Merge pull request #4201 from amueller/related_projects_skll
Add test that score takes y, fix KMeans, FIX pipeline compatibility of clustering algorithms!
Explain why we multiply mean by two
Better error messages in MeanShift, slightly more robust to bad binning.
Merge pull request #4176 from amueller/mean_shift_no_centers
Merge pull request #4221 from saketkc/fixes
Merge pull request #4232 from lesteve/use-absolute-imports-in-tests
Fix gibbs sampling behavior in RBM with integral random_state.
FIX check (and enforce) that estimators can accept different dtypes.
fixes in GMM, TSNE, MDS, LSHForest, exclude SpectralEmbedding
DOC minor improvement in Ensemble user guide
Merge pull request #24 from ogrisel/fix-nan-1d-regularized-covariance
Merge pull request #4136 from amueller/test_dtypes_all
Merge pull request #3891 from ragv/decision_function_ovo_1523
make check_array convert object to float.
more robust check for dtype object
extensive (excessive?) testing of FDR
raise noise in test again, be ascii
ENH make defensive copies in GradientBoosting*.staged_decision function.
Merge pull request #4165 from amueller/gbrt_staged_defensive_copies
Merge pull request #4146 from amueller/fdr_treshold_bug_2
special case if dist is object in _get_weights
whatsnew entry for KNeighbors zero division in weights.
Merge pull request #4269 from MechCoder/deprecate_n_components
Merge pull request #4092 from ogrisel/iris-rbf-params-heatmap-example
Merge pull request #4293 from kyleabeauchamp/patch-1
Make BaggingClassifier use if_delegate_has_method in decision_function
remove special case for BaggingEstimator after merge of #4137
TST make TheilSenRegressor run faster in common tests, I could swear I did that before...
Fix docstring of mahalanobis distance in empirical covariance. Closes #4168.
Merge pull request #4310 from larsmans/svm-bounds-l2
skip OMPCV on travis by raising SkipTest in set_fast_parameters.
some fixes for sphinx and in examples
don't use unicode in latex. That just complicates things too much. Also, who the hell uses unicode whitespace?
use PNG images for latex compatibility
Merge pull request #4314 from vortex-ape/RBFSampler
catch some warnings, be less verbose in testing.
Merge pull request #4320 from amueller/minor_doc_fixes
Generate example rst files for inclusion in docstrings.
Merge pull request #4323 from amueller/generate_empty_example_rsts
minor fixes in docs.
change default to shuffle=True in SGDClassifier and friends.
Merge pull request #3965 from amueller/sgd_classifier_shuffle
Merge pull request #4325 from amueller/more_minor_doc_fixes
Merge pull request #4234 from amueller/rbm_random_state_gibbs
Merge pull request #4338 from lesteve/fix-typo-in-pyamg-skip-test-message
Implement "secondary" tie strategy in isotonic.
Merge pull request #4345 from bendavies/20newsgroups_example
add scipy2013 tutorial links to presentations on website.
Merge pull request #4339 from amueller/scipy2013_lecture_link
Merge pull request #4342 from sotte/fix_contributing_format
Fix rebase conflict
Fix #4351. Rendering of docs in MinMaxScaler.
Merge pull request #4352 from amueller/issue-4297-infinite-isotonic_bak
Move dev branch to 0.17
add appveyor badge to alert us of failures
Merge pull request #4363 from arimbr/patch-1
add whatsnew entry for GaussianNB sample weights
make landscape.io much more useful
Merge pull request #4331 from martin0258/uni-feature-doc-seealso
Merge pull request #4335 from martin0258/dpgmm-lowerbound-bug
Merge pull request #4378 from dan-blanchard/patch-1
remove some style errors in bench code.
Merge pull request #4381 from amueller/style_errors_b
Merge pull request #4383 from saketkc/maint
Pass the appropriate include_self argument to kneighbors_graph everywhere.
DOC make defaults more explicit in text feature extraction.
Merge pull request #4392 from sinhrks/setparams
Merge pull request #4390 from Barmaley-exe/tests-fix
fix ompcv on old scipy versions
remove deprecated stuff from 0.17
remove cross_validation.Bootstrap
Merge pull request #4411 from ragv/maint_remove_n_iterations
Merge pull request #4412 from ragv/maint_remove_n_iterations_2
avoid NaN in CCA
FIX Minor shape fixes in PLS
make cross_decomposition pass common tests
deal with y lists
DOC fix n_jobs docs in KMeans as in 0a611193b12900dbc11f3dae4448809364161bb2.
DOC fix typo in output shape of fetch_lfw_pairs (and minor additions)
FIX LDA(solver="lsqr"): make sure the right error is raised on transform.
DOC fix link description to setuptools in devdocs.
COSMIT missing newlines in metrics test
Merge pull request #4436 from ogrisel/rebased-pr-3747
fix setting description in 20newsgroups
cleaned up input validation, added tests for copy keyword.
ENH minor fixes / simplifications for PLS input validation, add test for error.
DOC minor fixes in formatting, don't use deprecated n_components in Agglomerative
DOC/MAINT remove deleted cluster.Ward from references.
Merge pull request #4451 from ragv/travis_ignore_docstring
Fixes to sparse decision_function, rebase fixes.
DOC added highlights to whatsnew
DOC / website add 0.16 release to "news".
Merge pull request #4462 from xuewei4d/deprecate_estimator_params_docstring
Merge pull request #4471 from baharev/small-typo
FIX don't raise memory error in ledoit wolf
Merge pull request #4428 from amueller/ledoit_wolf_shrinkage_fix
Add tags to classifiers and regressors to identify them as such.
COSMIT use consistent shape description in docstring.
Merge pull request #4464 from vortex-ape/logregcv
Merge pull request #4456 from cgohlke/patch-1
Merge pull request #4356 from vortex-ape/dict_vectorizer
Merge pull request #4484 from vortex-ape/remove_array2d
Merge pull request #4496 from vmichel/rfe_feature_importances
DOC adding clusterer tag to dev docs.
Merge pull request #4418 from amueller/classifier_regressor_tags
DOC try to clarify pairwise_distances docstring for Y.
DOC fix link in text extraction narrative
Merge pull request #4506 from mheilman/master
DOC GradientBoostingX.estimators_ docstring shape.
FIX make CalibratedClassifierCV deterministic by default.
Merge pull request #4537 from amueller/callibrated_classifier_cv_linearsvc_seed
FIX be robust to columns name dtype and also to dataframes that hold dtype=object.
Merge pull request #4531 from zhaipro/fix
Merge pull request #4561 from mspacek/patch-1
don't to input validation in each tree for RandomForest.predict
Misc trying to make k_neighbors warning even more explicit (before it didn't say that %d was n_samples)/
add 0.16.1 bugfixes to whatsnew.
Merge pull request #4467 from giantoak/param_invariance
Merge pull request #4534 from xuewei4d/optimize_rfecv
DOC fix libsvm docstring consistency
Merge pull request #4636 from scls19fr/scls19fr-Update_confusion_matrix_docstring
Merge pull request #4631 from Aerlinger/documentation_improvements
ENH add high-level estimator-validation function
check output, not error on sparse data handling
fix warning message by not using assert_raise_message, which adds escaping.
add comments to tests
FIX make backport of assert_raises_regex raise the same error as the python3 version.
FIX make rfe feature importance test deterministic.
Merge pull request #4640 from dougalsutherland/patch-1
Merge pull request #4526 from vortex-ape/if_matplotlib
FIX pass percentiles to partial_dependence in plotting.
Merge pull request #4593 from clorenz7/gmm_fit_predict
Merge pull request #4659 from MAnyKey/sampler-samples
fix minor documentation link issues. looks more complicated than it is as I replaced tabs with spaces
FIX work-around for read only dataframes
Merge pull request #4678 from amueller/pandas_safe_indexing_fix
ENH make pipeline.named_steps a property, fix pipeline.named_steps doctest
DOC whatsnew for unaveraged multi-output regression metrics.
Merge pull request #4438 from mbatchkarov/stratified_train_test_split
DOC add whats_new for stratified train_test_split
DOC add random_state to parameter docstring in gradient boosting
ENH don't use mutable defaults in RidgeCV.
FIX add missing numpy imports to VotingClassifier examples, pep8 fixes.
COSMIT remove unnecessary float casting from tomography example
TST don't check that warning was raised as whether a copy needs to be made or not depends on the pandas version. Version 0.12 and 0.13 apparently don't use cython here, and >0.17 will have a fix.
FIX / TST raise error when init shape doesn't match n_clusters in KMeans, check for sensible errors.
DOC fix sample_without_replacement docstring. Closes #4699.
Merge pull request #4694 from betatim/check-fit-returns-self
Merge pull request #4702 from betatim/online-learning-batch-size
Doc sphinx fix Return->Returns in label_ranking_loss
Merge pull request #4727 from Spikhalskiy/truncated-svd-doc-T
FIX make random_choice_csc and therefor DummyClassifier deterministic.
ENH support for sparse coef_ in ovr classifier
TST test that all default arguments are not mutable
DOC some fixes for the EnsembleClassifier
Merge pull request #4645 from TomDLT/astype_fix
FIX / TST make cross_val_predict work on lists, test pass-through and compatibility with various input types.
FIX ransac output shape, add test for regressor output shapes
don't warn about multi-output deprecation if not using multi-output
Merge pull request #4743 from ogrisel/fix-check-finite-gram-omp
Merge pull request #4684 from ogrisel/readonly-lars-path
Merge pull request #4749 from ogrisel/travis-no-sudo
FIX n_jobs slicing bug in dict learning, add input validation to gen_even_slices
Merge pull request #4716 from TomDLT/notfitted
Merge pull request #4125 from untom/RobustScaler
Merge pull request #4761 from tiagofrepereira2012/fix_lle
DOC added some bug fixes to whats_new.rst
TST/COSMIT remove nose call boilerplate
Fix RFE / RFECV estimator tags
DOC Add coveralls badge to Readme
Merge pull request #4680 from amueller/pipeline_named_steps_fix
Merge pull request #4431 from vortex-ape/deprecate_func
Use more natural class_weight="auto" heuristic
FIX add class_weight="balanced_subsample" to the forests to keep backward compatibility to 0.16
Cosmit readability improvements and better whatsnew.
Merge pull request #4785 from amueller/no_noise_boilerplate
float division
Merge pull request #4786 from amueller/estimator_tags_bug
DOC adding backlinks to docstrings
Merge pull request #4723 from amueller/backlinks
DOC minor sphinx fixes
Fix deprecation of decision function in SGD
ENH refactor OVO decision function, use it in SVC for sklearn-like decision_function shape
TST set missing random state in test
ENH minor fixes to the tests, don't raise as many warnings in the test suite
Merge pull request #4819 from sornars/svcgamma
Merge pull request #4832 from kronosapiens/update-docs
Better docstring for PCA, closes #3322.
Merge pull request #4838 from trevorstephens/ridge_sw
Merge pull request #4796 from joshloyal/standard-scaler-std-doc
DOC remove linewidth in pca/ica example
Merge pull request #4854 from opahk/master
Merge pull request #4849 from jnothman/contrib-link
Merge pull request #4868 from rasbt/silhouette
Merge pull request #4870 from rasbt/decisiontreeregressor
Merge pull request #4836 from amueller/pca_docstring
Merge pull request #4947 from zyrikby/patch-1
Merge pull request #4953 from mrphilroth/issue4940
Merge pull request #4783 from tokoroten/decrease_randomforest_memory
Merge pull request #4962 from vinc456/unexpected-facial-recognition
Merge pull request #4915 from untom/minmax_scale
Merge pull request #4971 from dotsdl/issue-4724
add whatsnew entry for fix of penalty passing in LogisticRegressionCV
remove links to old docs
Make more clear that adding algorithms is not the preferred way to start contributing.
Merge pull request #5038 from jnothman/textdoc
Merge pull request #5042 from amueller/contributing_addition
fix shape of thetaL, thetaU, theta0 in gp
Merge pull request #5048 from marktab/fixlfw
Merge pull request #4965 from mrphilroth/4922
DOC fixed short underline
FIX make sure y is 1d in svm regressor, add common test for regressor shapes.
fix TheilSenRegressor input validation, LassoLarsIC input validation
added handling of 2ndim y in regressors to whatsnew.
Merge pull request #5057 from amueller/fix_2d_y_svm_regressor
Merge pull request #4775 from arthurmensch/cd_fast_readonly_array_brainfix
Merge pull request #5052 from larsmans/kappa
force y to be numeric in kernel ridge.
TST Fix: made the test for y 1d conversion warning more robust.
Merge pull request #5077 from rvraghav93/fix_import_coor_des
ENH: Renames CallableTransformer -> FunctionTransformer.
DOC added whatsnew for k-means fix
FIX be robust to obscure callables that are false.
Add common test that transformers don't change n_samples.
Fix memory access error in OneClassSVM
Merge pull request #5093 from amueller/one_class_svm_memory_fix
Merge pull request #5117 from kylerbrown/patch-1
Merge pull request #5120 from larsmans/faster-lda
DOC mention intercept_ attribute in ridge docstring.
test for accepted sparse matrices
Merge pull request #5037 from betatim/tree-feature-transform
Merge pull request #5195 from jnothman/example_latex
Merge pull request #5218 from mooz/patch-1
Merge pull request #5199 from ogrisel/joblib-0.9.0b4
Merge pull request #5200 from jackzhang84/add_dump_svmlight_file_comment
Updated whatsnew for sprint and LatentDirichletAllocation, add some missing author links
fix rebase mistake
add necessary blas files.
Minor fixes, remove transform for now.
minor fixes to the doc build
Merge pull request #5263 from hlin117/check_X_y-docs
Merge pull request #5238 from rvraghav93/cvdoc
Merge pull request #4887 from amueller/tsne_fixes
Merge pull request #4852 from TomDLT/nmf
add future warning to pipeline.inverse_transform with 1d X.
Merge pull request #5294 from trevorstephens/lda_qda_fit_params
Merge pull request #4955 from copyconstructor/preserve-vocab-ordering
Merge pull request #5349 from jakevdp/naive-bayes-scale
DOC trying to make GridSearchCV docs more accurate... not sure if better
Fixes #4455.
Merge pull request #5104 from giorgiop/scaler-partial-fit
Merge pull request #5400 from kwgoodman/doctypo
DOC formatting fixes for laplacian kernel
Fixed warnings for DataDimensionalityWarning, decision_function and decision_function_shape.
Merge pull request #5410 from GaelVaroquaux/fix_doc_gcv
Merge pull request #5398 from rvraghav93/fix_for_numpy_10
Merge pull request #5409 from lesteve/fix-plot-lle-digits
MAINT don't print things in testing.
FIX Don't compare arrays to strings!!!!
DOC cosmit I find it confusing to say that fit resets the estimator, at it always does that.
FIX skip LDA deprecation test on python3.3 that has no reload.
FIX class_weight in LogisticRegression and LogisticRegressionCV
BF: FIX OvR decision_function_shape in SVC
undo change by @arthurmensch that was probably accidental ;)
TST make tsne tests 32bit save
TST catch warnings in tests that the solver is changed to SAG in the sparse case.
DOC polish documentation of output types of train_test_split, add change to 0.16 whatsnew.
COSMIT don't use the deprecated residuals property of ols
TST close /dev/null in the theil sen tests.
MAINT catch warning for deprecated allow_list option to train_test_split
FIX we shouldn't warn the user if the solver is auto
REL Make 0.17 website changes for release.
Don't use floats to index numpy arrays
fix 1 sparse row scaling in robust scaler
fixed missing import from too much cherry picking.
FIX don't compare things that can be arrays to strings.
DOC some fixes to the doc build.
FIX port LDA covariance fix to decomposition module
Add missing whatsnew entries for 0.17
DOC Typo fixes in whatsnew. Thanks @jnothman
MAINT Don't use deprecated 1d X (or deprecated matplotlib stuff) in examples.
DOC some fixes to docbuild
REL add 0.17.0 release to news
More doc fixes. Latex builds again.
skip unstable tests and doctests on 32bit platform
split installation into simple and advanced part
MAINT version string for 0.17. D'OH
Fix import of reload for python 3.3
Andreas van Cranenburgh (1):
add 'MultiLabelBinarizer' to __all__
Andrew Ash (1):
Typo: thepath -> the path
Andrew Clegg (2):
Workaround for andrewclegg/snake-charmer#12
Fixed typo
Andrew Lamb (5):
Fix check in `compute_class_weight`.
Fix assertion for Python 3, use `assert_raise_message`.
Change formatting of `assert_raise_message`.
Fix SGD partial_fit multiclass w/ average.
Added entry to `doc/whats_new.rst`.
Andrew Tulloch (4):
[Cross Validation] Use itertools.izip consistently across Python 2/3
[tests] Use Python 3 zip/Python 2 zip consistently in tests
ENH - Improve performance of isotonic regression
[Feature Selection] Fix SelectFDR thresholding bug (#2771)
Andrew Walker (1):
Update plot_kernel_approximation.py
Andrew Winterman (18):
implemented predict_proba for OneVsRestClassifier
forgot an except clause
removed unnecessary repeat
corrected doc for predic_proba, also caught few errors.
wrote test_ovr_predic_proba method
divided test for predict_proba into two functions
removed check for predict_proba method.
[pep 257](http://www.python.org/dev/peps/pep-0257/) and and other doc improvements.
corrected bad test in test_multiclass
Flake8 Corrections made
spell checked
Spelling is checked, passes Flake8 without errors.
added backtick around self.classes_ in multiclass.py
changed n_folds > min_labels error to warning
removed tests for the old error.
added test for warning. Added warning category
removed a carriage return in warning message
added space between # and text
Anish Shah (1):
[issue #5043] fix documentation of callable kernel
Ankit Agrawal (7):
Making the results NaiveBayes example more explicit
Adding doc for bin_seeding parameter in cluster.MeanShift
Wrapping line in naive_bayes.rst and specifying bin_seeding argument as optional
Returning an array mask using np.ma.getmaskarray
Adding a test for verifying the shape of imputed matrices
Using assert_equal routine from nose
Minor doc example fix
Ankur Ankan (1):
fixes bug in oob_score when X is sparse.csc matrix [refs #4744]
Anne-Laure Fouque (3):
ENH added R^2 coeff which is now used as score function in RegressorMixin
renamed explained_variance_score to r2_score in linear_model
adding r2_score : fixed typos and doctest
Anthony Erlinger (5):
Minor docstring fixes to svm/classes.py
Improve documentation on Diabetes dataset
Typo fix in diabetes documentation :Aex: -> :Sex:
Improve wording of SVM `crammer_singer` doc following PR feedback.
Clarify documentation on feature scaling for diabetes data
Antony Lee (2):
Accelerate AffinityPropagation.
AffinityPropagation: save memory by reusing tmp.
Anze (5):
P3K: Fixed imports.
P3K: Cannot compare list to tuple.
Replaced use of deprecated method.
P3K: Changed StringIO to BytesIO to fix a failing test.
P3K: Fix build for py3k + pip.
ApproximateIdentity (4):
Changed a minus sign to a plus sign in the documentation of n_jobs in some files.
Changed minus sign to plus.
Added n_jobs to multiclass.py
Revisions due to previous pull request.
Ari Rouvinen (1):
DOC Fix NMF inconsistency and broken links
Ariel Rokem (1):
Added description of input parameters in svm.SVC docstring
Arnaud Joly (555):
ENH add random-seed args
Call DecisionTreeRegressor instead of Tree
COSMIT Remove duplicated assignement
Use the check_input argument
DOC : add description of check_input args
DOC explain parameter estimators_
DOC explain parameter estimators_ (2)
ENH Move parameter checking to fit
COSMIT
FIX casting bug
ENH preserver contiguous property
COSMIT
DOC describe reasons for reshape
PEP8
FIX: perform transition from tree to DecisionTreeRegressor
FIX feature importance computation + Enable smoke test of feature importances
Update whats new
ENH add author
COSMIT use sklearn.utils.testing
ENH Let the user decide the number of random projections
Clean random_dot features
Clean random_dot features (2)
Clean random_dot features (3)
Clean random_dot features (3)
ENH let the user decide density between 0 and 1
COSMIT
ENH Strenghtens the input checking
ENH Add gaussian projeciton + refactor sparse random matrix to reuse code
ENH add more tests with wrong input
ENH add warning when user ask n_components > n_features
DOC: correct doc
ENH add more tests
Update doctests
ENH cosmit naming consistency
FIX renaming bug
COSMIT
WIP: add benchmark for random_projection module
ENH finish benchmark
Typo
ENH optim sparse bernouilli matrix
FIX example import (name changed)
FIX: argumetn passing selection of sparse/dense matrix
ENH assert_raise_message check for substring existence instead of equality
ENH add two test to check proper transformation matrix
PEP 8 + PEP257
DOC improve dev doc on reservoir sampling
COSMIT + ENH better handle dense bernouilli random matrix
FIX: make test_commons succed with random_projection
DOC removed unrelevant paragraph(s)
ENH add implementation choice for sample_int
ENH add various sampling without replacement algorithm
Typo
TST: Add tests for every sampling algorithm + DOC: improved doc
DOC: fix mistake in the doc + ADD benchmarking script
ENH Rename sample_int to sample_without_replacement
DOC + ENH: minor add in doc + set correct default
FIX: broken import
FIX typo mistakes + ENH change default behavior to speed the bench with Gaussian random projection
ENH Add allclose to sklearn.testing
ENH improve naming consistency
PEP8
COSMIT
DOC + typo
DOC set narrative doc for random projection
FIX: broken test due to typo correction
DOC minor improvements
DOC mainly switch from .\n:: to ::
FIX typo mistakes
DOC improve name in example
DOC Separate the jl example from references
ENH Add jl lemma figure to random_projection.rst
COSMIT (typo, doc, simplify code)
pep8
Typo
DOC typo in narrative doc
DOC fix typo in filename
DOC clarification
ENH flatten random_projection module + add sklearn.utils.random
ENH refactor matrix generation BaseRandomProjectiona and subclass
DOC improve layout (url)
Make the JL / RP example use the digits dataset by default
FIX broken import
pep257 + COSMIT: naming consistency
COSMIT
COSMIT
Remove unused line
DOC improve doc for jl lemma function
typo
ENH Rename Bernoulli random projection to sparse random projection
ENH Rename Bernoulli random projection to sparse random projection
DOC add see also
pep8
COSMIT make everything use the common interface
DOC improve + fix mistakes + TST added
ENH Simplify assert_raise_message + TST add them
DOC add utitilies to the doc
DOC + FIX density to Ping and al. recommandation
ENH make jl lemma work even with non numpy array
DOC add default values
ENH Add support for multioutput metrics
DOC add narrative doc for regression metrics
Update what's new
TST check that ValueError is raised when the number of output differ
ENH add mean absolute error
DOC cosmit alphabet order of classification metric in ref
DOC typo
ENH add multioutput support for dummy estimator
DOC instance attributes + TST: do not record warning
DOC typo
ENH preserve output ndim
COSMIT reorganized functions in the module
DOC add narrative overall description of classification metrics
DOC add hinge loss narrative doc
DOC Set reference links in the doc
DOC add narrative doc on zero_one loss metric
DOC add narrative doc on zero_one_score
DOC add narrative doc for precision, recall and fbeta measures
DOC add narrative doc on roc curve
DOC add narrative doc on auc and average precision
DOC add narrative doc on matthews_corrcoef
DOC add narrative doc for explained variance
DOC add reference to multioutput metrics in regression
DOC add link to clustering metrics
Update what's new
ENH renamed metrics.zero_one to metrics.zero_one_loss
ENH rename zero_loss_score to accuracy_score
ENH ClassifierMixin use a metrics from sklearn.metrics
DOC add classification_report to the narrative doc
DOC typo and mistakes
DOC comment from @amueller + several minor improvements
TST + DOC add many examples on sklearn.metrics
DOC typo + minor improvements
DOC remove redundant comment
DOC better example with dummy estimator + link to appropriate reference
ENH use deprecated decorator
FIX DOC missing default behavior change
DOC COSMIT pretty math
DOC clarification of api change
FIX catch deprecation warning
COSMIT (don't change anything see sparse_random_matrix)
Typo
FIX add doctest ellipsis
FIX doctests dtype
Typo
ENH multilabel metrics: accuracy, Hamming, 0-1 loss
DOC FIX foating point issue
FIX numpy 1.3 issues with multilabel metrics
ENH add normalize option to accuracy_score + FIX bug with 1d array
DOC return_path argument, prettier references
ENH more pythonic way to treat list of list of labels
ENH add jaccard similarity score metrics
FIX compatibility issue with np 1.3 py 2.6
ENH add multilabel support to PRF metric family
ENH remove pos_label argument with multilabel binary indicator format
ENH remove warnings at testing time
FIX unique_labels in corner case
FIX issue with comparable but different dtype
ENH don't allow mix of input multilabel format
ENH simpler check for mix of string and number input
COSMIT better name
Typo
ENH use type_of_target within unique_labels
ENH improve documentation with allowed label types
ENH check that we don't mix number and strings
Flatten label type checking
TST add smoke test for all supported format
COSMIT
PY3K use six.string_type
OPTIM + ENH simplify mix string and number check
FIX bug with indicator format
ENH use a comprehension over imap
@arjoly and @glouppe thanks their funding FNRS and DYSCO
ENH remove _is_1d and _check_1d_array thanks to @GaelVaroquaux
flake8
ENH raise ValueError with row vector if multilabel or multioutput is not supported
ENH being less permissive thanks to @jnothman
DOC add example is_multilabel
ENH handle properly row vector
Flake8
ENH better error message
FIX switch to the new format syntax
ENH prettier error message for _binary_clf_curve with bad input shape
ENH use ravel instead of atleast_1d and squeeze whenever possible
ENH coherently input checking for regression metrics
ENH dryer thanks to @jnothman
TST stronger test for _column_or_1d function
FIX ^ is a symetric difference
MAINT Set random_state, modernize tests
TST max_features for more tree estimators
TST remove unused tests
ENH add missing pxd of utis.random
ENH Use file configuration
FIX signature
TST error message for _check_clf_target
COSMIT
FIX TST given cosmit
COSMIT don't need set
DOC explain the code
COSMIT product(..., repeat=2)
Update mailmap
DOC add missing datasets helper
ENH remove deprecated
ENH remove deprecated things (2)
Update what's thanks @NicolasTr
ENH add support for string input with classification metrics
ENH use the new format syntax
ENH remove inspect
COSMIT
Update what's new
DOC state that string is possible
TST with labels arguments
FIX what's new...
ENH remove bad examples
DOC let some example for prf metrics
ENH allows make_multilabel_classification to return label indicator f…
TST grid_search_cv works with multioutput data
TST cross_val_score with multoutput data
COSMIT
ENH consistency mse=> mean_squared_error ari => adjusted_rand_score
FIX docstring
Update what's new
DOC add missing links to the scorer and classication section
ENH add multioutput support to KNeighborsRegressor
ENH add multioutput support to RadiusNeighborsRegressor
ENH add multioutput support for KNeighborsClassifier
ENH add multioutput support to RadiusNeighborsClassifier
DOC + example with multioutput regression face completion for knn
ENH allows make_multilabel_classification to return label indicator format
ENH TST grid search with multioutput
ENH TST random search with multioutput data
DOC gridsearch support mulioutput data
TST cross_val_score with multioutput data
DOC more information about which classifier support multilabel
DOC unveil that some estimators support multilabel classification and multioutput-multiclass classification
DOC overall improvements
pep8
DOC credit + fix typo + wording + use mathplotlib.pyplot
ENH take @glouppe comments into account
FIX small title issue
DOC update what's knn and radius-nn support multioutput data
FIX bug in f_score with beta !=1
FIX formula inversion for sample-based precision/recall
FIX set same default behavior for precision, recall and f-score
ENH raise warning with ill define precision, recall and fscore
Backport assert_warns and assert_no_warnings from np 1.7
TST test warning + ENH Add warning average=samples
FIX TST with warnings thx to @jnothman
flake8
ENH set warning to stacklevel 2
TST silence warning
ENH use with np.errstate
DOC TST correct comment
FIX warning test
FIX warning tests in preprocessing
PY3K remove __pycache__ in make clean
FIX PY3K warning.catch_filter set record
DOC overall improvements in the multiclass documentation
DOC take into account @vene and @ogrisel + specify format for multioutput-multiclass
DOC rewording
Typo
DOC ENH take into account @NelleV comments
DOC more comments from @NelleV
DOC Remove deprecated reference + acknowledge @larsman
DOC Update what's new
ENH more explicit name for auc + consistency for scorer, fix #2096
DOC put the narrative documentation of roc_curve and roc_auc_score in one place
FIX search and replace misstake
ENH reduce memory overhead of storing tree ensemble
Merge pull request #2438 from arjoly/tree-mem-overhead2
Update what's new
FIX issue #1993: passing a multilabel indicator is no more noop
Merge pull request #2521 from dmedri/master
Merge pull request #2504 from glouppe/tree-tweaks
Merge pull request #2614 from zyv/patch-1
TST add generic test for averaging
ENH test that most metrics work with one sample
ENH auc averaging for multilabel-indicator format
ENH add multilabel-indicator support for average_precision
TST perform testing on average precision + simplify invariance test system
ENH add scorer support for multi-label threshold scorer
TST scorer works with multi-output decision function
flake8
TST more clean up test_metrics
DOC narrative doc for averaging support of average_precision_score
FIX decision function scorer multilabel
DOC narrative doc for roc_auc_score multilabel-indicator
DOC not all functions support multilabel-sequence format
DOC wording
FIX doctests
DOC extend roc example
DOC update example of precision-recall curve
TST properly raise ValueError...
DOC nicer plot
TST add roc_curve and precision_recall_curve on toydata
DOC TST explain how to use common tests in test_metrics
Typo
TST clean copy paste mistake
TST remove print
TST full coverage for _average_binary_score
DOC TST typo
WIP TST more sample invariance test
WIP TST adapt test for sample based metrics
TST finish to add invariance test for sample based metrics
DOC typo
DOC wording and typo
TST add test for garbage averaging input string
TST typo
DOC explain why roc auc score is useful in multilabel classification
typo
TST more corner case tests
TST ensure that exceptions are properly raised on np 1.3
Merge pull request #2629 from arjoly/fix-auc
Update what's new
Merge pull request #2633 from josericardo/FixingTypos
Merge pull request #2643 from jnothman/bench_mutilabel_metrics
Remove deprecated zero_one and zero_one_score
flake8
FIX silence numpy 1.8 warning for using non integer
Merge pull request #2656 from arjoly/np1.8-warnings
Merge pull request #2719 from amueller/remove_description_in_common_tests
Merge pull request #2717 from Manoj-Kumar-S/test_log_loss
TST Gini is equivalent to mse in binary classification
Merge pull request #2744 from arjoly/test-impurity
OPTIM MSE criterion
OPTIM more optimisation of mse
OPTIM more optimisation of mse
ENH use memset instead of for loop
Typo
Update what's new
Merge pull request #2732 from jnothman/two-array-tree
Merge pull request #2780 from ugurthemaster/patch-1
Merge pull request #2825 from bryan-lunt/master
Uniformize max_features semantics
TST add tests with constant features
Update what's news
Remove duplicate what's new
Merge pull request #2829 from arjoly/maxfeatures-seamantics
ENH Cache features value for extra trees
Merge pull request #2897 from ugurthemaster/patch-1
WIP add n_constant_features argument
WIP add constant_features array
ENH avoid splitting on constant features for best splitter
FIX bug with BestSPlitter + ENH avoid searching for a split on constant features
ENH avoid trying to split on constant features for presort best splitter
COSMIT
COSMIT
FIX deallocate memory
FIX proper inititalization
ENH use memcopy instead of while loop
COSMIT
FIX bug with invariant
Simpler swap
pep8
ENH remove unused constant
ENH document and rename EPSILON_FLOAT
DOC improve documentation for the splitting algorihtm
DOC verbose documentation of features and constant_features
Rename EPSILON_FLOAT -> CONSTANT_FEATURE_THRESHOLD + DOC n_drawn_constants & n_found_constant
DOC more comments about features and constant_features
DOC explain invariant with constant features + DOC clarify comments
DOC more comments of the splitting algorithm
Merge pull request #2886 from maheshakya/dummy_regressor
Update what's new
Merge pull request #2875 from arjoly/et-skip-invalid-
Update what's new
Merge pull request #2971 from jyu-rmn/master
Clean tree builder interface
ENH inline sme function
ENH avoid serializing the splitter & criterion + avoid storing tree parameter in the tree structure
ENH avoid serialising the random_state in the tree
ENH clean public tree strucure
ENH for the tree structure max_depth is part of the inner structure
ENH stricter separation between tree builder, splitter and criterion
ENH add a dedicated constant for min_impurity_split
Document maximal depth of the tree
ENH stop splitting for best first tree builder if pure node + change strict to unstrict inequality
ENH stricter separation between splitter and criterion
FIX tons of flake8
DOC document max_depth of TreeBuilder
ENH re-generate c file
Add space to be consistent with _tree.pyx
DOC document internal tree class
ENH rename CONSTANT_FEATURE_THRESHOLD to FEATURE_THRESHOLD + revert some pep8 fix for readability
DOC missing words
Merge pull request #2977 from arjoly/tree-cleanup
DOC example rendering
FIX numerical stability issues on 32 bit platform
Merge pull request #3049 from glouppe/tree-bestfirst
TST define only one list of name for sample weight invariance testing
FIX missing comma in METRICS_WITHOUT_SAMPLE_WEIGHT
DOC advertize GenericUnivariateSelect in the narrative documentation and api
Merge pull request #2936 from ugurthemaster/master
FIX raise memory error if constant_feature can't be realloc
Update numpy and scipy requirement
Merge pull request #3127 from rmcgibbo/_fit_and_score-docstring
pep8 remove unused variable
DOC typo
MAINT remove redundant class hierarchy + fix api perform parameter check in fit
DOC fix example in univariate selection (fix issue #3132)
FIX testing datasets have too few feature for default of SelectKBest
TST improve parameter checking + use nose assert_ functions
DOC remove comment thanks to @jnothman
Merge pull request #3178 from MechCoder/Iss3174
TST Refactor test of sklearn/ensemble/tests/test_forest.py
TST refactor oob score testing
Update what's new with adaboost sparse input support by @hamsal
TST reduce max_depth for numerical stability
Merge pull request #3286 from ilam/seconddoc
MAINT reduce amount of boiler code using standard C operations
Merge pull request #3308 from jnothman/ovr_constant_predict_proba
Merge pull request #3318 from abhishekkrthakur/master
Update what's new
Update what's new
FIX failing tests due to the change of the underlying sample generator
Merge pull request #3366 from MechCoder/link-to-blog
MAINT Re-order argument to put deprecated one at the end
ENH improve forest testing + avoid *args
ENH improve rand_int and rand_uniform
Merge pull request #3406 from jnothman/minor
ENH add label ranking average precision
DOC write narrative doc for label ranking average precision
DOC FIX error + wording
TST invariance testing + handle degenerate case
flake8
FIX use np.bincount
DOC friendlier narrative documentation
ENH simplify label ranking average precision (thanks @jnothman)
ENH be backward compatible for old version of scipy
Typo
pep8
DOC remove confusing mention to mean average precision
Merge pull request #3412 from brentp/logistic_l1_l2_ex
Merge pull request #3411 from lesteve/scheduled-removal-from-0.15
DOC improve documentatino thanks to @vene and remove mention of relevant labels
ENH add sample_weight support to dummy classifier
DOC update what's new
DOC sample_weight attribute
DOC more intuition about corner case
DOC add documentation to backported function
ENH less nested code
Merge pull request #3430 from amueller/test_list_input
Merge pull request #3433 from lesteve/remove-non-integer-deprecation-warnings
ENH better default for test for SelectKBest and random projection
MAINT tree compute feature_importance by default
pep8
Merge pull request #3439 from arjoly/test-commons
FIX encoding issue
Merge pull request #2804 from arjoly/lrap
DOC update what's new
MAINT deprecate fit_ovr, fit_ovo, fit_ecoc, predict_ovr, predict_ovo, predict_ecoc and predict_proba_ovr
MAINT split sklearn/metrics/metrics.py
ENH + DOC set a default scorer in the multiclass module
MAINT flatten metrics module and avoid nested bicluster module
DOC typo + not forgetting single output case
ENH in sklearn.metrics set bicluster as a submodule of cluster
FIX import
DOC improve documentation and distinguish each module
Merge pull request #3401 from vene/scorer_weights
ENH add a friendly warnings before deleting the file
Merge pull request #3445 from arjoly/flatten-metrics
MAINT move log_loss and hinge_loss to the classification metrics
MAINT use assert_warns
MAINT Fix regression random projections work sparse matrices
Merge pull request #3522 from ugurthemaster/patch-1
DOC add link toward Jatin Shash webpage
DOC update what's new
MAINT simplify covertype benchmark
DOC document copy_X parameter from LinearRegression
MAINT add a deprecation version
Merge pull request #3732 from queqichao/fix_typo
FIX ensure that pipeline delegate classes_ to the estimator
pep8
TST bagging of pipeline of classifier
DOC update what's new
Update what's new
COSMIT reshape for all regression strategy + avoid xrange
COSMIT less nested constant check + use comprehension
Revert unwanted modificatoin
Add sample_weight support to Dummy Regressor
Use np.average instead of np.mean
Update what's new: full name of Staple
MAINT remove deprecated oob_score_
MAINT remove deprecated loss in gradient boosting
DOC add missing public function into the references
Update what's new
ENH Bring sparse input support to tree-based methods
FIX+ENH add min_weight_fraction_split support for sparse splitter
Re-organize code dense splitter then sparse splitter
Simplify call to extract_nnz making it a method
ENH while -> for loop
ENH reduce number of parameters
FIX min_weight_fraction_split with random splitter
FIX min_weight_leaf in best sparse splitter
ENH remove spurious code
cosmit
ENH adaboost should accept c and fortran array
COSMIT simplify function call
ENH expand ternary operator
Revert previous version
ENH move utils near its use
ENH add a benchmark script for sparse input data
Extract non zero value extraction constant
Lower number of trees
wip benchmark
Temporarily allows to set algorithm switching through an environment variable
Benchmark: Add more estimators + uncomment text
FIX duplicate type coercision + DOC fix inversion between csc and csr
Remove constant print
COSMIT add Base prefix to DenseSplitter and DenseSplitter
MAINT refactor gradient boosting code
FIX unlikely pickling error of splitters
TST add a toy test for min_weight_samples_leaf
ENH add coverage multilabel ranking metric
Remove copy paste mistake
DOC improve documentation of coverage
TST make separate tests for coverage + remove redundant tests with commons
DOC clarify how ties are broken for coverage_error
ENH add sample_weight support
DOC typo
TST + FIX ensure that it fails if sample_weight has not proper length
DOC more explicit title
Update what's new
DOC add missing load_svmlight_files to api references
ENH Release gil in feature importance
ENH use threading backend for features importance parallelisation
FIX explicit initialization of normalizer
Update what's new
FIX copy paste mistake
MAINT remove deprecated auc_score function
ENH add a benchmark on mnist
ENH Improve script output display
DOC add performance for all available classifiers
FIX Ensure at least 1 feature is sampled when max_features is a float
FIX ensure that negative float max_features will lead to an error
FIX numpy deprecation warning from unsafe type comparison
ENH add a prior strategy for the dummy estimator
Update what's new
FIX raise error properly when n_features differ in fit and apply
Factorize input validation
ENH Factor out validation of X for apply/predict in forest
FIX properly raised not fitted error
FIX use the new notfittederror in feature_importances_
DOC use same error message in decision tree and random forest
ENH add ranking_loss multilabel ranking metric
FIX raise ValueError if sample weight are passed but unsupported by the base estimator
Update what's new
DOC document missing attributes
FIX raises memory error in depth first builder
Merge pull request #4932 from rasbt/randomforest
Optimize MSE criterion
ENH Faster tree-based methods by implementing reverse update of criterion
Merge pull request #5230 from jmschrei/_tree_split
FIX unstable test due to bootstrap and unset random state
Merge pull request #5252 from jmschrei/presort
DOC FIX missing minus in Shannon's entropy formula
Arnaud Rachez (1):
Addressed comments on PR #5451
Arthur Mensch (4):
Bugfix : type in cd changed for read only memmap compatibility
Added check_input in Lasso for DictLearning performance
Fix sparse_encode input checks
Fix fit_transform, stability issue and scale issue in PLS
Aymeric Masurelle (19):
FIX : pass random_state to kmeans in gmm.fit
FIX : add condition pos_label!=None for multiclass purpose in metrics.precision_recall_fscore_support
TEST : add a test, test_precision_recall_f1_score_multiclass_break(), that breaks with current master and now works
Change metrics.py as before and shorten test (test_precision_recall_f1_score_multiclass_break() in test_metrics.py) to show where it breaks
ADD : cosinus kernel calculation in metrics/pairwise.py
add cos_kernel in help of decomposition/kernel_pca.py
name change: cos into cosine
change way of calculating cosine_kernel in metrics/pairwise.py
add test for cosine_kernel in metrics/test_pairwise.py
correct indent pb and re-edit cosine_kernel help in metrics/pairwise.py
fix style issue by running pep8 on metrics/pairwise.py and on metrics/tests/test_pairwise.py
remove duplicated test_cosine_kernel() in metrics/tests/test_pairwise.py
change test_cosine_kernel to include normalize from preprocessing.py in metrics/tests/test_pairwise.py
remove duplicated dimension check in metrics/pairwise.py
add reference to cosine similarity in cosine_kernel help from metrics/pairwise.py
modify cosine_kernel func to use normalize from preprocessing.py and change the test_cosine_kernel adding scipy.sparse inputs respectively in metrics/pairwise.py andmetrics/test_pairwise.py
modify test_cosine_kernel to compare result obtain with linear kernel in metrics/tests/test_pairwise.py
FIX: add prefix 'np.' to sqrt for test_cosine_kernel in metrics/tests/test_pairwise.py
FIX: move import of normalize function into the cosine_kernel call in metrics/pairwise.py
Bala Subrahmanyam Varanasi (21):
modified 'gid' to 'git'
pep8 compliant
fixed visual indentation errors
fix indentation errors
use spaces in indentation
fix indentation
fix E501: line too long
fixed indentation and visual indentation errors
fixed visual indentation errors
fixed indentation and visual indentation errors
fix for 'line too long' warning
pep8 fixes
fix indentation and convert tabs to spaces
fix indentation
fix too many blank lines
fix visual indent
add expected blank line
fix line endings and visual indents
fix visual indentation
fix visual indentation
removed unused import from example
Baptiste Lagarde (5):
FIX: Typo
FIX: Typo
FIX: Typos
FIX: Typo
FIX: Typo
Barmaley.exe (8):
FIX Flips sign of SVM's dual_coef_ in binary case
Updates whats new and fixes the documentation
Removes RBF calculation by hand
Minor fix for a comment
DOC Adds SVR description to the tutorial
TST Adds sanity check for SVR's decision_function
DOC A small change to the SVM's tutorial
TST Fixes #4386
Bastiaan van den Berg (1):
BUG allow outlier_label=0 in RadiusNeighborClassifier
Ben Davies (1):
use a pipeline for pre-processing feature selection, as per best practise
Ben Root (7):
This should make the hungarian algorithm accept rectangular cost matrices. Also enabled the tests.
An additional check needed in case where there are fewer columns than rows.
Added support for hungarian assignment problems where one dimension of the cost function is zero-length.
Created an alternative hungarian solver for rectangular matrices that does not involve matrix padding.
hungarian() now returns a 2-D array of indices instead of a 1-D array. Also modified the find_permutations test to accomodate.
Some minor changes to docs, and small simplification in code.
Updating namespace usage from scikits.learn to sklearn
Benedikt Koehler (1):
Typo
Benjamin Peterson (1):
ENH import six package for Py2/Py3 compat in a single codebase
Bertrand Thirion (74):
introduced gael's implementation of fast_ica and debugged GS orthogonalization
cosmit in fastica, that created a bug -- to be fixed
updates in fastica and more tests
completed and cleaned the tests
improved the tests
solved conflict in test_fastica
added probabilistic PCA and associated tests; works reasonably well
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' of github.com:scikit-learn/scikit-learn
fix in ppca
cosmit in pca and test_pca
merged origin and fixed a conflict
merged origin
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' of github.com:scikit-learn/scikit-learn
merged the mainr epo
Merge branch 'master' of github.com:scikit-learn/scikit-learn
new criterion for wards clustering
one single cython module for inertia and ward distance
always use scikits ward algo when no structure is provided
tiny updates on lda (checks, numerical stability)
removed the unused inerta stuff
Variable renaming and dostring fixing
merged with master logsum -> logsumexp
ENH: renaming estimated variables from self._variable to self.variable_
removed the decode
removed the decode in dpgmm and removed return_log in eval
ENH: Cleaned after rebase and compatibility with hmm
ENH: Removed X and z varaibles from dpmm cladd (should not ship the data)
ENH:aviod initializaing GMM means with zeros
ENH: more snsible initialization in case of divergence
BF: Mended the tied covariance estimator
ENH: added multiple initialization to the GMM -- untested
FIX: fixed collateral dammages in hmm
added some tests to ensure that GMMs work in about all conditions
ENH: renaming cv_type and posterior to more explicit name + tested multiple init
avoid changing the covariance when computing the Gaussian density
FIX: Fixed a buf I introduced in dpgmm
ENH: Added AIC/BIC + tests. Seems to work
Cosmit in dpgmm
merged with master
Changed the shape of spherical covariance matrices to be equal to disgonal covariance matrix, in order to avoir handling the dimension in particular
Merge branch 'master' of github.com:scikit-learn/scikit-learn into gmm-fixes
detail fixed in an example
Hopefully clarified notations in dpgmm
Many corrections in dpgmm to remove en-necessary loops (significant speed-up) + renaming
Fixed an example that happened to fail
Several details outlined by Jake
handled the eval on Null data
merged the master repo
Added an example with model selection
Oups: really added an example with model selection
ENH: Removal of properties from GMM -- unfinished
removed properties from dpgmm
replace log_weights_ by weights_, which makes the API more consistent
Getting rid of properties in hmm, gmm, dpgmm
fixed a doctest
ENH: Some cleaning in the examples
ENH:pep8
ENH: enforcing skls conventions
A pass on the docs
corrected the doc for dpgmm
removed get_means, set_means, get_weights, set_weights
ENH: renamed plot_gmm_model_selection.py to plot_gmm_selection.py
Fixed the doctests in hmm
COSMIT:pep8 in hmm
Corrected the docs
ENH: changes in the code to fulfill Gaels requirements
Merge branch 'master' of github.com:scikit-learn/scikit-learn into gmm-fixes
ENH:Added back rvs as deprectaed and updated whatsnew.rst
ENH: fixed the GMM docs
ENF changed INF_EPS to EPS in hmm too.
Bogdan Trach (1):
* doc/conf.py: added required latex packages (bm and morefloats)
Boris Feld (1):
Fix tinyclues logo in doc/about.rst
Borja Ayerdi (3):
Fix RFE n_features minimum value #3812
Fix RFE n_features minimum value #3812 and make it simpler.
If step is explicitly zero or negative, raise a ValueError exception.
Boyuan Deng (3):
Use immutable default arguments throughtout repo
Link to functions and fix typos in dataset docs
Fix class reference in twenty_newsgroups.rst
Brandyn A. White (2):
Fixed docstring to reflect current code in precision_recall_curve.
Faster confusion_matrix implementation
Brent Pedersen (1):
use more interesting range for C in logistic l1 l2 example.
Brett Naul (2):
Added/improved GraphLasso tests, per #4134
Add enet_tol parameter to GraphLasso class/methods
Brian Cajes (6):
Improving code coverage for datasets module. Moved dataset imports inside test_data_home, because it is preferable for import errors to only affect the tests that require those imported methods. My first commit to scikit. -bcajes
revert to original import placement style
Improving code coverage for datasets module. Moved dataset imports inside test_data_home, because it is preferable for import errors to only affect the tests that require those imported methods. My first commit to scikit. -bcajes
bring datasets.base to 97% coverage with a few more tests
removing backup file
checking data.shape for each test dataset
Brian Cheung (15):
Discretization method for spectral clustering added along with tolerence setting to loosen eigendecomposition constraints
Documentation and small bugs fixed and code cleaned up
Small comments/constants added
Added more info in documentation
Small aesthetic fixes to discretization
pep8 formatting
More description of the discretization algorithm.
Even more description of the discretization algorithm.
Documentation changes, removed more camel case variables
Fixed some memory inefficiencies and clearned up documentation and code semantics
Example for spectral clustering embedding handling
Added newline to the end of file
removed a hardcoded value
Modified lena segmenation example to include different embedding solvers
Removed savefig
Brian Holt (198):
Refactored decision trees and forests to support CART algorithm.
Refactored decision trees and forests to support CART algorithm.
Added documentation
make number of classes explicit
Added visualisation and corrected bugs in CART algo
Merge branch 'enh/ensemble' of https://github.com/satra/scikit-learn
Merge https://github.com/scikit-learn/scikit-learn
improved nosetests doctest time
PEP8
Merged decisiontree and tree_model into tree, random_forests to ensemble
20% speed improvement by moving _find_best_split to cython
removed occurances of tree_model
Merge https://github.com/scikit-learn/scikit-learn
Added the Boston House Prices dataset
Fixed imports and run unit test
Merge pull request #6 from vene/boston
Corrected import of the data: all 506 columns are now usable
merge
Merge branch 'boston' of https://github.com/bdholt1/scikit-learn
Updated documentation for boston house prices dataset
FIX: removed the required parameter K
FIX: dataset description
Further optimisation of _find_best_split
Further optimisation of _find_best_split
Refactoring: speedup of decision tree code
Further performance improvements. Now approx 30 - 50% faster than MILK.
Merge branch 'boston'
Updated benchmarking for trees
PEP8
DOC: added documentation for graphviz method
FIX: corrected computation error and typed incoming arrays
FIX: corrected graphviz visualisation.
removed everything except the plain and simple decision tree to make reviewing easier
DOC: Updated the documentation to reflect decision trees.
Corrected newlines, and ensured only tree related changes are in this set
FIX: replaced ad-hoc RNG with suggested scikits.learn implementation. Tidied up dependent examples.
Merge with master
Merge https://github.com/scikit-learn/scikit-learn into enh/tree
Merge https://github.com/scikit-learn/scikit-learn
Removed unused import
PROF: improved speedup thanks to ppret
Merge branch 'enh/tree'
Initialise random state for examples
DOC: Added +ELLIPSIS for examples
ENH: Support binary (-1,1) classification as well as [0,...,K) multiclass classification
Merge git://github.com/scikit-learn/scikit-learn into enh/tree
Removed unnecessary import
Fixed doctest example
Updated documentation for class interface
Minor patches to docs
Optimisation: moved _find_best_split to cython.
DOC: change classification to regression
Merge github.com:scikit-learn/scikit-learn into enh/tree
DOC: corrected doctest
don't allocate a new pm for each call: 3 times faster
Moved to @pprett's faster splitting code (debugged)
Added more debugging info to graphviz
Moved to the version without a sample mask, since correctly implemented it is almost as fast
Fixed error of splitting between identical feature vals
DOC: updated comments
Fixed memory leak in libsvm
Improved graph visualisation
Move initial entropy computation outside loop.
raise ValueErrors with appropriate messages
merged upstream master
merged upstream master
Standardise error messages
Copied ensemble and random forest classifiers to new branch
Check that labels are in range for multiclass classification
Check that labels are in range for multiclass classification
Further clarification of error messages
Merge branch 'upstream-master' into crossval
Fixed regression bug. Thanks @pprett
Merge branch 'upstream-master' into enh/tree2
merged enh/tree2 into enh/tree
Fixed doctest
Enforce 64bit and 32bit types and correct regression bug (divide by zero).
Refactored construct to subsample dimensions.
store all tree parameters in the RF base class so that clone() will work
Revert to _Fixed Doctest_ and added regression bug fix
update to unit test and doc test
enforce type on storage arrays
enforce 64 bit types on parameters
further type enforcement
initialise variables
removed unused import, removed unnecessary backslash
Improved names and documentation for Leaf and Node
Renamed K to n_classes
renamed F to max_features
renamed features to X
renamed labels to y
renamed n_dims to n_features
explained min_split
renamed C to predictions
improve documentation
renamed K to n_classes
COSMIT: improved documentation
renamed pm to label_count
renamed K to n_classes
improved documentation and renamed features and labels
renamed var to variance
fixed comments
Updated docstrings
merged upstream-master into enh/tree
Merge pull request #9 from ogrisel/bdholt1-enh-tree
Merge pull request #10 from ogrisel/bdholt1-enh-tree
merged upstream/master
renamed scikits.learn to sklearn
Push coverage up to 96%, added graphviz test
merging
Merge pull request #11 from pprett/bdholt1-enh/tree
added example usage of graphviz
Merge branch 'enh/tree' of github.com:bdholt1/scikit-learn into enh/tree
fixed unit test of graphviz
added trees (boston and iris datasets)
pep8
moved the min_split test to beginning of recursive_split
group imports by hierarchy
sed s/dimension/feature/g
time is measured in seconds
print left and right child repr in graphviz
Merge branch 'enh/tree' of github.com:bdholt1/scikit-learn into enh/tree
fixed graphviz test failure
added feature_mask to reduce fancy indexing
replaced == with 'not' operator
updated the decision tree docs (not done yet)
use Fortran array layout
corrected feature_mask implementation
allow for different architectures
merged upstream/master moving to sklearn
merged enh/tree
Merge pull request #12 from pprett/bdholt1-enh/tree
Incorporated suggested changes to Graphviz exporter
visit -> export
cosmit: added spaces
cosmit: improved documentation
fixed indentation and added section on memory requirements
Updated documentation to include the iris svg example
improved documentation
np.float64 -> DTYPE. Set DTYPE to np.float32.
make sorting more efficient by transposing and sorting along last axis.
Use a sample mask instead of fancy indexing.
Merge pull request #13 from pprett/bdholt1-enh/tree
COSMIT: corrected comments
made sample_mask a fit parameter
updated documentation to reflect min_density concept
Merge pull request #14 from pprett/bdholt1-enh/tree
there is no more Leaf class
added feature_names to GraphViz export
Tidied up graphviz related code
test for improperly formed feature_names
removed sample_mask parameter
only return values that are used
Merge branch 'master' of github.com:scikit-learn/scikit-learn into enh/tree
Merge branch 'enh/tree' of github.com:bdholt1/scikit-learn into enh/tree
Merge pull request #16 from pprett/bdholt1-enh/tree
use np.isfortran
use None as the default marker
compute node id's on the fly
removed leftover class_counter
Merge pull request #17 from larsmans/enh/tree
added test for pickle-ability
Merge branch 'enh/tree' of github.com:bdholt1/scikit-learn into enh/tree
Merge pull request #19 from pprett/bdholt1-enh/tree
fixed failing docttest
improved tree documentation
included a mathematical formulation for CART
verify that scores from pickled objects are equal to original
pep8
Merge pull request #20 from GaelVaroquaux/tree
COSMIT: +SKIP on classification doctest
rewrote GraphvizExporter into a function export_graphviz
removed duplicate tests (already in fit)
Merge pull request #21 from glouppe/tree
classes can be any integer values
require that the next_sample_larger_than is greater than the previous by at least 1.e-7
regenerate cython
if threshold is indistinguishable from a, choose b
modified threshold comparison from < to <=
Merge branch 'master' of github.com:scikit-learn/scikit-learn into enh/tree
Added tree module to whats_new
release sv_coef memory
tree construction depends on n_features
Merge pull request #22 from ogrisel/bdholt1-enh-tree
Added person webpage
added trailing underscore
Merge branch 'master' of github.com:scikit-learn/scikit-learn into enh/ensemble
Merge pull request #23 from larsmans/enh/ensemble
scikits.learn -> sklearn
update parameter names
Merge branch 'master' of github.com:scikit-learn/scikit-learn into enh/ensemble
remove enforcement of return type
replaced ratio r with sampling with replacement
Re-ran the tests and found that the GaussianNB error was much lower.
Fixed typo
added multi-ouput tree example
updated documentation to reflect multi-output DT regression
added link
Brian Kearns (1):
FIX regression in CountVectorizer handling of float min_df/max_df
Brian McFee (1):
ENH: added LabelShuffleSplit cv iterator
Brian Wignall (1):
CLN: Fix typo in comment
Brooke Osborn (30):
DOC adding description for handling of the auto parameter in neighbors
BUG fix whitespace in error messages
changing private _fit to public partial_fit
removing NWS from list of stock prices
adding test to rbm
changing partial fit to be seperate from _fit
changing test
saving the random state
making adjustments to variable names and account for sparse matrices
changing xrange to range
putting cluster on new line to solve sphinx documentation formatting issue
changing formatting of documentation
Merge remote-tracking branch 'upstream/master' into rbmbranch
changing computation of batch slices to np.array_split
pep8 corrections
adding test for csr data for partial fit
pep8-ing
removing copy() and adding tests for multiple forms of sparse matrices
removing tests for multiple sparse matrix formats
adding test for csc and csr sparse matrix
adding change to whats_new
adding avg weights and flag to plain_sgd method
asgd is added
adding test that computes the average sgd
checkpoint for classifier
fixing tests
adding documentation fixes and changing avg to average
fixing documentation
adding comments to linear algebra operations
updating doc
Bryan Lunt (1):
Fixed FS1995 citation.
Bryan Silverthorn (3):
Test KernelPCA support for n_components.
Add support for n_components in KernelPCA.
PEP8 fix.
Buddha Prakash (7):
ENH: split LDA's n_iter_ into n_iter_ and n_batch_iter_
Add check for sparse prediction in cross_val_predict
Use a single vstack for concatenating all blocks in prediction matrix
Use Inverted locations to reorder the predictions
Remove redundant p variable
Add test to check sparse predictions in cross_val_predict
Fix minor indentation issues
Bussonnier Matthias (1):
[Docstring Typo] making there -> making their
CJ Carey (12):
DOC: grammar and spelling fixes
BUG: avoid NaNs throwing off class probabilities
fixing warning ignore type
adding test case
conform to pep8
apply the fix closer to the source of the issue
ENH: use partial sort for kneighbors selection
ENH: Use the scipy C-based L1 distance if possible
WIP: adding 'max' normalizer to normalize()
TST: covering norm='max' branches of normalize()
DOC: updating Normalizer docstring for norm='max'
Fixing sparse max for older scipy
Calvin Giles (2):
Changed f=open() to with open() as f to eliminate ResourceWarnings. Fixes #3410.
Moved code out of context block where not required for #3612
Carlos Scheidegger (1):
BUG: missing subpackage svm/sparse on setup.py. fixes issue #559
Cathy Deng (1):
FIX agglomerative clustering w/ connectivity constraints & precomputed distances
Celeo (1):
Fixed typo in CONTRIBUTING.md about how to submit changes back to the project
Charles Earl (1):
Added docstrings for model attributes in LabelPropagation and LabelSpreading
Charles McCarthy (2):
Fixed data.filenames consistency issue when 'all' specified for 'subset'.
Added basic test for filenames consistency when all specified.
Charles-Pierre Astolfi (1):
Typo fix
Chen Liu (6):
fix numpy deprecationWarning('using a non-integer number ...') in svm
fix converting an array with ndim > 0 to an index DeprecationWarning
using floor division in python3
handle the case when length of sign changed coefficients is greater than 1 in LARS algorithm
Merge remote-tracking branch 'upstream/master' into fix-DeprecationWarning
handle the case where length of idx in least_angle greater than one
Chi Zhang (2):
added comments for y in dump_svmlight_file() method
changed shape parameter
Chih-Wei Chang (1):
Add multilabel support for dump_svmlight_file.
Chris Filo Gorgolewski (1):
DOC: Fixed roc_curve docstring
Christian Jauvin (4):
Mechanism to propagate optional estimator.fit arguments when using CV
changed **fit_kwargs to explicit fit_params dict
make sure that param has len attr + a test
replaced assert with assert_true + error msg
Christian Osendorfer (17):
Fixed problem with big full covariance matrices: sum,log instead of log,prod for loglikelihood computations.
Factor Analysis -- implemented with EM + SVD.
TST: Make factor analysis test repeatable.
Extended faces decomposition example with Factor Analysis.
Factor Analysis learns variance of generative model for every dimension. Illustrated with faces.
pep 257.
Make sure that psi=0 does not break em.
Some documentation for FA.
More or less same code already available.
Plot noise variance for FA. Changed some things to make plot_gallery usable for this, too.
Adding some plots for FA. Ordering of articles must be adopted.
Extended test a bit.
Added score function.
Two iterations are enough for the test.
score works like ppca.score().
adapted to new signature of score().
Moved paragraph on FA before ICA.
Christian Stade-Schuldt (1):
TST make catch_warnings blocks more robust
Christof Angermueller (3):
Update documentation of predict_proba in tree module
Update docstring predict_proba()
Add conventions section to userguide
Christoph Deil (1):
Fix typo in README
Christoph Gohlke (1):
Fix signature not compatible with previous declaration
Christophe Bourguignat (5):
DOC document missing attributes
DOC Updated documentation for cv parameter
DOC Updated documentation for cv parameter (issue #4533)
DOC Updated documentation for cv parameter (issue #4533)
[DOC] Precision on random_state in KFold() doc
Christopher Erick Moody (1):
FEAT Barnes-Hut t-SNE
Chyi-Kwei Yau (33):
COSMIT remove duplicate key 'hamming' in METRIC_MAPPING
fix import error in lda.py
add OnlineLDA model and test
don't use sparse.block_diag since it is new in scipy 0.11.0
fix typos and imporve documentation
fix typo & improve doc 2
testing code improvement
move input validation and variable assignment to fit method
fix _approx_bound with subsampling
native dense input support
rename variable and minor pep8 fix
clean up test and add error check in preplexity method
remove duplicate code in _approx_bound method
move check_non_negative to utils/validation.py
fix params and import test coverage
update example and change total_samples in fit method
add LDA user guide
add LDA to classes.rst
random initialize in e-step
change LDA example to script
change doc, fix n_jobs, and remove fit_transform
merge NMF and LDA example
improve LDA doc
fix typo and improve doc
use n_features_
remove self.n_features_
remove main in test
make dimension mismatch err msg more explicit
change all rng to random_state
fix feature_names
improve document in _online_lda.pyx
use logsumexp
remove dirichlet_component_ variable
Cindy Sridharan (3):
vocabulary of type set now coerced to list to preserve iteration ordering after serialization
use sklearn.utils.random
removed list around sorted
Claire Revillet (1):
- fix missing links to the C math libray
Clay Woolam (103):
added label propagation class
switch map and sum commands to numpy
fixing up tests, adding "unlabeled_identifier"
basic features of multiclass labeling up
fixing the way labeling works
checking in minor changes
added documentation, reworking tests
fixing up tests
added a lot more to label propagation, explained algorithms and differences between the two models
more documentation
added beginning of examples
added "structure" example
tweaked structure plot
finalized SVM comparison example
all tests pass
removed some stuff from documentation
updated pydoc to make behaviour clearer
passed PEP8, using already implemented kernel functions
making everything more numpy compatible
graph construction and example more numpy-like
fixed other diagonal matrix construction
rename misnamed "plot" example
example conforms to pep8
other example conforms to pep8
made test conform to pep8
predict() method now numpy friendly (100% numpy friendly now)
more numpy integration
removed function kernel, switched to string for picklability
fixed a bug in the circle example
moved label propagation examples to lower subfolder
more numpy friendliness
more numpy use,
fine tuned some documentation
added a snazzy label propagation versus SVM decision boundary plot
added more explanation to the plot
added semi_supervised directory
removed old, useless code
removed unused imports
added more documentation, another doctest for LabelSpreading
minor tweaks to the overall layout of the code
reverted plot_iris accidental commit
added unlabeled_identifier explanation to docstrings
Merge remote-tracking branch 'upstream/master'
fixed indentation problem in documentation rst
conformance to pep8
fixed bug in tests causing gram matrix construction to not work properly (assumed casts to floats)
added two new examples, including an active learning demo with label propagation
heavily downsampled digits examples (runtime a few seconds now) and removed supporess_warrnging bug
changed doc to remove long runningtime warning
rennamed active learning example so it won't be run for doc compilation
changed subplot titles so the plot is more clear
fixed structure example
added vene's subplot adjustments
Merge branch 'new_lp'
made convergence check function private
fixed spelling error with variable name (indicies -> indices)
optimized _build_graph with inplace methods, conform to standards with variable names
one more optimization! avoids cast to numpy matrix and does in place matrix multiplications
fixed test cases to conform to api changes & new internal parameters
updated docs!
Merge git://github.com/scikit-learn/scikit-learn
localized a variable
fixed test suite, changed module to conform to new sklearn naming scheme
fixed examples for new naming scheme
merged ogrisel's docs & optimization, also fixed active learning example plot
changed a bunch of variable names, fixed some test cases
all code works great, all tests pass, full coverage
changed a variable name to conform to scikits code
correct variable names and added inline comments for active learning examples
added attributes text to explain named attributes
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
added support for sparse KNN graphs and tests
finishing up sparse additions (need to complete todo)
sparse KNN graphs now work
ENH add label propagation algorithm
finalized KNN work, all tests pass properly
Merge branch 'larsmans-label-propagation'
removed extra semisupervised folder
polished the lp & test code
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into label-propagation
variable name changes, using premade functions, doc fixes as per
variable name changes, doc corrections
removed unlabeled_identifier, updated tests and examples to reflect this
corrected example that still refered to unlabeled_identifier
optimization that stores the spatial index when using knn graphs
updated rst docs with kernel information
shuffled digits example, added sensible point colors to plot chart,
docs describe the different kernels available in techniques
TL directory change to push label propagation code into semi_supervised
added __init__.py file to semi_supervised folder
Updated docs for label propagation, added more technical details about
specific fine tuning to the label propagation docs
doc updates & tweaks
fixed typo in test code
added AISTAT ref to docs
added AISTAT ref to rst doc
fixed bug causing error on sparse input data
corrected the documentation and add semi-supervised section to the user
placed semi-supervised under supervised learning techniques in user
Merge remote-tracking branch 'upstream/master'
fixed error in graphviz export code causing graph error raised with
Clemens Brunner (54):
Refactored LDA module (first working version).
Added LabelEncoder and updated most of the docstrings
Updated more docstrings
Return self in fit()
Use equal priors by default
Added simple LDA example
Moved some common code into helper function
Support manually set shrinkage parameter (alpha)
For now only the svd solver supports the transform method
Updated LDA unit tests
Simplified LDA vs. shrinkage LDA example
Removed magic number
Addressed @mblondel's comments
Updated parameter name in help text
Parameters are now consistently named; also refactored the _means_cov() function
Added different shrinkage values to test cases
Updated LDA example and removed decision plane comparison
Increased coverage
PEP8 style conventions
Catch NotImplementedError
Fixed doctest
Use assert_raises() correctly
Removed private _decision_function()
Revert last commit because it broke some tests
Assert that the coefficients of the solvers are approximately the same
Replaced str with six.string_types
Raise error if alpha is not the expected type
Cleaned up docstrings.
Cleaned up one more docstring.
Addressed some comments by @agramfort.
Changed format of shape parameter in docstrings.
Updated test.
Use broadcasting to scale back instead of multiplying with a diagonal matrix.
Added random seed.
By default, estimate priors from sample.
Fixed docstrings.
Updated example because store_covariance is now set in __init__() and not in fit().
Changed parametrization of coef_ and intercept_.
Treat the binary case differently from the case when n_classes > 2.
Introduced step size to speed up the example.
(Hopefully) fixed Travis bug on Python 2.6.
Another try to fix the Travis bug.
Inherit from LinearClassifierMixin.
Added shrinkage LDA entry.
Rename parameter value 'ledoit_wolf' to 'auto'.
Test that lsqr and eigen solvers return almost exactly the same coefficients.
Added deprecation warnings to store_covariance and tol parameters in fit() method.
Added sentence on lsqr vs. eigen solvers.
Now correctly use weighted average to compute the class covariance. Cleaned up code and added references.
Works with one feature.
Updated LDA tests.
Fix problem with np.linalg.norm for version <= 1.6.
Added documentation for the three solvers.
Updated documentation on shrinkage.
Clyde-fare (1):
Added laplacian kernel
Conrad Lee (39):
Modified learn.cluster.mean_shift_.py so that the mean_shift function uses a KDTree to efficiently calculate distance within some threshold. KDTree implementation is in C and is from scipy.spatial. Tested only using the example located in examples/cluster/plot_mean_shift.py
Added another variant of mean shift clustering (in scikits/learn/cluster/mean_shift_.py that seeds using a binning technique on a grid.
Modified learn.cluster.mean_shift_.py in the following ways: Replaced old seeding strategy with bucket strategy which should be scalable. Modified nearest neighbor lookup to make it more scalable by adding a maximum number of neighbors -- in most cases this will not make a difference in the results --- the impact of this change is tunable with the max_per_kernel parameter. It is now possible to force all points to belong to a cluster (default) or only those points that are within t [...]
Modified learn/cluster/mean_shift_.py in the following ways: Added more efficent and proper removal of duplicate clusters. Took seed detection out of mean_shift function and put it in its own function. Default bucket size for seed detection is now the bandwidth.
Made following changes to cluster.mean_shift_.py: Added documentation for new functions. Made following changes to cluster.__init__.py: this module now imports the get_bucket_seeds function from mean_shift_.py
scikits.learn.cluster.mean_shift_.py modified in the following way: improved documentation
Changed plot_mean_shift.py example to use larger data set to show how bandwidth estimation dominates the runtime.
Changed scikits.learn.cluster.mean_shift.py: Updated reference for mean_shift algorithm
Changed scikits.learn.cluster.mean_shift.py: Added Conrad Lee as author.
Changed scikits.learn.cluster.mean_shift: modified so that complies with pep8.
Changed scikits.learn.cluster.__init__.py and examples/cluster/plot_mean_shift.py: modified so that complies with pep8.
Changed scikits.learn.cluster.mean_shift_.py: Now uses BallTree because of built in query_radius function, allowing us to get rid of the get_points_within_range function. Changed MeanShift to not use bucket seeding by default.
Hard coded bandwidth to 1.30 because otherwise its calculation is too slow.
Changed scikits.learn.cluster.mean_shift_.py: now uses blas nrm2 to compute norm.
Modified file scikits.learn.cluster.mean_shift_ Replaced a list comprehension and a for loop with numpy operations to improve efficiency.
Modified file scikits.learn.cluster.mean_shift_: removed print lines used for debugging, made code compliant with pep8
Modified file scikits.examples.plot_mean_shift.py: updated reference.
Mean shift: now uses norm function from utils.extmath
Mean shift: removed obsolete reference to KD-Tree with reference to BallTree
Removed obsolete import of izip, made description of complexity more concise and accurate
Mean shift: settled on term 'bin' and removed unnecessary references to 'bucketing' or 'discretization' from variable names and documentation
Mean shift: Fixed a minor type
Mean shift: Moved a test file in preparation for merge with agramfort's branch
Merged agramfort's branch with my own
Mean shift: removed my old test script due to merge with agramfort, changed num points in plot example to ten thousand to speed it up.
Brought my branch for mean shift modification up to date with current head on github
Mean shift: modified get_bin_seeds so that it no longer has to copy all points
Mean shift: Fixed a bug that occurs when the cluster_all argument is False
Merge remote-tracking branch 'upstream/master'
Mean shift: fixed bug introduced during upstream merge
cross_validation.py: fixed bug in text of error message
metrics.py: modified precision_recall_curve to lower computational complexity
metrics.py: pep8 and other cosmetic changes
metrics.py: Added more comments to precision_recall_curve.
metrics.py: bugfix in precision_recall_curve and added tests
metrics.py: more detailed comment in precision_recall_curve
metrics.py: pep8
metrics.py: COSMIT more commets on precision_recall_curve
metrics.py: COSMIT, replaced cryptic np.r_ with np.hstack
Corey Lynch (10):
cythonized expected_mutual_information
added authors
Changed example svc kernel to be linear, however the error curve ends up flat under the new kernel.
Used more extreme values of C to show a more pronounced error curve.
Took out a save image line
Edited docs to reflect change in kernel used.
added yticks
added yticks
added yticks
limited range of C cross validation
Cory Lorenz (2):
Fix Float Resolution Bug on Gaussian Process Test
Add a fit_predict method for the GMM classes
Craig Thompson (1):
MAINT: unreachable code removed from BernoulliNB
Daiki Aminaka (1):
fix typo (on -> one)
Dan Blanchard (2):
ENH sort option for memory-efficient DictVectorizer
Vainly add link to my GH profile in What's New.
Dan O'Huiginn (1):
Fix a few spelling/grammar errors in the docs
Dan Yamins (14):
added arithmetical ordering patch for labels in linear.cpp and test for liblinear predict
comment
simplification in liblinear testing
pep8 compliance in liblinear testing code
simplified liblinear prediction function
two trailing whitespaces removed from an multiline comment :)
minor syntacting improvement in liblinear test function ...
one more minor improvement to liblinear test code
I think i've got it this time ...
pep8 compliant at last!
various changes to handle fortran ordering in matrices
some pep8 fixes .... but probably more to come
removed testing thing
pep8 stuff as well removed testing stuff
Daniel Duckworth (9):
Merged svm parameter selection visualization
split plot_rbf_parameters.py's plot into two
Added plot_rbf_parameters example to SVM doc
Fixed bug in plot_rbf_parameters.py causing only one figure to show
Fixed location of ".. _svm_mathematical_formulation:" in svm.rst
Convert input dtype to float in pairwise_distances
Convert input dtype to float in pairwise_distances
Merge remote-tracking branch 'upstream/master'
Python 2.6 bugfix for plot_rbf_parameters.py
Daniel Galvez (6):
Make apply method of trees public. Added test for concistency with private method.
Added docstring
Added example demonstrating tree.apply
Added indentation to docstring
Removed cruft
Added tests of apply() for valid and invalid inputs. Fixed style.
Daniel Kronovet (3):
fixed typos and update formatting in doc/developers/index
additional typos, formatting, and links for doc/developers/index and /utilities
Fixed factual error in descriptino of PR flow
Daniel Nouri (20):
Test qda with 'priors' parameter
Test QDA.covariances_ attribute
Don't cover this deprecated method
Test non-normalized GaussianProcess
Test _BaseHMM._decode_map
Test _BaseHMM.{predict,predict_proba}
Make this bit of code more compact (and improve code coverage).
Remove unused code branch. (_hmmc must be always available nowadays.)
Remove stale test code
Remove obsolete comment
Improve cross_validation test coverage: 94% -> 99%
Improve metrics.metrics code coverage: 95% -> 100%
Improve svm.base test coverage: 92% -> 98%
Add docs for `vocabulary_` and `stop_words_` attributes of Countvectorizer.
FIX #2372: StratifiedKFold less impact on the original order of samples.
Fix accidental doctest breakage.
Instead of linking to NB, explain the problem inside the test itself.
Avoid list, preallocate a numpy array for indices instead.
Update comment with numbers for when we run with 800 samples.
Add entry for #2372 to whats_new.rst
Daniel Velkov (1):
Fix wrong argument name in RFECV docstring
DanielWeitzenfeld (1):
added howaboutwe testimonial
Daniele Medri (1):
Update linear_model.rst
Danny Sullivan (52):
more documentation changes
adding test that contains a simple asgd implementation and compares it with the output of sgd_fast
adding more precision to almost equal
converting to setting coef after the fit has been made
putting const on pointer arguments
adding test for binary classifier
adding support for partial fit and adding test for partial fit
adding support for averaged intercept
adding test for intercept part of asgd
adding support for multiclass with a test included
changing float to double
adding comments to xnnz implementation
adding faster implementation of asgd with sparse data
changing api to have average_sgd and plain_sgd
seperating out the average and plain apis in stochastic_gradient.py
adding sgd_fast.c
fixing typos
adding explanation of asgd to doc
adding if self.average logic
adding test to make sure plain sgd does not have average parameters
changing sgd to asgd
adding comparison for ASGD
removing standard_coef and standard_intercept from plain sgd
making optimal for asgd the constant learning rate
fixing merge conflicts
changing averaging to use sparse trick
adding value error for partial fit with auto weights
adding note to whats_new.rst
removing miss-merge and adding github
adding a more descriptive error message
adding escape characters to regex
cleaning up commented out code and previous_coef_ parameter
cleaning up floating point and unneeded todos, also removing constant learning rate for asgd
removing documentation about constant learning rate for averaging
adding add_average method and solving iteration bug by replacing definition of t
adding note for partial_fit and n_iter
clean up and average can now be set to an int indicating the number of iterations before averaging
increasing testing precision, putting comment on one line and adding asgd to regression benchmark
changing parameters for asgd regression benchmark and changing array shape for docstring
adding space to shape and putting // in average division
removing spaces for tuples of width 1
adding asgd to whats_new.rst
fixing merge conflicts
fix to a minor bug with intercept
fixing duplicated classifier in asgd test
adding example for memory wrapper
adding support for class_weight in fit method
adding warning if passing class_weight through fit
changing warn message, making it Deprecation Warning, and removing negative index for py2.7
adding simplefilter for warning
changing warning test to use assert_raises_message
Adding Implementation of SAG
Data1010 (2):
fix #5139
DOC FAQ on loading data as numpy arrays
David (1):
[Pipeline] add named_steps attribute documentation #4482
David Cournapeau (2):
REF: hack to be able to share distutils utilities.
BUG: remove remaining utf-8 characters in docstrings.
David D Lowe (3):
Add example for precision_recall_fscore_support
Shrink the doc for precision_recall_fscore_support
Shrink precision_recall_fscore_support's doc again
David Dotson (2):
Closes #4614
Added list of multi-label supporting classifiers.
David Fletcher (1):
DOC fix/expand Imputer example docs
David Marek (17):
fixed SparsePCA.transform returning NaN for 0 in all samples. (fixes #615)
Added test for SparsePCA.transform (checks #615)
ENH: Added p to classes in sklearn.neighbors
TEST: tested different p values in nearest neighbors
DOC: Documented p value in nearest neighbors
DOC: Added mention of Minkowski metrics to nearest neighbors.
FIX+TEST: Special case nearest neighbors for p = np.inf
FIX: pep8
ENH: Use squared euclidean distance for p = 2
ENH: train_size and test_size in ShuffleSplit (#721)
TEST: Added more tests for ShuffleSplit
TEST: Tested ShuffleSplit with different types of test_size
Changed deprecation warning.
DOC: Added changes in ShuffleSplit and sklearn.neighbors
Error checking now works for more types than just int and float.
Use numpy dtype.kind instead of isinstance
TEST: assert_equal instead of assert
David Warde-Farley (21):
Rephrase motivation for Sparse PCA
Misc rephrasings of sparse PCA docs.
Remove 'structured sparsity not implemented' comment
Prefix explanation of sparse PCA formulation with 'Note that'
atoms -> components for clarity
Trailing whitespace fix.
Rewording in docstring
gradient descent -> coordinate descent in docstring
'Returns' section of the _update_code docstring
Wrap np.seterr reset in a try..finally block
ImporError -> ImportError
Added loader code for (Roweis) Olivetti faces dataset.
Added imports to __init__.py for Olivetti faces
Documentation for the Olivetti Faces dataset.
Remove 'load_' alias for 'fetch_'
Use prints for now instead of logging at Gael's request
Add a shuffle keyword, default False
Fix math notation for exp and tanh.
Add pointer to kernel equations from SVC docstring.
Rephrased narrative doc reference in docstring.
Added RST comment about where to find narrative docs.
Denis (1):
FIX Move pooling_func to constructor.
Denis Engemann (117):
FIX + ENH: catch custom function argument errors and inform user
FIX transform tests
FIX: remove inplace mod
COSMITS
FIX: inverse transform + add mean_
COSMITS
FIX: syntax typo
FIX: tutorial
COSMITS + DOC
COSMITS
ENH: improve tutorial to be more clean.
ENH + FIX: remove inverse-t kwarg + fix mean_
FIX: address @agramfort 's comments
FIX: address remaining issues
ENH: speed up logcosh
ENH: improve ICA memory profile by 40%
ENH: add failing test exposing bug in RandomizedPCA
FIX: only center if copy == True
ENH: get it right.
FIX: inverse_transform; tests
DOC better doc message
API: get rid of **params in PCA estimators.
DOC: more doc string fixes in pca.py
DOC: more fixes in pca.py doc strings
STY: get rid of unnecessary identifiers
FIX: X.copy() test now works
STY: removing unnecessay import
COSMITS
ENH/FIX: improve logcosh function + tests
FIX/ENH: revert changes, improve doc
FIX: 80 characters
COSMITS
COSMITS
COSMITS
ENH: address discussion
ENH: use empty, not zeros
COSMITS
ENH: add fast_dot function
employ DataConversionWarning
FIX: avoid zero-dvision warnings
FIX: fix test warning
better handling of warning in test
COSMITS
FIX: workaround for missing BLAS
ENH: return np.dot if ndim does not match
CLEANUP
FIX: remove spurious test
ENH: add fast_dot shape check
ENH: cover another cornercase + fix messages
ENH: add new Warning class, improve tests, update docs
FIX: superfluous check
ENH/DOC: add performance tips on fast_dot
COSMITS
update what's new
FIX: proper doctest
ENH/FIX: address @ogrisel comments
FIX: move inline comment to its right place
FIX: performance.rst doctest
ENH: restrict fast_dot to np < 1.8
FIX: ghost diff
ENH: update version checking
ENH: put fast dot to PCA, ICA and FA
FIX: test doc np.allclose
ENH: add randomized_svd option to FactorAnalysis
FIX: misc
misc
COMSITS+DOC
ENH: use `algorithm` keyword + expose failing test
FIX: make SVD working again + add tests + update whats new
COSMITS
ENH/FIX: misc + address discussion
COSMITS
ENH: add strong comparisons
COSMITS: address discussion
COSMITS2
ENH: add check on init + appropriate test; COSMITS
ENH: address discussion + actuall make randomized default
ENH: exposed n_iter; add missing fast_dot
ENH: improve wording + recommendation
COSMITS3
ENH: add warning instead of print
ENH: deprecate `verbose` parameter
ENH: optimze euclidean norms
FIX scipy API
... add the fix to import check
ENH: improve FA vs PCA example
COSMITS
MAINT: use stable blas getter API
FIX: unintended assignment to ValueError
STY: tuple for Gael
ENH: refactor warnings 1
ENH: add assert_warn_message
ENH: revem tests_metrics.py
ENH: more of this
ENH: final fix: clear all regisitries
FIX: PPCA related warnings
ENH: context manager + decorator
FIX: doc example
ENH: address discussion
ENH: address discussion
rebase
cosmits
FIX
FIX typo 2
address discussion
remove import
Improve doc
py3k syntax
ENH: color scheme + add PCA
COSMIT
address duscssion
address duscssion 2 + expose modern API
ICA vs PCA barrier free + fixed
COSMIT
Improve wording + COSMITS
Improve wording + COSMITS II
manual subplots adjust
Denton Cockburn (3):
DOC fix some docstring/parameter list mismatches
renamed weight to sample_weight in sklearn/isotonic.py
DOC missing stuff in randomized_l1 module
Diego Molla (2):
Minor bug fix in metrics.adjusted_rand_score
Added tests
Dmitrijs Milajevs (1):
Don’t embed hyperlinks during latex file generation.
Dmitry Spikhalskiy (1):
Fix mistake in TruncatedSVD expalation in docs
Donne Martin (1):
Fix #4978: Typo in CONTRIBUTING.md
Doug Coleman (18):
BUG: Don't test test_k_means_plus_plus_init_2_jobs on Mac OSX >= 10.7 because it's broken. See #636. Closes #1407.
BUG: Fix the random_state on make_blobs() in test_classifiers_classes(). Fixes #1462.
BUG: Make a RandomState object and use it in test_transformers(). Fixes #1368.
FIX: Cast floats to int before slicing in robust_covariance
BUG: Build random forests the same way regardless of n_jobs and add a test for this. Don't predict in parallel since the cost of copying memory in joblib outweighs the speedups for random forests. Fixes #1685.
COSMIT: Fix up a loop.
COSMIT: Better assert.
DOC: Update new magic numbers in docs since random forests train differently now.
FIX: sklearn.ensemble.forest: Refactor to remove references to parallelism in predict() functions.
BUG: Fix performance regression on large datasets in random forest.
DOC: Emphasize that n_jobs is for fit and predict methods in random forests.
BUG: Use Py_ssize_t to index into numpy arrays to help Python handle big data.
MISC: Update _tree.c with cython.
BUG: Use ``Py_ssize_t`` in a few more places for strides. Add the c file again.
DOC: Clarify docs on preprocessing.Binarizer.
FIX: Finish package rename from mst -> sparsetools. Fixes #2189.
DOC: Fix backwards docs on thresholds for preprocessing.
FIX: Newer numpy causes scipy to issue a DeprecationWarning. Ignore it. Fixes #2234.
Dougal J. Sutherland (3):
fix factor of 2 in RBFSampler; make test more rigorous
clarify KernelDensity.score{,_samples} docstrings
ForestClassifier.predict docstring correction
Dougal Sutherland (4):
StratifiedKFold: remove pointless copy of labels
stochastic_gradient: fix mistake in _init_t docstring
stochastic_gradient: describe all losses, fix epsilon description
support X_norm_squared in euclidean_distances
DraXus (2):
peping8 examples
peping8 examples/applications
Eddy L O Jansson (1):
Wrong format specifier used when formatting exception message.
Edouard DUCHESNAY (45):
add pipeline
WIP pipeline
Example of feature selection pipeline
Merge branch 'master' of github.com:vmichel/scikit-learn
Cosmetic on Pipeline
Merge branch 'master' of github.com:vmichel/scikit-learn
Partial Least Square 2 blocks mode A (PLS) implementation
PLS examples
Merge branch 'master' of github.com:scikit-learn/scikit-learn
PLS mode A : two estimation algo: NIPALS & SVD
PLS: WIP
PLS : cosmetic changes
PLS
PLS cosmetic
PLS: optimize, compare against R implementation, clrify terms
PLS: simplify API + som additionnal test
PLS: add transform function
PLS: test_pls fix a bug
Merge branch 'master' of github.com:scikit-learn/scikit-learn
PLS: transform method
PLS : add predict function
PLS : predict
Merge branch 'master' of github.com:scikit-learn/scikit-learn
PLS : make sure this also works with 1 dimensional response (PLS1)
Merge branch 'master' of github.com:scikit-learn/scikit-learn
remove quotes "" on columns names
PLS cosmetic: PEP8, etc.
PLS, new specific classes: PLSCanonical, PLSRegression, CCA + some cosmetics
PLS: computation optimization
PLS API
PLS: API (2)
PLS : coeficients computation
PLS : check for numerical instabilities + force float
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'pls' of https://github.com/fabianp/scikit-learn
Merge branch 'master' of github.com:scikit-learn/scikit-learn
resolve conflict
Merge branch 'pls' of https://github.com/fabianp/scikit-learn
resolve conflict
samples generators: remove multivariate_normal_from_latent_variables
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Check that scikit-learn implementation of PLS provides exactly the same outcomes
Some more non regression test on PLS
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge pull request #869 from pprett/pls-scale-by-zero
Eduardo Ariño de la Rubia (1):
Fixed incredibly minor spelling mistake
Eduardo Caro (1):
Change name of variable to be consistent with dataset
Emmanuelle Gouillart (7):
Corrected a few typos in the documentation.
In spectral clustering example, forced the solver to be arpack
Example on tomography reconstruction with Lasso for the gallery.
COSMIT: PEP08
Tomography example: PEP08, typos...
Reference to tomography example in narrative doc
ENH: a few typos in docstrings
Eric Jacobsen (3):
ENH: Call plot twice per class label rather than for every point.
ENH: Call plot twice per class label rather than for every point.
Merge branch 'plot_dbscan' of https://github.com/ericjster/scikit-learn into plot_dbscan
Eric Larson (1):
FIX: Fix for changing mode
Eric Martin (1):
Directly compute polynomial features.
Erich Schubert (2):
Do not shuffle in DBSCAN (warn if `random_state` is used).
Update whats_new.rst
Erik Shilts (2):
DOC missing parts of docstrings
DOC Remove target_names from boston dataset object description
Erwin Marsi (3):
Added cosine distance metric for sparse matrices
added missing assert in unit test
Fixed doc string; compute cosine distance without copying matrix.
Ethan White (1):
Fix typo in linear_model documentation
Eugene Nizhibitsky (1):
Fix staged_predict_proba() in gradient_boosting.
Eustache Diemert (95):
added first version of out-of-core example
revision round #1 (move to examples/applications, 1 file, auto-download dataset)
pep8 / pep257 compliant formating
get rif of feature dicts, leverage HashingVectorizer class directly
plot as both a function of time and n_examples
using print() function
improve explanations on out-of-core learning paradigm
improve explanations on example structure
fixed use of docstrings + added section in whats_new.rst + added data dir to .gitignore
more robust data location
use same, separate held-out data to estimate accuracy after each mini-batch
added first version of out-of-core example
revision round #1 (move to examples/applications, 1 file, auto-download dataset)
pep8 / pep257 compliant formating
get rif of feature dicts, leverage HashingVectorizer class directly
plot as both a function of time and n_examples
using print() function
improve explanations on out-of-core learning paradigm
improve explanations on example structure
fixed use of docstrings + added section in whats_new.rst + added data dir to .gitignore
more robust data location
use same, separate held-out data to estimate accuracy after each mini-batch
fixed conflict in whats_new.rst
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn into out-of-core-examples
factorized instance extraction + plots
added note on test set creation rationale
cosmit : inline extract_instance
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn into out-of-core-examples
more structured iteration using islice + wrappers; renamed chunk for minibatch as the latter seems more common in hte literature
added sub section on out-of-core scaling in the narrative docs
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn into out-of-core-examples
some more language corrections
more pep257 fixes (not for ReuterStreamReader as it is not really the interesting class here)
DOC recommend understanding NumPy in the tutorial
DOC expand feature selection docs with an example
first draft of scaling_strategies.rst - still wip
improved scaling_strategies.rst, inc. working links
added Federico as author
cosmit pep8/257
added link to user guide + small cosmit
fix re: ogrisel comments
added note on learning rate evolution
readded a paragraph that got lost in the rebase
COSMIT pep8/257
COSMIT typos
COSMIT typos
added links to plots in narrative docs
added bestofmedia logo
added bestofmedia testimonial
cosmits
cosmits (2)
COSMIT more typos/precisions
inital checkin
first plot working
working boxplot
pep8/257 fixes
more pep8/257 fixes
1st draft of merformance docs
Merge remote-tracking branch 'upstream/master' into pred_latency
some substance to the perf docs
added influence of n_features; added throughput explanation/graph
Merge remote-tracking branch 'upstream/master' into pred_latency
wip sparsify()
Merge remote-tracking branch 'upstream/master' into pred_latency
completed sparsity sections; added benchmark
cosmit link
cosmit double backticks
precisions from @ogrisel
Merge remote-tracking branch 'upstream/master' into pred_latency
Merge remote-tracking branch 'upstream/master' into pred_latency
@ogrisel precision on sparse + elasticnet tradeoff
link to benchmark in github
fix for np.count_nonzero not present in Numpy < 1.6
whats_new.rst
s/speed/latency/g
added section on model complexity
now compares linear vs svr vs rf
added example and fullfilled section on influence of model complexity
removed CV code to tune hyperparameters (better for CI)
mentionned training throughput
added ElasticNet to the narrative doc
more descriptive legend for ElasticNet
merge doc/whats_new.rst
Merge remote-tracking branch 'upstream/master' into pred_latency
found a good elasticnet example
fixed link & desc in WN.rst
added model complexity in latency/throuhputs example plots
added prediction time plot to Reuters example
typos in narrative doc
Merge remote-tracking branch 'upstream/master' into pred_latency
Joel's feedbacks
Merge remote-tracking branch 'upstream/master' into pred_latency
Merge remote-tracking branch 'upstream/master' into pred_latency
added model re-shaping + compressed verbose §
emph. reshaping benefits on I/O
Fabian Pedregosa (876):
Add intercept to classes Lasso and ElasticNet
Cosmetic changes in SVM doc.
Start of 0.5 development cycle.
Re-enable code that was removed for the release
Cleanup gmm example. Removed unused modules.
In LAR, normalize only non-zero columns.
Add support in LAR for unnormalized columns.
LAR: add a test for zero coefficients.
Cosmetic changes in glm module.
Add modules to top-level __init__.
Rename ninter --> n_iter in the API guidelines.
Add documentation to svm mdoule.
FIX: bug in blas_opt detection.
Link against compiled cblas in case this is not in the system.
Bug fixing in setup.py
Apply changes made by Olivier.
One more tests on LibSVM with precomputed callable kernel.
Refactoring of LibSVM bindings.
Test for libsvm margin.
More bugfixes for blas detection in setup.py
FIX: numpy 1.4 fixes.
Mark as known to fail some tests in test_hmm.
Use numpy.testing instead of unittest to skip failing tests.
Refine cblas detection on OSX.
FIX: compatibility fixes for py3k.
Initial support for sparse matrices in SVMs (scikits.learn.sparse.svm)
Refine cblas detection on OSX.
FIX: compatibility fixes for py3k.
Initial support for sparse matrices in SVMs (scikits.learn.sparse.svm)
FIX: bug fixing on sparse.svm.
FIX: more bugfixing in sparse.svm.
Doc updates to the svm module.
Remove unused imports in qda module.
Some doc for the svm module.
Add target in to Makefile.
Fix names and missing parameters in LinearSVC.
Add support for sparse matrices in liblinear bindings.
Add a reference to density estimation in GMM docs.
Use relative imports inside scikits.learn.
Remove unused imports from hmm module.
Refinement and bugfixing in the liblinear bindings.
More refactoring and bugfixing with liblinear.
More refactoring in libsvm + liblinear.
remove unused imports from setup.py
run all tests suite through nose.
move liblinear into its own folder
Bug fixing in liblinear bindings.
Added some failing tests.
Bug fixing in liblinear bindings.
XFail tests that fail (or are plainly wrong).
Refactor layout of developer docs.
Revert unwanted changes (aka ooooops!).
Added tests to trigger failure on classes using liblinear.
Refinement and bugfixing in the liblinear bindings.
More refactoring and bugfixing with liblinear.
More refactoring in libsvm + liblinear.
remove unused imports from setup.py
run all tests suite through nose.
move liblinear into its own folder
Bug fixing in liblinear bindings.
Added some failing tests.
Bug fixing in liblinear bindings.
XFail tests that fail (or are plainly wrong).
Refactor layout of developer docs.
Revert unwanted changes (aka ooooops!).
Added tests to trigger failure on classes using liblinear.
Update the developer docs.
Refactoring & bug solving in liblinear.
FIX: fix liblinear predict in the multiclass case.
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
Add reference to pybrain from the ann docs.
Update numpydoc (sphinx extension).
Update svm rst doc.
Add rst doc for logistic (empty for now).
FIX: fix shape of support vectors in liblinear sparse.
Update git information.
Do not compare LinearSVC and SVC for exactly equal classification.
Update git information in README.rst.
Update sphinxext/docscrape from numpy's trunk.
Refactoring in the svm module.
Re-enable probability estimates in logisitic regression.
Rename failing example in order to build the doc.
FIX: fix generating the examples with some tricky uses of pylab.
Update install information.
Fallback to plain html for image rendering in index.html
Fix bugs in dev docs.
Updates on install doc.
Update mailmap file.
Updates on sparse.svm.SVC
Remove install_requires line.
Update README. Remove unused dependencies.
Use str for printing parameters.
Update setup.py.
Change make setup to run setup.py
Use repr for arrays in representation of classifiers.
Use nosetest as testing tool in README.
Allow setting variable PYTHON, NOSETESTS in Makefile.
Doc: correct size of intercept in svm.
Keep shrinking and probability as booleans in SVM.
Refactoring: put all gmm examples in its own directory.
Some love for the rst docs.
Create new class NuSVR.
Some patches for k_means.
Backport changes in sparsetools to compile under python2.7.
Add a pure-python version of LARS and refactor structure in glm.
add README for gmm examples.
Refactoring and doc for svm module.
FIX: fixes for the lars lasso code.
Fix build system.
Fixes on the Lasso-LARS algorithm.
Add benchmarks for the LARS algorithm.
Change score function and add docstrings.
Some work on the rst docs.
more doc love.
DOC: more work on svm module.
Fix in LARS: specify manually number of interations for full path.
Remove "debugging" traces...
Fix doctests.
DOC: some doc glm module.
Convert to ndarray in Ridge
DOC: glm module.
FIX: fix intercept in LinearRegression.
Doc: Add stub file.
Benchmarks for some Glm classifiers.
DOC: more on glm module.
Fix typo
Backport total_seconds from python2.7 to use in benchmarks
Refactoring in glm.benchmarks.
Rename nSV_ --> n_support_ in svm module.
Some more doc for the glm module.
Update doc svm.NuSVC
Make BSD find happy.
Compat: Add function copysign in the case of numpy < 1.4
Do not import pylab globally in benchmarks
Test for function utils.fixes._copysign.
FIX: fix previous (stupid) commit.
Welcome Virgile Fritsch
Update docstring for BayesianRidge.
DOC: update docstrings in glm.bayes.
Update docstrings in svm and logistic.
FIX: some fixes for bayesian doc in glm.rst
DOC: some more fixes on BayesianRegression doc.
Better error message in fit svm.
Fix failing tests (sparse svm).
Merge http://github.com/GaelVaroquaux/scikit-learn into gael
Be able to do _get_params and _set_params in a recursive way.
Fix imports.
Temporary fix for Table of Contents not showing.
Do not import pyalb in the benchmarks.
LARS refactoring speedup Work In Progress!!!
more on lars optimization
more on lars speedup WIP
LARS with precomputed kernel working.
more work in lars optimization.
More on LARS performance: triangular solving and cholesky deletes.
A more challenging example.
cleanup and fix some tests.
More on LARS.
more optimizations
cleanup
Update CBLAS files: add rotg, rot, trsv, remove tpsv.
Good bye minilearn.
Fixes for ref atlas.
Some fixes for the atlas we ship.
Add missing cblas_dcopy files.
New theme for the web page.
Add what's new page and a nicer sidebar for index page.
Move glm related benchmarks to a common location.
Performance improvements for LARS precomputed Gram matrix.
Remove weight_label keyword from SVR.
Remove Minilearn C sources.
Add some developer information for the CBLAS we ship.
Cosmetic changes to the cblas README.
More love for the new web page theme.
Refactoring in the svm module.
Also remove Windows python extension (.pyd) in make clean.
move sparse.svm into the svm module to match glm.sparse.
Make a reference page in the docs.
Update class reference.
Cleanup in setup scripts.
Bugfixing in setup.py
cosmit.
Cosmetic changes to svm tests.
Correct default value of gamma in the svm docstrings.
Refactoring in sparse SVM and bug solving (default value of gamma).
Refactor svm tests.
Fix doctest failing by last bug fixing.
Add target test-doc to Makefile to test the RST docs.
Remove obsolete debugging code from grid_search.
Remove obsolete comments from doc.
Cosmetic changes to grid_search_digits.
Reduce import time.
Polynomial kernel also uses keyword gamma.
Fix typo in svm docs.
Fix wrong link in doc.
DOC: update doc about LARS.
Add LARS, LassoLARS to class reference.
Update funding info.
Update API changes in feature_selection doc.
Remove the ann module.
Remove obsolte css code from the docs.
Update docs on sparse svm.
Correct spelling errors in svm documentation.
Fix spelling errors in glm.rst
Remove non ASCII characters from the docs (problem in latex output)
Fix non-html (latex) generation of the docs.
Fix RST (numpydoc) markup.
Update organization in index.rst
Update doc of neighbors module.
Update links in svm doc.
Update theme in web page (sidebar color)
Fix malformed RST in BallTree.cpp
Change doctests that are machine dependent.
Update joblib to 0.4.5
Comment out fragile joblib tests.
Add test.py script that runs nose.
Remvoe printing statements from tests.
Adopt numpy naming scheme for __version__ attribute.
Compatibility fixes for utils.graph.
Use by default np.unique.
Compatibility fixes for scipy <= 0.7 and numpy <= 1.4
Compatibility fixes for old scipy.sparse.
Do not include Makefile in final release.
Add missing files to setup.py
Add missing images
Rename features --> feature_extraction to match module feature_selection.
Update information on testing.
FIX: bug in setup.py file from glm/sparse/
Update web page.
FIX: fix imports in example for renamed modules.
Add function template for doc.
Update web page theme.
Update mailmap file.
Add Feature Selection classes to the reference docs.
0.5 changelog (Work in progres).
DOC: better link that literalinclude.
Combine user guide into a single file.
Update changelog.
Add png logo.
Update MANIFES.in file.
Update test.py and README.
Simplify test machinery.
Use ELLIPSIS for machine-dependent results in joblib.
Comment out machine-dependent tests from joblib.
Update Makefile
Welcome Mathieu Blondel.
Fix doctests from the tutorial.
0.5 release candidate.
FIX: some setuptools oddities.
0.5.rc2 release.
Still fixing distutils oddities ...
0.5.rc3 release.
Web page update.
Add sparse ti glm/__init__
Fix typo in docstring.
Fix typo
Fix doctests from RST docs.
Fix links in about page.
Cosmetic changes in install.rst
You want the truth well here it is.
Add a link to the PDF version of the docs.
0.5 final release.
Start of 0.6 development cycle
Add note on executing the test suite.
Update web page.
Add a note on complexity for SVMs.
Add datasets to __init__ file.
Correct typo in docstring.
Allow access to multi-class SVM in liblinear.
Do not execute test coverage by default.
lighten GMM tests.
remove n_dim property (use plain field).
Fix and enable _covar_mstep_full in gmm.py
Cosmetic changes.
Bindings for libsvm-dense
Update svm benchmark with latests libsvm.
Some fixes for libsvm-dense
More accurate info in examples.
Update svm examples affected by latest API changes.
DOC: Some docstring for libsvm low level API.
Revert "DOC: Some docstring for libsvm low level API."
Revert "Update svm examples affected by latest API changes."
Revert "More accurate info in examples."
Revert "Some fixes for libsvm-dense"
Revert "Update svm benchmark with latests libsvm."
Revert "Bindings for libsvm-dense"
ENH: enhacements in the gmm module.
Make previous commit work also with old versions of scipy.
No specific need that matrix is upper-triangular in gmm.
Fix doctests in gmm (skip random ones).
Revert "Fix doctests in gmm (skip random ones)."
Revert "No specific need that matrix is upper-triangular in gmm."
Revert "Make previous commit work also with old versions of scipy."
Revert "ENH: enhacements in the gmm module."
Bindings for libsvm-dense
Update svm benchmark with latests libsvm.
Some fixes for libsvm-dense
More accurate info in examples.
Update svm examples affected by latest API changes.
DOC: Some docstring for libsvm low level API.
Compile _libsvm_sparse in the sparse module.
Add setup.py to svm.sparse
Preliminary fix for naming issue in OSX with libsvm.
Add a namespace to svm methods to avoid same name mangling.
Fix for building libsvm in a portable way.
FIX: fix doctest with recent API changes.
FIX: fix fragile doctest.
Updated liblinear to latest version 1.7.
Make liblinear quieter.
Update classes to use new features from liblinear 1.7.
Move logistic into glm and add a sparse version.
Doc: better tests for logistic.
Fix imports in example.
Fix doctests in sgd module.
Welcome Peter.
Avoid iterating over features in gmm.
Add more sanity checks for svm with precomputed kernels.
Use n_jobs=1 as default value in SGD module.
Unique URL for release-specific doc
Cleanup in libsvm bindings.
Cosmetic changes in gmm.
Improve docstrings in metrics.py
Cosmetic changes
DOC: Add new installation media and a note for pythonxy users.
FIX: prefix with plot examples that produce output image.
New implementation of LARS algorithm.
Add a test for lars_path.
Fix typo (wantto -> want to)
remove obsolete bench_lars.py
FIX: replace nsamples --> n_samples in svm docstrings.
Remove BaseLib class.
Implement make html-noplot for building the doc.
Update libsvm docstring with latest API changes.
Rename predict_margin --> decision_function.
Indentation fixes in libsvm bindings.
Performance improvements in LARS.
Better heuristic in LARS.
Add support for np.float32 matrices in lars_path.
Add parameter precompute='auto' for *LARS classes.
Some LARS refactoring.
Rename scikits.learn.gmm to scikits.learn.mixture.
Update developers info.
Add GridSearch and GridSearchCV to the class reference.
Update svm docs (content of dual_coef_).
Account for lower=True option in solve_triangular.
Do not import gaussian_process from top level __init__.
update NuSVC docstring.
Fix failing doctests in gaussian_process.rst.
Fix GridSearch does not exist.
Give credit for web page layout.
glm --> linear_model rename holocaust.
Welcome Vincent Dubourg.
Update AUTHORS information.
Initial support for weighted samples in svm module.
Cosmetic changes to web page layout.
Fix example paths for GMM after renaming.
Update class reference list.
Cosmetic changes in documentation.
Add sgd.* to class reference.
Move benchmarks outside the source tree.
Fix precompute keyword in LARS.
Update LARS benchmarks with latest API changes.
Cosmetic changes in plot_weighted_samples.py
Add cross-references between LassoLARS and Lasso.
More rename in the sgd module.
ENH: prettify web page layout.
Some love for scikits.learn.svm.
FIX web page layout for very long paths.
Update LARS documentation.
Fix for linear_model.rst
More love for rst docs.
Like it or not, we depend on setuptools.
Use original diabetes data as shipped by the R package lars.
rename lars --> least_angle
Remove duplicates in linear_model/__init__.py
Use relative imports in datasets.
FIX: sparse svms do not accept callable kernels.
py3k fixes: callable has been removed.
Py3k compatibility
Remove redundant site.cfg parsing.
Update status of py3k support.
Cosmetic changes in LARS.
FIX: correctly add depends files to setup.py.
Make libsvm recognize labels in increasing order.
Correct array size in decision_function docstring
TEST: sanity check on decision_function.
Inverse sign in decision_function.
No need to sort predict_proba any more.
Add a comment on inverting the sign of decision_function.
FIX: order of indices of support vectors in multiclass.
Shuffle globally for iris in test_svm.
Divide parameter alpha / n_samples for consistency with Lasso.
Cosmetic changes.
Cosmetic changes in lars.
Update .mailmap
FIX: fix bug in sparse liblinear: bias parameter was not set.
FIX lda, qda: new numpy.bincount requires integer arguments.
Started Changelog 0.6.
Change link in plot_face_recognition.
Remove example plot_lar.py
FIX: do not invert the sign of decision_function in OneClasSVM.
Add missing options to OneClassSVM.
web page layout fixes.
Remove duplicate docs (sphinx generates this for us).
Prepare for 0.6 release.
Remove generated classes on make clean.
Add notes on fluctiations of liblinear.
Add type info to docstrings.
FIX: backwards compatibility for scipy <= 0.8
Remove Methods from docstring.
FIX: scipy 0.9 compatibility fixes
FIX: second argument in euclidean_distances.
Cosmetic changes.
Better version detection for scipy
FIX: stupid mistake.
FIX Stupid mistake
More robust utils.fixes.
FIX: docstring.
FIX: np.unique.
Start 0.7 development cycle.
Add AUTHORS to web page.
Note on LinearSVC.
Web page layout.
FIX: update to latest API.
Web page update.
FIX tests when run with scikits.learn.test()
Update doc.
Update Mailmap.
Update authors list.
Update README.
Add all doc to generated latex.
Add species distribution modelling to OneClass examples.
Add other ways to contribute to the doc.
Little doc improvements to the grid_search.
DOC: remove duplicate information.
Remove unused imports
Add installation instructions for NetBSD.
Revert "Partial Least Square 2 blocks mode A (PLS) implementation"
Revert "PLS examples"
Revert "PLS mode A : two estimation algo: NIPALS & SVD"
Some docstrings added to ridge.
Rename lb -> label_binarizer.
Add note on multi-class classification.
Add some more doc to LabelBinarizer.
Some love for lars_path.
Turn off axis in plot_iris.
ENH: implement decision_function for libsvm-based classes.
DOC: svm.rst refactoring.
FIX: always raise ValueError on deficient input in BaseLibSVM.
FIX: fixes & tests for liblinear decision_function.
ENH decision_function liblinear, sparse variant.
FIX: fixes for liblinear decision_function.
Nicer support vectors in example plot_separating_hyperplane.py
PEP8 fixes.
Doctest fixes.
Remove obsolete info.
Squash function in test_svm.py
remove unused.
Add RandomizedPCA to RST docs.
PCA docstrings reestructuring.
Do not resize the array on k=1.
ENH: Neighbors refactoring.
Add parameter eps to NeighborsBarycenter.predict.
FIX: fix dimensions in plot_neighbors_regression.
Simpler doctest for neighhbors.
FIX: rename adjacency --> connectivity in kneighbors_graph.
Change the algorithm used in neighbors.barycenter.
small fix in barycenter
remove unused imports.
Rename barycenter --> barycenter_weights (as it was before).
Neighbors refactoring.
FIX: fix collinearity issues in least_angle.py
Regenerate Cython file _liblinear.pyx
Remove arbitrary code in tests.
Simpler check for orthogonality.
Add pls to __init__
DOC: set up barebones documentation for PLS.
FIX: do not resize array in knn_brute.
Faster Neighbors* in high dimensional spaces.
Use squared distances.
FIX: typos and missing info in docstring.
metrics.pairwise has right to live.
Rename inplace --> brute_inplace
ENH: better consistency tests for neighbors module.
FIX: typo.
FIX: don't import assert_allclose
So this is why people kept posting issues to SF's trac ...
Deleted code is debugged code.
Cosmetic changes in decision_function.
Rename strategy --> algorithm in Neighbors*.
Improve performance of GMM sampling
Second patch by f0k.
Cosmetic fixes in GMM.
More cosmetic changes in GMM.
Rename ndim --> n_dim
Rename nobs --> n_obs
Some more docstring fixes for mixture.
Examples cleanup: remove pl.close, it is now handled by gen_rst.
Changelog for 0.7
More doc on 0.7 release.
More on changelog.
Minor fixes in changelog.
Add metrics to the doc.
More fixes for the changelog.
Some more changelog stuff.
FIX: mxf --> Xinfan Meng
Documentation update.
Replace latex with simple syntax in docstrings.
Start of 0.8 development cycle.
Building on Windows.
Build precompiled windows binaries.
ENH: make transform() work when no Y is given.
Remain compatible with numpy 1.2
Do not import scipy.sparse globally.
Implement probability estimates for SVR and OneClass.
Raise NotImplementedError on predict_proba when model do not implement
Update numpy/scipy requirements.
Read README.rst for description in PYPI
DOC: clearer doc for BallTree.
DOC: docstring enhacements for Gaussian Naive Bayes.
DOC: some documentation for naive_bayes module.
Refactoring in svm module.
ENH: better doc and tests for unbalanced svm's
Python 3 compatibility.
Nicer low-level API for libsvm.
Ignore OSX .DS_Store files.
Revert "Python 3 compatibility."
FIX: rename eps to tol also in svm.sparse.
ENH: cython bindings for libsvm's cross_validation routine.
Revert "Python 3 compatibility."
FIX: rename eps to tol also in svm.sparse.
ENH: cython bindings for libsvm's cross_validation routine.
FIX: cross val return array size.
Initial implementation of cross validated SVC
Python 3 compat, this time with npy_3kcompat.h
Revert "Initial implementation of cross validated SVC"
Merge branch 'cython-balltree-wrapper' of https://github.com/thouis/scikit-learn
Cosmetic changes in base.py
FIX: py3k compat.
I won't import scipy.sparse globally.
Some cleaning in libsvm sparse bindings.
name consistency in sparse svm
ENH: low-level API of libsvm.
Cleanup in libsvm helper.
FIX: important fix for sparse SVC (weights were not initialized correctly).
Don't hardcode n_jobs.
Add regularization in the computation of barycenter weights.
Add regularization in the computation of barycenter weights.
libsvm low-level API refactoring.
PEP inquisition.
Some fixes for web layout.
Remove obsolete information.
More low-level refactoring.
Return first score in case of ties.
rename grid_points_scores_ to grid_scores_ in GridSearchCV
Some tests for the things I changed in GridSearchCV.
Merged pull request #135 from paolo-losi/l1_logreg_minC.
DOC: fix links to l1_min_c
FIX: reference to l1_min_c
Merge branch 'covariance' of git://github.com/VirgileFritsch/scikit-learn
Cosmetic changes in covariance.
DOC: add low-level methods from libsvm.
FIX: fix rename of grid_scores_
Do not open file write file until download is complete.
Add tests for libsvm.cross_validation.
Add optional parameter n_class to load_digits.
Merge pull request #144 from larsmans/balltree-cleanup.
FIX: missing import in plot_covariance_estimation.py
Py3K: use explicit floor division
Return also t from swiss_roll generator (needed to plot colors)
CSS style tweaks.
more CSS tweaks.
Some more CSS tweaks
Initial implementation of Locally Linear Embedding.
Re-generate .cpp from ball_tree.pyx
pep8 clean.
FIX: python2.5 SyntaxError
FIX: tuples have no .index in python2.5
FIX: more python2.5 SyntaxError
FIX: explicit linking against std++ breaks under mingw32.
FIX: fix import paths in doctests.
Merge pull request #157 from fabianp/joblib_fix
FIX: compatibility python2.5
DOC: add docstrings to BallTree.
Update neighbors with latest changes to BallTree.
Update .mailmap
Layout fixes.
Add analytics code to web page, SF discontinued web page stats.
Changelog
Some doctest fixes.
More docstring fixes.
FIX: change doctest to avoid results with NaN
I have no idea why, but this fixes the broken doctest.
Start of 0.9 development cycle
Welcome Lars & Edouard.
FIX: pls docstring.
DOC: added section on complexity for LLE.
Rename embdding_vectors_ --> embedding_
Add submodule for manifold.
Cosmetic changes.
Merge pull request #3 from GaelVaroquaux/manifold
More on practical tips.
Typo
FIX: bad import
Move cache_size out of model parameters.
Cosmetic changes in the docs.
Docstring for test.
Test for non-contiguous input for svms
Implement predict_proba for sparse svms.
FIX: doctests in svm doc
ENH: support instance of BallTree as input to kneighbors_graph.
Merge branch 'master' of github.com:scikit-learn/scikit-learn into manifold
Implement transform method in LLE.
FIX: fix test.
more fixes.
FIX: fix segfault in cases of infeasible nu (NuSVM)
FIX: transform method.
Merge pull request #153 from fabianp/manifold
FIX: use NeighborsClassifier in test.
FIX: some bugs in locally_linear_embedding.
DOC: remove obsolete information in neighbors.rst
Add max_iter to LARS.
DOC: fix errors in manifold doc + style tweaks.
Explicit cmap in swissroll example.
Add test and cleanup for 2c1c88
Test: test for unnormalized predictors.
Add failing test.
DOC: add reference to FastICA from the ICA docs.
DOC: add fit_intercept to LinearSVC docstring.
Refactoring in ridge.py
Rename of cg -> dense_cg and 'default'-> 'dense_cholesky'.
Some docstring updates.
Move scipy_future into utils.arpack
Add Jake to the mainfold credits.
Merge pull request #222 from jakevdp/balltree-doc
Explicit cmap for plot_compare_methods.
Cosmetic cleanup.
FIX: bad logic in Pipeline.
Revert "FIX: bad logic in Pipeline."
Refactoring in libsvm bindings.
FIX: fix bug in LLE with dense solver
Update ARPACK from scipy.
Backward compatibility fixes for testing LLE.
FIX: arpack doctest
comment LLE arpack test
Protect against MemoryError in libsvm.fit
FIX: doctest Ridge.
FIX: add newline after autosummary:: sphinx directive.
Layout & consistency fixes linear models documentation.
cosmetic linear_model.rst
FIX doc linear_model.rst
Layout tweaks.
DOC: new example for Ridge + more rst docs
Merge pull request #236 from JeanKossaifi/sparse_matrix_type
Don't use np.atleast_2d when interfacing with native code.
Some documentation for hmm module, and a warning.
Revert "pyflakes warnings"
Covariance with residual at the end for path is zero.
FIX: LARS doctest in linear_model.rst
Update rsync command
Merge branch 'variational-infinite-gmm' of https://github.com/GaelVaroquaux/scikit-learn
Replace logsum by np.logaddexpr in hmm, tweaked some tests.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge pull request #218 from fabianp/fix_lars
Rename n_states --> n_components in mixture & hmm + cosmetic changes.
FIX: support numpy < 1.3
Merge pull request #280 from vene/lars_n_features
Remove max_features keyword from lars_path.
Use default value for n_nonzero_coefs
Remove hardcoded n_jobs from examples.
Revert "Remove hardcoded n_jobs from examples."
Don't use n_jobs=-1 in the examples.
Refactor tests for SVR.
Correct NuSVR impl in the sparse case.
Add tests for last commit.
Remove fit params from all objects.
Merge pull request #316 from jakevdp/cython-ball-tree
Compatibility fixes for Python 2.6 and doctest fixes
FIX: py3k compatibility.
FIX: py3k compat.
Merge pull request #326 from bdholt1/fix/svm
Welcome Brian Holt
FIX: broken example
Generate thumbnails in the example gallery
Link images to example file in new gallery
FIX some broken examples.
Rename face_recognition so that the result is plotted.
Revert "Rename face_recognition so that the result is plotted."
FIX: linnerud dataset mixed variables.
Layout tweaks
Cosmetic changes in example docstring
layout tweaks
Move project directory from scikits.learn to sklearn
Add a compatibility layer for some modules.
Forgot to add a blank image for the docs.
Revert "Add a compatibility layer for some modules."
Revert "Move project directory from scikits.learn to sklearn"
Move project directory from scikits.learn to sklearn
Add a compatibility layer for some modules.
correct imports
Merge pull request #335 from fabianp/rename
Add more modules for compatibility layer.
More renaming.
scipy.lena() no longer works on scipy's dev version.
FIX: fix variable referenced before assignement in libsvm.pyx
Do not import mixture from top-level sklearn.
DOC: add parameter C to docstring.
Use LinearSVC's docstring instead of outdated one.
scipy.lena has moved to scipy.misc.lena in scipy's dev version.
Use 1 / n_features as value for gamma.
FIX broken tests by last commit.
Add changelog for changing gamma parameter.
FIX example logisitic regression.
Move matrix factorization to work in progress.
Initial changelog -- to be completed.
More changelog and .mailmap
Why, emacs, why ??
Update changelog
DOC: broken link to example
FIX: add test fixtures to distribution.
FIX: broken link to example
DOC: always generate pages for autosummary.
FIX: some sklear.test() fixes.
Add Vlad as GSOCer
Complete Changelog.
FIX: import path under scipy's dev version.
Comment tests that depend on PIL.
Comment out tests for the current release.
FIX typo
FIX: docstring for RadiusNeighborsRegressor
sklearn.test() does not like doctest that don't print.
Doc: Print --> Issue
Safer assert_all_finite.
Some more doctest fixes for sklear.test()
Update commiters list
Start of 0.10 development cycle.
Some Python 2.5 fixes.
More python2.5 fixes
FIX: assign NaN to an integer array has no effect on old numpy
Some more changelog stuff.
Update MANIFEST.in: scikit-learn --> sklearn
Add mldata loader and olivetti dataset to changelog.
Faster tests for coordinate_descent.
Add changelog entry.
Merge pull request #375 from VirgileFritsch/mcd
Merge pull request #383 from bdholt1/svm-mem-leak2
Add Brian's name to the Changelog.
FIX: keywords {precompute, Xy} where implemented and documented but unused ...
Cosmetic changes in LARS
FIX: Py3k compatibility.
Delete benchmarks/bench_svm.py
Delette benchmarks/bench_neighbors.py
MISC: More meaningful names for lapack functions in least_angle.
Removed unused parameters in least_angle
Convert to scipy doc convention + add missing options
FIX: array2d was did not return contiguous arrays with order='C' ...
FIX: do not use reshape in libsvm sparse bindings.
Use centralized directory for generated files.
Description for logo: font, color, etc.
DOC: Move practical info into its section and delete duplicates.
Style: webpage tweaks
Style update in documentation.
Doc: minor fixes
Minor update and fixes to linear_model documentation
Minor update and fixes to linear_model documentation
Move implementation details into RST doc.
Docstring conventions.
DOC: rename n -> p
Web page layout tweaks.
Small comment on the dual parameter
Use M.dot instead of np.dot on sparse matrices
FIX: LLE mode='auto' for small matrices and tuples.
FIX: use .toarray() instead of .todense()
COSMETIC: more readable syntax for mult. of sparse matrices.
Merge pull request #466 from amueller/svm_iris_example
Remove useless benchmark.
FIX: broken benchmark
Move uninteresting example to docstring
FIX docstring
Merge pull request #456 from vene/sparse-coder
Remove duplicate definition in RST
Replace unmaintainable test
More robust test for lars_path
Typo in example. Thanks Virgile for the cool example.
Revert code that I erroneously changed
Remove old API change warning
Merge pull request #504 from jakevdp/sphinx-images
FIX: docstring
DOC: exaple for sklearn.test()
FIX: convert lena to float32 (originally it's ints)
FIX: doctest
Still some tweaks for the sklearn.test() example
Remove pylab code from docstring and +SKIP those that requie PIL
FIX: explicit conversion to float64 in ElasticNet
FIX: bug in elasticnet with precompute not being updated correctly.
DOC: complete docstring for regression score function
DOC: restructure docstring of ElasticNet.
Changelog
Start of 0.11 development cycle.
Mailmap alias
And the winner is ...
DOC: links for people that have webpage.
DOC: some documentation fixes.
DOC: docstring update for dump_svmlight_file
Refactor in KFold.
Set the download link to PYPI.
FIX: bug in DenseBaseLibSVM when subclasses implement new params
FIX: inheritance in DenseBaseSVM
Add Satra to the AUTHORS list.
WEB: update the designer's URL
FIX: latex underscore
Explitit cmap for background.
Some doc for the example "Lasso path using LARS"
Some documentation for example plot_ridge_path
BUILD: add gemv cblas routine
BUILD: add dger cblas function
Update README.rst
Merge pull request #1078 from buguen/docs
Print running time as a floating-point number with two decimals.
Merge pull request #1138 from fabianp/doc_float
Robustify LARS. Fixes issue #487
New (faster) implementation of isotonic regression
ENH Improve Ridge's conjugate gradient descent
Added the paper I used to implement isotonic_regression.
Add support for preference contraints in svmlight format.
FIX: query_id parameter and other cosmetic changes
Add test for load_svmlight_files
Merge pull request #1182 from fabianp/svmlight
FIX: typo in ValueError message.
Add support for query_id in dump_svmlight_file
DOC: added svmlight qid support to whats_new.rst
Python3 compat: print()
ENH: Consider order in X for IsotonicRegression.
Better tests + cosmetic changes.
Store X as an ordered array.
Clarify docstring in lars_path
Update LIBSVM_CHANGES
Add SVD-based solver to ridge regression.
Remove unnecessary code in ridge svd
BUG: solver was not passed to computational method in Ridge object
Use Cholesky solver by default, but use SVD as fallback
Use ValueError for non-existant solvers
Merge pull request #1914 from fabianp/ridge_svd
Test for singular matrices in Ridge regression
Fix broken link to web designer
Fix broken link to web designer
Add artwork with reference to logos in about.rst
Cosmetic: correct latex formula display
DOC: write cost function of logistic regression.
DOC: add transpose and intercept to formula of logistic regression
Merge pull request #3378 from kastnerkyle/dense_cholesky_warnings
FIX: check with tolerance on lars_path
Cosmetic np.abs -> abs
DOC: expand documentation for logistic regression
Implementation of logistic_regression_path.
Take into account @agramfort's comments.
Some fixes for LogisticCV object
Docstring of LogisticRegressionCV
Docstring
Refactor and some bug fixing
Docstring
FIX missing import
FIX: bug in LogisticRegressionCV
Make tests deterministic
FIX: coef.shape
Just to be sure
Add test
BUG when fit_intercept=False
Fallback for failing line search
Remove warning (not needed any more)
Compatibility for old scipy
iterate some more on line search
cosmetic
ENH LinearSVR using liblinear code
DOC: make LinearSVR appear in the doc reference
Change loss names for LinearSVC and LinearSVR()
Fazlul Shahriar (1):
DOC fix docstring typos in cluster/mean_shift_
Federico Vaggi (7):
Added test_regressor_pickle to tests.
Added test_classifiers_pickle to tests.
Finished adding pickle tests.
Removed the use of StringIO, using pickle.dumps instead.
cosmetic: Changed all instances of nonlinear to non-linear
ENH: Added comparison of other classifiers using partial fit methods
Fixed misc style changes suggested by Ogrisel
Felix Brockherde (1):
FIX scores calculation in ovo multiclass
Fernando Carrillo (1):
Updated link. Original link to ZhangIJCV06 paper "Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study" was dead. Replaced all occurances of "http://eprints.pascal-network.org/archive/00002309/01/Zhang06-IJCV.pdf" with "http://research.microsoft.com/en-us/um/people/manik/projects/trade-off/papers/ZhangIJCV06.pdf"
Feth Arezki (1):
lfw: import imread from new location in scipy
Florian Hoenig (3):
added test that fails because Scaler.fit changes a sparse input vector when Scaler is initialized with copy=False
removed bug in Scaler.fit
improved test_scaler_without_copy
Florian Wilhelm (100):
Added multiple linear Theil-Sen regression
Added an example and documentatin for Theil-Sen
Added subpopulation parameter to Theil-Sen
Added parallelization support to TheilSen Estimator
Improved parallelization for Theilsen estimator
Merge branch 'master' into theilsen
Cleanups and corrections for Theil-Sen regression.
Removed subpopulation=None option in TheilSen
xrange fix for Python3 in TheilSen
FIX Theil-Sen unittest for older Scipy Version
FIX that some functions in Theil-Sen were public
FIX usage of linalg from Scipy in Theil-Sen
FIX: Let Theil-Sen handle n_samples < n_features case
FIX: Python 2.6 format syntax in Theil-Sen
Vectorization of theilsen._modweiszfeld_step
FIX: Parallel unittests for Theil-Sen estimator
FIX: TheilSen supports old Numpy versions
DOC: Comparison of Theil-Sen and RANSAC
DOC: Fixed typo in Theil-Sen example.
FIX: Some coding style fixes in TheilSen unittest.
FIX: Reduced the runtime of the TheilSen unittest.
DOC: Small corrections in the docs of Theil-Sen
DOC: Explanation when TheilSen outperforms RANSAC.
Merge branch 'master' into theilsen
Merge branch 'master' into theilsen
Fix for old Numpy 1.6.3
Added to comments to better explain last commit
Use string argument for legend's loc parameter
Merge remote-tracking branch 'gvaroquaux/pr_2949' into theilsen
Merge remote-tracking branch 'upstream/master' into theilsen
TST: Cleanups in test_theil_sen
COSMIT: Renamed _lse to _lstsq in theil_sen.py
ENH: Removed shared-memory parallelism in theil_sen
COSMIT: Inlined two methods in theil_sen.py
ENH: Use warnings instead of logging in theil_sen
ENH: Removed _split_indices method in TheilSen
ENH: Rewrote TheilSen._get_n_jobs as a function
COSMIT: More explicit names for vars in theil_sen
Merge branch 'master' into theilsen
FIX: usage of check_array in theil_sen
FIX: Use check_consistent_length in theil_sen
ENH: Refactoring in theil_sen
ENH: Removed unnecessary generator in theil_sen
FIX: doctest of get_n_jobs
ENH: Theil-Sen vs. RANSAC example
Merge branch 'master' into theilsen
COSMIT: Small changes regarding Theil-Sen
DOC: Better documentation for Theil-Sen
ENH: Improvements in the Theil-Sen regressor
ENH: Shortcut for 1d case in spatial median
ENH: Avoid trailing \ in test_theilsen imports
FIX: TheilSen -> TheilSenRegressor in docs
DOC: Narrative doc for median_absolute_error
ENH: Reworked _modified_weiszfeld_step
DOC: Improved _spatial_median docs
COSMIT: Replaced xrange by range
COSMIT: Renamed y -> x_old in _modified_weiszfeld_step
COSMIT: Renamed spmed[_old] to spatial_median[_old]
COSMIT: Break and for .. else in _spatial_median
ENH: Reworked _lstsq in theil_sen.py
ENH: Replace AssertionError by ValueError
COSMIT: Improved error message in theil_sen.py
COSMIT: Renamed n_all to all_combinations in theil_sen.py
COSMIT: Consistent naming for n_subpop
COSMIT: Fixed pep8 problem in theil_sen.py
DOC: Moved notes section to long description
Merge branch 'master' into theilsen
ENH: Added median_absolute_error to metrics
DOC: Added doc in model_evaluation.rst
ENH: unit tests for median_absolute_error
DOC: Added return doc to _lstsq
COSMIT: Better variable names for _modified_weiszfeld_step
COSMIT: Moved epsilon to module level
COSMIT: Some empty lines for better readability
ENH: test_common.py unit tests pass
FIX: Small typo
Merge branch 'master' into median_absolute_error
ENH: Removed median_absolute_error due to PR #3761
ENH: Made n_subpopulation a fit parameter
COSMIT: Some renamings and PEP8 compliance
Merge branch 'master' into theilsen
ENH: Removed 1d shortcut in _spatial_median again
COSMIT: Clearer slicing syntax in _modified_weiszfeld_step
ENH: Fixed confusing X = X.T renaming
COSMIT: Some renamings in _lstsq
FIX: Fix of last merge with master
COSMIT: Renamings for easier understanding
COSMIT: Another slicing syntax cleanup
Revert "ENH: Removed 1d shortcut in _spatial_median again"
ENH: sample without replacement
Merge branch 'master' into theilsen
COSMIT: pep8 and renaming
COSMIT: replaced assert by assert_less/greater etc.
TEST: No console output during unit tests
ENH: Always set random_state in unit tests
ENH: Speedup of unit tests
COSMIT: Better consistency
ENH: Added random_state in plot_theilsen.py
MAINT remove multi-output support from meadian absolute error
DOC: Added whats_new for TheilSenRegressor
Francois Savard (2):
Fixed docstring for C param in BaseLibLinear/SVM subclasses.
Added version info to deprecation warning
Frank C. Eckert (1):
Fix typo in model evaluation documentation
Frank Zalkow (9):
added a string for FriedmanMSE (instead impurity) when exporting a dot file
missed elif
now uses isinstance and keeps original name (FriedmanMSE)
added a test and reverted string to friedman_mse
corrected test: criterion name has to be only in those nodes where there is "samples"
took if clause into regex
implemented improvement suggestions
added newline (for pep8) and reverted to regex solution due to 0.0/-0.0 problem on windows
fixes
François Boulogne (1):
DOC: minor improvement in comments of an example
Félix-Antoine Fortin (5):
Modified package name in Easy Install section.
DOC/FIX affinity_propagation damping default value.
DOC fix roc_curve docstring.
DOC: fix 2 covariance examples rst math markup.
DOC: Remove extra # from url in fit.
Gael Varoquaux (1621):
MISC: Make sure that the tests pass on numpy 1.2
MISC: Comsit + replace some global seeds with RandomSate
MISC: Rename to let the underscore RULE!
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
API: Create a base estimator class.
ENH: improve base class
ENH: Temporarily remove the typing for the base_estimator
TEST/BUG: Test the BaseEstimator class and fix the repr
Merge branch 'master' of http://github.com/agramfort/scikit-learn
Cosmit
MISC: Remove the #$! import *
ENH: convert all GLM estimators to the BaseEstimator class
BUG: Fix the OLS regression
BUG: Fix constructors with arguments.
BUG: Syntax error
BUG: str of linear models now working.
API: Change the type of params: turn this into a frozenset: unmutable and
Merge branch 'master' of http://github.com/agramfort/scikit-learn
ENH: Change _params to frozenset
API: Change argument controling whether intercept should be fitted
Cosmit in tests
Merge branch 'master' of http://github.com/agramfort/scikit-learn
API: Change the BaseEstimator and parameter signature logic.
Cosmit
Cosmit
ENH: Convert LDA and clustering to use the new BaseEstimator
MISC: Change the title of the documentation.
Cosmit
ENH: Make the clustering more usable
ENH: Add an example of playing with the stock market
ENH: Make SVNs fit to the BaseEstimator API.
ENH: Make SVNs fit to the BaseEstimator API.
Cosmit
MISC: Put the nearest neighbors estimator to the BaseEstimator
MISC: rename base_estimator.py to base.py
BUG: Make sure that the docs still build with recent versions of numpy
BUG: Make sure the docs still build with recent versions of numpy
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
BUG: Adapt the sparse SVM to the rename of base_estimator.
MISC: Remove warning when compiling docs.
MISC: Adding titles to examples.
DOC: Document best practices/coding guidelines to make it easier for
Cosmit
MISC: Put the nearest neighbors estimator to the BaseEstimator
MISC: rename base_estimator.py to base.py
BUG: Make sure that the docs still build with recent versions of numpy
BUG: Make sure the docs still build with recent versions of numpy
BUG: Adapt the sparse SVM to the rename of base_estimator.
MISC: Remove warning when compiling docs.
MISC: Adding titles to examples.
DOC: Document best practices/coding guidelines to make it easier for
Cosmit
ENH: Make the grid_search take instances of estimators rather than
Add a setup.cfg to specify default nosetests behavior.
MISC: 80 character bordel!
Add a setup.cfg to specify default nosetests behavior.
MISC: 80 character bordel!
Cosmit: rename grid to iter_grid
Merge branch 'master' of github.com:GaelVaroquaux/scikit-learn
DOC: Beautify example
Merge branch 'master' of github.com:GaelVaroquaux/scikit-learn
MISC: Beautify examples.
Cosmit
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
Merge branch 'master' of http://github.com/vmichel/scikit-learn
ENH: rework univariate selection to reach a compromise between ease of
ENH: First go at a help for cross-validated evaluation of a score.
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
Merge branch 'master' of http://github.com/agramfort/scikit-learn
ENH: Add an example showing the dependency of SVC+Anova on the number of
ENH: Add joblib as a bundle dependency.
BUG: Fix doctests in pipeline.py
BUG: Fix doctest in GMM
ENH: Add script to update joblib dependency
Cosmit: rename MixinClassif to ClassifMixin
ENH: Make sure that in cross_val_scores the StatifiedKFold is used only
Merge branch 'master' of github.com:GaelVaroquaux/scikit-learn
MISC: Small change to contribution guidelines, suggested by Mathieu
COSMIT: For the sake of underscores
MISC: comment
ENH: Add parallel computing for cross validation.
BUG: Get the BaseEstimator to work even if there is not __init__
ENH: Make the parallel cross validation more efficient.
BUG: Fix cross_val_scores for unsupervised problems.
ENH: Improve cross_val in parallel
Merge branch 'cross_val_gael' of git at github.com:GaelVaroquaux/scikit-learn
Misc
ENH: Improve the repr for the BaseEstimator
Merge branch 'gael' of http://github.com/agramfort/scikit-learn
BUG: Fix a bug preventing from LinearModelCV to print.
Misc
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
BUG: Fix forgotten import in example
MISC: Remove pointless ellipsis directive (doctest)
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
BUG: Fix lda and qda on 64 bits.
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
Merge branch 'master' of ssh://gvaroquaux@scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
TEST: Make sure that doctests for bundled dependendies pass.
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
BUG: Make sure that joblib does get installed.
ENH: Make sure tests get installed.
ENH: Improve the repr for the BaseEstimator
TEST: Make sure that doctests for bundled dependendies pass.
BUG: Make sure that joblib does get installed.
ENH: Improve the repr for the BaseEstimator
TEST: Re-enable external tests.
BUG: Fix doctests to account for change in BaseEstimator repr
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
Merge branch 'cross_val_gael' of git at github.com:GaelVaroquaux/scikit-learn into cross_val_gael
MISC: Update joblib to 0.4.4
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
ENH: Make sure that the QDA inherits from the ClassifierMixin
Merge branch 'cross_val_gael' of github.com:GaelVaroquaux/scikit-learn into cross_val_gael
ENH: Make sure that the QDA inherits from the ClassifierMixin
ENH: Make sure that the logistic regression does inherit from
ENH: Add image to graph feature-extraction helper, and some basic graph
ENH: Make sure that the logistic regression does inherit from
ENH: Add some code to compute a graph Laplacien on sparse and non sparse
Merge branch 'master' of github.com:scikit-learn/scikit-learn into cross_val_gael
BUG: Temporary fix for 'array does not own memory' in SVM
ENH: First implementation of spectral clustering.
BUG: Temporary fix for 'array does not own memory' in SVM
Merge branch 'master' of github.com:scikit-learn/scikit-learn
ENH: Clean up the image clustering code and add an example on lena.
DOC: Add documentation for spectral clustering.
ENH: Add an estimator object for the spectral clustering.
ENH: Add k-means cluster with clever initialization.
ENH: K-means algorithm with good initialization, more
MISC: Restore an example that is now working again.
API: Change 'clustering' to 'cluster'
Merge branch 'master' of git at github.com:GaelVaroquaux/scikit-learn
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
BUG: Add a forgotten setup.py line
ENH: Backport a fast graph connect component algorithm from scipy.
MISC: Cosmit based on comments from Olivier and Alex
DOC: Add some notes on complexity of clustering algorithms.
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
BUG: Adding missing setup.py file.
BUG: Remove UTF8 character checked in by mistake.
BUG: Make graph laplacian and spectral clustering work in 64 bits.
ENH: For numpy >= 1.5, use np.linalg.slogdet as a fast_logdet
BUG: Fix bug with numpy >= 1.5 introduced by my previous (stupid) commit.
ENH: Remove 'import *' in glm/__init__
BUG: Fix tests broken by last commit
ENH: Make spectral clustering tests more robust
Cosmit (PEP 8)
BUG: Fix glm/setup.py so that the glm sub package installs right.
TEST: make the test location consistent.
API: Add cluster as an import of the main __init__
BUG: Fix warnings module not imported in coordinate_descent. Thanks to
Merge branch 'master' of github.com:scikit-learn/scikit-learn
MISC: Move AUTHORS to AUTHORS.rst so that it displays better on github
BUG: Make sure computations do not get executed at import time, so that
Cosmit
BUG: Remove failing doctest.
MISC: More tests and more docs for preprocessing.
BUG: Cater for NaNs in SelectPercentile.
Cosmit: 2 lines between function definitions
ENH: Make sure that the cross_val_score uses StratifiedKFold for
ENH: GridSearchCV: add an 'iid=True' and open the option to optimize
MISC: 3-Fold cross-val by default
ENH: Make sure that grid_search uses a StratifiedKFold by default on
BUG: Fix doctest
DOC: better example for SVM-Anova
ENH: Make sure docs build on older versions of sphinx
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
DOC: Prettify
DOC: Make first page more compact.
DOC: Update the developer guidelines.
MISC: Tweak front page
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
MISC: Cosmit on PCA tests to get understandable errors from the buildbot.
MISC: Delayed import of pylab, to work on the buildbot
MISC: Relative imports
BUG: Fix tests to be moroe robust
Cosmit
Merge branch 'master' of github.com:scikit-learn/scikit-learn
MISC: Add a warning in the spectral clustering if pymag is not present
Revert "MISC: Add a warning in the spectral clustering if pymag is not present"
Revert "Revert "MISC: Add a warning in the spectral clustering if pymag is not present""
BUG: Import stats explicitely to work with scipy > 0.7
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Fix typo on David's name
Merge branch 'master' of github.com:scikit-learn/scikit-learn
ENH: Add _set_params/_get_params in Pipeline
Merge branch 'master' of github.com:GaelVaroquaux/scikit-learn
ENH: Implement _reinit on Pipelines.
ENH: Use new pipeline framework in SVN-ANOVA
BUG: Import stats explicitely to work with scipy > 0.7
ENH: Finish the __repr__ for the pipeline
ENH: Add error management to the KFolds
ENH: Port the nipy k-means with some cleanups and enhancements.
ENH: Change the initialisation heuristic for k-means: in general random
BUG: Adapt spectral to new k_means API
Merge branch 'master' of github.com:scikit-learn/scikit-learn into kmeans
ENH: Add error management to the KFolds
Cosmit
Cosmit
Merge branch 'master' of github.com:scikit-learn/scikit-learn into pipeline
ENH: Change pipelines so that they are simpler and address subobjects
BUG: Fix bug introduced by previous commit
BUG: Fix doctests.
DOC: Better docstring.
ENH: Make setting nested parameters on Pipeline really work.
Cosmit
TEST: Add a smoke test for cross_val_score
MISC: Fix spelling.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
DOC: Add more links to the index
DOC: Add forgotten link targets.
DOC: prettify the nearest neighbors docs
DOC: Prettify the SVM docs.
MISC: Add 'Python' in the examples page
DOC: fix the PCA iris example.
MISC: Remove non-necessary lines from PCA example
DOC: Prettify the clustering documentation.
DOC: Fix the reference classes documentation
DOC: Prettify the GLM docs
MISC: Quiet down the tests.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
ENH: Add a tester to the scikit.
ENH: Make doctests pass with numpy tester.
MISC: Make sure tests always run.
MISC/DOC: fix reference
TEST: Fix test_pca sign error.
TEST: Fix whitespace in doctests.
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
ENH: GridsearchCV, Pipelines and cross validation
ENH: Make sure that Pipeline and GridSearch objects are indeed recognized
ENH: Make sure clone works on pipelines
ENH: Implement a score for the GridSearch.
ENH: Make sure that a GridSearchCV has a score
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'cross_val' of github.com:GaelVaroquaux/scikit-learn
Merge branch 'master' of http://github.com/fabianp/scikit-learn
ENH: Small optimization to BaseEstimator
BUG: Make sure that grid_search works with sparse data.
MISC: Cosmit in new GMM classifier example
DOC: Make the plot_ica_vs_pca example richer.
MISC: Some tweeks to the layout so that the docs display better on a
DOC: Fix title level in install
DOC: make the index page content clearer
MISC: Explicit acronym
MISC: PEP8 in docs
DOC: Change the titles' layout
DOC: Rewamp the tables of contents and corresponding layout
Merge branch 'master' of github.com:scikit-learn/scikit-learn
DOC: Make sure the docstring of pca render well
DOC: Remove empty section
DOC: Add documentation for ICA/PCA
DOC: Remove useless tables of contents
DOC: work on the clustering documentation
Merge branch 'master' of github.com:scikit-learn/scikit-learn
DOC: Tweak in the clustering docs.
DOC: document with more details the GMM module.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' of github.com:scikit-learn/scikit-learn
DOC: make the neighbors doc sexier
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Cosmit: explicit what OVA means as much as possible.
Cosmit
MISC: Recover changes overidden by manual merge.
BUG: Fix metrics to run on 2.5
ENH: Cosmetic improvements to the face example
MISC: Cosmit+Doc in fast truncated PCA
MISC: Remove redundant code and cosmit
Merge branch 'master' of github.com:scikit-learn/scikit-learn
ENH: Update embedded joblib to 0.4.6
ENH: import symbols on subpackage's __init__
API: Return self in _set_params
BUG: svm_gui: C is not defined for OC-SVM
ENH: Raise error when cloning bug estimators
BUG: Deal with 1D data in preprocessings.
BUG: fix cross_val and GridSearch in unsupervised.
BUG: Fix GridSearch in unsupervised
BUG: Fix the doc-generation of examples
Cosmit
MISC: Fix example
DOC: minor changes in gaussian_process docs
BUG: Fix missing gaussian_process subpackage in setup.py
FIX more missing files in setup.py
API: Remove long-depreciated function
BUG: FIx doctests broken in previous commit
DOC: documentation CD Enet fit parameters
DOC: Cosmit in docs
DOC: score is reserved to 'better is higher'
DOC: Better plotting in RFE example
ENH: Small tweak in BaseEstimator repr
ENH: Add control of the dtype in img_to_graph
ENH: dtype is img_to_graph defaults to input dtype
DOC: Add scipy in the install dependencies.
DOC: Typo in docstring
DOC: document better similarity matrix of spectral clustering
DOC: typos in docstring
ENH: Reorganise the feature agglomeration
ENH: Accept strings as memory
DOC: Add the logistic regression to linear models doc
DOC: Be explicite about what criteria are used in GridSearchCV
ENH: Add inverse transform to univariate_selection
MISC: Make sure that nosetests doesn't try to run the bench
ENH: Add a benchmark for ward
API: fit params -> class params in GrideSearchCV
MISC: Docstring formating
ENH: Tweaks for k_means performance.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Cosmit
ENH: n_leaves = n_samples in ward tree
MISC: np.zeros -> np.empty
ENH: Avoid big temporaries in hierarchical
MISC: Cleanup
ENH: Hierarchial: don't compute moments twice
ENH: hierarchical: gain memory with izip
ENH: hierarchical: simpler, faster without connectivity
MISC: labels in cluster -> int
MISC: Fix ward bench vs scipy
MISC: Avoid depending on numpy > 1.4
BUG: Add missing import
MISC: Less code duplication in lfw
Cosmit
API: SVMs: eps -> tol
MISC: Fix example to adjust to eps -> tol
ENH: Fixed seed for shuffling in SGD
BUG: fix grid_to_graph
MISC: cosmit + use private prng
MISC: fix typo
Merge remote branch 'vincentschut/master'
BUG: Fix bug introduced by PLS
DOC: Minor fixes to documentation
BUG: fix kneighbors method in high dim
DOC: improve PLS docs and example
MISC: Update joblib
ENH: Add verbosity to the gird_search
ENH: More parallelism in GridSearchCV
ENH: GridSearCV: better verbose
TEST: Fix trivail doctest failure
BUG: iter on complete grid (GridSearchCV)
MISC: html-nodoc default target
Merge remote branch 'origin'
Merge branch 'master' of https://github.com/yml/scikit-learn
TEST: Ellipsis on numericaly instable docs
Cosmit
BUG: doctest the joblib in externals not global
BUG: restore ellipsis in doctests
DOC: add the show-source back on html
BUG: fix multiple figure plotting
BUG: restore ellipsis in doctests
Merge branch 'master' of github.com:scikit-learn/scikit-learn
DOC: update rst docs to use multiple figures
DOC: front page link: Ward lena
DOC: better docs for Ward
ENH: Avoid gathering old images in docs
MISC: reduce disk consumption when generating docs
COSMIT: make the layout a bit cleaner in NMF docs
BUG: giving up on cleaning build
DOC: Better center of images
MISC: Two figures in plot_pca_vs_lda
DOC: move working notes to wiki
DOC: tweak sidebar
DOC: More sidebar tweaks
DOC: fix link to bug tracker
DOC: tweaks to developers notes
DOC: move KMeans to top of clustering
DOC: Less warnings during build
DOC: Fix more warnings
BUG: fix links to examples
MISC: cleaner generated code in doc/examples
MISC: separate decomposition examples to new dir
DOC: rmk on Sphinx version
DOC: add tiny docstrings where missing
DOC: Fix warnings
DOC: module entries in reference documentation
DOC: fix indentation
DOC: link fixes in kernel PCA
DOC: move working notes to wiki
DOC: tweak sidebar
DOC: More sidebar tweaks
DOC: fix link to bug tracker
DOC: tweaks to developers notes
DOC: move KMeans to top of clustering
DOC: Less warnings during build
DOC: Fix more warnings
BUG: fix links to examples
MISC: cleaner generated code in doc/examples
MISC: separate decomposition examples to new dir
DOC: rmk on Sphinx version
DOC: add tiny docstrings where missing
DOC: Fix warnings
DOC: module entries in reference documentation
DOC: fix indentation
DOC: link fixes in kernel PCA
DOC: Make sure that mpl is not interactive
DOC: Tweak DPGMM docs
COSMIT: lognormalize->log_normalize
COSMIT: avoid 'import as'
COSMITs
COSMIT: avoid one-liner
COSMIT: Local imports last
Revert "COSMIT: avoid one-liner"
MISC: move mixture's test to sub directory
COSMIT
COSMIT
COSMIT
ENH: Vectorize bound computing
ENH: Speed up _bound_z in DPGMM
COSMIT: DPGMM: Move bound computing to functions
ENH: Speed improvements in DPGMM
DOC: Improve the GMM vs DPGMM example
BUG: Fix bug introduced by moving test_mixture
DOC: Fix layout
DOC: Tweaks in mixture
BUG: fix testing (heisen) bugs in hmm
TEST: Heisen-bug fixing
ENH: Update joblib
Merged pull request #140 from fabianp/lfw.
MISC: Prettify GMM example
COSMIT: Pep 8 and remove useless imports
ENH: avoid useless computation and warnings
Merge pull request #148 from yarikoptic/master.
DOC: Fix layout
DOC: Fix layout
DOC: improve datasets information
MISC: links to upstream datasets
BUG: Make SVMs work on non contiguous arrays
COSMIT: Better fix for continuity in SVMs.
MISC: Recythonize the ball_tree
DOC: Fix links in covariance
MISC: Remove unused imports
ENH: Make LLE work with older PyAMG
MISC: Prettify the swiss roll example
COSMIT: Remove unused import
DOC: Prettify MiniBatchKmean example
BUG: Fix pyflakes warning in k_means
BUG: params not applied in MiniBatchKMeans
ENH: MiniBatchKMeans: avoid useless computation
BUG: MiniBatchKMeans: Error in stopping criteria
MISC: Cosmit in k_means_
DOC: cosmit in MiniBatchKMeans docs
TEST: Control seed in fastica tests
ENH: Capture different data length in grid_search
BUG: Avoid NaNs in lars_path
ENH: add pre_dispatch to GridSearchCV
ENH: Catter for lists in grid_search
Merge pull request #185 from amueller/master
BUG: Minor bugs in cross_val
Merge pull request #203 from amueller/docs_fix_again
Python2.5 compatibility
BUG: explicit imports in doctests
ENH: Lasso and LassoCV: fit params -> class params
DOC: Tweak the cross-val lasso path example
ENH: LARS: fit_params -> class params
COSMIT: Minor refactor in lars_path
ENH: Add a ShuffleSplit cross-validation iterator
BUG: Fix bug introduced in 83cf11c
Merge pull request #216 from ametaireau/master
BUG: change alpha scaling in LassoLARS
BUG: LassoLARS: X: modified during the normalization
BUG: LassoLARS didn't renormalize the coefs
ENH: Update joblib to 0.5.2
ENH: Update joblib
MISC: minor cleanups
ENH: Add small info on diabetes
COSMIT: Simplify the lena ward example
Cosmit
Merge pull request #247 from NelleV/FIX_doc
Cosmit
ENH: l1_distance: gaussian_process -> metrics
Doc: fix minor error in docstring
DOC: sparse_pca: put maths at the end
DOC on sparse_pca
DOC: Add l1_distances to classes.rst
TEST: faster tests, and more coverage
Merge pull request #248 from dwf/misc_fixes
BUG: Fix gmm bug + test failures
FIX/ENH: numerical stability in GMM
Comsit: PEP8
Cosmit: remove useless comments
MISC: Restructure compound decomposition example
TEST: SparsePCA: testing fit_transform useless
TEST: testing HMM more robust
Merge pull request #212 from vene/sparsepca
ENH: mixture: better numerical stability
Cosmit: Fix (some) pyflakes warnings
ENH: olivetti faces: control RNG in shuffle
DOC: Add a descr to olivetti_faces
DOC: fix some formating issues
DOC: Fix layout
DOC: fix layout
DOC: fix layout
Add forgotten 'install' for mixture.
BUG: Fix clone for ndarrays and sparse matrices
BUG: Fix clone for nadrrays and sparse matrices
Removing unused code
ENH: Avoid np.logaddexp.reduce
DOC: more precisions in univariate_selection
Merge pull request #266 from glouppe/master
BUG: fix dotests
DOC: stress that only chi2 works with sparse
COSMIT: remove unused import
DOC: Improve the Bayesian regression docs
Typo
Sorry, other typo
Merge pull request #279 from JeanKossaifi/master
ENH: Add a subset="all" to 20news
API: load_20newsgroups is depreciated
Cosmit
API+ENH: load data by default in mlcomp and 20news
ENH: compression in 20newsgroup caching
DOC: leftover false info in docstrings
DOC: load_filenames -> load_files
DOC: Link the Olivetti docs in the main docs
DOC: more explicit docs on alpha/rho in elasticnet
ENH: cv objects created by a helper function
COSMIT: fix doc indentation to PEP8
BUG+COSMIT: rewamp the lasso path examples
ENH: Add a LassoCV using LARS
COSMIT: Nobody expects the PEP8 inquisition
API: add import paths for LarsCV and LassoLarsCV
MISC: Follow changes to alpha scaling
ENH: Add normalization of X to LarsCV
BUG: Propagate fix 086b58f5 to LassoLarsCV
DOC: LARS docstring
BUG: Avoid div by 0 in lars_path_residues
ENH: Expose eps in LARS
DOC: Tweak the bayesian ridge docs
DOC+TEST: LarsCV
TEST: Improve test coverage of LarsCV
DOC: document eps in least_angle better
MISC: LarsCV: preallocate mse_path
ENH: use _check_cv in LassoLarsCV
DOC; fix documentation
MISC: mse_path in LassoLarsCV is now the mean
DOC: add example comparing LassoCV and LassoLarsCV
DOC: typos
API: _check_cv -> check_cv
Merge remote branch 'jakevdp/kernelpca-arpack'
TEST: Robustify LLE tests
BUG: Fix a bug introduced in rebasing
BUG: normalize before center in lars_path_residue
DOC: cosmetic changes to lars-bic doc and examples
DOC: make lasso docs easier to read
COSMIT: remove unused import
BUG: make lobpcg work with non-sparse matrices
COSMIT: tweak plot_compare_methods example layout
COSMIT: print time in plot_lle_digits example
MISC: fix image in manifold doc
MISC: prettify the faces example
COSMIT: doc and examples in decomposition
Merge pull request #314 from emmanuelle/spectral
ENH: More interesting benchmarks for OMP
API: eps -> tol in bayes
Merge pull request #317 from agramfort/normalize_data
Merge pull request #318 from JeanKossaifi/master
DOC: change the name scikits.learn to scikit-learn
Merge pull request #331 from JeanKossaifi/master
DOC: Fix doctest
DOC: scikits.learn -> scikit-learn
DOC: fix link
DOC: scikits.learn -> sklearn
DOC: Minor scikits -> scikit
BUG: sklearn/setup.py : learn -> sklearn
BUG: Backward compatibility layer sklearn.externals
ENH: Add verbosity control to LinearModelCV
BUG: scikits.learn -> sklearn: backward compatibility
COSMIT: PEP08
Unused import
BUG: backward compat: scikits.learn -> sklearn
ENH: add control of n_init in spectral clustering
BUG: scikits.learn -> sklearn backward compat
DOC: larger lena size in denoising example
Cosmit: make in-place modifications explicit
DOC: update whats_new.rst
Merge pull request #7 from larsmans/sklearn
BUG: ShuffleSplit: repr for random_state not number
DOC: formatting examples as a topic
ENH: GridSearchCV can has predict_proba
FIX bug introduced in 68e6544
Remove BaseLibLinear.predict_proba not implemented
DOC: Install.rst wrong packaging info
COSMIT
scikits.learn -> scikit-learn in README
`scikits.learn` in the README, to catch google
DOC: fix rst
TEST: skip unreliable doctest
DOC: minor doc ENH for trees
COSMIT: tree code simplification
COSMIT: np.random should never be called
COSMIT: no seeding of the global RNG
ENH: move parameter checking to fit
COSMIT: y is a vector, not a matrix
Cosmit, PEP8
DOC: doc and example cosmetics for trees
DOC: improve spectral clustering docs
API: spectral clustering uses arpack by default
DOC: proper docstring for load_sample_image
API: default in spectral clustering: auto
ENH: add doc target to Makefile
Merge branch 'master' into tree
Minor cosmit
DOC: use random_state in KMeans
DOC: improve silhouette coefficient docs
MISC: better check_build error reporting
PEP08 names in graph_shortest_path
COSMIT
TEST: simplify test case
SPEED tree: 2X in Gini criteria
MISC: mk roc_curve work on lists
MISC: __version__ in scikits.learn
DOC: add IterGrid in reference
COSMIT: no import as
MISC: Warn for integers in scaling/normalize
MISC: better warning message
COSMIT: never use np.linalg, but scipy.linalg
BUG: ProbabilisticPCA.score work with pipeline
MISC: remove links to sourceforge URL
DOC: fix links in mixture
MISC: add citation information
BUG: vectorizer.inverse_transform on arrays
DOC: pdf compilation
ENH: Easier debugging in check_build
ENH check_build: better error msg for local imports
DOC: turn off generation of index pages
ENH: Capture stdout in executed examples
COSMIT: layout in plot_kmeans_digits example
DOC: minor fix to AMI docs
ENH: First sketch of glasso
ENH: example for l1 covariance estimator
ENH: Add cd solver to glasso
COSMIT glasso: docstring and cleanup
ENH: the GLasso estimator
DOC: Better glasso example
TEST: test GLasso
ENH Glasso: don't penalize the diagonal
ENH: Add a GLassoCV
ENH GLassoCV: iteratively-refined Grid search
ENH GLasso: stability on correlated data
ENH GLassoCV: better parameter optimization
TEST GLasso: increase test coverage
DOC: narrative documentation for GLasso
COSMIT: @agramfort's comments
DOC: add sparse inverse covariance in whats_new
PEP8
DOC: rmks on structure recovery
DOC: better stock_market example (WIP)
COSMIT: address most of @ogrisel's comments
ENH: don't echo convergence warning on CV grid
DOC GraphLasso: be explicit about which algorithm
DOC GraphLasso: notes on algorithms and recovery
DOC: docstring in stock market example
DOC/API: integrate make_sparse_spd_matrix
Typo
MISC: address @larsman's comments
API: g_lasso.py -> graph_lasso_.py
DOC: GLasso -> GraphLasso
MISC: @VirgileFritsch and @mblondel's comments
MISC: silence stdout in GraphLassoCV tests
ENH GraphLasso: Silence warning
ENH: graph_lasso works on empirical covariance
BUG: update tests to changes in graph_lasso
BUG: fix layout in examples
MISC: fix rst bug
DOC: put class reference in the banner
COSMIT: prettify plot_oneclass
DOC: rework front page
DOC: Add 'up' relative link
DOC: title for the user guide content file
DOC: don't display empty tocs
MISC: scikits.learn -> sklearn
DOC: proper link structure in examples
DOC: title to relative links
DOC: EPD ships a recent version, but not latest
DOC: state clearly the version number
MISC: plot_stock_market cluster on learned covariance
BUG: fix score() with GraphLasso
Compatibility with numpy 1.1
BUG GraphLassoCV: score() needs a store_precision attribute
DOC: restore 'This page' in sidebar
Merge pull request #463 from npinto/patch-2
MISC: update joblib
BUG: fix joblib doctest
BUG: make the tests pass with numpy 2
COSMIT
COSMIT: prettify datasets docs
Merge pull request #469 from amueller/preprocessing_epsilon_doctest
DOC: start to merge statistical learning tutorial
Merge pull request #471 from amueller/linnerud_renaming
DOC: explicit the __init__ convention
Cosmit on randomized range finder
Merge pull request #475 from amueller/datasets_doctests
BUG: fix RandomizePCA: renaming of fast_svd args
DOC: scikit.learn -> sklearn
BUG: casting with numpy 2.0
BUG: API change in fast_svd
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
Merge branch 'master' into n_samples_scaling
MISC: FutureWarning on C scaling
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
COSMIT: beautify the plot_oneclass example
DOC: outlier detection improve docs and examples
DOC: improve outlier detection docs
API: h -> support_fraction
Cosmit
ENH: use controled random numbers
BUG: follow API change in outlier_detection
MISC: update whats_new
DOC: cosmit in kernel approximation
DOC: removing dangling link
Cosmit in metrics
BUG: fix bug introduced in fd8c
ENH: Store the full cv_scores in grid_search
DOC: add alpha_ to attributes of LassoLarsIC
COSMIT: utils.fixes: document versions
Merge pull request #512 from amueller/doc_consitency
ENH: update joblib
MISC: improve copy_joblib script
ENH: Integrate joblib 0.5.7
Merge pull request #478 from glouppe/tree
DOC: doctest bug
Cosmit: example prettier without colorbar
DOC: add links to examples.
DOC: improve univariate feature selection docs
MISC: move SelectorMixin outside of __init__.py
OPTIM: minor optimization
MISC: better error message
COSMIT
TST: fix doctest
TST: fix touchy doctest
COSMIT: avoid set_cmap and pcolormesh in example
Cosmit in docs
MISC: fix bibtex
ENH: Make LinearRegression work with sparse
DOC: update LinearRegression docstring
FIX: sparse LinearRegression with scipy 0.7.0
ENH: update joblib
MISC: tag explicitely a dependency
ENH: use joblib compression in datasets
MISC: tune test verbosity
ENH: update joblib
DOC: restore index on pages
Merge pull request #526 from amueller/ball_tree_skip_doctests
Merge pull request #537 from amueller/gaussian_nb_underscore
MISC: species distribution example plotted
ENH: better error messages
MISC: shorten a bit the description
DOC: fix image
DOC: layout
DOC: random selection of frontpage images
DOC: compress a bit the layout
DOC: shorten a bit the front page
DOC: avoid imgs taking 2 lines
DOC: Add a few images to the banner
DOC: fix wrong link
DOC: avoid line return
ENH: get the murmurhash to build properly
DOC: prettify ensemble docs
BUG: restore score functionality in grid_search
ENH: refit now works in the GridSearchCV
FIX: MurmurHash3 compilation on older GCC
Cosmit: remove unused imports
MISC: fix bibtex
Merge pull request #588 from jakevdp/balltree-fix
ENH: make LassoLarsIC more reproductible
BUG: fix test_precision_recall_curve
ENH: Add randomized lasso
ENH: randomized_lasso example: multiple alpha
Better randomized_lasso
Jacknife in randomized_lasso
Add a randomized logistic
COSMIT: pep08
ENH: Add pre_dispath to RandomizedLinearModel
ENH: RandomizedLinearModels transformers + memory
BUG: fix broken merge
MISC: inherit from BaseClassifier
BUG: parameter was not set right
DOC: Improve feature selection docs
DOC: try to improve randomized lasso example
ENH: numerical stability in LassoLarsCV
DOC: update dostring
ENH: grid in terms of alpha/alpha_max
DOC: nicer path
DOC: beautify feature_selection docs
DOC: cross-reference linear_model and randomized_lasso
DOC: enrich example docstring.
DOC: better example for randomized lasso
MISC: make sure two figures hold on a line
DOC: example and docs for randomized-lasso
MISC: address @ogrisel and @mblondel's comments
Cosmit
MISC: add randomized linear models to what's new
BUG: make clone work on 2D arrays
TST: add a test for bug fixed in previous commit
COSMIT: make the plot landscape
DOC: improve the label_propagation docs
COSMIT: authorship and licensing info
Cosmits
DOC: minor rmk on label_propagation
TEST: assert -> nose.tools.assert_equal
Merge branch 'label-propagation'
BUG: fix typo in tests
DOC: update whats_new
BUG: fix tests under numpy 1.5
TEST: add a test for whitening in ICA
PEP8
ENH: control random state in ICA
BUG: SVM raw_coef_ must be fortran ordered.
MISC: cosmit: use subpackage setup.py
DOC: reorganize GMM docs
DOC: reorganize GMM docs
DOC: more examples for DPGMM
Cosmit
MISC: remove custom __repr__
Merge branch 'master' of github.com:scikit-learn/scikit-learn
BUG: fix doctests
ENH: optim hierarchical: heapq in tree traversal
ENH: hierarchical: speedups in tree cut
MISC: clean up old c file
MISC: assert -> raise ValueError
BUG: typo
MISC: fix broken link to example
ENH: parallel in lasso_stability_path
API univariate_selection: _scores -> scores_
ENH: update joblib to release 0.6.2: bugfix
Merge pull request #613 from bwhite/patch-1
MISC: remove joblib from .gitignore
BUG: add missing file in joblib
Merge pull request #601 from agramfort/scale_C_true
BUG: follow API change in example
ENH: update joblib
Merge pull request #603 from jakevdp/GPML-fixes
Merge pull request #637 from fannix/fix
ENH: optim in ward_tree
Cosmit
BUG: ShuffleSplit should give reproducible splits
ENH: small speedups in coordinate descent
Revert "ENH: small speedups in coordinate descent"
ENH/FIX: in graph shortest path
Faster hierarchical cluster for very dense trees
ENH: Add the ability to set rho by cross-val
ENH: store the path for rho in ENet
BUG: fix tests and reorganize code
ENH: draft of parallel CV in elastic net
TEST: setting rho with ElasticNetCV
DOC: document ElasticNetCV
MISC: cosmit to please @agramfort
BUG: Same MSE scaling for LassoLarsCV and LassoCV
TEST: better tests of LassoCV and LassoLarsCV
DOC: add a link the Gohlke's 64bit windows binaries
DOC/TEST: HMM fix doc layout and doctest
ENH: Add controled random_state in HMMs
DOC: prettify HMM sampling example
Cosmit
COSMIT: underscores are better than unseparated words
TST: fix trivial bug and control the rng
MISC: fix the random number generators
Merge branch 'hmmc'
TEST: fix doctest on non 64bit boxes
COSMIT: readability
TEST: Fix cross_validation tests
BUG: fix cross_validation on numpy 1.3
Merge pull request #709 from ibayer/cleanExamples
Merge pull request #705 from agramfort/fix_ica
MISC: better verbosity in lars
DOC: more visible version remark
ENH Ward: better behavior for non-fully-connected graphs
ENH: Don't modify connectivity unless specified
DOC: affinity-propagation in clustering comparison
DOC: add clustering example on front page
Merge pull request #726 from emmanuelle/doc_correction
ENH: summary table on clustering
DOC: better clustering comparison table
DOC clustering comparison: link table and figure
MISC: tweak example layout
DOC: finish table to compare clustering
Merge branch 'WIP_tut'
DOC: Better narrative for DBSCAN
DOC: finish misc in tutorial
BUG: no plotting in doctests
COSMIT: layout tweak
Redo CSS layout killed by commut 94088b81
BUG: fix doctests
Merge pull request #730 from jaquesgrobler/rename_EllipticEnvelope
DOC: timings in cluster comparison example
COSMIT: prettier plot
Merge pull request #733 from jaquesgrobler/master
DOC: misc wording
TEST GNB: test that class_prior sum to 1
Merge pull request #751 from jaquesgrobler/master
DOC: Manhattan distance == l1 norm
BUG fix LinearSVM doctest
MISC: verbosity in SVMs
ENH: use warning.catch_warnings
ENH: neighbor warning always raised
API: n_test -> test_size in Bootstrap
COSMITs on GGM
TEST: Fix doctest
Cosmit: comment on 'clever' code
Warn: Passing params to fit is depreciated
DOC: testing without sklearn.test()
COSMIT: macports package name
COSMIT: better warnings
ENH MiniBatchKMEans: increase init_size for large k
DOC: better description of init_size
DOC create example section for datasets
DOC title for the tutorial examples
EXMPL: fix legend in sgd sample weights
COSMIT we no longer support Py 2.5
COSMIT simplify a bit examples
DOC: restructure what new
BUG: explicit adding of libm at build
BUG test_oneclass_decision_function: fix RNG
COSMIT: no capitals outside of class names
COSMIT: remove print
BUILD: add libm onlyon posix systems
MISC: simpler faster code with vectorization
SPD: Minor speedups
SPD: minor speedups
FIX: handle deprecation with estimator API
BUG: fix assert_greater/assert_lower
BUG: fix assert_greater
BUG: fix doctests
DOC: cosmits in docs
COSMIT: only classes should have capitals
ENH: make LinearSVC copyiable
TST: do not raise warnings in sklearn.test()
BUG: fix testing on older numpy
DOC: cosmits on tutorials and videos
DOC: wording of whats_new
BUG: use permutation rather than shuffle
CLEAN sparse_encode: remove unused arguments
ENH: avoid an underflow
Revert "ENH: avoid an underflow"
DOC: instructions on testing
DOC: faster and more meaningful example
ENH: prevent multiprocessing in tests under Windows
DOC: avoid 2 rows of images
DOC: more readable title
DOC: Feature extraction vs feature selection
DOC: image to graph utilities
ENH: update joblib
BUG: remove n_jobs=-1 from examples
Merge branch 'install-windows' of https://github.com/vene/scikit-learn
FIX: control RNG seeds in ICA tests
DOC: fix rst layout
MISC: clean up top-level namespace
P3K: more Py3k compat changes
BUG: multiple jobs in dict_learning
BUG: fix install bug for _check_build
BUG: casting error with recent numpys
DOC: note on heat kernel for spectral clustering
Typo
Typo
BUG: reassigning cluster centers with X sparse
BUG: k_means k -> n_clusters
COSMIT: k -> n_clusters
COSMIT: avoid deprecation warnings
MISC: os.name -> platform.system()
FIX: unique in old numpy
COSMIT in plot_mds.py example
DOC: misc improvements in MDS docs
DOC: minor MDS doc/example changes
MISC: update whats_new with MDS
BUG: address ill-conditionned designs in Lars
Cosmit: PEP8 :P
Cosmit: PEP8
COSMIT: intermediate variable
Merge pull request #953 from jaquesgrobler/nature_css_addons
ENH: backport gen_rst changes from NISL
ENH: minor speedup in Ward
ENH: factor 2 speedup in Ward
ENH: minor speed up in ward
ENH: minor speed up in Ward
Merge branch 'master' of github.com:scikit-learn/scikit-learn
MISC: avoid unprotected np.random
TST: testing without hard-coding the values
TST: test on diabetes rather than iris
Cosmit
BUG: example now needs 'assume_centered'
ENH: using slices rather than indice masks
ENH: avoid unecessary steps (covariance)
Cosmit: more explicit names
FIX: remove leftover print
Note on control of the RNG seed during testing
DOC: cosmit performance instructions
TST: test check_build
ENH: remove setuptools
ENH: restore 'develop' mode install
FIX: remove executable bit on joblib files
BUG: fix setup.py for develop
TST: test the setup.py using the configure step
MISC cleanup old coverage info in Makefile
ENH: Faster ward for large n_clusters
BUG: fix ward tests
DOC: ward docstring and testing
TEST: improve test coverage in hierarchical
FIX: make ward_tree work on 1D data
MISC: very minor speedup
COSMIT: remove left over profiling
TST: More testing in hierarchical
TST: test TypeError in Ward
TST: more tests for hierarchical
DOC: notes on improving code coverage
COSMIT: explainations of the partial import
MISC: build_utils: module rather than a subpackages
ENH: use sklearn.__version__ in setup.py
Merge branch 'linking_arrayfuncs'
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Cosmit: comment
TST: fix doctest
Update whats_new
Clean: remove debug print
PEP8
Typos
BUG: keep same shape for y in MultiTaskLasso
DOC: explicit MultiTaskLasso.coef_ dimensions
DOC: formatting and rephrasing in MultiTaskLasso
Merge pull request #1005 from NelleV/MDS
ENH: understandable error message for X sparse
BUG: casting rule with recent numpy
BUG: do not use diag_indices
BUG: choose seed to get affinity test working
BUG: fix my fix for affinity :(
DOC: link to Randomized sparsity in Lasso section
Merge branch 'master' into mixins
Revert "Rename Y to y in PLS"
Merge branch 'master' of github.com:scikit-learn/scikit-learn
BUG: sparse matrices in ElasticNetCV
MISC rest
DOC: improve scale_c_example
DOC: add a reference on multi-output trees
MISC: docstring work
BUG: fix setuptools feature
MISC: small docstring work
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Minor changes to contributing
BUG: parallel computing in MDS
BUG: deprecated k parameter in MiniBatchKMeans
BUG: copy and keep ordering
BUG: remove leftout debug prints
DOC: Enet alpha=0 => advice to use LinearRegression
MISC: add ltsa in docstring
ENH: MCD for large dataset
BUG in error message for k-means
BUG in error msg for spectral clustering
BUG: propagate random-state in MCD
DOC: protect `classes_` for valid rst
FIX: doctests under Windows 64bit
Update changelog
DOC: use nosetests rather than sklearn.test()
ENH: support arbitrary dtype in kNN classifiers
TEST: predict_proba in knn classifier with y string
DOC: fix doc mistakes
DOC: another layout fix
DOC: add doc on making a release
TST: cater for 0.9 not > 0.9
BUG: obey numpy 1.7's stricter rules
Merge remote-tracking branch 'origin/pr/1234'
BUG: cater for dev versions of numpy
MISC: use toarray instead of todense
BUG: RandomizedPCA needs random_state set
ENH: make RandomizedPCA.fit idempotent
TST: fix doctest
TST: fix test counting warnings
BUG: follow scipy API change
DOC: typo
DOC: improve the model selection exercise
ENH/FIX add a lobpcg solver to spectral embedding
MISC: decrease verbosity by default
FIX: numerical stability in spectral
MISC: addressing @satra's comments
ENH: make sure that spectral EVD is solve once
MISC: @agramfort's comments
PEP8
BUG: fix test error
BUG: make precision_recall invariant by scaling probs
BUG: fix setuptools feature
MISC: split example in two plots
API: change 'embed_solve' to 'assign_labels'
TST: increase coverage in spectral clustering
DOC: add docs of assign_label in spectral clustering
COSMIT: long remarks go in 'notes' section
BUG: restore numpy 1.3 compatbibility
MISC: minor clean ups in hmm code
Merge pull request #1290 from tjanez/master
COSMIT: pep8 in arrayfuncs.pyx
BUG: dot on sparse matrices broken in recent numpy
BUG: fix doctest bug
DOC: improve wording in covariance docs
DOC: typo
COSMIT: pep8, wording, layout
DOC: fixed string formatting in example
MISC: remove unused import
BUG: LassoLars path ending contained junk
TEST: one addition test on the length of the path
TEST: test that alpha is decreasing in LassoLars
BUG: lars corner case with path length == 1
ENH: multi-target Lars: lists rather than arrays
ENH: early stopping LARS for degenerate active set
MISC: address comments
MISC: more precise warning
WIP: drop for good correlated regressors
MISC lars_path: cleaner code in degenerate case
ENH: early stopping for lars
TST: add a test for lasso and lars
MISC: comment
ENH lars_path: early stopping after drop for good
TST: difficult test for early stopping
COSMIT: better comments
BUG: missing import introduced by rebase
DOC: Update whats_new with lars improvements
BUG: compat with numpy 1.3
DOC: spelling
BUG: AUC should not assume curve is increasing
COSMIT
DOC: LinearRegression document the shape of coef_
DOC: n_responses -> n_targets
TEST: decrease precision in test_lars_drop_for_good
BUG: imports should be locals
MISC: wording of doc/comments in example
ENH: RandomForestEmbedding in lle_digits example
DOC: cross-ref Random Forest embedding and manifold
DOC: list of dicts in GridSearchCV
DOC: wording and layout on front page
ENH: update joblib to 0.7.0a
BUG: fix properties on joblib files
BUG: Add forgotten file
BUG: update joblib to 0.7.0b
BUG: fix murmurhash compilation with recent Cython
ENH: use broadcasting, not tile
COSMIT: pep8
TEST: fixing randomly failing test
ENH: rng local to tests
TEST: add a test of sample weights
TST: improve last test
BUG: fix sample_weights in ridge
BUG: shape bug in tests
BUG: fix sample weights in ridge
BUG: Ridge: sample_weights in intercept
TST/BUG: test_common sample_weights in ridge
ENH: random reassignements in MiniBatchKMeans
ENH: fine tuning to the random assignement
DOC: example of dict-learning with KMeans
DOC: improve online KMeans example
DOC: dict learning with kmeans in narrative doc
DOC: fix typo
BUG: check n_clusters == len(cluster_centers_)
PEP8
DOC: change the example to lighter dataset
ENH: more control on reassignment in MiniBachKMeans
DOC: link to example
DOC: add comment
DOC: complete whats_new
TST: random reassignment in MiniBatchKmeans
TST: test verbosity mini_bach_kmeans
ENH: control random_state in MiniBatchKMeans
COSMIT: simplify parallel code in multiclass
DOC: put math at back, simplify formulation
MISC: fix rst in whats_new
MISC: index arrays with integers
DOC: voronoi + kmeans picture
DOC: typo in warning
BUG: reassignment_ratio == 0 in MiniBatchKmeans
BUG: sparse center reassignment MiniBatchKMeans
BUG: sparse vs non sparse centers
BUG: fix test to use sparse array
DOC: reference for discretise option
COSMIT :: in rst is easier for syntax highlighters
DOC: minor formatting in model_evaluation.rst
DOC: minor rst issues
DOC: misc rst formatting
COSMIT: prettify code and figure in example
COSMIT
Merge branch 'treeweights'
Merge pull request #1656 from rlmv/idf_diag
BUG: update joblib to 0.7.0d
TST: add a test for empty reassignment in MBKmeans
BUG: highly-degenerate roc curves
BUG: fix change of behavior in last commit
DOC: add example and ref to lars_path in lasso_path
BUG: ElasticNectCV choosing improper l1_ratio
ENH: minor changes for numpy versions
DOC: remove typo
DOC: libatlas3-base in requirement
ENH: Avoid computations in ElasticNetCV
ENH: improve memory usage in ElasticNetCV
DOC: docstring of private functions
BUG: fix sparse support in ElasticNetCV
COSMIT: address @agramfort's comments
DOC add 2012 GSOC students
COSMIT: labels in plot_lasso_coordinate_descent_path
COSMIT: txt -> rst
DOC: cosmit - fix latex typo
ENH: avoid MemoryError on manhattan_distances
BUG: old versions of numpy
BUG: old versions of numpy
MISC: details about the donations
BUG: type conversion in spectral_embedding
MISC: remove unused imports
BUG: restore Python 2.6
COSMIT: two empty lines between functions
Merge branch 'pr_1732'
BUG: fix sparsetools tests in old scipy
PEP8
Cosmit
Merge branch 'pr_2002'
BUG: fix unsafe casting
DOC: improve RBM example
MISC: remove unecessary dtype
ENH: better error message on scoring
DOC: reorganize model_evaluation
MISC: address comments and test failure
DOC: address remarks by @NelleV
DOC: Address @larsman's comments
DOC: @amueller's comments
ENH: Add the hungarian algorithm
TEST: Increase testing of hungarian
MISC: cosmit in hungarian
ENH: Speed up in hungarian
ENH: More speedups in hungarian
ENH: More speedups in hungarian
ENH: Still more speed ups in Hungarian
ENH: More speedups on Hungarian
API: scikits.learn -> sklearn
BUG: fix some numpy 1.3 compat issue
BUG: numpy 1.6 compat
:
BUG: fix kde tests
MAINT: update copy_joblib script
ENH: update joblib to 0.7.1
MAINT: misc change to copy_joblib
ENH: make bdist_rpm work
COMPAT: empty_like does not have a dtype in np 1.3
COMPAT: fix arpack and pls on old scipy/numpy
COMPAT: string formatting syntax in Py 2.6
COMPAT: median and nans in old numpys
COMPAT: no assert_warns in np 1.3
BUG: fix Py 3
DOC: invert priorities bootstrap <-> nature.css
DOC: sidebar lighter
ENH: add a new DataConversionWarning
MISC: fix plot_multilabel example
BUG: implement concrete __init__ for SGDRegressor
BUG: tests were raising the DataConversionWarning
Merge branch 'pr_2304'
MAINT: recompile Cython files
DOC: add whats_new on the news
TST: adjust test relying on change order
MISC: deprecate balance_weights (it's internal)
MISC: update whats_new
MISC: fix reference to example
DOC: DBSCAN misc doc formatting
DOC: also point installation menu to stable
MAINT: remove sklearn.test()
MISC: deprecation notice
DOC: reduce the number of examples
MISC: document sklearn.test deprecation
ENH: custom distutils clean command
DOC: layout tweaks
DOC: bigger menu fonts
DOC: button layout tweak
TST: avoid a crash in Windows + Anaconda Py3.3
MISC: fix wrong timing in example
TST: avoid nose running sklearn.test as a test
MAINT: randn on float is deprecated
MISC: deprection is in 2 releases
DOC: fix CSS bug
MAINT Update mailmap
REL: update whats_new and version
MAINT/DOC: bump docs and rev numbers to 0.15-git
DOC: link to documentation, not main page
TST: fix test on scipy dev version
MISC: switch line returns back to unix
BUG: restore setup.py clean functionality
MISC: fix rst
COSMIT: fix docstrings in affinity_propagation
ENH: add single_linkage
ENH: single linkage with Cython
ENH: speed up single linkage
ENH: start of complete linkage
ENH: first sketch of weighted linkage
BUG: fix bug introduced in Ward
ENH: add average linkage
BUG: remove debug print
WIP: MST for single linkage
Cosmit
MISC: refactor Ward and linkage in same object
ENH: hierarchical clustering example
DOC: nice hierarchical clustering example
API: rename to agglomerative clustering
Add cython code
MISC: manual merge of master
MISC: Take in account @NelleV's comments
TEST: increase test coverage
MISC: move fast_dict to utils
MISC: new formulation in message
COSMIT: remove unused import
ENH: add single_linkage
ENH: single linkage with Cython
ENH: speed up single linkage
ENH: start of complete linkage
ENH: first sketch of weighted linkage
BUG: fix bug introduced in Ward
ENH: add average linkage
BUG: remove debug print
WIP: MST for single linkage
Cosmit
MISC: refactor Ward and linkage in same object
ENH: hierarchical clustering example
DOC: nice hierarchical clustering example
API: rename to agglomerative clustering
Add cython code
MISC: manual merge of master
MISC: Take in account @NelleV's comments
TEST: increase test coverage
MISC: move fast_dict to utils
MISC: new formulation in message
Merge branch 'hc_linkage' of github.com:GaelVaroquaux/scikit-learn into hc_linkage
ENH: better error message
ENH: add the cosine distance to paired distances
ENH: different metrics in hierarchical cluster
BUG: Remove UTF8 character
MAINT: remove utf-8 headers
WIP
Merge pull request #2418 from rsivapr/typo-docs
Merge pull request #2404 from agramfort/fix_probabilitic_pca
DOC: add Rangespan testimonial
DOC: add new testimonial
MAINT: Add COPYING and README to MANIFEST
Merge pull request #2424 from sergiopasra/no-shebang
DOC: fix bestofmedia testimonial
Cleanup
ENH: address @larsman comments on fast_dict.pyx
MISC: improve tests of fast_dict
Merge pull request #2462 from kmike/metrics-unicode
BUG: fix typing
BUG: fix types
Clean: Separate hierarchical code from fast_dict
Merge pull request #2527 from shoyer/permutation_test_score-docstring
Merge pull request #2505 from rphlypo/master
Merge pull request #2525 from rmcgibbo/hmmfix2
Merge pull request #2517 from jaquesgrobler/simpler-collapsible-toctrees
Merge pull request #2533 from samuelstjean/patch-1
Merge pull request #2543 from johncollins/dev
Merge pull request #2565 from change/change_org_testimonial
MAINT: make gen_rst more robust (minor)
BUG: ward does not take an affinity
DOC: example metric + hierarchical clustering
WIP example l1 metrics hierarchical cluster
BUG: fix partial_fit of MiniBatchDictLearning
DOC: update whats_new
Merge pull request #2615 from rgommers/fix-numfocus-link
Merge pull request #2620 from jaquesgrobler/master
BUG: fix convergence check in OMP
Merge pull request #2488 from oddskool/pred_latency
Merge pull request #2626 from jnothman/string_objects
Merge pull request #2653 from jaquesgrobler/remove_ellipticenvelope_deprecation
Merge pull request #2654 from arjoly/metrics-0.15
Merge pull request #2667 from kowalski87/master
DOC: add phimeca testimonial
Merge pull request #2671 from agramfort/fix_f_oneway_int
BUG: OneVsOneClassifier was broken with string labels
COSMIT: simplify code
Merge pull request #2674 from GaelVaroquaux/bug_ovo_string_y
Merge pull request #2702 from glouppe/imputer-copy
Merge pull request #2714 from amormachine/master
Typo in examples
COSMIT: remove unused import
Merge pull request #2760 from AlexanderFabisch/check_error_patterns
TST: add Py2.6 on travis
BUG: assert_raises_regexp for Py2.6 compatibility
TST: get travis to find the system packages
TST: try to get travis to work with Py2.6
TST: get travis working on Py26 and Py27
TST: Add Python 3.3 to travis
Merge pull request #2767 from GaelVaroquaux/my_master
Merge pull request #2779 from jnothman/ref_sec_link
DOC: some underlines were too short
Merge pull request #2781 from blagarde/patch-1
Merge pull request #2782 from blagarde/patch-2
Merge pull request #2776 from DanielWeitzenfeld/master
Merge pull request #2775 from stefan-w/python-3-fix
WIP: fix StratifiedShuffleSplit
BUG: StratifiedShuffleSplit not obeying n_train
BUG: avoid same indices in test and train
COSMIT
Merge pull request #2791 from adrinjalali/master
Merge pull request #2794 from cli248/master
Merge pull request #2799 from charlescearl/charles-label-prop-doc-updates
Merge pull request #2797 from cli248/fix-DeprecationWarning
DOC: more context in embedded code
DOC: more robust compilation and CSS
Merge pull request #2768 from robertlayton/meanshiftdoc
Merge pull request #2820 from ankit-maverick/issue2819
MISC plot_rfe_with_cross_validation: better comments
Merge branch 'master' into hc_linkage
BUG: different distance names in scipy
BUG: fix minor floating-point precision detail
ENH: Py3 support
BUG: fix tests under Python 3
DOC: improve the clustering metric example
DOC: AgglomerationClustering doc and examples
BUG: _alpha_grid undefined symbol
DOC: Agglomeration clustering docs and examples
DOC: connectivity in agglomerative clustering
DOC: finish touches to hierarchical docs
DOC: fix rest syntax
MISC: typos in docs
MISC: remove reference to deprecated Ward
MISC: replace 'assert' with actual ValueError
MISC: increase test coverage and cosmetics
Merge pull request #2830 from eltermann/wrt-abbreviation
Merge pull request #2750 from jnothman/no-str
MISC: better error messages in BaseLibLinear
MISC+DOC: documentation for FeatureAgglomeration
Merge pull request #2859 from eltermann/fixed-typo
Merge pull request #2864 from sciunto/doc
COSMIT: many cosmetic comments
FIX: fix failing doctest
DOC: link FeatureAgglomeration better to dim red
MISC: documentation and cosmits
BUG: work aroundd Cython bug to build with clang
Merge pull request #2865 from hanke/boston_13features
Merge branch 'master' into hc_linkage
Merge pull request #2877 from pmandera/tfidf-citation
Merge pull request #2917 from ariddell/patch-1
TST: test utils.extmath more robust
MAINT: remove our solve_triangular
MAINT: commit generated code
MAINT: scipy 0.10 and 0.11
DOC: rmk different feature scaling in agglomeration
Merge pull request #2199 from GaelVaroquaux/hc_linkage
Merge pull request #2929 from ajtulloch/izip-tests
BUG: avoid NaNs in arrays passed to scipy.linalg
DOC: docstring formating
Merge pull request #2943 from jyu-rmn/patch-1
Merge pull request #2941 from perimosocordiae/patch-1
Merge pull request #2938 from jnothman/clean_impute
Cosmit
MAINT: point out HMMLearn
DOC: improve Thiel-Sen vs RANSAC example
ENH: speed up theilsen
Merge pull request #2955 from larsmans/less-blas
DOC: add OkCupid testimonial
Merge pull request #3003 from jnothman/hyperlinked_tt_colour
BUG FeatureAgglomeration meaningless for no samples
TST: more robust test of dropping in Lars
TST: relax warning class checked in lars
TST: lars tests: more brutal ill-conditionning
Merge pull request #3074 from msalahi/iris-file-warnings
Merge pull request #3066 from denisgarci/agglomeration-transform-constructor
Merge pull request #3096 from ash211/patch-1
Merge pull request #3101 from fabianp/logos
BUG: test_lasso_lars_vs_lasso_cd_ill_conditioned
COSMIT: rng.permutation rather than rng.shuffle
Merge pull request #3104 from MechCoder/swap_sparse
BUG: fix doctests
DOC: fix restructured text
Remove HMM examples
Revert "DOC : really fix alpha docstring in dict learning"
COSMIT: remove unused import
MISC: use precomputed variable
MISC: update comment in source
Merge pull request #3193 from mjbommar/issue-3167-eradicate-todense
Merge pull request #3157 from mjbommar/isotonic-increasing-auto
Merge pull request #3199 from mjbommar/isotonic-out-of-bounds-2
Optim: small optimizations
FIX: examples with more than 10 images
Merge pull request #3280 from larsmans/remove-hmm-docs
Merge pull request #3297 from esc/doc/random_kitchen_sink
Merge pull request #3333 from ogrisel/maint-skip-download-tests
Merge pull request #3357 from houbysoft/patch-1
Merge pull request #3369 from NelleV/install_doc
Merge pull request #3353 from kastnerkyle/skip_ompcv_travis
MAINT: more explicit glob pattern in doc generation
DOC: more readable whats_new
DOC: fix classes not referenced in docs
TEST: metrics sample_weight test: almost_equal
Merge pull request #3381 from kastnerkyle/skip_omp_cv_regressors
TEST: fix test failing due to numeric instability
COSMIT: address misc comments
Minibach k-means: change reassignment to uniform
MISC: avoid deprecation warning
MISC: minor changes in MBKmeans
DOC: spelling and phrasing
MISC: better formulation
Merge pull request #3363 from ogrisel/appveyor-ci
Merge pull request #3376 from GaelVaroquaux/fix_mb_kmeans
Merge pull request #3382 from ogrisel/install-doc-update
BUG: fix windows pointer size problem
FIX: try to get windows working
Merge pull request #3385 from GaelVaroquaux/fix_intp
MAINT: minor mailmap update
MISC: avoid overridding figure in benchs
MISC: <= rather than < in tol check (KMeans)
Merge pull request #2694 from amueller/allow_y_lists
BUG: Support array interface
ENH: enable y to only implement the array interface
COSMIT: fix indentation
BUG: fix validation bug
ENH: support non-ndarray subclasses in supervised estimator
ENH: transformers work on non ndarray subclasses
DOC: more comments
COSMIT:
FIX broken test
Merge pull request #3414 from lesteve/scheduled-removal-from-0.15
Merge pull request #3418 from arjoly/dummy-weight
ENH: add a median absolute deviation metric
DOC: add an example of robust fitting
TST: fix doctest
DOC: better documentation for robust models
API: naming: CamelCase class -> camel_case function
Merge pull request #3444 from arjoly/explain-default-score
Merge pull request #3442 from arjoly/split-metrics-module
TST: fix tests on numpy 1.9.b2
Merge pull request #3443 from amueller/input_validation_refactoring
DOC: Fix class link
TST: Fix warnings in np 1.9
DOC:Typo in doc
Merge pull request #3438 from hamsal/sprs-out-dmy
MAINT: add license info
MAINT: update joblib to latest release 0.8.3
MAINT: better gitignore
MAINT: be robust to numpy's DeprecationWarning
Merge pull request #3598 from GaelVaroquaux/np_warnings
Merge pull request #3604 from rahiel/patch-1
DOC: more explicit title
BUG: n_samples instead of n_features in cd_fast
ENH: minor speed-up in k-means
ENH: Remove unused copy
COSMIT: explicit comment
Merge pull request #3608 from agramfort/fix_gnb_proba
Merge pull request #3580 from jnothman/gridsearch-score
Merge pull request #3611 from ogrisel/fix-isfinite-windows
Merge pull request #3564 from jnothman/bounds
Merge pull request #3587 from jnothman/transforms-doc
Merge pull request #3678 from akshayah3/LinearReg
Merge pull request #3730 from ogrisel/fix-libsvm-random-seed
Merge pull request #3763 from ogrisel/fix-warning-test-affinities
Merge pull request #3797 from Titan-C/clean
Merge pull request #3854 from justmarkham/doc-changes
Merge pull request #3862 from Titan-C/doc_comment
Merge pull request #3852 from lesteve/add-cds-funding
COSMIT: minor cleanup
Merge pull request #3690 from queqichao/fix_bug_in_cross_validation_when_using_sparse_matrix_for_fit_params
Merge pull request #3823 from ragv/omp-3644
Merge pull request #3900 from dimazest/latexpdf
Merge pull request #3869 from borjaayerdi/rfe_n_features_to_remove
Merge pull request #3936 from sethdandridge/typofix
Merge pull request #3953 from Titan-C/savefigarg
Merge pull request #3158 from mvdoc/clustering
Merge pull request #3986 from amueller/remove_allow_nd_in_train_test_split
Merge pull request #4042 from lmichelbacher/patch-2
Merge pull request #3961 from trevorstephens/rf-class_weight
Merge pull request #4097 from amueller/remove_unneccessary_initialization
Merge pull request #4111 from ragv/slinear_isotonic
Merge pull request #4159 from aflaxman/gp_docs
Merge pull request #4174 from ragv/fixes_4173
Fix rst formatting
DOC: fix path
Merge pull request #4203 from gwulfs/rel_projs
DOC: stock_market example: follow change of symbol
Merge pull request #4401 from ogrisel/rebased-pr-4157
Merge pull request #4433 from gwulfs/master
Merge pull request #4389 from vortex-ape/move_test
MAINT: update mailmap
Merge pull request #4505 from jmetzen/isotonic_make_unique_fix
Merge pull request #4563 from mspacek/patch-2
Merge pull request #4560 from ogrisel/shuffle-passthrough
Merge pull request #4588 from amueller/random_forest_predict_check
Merge pull request #4541 from amueller/robust_input_dtype_check
Merge pull request #4595 from sseg/bicluster-example
Merge pull request #4592 from clorenz7/bugfix/gauss_proc_eps_issue
Merge pull request #4434 from xuewei4d/catch_LinAlgError_gmm
DOC: add a testimonial by infonea
Merge pull request #4653 from ssaeger/issue_4633
Merge pull request #4681 from trevorstephens/gplearn-link
Merge pull request #4665 from kianho/plot-ensemble-oob-trace
Merge pull request #4695 from perimosocordiae/patch-1
Merge pull request #4686 from bnaul/graph_lasso_tests
Merge pull request #4663 from Titan-C/sidebar
Merge pull request #4347 from amueller/class_weight_auto
Merge pull request #4952 from stephen-hoover/fix_nmf_init_doc
Merge pull request #4972 from ogrisel/doc-whatsnew-joblib-0.9.0b2
Merge pull request #5018 from ltiao/patch-1
Merge pull request #5016 from lesteve/update-joblib-to-0.9.0b3
Merge pull request #5049 from amueller/gp_reshape_fixes
Merge pull request #5047 from amueller/scipy_0.16_fixes
Merge pull request #5072 from amueller/fix_fun_conversion_warning
DOC: minor cosmetric to ROC example
Merge branch 'master' into pr_3651
Merge branch 'master' into pr_4009
Merge branch 'master' into pr_4009
Merge pull request #5187 from GaelVaroquaux/pr_4009
Merge pull request #4253 from Flimm/doc-precision_recall_fscore_support
DOC: update whats_new
TST: Use a local random state
Merge pull request #4104 from maheshakya/johnson-lindenstrauss_bound_example_fix
Merge pull request #4312 from xuewei4d/friendly_error_on_kneighbors
TST: No n_jobs=-1 in the test
Merge pull request #5188 from GaelVaroquaux/pr_4090
Merge pull request #5184 from christophebourguignat/master
Merge pull request #4767 from trevorstephens/passive-aggressive_cw
Merge pull request #4779 from martinosorb/parallel-ms
Merge pull request #4881 from sonnyhu/weighted_least_squares
Merge pull request #5189 from GaelVaroquaux/pr_4779
Merge pull request #5182 from MechCoder/predict_proba_fix
Merge pull request #2459 from dougalsutherland/euclidean_distances
Merge pull request #5194 from christophebourguignat/master
Merge pull request #5201 from christophebourguignat/master
Merge pull request #5289 from carrillo/master
Merge pull request #5253 from ogrisel/online-lda-with-parallel-context-manager
Merge pull request #5301 from ogrisel/python-3.5-inspect-deprecation
Merge pull request #5311 from jereze/master
Merge pull request #5317 from hlin117/verbose-sparse-encode
Merge pull request #5334 from ogrisel/better-ensure-min-error-message
Merge pull request #5332 from larsmans/nmf-cleanup
Merge pull request #5346 from Naereen/patch-1
DOC: correct docstring
Ganiev Ibraim (5):
OneClassSVM sparsity regression test added
deleted useless assert
Removed unnecessary class checks
Style fix
Fix of issue #5601
Garrett-R (2):
Edited _get_weights to make it robust to zero distances
Avoid divison by zero warning in KNeighbors estimators
Gilles Louppe (888):
DOC: Missing dot in Pipeline class description
Enforce axis=1 in Normalizer.transform + doc fixes
DOC: Fixed issue #110
DOC: Missing import in doctests
BUG: `copy=None` in `Scaler.transform` instead of `copy=False`
Complete rewriting of samples_generator.py
Fixes for broken tests due to the API changes in samples_generator.py (1)
Merge remote-tracking branch 'upstream/master' into samples_generator
Merge remote-tracking branch 'upstream/master' into samples_generator
Fixes for broken tests due to the API changes in samples_generator.py (2)
Fixes for broken benchmarks due to API changes in samples_generator.py
Fixes for broken examples due to changes in samples_generator.py
`seed` renamed to `random_state` and default value set to None.
Added references to functions in the `datasets` module.
Merge remote-tracking branch 'upstream/master' into samples_generator
Fixed a broken test.
Added tests for the samples generator module.
Added references to samples_generator.make_* functions in the documentation.
Small improvements in the documentation of the toy datasets.
dictionnary -> dictionary
Merge remote-tracking branch 'upstream/master'
Improvements of the RFE module.
Merge remote-tracking branch 'upstream/master'
Documentation + PEP8
More robust test on `step`.
Fixed a syntax error
Small code simplification.
Merge remote-tracking branch 'upstream/master'
Improved test coverage of rfe.py to 100%
Fixes of minor bugs + improved test coverage (now 100%)
Addressed Gael's comments.
Addresses Gael's comments. (2)
Addresses Gael's comments. (3)
Typo.
Improved test coverage of samples_generator and feature_extraction modules.
Fixed a small introduced due to a previous commit.
Merge remote-tracking branch 'upstream/master' into test-coverage
Improved documentation + predict/score.
Cosmit
Typo
Typo (2)
Merge remote-tracking branch 'upstream/master'
PEP8
Merge remote-tracking branch 'upstream/master'
Fixed examples
Improved test coverage to 100%
Added RFE into the narrative documentation
Doc: grammar
Added n_features_ attribute to RFE
Moved "feature selection" section back into the "supervised learning" chapter
Ensure 0.0 on diagonal elements if X is Y
Doc: Implementation details of euclidean_distances
Merge pull request #343 from glouppe/euclidean_distances
ENH: `np.fill_diagonal` replaced with more portable code. Added an explanatory comment.
scikits-learn -> sklearn
Added link to personal web page
Changes on the feature_selection module.
ENH: Cleaned setup.py
Merge remote-tracking branch 'bdholt1/enh/tree' into tree
DOC: Some docstrings have been rewritten + small cosmetic changes
Merge remote-tracking branch 'bdholt1/enh/tree' into tree
DOC: Improved documentation + cosmit changes
COSMIT: GraphViz exporter cleaned up
ENH: Made apply_tree_sample slightly more efficient + various cosmits
Regenerated _tree.c
Fixed issue #378 on the RFE module
Updated changelog.
Added a numerical stability test to decision trees
Added a numerical stability test to decision trees
Revert "Added a numerical stability test to decision trees"
Merge remote-tracking branch 'upstream/master'
Merge remote-tracking branch 'upstream/master'
Merge remote-tracking branch 'upstream/master'
DOC: Added load_boston in classes.rst
Merge remote-tracking branch 'upstream/master'
Simplified tree module API.
Added some comments
Allow for max_depth to be set to None
Simplified the tree code
Added k_features argument to build randomized trees.
First draft at find_best_random_split (not yet tested)
Renamed k_features to max_features
Added some explanatory comments into the code logic
Re-extended the _build_tree API
Factored is_classification
Added ExtraTreeClassifier and ExtraTreeRegressor
Typo
First draft at forest of random trees (work in progress)
Added some tests
Cosmit
Fixed bugs in forest + first test
Check X is a fortran-array and y is contiguous
Fixed bugs
Added tests of the forest module (work in progress)
Default value of n_trees=10
bootstrap=False for extra-trees
Set random_state=1 in tests
Added documentation in the forest module (work in progress)
Cosmit
Completed documentation
Added some tests
Added predict_log_proba
Added some more tests
Removed old random forest files
Added some more tests
Cosmit
Regenerate _tree.c
Fixed a small bug
Cosmit
Use super()
Use take instead of __get_item__
Rewrote some comments
Cosmit
Revert changes on conf.py (mistake on my part)
Added random_state parameter to _find_split functions
Factored out changes on the ensemble module
Merge remote-tracking branch 'origin/master' into tree
Fixing conflicts
Merge remote-tracking branch 'upstream/master'
Removed extra-trees (for now)
Removed extra-trees from __init__
Removed extra-trees (again!)
Merge pull request #432 from glouppe/tree
Merge remote-tracking branch 'upstream/master'
Merge remote-tracking branch 'upstream/master'
Rebase of @bdholt1's ensemble branch
DOC: Added module descriptions
PEP8: tree.py, forest.py
Merge remote-tracking branch 'upstream/master' into ensemble-rebased
DOC: Added warning and see also
ENH: Modified forest API to make it possible to grid-search the parameters of the underlying trees
Merge remote-tracking branch 'upstream/master' into ensemble-rebased
ENH: Check that base_tree is an estimator
ENH: Make forest derive from BaseEnsemble
Removed Bagging and Boosting modules from this PR
ENH: Make the Forest's API coherent with BaseEnsemble's API
FIX: Don't clone estimators at instantiation
TEST: Added test case for grid-searching over the base tree parameters
ENH: Cosmit
EXAMPLES: Improved plot_tree_regression
Typo
EXAMPLES: Improved plot_iris
EXAMPLES: Added plot_forest_iris
FIX: Trees couldn't be cloned properly
ENH: Added __init__.py into ensemble/tests/
DOC: Improved documentation in the examples
PEP8
TEST: Added tests of BaseEnsemble
TEST: Improved test coverage
EXAMPLES: Fixed a bug in plot_forest_iris
DOC: Cosmitis in the narrative documentation of the tree module
DOC: Improved narrative documentation of the tree module
DOC: Added ensemble methods to TOC
DOC: Added ensemble methods to the class reference
DOC: First draft at the narrative documentation of the ensemble module
DOC: Narrative doc of the ensemble module (work in progress)
DOC: Completed the narrative documentation (work in progress) + What's new
DOC: Fixed What's new
DOC: Last details on the narrative documentation
DOC: Added a last example in the narrative doc
Merge pull request #1 from ogrisel/glouppe-ensemble-rebased
DOC: Address @vene and @satra comments
TEST: Added test_base_estimator
DOC: Cosmit
ENH: Simplified RandomForest and ExtraTrees API
ENH: Use trailing _ for private attributes
DOC: Added warning in make_estimator
DOC: Removed 'default'
FIX: Bug with bootstrapping
FIX: Bug with bootstrapping (2)
FIX: Bug in plot_forest_iris
Merge remote-tracking branch 'upstream/master'
DOC: Use ELLIPSIS in doc-test
Cosmit
ENH: Address @agramfort comments
Benchmark: Added random forests and extra-trees to bench_sgd_covertype.py
Merge remote-tracking branch 'upstream/master'
Merge remote-tracking branch 'upstream/master'
FIX: Use random_state in _find_best_random_split
Merge remote-tracking branch 'upstream/master'
Merge remote-tracking branch 'upstream/master'
First draft at Reference rewrite
DOC: "the scikit-learn" -> "scikit-learn"
DOC: References to user guide sections
DOC: Standardize the module documentation format (work in progress)
DOC: Standardized the module documentation format (2)
DOC: Fixed graph_lasso reference
DOC: "Class Reference" -> "Reference"
DOC: Fixed warning
DOC: Changed sections titles in the reference
Merge pull request #461 from Balu-Varanasi/bug_in_rst_file
Merge pull request #467 from Balu-Varanasi/pep8-compliant
DOC: Fixed broken reference to user guide
Merge remote-tracking branch 'upstream/master'
ENH: Added feature importances to decision trees and to forests
TEST: Added test on feature importances
EXAMPLE: Added examples for feature importances using trees
COSMIT: rfe examples
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
EXAMPLE: Improved plot_forest_importances.py plot
COSMIT: tree examples
DOC: Fixed links to modules in the example gallery
DOC: Fixed broken links
EXAMPLE: Moved to the Olivetti dataset
ENH: Accelerate ensemble of trees by precomputing X_argsorted
FIX: bootstrap=False by default with extra-trees
EXAMPLES: Removed useless import
ENH: Use extra-trees instead of rf
COSMIT: examples
Added links and various cosmits
DOC: Added fetch_olivetti_faces to Reference
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
DOC: Cosmits on the Support page
ENH: Parallel fit/predict/predic_proba/feature_importances in forest
FIX: Ensure random random_states
ENH: use pre_dispatch
DOC: Return->Returns
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
DOC: Cosmit on the reference
ENH: Improved _parallel_predict_proba
DOC: add n_jobs to specs
ENH: Assign chunk of trees to jobs
EXAMPLE: renamed Frankenstein, set cmap in matshow
ENH: Forest -> BaseForest
DOC: Added reference for feature importance
ENH: Revisited importances API
Merge remote-tracking branch 'upstream/master' into tree
EXAMPLE: Fixed API changes
ENH: Missing default value for feature_importances_
ENH: Added SelectorMixin
TEST: Added tests of transform
ENH: Simplified API
DOC: Tree-based feature selection
ENH: don't sum if coef_ is 1-d
ENH: Inherits from TransformerMixin
PEP 257
PEP 257 (bis)
ENH: address @ogrisel comments
FIX: Used np.abs instead of ** 2
Merge remote-tracking branch 'upstream/master' into tree
ENH: Smart thresholds
Cosmit
PEP8
DOC: :mod: link
ENH: no predispatch with chunk strategy
FIX: Address Gael comments
Merge remote-tracking branch 'upstream/master' into parallel-forest
ENH: Simplified parallelization
PEP8
ENH: Simplified code
DOC: Quick docstrings for private functions
FIX: Revert changes
DOC: What's new
Merge pull request #2 from ogrisel/glouppe-parallel-forest
FIX: Address @ogrisel comments (1)
Merge branch 'parallel-forest' of github.com:glouppe/scikit-learn into parallel-forest
TEST: Added tests of parallel computation
DOC: Parallel computations in forest
TEST: Improved coverage of the ensemble package to 100%
DOC: Renamed example (+Parallel)
Merge remote-tracking branch 'upstream/master' into parallel-forest
Merge pull request #491 from glouppe/parallel-forest
DOC: Added missing BSD3 licenses
ENH: Better default values to trees and forests
TEST: Added tests of max_features values
DOC: Review of the narrative doc wrt max_features
DOC: Added warning to default values
DOC: typo
Merge pull request #523 from glouppe/tree-doc
DOC: fix broken doctest
FIX: max_features=None by default on single DT
Merge pull request #527 from otizonaizit/master
FIX: Add reference (stop words)
Merge remote-tracking branch 'upstream/master' into issue349
Merge pull request #528 from glouppe/issue349
DOC: Removed performance and utilities from toctree (they were appearing twice)
DOC: Fixed 'See also' in tree/forest
DOC: typo
2011 -> 2012
Merge pull request #627 from amueller/min_leaf_cherrypick
Merge pull request #684 from clayw/graphviz-fix
PEP8
ENH: move _compute_feature_importance into Tree
ENH: Use DTYPE instead of float64
Cosmit
ENH: Moved _build_tree into Tree
Cosmits + Fix to a test
Revert "ENH: Use DTYPE instead of float64"
FIX: return; instead of return NULL;
FIX: avoid dividing by zero in Tree.compute_importances
ENH: parallel computation of X_argsort
ENH: better argsort
ENH: cosmit and doc
Merge pull request #761 from glouppe/master
ENH: MultiOutputTree (wip)
Merge branch 'master' of github.com:scikit-learn/scikit-learn into tree-mo
ENH: Multi-output decision trees
ENH: Regenerate .c file
FIX: graphviz test
Merge branch 'master' of github.com:scikit-learn/scikit-learn into tree-mo
FIX: test_classification_toy
TEST: test_multioutput (1)
TEST: test_multioutput
ENH: make forests support multi-output
TEST: test_multioutput
ENH: Patch GradientBoosting
ENH: Patch GradientBoosting (2)
FIX: log_proba + DOC
DOC: What's new
PEP8
ENH: graphviz
DOC: narrative documentation
DOC: typo
DOC: Scikit-Learn -> scikit-learn
ENH: Cython improved code
ENH: Cython improved code (2)
DOC: narrative documentation
FIX: use and modify own y
COSMIT
FIX: segfault
DOC: Example
DOC: typo
DOC: example
DOC: typo
DOC: narrative documentation
DOC: docstrings for criteria
DOC: docstrings
Merge branch 'master' of github.com:scikit-learn/scikit-learn into tree-mo
Merge pull request #3 from bdholt1/glouppe-tree-mo
Merge branch 'master' of github.com:scikit-learn/scikit-learn into tree-mo
DOC: format
Merge pull request #923 from glouppe/tree-mo
Fix broken bot (sorry for that!)
Fix broken bot (again ;))
Merge branch 'master' of github.com:scikit-learn/scikit-learn into tree-speedup
DOC: What's new > Missing links
Merge branch 'master' of github.com:scikit-learn/scikit-learn into tree-speedup
Tree refactoring (1)
Tree refactoring (2)
Tree refactoring (3)
Tree refactoring (4)
Tree refactoring (5)
Tree refactoring (6)
Tree refactoring (7)
Tree refactoring (8)
Tree refactoring (9)
Tree refactoring (10)
Merge branch 'master' of github.com:scikit-learn/scikit-learn into tree-speedup
Merge pull request #948 from mrjbq7/trees
Merge branch 'master' of github.com:scikit-learn/scikit-learn into tree-speedup
Merge pull request #950 from mrjbq7/trees
Merge branch 'master' of github.com:scikit-learn/scikit-learn into tree-speedup
ENH: Tree properties
Tree refactoring (11)
ENH: make Tree picklable
Tree refactoring (12)
Tree refactoring (13)
FIX: avoid useless data conversion
FIX: avoid useless data conversion (2)
Tree refactoring (14)
Tree refactoring (15)
Tree refactoring (16)
FIX: @mrjbq7 comments
Tree refactoring (17)
Tree refactoring (18)
FIX: sample_mask
Merge branch 'tree-speedup' of github.com:glouppe/scikit-learn into tree-speedup
FIX: init/del => cinit/dealloc
Added _tree.pxd
FIX: gradient boosting (1)
COSMIT
Tree refactoring (19)
FIX: PyArray_ZEROS -> np.zeros?
FIX: gradient boosting (2)
Tree refactoring (20)
What's new
PEP8
Merge pull request #956 from Carreau/patch-1
COSMIT
Turn off warnings
FIX: test_feature_importances
FIX: test_feature_importances?
TEST: disable test_feature_importances for now
Merge pull request #946 from glouppe/tree-speedup
FIX: dtype conversion of y
EXAMPLE: plot importances with bars
FIX: forest / check_random_state in fit
FIX: tree / check_random_state in fit
FIX: bug in multi-output forest.predict_proba
Check for memory errors (1)
Check for memory errors (2)
Check for memory errors (3)
Avoid useless if-statements
Added a comment to clarify initial capacity
Merge pull request #1144 from glouppe/tree-malloc
DOC: return values of make_moons and make_circles
Merge pull request #1197 from glouppe/master
FIX: prevent early stopping in tree construction
FIX: prevent early stopping in tree construction (2) + Test
Merge pull request #1263 from glouppe/fix-1254
Merge pull request #1269 from mrjbq7/doc-fixes
ENH: Simplify the shape of (n_)classes_ for single output trees
ENH: Simplify the shape of (n_)classes in forest
PEP8
TEST: regression test for shape of (n_)classes
TEST: enforce flat classes_
What's new: API changes
ENH: better names for variables
What's new: added :class: keyword
FIX: convert predictions into a numpy array
FIX: docstring tests
Merge pull request #1445 from glouppe/tree-shape
What's new: typo
Merge pull request #1388 from arjoly/issue1047_gradient_boosting_uses_decision_trees
Merge pull request #1458 from seberg/contig_strides
What's new: fix by @seberg
Checkout files from ndawe:treeweights
FIX: roll back some changes
FIX: what's new
flake8
ENH: early binding + allocate features at tree creation (by @pprett)
FIX: oob test
DOC: sample_weight=None
DOC: what's new
DOC: typo
DOC: cosmit
FIX: use sklearn.utils.fixes.bincount
ENH: use random_state.shuffle
ENH: import aliases
ENH: import aliases (2)
ENH: import aliases (3)
PEP8 (some)
TEST: sample_weight
TEST: sample_weight (once more)
FIX: iris.target
FIX: raise an exception if negative number of samples
TEST: use rng
FIX: do not overwrite min_samples_split
FIX: set min_samples_split=2 by default
DOC: updated docstring
Typo
ENH: weighted r2 score for regression
COSMITs
ENH: Added balance_weights
ENH: added some tests
FIX: test_oob_score_regression
FIX: compute weighted oob scores
FIX: NaN problem + Added some tests
TEST: added some more tests
EXAMPLE: simplify n_estimators and n_samples
TEST: importances
TEST: multi-output problems
ENH: WeightedClassifier/Regressor mixins
DOC
FIX: drop support for multi-output
TEST: errors
ENH: staged_score
EXAMPLE: reduce the number of samples
EXAMPLE: merge plot_adaboost_iris into plot_forest_iris
EXAMPLE: drop plot_adaboost_quantiles
FIX: move balance_weights into preprocessing
PEP8 + PyFlakes
FIX: broken test
FIX: one more bug
FIX: remove prints
DOC: edited some docstrings
DOC: added references into classes.rst
ENH: rename boost method to _boost
DOC: cosmits + narrative documentation (begin)
DOC: proper citations
DOC
TEST: make test_importances more stable
DOC: narrative documentation
DOC: What's new
TEST: base_estimator
DOC: classes_ and n_classes_
DOC: put docstrings into subclasses to make them appear in the documentation
DOC + Better default parameter values
DOC: cosmits
DOC: typo
PEP8 and DOC
ENH: use shuffle
Roll back some changes
Roll back some changes (2)
FIX: what's new
Merge branch 'master' of github.com:scikit-learn/scikit-learn into adaboost
FIX: broken test
FIX: @amueller comments
Cosmits, code structure and tests
EXAMPLE: better plot_adaboost_regression
Revert changes on plot_adaboost_error.py
ENH: set default parameter values
Cleanup
EXAMPLE: give plot_adaboost_classification some love
DOC: narrative documentation
Merge branch 'master' of github.com:scikit-learn/scikit-learn into adaboost
Merge branch 'master' of github.com:scikit-learn/scikit-learn into adaboost
FIX: some nitpicks
ENH: remove boost_method parameter and use a string as switch
ENH: weights_ -> estimator_weights_
FIX: pprett comments
DOC: Added a References section in _samme_proba
COSMIT: flake8
ENH: weight -> estimator_weight
ENH: weight -> estimator_weight (2)
ENH: weight -> estimator_weight (3)
EXAMPLE: better x-axis label
EXAMPLE (2)
FIX: make_hastie_10_2 reference docstring
DOC: add a short dataset description in hastie example
DOC: narrative documentation
FIX: doctest
EXAMPLE: add AdaBoost to plot_classifier_comparison
FIX: some of Gael comments
What's new: Adaboost
Remove compute_importances parameter
What's new
ENH: Remove compute_importances in AdaBoost
ENH: Update feature_importances in GBRT
ENH: remove "mse" method and simplify
COSMIT
DOC: feature importances
Merge pull request #1657 from glouppe/feature-importances
DOC: add balance_weights to reference
EXAMPLE: compute_importances=True is no longer required (1)
EXAMPLE: compute_importances=True is no longer required (2)
DOC: narrative documentation on feature importances
ENH: precompute X_argsorted when possible
DOC: X_argsorted
Flake8
ENH: use isinstance instead
Merge pull request #1668 from glouppe/adaboost-tree
Merge pull request #1700 from erg/rf
FIX: use DOUBLE_t type
Merge pull request #1705 from glouppe/tree-fix
ENH: support float value for max_features
DOC: if float, then max_features is a percentage
ENH: Defer parameter checking of trees
DOC: GBRT max_features
TEST: added test
ENH: use numbers
FIX: numpy integers
PEP8
Merge pull request #1712 from glouppe/tree-maxfeatures
What's new: float values support for max_features
What's new: fix indentation
Merge pull request #1816 from ndawe/master
Merge pull request #1823 from erg/issue-1466
Merge pull request #1852 from slattarini/typofixes
ENH: moved export_graphviz to sklearn/tree/export.py
ENH: add max_depth to export_graphviz
ENH: output criterion name instead of "error" in export_graphviz
Merge pull request #1998 from kgeis/fix-setup-instruction
Merge pull request #2031 from jnothman/tree_comments
WIP: new Cython interface for decision trees
WIP: comments on the Cython interface
WIP: Criterion interface and base class
WIP: ClassificationCriterion (reset, update)
WIP: Gini criterion
WIP: entropy criterion
WIP: remove n_left and n_right attributes
WIP: MSE criterion
WIP: tree class
WIP: tree algorithm
WIP: add_node
WIP: node_value
WIP: node_value
WIP: predict + apply
WIP: Random Splitter
WIP: splitter
WIP: Best Splitter
WIP: sort features
WIP: first pass on tree.py
WIP: some debug
WIP: some more debug
WIP: debug in progress...
WIP: debug (tests still don't pass...)
WIP: one more bug fixed
WIP: cleanup
WIP: one more test fixed
WIP: more bugs fixed :)
WIP: 19 tests passed
WIP: test_tree.py now passes \o/
Cleanup
WIP: feature importances
WIP: discard samples with weight = 0
WIP: fix export functions
Cleanup
WIP: first pass on ensembles
WIP: use heapsort
WIP: small optimization to heapsort
WIP: remove asserts
WIP: use C-based random number generator
WIP: set n_classes as ndarray
FIX: fix test_random_hasher
WIP: fix adaboost
WIP: small optim to regression criterion
WIP: optimize tree construction procedure
WIP: optimization of the tree construction procedure
cleanup
recompile _tree.pyx
FIX: export_graphviz test
FIX: set random_state in adaboost
FIX: doctests
FIX: doctests in partial_dependence
FIX: feature_selection doctest
FIX: feature_selection doctest (bis)
WIP: allow Splitter objects to be passed in constructors
FIX
Some PEP8 / Flake8
Small optimization to RandomSplitter
FIX: fix RandomSplitter
Cosmit
FIX: free old structures
WIP: Added BreimanSplitter
WIP: small optimizations
WIP: fix BreimanSplitter
Cleanup
WIP: optimize swaps
Regenerate _tree.c
WIP: some optimizations to criteria
WIP: add -O3 to setup.py
WIP: normalize option for compute_feature_importances
WIP: Added deprecations in tree.py
WIP: updated documentation in tree.py
WIP: added deprecations in forest.py
WIP: updated documentation
WIP: unroll loops
WIP: setup.py
WIP: make sort a function, not a method
WIP: Cleaner Splitter interface
WIP: even cleaner splitter interface
WIP: some optimization in criteria
WIP: remove some left-out comments
WIP: declare weighted_n_node_samples
WIP: better swaps
WIP: remove BreimanSplitter
WIP: small optimization to predict
WIP: catch ValueError only
WIP: added some documentation details in _tree.pxd
WIP: PEP8 a few things
Benchmark: use default values in forests
WIP: remove irrelevant and unstable doctests
WIP: address @ogrisel comments
WIP: address @ogrisel comments (2)
WIP: remove partition_features
WIP: style in _tree.pyx
WIP: make resize a private method, improve docstring
WIP: use re-entrant rand_r
FIX: doctest in partial_dependence
WIP: break or shorten some long lines
FIX: doctest in feature_selection
WIP: break one-liner if statements
WIP: revert use of rand_r
FIX: broken tests based on rng
DOC: update header in rand_r.c
TEST: skip test in feature_selection (too unstable)
FIX: one more doctest
WIP: Faster predictions if n_outputs==1
WIP: Break comments on new line
WIP: make criteria nogil ready
WIP: enforce contiguous arrays to optimize construction
WIP: avoid data conversion in AdaBoost
WIP: use np.ascontiguousarray instead of array2d
TEST: add test_memory_layout
FIX: broken test
WIP: Make trees and forests support string labels
WIP: refactor some code in forest.fit
TEST: skip doctest in feature_selection (unstable)
WIP: better check inputs
WIP: check inputs for gbrt
Merge pull request #2131 from glouppe/trees-v2
What's new: new implementation for trees
FIX: remove debug message
FIX: remove -funroll-all-loops
FIX: ur strings are not supported in Python 3.3
DOC: some documentation for the Tree Cython structure
Merge pull request #2216 from glouppe/tree-doc
Benchmark: use specified dtype
TEST: cosmit on err_msg
Raise an exception if rows are full of missing values
FIX: doctest
Better error message
FIX: use range instead of xrange
FIX: imputation example
Merge pull request #2241 from arjoly/grid-cv-multioutput
Merge pull request #2262 from NicolasTr/fix_statistics
FIX: remove blank lines
Use epsilon=1e-7
FIX: partial dependence test
TEST: skip test_oob_multilcass_iris for now
Merge pull request #2277 from glouppe/tree-fix-32bits
COSMIT: typo in examples/imputation.py
Mr. Proper, act 1
Banner improvements
Banner style
Boxes on front page
Load bootstrap first
FIX: footer character encoding
CSS tweaks
CSS tweaks (2)
Lower part of the index
CSS tweaks
More css tweaks
Better alignment in the sidebar
CSS tweaks
More css kungfu
CSS stuff
Remove testimonials for now
CSS tweaks
Donate button + citing
Enhance contrasts
Contributin
Remove toc on the API page (it is already in the sidebar)
FIX: sidebar.js
Move Google javascript near </body>
What's new: put major 0.14 additions in first
FIX: remove dupplicate entry in What's new
Polishing on "Who's using scikit-learn"
Website: bottom buttons
Merge pull request #2361 from rolisz/patch-1
FIX: release reference to X in _tree.splitter
Merge pull request #2363 from glouppe/tree-memory
Merge pull request #2385 from ndawe/tree
DOC: docstring for predict_log_proba was wrong
Typo
Merge pull request #2394 from ndawe/export_graphviz
FIX: free self.estimators_
Merge pull request #2430 from glouppe/forest-2414
Merge pull request #2432 from jwkvam/docs_cluster
ENH bagging meta-estimator
ENH: move _partition_estimators to ensemble.base
TST: test_base in sklearn.ensemble
BUG: do not force base estimators to inherit from sklearn in AdaBoost
TST: fix random_state in test_weight_boosting
DOC: base_estimator_ + estimators_ in ensemble.base
COSMIT: tidy up forest code (inspired by bagging)
What's new: bagging
FIX: logaddexp(-inf, -inf) == -inf and not NaN
Merge branch 'master' of github.com:scikit-learn/scikit-learn into logaddexp-fix
DOC: add docstring to _logaddexp
DOC: bootstrap=True by default
ENH: added PresortBestSplitter
ENH: remove assert
COMIST: shorten some long lines
DOC: added entry to What's New
Merge pull request #2469 from glouppe/tree-presort
FIX: don't bias random feature selection
FIX: random number generator
FIX: module RAND_R_MAX
TEST: improve test_distribution
TEST: improved test_distribution
FIX: use XorShift random number generator
DOC: remove rand_r license
TEST: add tests for splitter="presort-best"
Merge pull request #2474 from glouppe/tree-tests
FIX: unsigned int -> UINT32_t
ENH: small optimization in PresortBestSplitter
Merge pull request #2500 from ndawe/r2
ENH: remove offset calculations
ENH: use memset and memcpy when possible
ENH: make sort an inline nogil function
ENH: make node_reset/split/value gil-free
ENH: use npy_int32 type
Merge pull request #2546 from StevenMaude/patch-1
Merge pull request #2550 from StevenMaude/minor-typo-fixes
Merge pull request #2566 from Manoj-Kumar-S/constant-output
What's new: DummyClassifier now supports constant output
COSMIT + FIX in _utils.pyx
Merge commit 'fbab2cd' into gbm
COSMITS + FIX to impurity_improvement
FIX: impurity_improvement
TEST: disable test_32bit_equality
Update max_depth in DepthFirstSplitter + Release splitter in BestFirstSplitter
Add max_leaf_nodes to forest + DOC
Cleanup criteria
Nitpicks
Normalize variance
Merge pull request #2684 from blagarde/master
Merge pull request #2685 from bicycle1885/fix-oob-score
ENH Release gil in _tree.pyx:predict
ENH Use backend="threading" for forest regression as well
FIX: Ensure expected behaviour when copy=True|False in Imputer
FIX: Don't copy X at imputation in the sparse case
COSMIT: remove useless import
Update docstring + Use np.diff
Merge pull request #2710 from eshilts/DOC-load_boston
FIX: don't force finite in imputation
FIX: use _assert_all_finite
Merge pull request #2712 from glouppe/imputer-copy
Pre-initialize all trees before dispatching
Merge pull request #2724 from glouppe/forest-2721
Regenerate C files
Merge pull request #2745 from arjoly/mse-criterion
COSMIT: PEP8 + coding style
COSMIT: PEP8 in test_gradient_boosting.py
COSMIT: Fix some PEP8 in gradient_boosting
COSMIT: PEP8 in test_imputation
COSMIT: Various PEP8
Merge pull request #2790 from jnothman/tree_without_cycles
ENH Make PresortBestSplitter cache friendly + cosmetics
Merge pull request #2800 from eltermann/doc-fix-tree
FIX: Weight impurity decreases with weighted_n_samples
TEST: add non-regression test
ENH: add weighted_n_node_samples property
Cleanup
FIX: Recompile _gradient_boosting.c
TEST: check with scaled sample_weight
Merge pull request #2836 from glouppe/tree-importances
DOC: Entry for #2835 + Put together entries wrt. trees and ensemble
What's new: group entries by topic
ENH: small optimization of the classification criterion
ENH: small optimization to the regression criterion
DOC: update examples, use matplotlib.pyplot instead of pylab
ENH: add explanatory coments
Merge pull request #2964 from abhishekkrthakur/patch-1
TEST: fix test_forest:test_boston (#2965)
DOC: the search for a split does not until one valid partition is found
Merge pull request #2966 from maheshakya/feature_selector
DOC: wordings
FIX: Set correct impurity values in BestFirstBuilder
PEP8
Merge pull request #3053 from ajschumacher/patch-1
FIX: weight impurity improvement by the size of the node
DOC: better doc for impurity_improvement
ENH: declare weighted_n_samples
ENH: rename Node attributes to avoid confusion
ENH: rename normalizer -> weighted_n_samples
ENH: renamed things too fast...
ENH: better API for passing weighted_n_samples
ENH: remove unused variables
Merge pull request #3056 from glouppe/tree-bestfirst
Merge pull request #3078 from ndawe/examples
Merge pull request #3076 from msalahi/sparse-bagging-tests
Merge pull request #3156 from IvicaJovic/fixes
Merge pull request #3334 from ami-GS/external
MAINT: Effectively remove deprecated parameters
Merge pull request #3715 from trevorstephens/correct-gbc-docstring
Merge pull request #3731 from ndawe/master
Merge pull request #4026 from akshayah3/master
FIX: ensure that feature importances are properly scaled
Merge pull request #4139 from MSusik/negative_njobs
Merge pull request #4190 from trevorstephens/refactor_cw
FIX: add except* in Splitter.init definitions
Merge pull request #4282 from glouppe/tree-fix-4281
Merge pull request #4489 from betatim/rf-oob-docs
ENH: public apply method for decision trees
ENH: raise NotFittedError instead of ValueError
Update what's new
DOC: better documentation tree.apply
Merge pull request #4488 from glouppe/tree-apply
Merge pull request #4570 from thedrow/topic/call-enumerate-on-empty-clusters-only
Merge pull request #4530 from amueller/gbrt_estimators_docstring
Merge pull request #4642 from arjoly/error-bagging-sw
COSMIT: VotingClassifier
Merge pull request #4692 from kianho/plot-ensemble-oob-trace
Merge pull request #3735 from trevorstephens/pretty-decision-trees
Merge pull request #4698 from trevorstephens/doc-export_graphviz
Merge pull request #4958 from rasbt/treeregressor
Merge pull request #4979 from donnemartin/master
Merge pull request #4954 from ankurankan/fix/4744
FIX: ensure that get_params returns VotingClassifier own params
Merge pull request #5010 from jmschrei/_tree_doc
DOC: fixes to _tree.pyx
DOC: add cross_validation.LabelShuffleSplit to classes.rst
Merge pull request #5162 from ogrisel/faq-freeze
ENH: rename to LabelKFold
COSMIT: variable names, documentation, etc
ENH: remove unnecessary assignments
Merge pull request #5220 from arjoly/fast-reverse-update
Merge pull request #5190 from glouppe/labelkfold
Merge pull request #5223 from mattilyra/allow_sparse_in_baggingclassifier
Merge pull request #5207 from MechCoder/ensemble_sample_weight_bug
Merge pull request #5242 from arjoly/stable-test-2
TEST: stronger tests for variable importances
TEST: use sklearn.fixes.bincount
TEST: take comments into account
TEST: reduce test time, variable name, etc
TEST: check parallel computation
Merge pull request #5261 from glouppe/check-importances
Merge pull request #5278 from jmschrei/_criterion_cleanup
DOC: better docstring for sum_total
Merge pull request #5305 from jmschrei/criterion_patch
FIX: Ensure correct LabelKFold folds when shuffle=True
Merge pull request #5310 from JeanKossaifi/add_repo
Merge pull request #5300 from glouppe/fix-5292
Merge pull request #5347 from ariddell/patch-3
FIX: remove shuffling in LabelKFold
Giorgio Patrini (5):
remove numpy's RuntimeWarning from corner case of PCA.fit
partial_fit for scalers
BUG: reset internal state of scaler before fitting
MAINT: deprecation warns from StandardScaler std_
DOC: only 'sag' supports sparse input in Ridge now
Graham Clenaghan (2):
Implement dropping of suboptimal thresholds in roc_curve
Ensure that n_nonzero_coefs is an int
Gryllos Prokopis (3):
Add more extensions and related packages to related_projects.rst
Add some newline characters to respect the 80 characters rule
Fixed minor typo
Guilherme Trein (1):
Fix exception message when cloning a estimator that does not implement 'get_params' method.
Hampus Bengtsson (1):
5 of 6 are zeros, not ones.
Hamzeh Alsalhi (82):
Included sprse testing functions test_sparse_classification() and test_sparse_regression() in test_weighted_boosting.py, based on the corresponding testing functions found in test_bagging.py
Removed densification calls on X '.asarry()' from all predict methods, removed 'dense' parameter from check arrays call, removed contiguous cast of the input array X, detected X as list or array to find the number of examples accordinly using len() or shape[0]
In functions def test_sparse_classification() and test_sparse_regression() added tests for coo, lil, and dok sparse matricies. Removed parameter set feautre since there are not parameters that alter sparse/dense data usage. Revised data set to use dataset.make_{regression/classification} for improved test time.
Revised recuring import statement to respect conventions
Updated documention to reflect sparse support on input data
Removed prints statements used for debugging
Revied test_sparse_{regression,classification} to use make_multilabel_classification, revised import statements to adhere to one name per import conventions
Enforced dtype for X, and clarified documentation by indcating the sparse formats supported and how they are treated
Re inserted the check for X.ndim == 2 in initialization, updated a doc detail for X: coo is converted to csr only in the initialization not in the predict functions
Inserted base classifier check for BaseDesicionTree and and BaseForest before enforcing DTYPE in wieghted boosting fit
Resolved python3 forest import error in weighted_boosting.py
Updated documenation for weighted_boosting fit() to indicate behavior of forced dtype
Included testing of the all of the predict/score functions of AdaBoost
Assert sample weight negative sum value error, add authors entry
Fix score function parameter mistake in test_weighted_boosting
Remove exectuable permision bit
Fixed typo toCSR -> tocsr
Trivial doc alteration, travis timeout
Fixed paramaters to make_multilabel_classification and make_regression to keep tests under 1 second, reformated some blank lines
ENH sparse matrix support in label binarization
FIX Forced dtype to int when densifying array in label_binarize
Modified sparse OvR to handle sparse target data
FIX Support unseen labels LabelBinarizer and test
Test sparse target data with dummy classifier
Fit dummy classifier with sparse target data
Op directly on y.data in fit, xrange -> range, y.tocsc
Share array intialization sparse and dense fit
Reorder fit and share denes sprs, Scafold sprs pred
Remove temporary code from test_sparse_target_data
remove zero from random uniform return data
Emulate sample weight support from dense (Rebase)
Add four tests for sparse target dummy, one for each strategy
Correct the class priors computation in the sparse target case
Remove redundant indices, data, indptr appends in predict
Test sparse stratifed corner case with 0s
Combine dense column check in fit
Update y with y.tocsc (Not an inplace op)
Select nonzero elements in predict stratified case
Implement uniform and stratified w/ sparse random choices, optimize concats
np.random.choice -> choice (from utils/random)
Use array for sparse matrix construction, fix constant check
Replace np.random.choice with utils.random.sample_without_replacement
Raise warning unifrom fit sparse target, Raise error uniform precit sparse target
Support matrix output random_choice_csc, Update usage in dummy predict
Validate lengths equal each element of classes and class_probabilities
Test random_choice_csc
Test ValueError for length mismatch between array in classes, and probs
Absract class prior construction to function sparse_class_distribution
Test sparse_class_distribution, Correct data indexing
Move array intiialiaztions into dense conditional, nz_indices -> col_nonzero
Rever formating of warning, Reword warning for sparse uniform case
Correct spell 'Predicting', doument (sparse_target_input_ -> sparse_output)
Reverse conditionals in fit and predict, positive case first
Make cosmetic revisions to random_choice_csc
import divison from future to give real results with two ints
Make cosmetic changes to sparse_class_distribution
Make cosmetic changes to sparse_class_distribution
Make naming changes in sparse_class_distribution
Make default parameters for test_random_choice_csc, and use almost equal
move random_choice_csc to utils/random from utils/sparsefuncs
Move test_sparse_class_distribution to utils/multiclass from utils/sparsefuncs
Uodate imports to correct function loactions
Comment .eliminate_zeros() in sparse class distribution
Include dense case support in class_distribution, pep8 revisions
Make sparse target dummy tests multioutput-multiclass
Test class_distribution w/ multioutput-multiclass sparse and dense
Replace numpy.random.choice with a search sorted strategy for faster runtime
Test random_choice_csc implicit, readability adjustments
Remove transposes from testing of random_choice_csc
Clarify unfiorm sparse warning message, Reword sparse_output_ doc
Clarify unfiorm sparse warning message, Reword sparse_output_ doc
Use UserWarning in place of SparseEfficiencyWarning
Remove outer for loop in dense case predict
Make cosmetic changes
Test that class dtypes string and float fail in random_choice_csc
Test explicit sample wieghts in the sparse case of class_distribution
Test insertion of 0 class in random_choice_csc
Change dtype check conditional to look for everything other than int
Make cosmetic adjustments, Fix random.py header, use message with assert_warns
Test additional corner cases with random_choice_csc, error on proabilites not summing to 1
Combine sparse and dense test for class_distribution
Manage explicit zeros manualy in class_distribution, test with explicit zeros
Hanna Wallach (2):
FIX issue #4268 (bug in BernoulliNB).
TST check that Bernoulli NB is consistent with Manning et al. IR book example
Hannes Schulz (2):
MISC privatize/deprecate internal function of gaussian process
typo
Harikrishnan S (1):
DOC/FIX twenty_newsgroups.rst should use TfidfVectorizer
Harry Mavroforakis (1):
DOC: specifies the default number of eigenvectors
Hasil Sharma (1):
FIX Make Spectral Embedding deterministic
Helder (1):
Fix a typo on svm.rst
Hendrik Heuer (1):
PY3 fixed examples
Henry Lin (5):
fixed newline errors in documentation for SVC
fixed newline errors in documentation for NuSVC
fixed newline errors in documentation for LinearSVC and LinearSVR
check_X_y: Added mention that function returns y array
#5309: Added verbosity argument to sparse_encode, verbosity for LassoLars and Lasso
Herve Bredin (8):
ENH: improve GMM convergence check #4178
ENH: issue #4178 (cont.)
FIX: fix doctest
doc: mention GMM stopping criterion change
a few changes according to comments by @ogrisel
ENH: better convergence check
DOC: what's new about GMMs
DOC: make it clear 'thresh' should be removed in v0.18
Hrishikesh Huilgolkar (7):
chi2 and additive_chi2 raise error if input are sparse matrices
Added same for additive_chi2_kernel
Fixed pep8 issues
pairwise_distance_functions renamed to PAIRWISE_DISTANCE_FUNCTIONS
Made more changes renamed pairwise_kernel_functions, kernel_params to allcaps
Added test for fit_transform(X)==fit(X).transform(X)
Fixed pep8 issues
Hsiang-Fu Yu (1):
MAINT some cleanup in Liblinear
Hsuan-Tien Lin (8):
a simple fix to reading empty label sets in the multilabel case
cython compiled from pyx
directly set target to empty list instead of empty string
given that line_parts has been copied to features, parsing should be based on features rather than line_parts
cython the pyx
use COLON to avoid byte buffer error
add the empty label test case
add another test case with empty labelset
Hugues SALAMIN (1):
Fix for overflow warning
Ian Gilmore (1):
ENH added digits as optional arg for classification_report
Ian Ozsvald (3):
clearer decision surface plots and classifier final predictions for the ensembles
improved formatting
updated docs to fix formatting errors
Ignacio Rossi (4):
Move model persistence doc inside model selection section
Remove trailing whitespace
Simplified security and maintenance section
Metadata information for unpickling models in future versions
Ilambharathi Kanniah (6):
Sparse input support in BaggingClassfier and BaggingRegressor #3241
Sparse support in Bagging #3241
DOC Document return of fetch_olivetti_faces
DOC Improve documentation of the olivetti dataset
DOC adding return type and improving module doc
DOC return types in datasets.lfw
Immanuel Bayer (74):
Test added for multiple-outcome:
bugfix: lstsq coefficients output needed to be transposed
fixed spelling error
docstring updated and list append replaced with
consistency
spelling
pep8 errors fixed
pip8 errors fixed
parallelized
parameter n_jobs added
BugFix, matrix was not flagged as sparse.
cleaned some examples
combat for sp_linalg.lsqr
test for positive constrained lasso added
positive constrained option for lasso added
lasso docstring update
remove outcommented lines
wording
example for lasso with positive constraint
renaming
reset wrongly committed file
use scikit function to make train test split
set w[ii] = 0 if tmp > 0
- changed parameter from positive_constraint to positive
indent
add examples for positive constraint lasso and enet
merged into plot_lasso_coordinate_descent_path
fix doctest
fixed doctest
Merge pull request #1 from agramfort/posCoeff
add dense attribute and dummy for sparse fit
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into merge_cd
add dense attribute and dummy for sparse fit
Merge branch 'merge_cd' of https://github.com/ibayer/scikit-learn into merge_cd
support of sparse input data added
tests of sparse coordinate_descent applied to the modified dense
-remove sparse option
remove sparse_coef_
Test is redundant since _set_coef function as been removed.
add property for sparse_coef_
add test for sparse_coef_ property
docstrings updated
merge cd_fast and cd_fast_sparse
remove redundant tests
remove redundant files, functionality has been moved to cd_fast.pyx
code removed and deprecated message added
fix docstring example
add test to check normalize option in sparse enet
Revert "remove redundant files, functionality has been moved to cd_fast.pyx"
Revert "remove redundant tests"
add sparse_std that has been wrongly removed in commit 48ba97f1 from the
update sparse_std call
some tests didn't use the numpy sparse matrix as input data and
make sure X is of dtype float64 in _sparse_fit
change input to inplace_csc_column_scale
modify test_normalize_option
test data changed for test_normalize_option
remove redundant folders in linear_model/sparse
remove unused imports
fix pip8
move sparse_center_data to linear_model.base
avoid copy if X has proper type, modify docstring
fix warning: add underscore to: grid_search.best_estimator_ and
add dual_gap_ and eps_ to Enet and Lasso docstring
extend eps_ description
renaming 'learn_rate' in 'learning_rate'
ENH hompage add links to headers in left panel
ENH add link to Citing
ENH renaming 'max_iters' to 'max_iter' for consistency
DOC missing class mention
ENH renaming 'n_atoms' to 'n_components' for consistency
ENH fix pep8
DOC use standart X, y notation
remove order checks for sparse matrices
Imran Haque (7):
ENH Release GIL when entering LibSVM/Liblinear code
Release GIL around sparse liblinear training
ENH Add partial_fit to GaussianNB
Update Sphinx docs for GaussianNB partial_fit
Remove duplicate GaussianNB.fit() code
Preserve public API
docstring
Ishank Gulati (1):
Breast cancer dataset added
IvicaJovic (1):
fix wrong confidence interval
Jack Hale (1):
WMinkowskiDistance corrections to error messages and docstring
Jack Martin (2):
Added import statement for Lasso example
added import np statements
Jacob Schreiber (16):
Epsilon added to x and y weights in nipals inner loop for numeric stability on windows
Changed addition of epsilon for cases where y_weights is near 0
Generated C file
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
Documentation added to criterion, dense splitters.
Verbosity reduced, suggestions merged
Responses changed to targets
Responses changed to targets
PEP257 adhered to closer
Final round of revisions
ENH split _tree.pyx into several files
ENH apply method added to Gradient Boosting
ENH criterion file cleaned up
FIX criterion variable names
ENH gbt sparse support
FIX mailmap
Jacques Kvam (7):
add verbose output for gradient boosting algorithms
Changed verbose to int, added a low verbose option to just print '.'.
remove '\r' and format numbers to be fixed width, 7 digits of precision
fix GradientBoostingClassifier by passing verbose as a keyword argument
DOC: Update pyamg site and fix a typo.
Remove superfluous AdaBoostRegressor call.
PY3: fix map usage in ensemble.partial_dependence
Jaidev Deshpande (1):
ENH Add option in cosine_similarity for sparse output
Jake VanderPlas (326):
fixed bug in BallTree cython wrapper
fixed small bug in cython wrapper for BallTree
updated ball_tree documentation
Merge commit 'upstream/master'
added MLLE, made some small fixes to manifold module
wrapped brute force neighbor search
added cython wrapper to BallTree.query_ball
query_ball -> query_radius, removed knn_brute
speed up BallTree.h
slight speedups to BallTree.h and ball_tree.pyx
added unit test for BallTree.query_radius
fixed reference-passing bug in BallTree.h
vastly improved MLLE speed
added HLLE code
sped up HLLE code
added ability to return distances and specify multiple search radii for BallTree.query_radius()
fixed r shape bug
Merge branch 'manifold' of git://github.com/fabianp/scikit-learn into manifold
pep8 changes
cosmetic changes
added arpack support in scipy_future; wrapped MLLE and HLLE into locally_linear function
removed old files; moved example to examples directory
pep8 changes
added LTSA method
pep8
fixed bug in modified LLE: now works for higher dimensions
added method argument to digits example
Merge pull request #1 from ogrisel/jakevdp-manifold
minor changes
NeighborsClassifier: changed window_size to leaf_size & updated documentation as discussed in Issue #195
fixed doc formatting
merged with sparse classifier commit
merged changes in master
pep8
merge with previous commits
H_tol/M_tol -> hessian_tol/modified_tol
Initial commit
fixed bug in calculating tau
added cythonized Floyd-Warshall algorithm
speed tweaks in Floyd-Warshall, and renamed graph_search->shortest_path
speedup in Floyd-Warshall: unsigned ints to prevent negativity checks
Added Dijkstra's algorithm with Fibonacci Heaps for significant speed gains in path searches
bug fix: free allocated memory
changed shortest_path() to accept a sparse distance matrix for more flexibility
cleanups & pep8
add tests, doc update
combined manifold examples
manifold doc update
Revert "combined manifold examples"
fixed bug in shortest path; consolodated isomap examples
ex. change
Merge branch 'manifold-test' into manifold-doc
cleaned up and documented Fibonacci code
added tests; cleanup; pep8
remove unused imports
first stab at implementation via KernelPCA
add arpack support to KernelPCA
small efficiency boost to KernelCenterer
np.random -> RandomState
K_pred_cols -> K_pred_cols_
Merge branch 'master' into manifold-isomap
manifold/shortest_path -> utils/graph_shortest_path
Implement Isomap + transform in terms of KernelPCA
add description to isomap transform
added Isomap.reconstruction_error()
store BallTree in Isomap for faster transform()
fix conflicts with master
Merge branch 'manifold-isomap' into manifold-doc
update manifold documentation
Merge commit 'upstream/master' into manifold-doc
changes to manifold doc
speed improvements on LLE variants for high dimensional data
manifold example updates
typo in HLLE
examples: make out_dim explicit
remove lobpcg from LocallyLinearEmbedding
merge with master; remove lobpcg references
initial commit
added compiled cython
assure C-ordered on init
fix NeighborsClassifier doctest
make memory allocation more efficient
documentation clarifications
ball_tree protocol 2, but paths are broken
Merge branch 'cython-ball-tree'
move ball_tree.pyx to scikits/learn/ and write pickle test
Merge commit 'upstream-RW/master' into cython-ball-tree
add BallTree pickle test cases
Merge branch 'cython-ball-tree'
refactor neighbors module
doc fixes
merge with upstream/master
Merge commit 'upstream/master' into neighbors-refactor
scikits.learn -> sklearn
add neighbors benchmark
change implementation to mixin pattern
move neighbors.py -> neighbors
fix doctests
merge upstream/master
move barycenter_weights to manifold
deprecation of NeighborsClassifier and NeighborsRegressor
Merge commit 'upstream/master' into neighbors-refactor
add deprecation warning to sklearn.ball_tree
Note neighbors module changes in doc/whats_new.rst
fix typos
gitignore: scikits.learn -> scikit_learn
Merge commit 'upstream-RW/master'
move neighbors examples to examples/neighbors/
Nearest Neighbors examples & documentation
switch to dynamically generated docstrings
commit dynamic doc changes
add weighting to classification and regression
add neighbors/tools to commit
add tests for weighted regression and classification
documentation of weighted classification and regression
add graphical neighbors benchmark
pep8 + move weighted_mode to utils
add tests & example for weighted_mode
benchmark -> bar plot
make constants uppercase
return to simple docstrings
increase BallTree test coverage
fix BallTree linkage
fix typos
Merge pull request #3 from ogrisel/jakevdp-neighbors-refactor
increase test coverage
pep8 + cosmetic changes
add warning flag to balltree + tests
warning_flag doc
add warning messages to KNeighbors
fixes for tests
attempt to address warnings catcher
hack to fix warning test
change warning message
simplify warning test; remove assert_warns from utils
bug: mode='LM' -> mode='LA'
remove unused return_log keyword in GMM
BUG/DOC: address manifold singularity issue
DOC: add utility information for developers
Move graph_shortest_path to utils/graph.py
remove duplicative utils.fixes.arpack_eigsh
Move validation utils to their own submodule
BUG: example plot compatibility with older matplotlib versions
Merge branch 'example-fix'
Merge pull request #4 from glouppe/dev-doc
randomized_range_finder -> randomized_power_iteration
Change logsum to logsumexp for comparability with scipy
BUG: fix scale_C bug in svm
TESTS: remove deprecated NeighborsClassifier calls
species datasets commit
clean up species distribution example
randomized_power_iteration -> randomized_range_finder
typo in fastica doc
Merge commit 'upstream/master' into util-docs
Merge commit 'upstream/master' into util-docs
DOC: add toc for developers resources
DOC: add warning that utils should only be used internally
use joblib for saving species data
Merge commit 'upstream/master' into dataset-fix
fix logsum test
Change depreciated behavior in feature agglomeration example
HACK: sphinx/prevent proliferation of build images in doc
simplify removal of _images dir
remove unneeded import
BallTree -> NearestNeighbors in Isomap
DOC: isomap fixes
convert LLE neighbors to NearestNeighbors object
BallTree -> NearestNeighbors in mean_shift
pep8
Merge pull request #501 from jakevdp/dataset-fix
remove unused import
remove unused imports
pep8
Merge commit 'upstream/master'
COSMIT: pep8
DOC: formatting
DOC: pep8, add quotations, and fix typos
fix for doc math issue
TYPO: generate all images
small simplification in LDA
add old version warning
add newline at file end
turn off old version warning
add random_state to LocallyLinearEmbedding
initialize indices and distances in balltree
check random state in _fit_transform
Address Issue #590 : use relative path link to about.html
Merge commit 'upstream/master'
ball_tree: more efficient array initialization
add info about valgrind to dev documents
Current version -> Latest version
Merge commit 'upstream/master' into old-version-warning
set warning margins to zero
allow for multiple nuggets in gaussian process
example + documentation of gaussian processes on noisy data
Merge commit 'upstream/master' into GPML-fixes
DOC: expand nugget explanation; combine two GPML examples
Merge pull request #6 from amueller/old-version-warning
fix link in warning
latest version -> latest stable version
BUG: fibonacci heap implementation
TEST: non-regression test for fibonacci heap bug fix
Generate c-code with cython 0.15.1
ENH: use shift-invert in spectral clustering
add detailed comment on ARPACK usage
DOC: add tutorial links
Merge branch 'cov-speedup' of git://github.com/vene/scikit-learn into vene-cov-speedup
speed up symmetric_pinv
additional speedup: all eigenvalues are real for symmetric matrix
TST: change LLE test to stable seed
DOC: fix documentation of arpack
Merge pull request #991 from jakevdp/doc-update
@jakevdp's version of pinvh
DOC: add google analytics theme option
clarify documentation for radius_neighbors
BUG update graph_laplacian to upstream SciPy version
Ball Tree, KD Tree, and tests
Fix tests for scipy <= 0.9
speed up KD tree construction by ~25%
add author & license information to pyx files
add median of 3 pivoting to quicksort
add pydist code
fix binary tree sort bug
add pydist: user-defined metric
add haversine distance
add exception passing to C functions
rename dist conversion funcs
Implement correct d-dimensional kernel norms
add metric mappings to dist_metrics
binary tree: make valid_metrics a class variable
dist_metrics: allow callable metric
add chebyshev distance to kd tree
add functionality to NearestNeighbors estimators
Roger-Stanimoto -> Rogers-Tanimoto
calculate kernel norm only once
compute kernel norm only once
TST: compare gaussian KDE against scipy version
Change dual splits to single splits in query_dual
Merge pull request #7 from jhale/new_ball_tree
add notes on implementation details to binary_tree.pxi
remove scipy cKDTree support from neighbors
add neighbors module changes to whats_new
Merge pull request #2104 from kastnerkyle/master
BUG: fix precision issues in kernel_density; remove buggy dual-tree KDE versions
add KDE Estimator class
add kwargs to PyFuncDistance
DOC: document the new neighbors functions & KDE
undo change to clustering example
fix conflicts with master
import KernelDensity from neighbors module
adjust math formatting in neighbors docs
fix NearestNeighbors to pass common tests
add KernelDensity to class list
set random seed in KDE example
skip KDE test to prevent failure due to older SciPy versions
fix typo: SkipTe -> SkipTest
fix doctest in neighbors
BUG: return proper algorithm in KDE
add species KDE example
PEP8: neighbors module
DOC: rearrange KDE examples
TST: increase test coverage in neighbors module
DOC: pep8 & formatting in neighbors docs
DOC: make doc tests pass
add 1D KDE example
DOC: small fixes to neighbors doc
DOC: move KDE discussion to separate page
add some notes and doc strings to neighbors cython code
add more documentation to ball tree and kd tree
DOC: tweak kde examples and move density docs
BUG: fix tophat sampling in KDE
Xplot -> X_plot
bt->tree; dm->dist_metric
Additional implementation notes in binary tree
BUG: use correct algorithm for callable metric
TST: set random state in callable_metric test
BUG: add new preprocessing module to setup.py
Merge pull request #2264 from jakevdp/setup_fix
neighbors numpy1.3 compat: fix typedefs, regen with cython 0.19
numpy 1.3 compat: use explicit type definitions
numpy 1.3 compat: make neighbors/dist_metrics compatible
COMPAT: make NeighborsHeap compatible with numpy 1.3
COMPAT: make NodeHeap compatible with numpy 1.3
COMPAT: make BinaryTree class compatible with numpy 1.3
COMPAT: make BallTree & KDTree compatible with numpy 1.3
COMPAT: last few BallTree/KDTree numpy 1.3 issues
BUG: type->dtype in a cross-platform way
compute offset in a cross-platform way
BUG: don't subtract offset in binary_tree
add explicit types to neighbors cython code
DOC: fix metrics documentation format
Merge pull request #2489 from samuela/patch-1
ENH: add PolynomialFeatures preprocessor
address PR2585 comments
import pylab -> import matplotlib.pyplot
TST: fix PolynomialFeatures test error
TST: fix doctests & pep8
DOC: adjust notes in PolynomialFeatures
reorder polynomial features
fix ordering & docs of PolynomialFeatures
TST: remove stray print statement
DOC: typo in kernel formulas
Merge pull request #2608 from jakevdp/master
Merge pull request #2585 from jakevdp/polynomial
DOC: PolynomialFeatures narrative doc
DOC: fix auto-links
DOC: fix Polynomial Regression docs
DOC: fix test failure in linear model
SPD: reuse variable in GaussianNB for speedup
DOC: typo in binary_tree docs
Merge pull request #2765 from AlexanderFabisch/validation_curves
DOC: fix doc inconsistency in GMM
BUG: only symmetrize matrix when it is not already symmetric
MAINT: create ensure_symmetric utility function to check matrix symmetry
rename ensure_symmetric -> test_symmetric
TST: add test of check_symmetric
DOC: fix docstring of check_symmetric
add more informative doc & error message in utils.check_symmetric
make utils.check_symmetric future-proof
Merge pull request #5231 from vighneshbirodkar/kmeans_fix
MAINT: make GaussianNB scale-invariant
TST: GaussianNB scale invariance
GaussianNB: use var() rather than std() in epsilon determination
MAINT: minor fixes to GaussianNB epsilon
DOC: add whats_new entry for PR #5349
fix merge conflict
JakeMick (1):
TST added test of fit and transform for kernels for nystroem
James Bergstra (27):
k_means_ - added optional rng parameter to work routines
Centering data for k-means before fitting
k-means - added verbose-level print after initialization
added faster distance-computation algorithm to k-means _e_step
PCA train() stores eigenvalues associated with components
adding James Bergstra as author of k_means_ file
k-means adding all_paris_l2_distance_squared function
k-means - modified k_init to use pre-computed distances for faster, clearer code
k-means - added support for a callable "init" argument instead of copying all the k_init parameters as optional arguments - invite user to use a lambda or something
k-means - fixed misleading typo in error message
k-means - added optional parameters "precompute_distances" and "x_squared_norms"
k-means - added "verbose" parameter to KMeans class
k-means - added copy_x parameter to worker routine and BaseEstimator, allowing optional in-place operation
added optional args to euclidean_distances and removed k_means_.all_pairs_l2_distances_squared
fixed typo in my previous patch to PCA
added PCA.inverse_transform and unit test
added components_coefs_ (eigenvalues) member to RandomizedPCA to match PCA
test_pca - modified to use assert_almost_equal
euclidian_distances - repair special case for when X is Y
ENH: adding iter_limit to libsvm
FIX: committing updated Cython-generated libsvm bindings
ENH: Solver iter_limit emits warning instead of raising exception
ENH: renaming iter_limit -> max_iter
FIX: missing file hidden among the Cython output
ENH: hint about data normalization when SVC stops early
FIX: adding missing c files from cython
ENH: assert -> assert_equals
James McDermott (1):
DOC rename lambda to alpha in plot_lasso_coordinate_descent_path. (Re)-Closes #903.
James Yu (2):
DOC: fixed docstring formatting
typo
Jan Dlabal (2):
Fix grammar
Fix labels of inliers vs outliers
Jan Hendrik Metzen (45):
Fixed bug in updating structure matrix in ward_tree algorithm.
Added test case that reproduces crashes in old version of ward_tree algorithm.
Performance tweaking in ward_tree.
FIX : Fixed bug in single_source_shortest_path_length in sklearn.utils.graph
FIX Anisotropic hyperparam optimization of GP bounds
FIX Using random_state.rand rather than scipy.rand consistently in Gaussian Process
TST regression test for optimum over several random-starts of GP
TEST A duplicate minimum value should not yield non-finite predictions in IsotonicRegression
FIX Adding eps to minimum value in clipping in IsotonicRegression
DOC Add inline comment with reference to scipy issue
ADD @mblondel's initial implementation of KernelRidge added
ADD Example comparing kernel ridge and support vector regression
TEST Added tests for KernelRidge based on @mblondel's code in lightning
DOC Added documentation for kernel_ridge module
ENH Example comparing KRR and SVR documented and slightly modified
FIX Fixed assigning colors to methods in KRR example
DOC Added narrative doc for kernelized ridge regreesion
DOC Resolved minor issues in documentation of kernel_ridge
MISC Optimized two numpy statements in kernel_ridge.py
FIX Add kernel_ridge to list of all submodules
FIX Input validation in fit() of KernelRidge
FIX Allow multi-output in KernelRidge
REFACTOR Backported changes in _solve_cholesky_kernel to ridge.py
MISC Using RandomState object instead of setting global seed
DOC Revised documentation of kernel ridge regression
DOC Polishing doc of kernel_ridge
PEP8 Removing PEP8 violations in plot_kernel_ridge_regression.py
MISC Let sample_weight default to None
FIX We must not use overwrite_a=True in _solve_cholesky_kernel as K might be reused
TEST Extending tests for kernel_ridge
REFACTOR Using lstsq rather than manual pinv in _solve_cholesky_kernel
MINOR Fixed typo in test name (test_kernel_ridge_singular_kernel)
TEST Use make_regression instead of make_classification when testing kernel_ridge
DOC Added KernelRidge to whats_new.rst
FIX KernelRidge checks if fitted before predict
DOC Added kernel_ridge to classes.rst
ENH Add probability calibration based on isotonic regr. and Platt's sigmoid fit + calibration-curve
ENH Brier-score loss metric for classifiers
TST Tests for the calibration module
DOC Examples for the calibration of predicted probabilities and calibration-curves
DOC Narrative doc for the calibration module
TST Adding brier_score_loss to test_common.py
ENH Adding sample-weight support for Gaussian naive Bayes
TST Tests for sample-weights in Gaussian naive Bayes
FIX _make_unique of _isotonic.pyx uses float64 consistent internally
Jan Schl�ter (3):
Replaced wrong k-means++ implementation with a correct one.
Extended docstring, renamed variables from javaStyle to python_style, replaced tab-indents with space-indents, pep8
Use scikits distance functions instead of scipy's. Avoid recomputations of x_squared_norms whereever possible. Completion and unification of docstrings.
Jaques Grobler (362):
Added a note to the install documentation
Added a note to the contributers documentation
Shorted the long line
Added a small note about the use of an upstream remote in the Contributions documentation
Shortened a line in the code
Merge branch 'WIP_tut', remote-tracking branch 'gaelVaroqueux/stat_tutorial' into WIP_tut
- Further integrated tutorial.rst (Section 2 in Userguide) with links to
moved tutorial files into separete folder within main tutorial folder. added folder for section2 tutorial. fixed some links.removed savefigure from plot_cv_diabetes.py
Merge remote-tracking branch 'origin/master' into WIP_tut
Merge branch 'master' into WIP_tut
Removed savefig from tutorial plot files.
Updated tutorial folders in doc with placeholders for other tutorials. updated index.rst for the tutorial menu accordingly
added an html page for plot_digits_first_image.py
Added links to some keywords.
Links, image resize and updated ipython code in tutorial
Added a dataset image, some links and 'import sklearn' updates
Added Knn classification example image&html
changed colours of plots, added links
Fixed link typo
Merge branch 'master' into WIP_tut
Simple linear regression example added to tut
Fixed spelling error,import lines,figures and html links for shrinkage section
Added links, images and docstrings to some plot files
fixed plots to have class coloured datapoints
Fixed some figures, added links & corrected SVM Param C explanation
Fixed missing image and GUI download link
Image page fixed
added div.green to the theme for Exersizes in scikit-tutorial
fixed link/updated some code
renamed file-names, finished model-selection, changed cv plot to use C
Section 4 done - images/links/htmls for images
All scikit tutorial images and links redone
Fixes for doctests
modified makefile for doctesting - not permanent
Merge remote-tracking branch 'origin/master' into WIP_tut
remove redundant file
removed redundant file
Better doctest time(wip),removed duplicate examples, update plot_ols.py
Merge remote-tracking branch 'origin/master' into WIP_tut
3 files moved into main example pool - links to them updated
Merged some examples into examples folder.
Merged a few examples into the example pool
delete redundant file, merged some examples and updated links
examples merged to example pool
deleted unused file, tutorial examples folder removed
replaced silence paramenter in makefile, links removed in stat_learn tutorial, big_toc_css copy deleted, heading changed in tutorial index, tutorial index info added
added ELLIPSIS to 4 examples
added ... to ellipsis
Merge remote-tracking branch 'origin/master' into WIP_tut
merged ols and ridge variance + some neating
fixed links & neatening
moved exercises into seperate folder, neating up
path fix of moved figure
fixed typo,changed 2.2s numbering, fixed 4 examples in exercises
fixed numbering in main User Guide
added collapsable sidebar - still WIP
Collapsable sidebar adding complete - appears to work well
Deleted redundant files
color change for button
comment added to gen_rst. Arrow added to button
Next button added:position correct,but does nothin
button is mostly working
spelling fixes
cleaned up
more cleaning-finished off
spelling errors,edit curse of dimensionality, explain top-down
bug fix - layout
changed hover colours for button
previous button added with hovering-effect
Merge branch 'master' into WIP_tut
fixed new doc-test error
Made old EllipticEnvelop deprecated class
changed message to *Use EllipticEnvelope instead*
Fixed broken image link
Removed `_plot` from the face recognition example
Added the name change for the recent change EllipticEnvelope
Changed GMM's API to suite rest of sklearn
1.Fixed typo 2.Removed has_key entries
restored last changes
Fixed syntax error
mixture/plot_gmm* examples updated
restored last changes
DPGMM API updated, along with plot_gmm_sin example
DPGMM and VBGMM API change, example updated
modified test_gmm to match API changes in gmm.py
updated documentation for gmm,dpgmm and vbgmm
Changed variable name `x` to `covar_type`
Updated `whats_new.rst` with API change
Added `note` to tutorial index for `doctest_mode` in `ipython`
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
changes to `fit` and `__init__`
decision logic removed from __init__
API update for HMM types with docstrings
tests updated to match API
fixed example`s fit(..) to new API
made `diag` explicit in example
Fixed typos, spacing errors & updated `Whats New`
fixed broken GaussianHMM documentation generation
correct some wrong fixes
reversed the order of the thresholds array
metrics.py
test added for this
fixed typos,updated `whats new`
typo fixed in what`s new
added alternating columns for tables in documentation and a tighter layout in pre
docstring fixes
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
Fixed broken links on Support page
Fixed broken links on Support page
Merge pull request #974 from jaquesgrobler/master
fixed long-name-references madness + removed some whitespace
trainling whitespace removed
blank line removed
slight adjustment to header size
Merge pull request #1075 from jaquesgrobler/master
Merge pull request #1077 from ludwigschwardt/minor-fixes
Added scale_c fiasco example
gael`s suggestions/tweaks
docstring change
docstring fixes
changed includes back - change broke JENKINS build
not the problem afterall - switch back
docstring changes
typos and alex`s review changes
small tweaks
changed includes back - change broke JENKINS build
not the problem afterall - switch back
add first collapsible toctree test
moved buttons to themes
working version
Links now clickable
-collapse toc moved to front page-
button colour change + comments
fixes - seemingly good version
highlighting of + implemented
-line highlight bug fixed, buttons changed, full expansion added
small bug fix and colour tweak
nitpick fix
cleanups
cleanups
toggle bug fixed
highlight fix
what`s new updated
remove `steps` from Attributes of docstring
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge pull request #1331 from jaquesgrobler/master
Merge pull request #1367 from fannix/master
Merge pull request #1369 from AlexandreAbraham/fix_doc_clean
doc fix - trailing underscore and init param update
goodness of fit fix
trailing whitespace sentence
Merge pull request #1372 from jaquesgrobler/doc-fix-dev_guide
plot fix
variable name change
new example added for manifold learning
Andy`s suggestions
links for MDS
small changes
final changes
pep8
heading change
added links to astropy and scipy workflow guids
Merge pull request #1564 from jaquesgrobler/contributor_guide_links
remove 1000`s of warnings from example
Merge pull request #1592 from jaquesgrobler/master
Add temporary survey banner
remove the equaldistance code warning, replace with doc warnings
typo fix
remove warning
warning removal
update warning box
deprecation warnings, indent fix
andys suggestions and test
add warning for no internet
Merge pull request #1644 from jaquesgrobler/doc_url_error
TYPO fix
example title change
gallery effects,icon change,cleanups
typo fix and heading changes
fix indentation error-cause lots of build warnings
4 thumbs per row/hover effect/some cleanup
fix for iris dataset
line_count sort added, some changes reverted
move comment out of list
remove comment, undo change
Merge pull request #1803 from kmike/hmm
rename example title
Switch off survey banner
newline at end of file
Merge pull request #1581 from jaquesgrobler/example_gallery_cleanup
temp disable line-count-sort for gallery while fixing bug
sort-by-line-count bug fixed
Merge branch 'master' of github.com:scikit-learn/scikit-learn
fix numbering for tutorials page
Add bit more instruction on writing docs
big O/tilde add in
removed old complexity info
image and html file added
link fixes
add further links
last links fixed
jquerys added
intigrated to tutorial index
update tutorial page
make links relative
rename image/html
add instructions for editing Readme, and script needed for that
remove svg2html script,toctree section added,doc page for ml_map created
sidebar added
layout fixes and top paragraph
TYPO fix
update what`s new
deleted unnecessary thumbnail
DOC improve description of cross validation
resized image
disable sidebar using cookies to remember last position
COSMIT pep8
Merge pull request #1884 from jaquesgrobler/ml_map
DOC added link to scipy lecture notes to tuts
Merge pull request #1924 from jaquesgrobler/FIX_sidebar_on_index_page
Merge pull request #1911 from Jim-Holmstroem/generalize_label_type_for_confusion_matrix
Merge pull request #1944 from jnothman/selectpercentile_limit_bug
fixed typo
maintenance scripts added for machine learning maps - needed for modifying the map in future
initial commit
DOC Fix references to missing examples
fix incorrect reference
Merge pull request #1986 from jaquesgrobler/DOC_reference_fixes
remove placehoder
Merge branch 'tutorial' into DOC_olivier_tut
remove duplicates for merging larsmans branch
initial tutorial re-added with it`s commit history
Placeholder to tutorials page
add optional banner to index page to advertise code sprints
link updated
Merge pull request #1996 from jaquesgrobler/DOC_sprint_sponser_banner
moved tutorial out of tutorial folder
tutorial index initial touchups
hover removed from nature, jquery more recent version, containerexpansion on mouseover add
image resizing added
Zoom bug fixed
added docstring space to popup block
docstrings embedded into example hovers
Final visual effects added to hovering
Nelle`s review fixes addressed
Cross browser shadows covered
remove forgotten print
shorten displayed dosctring to 95 chars
fix white space inconsistency between header and docstring
example docstring fixes
logistic regresion example fix
Tutorial Setup
Merge pull request #2056 from jnothman/leavepout_clarify
firefox bug fixed
classifiers comparison fix
DOC spellfixes
Donate buttons added `About us` and front page
donations paragraphs added
Merge branch 'master' of github.com:scikit-learn/scikit-learn
misalignment fix
example fixes to clean first docstring paragraph of rst code
fix merge conflict
border added for IE
make new classes for lasso_path/enet_path and deprecate old
rel_canonical prelim
Merge branch 'master' into ENH_docstrings_in_gallery
syntax fix
cleaned up-ready
Merge pull request #2017 from jaquesgrobler/ENH_docstrings_in_gallery
Small docstring changes for plot_ward_structured_vs_unstructered example, as mentioned in PR #2017
nitpick fixes, pep8 and fix math equations
removed old_version block test
Merge pull request #2205 from jaquesgrobler/ENH_rel_canonical
sidebar fix - sidebar.js was called before jquery. works fine under new version jquery too
sidebar/toctree harmonie, must still fix toggle
jquery reverted to 1.7.2 version. sidebar/toc-collapse works
fix and squash broken branch, remove all but section 3 and redo some docstring breaks
fix broken directory name
Merge pull request #2329 from NelleV/website
DOC: few small doc fixes to layout bugs on new website
comments added to the changes
Merge pull request #2331 from jaquesgrobler/master
fix sidebar button size staying constant and leaving a mountain of white-space due to expanding/collapsing toctree
fix cut-off ML map
Merge pull request #2336 from jaquesgrobler/DOC_fix_whitespace_on_userguide
commit added for hack-warning
Merge pull request #2337 from jaquesgrobler/DOC_fix_cut_ML_flowchart
Merge pull request #2345 from rgbkrk/patch-1
first carousel version added
firefox fix and more images added, auto-cycling disabled
arrows switched for dots
have images link to relevant examples
slight layout adjust
small layout changes for firefox, images taken from generated images now
indentation fixes
add more examples and cropping to first image
disable carousel for small displays, small tweaks
merge conflict fix
redo old deletions
contributors logos added to index screen footer
changed hyperlink to funding anchor - added same for the images
old footer removed for index page
removed little underscores appearing between images - caused by link added to images
add about us to community block
Merge pull request #2350 from jaquesgrobler/DOC_inria_google_logos
fix broken logo links on homepage and remove unused duplicate images
another speedup attempt for index.html: carousel images to use thumbnails instead, 1 blocking js removed for page (sidebar.js)
cleanup changes
remove paren
add lossless image compression for docbuilding machine
switch species_kde to thumbnail
logo lossless compressions
add coverage test for travis
coverall added
remove whitespace, fix pip install
adds minutes + seconds to examples
cosmetic - shorten long line
add copybutton to code examples
remove writing of minutes/seconds
remove duplicate line
cleaner toctree collapsing
remove unused images
mention python.org in js file
Merge pull request #2515 from jaquesgrobler/add_minutes_examples
fixes mis-alignment of Documentation link on front page
Merge branch 'master' of github.com:scikit-learn/scikit-learn
SGD and SVC duplicate example sorted
sort out plot_iris vs plot_svm_iris
some cosmetic commits and delete duplicate files
Merge branch 'master' into DOC_olivier_tut
Merge branch 'master' into DOC_olivier_tut
some cleanup, conflict fixes, squashes, re-dos
redo typo fixes
small css tweak
speedup for docbuild using joblib
change cache directory
typo fix
changes to headings and sections
remove face-recognition exercises, skeletons, etc and merge artifacts
further changes to headings and subsections
re-order some Documentation categories in documentation.html
fix broken link for stable version from dev version
fix tutorials index, update dropdown menu in navbar, update stat_learning tutorial link on documentation.html page
typo fix
Merge branch 'master' into DOC_olivier_tut
remove extra headings from tuts page
update what`s new
remove EllipticEnvelop deprecation
Merge pull request #2651 from amueller/spotify_testimonial
whatsnew conflict fix
Merge branch 'master' into DOC_olivier_tut
recommit joel comments
where to from here section
whitespace removals
olivier suggestions
changed PCA to truncated SVD
MAINT deprecate HMMs
conflict fix
small fixes
Merge pull request #1971 from jaquesgrobler/DOC_olivier_tut
sphinx colon error fix
COSMIT fix PEP8 errors
Jatin Shah (2):
Add sample_weight parameter to metrics.jaccard_similarity_score
Add sample_weight parameter to metrics.log_loss
Javier López Peña (3):
Fix example in plot_pca_3d.py (color array had wrong size)
Check if targets is a numpy array and convert it into one if it isn't
Replace 'coefs' by 'coef_' in PLSRegression
Jean Kossaifi (36):
Changed the default return type of ward_tree from bool to int
adding a comment on the test for grid_to_graph
pep8 and using np.bool instead of bool
FIX : _to_graph failed if mask's data was not of type bool
Test to check that the grid_to_graph function works with every type of
COSMIT : used implicit continuation inside parenthesis instead of
Typo : fix the 0.5 coefficient
Added normalize parameter to LinearModel
Added parameter normalize to LinearRegression
LassoLARS now uses the normalize parameter
Completed the integration of the parameter normalize
Implementation of the parameter normalize in bayes.py
added parameter normalize to coordinate_descent
added parameter normalize to ridge.py
Added parameter normalize to omp.py
Added parameter normalize
Fixed some errors (mainly docstrings)
Merge remote branch 'upstream/master' into normalize_data
Added a function as_float_array in scikits.learn.utils
Fix : deleted a forgotten line
FIX : corrected a bug in as_float_array and added a test function
PEP8 : replaced tabulations by spaces
FIX : if X is already of the good type, we musn't modify it
FIX : if X.dtype is changed, then a copy of X is returned, even if overwrite_X is True
Test : lasso_lars_vs_lasso_*
Merge branch 'normalize_data'
FIX : Ellipsis in least_angle.py doctests
FIX : ELLIPSIS in least_angle.py doctests
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
Sorting parameters in BaseEstimtor.__repr__
FIX : docstest fail
Cross_val : Removed useless & tricky parameter iid
Minor doc enhancement: documented setup develop.
Added link to the setuptools doc and a note on the need to rebuild every time a
Added subject independent KFold
Added link to github repositories in what's new.
Jean Michel Rouly (1):
Using the 'is' token compares identity; this should not be used for string value comparison.
Jeff Hammerbacher (2):
'none' is an acceptable value for penalty
Small typo fix in the comments: "whther" --> "whether"
Jeffrey Blackburne (8):
ENH shuffle option for StratifiedKFold
Added a unit test to ensure that there are no spurious repeating values in the thresholds returned by roc_curve because of machine precision, and a quick stab at a fix.
Slightly expanded the docstring for the test_roc_nonrepeating_thresholds test. Added named slice variables to same, for readability.
Improved the treatment of the machine precision issue for calculating ROC thresholds. Added a comment explaining why the treatment is needed.
Minor refactoring of the repeated-roc-thresholds test, for clarity.
Added a backport of numpy.isclose from numpy v1.8.1. This function was not available until numpy v1.7
Changed metrics.py to use the new backward-compatible version of np.isclose
PEP8 fix
Jeffrey04 (1):
FIX MaxAbsScaler on sparse matrices with 1 row
Jelle Zijlstra (1):
fix defaultdict call
Jeremy (1):
Dataiku testimonial
Jiali Mei (1):
a common test to check if classifiers fail when fed regression targets
Jim Holmström (10):
Added random_state=0 for AdaBoostRegressor
Replaced 'for i' with 'for _' at place where i is not used.
Extended test_confusion_matrix_binary to incorporate non-integer labels
Extended test_confusion_matrix_multiclass to incorporate non-integer labels
BUG: Fix for non-integer datatypes in confusion_matrix
ENH: faster preallocation and integer type for the accumulators
STY: one-lined lines that where less than 79
MAINT: let the result type be infered by coo_matrix, possible since np.ones already integer typed
MAINT: refactored metrics.auc to use np.trapz
ENH: Added input checks in confusion_matrix
Jochen Wersdörfer (2):
ENH CountVectorizer using arrays instead of lists
ENH added multiclass_log_loss metric
Joe Jevnik (2):
ENH: Adds CallableTransformer
COMPAT: Makes test_function_transformer py2 compatible.
Joel Nothman (357):
Fix comment: returns fbeta_score, not f1_score
ENH allow SelectKBest to select all features in a parameter search
DOC Allowing a list of param_grids means GridSearchCV is more than grids
DOC clarify relationship between pos_label and average parameters for
ENH/FIX make best_estimator_'s predict functions available in parameter search
FIX make *SearchCV picklable
REFACTOR combine train_wrap and csr_train_wrap
ENH call asarray on returned scores and pvalues
TST ensure SelectKBest and SelectPercentile scores are best
FIX ensure SelectPercentile only removes tied features in case of ties
ENH _BaseFilter.inverse_transform should respect dtype
DOC Fix comment for _BaseFilter.inverse_transform
ENH sparse _BaseFilter.inverse_transform
FIXTST fix errors introduced to feature selection tests
DOC comment feature selection sparse inverse_transform
Merge pull request #1935 from jnothman/base_filter_inv_transform
ENH Feature selection should use CSC matrices
COSMIT Remove redundant code in CountVectorizer
TST test CountVectorizer.stop_words_ value
ENH Use csr_matrix.sum_duplicates instead of tocoo
DOC small typographical fixes in grid_search documentation
COSMIT refactor roc_curve and precision_recall_curve
FIX bug where hinge_loss(..., neg_label=1) produced incorrect results
Merge pull request #1880 from NicolasTr/patch_extractor_float_max_patches
DOC Fix estimator unsupervised fit method signature
DOC clarification of parameter search
DOC fix typos
COSMIT shorten long line for pep8
ENH Create FeatureSelectionMixin for shared [inverse_]transform code
DOC rewrite descriptions of P/R/F averages and define support
DOC/COSMIT fix typos in What's New
DOC add some contributions to What's New
TST Use assert_almost_equal in test_symmetry
COSMIT prefer partial over lambda in test_metrics
TSTFIX use name, not metric, in test_metrics error messages
DOC correct note on handling 0-denominator in P/R/F
Merge pull request #2005 from kmike/test_pipeline_methods_preprocessing_svm
ENH faster unique_labels for big sequences of sequences
DOC explain labels parameter to confusion_matrix
DOC Detail on parent-child relationship in tree
FIX/COSMIT helper to identify target types
FIX cannot use set notation for Py2.6
FIX need explicit dtype for array of sequences in numpy 1.3
COSMIT remove redundant target size check
FIX numpy 1.3 has no float16; use float32
FIX/TST np.squeeze in numpy1.3 fails with array of sequences
FIX numpy 1.3 throws error with array of arrays
FIX use Python 2.6-compatible str.format
COSMIT refactor cross-validation strategies
Include LeavePLabelOut in refactoring
A further refactor
COSMIT Base class for KFold/StratifiedKFold validation
COSMIT make BaseKFold abstract
COSMIT pep8 in cross_val_score
COSMIT Base class for [Stratified]ShuffleSplit
DOC clarify LeavePOut's combinatoric explosion
DOC similar note in narrative docs
DOC More explicit note
DOC fix docstring headings
COSMIT make helpers private with underscore
COSMIT make BaseKFold private with underscore
TST additional tests for preprocessing.Binarizer
COSMIT add underscore prefixes where forgotten in cross_validation
COSMIT much simpler agglomeration inverse_transform
TST stronger test for agglomeration transforms
DOC minor fixes to Ward docstrings
DOC fix docstrings for AgglomerationTransform
DOC detail Ward.children_ and fix n_components_ type
DOC comment on Ward algorithm
DOC clean pooling_func arg type
DOC copy comment describing hierarchical clustering children
Merge pull request #2054 from ogrisel/invalid-n-folds
FIX avoid spectral_embedding naming conflict
Merge pull request #2085 from agramfort/fix_y_score_fa
Merge pull request #2090 from kanielc/fix_weight
COSMIT move deprecated parameter to end
COSMIT refactor document frequency implementations
ENH print number of fits in BaseSearchCV._fit
DOC fix comment on svm probability param
DOC add missing Returns description
MAINT deprecate indices=False for cross validation generators
COSMIT rewrite precision_recall_fscore_support
FIX warning interaction problem, DOC clarify parameters
FIX ignore corrcoef warning
Use label_binarize instead of LabelBinarizer
Use specialised warning class
TST messages for UndefinedMetricWarning from prf
TST use ignore_warnings helper
TST classification_report example that does not trigger warning
FIX use fixes.bincount for numpy 1.3.0 support
FIX Py2.6-compatible use of str.format
FIX finish deprecation of indices in CV
FIX limit warnings for recall_score, precision_score, f1_score,
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into prf_derivative_warnings
COSMIT use assert_no_warnings where appropriate
FIX stop forcing deprecation warnings for external packages
FIX use of str.format for Py2.6
ENH support multiclass targets of string objects
TST Add benchmarking script for multilabel metrics
DOC Add what's new entry for #2626
COSMIT in response to feedback
Merge pull request #2660 from amueller/mbkmeans_compute_labels_fix_new
FIX suspected bug for non-hypercube make_classification
FIX Low-memory cluster sampling in make_classification
FIX don't leave clusters in uint8
Remove debug print statement
FIX update data-dependent test assertion
Use internal sample_with_replacement rather than Python's random.sample
TST/COSMIT cleaner make_classification and fix tests
DOC add missing comment for C parameter
DOC remove duplicate reference to example
FIX Use Py2.6-compatible string formatting
DOC add learning_curve to what's new and API reference
FIX explicit type for offset variable
FIX remove confusing BaseEstimator.__str__
ENH/FIX Change Tree underlying data structure
DOC note #2732 in what's new
DOC fix documentation to match default value
ENH remove unnecessary CSR->CSC transform in text feature extractors
Merge pull request #2772 from jnothman/tfidf_no_copy
DOC make indifferent to compressed sparse orientation
DOC add link to module reference from class/function pages
FIX remove reference cycles from Tree
FIX dtype refcount; block resizing after building
Remove 'locked', add comments
COSMIT remove unused imports and variables
Merge pull request #2818 from jnothman/flakes
Merge pull request #2826 from perimosocordiae/patch-1
Merge pull request #2840 from earino/develop
Merge pull request #2869 from jperla/master
STY Simplify and avoided numpy warnings in imputation
Merge pull request #2925 from ajtulloch/izip-cross-validation
ENH/TST remove unnecessary sorts and complete testing for sparse median
COSMIT Remove unnecessary assertion
FIX for numpy compatibility, code clarity
MAINT warn of future behaviour change proposed in #2610
FIX for early numpy where astype(..., copy=...) unavailable
DOC Remove redundant parameter docs
Merge pull request #2991 from Manoj-Kumar-S/csr_sparse_center
DOC use hyperlink colour when in tt
Merge pull request #3010 from griffinmyers/fix-select-k-best
TST/COSMIT tests/comments for [sparse]center_data
Merge pull request #2952 from jnothman/future_warn_two_labels
Merge pull request #3057 from ajschumacher/patch-4
DOC Comment on thresholds[0] for roc_curve
DOC a note on shuffling for cross-validation
DOC shuffled StratifiedKFold now in whats_new
Merge pull request #3133 from kmike/patch-2
Merge pull request #3134 from jrouly/string_values
Merge pull request #3140 from MechCoder/small_fixes
Merge pull request #3192 from mmaker/master
DOC fix typo
Merge pull request #3196 from andrewclegg/master
Merge pull request #3197 from mjbommar/issue-3167-eradicate-todense
Merge pull request #3198 from apw/patch-1
Merge pull request #3208 from staubda/improve_RFE_doc
Merge pull request #3217 from bwignall/innerprod
Merge pull request #3225 from bwignall/quickfix-typo
MAINT deprecate sequences of sequences support
FIX remove duplicates in MultiLabelBinarizer
COSMIT use MultiLabelBinarizer in LabelBinarizer during deprecation
COSMIT use mlb for MultiLabelBinarizer instances
ENH do not allocate memory for temporary array of 1s
DOC remove mention of sequence of sequences in Parameters sections
FIX don't use dict comprehension for Py 2.6
FIX No set construction shorthand in Py2.6
DOC comment on _transform interface
TST Validate input MultiLabelBinarizer.inverse_transform
No set construction shorthand in Py2.6
DOC/FIX Address @arjoly's comments
TST stronger test for non-integers in MultiLabelBinarizer
Assert or ignore all sequence of sequences deprecation warnings
TST avoid more warnings related to sequence of sequences
TST fix testing for sequence of sequences warning in metrics
Merge pull request #3240 from JelleZijlstra/fix-defaultdict
DOC more complete coverage of API linking in examples
TST add test for name identification in RST generation
TST avoid nose running setup() in gen_rst.py
TST add sphinxext to testing
TST remove dependencies for testing sphinxext
DOC give example of binarizing binary targets
DOC Note shape of binary binarized output
COSMIT factor out split data as struct
Merge pull request #3165 from jnothman/links_in_examples
FIX OvR with constant label for non-predict methods
FIX case where label is constantly absent
Merge pull request #3309 from argriffing/remove-symeig
TST use assert_warns and modernise test for constant predictor
FIX avoid using sequences of sequences and fix tests
Merge pull request #3317 from pignacio/model_persistence_doc
MAINT remove residual sparsefuncs*.so when compiling
ENH make_multilabel_classification for large n_features: faster and sparse output support
COSMIT in response to @arjoly's comments
COSMIT pep8 fixes
TST ignore sequence of sequences DeprecationWarnings
DOC link example gallery scripts rather than inline
COSMIT
DOC show referring examples on API reference pages
DOC ensure longer underline
DOC mention doc-building dependency on Pillow
Merge pull request #3327 from jnothman/examples_in_apiref
DOC make neural networks example appear
DOC Fix example path
DOC fix 'Return' -> 'Returns'
FIX Py3k support for out-of-core example
DOC add links to github sourcecode in API reference
DOC fix opaque background glitch when hovering example icons
DOC fix doc errors in utils.testing
DOC fix formatting of attributes etc. in docstrings
DOC fix styling of See Also sections
DOC fix styling of method signatures
DOC fix see also references
DOC move Attributes section to after Parameters and style likewise
Merge pull request #3491 from ogrisel/fix-warning-test-cross-val
DOC fix see also reference
DOC fix markup error
ENH Sparse multilabel target support
DOC label indicators are clearer as ints than floats
DOC more precise input type for multilabel metrics
Merge pull request #3509 from arjoly/fix-rp-sparse
Merge pull request #3489 from jnothman/attributes-doc
DOC remove backticks from around attribute name + PEP8
DOC A less-nested coverage of model evaluation
Merge pull request #3528 from MechCoder/fix_heisen
Merge pull request #3527 from jnothman/unnest-model-evaluation
FIX only use testing.ignore_warnings in tests
ENH faster safe_indexing for common case
Merge pull request #3539 from jnothman/fast_indexing
DOC fix typo
DOC fix typo
DOC specify X shape for precomputed
DOC correct default value
DOC add missing details to what's new
DOC correct shape of Tree.value
COSMIT update sklearn.svm.bounds
MAINT Remove note that assert_warns comes with Numpy 1.7
Merge pull request #3572 from AndrewWalker/patch-1
ENH more explicit error message for ill-posed problem
DOC more explicit parameter descriptions in make_multilabel_classification
FIX PLSRegression again supports 1d target
FIX out-of-core example had been broken
DOC avoid the plot_ prefix in example where no plot
DOC fix line references in tutorial
FIX more intuitive behavior for *SearchCV.score
DOC document MultiLabelBinarizer.sparse_output param
Merge pull request #3586 from MechCoder/python3_installation
DOC fix link embedding regexp
DOC allow link embedding to proceed when web sites unavailable
DOC extend documentation on sample generators
Merge pull request #3001 from jnothman/doc_sample_gens
Merge pull request #3593 from MechCoder/row_norms_doc_fix
Merge pull request #3591 from MechCoder/refactor_logreg
DOC tweak BaseSearchCV.score docstring notes
DOC a further tweak to BaseSearchCV.score docstring
FIX update sklearn.__all__ to include all end-user submodules
DOC restructure data transformation user guide
DOC fix broken URL
Merge pull request #3676 from danfrankj/master
DOC reduce table width through less-qualified function names
Merge pull request #3742 from abhishekkrthakur/master
FIX score precision in doctest
Merge pull request #3571 from jnothman/fix-link-embedding
Merge pull request #3833 from amueller/sgd_partial_fit_variable_length
DOC fix what's new for sparse trees
DOC add documentation improvements to what's new
DOC narrative docs for grid search's robustness to failure
DOC fix layout: ensure stable width of documentwrapper
Merge pull request #3881 from amueller/code_tag_new_sphinx
Merge pull request #3893 from jnothman/css_fix
Merge pull request #3578 from jnothman/pls-validation
DOC more expansive description of well-established algorithms
FIX P/R/F metrics and scorers are now for binary problems only by default
Merge pull request #2679 from jnothman/prf_average_explicit
DOC editing for LSH forest narrative documentation
TST/FIX ensure correct ducktyping for metaestimators
TST extend ducktype testing to handle #2853 case
FIX ducktyping for meta-estimators
TST more rigorous testing of delegation
Merge pull request #3989 from hammer/patch-2
ENH support sparse matrices in LSHForest
ENH vectorise _find_longest_prefix_match
TST LSHForest benchmark script uses vectorised queries and fixed random_state
FIX not all scipy.sparse support rand with random_state
STY cleaning some LSHForest query code
COSMIT flat > nested / avoid duplicate validation
STY remove unnecessary variable/operation
STY remove unnecessary variable/operation
ENH sparse metric support for paired distances
Merge pull request #3995 from jmetzen/fix_isotonic
COSMIT Minor changes to address review comments
ENH vectorize DBSCAN implementation
ENH sparse support in DBSCAN
ENH support weighted samples in DBSCAN
FIX DBSCAN.fit_predict now also supports sample_weight
DOC/COSMIT clean up according to review
DOC mention sparse matrix in LSHForest param docstrings
FIX correct name due to rebase error
Merge pull request #4003 from ClashTheBunny/patch-1
DOC add note on scipy.sparse.rand use
Merge pull request #4038 from jakevdp/ensure_symmetric2
Merge pull request #4071 from amueller/grid_search_document_best_estimator
Merge pull request #4029 from ragv/maint_raise_uniform_error_notfitted
Merge pull request #4086 from ragv/whatsnew
Merge pull request #4093 from MechCoder/regresssion_nc
ENH use parallelism for all metrics in pairwise_{kernels,distances}
Merge pull request #4085 from jnothman/pairwise_parallel
Merge pull request #3850 from amueller/randomized_search_no_replacement
DOC what's new on parallelism of pairwise_distances
Merge pull request #4076 from ragv/silhouette_plot
Merge pull request #4158 from ragv/whatsnew
Merge pull request #4160 from trevorstephens/doc-fix-4098
Merge pull request #4131 from amueller/faq_no_1000_citations
TST/FIX in future, average='binary' iff 2 labels in y one of which is pos_label
Merge pull request #4037 from ragv/make_storing_stop_words_optional
Merge pull request #4206 from ogrisel/fix-strict-select-fdr
TST Test check_consistent_length and TypeError with ensemble arg
Merge pull request #4308 from terrycojones/fix-missing-space
Merge pull request #4315 from ugurcaliskan/patch-2
TST ensure return type of radius_neighbors is object
FIX/TST boundary cases in dbscan (closes #4073)
Merge pull request #4555 from ogrisel/fix-astype-copy-true
ENH faster LSHForest: use take rather than fancy indexing
ENH labels parameter in P/R/F may extend or reduce label set
DOC make contributor's guide more prominent on front page
PEP8 for recent merge
DOC backticks in attribute docstrings unnecessaru since #3489
DOC minor tweaks to docs homepage
DOC organise documentation hierarchy / table of contents
DOC links should not always point to dev
FIX avoid memory cost when sampling from large parameter grids
FIX #4902: SV*.predict_proba visible before fit
DOC tweaks for feature_extraction.text
COSMIT prefer loops to repetition in LSHForest benchmark
DOC rearrange related projects
DOC use correct attribute in example
DOC don't show example thumbnails in latex docs
ENH precomputed is now a valid metric for 'auto' neighbors
ENH support precomputed neighbors where query != index
TST precomputed matrix validation in correct place
FIX add _pairwise and test to NeighborsBase
DOC fix up parameter descriptions
FIX broken rebase
TST add test for precomputed metric and X=None at predict
FIX test with invalid input; simplify dbscan precomputed
FIX error introduced during rebase
ENH sparse precomputed distance matrix in DBSCAN
DOC resize images for latex
DOC hide web-specific links from latex build
DOC prefix version headings by "Version"
DOC provide PDF documentation for download
ENH avoid slow CSR fancy indexing variant
DOC comment on obscure syntax
Merge pull request #5288 from jnothman/bicluster-example-speed
Merge pull request #5237 from gclenaghan/remove_thresholds
Merge pull request #5369 from joshloyal/fix_hashing_docstring
Merge pull request #5371 from ylow/sgd_bug_fix
Merge pull request #5402 from lesteve/fix-lshforest-doctest
Johannes Schönberger (137):
Remove invalid todo comment
Add missing doc string printing for examples
DOC : fixes in covariance module
Add implementation of RANSAC algorithm
Add ransac function to __init__.py of utils package
Fix bug in residual determination
Add step-by-step description of RANSAC algorithm
Add parameter description to RANSAC and modify API
Add example plot script for RANSAC
Add simple RANSAC unit test
Add unit tests for RANSAC is_model_valid and is_data_valid
Add unit tests for RANSAC max_trials option
Return number of trials in RANSAC
Fix n_trials return value and add additional test
Add n_trials return value to doc string
Add test for stop_n_inliers parameter
Add test for stop_score parameter
Move ransac.py to _ransac.py to avoid nosetest namespace conflict
Set numpy random seed for all test functions
Remove unused variable
Add missing information predict method of estimator
Skip iteration for empty inlier sample set
Move algorithm description to notes section
Add some reference papers
Fix estimator parameter description
Add empty line between parameter and return value description
Move RANSAC implementation to linea_model subpackage
Implement RANSAC as estimator class
Fix indentation
Add score method
Add RANSAC to __init__ of linear_model sub-package
Add author and license info in source file header
Define source file encoding
Fix deprecated naming of estimator object
Update unit tests with new estimator class interface
Update RANSAC example script for new estimator class interface
Add doc string for predict method
Add doc string for score method
Set default initialization parameters
Fix default parameters of RANSAC estimator
Add missing trailing underscore for estimator attributes
Remove deprecated ransac implementation
Add description for min_n_samples parameter
Move estimator_, n_trials_, inlier_mask_ initialization to fit method and use
Add extra line between doc string sections
Add support for absolute and relative min_n_samples
Add default values for min_n_samples, residual_threshold
Include random_state utility
Use random_state in test cases
Explain behavior when no base estimator is specified
Fix test case function name
Add specific test case for score method
Add specific test for predict method
Mark all parameters as optional
Add missing reference papers
Set random state in fit rather than __init__
Add specific exception description for ValueError
Change indentation of multiline if-statement
Remove unused variable
Use term callable instead of function in doc string
Add more precise description of is_*_valid parameters
Combine nested if-statements
Remove default perceptron base estimator for integer data type
Add support for sparse feature vectors
Fix indentation of multi-line if-statement
Remove trailing empty line after doc strings
Add test case for sparse feature matrix
Fix flake warnings
Add test case for case without specified estimator
Add test case for min_n_samples parameter
Fix description of default base estimator behavior and add corresponding test case
Add description of RANSAC to linear-model docs
Move ransac example to linear-model folder
Add meta-data title, author and year to references
Fix indentation
Change ransac docs section title
Remove academic detail description
Remove duplicate detailed description from doc string
Change default parameters, so example works without specific parameters
Add support for multi-dimensional target-values
Add reference to narrative documentation for detailed description of RANSAC
Remove empty line
Add more descriptive explanation for raised ValueError
Fix bug in is_data_valid test
Add more specific test for is_data_valid and is_model_valid functions
Stylistic multi-line statement change
Improve ValueError description
Remove email adress
Fix typo
Set random state of base estimator as well
Use MAD as default residual threshold
Use np.logical_not rather than tilde
Extend RANSAC example with comparison to BaggingRegressor
PEP8 indentation fix
Remove double-space
Add more detailed description for score method
Improve description of base_estimator parameter
Add tests for more types of sparse matrices
Derive Ransac also from MetaEstimatorMixin
Document use cases of is_*_valid functions
Improve and document n_trials_ attribute
Fix parameter description of X and y
Change naming scheme of variables for consistency
Fix return value description of predict
Fix y paramter description of score
Change linear_model access and call explicitly with kwargs
Remove bagged regressor
Add missing parameter of score method
Unwrap lines
Make sure min_n_samples is not larger than number of samples
Remove _n_ from min_samples
Sample subset without replacement
Add reference to max_trials
Rename min_samples
Improve explanation of degenerate case
Fix typo
Improve RANSAC example script
Make sure dimension is correct for all estimators
Add note about computational cost of is_*_valid functions
Rename RANSAC to RANSACRegressor
Rename base_estimator to estimator
Use LinearRegressor as default estimator for all data types
Add RegressorMixin base class
Raise ValueError for non-integer absolute min_samples value
Rename base_estimator to estimator in docs
Clarify default estimator by explicitly instantiating the class
Remove unused outliers parameter
Rename estimator to base_estimator
Add additional floating point test for absolute min_samples value
Use assert_equal, assert_less rather than plain assert statement
Add RANSACRegressor to whats-new doc section
Add test for default value of min_samples
Add test for invalid value of min_samples
Add test for custom residual_metric
Add test for default residual threshold
Remove reference to web page
ENH: Add dynamic maximum trial determination to RANSACRegressor
John (1):
adding conditional import to lfw.py for python3 support
John Benediktsson (11):
tree: check length of sample_mask and X_argsorted.
DOC: fix typos in tree docstrings.
DOC: fix value error text in Tree.compute_feature_importances.
COSMIT: Use np.array.fill for scalar values.
COSMIT: doc fixes to sklearn.feature_selection.univariate_selection.
COSMIT: fix typo of homoscedasticity.
COSMIT: fix reference to scipy.stats.kruskal.
COSMIT: fix more typos.
DOC: fix 'Controls' typo in sklearn.ensemble.forest.
COSMIT: fix typo in AUTHORS.rst.
COSMIT: fix excessive indentation.
John Kirkham (1):
DOC: Remove extra "not". [ci skip]
John Schmidt (1):
DOC reference for k-means++ in clustering narrative
John Wittenauer (2):
Added several related projects.
Added a link to the deep learning software list.
John Zwinck (1):
FIX use float64 in metrics.r2_score() to prevent overflow
Jonathan Helmus (1):
BUG: Correct check for local import
Joonas Sillanpää (3):
Radius-based classifier now raises exception, if no neighbors found
Corrected some mistakes, added optional outlier_label parameter, which can be given to outliers
Fixed weight calculation from distances (1. / dist), and weight function in tests (lamda d : d ** -2)
Jorge Cañardo Alastuey (1):
BUG: Compare strings for equality, not identity.
Joseph (1):
Related projects linking to wrong URL
Joseph Perla (1):
Fix typo in fast sgd classifier implementation comments
Joshua Loyal (3):
Updated _std docstring in StandardScaler to make internal handling of
ENH Document how to get valid parameters in set_params
correct optional arguments for FeatureHasher
Joshua Vredevoogd (1):
DBSCAN BallTree implementation
José Ricardo (1):
Fixing small typos in the docs.
Juan Manuel Caicedo Carvajal (1):
Check for consistent input in Logistic Regression.
Julien Miotte (3):
Without pl.show, the figure won't be displayed.
Fetching every figure generated by the example scripts.
Since we changed the name of the figure names, changing the rst files.
Jungkook Park (1):
Use expit function to compute the probability in ExponentialLoss class.
Justin Pati (1):
changed warnings in grid_search.py related to loss_func and score_func being passed
Justin Vincent (19):
PY3 xrange, np.divide, string.uppsercase, None comparison
TST + PY3 various fixes
Got all the doc-tests working
Merge in master
More python3 fixes (and just plain bugs)
use ELLIPSIS in doctest to deal with numpy changes.
Forcing the deprecation warnings to happen while in get_params.
Force warning to be heeded in deprecated args check. Possibly fixed a test bug (but maybe I just got it wrong)
Make a test not dictionary order dependent.
Fix up last doc tests.
Make the fixes 2.6 compatible
ELLIPSIS around a unicode issue.
Fix y vector. We wanted round off division so that y == [0 0 1 1 2 2 ...], not [0 .5 1 1.5...]
A little more of those unicode helpers
Another ELLIPSIS
Pop off the recently added filter after testing for deprecation warnings.
merge in origin
Comment change
Fix two remaining python3 bugs.
Kaicheng Zhang (1):
G is upper-triangular, so G.T should be lower-triangular.
KamalakerDadi (1):
Version added for all new classes and parameters
Kamel Ibn Hassen Derouiche (1):
FIX: compilation issues under NetBSD
Karol Pysniak (3):
Added unit tests for parameters in isotonic regression
Fixed style issues as indicated by pep8
Fixed passing y_max and y_min to IsotonicRegression in test_isotonic
Kashif Rasul (1):
make_multilabel_classification sparse target
Keith Goodman (3):
DOC: minor typos in covariance doc.
BUG: price accidentally used instead of volume
DOC wrong default value in docstring
Kemal Eren (100):
ridge regression uses compute_class_weight()
Re-add deprecated class_weight parameter.
removed class_weight parameter from RidgeClassifier.fit()
check_pairwise_arrays() preserves dtype==numpy.float32
implement spectral biclustering and spectral co-clustering
wrote tests
wrote methods for generating bicluster data
added option to return piecewise vectors
cast data in fit()
made internal functions private
use random state in test
removed pickle test
shorten first lines of test docstrings
use random state in preprocess tests
duck typing, minor corrections: spacing and typos
fixed exceptions and their messages
updated svd()
better array validation
use random state in data generator
tests reuse data generators
user may select svd method
Added to docstring
split spectral biclustering into two classes
removed unused code
test bad arguments
now supports sparse data
check n_clusters parameter more thoroughly
made base class an abstract class
checkerboard panels may have arbitary values.
fixed exception type
removed empty mixin
started biclustering documentation and examples
shorter array slicing
made some methods into private methods
cleaner use of check_arrays()
named arguments
use safe_sparse_dot()
use np.random.RandomState directly
do not do any checks during __init__()
do not use mutable default arguments
added new tests for sample data generators
fixed bug in make_checkerboard(), so tests pass again
use assert_all_finite
skip permutation test for now
fixed some errors reported by pyflakes
raise exception instead of converting sparse arrays to dense
expanded biclustering documentation
corrected k_means in docstring
rearranged imports from general to specific
moved and renamed _make_nonnegative() and _safe_min()
added option to use mini-batch k-means
use dia_matrix
renamed 'preprocess' to 'normalize'
use sklearn.utils.extmath.norm
base class __init__ is no longer abstract
added more information to error messages
also use norm in _project_and_cluster()
make test more sparse
made 'bicluster' a submodule of 'cluster'
removed svd_kwargs argument
added n_svd_vecs parameter
tests use ParameterGrid to avoid deep nesting
replaced kmeans_kwargs with some useful k-means parameters
updated documentation
keep biclustering algorithms in submodule
renamed examples; added to example docstrings
re-added bicluster mixin, this time with some functionality
wrote newsgroup biclustering example
fixed a few things in examples, documentation, and docstrings
wrote bicluster scoring using jaccard index and hungarian matching
removed some parameters to speed up test
added default arguments to base class's__init__ to make test pass
test_make_checkerboard was wrong after api change
added documentation for bicluster evaluation
moved shuffle functionality to utility function
added consensus score to bicluster examples
renamed example to get output to work
made bicluster utilities for dealing with indicator vectors
index in one go. added sparse test.
documentation and docstring fixes
merged newsgroup example with Vlad's
moved bicluster examples to their own category
reduced noise in spectral coclustering example
updated newsgroups example
added n_discard parameter to _svd()
check value of n_components and n_best
a fix for nan values in singular vectors.
wrote tests to ensure svd works on perfect checkerboard
redundant phrase in docstring
put biclustering section after clustering section in reference
misc. fixes
changes to newsgroups example:
fixed some docstrings: backticks and missing parameters
updated setup.py
added myself to authors; added biclustering to whats new
examples use matplotlib.pyplot instead of pylab
consistency changes:
removed plot_ from newsgroups example file
import biclustering methods in sklearn.cluster and sklearn.metrics.cluster
skip perfect checkerboard test
Ken Geis (4):
Changed the setup instructions in the README to properly install the package in the user home.
FIX mbkmeans benchmark bug (k instead of n_clusters)
FIX off-by-one error in neighbors benchmark
ENH lots of benchmarks fixes
Kenneth C. Arnold (4):
Cosmit
Cosmit
fast_svd: factor out the randomized range finder (more generally useful)
Mark Cython outputs as binary so their changes don't clutter diffs.
Kenta Sato (1):
Fix OOB score calculation for non-contiguous targets
Kernc (12):
KNeighborsClassifier now has a predict_proba() method
reversed changes to KNeighborsClassifier.predict()
an simple test case for KNeighborsClassifier.predict_proba()
feature_extraction.text.CountVectorizer analyzer 'char_nospace'
Oneliner docstring
words for n-grams padded with one space on each side
missing unicode modifier
replaced str.format() with string concatenation as it's 3 times faster
char_nspace -> char_nospace, thanks Lars
changed 'char_nospace' keyword to shorter and meaningful 'char_wb'
some narrative documentation...
mentioned 'char' vs 'char_wb' in the narrative
Kevin Hughes (1):
ENH actually use scikit-learn's PCA class in plot_pca_3d.py
Kevin Markham (1):
DOC FIX: typo and minor update
Kian Ho (5):
Initial commit of plot_ensemble_oob.py
Minor amendments to plot_ensemble_oob.py.
Amended plot_ensemble_oob.py according to #4665.
Made more amendments according to PR feedback.
Refined the docstring for plot_ensemble_oob.py.
Konstantin Shmelkov (1):
ENH multioutput regression metrics
Kyle Beauchamp (12):
Added code to address issue #1403
In preprocessing.binarize, eliminate zeros from sparse matrices
Added feature for issue #1527
Minor PEP8 fixes for issue #1527
Minor docstring fix for issue #1527
Added tests and docs for normalized zero_one loss
Fixed pep8 spacing issue and floating point doctest issue
Added CSC matrix testing for binarize and added type tests.
Added MinMaxScaler inverse_transform for issue #1552
Dummy commit to trigger travis
Added feature to fix #1523
Update related_projects.rst
Kyle Kastner (42):
Removed pl.axis('tight') and set the plot limits with pl.xlim(), pl.ylim(). pl.axis('tight') appears to be adding whitespace around the colormesh
Added decision_function support to OneVsRestClassifier and a test, test_ovr_single_label_decision_function, in test_multiclass.py
Updated fixes for #2012.
Strengthened tests for OneVsRestClassifier decision_function
Cleaned up tests, and removed unused multilabel parameter in decision_function_ovr
Inlined extraneous function call from decision_function and added a check that the base estimator has a decision_function attribute
Updated with more explanatory text. Also changed to use train_test_split function
Corrected some typos, and added more explanation for the precision recall example.
Clarified a few wordy sentences
Attempted a clearer explanation of precision-recall score
Adjusted to fit 80 character columns, and corrected definition of precision
Removed random_seed argument to svm.SVC
Added contributions to whats_new
Further expansion of the precision_recall curve explanation, spelling corrections, and general cleanup.
Added random_state argument to svm.SVC
Added links to cross_val_score, auc_score, and recall_score
Added link to precision_score
Updated whats_new and fixed a typo in plot_precision_recall.py
Added Windows Powershell download script
ignificant refactoring, will now automatically install Python, pip, and nose. Numpy and Scipy still require manual intervention
Fixed bug with append string getting double "_py_py"
Changed to using wheel packages for scipy and numpy, currently hosted on the sklearn Rackspace account.
Additional functions, ensuring pip is correct for python 3 and passing the proper path
Registry reloading is necessary the very first time Python is installed!
Fixed PATH issues for initial Python install environment
hange directory to $PYTHONHOME to avoid encodings error
hanging directories is *not* useful for the encoding error
Added longer delay after python silent install
dded verbose output when setting temporarily paths
Updated to include git, and install all python versions by default
Added a few new features, and an example of chocolatey-style install
Missing the r in powe(r)shell
Added utility to skip tests if running on Travis
Added FIXME annotation to skipped test
Changed solver from 'dense_cholesky' to 'cholesky' to eliminate deprecation warning.
Additional skipping for omp_cv
Put check_travis in the correct place so that other tests are not skipped, and changed the CCA check to be more explicit.
Added additional skip checking for train and pickle tests
Added directory checking for documentation builds, and corrected for Windows pathing
Merge pull request #3477 from kastnerkyle/docbuild_fix
IncrementalPCA implementation
Updated what's new to add IncrementalPCA
Kyle Kelley (1):
Converted Markdown style link to restructured text
Kyler Brown (1):
fixing a typo in feature extraction documentation
LK (2):
Update dpgmm.py
Update dp-derivation.rst
Lagacherie Matthieu (1):
test case on rtfe grid_scores fix
Lars Buitinck (1261):
There is no scikits.learn.feature_extraction.text.sparse
Fix minor typo
Spelling error
Add a small part about cross-validation + copyedit
Copyedit on the "working with text" chapter
Make ball tree code safer and 64-bit clean
Cleanup lib{linear,svm} C helper routines
Spellcheck and formatting in developers' docs
typo
Updated installation instructions
Merge pull request #160 from larsmans/master
Be more explicit about coverage testing
cosmetic change to ball tree C++ code
cosmetic doc changes
cosmetic: pep8 in utils/ + rewrote factorial (2x as fast)
factorial should not use O(n) memory
Python 3-safe attempted import of factorial and combinations
Text chapter: load_files renamed load_filenames
Merge branch 'master' of github.com:scikit-learn/scikit-learn-tutorial
typos in README
typos in covariance docs
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn
Merge branch 'master' of https://github.com/amitibo/scikit-learn into amitibo-naive-bayes
naive bayes: copyedit + rename alpha_i to alpha
ENH: optional and user-settable priors in multinom naive bayes
naive bayes: minor fixes
Merge sparse and vanilla naive Bayes
docs + cosmit in naive_bayes
naive bayes: handle 1-d input
ball tree cleanup & 64-bit safety
naive bayes: fix predict_proba bug and change priors behavior
fix naive bayes docs and example + credit mblondel + vanity
typo: interation/iteration + re-Cythonize cd_fast.pyx
Merge branch 'master' of github.com:scikit-learn/scikit-learn
naive bayes: test pickling
naive bayes: safe_sparse_dot, doc and docstring updates
rename MultinomialNB params, rename GNB GaussianNB
reformulate MultinomialNB as linear classifier
NB: add class_log_prior_ and feature_log_prob_ back as properties
NB cosmit: *feature* independence
cosmit: expand MultinomialNB docstring
Safer importing in grid_search module
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge pull request #184 from larsmans/amitibo-naive-bayes
rm references to naive_bayes.sparse in docs
NB: rename use_prior to fit_prior
slightly improved logging in a few easy cases
rm self.sparse attr in MultinomialNB; not needed outside of fit
fix priors bug in MultinomialNB
2010 is so last year
Merge branch 'mldata' of https://github.com/pberkes/scikit-learn into pberkes-mldata
Improved error handling + reduce memory use
Simplify intercept fitting in MultinomialNB
Error in MultinomialNB docs
Added naive Bayes classifier for multivariate Bernoulli models
some documentation for BernoulliNB
Do binarizing in BernoulliNB
Simplify binarizing in BernoulliNB
fix message in document classification example
Merge branch 'master' into bernoulli-naive-bayes
Optimize BernoulliNB + improve docstring + add to doc-class example
Copyedit preprocessing docs
Refactor MultinomialNB: separate prior estimation and feature counting
Use unique from utils.fixes in naive_bayes
Fix bug in MultinomialNB: output transposed
Replace loop in MultinomialNB._count with dot product + pep8
BUG: binary classification failed in MultinomialNB, +regression test
Fix 404 from broken URL in release log
ENH: fit_transform on TfidfTransformer
add C parameter to LinearSVC docstring
Revised text classification chapter
Copyright and project name in HTML footer
Fix pprett's website URL (<> caused it to be a relative URL)
Merge branch 'master' into bernoulli-naive-bayes
Refactor MultinomialNB and BernoulliNB: introduce BaseDiscreteNB
vectorize loop in BernoulliNB for 100x speedup in sparse case
svmlight reader: don't use leading _ in identifiers
Merge branch 'svmlight_format' of git://github.com/mblondel/scikit-learn into mblondel-svmlight
SVMlight reader: minor fixes
SVMlight reader: ensure C calling conventions + docstring
Plumb memory leak in SVMlight reader
SVMlight reader: one more clear() instead of delete
SVMlight reader: cosmetic
SVMlight reader: skip one level of indirection
Simplify and document SVMlight/libSVM data reader
Use C++ exception handling in SVMlight reader.
finish exception handling in SVMlight reader
Extend MultinomialNB tests to BernoulliNB
Update BernoulliNB docs
Updated README
BUG: broken doctest in BernoulliNB
Glitches in BernoulliNB and DiscreteNB (mostly docs)
Merge pull request #210 from larsmans/bernoulli-naive-bayes
SVMlight reader: memory leak, type test
(Hopefully) full exception safety in SVMlight reader
datasets/mldata.py is not a script, chmod 644
Python 2.5 and SciPy 0.7 (tentative) compat in mldata
Fix broken doctest in mldata
document placement new in SVMlight reader
Remove ugly BLANKLINE stuff from text chapter
Use grid_scores_ instead of _get_params + small fixes in text chapter
fit_transform does NOT return self + other docfixes
Parallel vectorizing is slower than serial
Rewrote SVMlight parse_line with C++ iostreams
SVMlight reader: some extra tests + cleanup
Adapt kNN classifier to sparse input
Use new utils.atleast2d_or_csr in naive_bayes as well
document placement new in SVMlight reader
Document sparsity in k-NN
Correctly document sparse input possibilities in naive_bayes
Merge branch 'master' into sparse-knn
Add sparse k-NN test, fix a bug
Extend sparse k-NN test to try pairs of sparse matrix types
Fix bug in sparse k-NN and add disabled (!) test for sparse regression
Better document scipy.sparse support in neighbors module
Prevent some copying in neighbors + docstring for euclidean_distances
Use 10 neighbors in k-NN document classification
neighbors: check string equality with ==, not is
Copyedit SparsePCA docs
Copyedit SparsePCA docs
Merge pull request #219 from larsmans/sparse-knn
Some doc copyediting
Change normalization behavior in TfidfTransformer
Docfixes in feature_extraction.text
Remove bogus sparse vectorizing tests
docfixes in feature_extraction.text
document classification example doesn't demo only linear classifiers anymore
make parse_file in SVMlight reader static
Fix broken doctest in NeighborsRegressor
Search tfidf__norm space in text class. grid search example
Merge pull request #228 from larsmans/tfidf
Use four categories instead of all in doc. class. example
Optimize CountVectorizer.fit_transform (+ minor refactoring)
pep8 feature_extraction.text + rm content word "computer" from stop list
DOC: Expand and copyedit naive Bayes docs
Recythonize libsvm.pyx with Cython 0.14
Refactor/simplify CountVectorizer
Refactor feature_extraction.text (again) to use Counter
Replace mixture.logsum with numpy.logaddexp
on demand inverse vocabulary
Implement fit_transform for Vectorizer as well and document it
Default argument safety + cosmit in feature_extraction.text
typo
DOC fixes in datasets
Merge pull request #234 from larsmans/inverse-vectorizer
FIX hmm.py to succeed tests; stopgap, put old logsum.py in that module
FIX and ENH feature_extraction.text.CountVectorizer
default arg safety + docfixes
Started one-hot transformer
Merge branch 'master' of github.com:scikit-learn/scikit-learn
FIX broken test for CountVectorizer
Revert "Started one-hot transformer"
DOC grid_search + pep8
Refactor naive_bayes and don't treat BernoulliNB as linear model
ENH show top 10 terms per category in document classifier example
DOCFIX typos in svm module
cosmetic changes to DBSCAN
vectorize loop in DBSCAN with np.where
Cosmit DBSCAN test
DOCFIX DBSCAN: we use arrays, not matrices
Streamline imports in lfw.py: don't try anything with PIL
Restore conditional PIL import in datasets.lfw
DOC: copyedit docstrings in pls.py + (almost) pep8-clean
pep8 and docfixes in various modules
pep8 and docfixes for LLE
suppress division by zero warnings from precision, recall, f1
simplify np.seterr handling in sparse_pca
Use isinstance instead of the ancient (Py2.1) types module in fastica
FIX handles NaNs in LogisticRegression, and many more classes
assert_all_finite: pre-check if we're dealing with floats.
Rework assert_all_finite and related functions in utils
callable now actually allowed in fastica
disallow sparse input in dense liblinear
Merge pull request #259 from larsmans/input-validation
Chmod 644 feature_extraction/image.py: not a script
FIX more useful diagnostics in mlcomp_sparse_document_classification.py
Add χ² feature selection
Demo chi2 feature selection on document classification
document Chi2 feature selection
ENH test and fix chi2 feature selection
Rename f_chi2 to chi2
avoid mutable default arguments
s/euclidian/euclidean/g
More mutable default args
ENH decorator to mark functions and classes as deprecated
New-style deprecation of datasets.load_files
deprecated decorator won't work on __init__; skip it
make deprecated work on classes
typo
ENH optimize euclidean_distances for memory use
document and test sparse matrix support in euclidean_distances
ENH optimize idf computation in TfidfTransformer using np.binsort
ENH and DOC TfidfTransformer
FIX add idf smoothing to Vectorizer as well, defaulting to True
More specific exception in GaussianProcess + regression test
(Micro)optimization in DBSCAN
fix DBSCAN bug (oops)
new-style deprecation of load_20newsgroups
ENH set_params method on BaseEstimator, deprecate estimator params to fit
set_params: update according to @GaelVaroquaux's review
Rm k param from KMeans.fit again
DOC improve fbeta docstring
minor fixes in clustering metrics
cosmetic changes to ari_score
rename ari_score adjusted_rand_score
scikits.learn -> scikit-learn + url of Numpy
s/scikits\.learn/sklearn/g
pep8 sklearn/utils/__init__.py
refactor linear models to call as_float_array only from _center_data
unconditionally call as_float_array in LinearModel._center_data
update cluster docs (DBSCAN)
DOC: fix typos
DOC small stuff in base.py and multiclass.py
trees: don't use deprecated cross_val, error messages, use super
typo: threhold -> threshold
DOC minor editing to naive_bayes docs
Merge branch 'tmp'
rename overwrite_Foo params to copy_Foo (and inversed their meaning)
document overwrite_ -> copy_ API change in ChangeLog
BUG LinearSVC.predict would choke on 1-d input (+ regression test)
more helpful error message in SGDClassifier.predict_proba with wrong loss
Merge pull request #357 from larsmans/overwrite-to-copy
fix doctest failures in linear_models docs
refactor and simplify naive_bayes
prevent some copying in sparse SGD
BUG adapt text feature grid search example to new 20news loader
BUG fixed and cosmetics in CountVectorizer
BUG + optimization in GaussianNB
refactor common code of NB estimators into BaseNB class
Refactor/simplify naive Bayes tests
API change: 1-d output from BaseNB.predict_(log_)proba in binary case
ENH SGD error messages better still
FIX embarrassing SyntaxError in linear_model.base
BUG multiclass.predict_binary still relied on old MultinomialNB.predict_proba
DOC prob_predict -> predict_proba in SVM docstrings
Revert "BUG multiclass.predict_binary still relied on old MultinomialNB.predict_proba"
Revert "API change: 1-d output from BaseNB.predict_(log_)proba in binary case"
refactor SVMlight reader and writer
API change in SVMlight reader: handle multiple files with svmlight_load_files
Retry "BUG fixed and cosmetics in CountVectorizer"
CountVectorizer.fit_transformer refactoring, part N
Micro-optimize NMF for memory usage: topic spotting example down by ~17%
Replace two more flatten()s in NMF with ravel()s
FIX broken doctests in NMF + pep8
Allow sparse input to NMF
NMF: cosmit
Refactor ensemble learning code
FIX Issue 379 and use the opportunity to refactor libsvm code
DOC copy-edit naive bayes doc, with an emphasis on the formulas
COSMIT in chi² feature selection
DOC ported latexpdf target from Sphinx 1.0.7-generated Makefile
DOC typos in Ward tree docstring
COSMIT little things in hierarchical.py
BUG NMF topic spotting example would output n_top_words-1 terms
DOC explain multiclass behavior in LogisticRegression
COSMIT pep8 feature_extraction.text
DOC some stuff on input validation
ENH Cython version of SVMlight loader
ENH accept matrix input throughout
COSMIT rename safe_asanyarray to safe_asarray to prevent confusion
DOC correct Google URL
pep8 grid_search.py
FIX replace np.atleast_2d with new utils.array2d
DOC correct and clean up empirical covariance docstrings
ENH test input validation code on memmap arrays
Merge pull request #410 from larsmans/accept-matrix-input
ENH sample_weight argument in discrete NB estimators
BUG handle two-class multilabel case in LabelBinarizer
TEST better test for binary multilabel case in LabelBinarizer
ENH multilabel learning in OneVsRestClassifier
DOC OneVsRestClassifier multilabel stuff
ENH multilabel support in SVMlight loader
DOC multilabel classification in narrative docs
FIX Python 2.5 compat in utils/tests
COSMIT multiclass.predict_ovr
DOC expand Naive Bayes narrative doc (BernoulliNB formula)
COSMIT in naive_bayes
ENH prevent copy in sparse.LogisticRegression
Revert "ENH prevent copy in sparse.LogisticRegression"
DOC typos and style in linear_model docs
COSMIT cleanup sgd Cython code
DOC update cross validation docstrings for default indices=True
BUG handle broken estimators in grid search by cloning them
ENH don't require numeric class labels in SGDClassifier
BUG fix SGD doctests
BUG fix Naive Bayes test + refactor module
DOC typo
ENH support array-like y (lists, tuples) in GridSearchCV
ENH support arbitrary labels in metrics module
COSMIT rm comment in coord descent code about np.dot
COSMIT no need for csr_matrix "cast" in coord descent
ENH prevent copy in PCA if not necessary
FIX use super consistently in SVMs
ENH incrementally build arrays in SVMlight loader to reduce memory usage
Merge pull request #446 from larsmans/svmlight-loader-memory-use
DOC typos in ensemble.forest
drop Python 2.5; no more with statements from the __future__
drop Python 2.5; no more need for utils.fixes.product
drop Python 2.5; document and rm some workarounds for kwargs quirks
COSMIT rm some SciPy pre-0.7 compat code
raise TypeError instead of ValueError in check_arrays
COSMIT docstring fix + US spelling in K-means code
DOC I don't think Ubuntu 10.04 will be the last LTS release
test @deprecated using warnings.catch_warnings
COSMIT use utils.deprecated as a class decorator
don't use assert_in, not supported by nose on buildbot
Revert "FIX: more python2.5 SyntaxError"
Revert "FIX: python2.5 SyntaxError"
COSMIT use urlretrieve and "with" syntax in LFW module
COSMIT use ABCMeta in naive_bayes
COSMIT a few more easy cases of with open syntax
rm Py2.5 compat factorial and combinations from utils.extmath
use cPickle in spectral clustering tests
COSMIT use Python 2.6 except-as syntax
DOC rm Methods section from KMeans docstring
BUG typo in NB error msg
DOC fix datasets.load_digits example
DOC fix datasets.load_digits example, second attempt
COSMIT rename load_vectorized_20newsgroups + DOC + pep8
Merge pull request #2 from mblondel/multilabel
BUG only handle labels specially in SVMlight loader + multilabel
BUG fix off-by-one error in SVMlight format loader
DOC multilabel learning: note that it's experimental + @mueller's remark
DOC document svmlight file loader changes in changelog
COSMIT reorganise utils tests
TST add test for sklearn.utils.extmath.logsum
DOC copyedit kernel approximations docstring
DOC kernel approximations, some last bits
DOC unbreak kernel approx docstrings (UTF-8 + s/References/Notes/g)
Merge branch 'master' into multilabel
ENH add multilabel_ property to OvR and raise NotImplementedError in score
ENH demo sparse KMeans on 20news set (it's slow!)
Merge remote-tracking branch 'vene/lars_multilabel' into multilabel
BUG forget a return keyword in OvR classifier
DOC describe test_ovr_multilabel better
TST extra test for LabelBinarizer's multilabel behavior
COSMIT set union in LabelBinarizer
ENH improve stoplist handling in feature_extraction.text
DOC rm References sections in docstrings
DOC I broke the docs and I liked it
COSMIT make BaseLibSVM an abstract base class
BUG input validation in kernel approximations + pep8
BUG fix Vectorizer to play nicely with Pipeline
Revert "BUG Disallow negative tf-idf weight"
PY3K fix in datasets.samples_generator
scikits.learn -> sklearn migration in label propagation
BUG don't pass estimator params to fit in label propagation
DOC cosmetics in SVM docstring
COSMIT reintroduce ABCMeta into BaseSGD*
BUG refactor SGD classes to not store sample_weight
COSMIT rm unused svm.base.dot
BUG use ValueError in BaseLibSVM.coef_
BUG update test for SVMs raising ValueError for coef_
COSMIT remove superfluous imports in svm/sparse/base.py
BUG don't use deprecated attributes in GaussianNB.predict
remove deprecated Neighbors{Classifier,Regressor}
ENH raise ValueError in metrics instead of AssertionError
ENH intercept_ on linear OvR clf + change exception to AttributeError
DOC pep257, or "sentences end with a full stop"
ENH input validation in DBSCAN
DOC rm confusing line in BernoulliNB docstring
FIX small stuff in new tomography example
factor out some common code in dense/sparse SGD
prevent a copy in SGD regressor fitting
refactor SGD, part 2: simplify parameter passing
refactor SGD, part 3: factor out more sparse/dense common code
COSMIT rm no-op conversion in SGDRegressor
BUG restore symbolic class label support in SGD + test it
ENH merge dense/sparse LinearSVC, part 1: no more SparseBaseLibLinear
ENH merge dense/sparse LinearSVC, part 2: no more sparse.CoefSelectTransformer
ENH merge dense/sparse LinearSVC, part 3: deprecate sparse.LinearSVC
ENH merge dense/sparse LinearSVC, part 4: deprecate sparse.LogisticRegression
DOC reference for logistic regression training with liblinear
COSMIT refactor liblinear bindings
TST merge dense and sparse LogisticRegression tests
Merge branch 'master' into merge-linearsvcs
COSMIT fix ugly import, left over from LinearSVC refactoring
DOC put merged LinearSVC and LR in changelog + explain @mblondel's work
BUG fix SGD doctest
Merge pull request #561 from larsmans/merge-linearsvcs
BUG promote type-safety in murmurhash
BUG make coef_ 1-d in Naive Bayes for binary case
BUG replace assert by custom exceptions
COSMIT refactor SGD code further
Revert "COSMIT refactor SGD code further"
ENH merge sparse and dense SVMs, part 1
ENH merge sparse and dense SVMs, part 2
ENH merge sparse and dense SVMs, part 3: adapt sparse tests
DOC merge sparse and dense SVMs, part 4
Merge pull request #576 from larsmans/merge-svms
DOC improve intro to Git in the developers' documentation
DOC rm unused param from sparse.ElasticNet docstring
COSMIT abstract base class in univariate feature selection
ENH sublinear tf scaling in TfidfTransformer
DOC s/with dense data// in merged SGD module
refactor SGD regression input validation + doc fixes
ENH more generic dict-like test in CountVectorizer
DOC typos in whats_new
DOC typos
DOC typo
COSMIT refactor SGD with Dataset factory function
COSMIT rename _mkdataset function in SGD
ENH add DictVectorizer
ENH test feature_extraction.DictVectorizer
DOC syntax error in DictVectorizer docstring
COMPAT turns out collections.Mapping has an iteritems member
ENH add test for DictVectorizer.restrict
DOC + ENH DictVectorizer: complete docs, add dict_type param
COSMIT disable liblinear I/O code
ENH implement one-of-K/one-hot coding in DictVectorizer
COSMIT rename DictVectorizer source files
ENH optimize DictVectorizer (sparse case)
TEST more strict test for one-of-K coding in DictVectorizer
DOC narrative documentation for DictVectorizer
DOC + pyflakes in DictVectorizer
ENH reduce memory usage of DictVectorizer.transform in sparse case
BUG fix doctests for DictVectorizer (nose 0.X compat)
Merge branch 'dictvectorizer'
COSMIT simplify input validation in KMeans
DOC small fixes to NearestCentroid classifier
BUG disallow shrinking with sparse data in NearestCentroid
DOC typos, line-width and minor stylistic fixes in pipeline module
COSMIT shallow copy of steps in Pipeline + code style
Merge pull request #741 from ogrisel/sorted-dictvectorizer
COSMIT use sorted instead of list.sort in DictVectorizer
DOC small fixes to DictVectorizer documentation
BUG fix issue #753, "Sparse OneClassSVM missing argument to super()"
BUG re-allow zero-based indexes in SVMlight files
COSMIT replace utils.testing.assert_in with Nose-compatible functions
DOC + FIX DictVectorizer: actually support single Mapping arg in transform
ENH zero_based="auto" support + better n_features=None in load_svmlight_files
COSMIT vanity + license for ArrayBuilder
COSMIT refactor SVMlight loader
ENH fit_predict convenience method on KMeans and MiniBatchKMeans
Merge pull request #729 from larsmans/fit-predict
COSMIT pep8 SVMlight loader
BUG close files in time in SVMlight loader (with statement)
TEST + FIX zero_based="auto" behavior in SVMlight loader
DOC + PEP8 SVMlight loader
Merge pull request #756 from larsmans/svmlight_fix
DOC typo
COSMIT pep8 document classification example
DOC typo in example
DOC clarify zero_one_score
DOC typo
revert PLS param rename + move input validation out of loop
BUG chi² feature selection didn't work for COO matrices
ENH export f_oneway from feature_selection module
BUG ensure that SelectKBest actually selects k features
DOC clarify __check_build messages
DOC instruct new devs to *always* work in branches
COSMIT pyflakes + pep8 linear_model/base.py
ENH generalize LabelBinarizer to arbitrary Sequence types
BUG remove debugging statements from multiclass
BUG in LabelBinarizer (forgot to run the full testsuite)
DOC fixed sentence that was missing a verb
rm deprecated euclidian_distances synonym
ENH fix and test LabelBinarizer's handling of string labels
ENH import liblinear 1.91
COSMIT make a liblinear C private helper function static
BUG set new p parameter in liblinear helper
ENH support opening compressed files in SVMlight reader
ENH always support file descriptors in SVMlight loader
DOC typo in docstring
BUG do not close fd passed by user in SVMlight loader
FIX NearestCentroid.fit could not handle sparse formats other than CSR
DOC typo
DOC fix dead link
DOC + COSMIT additive chi² sampler
ENH scipy.sparse support in additive chi² sampler
DOC output from additive chi² sampler
COSMIT refactor input validation code and tests
COSMIT + DOC input handling and docstrings in RandomizedPCA
ENH classes_ on OvR classifier
DOC typos
COSMIT remove some dead code
BUG remove predict{_log,}_proba from SVR
COSMIT cleanup tests with pyflakes
ENH better input validation for dump_svmlight_file
ENH make generated SVMlight files self-describing in a comment
COSMIT don't call magic methods directly
ENH allow user-specified comment in SVMlight dumper
rm the long-deprecated scikits.learn package
TST: improve coverage of feature_selection.SelectorMixin
COSMIT suppress warning from qr_economic + docstring on Counter
TST absolute imports in spectral clustering tests
ENH more specific warning filter for qr_economic
TST upgrade trivial (single-class) k-NN problems to binary ones
DOC + TST vocabulary arg in CountVect docstring
COSMIT move BaseSGD to its only place of usage
COSMIT minor refactoring of SGD
DOC tutorial: explain what an estimator is
DOC rewrote logistic regression docs
DOC yet another AKA
DOC copyediting
TST (near-)empty lines and explicit zeros in SVMlight loader
COSMIT use property.setter in sklearn.svm
ENH performance of TfidfTransformer
COSMIT replace useless safe_sparse_dot in chi2 with np.dot
BUG fix broken top-10 features printing in text clf example
DOC copyedit HMM documentation
COSMIT const and void* correctness in liblinear wrapper
ENH refactor liblinear prediction code and add classes_ member
COSMIT liblinear C code cleanup
COSMIT comment out more unneeded liblinear code
DOC + COSMIT LogisticRegression: docstring + rewrite predict_proba
Merge pull request #1141 from pprett/sgd-predict-proba
DOC small fixes to SGD docstrings
COSMIT rm svm.sparse tests to prevent deprecation warnings
ENH micro-optimizations in SVMlight loader
BUG rm RidgeClassifier from 20newsgroups
Merge pull request #1143 from larsmans/refactor-liblinear
ENH no more distinction between "sparse" and "dense" LinearSVC
COSMIT rm deprecated SGDClassifier.classes property
COSMIT clarify L1/L2 LR sparsity demo
DOC fix link for IsotonicRegression
DOC fix IsotonicRegression docstrings
BUG allow array-like y in RFE
DOC RFE docstring + link RFECV in narrative docs
BUG rm LARS from linear_model.__init__
COSMIT refactor linear classifiers
TST improve Ridge test
COSMIT use LinearClassifierMixin in RidgeClassifier
COSMIT + DOC univariate feature selection
COSMIT re-indent docstring for safe_mask
BUG make GridSearchCV work with non-CSR sparse matrix
COSMIT rm deprecated class_weight from fit in Ridge
Revert "BUG rm RidgeClassifier from 20newsgroups"
ENH add max_iter argument to Ridge estimators
DOC Ridge improvements in whats_new
Merge pull request #1169 from larsmans/ridge-cg
COSMIT rm deprecated stuff -- lots of it
DOC rm references to deprecated stuff
TST writable coef_ and intercept_ on LogisticRegression
ENH let DictVectorizer build a CSR matrix directly and use array.array
DOC DictVectorizer returning CSR in ChangeLog
Merge pull request #1193 from larsmans/dictvectorizer-csr
COSMIT error messages in GenericUnivariateSelect
ENH perform feature selection on scores, not p-values, when possible
DOC some improvements to FeatureUnion docs
DOC LaTeX error in SVM narrative docs
ENH better error messages in CountVectorizer for empty vocabulary
TST CountVectorizer with empty vocabulary
Merge pull request #1208 from larsmans/check-empty-vocabulary
Merge pull request #1211 from kcarnold/gitattributes
DOC typos in README
DOC feature selection by scores instead of p-values
DOC various typos and other minor stuff
DOC clarify zero_based's implications in SVMlight loader
Merge pull request #1204 from larsmans/mi-feature-selection
BUG + DOC l1_ratio in SGD and CD
COSMIT correct error msgs in SGD and make them more consistent
Merge branch 'pr/1214'
DOC let BibTeX handle its own capitalization, except for {P}ython
BUG NaN handling in SelectPercentile and SelectKBest
COSMIT rm unused import
COSMIT website address + copyedit in __init__.py
DOC move implementation details on mixins to comments
Revert (rebased) merge of euclidean_distances speedup
ENH allow more than 1000 linear SVMs with custom random seeds
BUG halve the number of LinearSVCs
COSMIT use np.clip in SGD
ENH fit_transform on KMeans
ENH input validation in chi2, error for negative input
Merge branch 'master' into pr/1279
ENH OneHotEncoder docs + TypeError + test active_features_
ENH cut down on memory use of text vectorizers
DOC copyedit tutorials
COSMIT rm outdated file of changes to liblinear
Merge pull request #1335 from robertlayton/clustdocs
DOC typo in k-means docs
Merge pull request #1366 from agramfort/move_isotonic
DOC grammar in isotonic regression narrative docs
ENH feature hashing transformer
DOC narrative documentation for feature hashing
ENH speed up hashing and reduce memory usage by 1/3
ENH allow (feature, value) pairs in FeatureHasher
ENH 20newsgroups example for FeatureHasher
ENH + DOC FeatureHasher
ENH add dict support to FeatureHasher and make it the default input_type
Merge pull request #1374 from jakevdp/doc_GA_flag
BUG enforce and document max. n_features for FeatureHasher
DOC update Ubuntu installation instructions
FIX smoothing in Naive Bayes and refactor the discrete estimators
COSMIT no diff for pairwise_fast.c
DOC credit @sjackman in what's new for BernoulliNB fix
COSMIT refactor input validation code; skip some issparse calls
BUG Cholesky delete routines wouldn't compile on Solaris
COSMIT simplify unique_labels in sklearn.metrics
COSMIT shut up the build by calling np.import_array in Cython modules
Merge pull request #1556 from larsmans/cython-cleanup
COSMIT wrong path in .gitattributes
Update sklearn/metrics/metrics.py
update year in copyright notices
BUG don't write comments in SVMlight dumper by default
BUG hotfix for issue #1501: sort indices in SVMlight i/o
DOC fix travis URLs in README
TST sorting CSR matrix indices in SVMlight file handling
DOC improve cosine similarity docs
COSMIT make BaseVectorizer a mixin
DOC copyedit HashingVectorizer docs
Merge pull request #1598 from amueller/naive_bayes_class_prior_rename_revert
COSMIT rm deprecated svm.sparse module
COSMIT rm deprecated attrs from [LQ]DA
BUG last references to svm.sparse
COSMIT rm deprecated stuff
BUG fix failing doctest
BUG one more failing doctest
BUG move label_ from BaseLibSVM to BaseSVC
COSMIT decouple regression and classification in SVMs
BUG in RadiusNeighborClassifier outlier handling
Merge pull request #1576 from mrorii/fix_kneighbors
ENH rewrite radius-NN classifier's outlier handling
COSMIT translate lgamma replacement to C and clean it up
COSMIT add lgamma to gitattributes
DOC update SMART notation in TfidfTransformer docs
P3K: use print as a function in the examples
ENH refactor univariate feature selection
P3K use six.string_types and six.PY3
P3K one more iteritems
COSMIT rm Python 2.5 and Jython compat from six
BUG fix import problem in preprocessing
P3K StringIO vs BytesIO
DOC fix failing doctest due to unicode_literals
DOC whitespace in doctest
BUG revert P3K changes that broke mldata tests
rm gender classification example
P3K death to the print statement
P3K fix broken doctest and add forgotten print_function import
DOC no more need for compute_importances in trees
DOC copyedit FeatureHasher narrative
ENH move covtype loading to sklearn.datasets
TST covertype loader
DOC copyedit FeatureHasher narrative further
P3K range vs. xrange
Merge pull request #1524 from amueller/break_ovo_ties
DOC pretty math in kernel docstrings
BUG MinMaxScaler missing from preprocessing.__all__
BUG in KernelPCA: wrong default value for gamma
Merge pull request #1688 from hrishikeshio/fit_transform
ENH speed up RBFSampler by ~10%
BUG oops, removed validation by accident
BUG fix broken grid search example
COSMIT update mailmap
ENH sparsify method for L1-reg linear models
DOC developer guidelines for unit tests and classes_
DOC dev guide: random_state_ + @amueller's remarks
DOC r2_score may return negative values
Merge branch 'sparse-coef'
COSMIT callable instead of hasattr __call__
DOC rm failing doctest on graph_laplacian
DOC fix text vectorizer docs and add NLTK example
DOC fix broken doctests for feature_extraction.text
BUG restore empty vocabulary exc in CountVectorizer
ENH prevent copying of indices in CountVectorizer
DOC credit @ephes
Merge pull request #1713 from larsmans/vectorizer-memory-use
COSMIT use callable instead of hasattr
Merge pull request #1727 from amueller/min_max_scaler_fix
BUG broke the what's new while rebasing
ENH set min_df in fe.text back to 1
TST compute_class_weight in utils
FIX + TST + DOC compute_class_weight
ENH use bincount in compute_class_weight
BUG use fixes.unique
BUG in SVM tests
BUG fix compute_class_weights issue in SGD
Merge pull request #1753 from NelleV/FIX
P3K some more fixes in random places
DOC OpenBLAS is more dangerous than I thought
DOC oops, typo
COSMIT get rid of undocumented attributes on SVMs
PEP8 and allow non-bool truth values in CD
BUG + ENH: removal of components in kernel PCA
Merge pull request #1758 from larsmans/kernelpca-fix
P3K make feature_extraction.text work
BUG failing doctest
DOC IsotonicRegression wasn't in the changelog at all
P3K all of feature_extraction passes tests on Py2 and 3
DOC clarify column ordering in SVC scores
COSMIT DictVectorizer.inverse_transform readability
DOC CountVectorizer does NOT do stopword filtering by default
ENH don't recompute distances in MBKMeans
ENH cut MiniBatchKMeans memory usage in half for large n_clusters
DOC installation instructions: MacPorts, fix types, stdeb instructions
Merge pull request #1773 from jnothman/prf_docstring
BUG StandardScaler would ignore with_std for CSR input
bring text classification somewhat closer to current API
BUG SGDClassifier and friends did not forget labels_ in re-fit
DOC clarify C parameter on LogisticRegression
TST + DOC + COSMIT refactor ParameterGrid and test it
ENH len on ParameterGrid and ParameterSampler
BUG deprecation of grid_scores_ in GridSearchCV
BUG always do cross-validation in GridSearchCV
DOC fix clone and get_params documentation
TST grid search/randomized search on non-BaseEstimator
TST actual sparse input in sparse k-NN tests
COSMIT prevent a copy in randomized LR
TST speed up comment tests by ~20%
TST radius-neighbors regression test not entirely stable
BUG additive_chi2 missing in KERNEL_PARAMS
BUG + DOC fix Nystroem for other kernels than RBF
COSMIT rm repetitive __main__ blocks from tests
ENH allow additional kernels on KernelPCA
TST fix broken doctest
P3K developer docs
Merge branch 'pr/1790' -- Python 3 support from PyCon sprint
Merge pull request #1812 from kmike/testing-fixes
DOC describe SVM probability calibration (and advise against it)
DOC further comments on SVM probabilities
ENH multiclass probability estimates for SGDClassifier
BUG digits grid search was passing cv to the wrong method
DOC typos in grid search docstrings
PY3 + TST decouple test_metrics from random module
Merge pull request #1836 from kmike/master
DOC distributions produced by hashing trick depend on input
DOC multiclass: typo and use case
DOC PR means pull request
FIX BytesIO and urllib usage in fetch_olivetti_faces
DOC I didn't mean soft-O by "tilde notation"
DOC describe API, not internals, for AdaBoost
DOC replace "arithmetical order" in AdaBoost docs
TST strengthen AdaBoost tests
FIX SVR complaining about a single class in the input
COSMIT do np.unique(y) once in SVC
DOC rewrite description of k-fold CV
mailmap entry for @lqdc
DOC define validation before cross validation
DOC typos in cross-validation description
clean up mailmap/deduplicate contributors
BUG disable memory-blowing SVD for sparse input in RidgeCV
FIX DictVectorizer behavior on empty X and empty samples
TST + DOC AdaBoostClassifier.predict_proba fix
COSMIT refactor AdaBoost code
ignore PDFs
DOC move old tutorial out of the way for merge
Merge branch 'tutorial'
ENH speed up sklearn.feature_selection.chi2
DOC dependency installation with yum (Red Hat, CentOS)
FIX bug (swapped args) in chi2
FIX yet another chi2 bug
ENH add latent semantic analysis/sparse truncated SVD
ENH use rnd SVD in TruncatedSVD by default for speed
COSMIT omit unused parameter/return value in svd_flip
TST strengthen TruncatedSVD tests
DOC + MAINT deprecate RandomizedPCA scipy.sparse support
FIX and link LSA clustering example
DOC explain normalization in LSA KMeans example
Merge pull request #1716 from larsmans/truncated-svd
FIX metrics/scoring bug with LeaveOneOut CV
MAINT remove deprecated gprime handling from FastICA + refactoring
Merge pull request #2067 from jnothman/test_binarizer
DOC no more mention of the Bunch in the narrative docs
FIX don't rely on Bunch behavior with fetch_covtype
DOC fix some docstring/parameter list mismatches
DOC fix RandomizedPCA docstring for n_components=None
ENH allow empty grid in ParameterGrid
MAINT ignore kernprof.py reports
DOC ParameterGrid on lists
Merge pull request #2082 from larsmans/empty-parameter-grid
DOC fix V-measure docstring
MAINT dedup Clay Woolam's contribs (>100 commits!)
FIX/ENH mean shift clustering
DOC typo
ENH micro-optimize RFECV
COSMIT refactor LibSVM wrapper for safety and readability
DOC fix some broken URLs
FIX charset -> encoding in load_files
DOC typo
Revert "FIX charset -> encoding in load_files"
FIX verbose output from k-means
FIX remove params from RandomizedSearchCV
FIX charset -> encoding in load_files
FIX search bug introduced in 1327057f4258f41712ecab5c94770aac5ff01982
FIX inconsistent attributes shapes in naive Bayes
FIX test failure in naive Bayes
FIX failing doctest for CountVectorizer
Merge pull request #2027 from mblondel/select_categorical
FIX copy in OneHotEncoder and _transform_selected
ENH optimize KMeans for sparse inputs
FIX KMeans bug; argsort result apparently not always C-contiguous
DOC what's new: faster KMeans
DOC more explicit description of degree param on SVMs
COSMIT pep8
ENH order *does* matter for sparse matrices
FIX get rid of the last few asanyarray calls
DOC fix erroneous docstring on preprocessing._transform_selected.
MAINT: dedup @jakevdp and @jnothman in mailmap
COSMIT simplify printing of number of fits in grid search
COSMIT fix a docstring in feature_extraction.text
P3K developer docs
TST r2_score float32 overflow fix
Revert "TST r2_score float32 overflow fix"
PY3 use urllib2 or urllib.request, based on Py2/3
DOC let OneHotEncoder, DictVectorizer and FeatureHasher refer to each other
DOC correct class_weight description for LogisticRegression
FIX memory usage in DictVectorizer.fit
ENH back-port rand_r from 4.4BSD
FIX move rand_r to tree module for now
DOC 20news filtering with smaller set and MultinomialNB
PY3 fix string literal syntax error
TST skip Graphviz export docstring in trees
TST use TruncatedSVD in random forest tests
COSMIT refactor random forests
COSMIT refactor forests, part 2
FIX faulty import in 20news docs
ENH fit_inverse_transform for FastICA
DOC document mixing_ attr on FastICA
COSMIT attribute checking in FastICA
COSMIT explicit None check in naive Bayes
ENH simplify the Scorer API
FIX bug in scorers that take probabilities
COSMIT RBM test in usual nose style + moved to proper module
BUG + COSMIT + ENH RBMs
Merge branch 'pr/1954'
MAINT _logistic_sigmoid.c is "binary"
PY3 fix RBM test
DOC copyedit RBM docstrings
DOC pep257 + c/e in sklearn.base
TST fix string labels in metrics tests
DOC copyedit preprocessing docs
MAINT ignore profiling results from kernprof.py
DOC copyedit KernelCenterer docstring
DOC minimal kernel centering narrative docs
DOC minor copyedit to FS docs
Merge pull request #2230 from pprett/neighbors-segfault-fix
TST catch deprecation warning in feature_extraction.text
Merge branch 'pr/2246'
DOC correct/copyedit linear model docstrings
FIX inline rand_r to fix build on Windows
DOC add an extremely simple classifier code example to dev docs
ENH rewrite multiclass_log_loss, rename log_loss, document it
ENH Scorer object for log loss
ENH add log_likelihood_score as -log_loss
PY3 new overfit prevention stuff in 20newsgroups loader
DOC SGDClassifier has multiclass predict_proba
DOC minor copyedit to narratives
FIX don't use old scoring API in randomized search
FIX use category and stacklevel=2 for {loss,score}_func
ENH speed up BernoulliNB's predictions
DOC "creating features" -> "feature extraction" + minor stuff
Revert "ENH add log_likelihood_score as -log_loss"
DOC copyedit example docstring
DOC XHTML fixes (unclosed tags, type="text/javascript")
ENH speed up logistic_sigmoid (using less code)
FIX make BaseSGDClassifier an ABC
Merge pull request #2295 from larsmans/fast-sigmoid
DOC credit to @ephes and myself for log loss in metrics
DOC fix example comments
DOC typos involving Nyström
DOC copyedit SGDClassifier docstring
DOC improve docstrings in sklearn.base
FIX integer types in Ward clustering
MAINT deprecated ENet param used in doc and benchmark
MAINT remove deprecated parameters (the easy cases)
FIX rm spectral_embedding import from sklearn.cluster
DOC typo
FIX + COSMIT Reuters out-of-core example
FIX py2.6 compat in biclustering example
MAINT remove Counter from fixes; no longer used
COSMIT refactor Hungarian algorithm
DOC while we're at it, link to our own RBM docs
DOC make DictVectorizer docstring refer to FeatureHasher
FIX remove warnings from univariate FS
ENH add VarianceThreshold feature selection method
Merge pull request #2308 from pprett/gbrt-check-supported-loss
DOC clarify distances in KMeans' _labels_inertia
COSMIT skip some repeated computations in k-means
DOC ASCII only in docstrings
ENH speed up NMF (about 30% off topic extraction runtime)
ENH export randomized_svd publicly
Merge pull request #2210 from pgervais/distances_argmin
Revert "CSR matrix support in pairwise_distances_argmin_min"
wip
ENH prettier output from NMF example
FIX failing test in NMF due to negative zeros
DOC broken link in HMM narrative
ENH micro-optimize pairwise_distances_argmin_min
BUG duplicate finity check in input validation
MAINT dedup Brandyn A. White in mailmap
MAINT remove useless deprecation in sklearn.utils
BUG rename n_iterations to n_iter in TruncatedSVD
ENH SGD Cython improvements
DOC boolean masks in CV generators are deprecated
DOC what's new: fast_dot is internal, so don't mention it
Merge branch 'bagging'
DOC copyedit DBSCAN implementation notes
DOC output types should never be "array_like"
DOC+COSMIT: typos, lots of them
ENH: don't call astype when a copy is not needed
BUG oops, *safe_* asarray
DOC improve encoding docs
MAINT remove deprecated code from CD
Partially revert "MAINT remove deprecated code from CD"
TST speed up biclustering tests
MAINT remove dead code from LibSVM
ENH 10x speedup in dump_svmlight_format
DOC: NMF narrative: describe optimization problem
Merge pull request #2426 from larsmans/sgd-improvements
Merge pull request #2457 from untom/rbm_csr_format
MAINT zap unused import
DOC optimized PNG images with OptiPNG
DOC remove qda and faces from website carousel
DOC replace jquery by minified version
Merge pull request #2479 from larsmans/website-speed
DOC typo: polynominial
MAINT list @paulgb's full name
DOC improve biclustering docstrings
ENH sparse matrix support in pairwise + optimizations
ENH optimize NMF inner loop
COSMIT pyflakes feature_extraction.text tests
DOC error in feature_extraction.text docstrings
MAINT add authors to validation.py and pairwise.py
MAINT: use python setup.py clean in Makefile
DOC docstring for extmath.norm
ENH refactor squared-norms computation to extmath
ENH use row_norms in KDE code
FIX fit followed by partial_fit in multiclass SGD
MAINT use subprocess.call, not os.system
COSMIT remove dead code in k-means
Merge branch 'refactor-squared-norms'
ENH honor Y_norm_squared when X=Y in euclidean_distances
COSMIT use norm function in feature selection
MAINT simplify f_oneway in feature selection
DOC norm optimizations in what's new
MAINT remove mlcomp document classif. example
ENH optimize one more sq. distance in k-means
TST fix broken fast_dot test
COSMIT friendlier output from faster NMF benchmark
ENH micro-optimize NMF inner loop
TST fix still broken test_fast_dot
DOC sparse matrix support in BernoulliRBM
FIX sparse matrix indexing in BernoulliRBM
TST clean up after ourselves in SVMlight test
COSMIT micro-optimize norm computation in NMF
DOC fix cross-decomposition docstrings
DOC remove confusing comment from TruncatedSVD
DOC: multiclass: make the warning more prominent
DOC improve feature selection docs
COSMIT remove useless "if False" in kernel approximations
DOC explain ARPACK algorithm in TruncatedSVD docstring
BUG SVMlight loader should check whether n_features is big enough
BUG lower space complexity of estimate_bandwidth to linear
DOC be more explicit on mean-shift scalability
Merge pull request #2541 from dengemann/fix_dot
MAINT: use $(MAKE) for recursive make
DOC add sklearn.base to generated docs
ENH use fast row_norms helper in preprocessing.normalize
ENH use fast row_norms in dictionary learning
BUG don't densify sparse matrix in BernoulliRBM.score_samples
ENH speed up progress reporting in RBM
ENH better error message when CountVectorizer prunes away all terms
TST disable non-doctests in Comp. Perf. docs
DOC {min,max}_df look at document freq, not term freq
DOC fix BernoulliRBM._fit docstring
Merge pull request #2642 from larsmans/rbm-speedup
ENH: faster cartesian product in make_classification and document complexity
Revert "ENH: faster cartesian product in make_classification and document complexity"
MAINT remove some dead code from the LibSVM wrapper
MAINT use TruncatedSVD in pipeline tests
DOC optimize Phimeca logo, size halved
Merge pull request #2673 from larsmans/libsvm-310
ENH filter out zeros early in FeatureHasher
FIX PCA.score_samples didn't do input validation
FIX MBKMeans w/ explicit centers and n_init>1, part 2
ENH show top term per cluster in doc. k-means example
FIX n_init bug in k-means
ENH make_pipeline and make_union utility functions
MAINT refactor fast_dot
ENH speed up SVMlight loader using Cython's array support
COSMIT get rid of deprecation warning in tree tests
MAINT ignore sklearn/tree/_utils.c in diff
COSMIT tree: unused variable warnings and use for/range
BUG fix unchecked mallocs in trees
Merge pull request #2715 from larsmans/tree-malloc
Merge pull request #2734 from eloj/cv-broken-format-string
MAINT remove last trail of the ArrayBuilder
DOC clarify order of output in NB predict_proba
FIX don't put data in source dir in bench_covtype
Merge pull request #2754 from eltermann/doc-fix-tfidf
DOC + FIX mean_ in PCA
ENH faster heapsort in trees
ENH introsort in tree learner
Merge pull request #2747 from larsmans/tree-sort
DOC logistic regression attribute docs + authorship
DOC up the sales pitch for SGD
COSMIT pep8 + full stop police
FIX error message with sparse precomputed kernels
Revert "FIX error message with sparse precomputed kernels"
FIX error message with sparse precomputed kernels (second try)
MAINT optimize Spotify logo (- a few hundred bytes)
FIX restore _pairwise on SVMs
ENH sparsify and densify methods for CD models
MAINT comment out unused import in example
MAINT full name for @h10r
FIX error message from trees for large inputs
Revert "ENH sparsify and densify methods for CD models"
MAINT missing import_array() in isotonic r. Cython code
MAINT get rid of compiler warnings from Liblinear
DOC no more "arithmetical order" for classes
FIX integer dtype for labels in DBSCAN
FIX numerical stability issue in BernoulliRBM
ENH speed up RBM training with scipy.special.expit
ENH re-instate extmath.logistic_sigmoid
FIX DictVectorizer handling of empty inputs
TST older nosetests compat in DictVectorizer test
Merge pull request #2882 from larsmans/expit
FIX predict_proba status on SGD and SVC when disabled
ENH use hasattr "predict_proba" in bagging
FIX one more unchecked malloc in the tree code
DOC SVC.predict_{,log_}proba does not return X
DOC what we call poly features are called interaction features in stats
FIX OneHotEncoder: check value max when n_values is integral
Merge pull request #2910 from larsmans/hasattr-predict-proba
Merge pull request #2876 from Manoj-Kumar-S/fix_auto
DOC: note about numerical precision in euclidean_distances
TST fix failing doctest for OneHotEncoder
FIX GBRT missing from covertype benchmark usage
MAINT drop support for NumPy < 1.6.1
MAINT drop support for SciPy < 0.9
MAINT remove useless import
ENH more optimizations for RBM
FIX + TST stability problems with scipy.special.expit
FIX input validation in Nystroem
FIX Nystroem input validation, again
FIX error handling in SVM
FIX decouple spectral embedding from TransformerMixin
DOC fix rendering of fetch_mldata example
MAINT remove leftovers from solve_triangular
COSMIT get rid of warning from expit import
COSMIT hinge_loss: better input validation
MAINT remove some deprecated stuff
MAINT final occurrence of "Scaler"
DOC/MAINT: clarify cblas/README.txt
DOC clarify Imputer constructor: arbitrary strings not accepted
MAINT remove deprecated functionality from SGD
ENH use threads instead of multiprocessing in SGD
MAINT use CBLAS instead of Fortran API in Liblinear
BUG restore joblib logging behavior
BUG joblib writes to wrong dir
DOC trees/ensembles: class labels need not be integers
ENH pairwise L1 distances for sparse matrices
FIX TfidfVectorizer to no longer ignore binary param
MAINT don't use Perl in Makefile when sed suffices
DOC heapsort is not stable at all
FIX cross_val_score to take y as a list
FIX cross_val_score to take y as an *optional* list
FIX sed usage in Makefile
FIX Makefile to use Perl again
COSMIT: pep8, trailing spaces
TST don't run fast_dot tests on numpy>=1.7.2 + pep8
DOC: tfidf is actually tf*(idf+1) = tf + tf*idf
DOC: clear up the big elastic net confusion (I hope)
MAINT lazily import scipy.cluster
ENH micro-optimize a few tests
ENH micro-optimize fast MCD
FIX random_state validation on c_step
DOC mention "shape" in AP docs
ENH less copying in validation for neighbors
FIX safe_asarray to handle LIL, DOK formats
DOC: website: feature selection on front page
FIX loss function example
DOC: clustering: merge discussions of k-means and inertia
MAINT remove unsupported documentation formats
DOC: move mean_shift docs to MeanShift
ENH factor out squared norm helper
DOC no input validation in constructors
COSMIT six.{map,range} usage in partial_dependence
DOC random_state is an arg to LogisticRegression
FIX MemoryError raising in trees (+test)
DOC: add PyPy support to FAQ
Merge branch 'simplify-fselection'
MAINT make export_graphviz more exception-safe
Revert "COSMIT skip some repeated computations in k-means"
DOC L1 distance works for sparse matrices
Merge branch 'pr/3120'
FIX numerical stability in GMM with eigh sampling
DOC multiclass: OvO needs predict_proba or decision_function
Merge pull request #3184 from YS-L/tfidfvectorizer_idf
DOC: DBSCAN: there's no calculate_distance function
DOC: improve feature_extraction.text docstrings
TST skip flaky label propagation test
DOC we work with Python, not against it :)
ENH/DOC fix poly features complexity
ENH interaction_only in PolynomialFeatures
Merge pull request #3239 from larsmans/faster-poly-features
FIX PCA error handling for invalid n_components
TST test interaction features outside doctest
DOC fix deprecation in CD
DOC missing verb in README
TST silence spectral test/test warning message
FIX wrong parameter name in deprecation warning
DOC typo in FastICA docs
DOC: copyedit log loss, hint how multiclass generalizes binary
ENH save a bit of memory in Euclidean distances
DOC remove HMM documentation
ENH fix astype usage to prevent copying
DOC big fat warning about multithreaded BLAS
DOC correct bagging docs regarding sparse inputs
FIX RidgeClassifierCV didn't have scoring parameter
DOC typo in ExtraTrees
DOC SVMs take string class labels as well as integers
DOC what's new: GaussianNB.partial_fit + typo in comment
DOC double backticks for fixed-width (code) font
TST crank up preprocessing tests to 99% coverage
FIX Gaussian KDE should return array, not scalar
COSMIT use explicit RandomState in KDE tests
DOC more explicit docstring for preprocessing.normalize
DOC optimize some PNG images, shaves off 24kB
DOC run optipng before uploading website
DOC: typo, envelop → envelope
Merge branch 'pr/3358'
DOC numpydoc convention for Bunch-returning functions
Merge branch 'pr/3316'
MAINT ignore sparsefuncs_fast in diffs
MAINT mailmap update
FIX expit bug with out != None
FIX potential overflow in _tree.safe_realloc
FIX DBSCAN input validation
MAINT remove deprecated PCA code
MAINT remove deprecated code from trees and forests
MAINT remove deprecated sklearn.pls module
MAINT remove deprecated code from preprocessing
MAINT remove sklearn.test
COSMIT clean up tests with pyflakes
MAINT remove deprecated code
DOC typo in warning
Merge pull request #3479 from MechCoder/improve_logcv_docs
DOC fix references to moved examples
DOC mistake in comment
Merge branch 'pr/3535'
MAINT NumPy 1.10-safe version comparisons in joblib
Merge branch 'pr/3545'
DOC GP fixes in what's new
Merge branch 'pr/3395'
DOC typo
DOC vectorizers were referring to a private function in public docs
Merge pull request #3573 from MechCoder/high_dimensional_enetcv
ENH multinomial logistic regression using L-BFGS
DOC deprecation of **kwargs in neighbors per 0.18
TST fix doctest for k-NN
Merge pull request #3603 from MechCoder/fix_imputation_example
DOC rename "imputation.py" to "missing_values.py"
DOC dtype parameter on validation functions
DOC multi_output parameter on validation function
MAINT optimize machinalis.png, -9kB/42%
MAINT remove harmless, but useless, double call to csr_matrix
MAINT use %r for better printing of regexps
ENH friendlier message for calling predict before fit on SVMs
MAINT remove unused MST code
FIX division by zero warning in LassoLarsIC
MAINT big refactor of sklearn.cluster.bicluster
Merge branch 'pr/3613'
ENH micro-optimize gradient boosting
MAINT move comment to the appropriate place
FIX don't compare strings with 'is'
FIX propagate MemoryError from Tree._resize
MAINT refactor DictVectorizer's transform+fit_transform (1)
Merge branch 'pr/3683'
TST more robust test for MemoryError from Tree._resize
MAINT: handle frombuffer with empty 1st arg in utils.fixes
DOC single-pass DictVectorizer in what's new
MAINT: set attributes as last action in DictVectorizer.fit
Merge pull request #3687 from perimosocordiae/fast-l1-dist
TST stronger test_shuffle_on_ndim_equals_three
Merge branch 'pr/3696'
Merge pull request #3695 from dougalsutherland/kde-docs
Merge pull request #3656 from dougalsutherland/fix-rbf-samp
FIX ovr predict_proba in the binary case
Merge branch 'pr/3710'
DOC for the Git novices: cd to the clone directory
Merge pull request #3746 from MechCoder/nearest_neighbor_manhattan
DOC fix (Nu)SVR attribute shapes in docstring
FIX sign flip in (Nu)SVR with linear kernel
Merge pull request #3761 from FlorianWilhelm/median_absolute_error
DOC: FAQ: what to do with strings
DOC fix Isomap docstring formatting
Merge pull request #3786 from AlexanderFabisch/tsne_fix
Merge pull request #3800 from MechCoder/unknown_transform
Merge pull request #3829 from amueller/feature_selection_from_model_doc
DOC link to NLTK website
TST stronger non-regression test for #3815
Merge branch 'pr/3837'
Merge pull request #3883 from ogrisel/sgd-stability
Merge pull request #3898 from arjoly/nogil-feature-importance
Merge pull request #3899 from jlopezpena/fix-OvOlist
DOC "RF" -> "Random forest" in text classif example
Merge pull request #3925 from Titan-C/cssbox
DOC typo
MAINT get rid of undefined variable warning in Liblinear
Merge pull request #3934 from amueller/common_test_slight_cleanup
Merge branch 'update-liblinear'
DOC clarify RandomTreesEmbedding docstring
Merge pull request #3968 from MechCoder/metric_neighbors_Graph
MAINT remove probability kwarg from SVR and NuSVR
Merge branch 'pr/3944'
TST remove SVR(probability=True) from AdaBoost tests
COSMIT pep8 fixes to sklearn/tests
FIX LSH benchmark: joblib import, hardcoded /tmp
Merge pull request #3980 from larsmans/lsh-forest
DOC: what's new: LSHForest
Merge pull request #4011 from MechCoder/fix_deprecation_paths
Merge pull request #3982 from amueller/metaestimator_delegation
DOC DBSCAN doesn't "initialize centers"
DOC Birch: spaces
ENH: optimize DBSCAN (~10% faster on my data)
Merge pull request #4151 from larsmans/dbscan-faster
Merge pull request #4153 from hsalamin/remove_exp_warning
Merge pull request #4148 from amueller/clean_up_after_tests
Merge pull request #4141 from amueller/bootstrap_int_one_fix
Merge pull request #4147 from amueller/precision_recall_unsorted_indices
Merge pull request #4156 from AlexanderFabisch/fix_tsne_one_comp
Merge pull request #4171 from amueller/statistical_learning_gridsearch
Merge pull request #4188 from miniyou/patch-1
Merge commit 'b35b2fdad3a38e29bf5ef3534aeb9468b7f20f4a' from #4181
Merge pull request #4198 from arjoly/gbrt-fix-percent-max_features
Merge pull request #4193 from lesteve/deactivate-travis-default-venv
Merge pull request #4179 from amueller/nystroem_singular_kernel
Merge pull request #3933 from fabianp/loss_liblinear
FIX TypeError not ValueError in is_fitted
TST test last commit
MAINT change "l2" to "squared_hinge" in svm.bounds
ENH optimize DBSCAN by rewriting in Cython
FIX TSNE.fit: didn't return self
DOC typo in TruncatedSVD narrative
Merge pull request #4739 from amueller/regressor_output_shape_test
DOC fix comment in tf-idf: log+1, not log(1+x)
Merge branch 'pr/4714' + pep8 fixes
TST fix failing doctest
Merge pull request #4905 from lesteve/update-joblib-0.9.0b2
Merge pull request #4900 from arjoly/fix-raise-memerror
Merge pull request #4966 from lazywei/multilabel-dump-svmlight
Merge pull request #4937 from pianomania/FeatureHasher_add_example
DOC FeatureHasher takes (finite) numbers as values
Merge pull request #4956 from nealchau/ransacresid
FIX dump_svmlight_file multilabel arg: should be last
ENH: add Cohen's kappa score to metrics
FIX assign_rows_csr: should not zero the entire out array
FIX commit re-cythonized sparsefuncs_fast
TST common metric tests for cohen_kappa_score
DOC expand FunctionTransform docstring
Merge pull request #5059 from amueller/function_transformer_rebase
DOC/MAINT final touches to FunctionTransformer
Merge pull request #5076 from PGryllos/update_doc
Merge pull request #3659 from chyikwei/onlineldavb
COSMIT pep8 online LDA
ENH micro-optimize LatentDirichletAllocation + cosmetics
Merge pull request #5074 from jdwittenauer/master
DOC: happy new year!
DOC: typo in function name
Merge pull request #5050 from AnishShah/issue5043
Merge pull request #5111 from chyikwei/remove-unnecessary-variable
TST: LatentDirichletAllocation behavior on empty documents
ENH: faster LatentDirichletAllocation (~15% on 20news)
Merge pull request #5271 from vortex-ape/lda-typo
ENH: optimize LDA, ~15% faster on one core
DOC: TODO notice in Hungarian algorithm, use SciPy version when released
DOC: optimize Dataiku logo (16kB => 10kB)
BUG shuffle components, not samples in NMF CD
MAINT: better timing and general fixes in NMF/LDA example
COSMIT some refactoring in NMF
Laurent Direr (11):
Corrected two typos in docstring.
#3356 - Added an exception raising when np.nan is passed into a HashingVectorizer.
#3356 - Added an exception raising when np.nan is passed into a HashingVectorizer.
Added a test on hashing vectorizer behavior with np.nan input.
PEP8 line length fix.
Replaced assert_raises with assert_raise_message as the point is to make sure the exception message is clear.
Added a comment to explain the use of a test.
Added a parameter check on n_estimators in BaseEnsemble to raise error if it is not strictly positive.
Changed _partition_estimators signature to make it compatible with warm start.
Added warm start to random forests.
Modified the confusion matrix example: included a normalized matrix, changed the colors and added class labels.
Laurent Luce (1):
Fix Machine Learning for NeuroImaging in Python link.
Laurent Pierron (1):
Improve the GMM PDF example.
Lilian Besson (3):
Fixed typo in the final ".. note::" of naive_bayes.rst
Missing link for inertia (about K-Means)
Fix few typos on links and doi
Loic Esteve (41):
Remove deprecated 'mode' parameter from sklearn.manifold.spectral_embedding
Remove deprecated 'class_weight' parameter from RidgeClassifierCV.fit
Remove deprecated 'precompute_gram' parameter
Remove deprecated 'copy_Gram', 'copy_Xy' and 'copy_X' parameters
Remove deprecated 'Gram' and 'Xy' parameters from OrthogonalMatchingPursuit.fit
Removed 'Gram' and 'Xy' parameters were still in test_omp.py
Remove deprecated 'score_func' and 'loss_func' parameters from sklearn.metrics.scorer.check_scoring. Amend the code in all the other places they were used.
Remove 'copy_X', 'copy_Gram' and 'copy_Xy' documentation since these parameters have been removed
Tidy up by removing unnecessary local variables
Remove 'DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future' warnings
Remove a couple more 'using a non-integer number instead of an integer DeprecationWarning'
DOC: add funding from Paris-Saclay Centre for Data Science
Added missing GSOC student
MAINT: fix silhouette typo
MAINT create a new venv in travis for "ubuntu"
TST fix tests with numpy 1.6.1
DOC fixes supported numpy version
MAINT improve misleading comment
Move import statement
Simplify the code by using len
MAINT use absolute imports in tests
DOC python 3 fix in plot_kmeans_silhouette_analysis.py
Remove trailing spaces
MAINT fix typo pyagm -> pygamg in SkipTest
DOC remove sphinx warnings when generating the doc
Use anonymous hyperlink targets
Tweak appveyor badge
FIX DBSCAN fit with precomputed matrix in edge cases
Updated joblib to 0.9.0b2
Update joblib to 0.9.0b3
DOC fix README.rst coveralls badge
MAINT make copy_joblib.sh Python 3 compatible
MAINT update joblib to 0.9.2
FIX non deterministic LSHForest doctest failure
DOC add link to the joblib 0.9.2 release notes
DOC Fix plot_tomography_l1_reconstruction example
MAINT Update the 3rd travis build to latest versions of numpy and scipy
FIX AdaBoostRegressort test failure with numpy 1.10
MAINT add safe_{median|mean} for np 1.10.1
DOC fix missing import in plot_lle_digits example
MAINT update joblib to 0.9.3
Louis Tiao (2):
Update plot_ols.py
Filled in missing negation
Lucas Wiman (1):
Fix spelling in dosctring.
Ludwig Schwardt (1):
FIX removed ancient templates from manifest to make sklearn pip-installable.
Luis Pedro Coelho (4):
cd_fast: use square norm directly
DOC Fix ``copy_X `` default in documentation
DOC Fix n_jobs documentation
ENH Add cross_val_predict function
Lukas Michelbacher (3):
Fix typo
Add dependence to max_features to docstring
Add newline before bullets
MLG (1):
Quick fix on grid_scores, updating final value with len(cv)
Maheshakya Wijewardena (22):
Added reference to function
Added :func: reference
Added :func: reference
Implemented median and constant strategies in DummyRegressor
Depreciated y_mean_ and y_median. Code is formatted according to pep8
Removed extra lines. Added deprecation warnings for y_mean_
Updated document to reflect changes in the DummyRegressor
Fixed value error. Scalar values are allowed for constant when not multioutput
Done refactoring and necessary changes for multiouput constant strategy. Added tests and updated the documentation.
Changed value error messages to be more informative in constant strategy.
Edited the documentation-unordered list
Implemented additional test cases. Removed the check for lists and numpy arrays in constant strategy.
Refactored the code and added quotes
Recreated test cases with random state
Updated tests. Fixed flake8 errors
fixed pep8 errors in dummy.py
Changed to random_state
Changed documentation in DummyRegressor
Removed one of 'is a regressor' part in the DummyRegressor documentation
Added _LearntSelectorMixin in BaseGradientBoosting
Added extra check to feature importances
Implemented SelectFromModel meta-transformer
Manoj Kumar (255):
Constant output dummy classifier
Minor doctest change
Handled exceptions in fit; Added tests
TST Tests for string labels, DOC Minor doc changes
DOC Minor doc changes
Removed unnecessary attributes
ENH: ElasticNetCV and LassoCV raise ValueError with multitarget outputs
Testing log_loss and hinge_loss under THRESHOLDED_METRICS
FIX: Removed redundant code
TST: Test class variance and string input
FIX: Changed str vs float invariance test
ENH: MultiTaskElasticNet (and Lasso) CV
Proper centering of alpha_grid for sparse matrices
FIX: Normalize=True
Fixes Issue 2751
FIX: Removed coef and improved initialization
FIX: sample_weight='auto' for RidgeClassifier
FIX: Label encoding done in compute_class_weight
ENH: Speed up sparse_coordinate descent
Replaced cython calls for dot operations with BLAS calls
Speed up using typed memory views
Preserve CSR storage format when input is CSR in sparse_center_data
ENH: Refactoring and optimisation of sparsfuncs.pyx
ENH: Optimise sparsefuncs
Moved sparsefuncs to sparsefuncs_fast
TST: Added tests for non-CSR/CSR format
ENH: Swap rows in sparsefuncs
Made the following changes
COSMIT: Replaced ptr1/2 with start/stop
Made the following changes
Improved formatting of unsupported sparse matrices in swapping
FIX: ENetCV and LassoCV now accept np.float32 input
FIX: Use coordinate_descent_gram when precompute is True | auto
Remove unused param precompute from MultiTask models
FIX: Raise ValueError for invalid precompute
Replaced numpy calls with blas in dual_gap
Made the following fixes
Replaced multiprocessing with threading
ENH: Release the GIL for sparse coordinate descent
COSMIT: Replaced xrange with range
ENH: Release GIL in the gram variant
Replaced C ordered contiguity and added typed memory views
Added typed memory-views for H and XtA
Release GIL in the multi-task variant
ENH: Replaced double indexing as it causes python overhead
Link my name to my wordpress blog
ENH: Return attribute n_iter_ for linear models dependent on the enet solver
TST: Added test to check that warm model converges faster than a cold one
TST: Better to test that warm_start runs only once after the prev model has converged
TST: Added test to check that higher alpha converges faster
Added return_n_iter parameter for enet (and lasso) path
FIX: Fit the data in tests
FIX: Remove raw_coef_ attribute that eats up memory
ENH: Added n_iter_ parameters across all iterative solvers
FIX: Preserve public API by introducing return_n_iter param
TST: Test that after fitting n_iter is greater than 1
Added n_iter attribute to OMP (and CV)
MAINT: Fixed a few docstrings and cosmits
COSMIT
Replaced Label Encoder with Label Binarizer
Replaced helper function _phi by special.expit
Do away with intercept helper functions
Refactor fit_intercept case
FIX: Fixed hessian value for intercept
DOC: Add docs for helper functions
ENH: LogisticRegressionCV can now handle sparse matrices
TST: Add tests to explicitly check hessian, loss and gradient for fit_intercept
More docs and tests for LogisticRegressionCV
FIX: Doctests
TST: Improved tests
ENH: Added one-vs-all fit in case of multi-class data
ENH: Added refit parameter
TST: Tests to verify OvA behavior
FIX: PEP8 and other cosmits
Made the following changes
ENH: Logistic Regression now supports newton-cg and lbfgs
ENH: Weighted logistic regression for lbfgs and newton-cg
Changed copy default from True to False, updated docstring for sample_weights
DOC: Minor changes
ENH: Added warnings for convergence, added support for l1 penalty if solver is liblinear
FIX: Changed tolerance of newton-cg to be compliant with that of lbfgs
COSMIT: Utils imports are together
FIX: Class weights are computed for each OvA
FIX: Liblinear solver for LogisticRegressionCV now works for class_weight==auto
FIX: Add DataConversionWarning
FIX: Changes due to recent refactoring of the check functions
MAINT: Improve documentation and coverage
Update whats new!
FIX: Fixes for the cross_validation failure
FIX: Increase testing accuracy
Improve docstring of the LogRegCV model
ENH Stochastic Coordinate Descent for ElasticNet & Lasso
Max iter param in Liblinear is now softcoded
ENH: Return n_iter_ from liblinear and print convergence warnings
TST: Added test to check ConvergenceWarning
Fix heisenbug due to addition of max_iter param in LSVC
FIX: Incomplete download of 20newsgroup_dataset
FIX: Memory crashes for _alpha_grid in ENetCV
ENH: Add support for class weights
FIX: Change alpha to 1./c
ENH: Merge MultinomialLR into LR with multi_class='multinomial'
ENH Added multinomial logreg to plot_classification_probability.py
ENH: Added multinomial option to LogisticRegressionCV
COSMIT: Minor doc fixes
TST: Tests for multinomial logistic regression
COSMIT: Made the following changes
DOC: Improved documentation and error messages
DOC: Made the docs for LogisticRegression clearer
FIX: PEP8 Errors and unused imports
DOC: Changed docstring style for optional arguments
DOC: Explicit instructions for Python3
MAINT: Remove BaseLibLinear for LogisticRegression
DOC: Added np.sqrt since default is 'squared=False'
ENH: Improved verbosity and denesting
FIX: Fix the imputation example
MAINT: Made the following changes
DOC: Explain prediction when decision_function is zero
OPT: Prevent iterating across n_features twice
FIX: Make sure LogRegCV with solver=liblinear works with sparse matrices
Changed default argument of precompute in ElasticNet and Lasso
TST: Added tests to test Deprecation warning
DOC: Document criterion_ attribute in LassoLarsIC
FIX: Raise warnings in f_classif a given feature is constant throughout
ENH: More descriptive error which prints the feature indices
FIX: Thresholded Nearest Centroid fails with non-encoded y
Bug fix in computing the dataset_centroid in NearestCentroid
FIX: Remove test that tests state of warnings before and after assert_warns
Merge pull request #3750 from MechCoder/bug_nearest_centroid
TST: Test that the warning registry is empty after assert_warns
TST: Test that assert_warns is reset internally
DOC: Make it explicit that assert_warns clears the warning registry
Merge pull request #3752 from MechCoder/remove_testing_test
Expose positive option in elasticnet and lasso path
TST: Add tests and document whats_new.rst
ENH: Patches Nearest Centroid for metric=manhattan for sparse and dense data
FIX: Wrap csc_row_median around the _get_median imputer function
MAINT: Move _get_median into sparsefuncs to avoid circular imports
DOC: Explain why the centroid of the manhattan metric is the median
Renamed csc_row_median to csc_median_axis_0
Warning for non-euclidean and non-manhattan metrics
Sparse matrix conversion depending on the type of metric
Update what's new.rst
Handle_unknown option to OneHotEncoder
Merge pull request #3801 from cmd-ntrf/typo_vbgmm
DOC: Make the comment slightly clearer
COSMIT: Minor opt in pairwise_distances_argmin_min
OPT: Speed improvements by avoiding repeated calls to check_array
Merge pull request #3779 from arjoly/sw-dummy-regressor
COSMIT: Use searchsorted in weighted_percentile
Update about page
Raise error when sparse matrix is supplied to predict in KNeighborsClassifier.
DOC: Typo in dict_faces.py
Minor typo in plot_dict_face_patches.py
FIX: Raise error when patch width/height is greater than image width/height
Fix BallTree and KDTree Docs
what_new entry for KD and BallTree doc fixes
ENH: Birch, first commit
Made the following changes
ENH: Made the following changes
OPT: Set check_X_y=False internally in birch
OPT: Increased the code speed by doing the following.
OPT: Made the following changes.
ENH: Added example to compare birch and minibatchkmeans
ENH: Add arbitrary clusterer in the global clustering step and test
ENH: Sparse matrix support
TST: Add tests for branching_factor
OPT: Remove unwanted calls to np.asarray for self.centroids_
ENH: Add narrative documentation
STY: Cosmits in documentation
Made the following changes
DOC: Update documentation of _CFNode
Made the following changes
Made the following changes.
Made the following changes
Prevent preprocssing for iterating over dense data
DOC: Modification to the documentation of threshold
Made the following changes.
Made the following changes:
API: Rename n_clusters to global_clusters
Changes to narrative documentation
Major changes
Rename global_clusters to n_clusters again
FIX: Remove trailing attributes in __init__
Explicit error when n_dim changes during partial_fit
Rewrite model if fit is being called after partial_fit or vice versa
Update whats_new.rst for n_iter_ attribute
Tuple unpack farthest_dist
Made the following changes
MAINT: Moved compute_label test to common tests
MRG: Fix FeatureAgglomeration docs
COSMIT: Make how to use partial_fit render properly
ENH: Use check_X_y in pairwise_distances_argmin
FIX: Raise AttributeError for fit_predict
ENH: Add metric support to neighbors_graph
Replace feature with sample
Remove optional input validation from pairwise_metrics
Merge pull request #3946 from MechCoder/doc_fix
Validation of params passed to neighbors_graphs
ENH: Allow connectivity to be a callable in AgglomerativeClutering et.al
Remove support for int and update whatsnew.rst
Use check_array and remove shape check
DOC: Improvements to children_ attribute in AgglomerativeClustering
Borrow a bit from scipy definition
ENH: Return distances option for linkage_trees
Merge return_distance tests of both linkage and ward tree
Add documentation for return_distances option
COSMIT: Removed XXX comment related to LIL matrices
FIX: Fix bug in computing full tree when n_clusters is large
DOC: Add note to whatsnew.rst
Merge pull request #3978 from MechCoder/compute_full_tree_bug
Merge pull request #3976 from MechCoder/agglomerative_playground2
TST: Test ground truth in linkage trees
Merge pull request #4000 from he7d3r/patch-1
FIX: Bug in sparse coordinate solver in lazy centering
MAINT: Remove deprecation warnings in enet_path and lasso_path
Merge pull request #4061 from ragv/pep8_cleanup
FIX: Regression in NearestCentroids
FIX/ENH: Fix kneighbors_graph/kneighbors and allow X=None
FIX: Handle and test case where n_duplicates > n_neighbors
Add docstring and test for kneighbors
ENH/FIX: Patch radius_neighbors to follow kneighbors convention
TST: Refactor tests for k and radius neighbors
ENH: Add include_self param to graph variants for backward compat
DOC: Add what's new entry
DOC: Make docs clearer
Merge pull request #4046 from MechCoder/allow_X_none
Merge pull request #4222 from saketkc/benchmark_fixes
Merge pull request #4226 from ragv/fix_4224
FIX: Fix n_components in AgglomerativeClustering and friends
Merge pull request #4277 from Celeo/contributing-request-typo
[MRG] [BUG] Pass penaly to the final logistic regression fit
[BUG] predict_proba should use the softmax function in the multinomial case
override predict_proba in log_reg
Add non regression test
AdaBoostRegressor should not raise errors if the base_estimator
Add numerically stable softmax function to utils.extmath
Add test for softmax function
Merge pull request #5225 from MechCoder/fix_overflow
[BUG] _init_centroids has an optional x_squared_norms parameter which is not exactly optional
Merge pull request #5262 from andylamb/lamb-fix-class-weight-check
Merge pull request #5284 from tdhopper/patch-1
fix test failures
Remove warm start
Catch filters instead of removing the tests
Added example to depict feature selction using SelectFromModel and Lasso
Minor doc changes and removed _set_threshold and _set_importances
Now a fitted estimator can be passed to SelectFromModel
Add narrative docs and fix examples
Merge SelectFromModel and L1-selection examples
Lasso and ElasticNet should handle non-float dtypes for fit_intercept=False
Merge pull request #4707 from amueller/k_means_init_mismatch
1. Added parameter prefit to pass in a fitted estimator.
Refactor tests
Merge pull request #4242 from MechCoder/select_from_model
Fix broken examples using RandomTreeEmbeddings
Manuel (1):
Rename k to n_clusters in docs
Mario Michael Krell (1):
DOC typos and consistency in SVM docstrings
Mark Veronda (2):
Type-os and added great links to learning more about Machine Learning
Feedback from @amueller
MarkTab marktab.net (1):
Update lfw.py
Marko Burjek (7):
DOC Added SGDCLassifier support only binary prediction probabilites.
DOC Fixed a return in predict_proba in SGDClassifier
DOC add support for sparse arrays to SGDCLassifer
DOC forgot dot in SGDCLassifier documentation
DOC Fixed a return in predict_proba in SGDClassifier
DOC add support for sparse arrays to SGDCLassifer
DOC forgot dot in SGDCLassifier documentation
Martin (2):
added test for transformed scatter matrix
DOC Updated what's new
Martin Billinger (5):
Added test for orthogonal LDA transform.
Fixed LDA transform.
Improved precision of the LDA orthogonality test.
forgot to remove an import
misc
Martin Ku (2):
Fix typo of DPGMM doc
Add "See also" for selectors and scoring funs
Martin Luessi (6):
WIP: doc hyperlinks, fixed size thumbnails
gzip support, whats_new
use Sphinx searchindex.js
no_image.png for examples w/o thumbnail
fix paths for Windows
links for scipy, cleanup
Martin Spacek (2):
DOC fix fastica source matrix output shape
Improve FastICA convergence warning message
MartinBpr (1):
Corrected macro ROC in example plot_roc
MaryanMorel (1):
Deprecate residues_ in LinearRegression
Masafumi Oyamada (1):
Correct expectation notation in DP-GMM doc
Mateusz Susik (1):
FIX Support for negative values of n_jobs
Mathieu Blondel (772):
Added filters to WordNGramAnalyzer.
Added a non-hashing dense vectorizer object.
Added transform() method to Pipeline object.
Updated dense vectorizer to follow transformer API.
Support fit_transform() in pipeline.
Support lists for training data in grid_search.
Use fit_transform and use iterables for documents.
Remove uncessary code.
Save memory when the matrix is built.
Added fit_transform() to pipeline.
Vectorizer should implement fit_transform.
SparseCountVectorizer, SparseTfidfTransformer and Sparse Vectorizer
normalize option for TfidfTransformer
fix garbage
Fix indentation.
Fix cross_val when y is a 2d-array.
Add refit option to GridSearchCV.
Merge branch 'master' into textextract
API changes to precision_recall
Fix consistency problem in the order of arguments for loss functions.
Add fbeta_score and f1_score metrics.
Rename roc to roc_curve.
Merge branch 'master' into textextract
Fix doctest in grid_search.
Merge branch 'master' into textextract
Add tests for predict_proba in LogisticRegression.
Use filter object.
Add dtype parameter to CountVectorizer and SparseCountVectorizer.
A few optimizations.
Move sparse code to sparse module.
Remove Sparse prefix from class names.
Move preprocessing to its own module.
Add Normalizer, LengthNormalizer and Binarizer.
Remove normalize option from TfidfTransformer.
Sparse equivalents of Normalizer, LengthNormalizer and Binarizer.
Fix hierarchy inconsistency for sparse module.
Move common sparse code to SparseBaseLibLinear.
Fix Sparse Logistic Regression.
Import LogisticRegression in sparse/__init__.py.
Merge branch 'master' into textextract
Activate class_weight option in fit() for liblinear-based classes.
Merge branch 'master' into textextract
Fix slicing issue when using sparse matrices.
Y -> y (capital letter is for 2d-arrays)
Raise exception when X_train.shape[1] and X_test.shape[1] don't agree.
Merge branch 'textextract' of git://github.com/ogrisel/scikit-learn into textextract
Merge branch 'textextract' of git://github.com/ogrisel/scikit-learn into textextract
Merge branch 'textextract'
Convert sparse matrix to CSR format in grid search.
Fix imports.
Pass kwargs to mlcomp loader.
Fix SGD-based binary classification example.
Note on fit_transform.
Test compute Gram matrix with support vectors only.
Activate stop word removal by default.
Add vocabulary property.
Fix small typos.
Make max_df to 1.0 by default.
Update matrix type in documentation.
Fix broken test.
class_weight="auto" for liblinear-based and sparse classes.
Fix math rendering in SVM documentation.
Fix typo.
Add LabelBinarizer.
Add sparse Ridge.
Support 2-d Y.
Add RidgeClassifier.
Add RidgeClassifier to 20newsgroup classification example.
Add efficient LOO cross-val for Ridge.
Add sample_weight to fit.
Add reference.
Add support for custom loss or score function.
Add label binarizer documentation.
Test 2-d y case.
Support fit_intercept in RidgeLOO.
Forgot to use sample_weight...
Default fit_intercept to True.
Add sparse RidgeLOO.
Add RidgeClassifierLOO.
Add class_weight.
Add some more documentation.
Add sample_weight.
Add dense_output option to safe_sparse_dot.
Use safe_sparse_dot.
Fix problem when output is a vector.
Add safe_asanyarray.
Handle sparse matrix in LinearModel.
Import necessary modules.
Fix tests for sparse case.
Add RidgeCV.
Merge dense and sparse code.
Rename to RidgeClassifierCV.
Fix 20newsgroup example.
Make RidgeLOO private.
Fix test.
Predict is already implemented in LinearModel.
Fix issue in RidgeCV.
PEP8!
Fix typo.
Add documentation on matrices used for clustering.
Rename _RidgeLOO to _RidgeGCV.
Note on efficiency.
Improve the documentation for LabelBinarizer.
Add TransformerMixin.
Use TransformerMixin in LabelBinarizer.
Merge branch 'ridge'
Fix typos.
Fix TransformerMixin.fit_transform.
Remove references to y in preprocessing objects.
Add sample_weight to Ridge.
Improve documentation for Ridge objects.
Move cv parameter to constructor in RidgeCV.
Temporarily disable sample_weight when cv is passed to RidgeCV.
Preserve backward compatibility in GridSearch.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Fix error in documentation.
Remove coef_ and get_support from Pipeline.
Add SparseTransformerMixin.
Use sparse.base.SparseTransformerMixin.
Add documentation on model persistence.
Minor fixes in RidgeCV.
Add reference for GCV.
Add Olivier Grisel to metrics.py's credits.
Comment broken test.
Rename SparseTransformerMixin to CoefSelectTransformerMixin.
Can now specify desired percentage of explained variance ratio in PCA.
Add a few sanity checks for SVC.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Add tests for sanity checks in SVC.
Flip the sign when the user accesses coef_ or intercept_ in the 2-class case.
Implement transform in LDA.
Add LDA to plot_pca.py and rename to plot_pca_vs_lda.py.
Proper implementation of predict_log_proba in LDA.
Add polynomial interpolation example.
Use np.vander.
Support multilabel case in LabelBinarizer.
Add linear_kernel, polynomial_kernel and rbf_kernel.
Small optimizations for polynomial_kernel and rbf_kernel.
Add KernelCenterer.
Add KernelPCA.
Add kernel PCA example.
Merge branch 'master' into kpca
Add KernelPCA documentation.
Add test for precomputed kernel.
Optim in polynomial_kernel.
Efficient fit_transform in PCA.
Merge branch 'mblondel-kpca' of https://github.com/vene/scikit-learn into kpca
Cosmit.
Use TransformerMixin in KernelPCA.
Merge branch 'master' into lda
Merge branch 'lda' of https://github.com/bthirion/scikit-learn into lda
Fix doctest.
pep8 love (integrism?).
Add test for invalid kernel.
Rename plot_kpca.py to plot_kernel_pca.py.
Add comment regarding PCA's fit_transform method.
Add note on sign ambiguity in PCA.
Merge branch 'kpca'
Add kernel PCA and linear PCA equivalence test in its own function.
Merge pull request #163 from paolo-losi/revert_preprocessing
Make the author file more consistent.
Merge pull request #167 from bsilverthorn/fix-kernelpca-ncomponents
Add sparse.LogisticRegression to class reference.
Better doc for the dataset loaders.
Make kernels consistent with SVM and add sigmoid kernel.
Fix LDA transform.
Add LDA to the handwritten digit 2d-projection example.
Add TransformerMixin to LDA and RandomizedPCA.
Cosmetics.
Merge pull request #200 from amueller/minor_docs
Merge pull request #193 from ogrisel/preprocessing-simplification
Better PCA docstrings.
Fix LDA.transform's docstring.
Typo.
Add hinge_loss to metrics.
Fast and memory-efficient loader for the svmlight format.
Allow to user to fix n_features.
Docstring.
Important note.
Propagate errors up to the Python level.
Narrative documentation.
Update credits.
Return false when couldn't read the file.
Fix comment.
Merge pull request #6 from larsmans/mblondel-svmlight
Merge branch 'mblondel-svmlight' of git://github.com/larsmans/scikit-learn into svmlight_format
Fix compile issues on Mac OS X.
Fix ref counting bug.
More comments.
Merge pull request #7 from larsmans/mblondel-svmlight
load_svmlight_format -> load_svmlight_file.
Merge branch 'master' into svmlight_format
Merge pull request #209 from mblondel/svmlight_format
Documentation fixes.
Add note to base fit_transform doc.
Raise error if file doesn't exist.
Fix parsing issues.
More tests for the svmlight reader.
Documentation fixes.
Better performance of Ax=b solver when b is 2d and A is sparse, and add
Fix doctest.
Reverse coef_ in Ridge.
Merge pull request #235 from mblondel/fix_ridge
Improve Logistic Regression sparsity example.
Better test and remove old garbage.
Allow CountVectorizer to be fitted twice.
Remove unnecessary submethod.
2011!
squared loss -> squared hinge loss.
Merge pull request #255 from vene/kernel-pca
Merge pull request #260 from glouppe/master
Merge pull request #261 from glouppe/master
Merge branch 'dbscan' of https://github.com/robertlayton/scikit-learn into dbscan
Handle metric="precomputed" in dbscan.
Use euclidean_distances in kmeans.
Cosmit: use dense_output=True.
Sparse matrix support in kernels.
PCA: fix issue #258.
PCA: better doc string for 0 < n_components < 1 case.
Partial support for sparse matrices in kernel PCA.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Remove unnecessary import.
Merge branch 'dbscan' of git://github.com/robertlayton/scikit-learn into dbscan
calculate_distances -> pairwise_distances + goodies.
Improve DBSCAN doc.
Fix DBSCAN example.
Remove automatically generated auto examples.
Test pickability in DBSCAN.
Test precomputed similarity in pairwise_distances.
Merge branch 'samples_generator' of git://github.com/glouppe/scikit-learn into samples_generator
Doc for sample generator cosmits.
Merge branch 'kmeans_transform2' of https://github.com/robertlayton/scikit-learn into kmeans_transform2
Add tests and fix bug.
Kmeans transform and predict doc improvements.
Merge pull request #296 from bdholt1/fix/feature_extraction
Add TransformerMixin (back?) to preprocessing classes.
Fix plot_kmeans_digits.py.
Typo.
Implement one-vs-the-rest multiclass strategy.
Fix bug in one-vs-rest when underlying estimator uses predict_proba.
Implement one-vs-one multiclass strategy.
Merge pull request #2 from ogrisel/robertlayton-kmeans_transform2
Implement error-correcting output-code multiclass strategy.
Test grid searchability.
Merge pull request #273 from robertlayton/kmeans_transform2
Docstrings!
Add new meta module to setup.py
Merge branch 'master' into multiclass
Check estimator and fix syntax error.
Documentation for the meta learners.
pep8-proof.
Fill missing docstrings.
Allow one-class only in LabelBinarizer.
Rewrite svmlight loader in pure Python for now.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' into multiclass
Fix mistake and docstring cosmits in SVC.
Moved multiclass module to top-level module.
Fix doc!
Fix setup!
Address @agramfort and @ogrisel's comments.
Merge branch 'master' into multiclass
More informative name for color quantization example.
More explanations and pep8.
Use 256 colors and add title.
Emphasize one-vs-all.
Better documentation.
Fix doctest errors (hopefully!).
Document fit_ecoc.
Typo.
Fix currentmodule.
Fix bad copy-paste.
Merge pull request #320 from mblondel/multiclass
64 colors + random codebook comparison.
Better title + authors.
Welcome to Robert and Gilles.
Sparse matrix support in the `density` util.
Documenting a secret feature and fixing bugs in the process.
Use l1 penalty.
Giving due credit (last minute ChangeLog item).
Cosmit.
Merge pull request #354 from amueller/liblinear_parameter_errors
Add dump_svmlight_file.
Export data option in SVG gui.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge pull request #407 from amueller/sgd_url_typo
BUG: Use threshold in LabelBinarizer in multi-label case.
ENH support decision_function in multi-label classification
Cosmit: used named parameter.
ENH Label indicator matrix support in LabelBinarizer and OVRClassifier
Remove C from NuSVR.
Revert "Remove C from NuSVR."
Revert "FIX : removing param nu from sparse.SVR, C from NuSVR + pep8"
Small comment on the dual parameter in LinearSVC.
Update svmlight loader documentation.
Fix svmlight loader doc.
Implement mean_variance_axis0.
Fix bug with sparse matrices.
Cosmit.
Test edge case.
tmp -> diff
Add score method to KMeans.
Use int for indptr and indices.
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
Sparse matrix support in KMeans.
Vectorized news20 dataset loader.
Merge multilabel branch with master.
Check that LabelBinarizer was fitted.
Multilabel classification dataset generator.
Test multilabel classifier on random dataset.
scale_C will be True in scikit-learn 0.11.
Merge pull request #8 from larsmans/news20_loader
Return bunch object.
Merge pull request #493 from amueller/kernel_approximation_doc
Add to class reference.
Add precompute_distances option back and export it.
Merge branch 'minibatch-kmeans-optim' of https://github.com/ogrisel/scikit-learn into minibatch-kmeans-optim
Address @ogrisel and @amueller's comments.
Better doc for the 20newsgroup dataset loader.
Do not use joblib's memoizer.
Use int16 for more compactness.
Merge branch 'master' into sparse-kmeans
Merge with master.
One more test.
Fix test.
Cosmit in MiniBatchKMeans.
Optimize for high dimensional data.
Use CCA as well in multilabel example.
Add missing reference.
Break down fit_transform into parts.
Cosmit
More tests for nuSVR.
Use rbf_kernel.
Add decision_function to ElasticNet.
FIX: support for regressors in multiclass module.
Support for coef_ in OneVsRestClassifier.
Mention multi-variate resgression support in Ridge.
Add safe_mask utility.
coef_ and intercept_ in LinearSVC are now writable.
Add safe_mask to developer doc.
Typos.
Create partial_fit and call partial_fit from fit.
Add partial_fit to SGDRegressor.
Partial tests + fix bugs.
Fix a few more bugs.
Use proper assertions.
Fix more bugs + tests.
Add decision_function to SGDRegressor.
Multiclass tests.
Merge dense and sparse SGD implementations.
Re-enable sparse tests.
Add deprecation warning.
Update docstrings.
What's new.
Removed needless line.
Use only one epoch in partial_fit.
Use named parameters.
Updat examples.
Update doc.
Use only epoch SGDRegressor.partial_fit.
Save iteration number.
More tests + fixes.
Fix bug when fit is called mutiple times.
Fix "what's new".
Merge pull request #10 from larsmans/sgd_partial_fit
Address @ogrisel and @larsmans 's comments.
pep8!
FIX: y should be np.float64.
Add filter_params option to pairwise_kernels.
Precomputed kernel can actually be non-squared.
Use pairwise_kernels in KernelPCA.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge pull request #11 from larsmans/sgd_partial_fit
More technically correct description.
Rename _get_params() to get_params().
Merge branch 'sgd_partial_fit'
Use classes_.
Better title in README.rst.
More intuitive warm-restart in SGD.
Fix doctests.
warm_restart -> warm_start
More intuitive warm-start in ElasticNet.
Fix doctests.
Copy in user-land.
Missing docstring in ElasticNet and Lasso.
Fix failure in `test_bad_input`.
Revert change on svm.base.
Remove if statement.
Suppress deprecation warnings.
Merge branch 'warm_start' of github.com:mblondel/scikit-learn into warm_start
Make sure order="C".
Merge branch 'warm_start'
Fix doctest.
preprocessing/__init__.py -> preprocessing/preprocessing.py
Move preprocessing.py to sklearn/.
Remove CoefSelectTransformerMixin and use SelectorMixin instead.
Better default threshold for L1-regularized models.
euclidian_distances is to be deprecated in v0.11.
Add n_jobs option to pairwise_distances and pairwise_kernels.
Merge branch 'enh/metrics' of https://github.com/satra/scikit-learn into metrics
Backward compatibility in precision, recall and f1-score.
Factor some code.
More what's new items.
Fix what's news.
Add Perceptron.
Add Perceptron to document classification example.
Minimal documentation.
Add references and implementation details.
Propagate parameters.
Expose more parameters.
Explain parameter in Hinge loss.
Don't rescale coef if not necessary.
Quick note on sparsity.
Don't break API in precision_recall_fscore_support.
Pep8!
Fix scale_C warning.
Merge branch 'perceptron' of github.com:mblondel/scikit-learn into perceptron
t -> threshold
Add mean_squared_error and deprecate mean_square_error.
Don't raise warning when passing explicit scale_C=False.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
DOC: scaling regression targets.
Merge pull request #623 from npinto/ridge-docfix
Set label encoding in LabelBinarizer.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Guess threshold if not explicitly provided.
Bug: must be strictly less than.
Pep8.
Don't raise warning in auto mode.
Merge pull request #712 from agramfort/fix_y_center
Merge branch 'shuffle_kfold' of https://github.com/NelleV/scikit-learn into kfold-shuffle
Test indices=False case.
Factor tests.
Merge branch 'combat' of https://github.com/ibayer/scikit-learn into lsqr_fix
Fix lsqr for scipy 0.7.
Add test for grid search with only one grid point.
Check param grid.
Return early if there's only one grid point.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Fix doctest failure.
Merge branch 'nearest_centroids' of https://github.com/robertlayton/scikit-learn into nearest_centroids
Fix doc mistakes.
Precomputed distance matrices can be rectangular.
Add test for precomputed distance.
Doc cosmits.
Fix bug when refit=False.
Fix kernel pca example.
Fix doctest in PLS.
Rename "p" to "espilon".
Allow regression losses for classification.
Add epsilon-insensitive loss.
predict_proba with loss="modified_huber".
Update doc.
Doc: predict_proba.
What's new.
Document API change.
Easier to understand formula.
DOC LabelBinarizer
BUG: now build works.
Add LabelNormalizer.
Documentation for LabelBinarizer and LabelNormalizer.
Pep8.
Cosmit: LabelBinarizer and LabelNormalizer are not classifiers.
More useful error message.
Doc cosmit.
Add test for non-numerical labels.
LabelNormalizer -> LabelEncoder.
Add documentation for non-numerical label case.
What's new.
Cosmit: be more explicit why LabelEncoder is useful.
Address @larsmans' comments.
Merge branch 'sgd_losses' of github.com:mblondel/scikit-learn into sgd_losses
Address @ogrisel and @pprett's comments.
Fix remaining merge conflict.
Fix doctest.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
What's new.
Fix typo.
Note regarding multilabel example.
Note on one-vs-all classification in SGD module.
Unused import.
Fix warning.
Merge pull request #877 from duckworthd/master
Fix #904.
Removed needless method redefinition.
Fix: RidgeClassifier must not inherit from RegressorMixin.
Clean up unused code.
Test default input.
Credits and license.
Update doc/whats_new.rst
Update doc/whats_new.rst
Typo.
Check that feature indices are sorted.
Add missing test file.
Optim in LabelEncoder.
Remove needless loop in inverse_transform.
Simplify LabelEncoder.fit_transform.
Fix warnings in multiclass module tests.
Remove duplicated line.
Add all_categories option.
Normalize training and test times.
Typo.
Simplify LabelEncoder.transform.
Test LabelEncoder.fit_transform with arbitrary labels.
Ignore joblib folder.
Fix #1080.
Decision threshold is now 0 in RidgeClassifier.
Optim + cosmit in StratifiedShuffleSplit.
Use fixed random state in isotonic regression example.
Note on the use of X in isotonic regression.
Fix confusing notation in isotonic regression.
Fix latex formula in isotonic regression doc.
Release manager change + fix Satra's URL.
Move solver option to constructor.
Add lsqr solver.
BUG: transmit parameters correctly from Ridge to ridge_regression.
Can afford better precision in news20 example.
Fix docstrings and doctests.
Add minimalistic test for each solver.
Fix damp parameter.
Fall back to dense_cholesky if sample_weight is given.
lsqr is not available in old scipy versions...
Better documentation on the choice of solver.
PEP8!
Cosmit: not a fan of defining a function in a loop :)
Update what's new.
More accurate API change description.
Fix warning message.
Merge pull request #1215 from amueller/pipeline_muliclass
Merge pull request #1237 from kalaidin/typos
Merge exthmath tests into the same file.
Add common assertions to sklearn.utils.testing.
Fix density utility when input is sparse.
Typo.
Fix test failure.
Use sklearn.utils.testing in tests.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
More use sklearn.utils.testing.
Even more sklearn.utils.testing.
Missing random_state in LinearSVC.
Merge pull request #1323 from dnouri/countvectorizer_doc_1154
FIX: vocabulary_ maps to feature indices.
Merge pull request #1320 from dnouri/test_coverage
Merge branch 'sgd_learners' of https://github.com/zaxtax/scikit-learn into passive_aggressive
Rename pa.py to passive_aggressive.py.
Cosmit: random_state is not necessary.
Fix many bugs and test PA-I.
Do not expose C in SGDClassifier / Regressor.
Implement and test PA-II.
Add SquaredHingeLoss.
Test different losses.
Add squared epsilon insensitive loss.
Test PA-II (regression).
Fix random_state in SGD.
Update narrative documentation.
Fix example.
Credit myself.
Fix see also.
Fix a few test failures.
Add one more test for PassiveAggressiveRegressor.
Fix underflow detected by test_common :)
Update document classification example.
Fix doctests.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Better documentation for C.
Add PassiveAggressive* to class reference.
Remove sample_weight and class_weight from PassiveAggressive*.
Add tests for partial fit.
Document epsilon.
Better documentation for epsilon in SGD.
Remove predict_proba from Perceptron and PassiveAggressiveClassifier.
Remove transform from PassiveAggressive*.
Fix typos and wording in RandomForestEmbedding.
Indicate dimensionality in RandomForestEmbedding example.
Cosmit: use less memory in feature hasher tests.
Cosmit: make KernelCenterer a private attribute in KernelPCA.
Improve KernelCenterer docstring.
Add add_dummy_feature.
Add RandomClassifier and tests.
Fix tests.
Add docstrings for RandomClassifier.
PEP8.
random_state=None by default.
Remove label encoder.
Implement predict_proba.
Add some narrative doc.
Address @amueller's comments.
Rename to dummy.DummyClassifier.
Add DummyRegressor.
Add dummy estimators to references.
Add what's new entry.
Add comments.
Check returned types.
Test expectations.
Test string labels.
Test exceptions.
Cosmit: save one line.
Address @amueller doc comments.
Skip common tests for Dummy*.
Typo :/
Add example in docstring.
Add to references.
Merge pull request #1382 from mblondel/add_intercept
Merge pull request #1373 from mblondel/random_clf
Remove unused import.
Improve error message when vocabulary is empty.
Fix bug in sqnorm (used by PassiveAggressive).
Link to travis.
Specify branch in status button.
Add missing assertion.
Update what's new.
Cosmits and typos.
Add perceptron loss to plot.
threshold parameter was ignored in SquaredHinge loss.
Welcome to Wei Li and Arnaud Joly.
Clean up test_pairwise.py.
More clean up of test_pairwise.py.
Cosmit: break up long line.
Merge pull request #1530 from agramfort/doc_lasso
X is not a constructor parameter.
Add missing types to docstring.
Move more minor contributors to what's new file.
Remove contact address.
Merge pull request #1561 from kyleabeauchamp/MinMaxScaler_Inverse
Merge pull request #1536 from kyleabeauchamp/issue-1403
Merge pull request #1604 from darkrho/doc-linear-model-typo
DOC: make distinction between evaluation and pairwise metrics.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Cosmit: more explicit xlabel.
Cosmit: more explicit label.
Update load_svmlight_file docstring.
FIX: X was converted twice.
Merge pull request #1804 from AlexanderFabisch/fix_example_path
Cosmit: remove needless blank lines.
Cosmit: more idiomatic way of clipping to zero.
Demystify magic values in NNLS implementation.
BUG: fix replacement for _neg.
Fix random state where appropriate.
Fixx doctest.
DOC: document attributes fitted by DictVectorizer.
DOC: put feature extraction before pre-processing.
COSMIT: better notation in CountVectorizer.
COSMIT: same changes in transform method.
COSMIT: more robust condition in inverse_transform.
Import gzip and bz2 only if necessary.
Move balance_weights out of preprocessing.
Add categorical_features option to OneHotEncoder.
Support both masks and arrays of indices.
Typo.
Rename _apply_transform to _transform_selected and make it a function
Merge branch 'master' of github.com:scikit-learn/scikit-learn into select_categorical
Address @jnothman's comments.
Test exception is raison when number of targets and penalties don't
Simplify ridge solvers (ongoing work).
Extract sparse_cg and lsqr solvers.
Extract dense_cholesky solver (linear case).
Extract dense_cholesky solver (kernel case).
Clean up.
Extract SVD-based solver.
Clean ups.
Remove copy option.
Cosmit in docstring.
What's new.
Remove if statement.
Cosmit.
Fix failures in grid search.
Do not set sample_weights unless need to.
Add warning when fall back to other solver.
Remove unused variable.
Fix failure in svd-based ridge solver w/ old numpy.
BUG: replace elif by if in Ridge solver selection.
Add fit_transform to FastICA.
Add inverse_transform to FastICA.
Add docstrings to methods in FastICA.
Address @dengemann's comments.
Add test.
Push failing test.
Merge pull request #2229 from larsmans/kernel-center-narrative
BUG: FIX Crammer-Singer formulation in the binary case.
Better test for auc_score.
Optim for Crammer-Singer formulation in binary case.
Completely avoid for loop in _auc.
Typo.
Update my URL.
Better docstring for KMeans.predict.
Move estimate_bandwidth test to its own function.
Add predict method to MeanShift.
Add predict method to AffinityPropagation.
Add missing docstrings to test functions.
Add what's new item.
Remove warning in AffinityPropagation.
Fix grid search test.
More user-friendly error message.
COSMIT: change variable name.
Merge pull request #2368 from emsrc/cosine_distance
Use pairwise_distances_argmin_min function.
Use pairwise_distances_argmin_min in examples.
Add pairwise_distances_argmin.
Add tests for pairwise_distances_argmin.
Merge pull request #2410 from Balu-Varanasi/pep8_fixes
Merge pull request #2411 from Balu-Varanasi/remove_unused_import
Merge pull request #2415 from kemaleren/skip_checkerboard
Add what's new entry for pairwise_distances_argmin_min.
Cosmit: move log_loss.
Merge pull request #2558 from Jorge-C/patch-1
Test correctness of average_precision_score.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Remove some warnings.
Add docstrings to _auc and _average_precision.
BUG: ValueError was assigned as local variable!
COSMIT: no need for parentheses.
More robust unit tests for fast_dot.
Remove warnings in ProbabilisticPCA tests.
One less warning in QDA tests.
Simplify fast_dot.
Remove print statement.
Add lobpcg in possible solvers.
Fix fragile symmetry check.
Fix fragile test.
Fix typo in test.
Merge pull request #2659 from jakevdp/gaussiannb_speedup
Handle n_features < n_informative case in make_regression.
Merge pull request #2743 from eltermann/doc_fix
Test primal-dual relationship in Ridge.
More robust test for sample_weight in Ridge.
Add mean_absolute_error to scorers.
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
Fix doctest.
Rename _split_with_kernel to _safe_split.
Rename fit_and_score to _fit_and_score.
Merge pull request #2762 from sergiopasra/fedora-package
Merge pull request #2761 from ogrisel/thanks
BUG: dual_coef must be defined inside the try block.
Merge pull request #2789 from ugurthemaster/patch-2
Merge pull request #2827 from amueller/fix_subgradient_equation
Cosmit in SGD doc.
Merge pull request #2852 from jwkvam/adaboost_test
Add test for _safe_split with pre-computed kernel.
Fix test failure on Python3.
Merge pull request #3055 from ajschumacher/patch-3
Merge pull request #3365 from fabianp/master
Fix wrong attribute doc in BaggingRegressor.
Merge pull request #3667 from ajschumacher/patch-6
Wording.
Fix typo.
Merge pull request #3713 from akshayah3/Boost
Fix sparse_cg solver when max_iter is specified.
Add random forests to document classification example.
Typo
Merge pull request #4047 from agramfort/faster_sparse_enet_dual_gap
Update lightning information.
Update seqlearn information.
Remove singular matrix warning.
Remove unused imports.
Merge pull request #4117 from chyikwei/fix_import_err
Kernelized ridge regression -> Kernel ridge regression
Fix error in CCA docstring.
Handle sample_weight upfront.
Simplify rescaling.
Use relative imports.
Add what's new entry.
Support sparse matrices in KernelRidge.
Merge pull request #4713 from amueller/ridge_cv_nonmutable_alphas
Merge pull request #4733 from amueller/fix_csc_random_sampling
Typo.
Add API paper.
Reference ECML PKDD workshop paper.
Merge pull request #4851 from trevorstephens/ridge_no_copy
Merge pull request #4891 from MechCoder/lrcv_bug
chmod -x
Remove shebang and utf-8 header.
Add link to ETL.
Matt Krump (2):
Don't allow negative values for inverse transform, and raise error similar to transform for new labels
Added test
Matt Pico (1):
DOC on negation of loss functions used as scoring parameters
Matt Terry (1):
FeatureUnion example using a heterogeneous datasource.
Matteo Visconti dOC (18):
FIX: ward_tree now returns children in the same order for both
Add forgotten newline
Cosmetic change in test
Minor changes: float conversion and rename rnd into rng
Merge remote-tracking branch 'upstream/master' into wtree_children
ward_tree returns heights (distances) between the clusters
ENH: `ward_tree` returns distances
changed xrange to range for compatibility with Python 3
TST, test ward_tree_distance on known dataset
FIX: structured and unstructured ward_tree return children sorted in the
Cosmetic change in test_hierarchical, better documentation
Add ENH description in whats_new.rst
Add better description of distances, cosmetic changes
Fix git rebase problems
Add forgotten IFs
Clean code
Clean code
DOC: fix correct shape of children and distance
Matthew Brett (1):
ENH: use setuptools for bdist_wheel command
Matthias Ekman (1):
ENH: add pre_dispatch option to cross_val_score
Matthias Feurer (1):
ENH: make trees work with very small max_features.
Matthieu Brucher (1):
Fixed a typo
Matthieu Perrot (25):
ENH: optional computing of estimated covariance of LDA classifier.
MISC: add an unfinished toy example to compare LDA with a (not yet implemented) QDA.
Merge branch 'master' of http://github.com/GaelVaroquaux/scikit-learn
BUG: Fixed example after last API changes
BUG: add missing call to pylab show function
BUG: Fixed pipeline feature selection example after last API changes
MISC: lda: Y -> y
Merge branch 'master' of http://github.com/GaelVaroquaux/scikit-learn
BUG: Fixed example after last API changes
BUG: add missing call to pylab show function
BUG: Fixed pipeline feature selection example after last API changes
ENH: add QDA classifier, some docs, examples and tests. LDA has been reworked a bit to follow the API of QDA and avoid useless operations.
Merge branch 'master' of http://github.com/GaelVaroquaux/scikit-learn
Merge branch 'master' of git://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
ENH: optional computing of estimated covariance of LDA classifier.
MISC: add an unfinished toy example to compare LDA with a (not yet implemented) QDA.
MISC: lda: Y -> y
ENH: add QDA classifier, some docs, examples and tests. LDA has been reworked a bit to follow the API of QDA and avoid useless operations.
Merge branch 'master' of git://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
cosmit in LDA/QDA
MISC: vectorize priors computation for LDA and QDA
Merge branch 'master' of ssh://revilyo@scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
MISC: remove debug
Merge branch 'lda' of https://github.com/mblondel/scikit-learn into discriminant_analysis
re-add self.means_
Matti Lyra (5):
Fixed an issue where CountVectorizer.decode leaves file pointers open after reading the contents of the file. This produces unpredictable behaviour as closing the file pointer is left to the implementation of the python interpreter.
Changed the CountVectorizer charset default back to 'utf-8' instead of 'utf8'. This was due to debugging on my local machine.
Fixed issue 3815. Discrete AdaBoostClassifier now fails early if the base classifier if worse than random.
Added the option of passing in a sparse X matrix into decision function, plus tests for sparse for all prediction functions.
Changed predict_log_proba so that it accepts sparse matrices.
Max Linke (3):
DOC: kmeans runs inits in parallel, not distance computations
MAINT clean up flake8 complaints in k-means
ENH: Precompute distances only if overhead is below 100MB
Maxim Kolganov (2):
remove useless |samples| variable out of ParameterSampler.__iter__()
replace itertools.repeat with six.moves.range
Mehdi Cherti (1):
Add predict_log_proba method to GradientBoostingClassifier
Meng Xinfan (2):
Update the docstring to reflect the package name changes.
fix an error in naive bayes docs
Michael Becker (3):
TST: Switch from python 3.3 to 3.4 in travis
TruncatedSVD: Calculate explained variance.
Authors: Update based on #3067
Michael Bommarito (51):
Adding string={pearson, spearman} option to increasing argument in IsotonicRegression.
Adding increasing and decreasing tests for both Pearson and Spearman increasing argument options
Refactoring increasing_bool set into _check_increasing method
PEP8ing isotonic tests
PEPing isotonic regression
Docstring style changes
Change arguments to increasing={'auto', True, False} and default to 'auto'; implement Fisher transform and warning on 0 \in CI
Adding test for CI check and removing Spearman/Pearson-specific tests.
Minor docstring cleanup
Minor docstring cleanup
Docstring fix
Replacing the non-test .todense() methods with .toarray()
Replacing the test .todense() methods with .toarray()
Improving tests to affirm no CI warnings are thrown when appropriate
Matching CI calculation and docstring with reference
Fixing docstring
Refactoring check_increasing, ensuring that increasing_ is only set on fit/fit_transform, and fixing rho \in {-1, +1}
Reorganizing tests to isolate check_increasing from increasing='auto'
Fixing redundant np.asarray(X.toarray()) in metrics
Improving tests based on feedback from @GaelVaroquaux
Improving tests based on feedback from @GaelVaroquaux
Fixing matrix vs. vector notation for X
Fixing space in docstring for default value
Adding check_increasing to classes.rst
Adding no-warning assertions to IR auto tests
Additional .todense() -> .toarray() fixes
Fixing small formatting issue in doctest
Fixing classes.rst
Adding Notes section to img_to_graph and grid_to_graph re: np.matrix->np.ndarray
Adding what's new item for sklearn.feature_extraction.image np.ndarray changes
Switching from np to math for scalar float ops
Merge remote-tracking branch 'upstream/master' into isotonic-increasing-auto
Additional docstring fixes for issue #3167
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
Adding tests to handle out_of_bounds parameter.
Adding out_of_bounds parameter to handle values outside training domain
Adding test for out_of_bounds argument validation
Reworking .transform conditionals for out_of_bounds argument
PEP8 fix
ENH Dense pipeline support for RandomTreesEmbedding via sparse_output param
Refactoring, small improvements, and cleaning notation of IsotonicRegression
Changing docs from "vector" to "1d array"
Reverting array X to vector x rename
Docstring addition and x->X
Incorporating @ogrisel's comments for simplification
More simplifications
Fixing simplification for clip
Adding test coverage for weight deprecation in IsotonicRegression and isotonic_regression
Improving exception and warning coverage
Adding unit test to cover ties/duplicate x values in Isotonic Regression re: issue #4184
Adding fix for issue #4297, isotonic infinite loop
Michael Eickenberg (32):
fixed the function definition of cross_val_score
changed cross_val_score doc again
Added strided patch extractor to feature_extraction/image. Extracts patches 16x faster on the MiniBatchDictionaryLearning example
Now added extract_patches for random extraction as well
Now replaced max_patches part by fancy indexing
removed stuff i commented out
testing for correct output shapes and patch content of the last patch for 1 to 3 dimensional arrays
Changes in documentation and notation
ridge multi target with individual penalties written. To be tested
old tests passing
new multiple target tests added, functionality confined to direct usage of ridge_regression function
Ridge estimator works with individual penalties
test for ridge estimator
ridge doc string
ValueError for wrong shaped input instead of assertion failure, in order for sklearn/tests/test_common.py, line 238 to pass
docstring in Ridge estimator
added individual penalties function for all other solvers. Tests passing for all of them
always make alpha into an array
updated tests
tests passing
removed elaborate testing in ridge.fit, not necessary anymore
simplified _solve_svd
TST safe_asarray for dok_matrix and lil_matrix
Now capable of treating sample_weights in feature space
Ridge regression now can use sample_weights in feature space. Summary commit over around 20 commits to avoid failing tests
updated authors
failing test for wrong solver exception
Added raise statement in ridge_regression solver check
FIX Exact inverse transform for whitened PCA
TST test whitened inverse
ENH components of unit length and whitening done by scaling with sqrt of explained variance
DOC What's new
Michael Hanke (1):
BF: load_boston() return 13 features, but 14 feature names
Michael Heilman (2):
fixed reverse sorting issue in label_binarize()
added an additional test case in test_label.test_label_binarize_with_class_order()
Michael Patterson (2):
Changed the number of features from 10 to variable
Now plots based on the number of features in X
Michal Romaniuk (1):
Enable grid search with classifiers that may throw an error on individual fits.
Michele Orrù (1):
Typo.
Mikhail Korobov (21):
P3K fix incorrect import
P3K: division should produce integer.
PY3 array.array wants str in Python 2.x and 3.x - give it a str
Update outdated comments in sklearn.hmm.
PY3: fix exception syntax in tests/test_common.py
PY3 fix test_cross_validation
PY3 fix OneHotEncoder doctest ( "<type 'float'>" is "<class 'float'>" in Python 3.x)
PY3 fix metaclasses. See #1829.
ENH speed improvements in HMM
TST Fixed test_pipeline_methods_preprocessing_svm: pca was unused
Fixed typo in metrics.py
TST "confusion_matrix" was a duplicated key in CLASSIFICATION_METRICS and NOT_SYMMETRIC_METRICS dicts
TST style fix: old_error_settings should be outside try-finally block to be safely used in finally statement
FIX classification_report shouldn't fail on unicode labels in Python 2.x
FIX parallel FeatureUnion.fit and fit_transform
TST skip `test_k_means_plus_plus_init_2_jobs` on Mac OS X 10.9. See GH-636
TST fix sklearn.ensemble.tests.test_bagging.test_parallel
TST move Mac OS checking utility to sklearn.utils.testing
TST skip SparsePCA n_jobs!=1 test on Mac OS X because it hangs
TST remove n_jobs=-1 usages in tests
DOC mention that train_test_split returns a random split
Minwoo Jake Lee (2):
Merge remote-tracking branch 'master/master' into sparse-mbkm
moved _gen_even_slices to utils/init
Miroslav Batchkarov (5):
fixed the __repr__ method of cross_validation.Bootstrap, which failed if self.random_state is None
[WIP] preliminary implementation of stratified train_test_split
fixed issue when train_size < test_size for stratified
using StratifiedShuffleSplit instead of StratifiedKFold
incorporated amueller's feedback
Miroslav Shubernetskiy (1):
PY3 allow multiple base classes in six.with_metaclass
Nantas Nardelli (2):
Added contributing subsection and fixed link to web doc
No whitespaces at the end of file
Naoki Orii (1):
FIX issue #1457 KNeighbors should test that n_samples > 0
Nelle Varoquaux (224):
First draft of the mini batch KMeans - works, but a lot of cleaning up to do
Refactored: deleted the batch_k_means function, and created an option for the batch_k_means to avoid code duplication - Added some documentation
Added test one the batch k_means
Improve documentation
Batch K-Means
[batch k-means] Changed the algorithm to compute the centroids.
[batch k-means] Fixed the computation of the batch kmeans centroids
[MiniBatchKMeans] Starting refactoring code after the review
[MiniBatchKMeans] Small fixes
Merge branch 'master' into batchKMeans
[MiniBatchKMeans] Small fix in the initialisation for the random initialisation of the centroids
[MiniBatchKMeans] Fixed the tests for the new API
[BatchKMeans] Small fixes following Olivier & Gael's review
Merge remote branch 'scikit/master' into batchKMeans
[MiniBatchKMeans] Removed the unnecessary import in examples/cluster/mini_batch_kmeans.py
[MiniBatchKMeans] Now checks the validity of the data only when initializing the centroids. When the data is empty, return immediately
Merge with Olivier's branch
[MiniBatchKMeans] Documentation fixes
[MiniBatchKMeans] Added a benchmark
[MiniBatchKMeans] Added chart showing the speed and the inertia / total number of points depending on the chunk size and number of iteration
merge with master
[MiniBatchKMeans] PEP8 Compliance
[MiniBatchKMeans] Fixed typo in attribute: cluster_centers_
[MiniBatchKMeans] Added some documentation and example
[MiniBatchKMeans] PEP8 compliance
[MiniBatchKMeans] Added a fit method to the MiniBatchKMeans
Merge branch 'master' into batchKMeans
Merge branch 'master' into batchKMeans
[MiniBatchKMeans] PEP8 compliance and small fixed
Trailing white space
[MiniBatchKMeans] Small fixes
[MiniBatchKMeans] Added an example
[MiniBatchKMeans] Updated the example to compare BatchKMeans and MiniBatchKMeans - added the copy_x option to the BatchKMeans
[MiniBatchKMeans] Minor modifications on the examples
[MiniBatchKMeans] Added labels and scaled the axis properly on the benchmark plot
merge with master
Merge remote branch 'gael/batchKMeans' into batchKMeans
FIX the IRC chan used is scikit-learn, and not learn
FIX - error in the bibtex entry - extra comma that makes bibtex fail
closes #677 - improved affinity propagation docstrings
closes #703 - KFold has now an option to shuffle the data
Added unit test for shuffle option in KFold
Now tests the randomness of the KFolds when shuffle is True, and that all indices are returned in the different test folds
Updated mailmap
Updated mailmap (bis)
Added Pool Adjancent Violator
SMACOF algorithm for MDS
Added tests and documentation to the smacof algorithm
PAV now uses Kruskal's first approach to ties
Added a new dataset: traveling distances between 17 cities in france
MDS now computes the SMACOF algorithm several times, and returns the results with the lowest stress
Added documentation on MDS
MDS can now run several jobs in parallel thanks to joblib - when initial array passed, MDS will also only run once. If n_init is not set to 1, it will raise a warning
FIX mds tests where failing because of an interface change
Added docstrings to MDS
Cleaned up MDS's documentation
Added more documentation on the cities dataset
Fix errors due to previous refactoring on MDS
Changed dataset from france's mileage to knuth's USA mileage dataset
Replaced MDS US mileage distance example by a generated, more representative one
Added paragraphs on metric and nonmetric MDS, explaining the difference
MDS: out_dim → n_components
MDS: added documentation for n_jobs parameter
MDS - fixed some latex error in the documentation
Added a fit_transform method to the MDS class
Pool Adjacent Violators now does a max_iter number of iteration
DOC: added references to papers and licence - fixed the MDS example
a += a.T is different from a = a + a.T
Small explanation on the plot_mds example
np.diag raised a red flag - used broadcasting instead
Set the seed of the random_state generators to have nicely aligned results
Knuth load_cities dataset isn't used anymore
MDS: renamed positions_ to embedding_
Added MDS to manifold comparison methods
MDS: documentation fixes
FIX: load_cities doesn't exist anymore
Added test to sklearn.utils.bench's total_seconds method
FIX - the eps option of the MDS was overwritten
FIX in the makefile - we should delete pyc and so only from the source code, and not from everything in the root folder
Deprecated sparse classes from the SVM module - refs #1093
FIX sparse OneClassSVM was using the wrong parameter
FIX the AP was using a deprecated parameter
Decrease the number of convit in the AP
Renamed parameter convit to convergence_iteration and deprecated the old API
FIX typo in deprecation warning in the AP module
DOC better documentation on the AP
FIX The new parameter of the AP is called convergence_iter and not convergence_iteration anymore
ENH: Isotonic regression
MDS is now using the new isotonic_regression submodule
Added tests to isotonic_regression
DOC - added paragraph in user documentation on the isotonic regression + an example plot.
More documentation
FIX IsotonicRegression only takes vector input, hence don't test it in the common estimators
ENH IsotonicRegression now uses variable names that have more than 3 letters
ENH better error messages on the IsotonicRegression
Added a predict method to the IsotonicRegression
FIX random_state in MDS was not initialized properly
ENH isotonic regression is now slighty more robust to noise
Added test to check whether the isotonic regression changed y when all ranks were equal
ENH uses the IsotonicRegression classifier instead of the method
FIX the mds example did not plot the NMDS
FIX - nmds now uses the same scaling as previously
ENH we require a version of sphinx sufficient for "new" numpy_ext to work
FIX instead of appending numpy_doc to the list of extensions, directly add when creating the list
DOC: small fix in the regression's score method documentation
FIX make_classification now outputs integer labels
DOC formatting (k_means)
ENH - 3x speedup in the isotonic regression
FIX gen_rst.py was something using an undefined variable
Merge pull request #1886 from NelleV/DOX_fix
Added sponsors to the about.rst page
Spelling mistake
DOC fix in the hierarchical clustering
DOC Acknowledge sponsors for the Paris sprint
DOC fixed small mistakes in the pls module
Merge pull request #2140 from arjoly/ajoly-glouppe-sponsor
DOC fix small mistakes
DOC fixed some formatting in kernel approximation
DOC fixed some formatting in the multiclass module
Merge pull request #2146 from ianozsvald/clearer_iris_decision_surfaces
Merge pull request #2163 from ianozsvald/fix_plot_forest_iris_docs
ENH better error message when estimators don't specify their parameters in the signature.
Merge pull request #2187 from FedericoV/non_negative_style
Merge pull request #2195 from erg/bug-2189
ENH added an option to do an isotonic regression on decreasing functions
TEST: added a small test for fitting an isotonic regression on a decreasing function
TEST tests the class instead of the function for the decreasing isotonic regression
MAINT moved the pls file based module to a folder
TEST fixing pls tests failing:
MAINT Move the pls to the cca to a cross_decomposition module
MAINT renamed pls to cross_decomposition in the documentation
FIX the example plots of the pls module did not import pls methods from the correct module
FIX removed the cca and pls modules
FIX added the new module to the setup.py installation
DOC improved docs/docstrings on cross_decomposition
MAINT deprecated the pls module, moved CCA to cca_
FIX init methods of ABCMeta class also need to be abstract
FIX on py3k, we need explicit relative imports
FIX missing deprecation release information.
MAINT charset is deprecated in favor of encoding
TST added tests for encoding/charset deprecation
DOC better deprecation warning messages.
TST better testing of the PLS module
FIX PLSSVD now returns the correct number of components
COSMIT small documentation tweaks
DOC ignoring gen_rst's parsing errors
Merge pull request #2280 from larsmans/randomsearch-scoring
Merge pull request #2281 from ogrisel/improvements-to-setup-py
DOC fixed the optional arguments
FIX added some descriptions to each categories in the main webpage
FIX spelling mistake
FIX the css in the API
ENH added the fork me ribbon to the website
WEB added testimonials
DOC fixed the previous/next button
DOC fided the collapsable sidebar
DOC dropdown menu works
FIX minor edits on the website
DOC fixed z-index on the website
FIX website layout on small screens
FIX improve display on small device
DOC fix dropdown menu
FIX backward compatibility was broken
Merge pull request #2324 from arjoly/missing-contributions
DOC added link from banner to example.
DOC now building to html/stable
DOC home always points to stable
Merge pull request #2338 from agramfort/pca_cleanup
ENH added an orange cite us button on the front page
FIX cite us buttong made blue bar span too much
DOC added testimonials
FIX forgot evernote's logo
ENH added telecom to the testimonials
DOC updated evernote's testimonials
ENH added AWeber's testimonial
ENH added carousel back on front page for testimonials
ENH better spacing on the first page
ENH testimonials img are now centered.
FIX typo in testimonials
FIX spelling mistakes and whitespace nitpick
PEP8 fixes on hierarchical.py
ENH improved the documentation of the fix_connectivity function
FIX deprecated the copy arguments in hierarchical clustering
FIX a convert on list was applied twice on the inertia matrix
DOC added the docstring to linkage_tree and AgglomerationClustering
ENH Removed copy option and deprecation on new functions and classes
FIX spelling mistake
MAINT deprecated ward class
TEST test that ran with ward linkage now also run with average and complete linkage
DOC small fixes on the hierarchical clustering
DOC improve narrative docs on hierarchical clustering
TEST FeatureAgglomeration does not behave like normal clustering
ENH AgglomerativeClutering now supports different metrics
DOC/TEST improved failing errors and docstrings on metric
ENH now used instead of euclidean distance to gain speed
DOC/TEST improved doc and tests on the paired distances
FIX spelling mistakes and whitespace nitpick
PEP8 fixes on hierarchical.py
ENH improved the documentation of the fix_connectivity function
FIX deprecated the copy arguments in hierarchical clustering
FIX a convert on list was applied twice on the inertia matrix
DOC added the docstring to linkage_tree and AgglomerationClustering
ENH Removed copy option and deprecation on new functions and classes
FIX spelling mistake
MAINT deprecated ward class
TEST test that ran with ward linkage now also run with average and complete linkage
DOC small fixes on the hierarchical clustering
DOC improve narrative docs on hierarchical clustering
TEST FeatureAgglomeration does not behave like normal clustering
ENH AgglomerativeClutering now supports different metrics
DOC/TEST improved failing errors and docstrings on metric
ENH now used instead of euclidean distance to gain speed
DOC/TEST improved doc and tests on the paired distances
Merge master in hc_linkage
DOC Clarified the doc of the hierarchical clustering
FIX precomputed distances on the hierarchical clustering
ENH callable metrics now work for the hierarchical clustering
FIX the option affinity wasn't used in the hierarchical clustering
DOC metrics and hierarchical clustering
FIX verbosity of the mds
Merge pull request #2831 from eltermann/doc-typo
Merge pull request #2909 from ogrisel/peerindex-testimonial
DOC updated installation documentation
Merge pull request #4420 from amueller/cca_low_rank
Nick Wilson (7):
DOC: Various minor fixes to "Contributing" docs
Skip k-means parallel test on Mac OS X Lion (10.7)
FIX: Delete temporary cache directory
BUG: Fix metrics.aux() w/ duplicate values
FIX: Add NORMALIZE_WHITESPACE to broken doctest
Stop passing keyword arguments for positional args
Add verbose parameter to SVMs (fixes #250)
Nicola Montecchio (1):
unused variables are redefined later
Nicolas (1):
FIX make StandardScaler & scale more numerically stable
Nicolas Pinto (34):
MISC: cosmetic -- setup.py is now pep8 safe
MISC: cosmetic -- cross_val.py is now pep8 safe
MISC: cosmetic -- fastica.py is now pep8 safe
MISC: cosmetic -- pca.py is now pep8 safe
MISC: cosmetic -- scikits/learn/setup.py is now pep8 safe
MISC: cosmetic -- pls.py is now (almost) pep8 safe
MISC: cosmetic -- hmm.py is now pep8 safe (getting tiring, next time I'll show up earlier at the sprint ;-)
MISC: cosmetic -- base.py is now pep8 safe
MISC: cosmetic -- grid_search.py is now pep8 safe
MISC: cosmetic -- grid_search.py is now pep8 safe
MISC: cosmetic -- more pep8
MISC: cosmetic -- setup.py is now pep8 safe
MISC: cosmetic -- cross_val.py is now pep8 safe
MISC: cosmetic -- fastica.py is now pep8 safe
MISC: cosmetic -- pca.py is now pep8 safe
MISC: cosmetic -- scikits/learn/setup.py is now pep8 safe
MISC: cosmetic -- pls.py is now (almost) pep8 safe
MISC: cosmetic -- hmm.py is now pep8 safe (getting tiring, next time I'll show up earlier at the sprint ;-)
MISC: cosmetic -- base.py is now pep8 safe
MISC: cosmetic -- grid_search.py is now pep8 safe
MISC: cosmetic -- grid_search.py is now pep8 safe
MISC: cosmetic -- more pep8
Fix typo in SGDClassifier's docstring (via GitHub).
Add arXiv link to Halko et al. 2009 paper.
DOC: fix a few incoherencies in ridge.py
ENH: add verbose option to LinearSVC
BUG: fix LibLinear verbosity for L2R_L2_SVC
MISC: verbose should be int, not bool
TST: add smoke test for LinearSVC's verbose option
ENH: add store_loo_values attribute to _RidgeGCV see Issue #957
FIX: expose loo_values_ in RidgeCV instead of the private _RidgeGCV
COSMIT: rename M matrix to loo_values
COSMIT: -loo_values +cv_values
FIX: use rng with fixed seed
Nicolas Trésegnie (38):
DOC fix macports package name
Add test for PatchExtractor (float value for max_patches)
Fix float value support for max_patches in PatchExtractor
Fix as_float_array behaviour when copy=True
Add test of the as_float_array behaviour when copy=True
Add a copy parameter to safe_asarray()
Imp readability
Missing value imputation
Fix tests
Fix tests + doc improvements + renaming
Add test with default value of copy + doc improvements
Imp readability
Fix use of as_float_array
pep8
Imp variables names
Del use of as_float_array + naming and documentation improvements
Fix use of mask
Fix import names
Add pycharm files in .gitignore
Imp splitting of preprocessing.py
Imp splitting of test_preprocessing.py
Del unused imports in preprocessing + pep8
Fix imports
Imp move OneHotEncoder to preprocessing/data.py
pyflakes and pep8
Fix self.statistics_ souldn't be set if axis==1
Fix use of self
Refactor loss_func and score_func warnings in grid_search
Add score_overrides_loss to _deprecate_loss_and_score_funcs
Add deprecation warnings in Ridge
Add deprecation warnings in rfe
Add catching of the deprecation warnings in rfe and ridge tests
Refactor loss_func and score_func warnings in cross_validation + replacement in two examples
Fix 'scoring' docstrings
Imp documentation
Fix tests
Fix grid_search.py example
Fix tests
Nikolay Mayorov (6):
FIX Implemented correct handling of multilabel y in cross_val_score
FIX bug with set_params on minkowski 'p'.
Fixed a bug in RFECV when step != 1
Added multithreading support for kneighbors search
Explicit encoding for opened files in gen_rst.py
Support Python 2 and 3
Noel Dawe (187):
adding boosting and decision trees
adding bagging and gradboost
minor change
working on interfacing with Cython
minor updates
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn
pull from upstream
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn
implemented AdaBoost
refactoring
minor fix
minor fix
almost done...
it compiles\!
now it really compiles
minor fix
working on segfault
now it works
trying to fix score bounds
updates
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn
sanity check in adaboost
more sanity checks in adaboost
fairly stable now
fixed bug where node cuts were not set but left at 0
working on limiting cases
updates
fixing bug in adaboost
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn
updates
minor change
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn
bagging now implemented
removing committee for now
updates
adding tests
better demonstration in test module
minor change
bugfix
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn
minor change
pep8
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn into decisiontree
updates
ignore splits that yield nodes with net negative weight in find_best_split
rm unneeded negative weight logic in Criterion.init_value and Gini.eval
add note about negative weight treatment in BaseDecisionTree.fit
add negative weights test (currently fails): predict_proba should still be valid probabilities
FIX: negative weight test. do not allow any class to have negative weight after a split
DOC: document negative weight treatment in the case of classification
implement AdaBoost
use weighted mean in ClassifierMixin.score
FIX: DecisionTreeRegressor.score
FIX: import not used
FIX: overlapping y-axis labels
FIX: use generator instead of np.random
rm doctest in make_gaussian_quantiles
fix variable naming in weight_boosting
FIX: TypeError for regressor
FIX minor comment
FIX: docs, code clean up, learn_rate -> learning_rate
FIX: plot_adaboost_classification.py
don't enforce DTYPE at the ensemble level
DOCS: note generator behaviour in staged methods
Make BaseWeightBoosting abstract and other misc changes
revert changes to grid_search
FIX: import
revert implementation of sample weights in BaseWeightBoosting.staged_score
revert a few spurious changes
pep8 + pyflakes, use arrays for errors_ and weights_
init weights_ to zeros and errors_ to ones
add Hastie 10.2 example
pep8
implement SAMME.R algorithm
update adaboost hastie example and weight_boosting tests
use broadcasting
combine real and discrete algorithms under one class
DOC: AdaBoostClassifier real arg
update example: fix histogram range
Merge pull request #20 from glouppe/adaboost
Merge pull request #21 from glouppe/adaboost
update adaboost example: exposes instability
displace predict_proba by 1e-10
Merge pull request #22 from glouppe/adaboost
FIX: adaboost predict_proba
only boost positive sample weights
FIX: only boost positive sample weights
Merge pull request #23 from glouppe/adaboost
FIX: negative and zero probabilities while boosting with SAMME.R
FIX: doctest
FIX: doctest and slightly larger displacement from zero probabilities (32 vs 64bit doctest instability)
remove weighted_r2_score (leave for next PR scikit-learn#1574)
revert spurious change in metrics.py
FIX: use full decision tree in AdaBoost and fix title in plot_forest_iris.py
DOC: add __doc__ to plot_adaboost_hastie_10_2.py
FIX: reference format
FIX: show decision boundary in plot_adaboost_classification.py
FIX: refactor plot_adaboost_classification.py and add legend
rename plot_adaboost_classification.py -> plot_adaboost_twoclass.py and add predict_twoclass method to AdaBoostClassifier
FIX: only possible split sometimes creating children with negative or zero weight in the presence of negative sample weights
FIX: improve multi-class AdaBoost example (rename to plot_adaboost_multiclass.py)
add author
typo
use metrics module and pep8
typo
fix class ordering in two-class
faster sample_weight initialization
speed improvements to make_gaussian_quantiles
even more speed improvements to make_gaussian_quantiles
py3k
DOC: note initialization of sample_weight if None
factorize common sample_weight check
Merge pull request #24 from glouppe/adaboost
add decision_function and staged_decision_function and refactor some code
Merge remote-tracking branch 'upstream/master' into treeweights
Merge pull request #25 from glouppe/adaboost
pep8
Merge pull request #26 from glouppe/adaboost
update adaboost regression example and use estimator_errors_
rm n_estimators argument from predict methods
DOC: fix docstring for make_gaussian_quantiles
FIX: alpha=.5 and use more difficult dataset in two-class example. Add mean and cov arguments to make_gaussian_quantiles
FIX: learning_rate default value consistency
FIX: TypeError message if base_estimator does not support class probabilities
FIX: comments from @ogrisel
make learning_rate=1 default for classification
only sum sample_weight once
rm sphinx/docutils formatting in exception messages
inline comment about learning_rate in hastie example
add note about SAMME.R converging faster than SAMME
add note about y coding construction
add description of dataset in two-class example
fix missing parenthesis in make_hastie_10_2 dataset
Merge pull request #27 from glouppe/adaboost
import pylab as pl
remove check for fit_predict
fix importance test and test both SAMME and SAMME.R algs
don't show class B probabilities in two-class example
two-class decision scores -> decision scores
clarification on two-class decision scores plot
explain decision scores in two-class example
fix AdaBoost.R2 and update example
DOC: loss_function
fix failing tests
fix failing doctest
Merge pull request #28 from glouppe/adaboost
API consistency with gradient boosting: loss_function -> loss
Merge pull request #29 from glouppe/adaboost
minor edits in docs
DOC: notes about examples and minor edits
make setup.py executable
AdaBoost: use estimator weights in predict_proba
DOC: missing documentation of splitter parameter in tree.py
tree export_graphviz: remove unused close parameter and close the file if out_file is a string
AdaBoostRegressor: fix redundant recalculation of error_vect.max()
plot_adaboost_multiclass.py: handle case where boosting terminated early. Add missing author on other boosting examples.
xrange -> range
add sample_weight to base score and weight_boosting staged_score
weight_boosting: unneeded np.copy
weight_boosting: include sample_weight in test_staged_predict
metrics: add sample_weight support
rm (default=None)
require sample_weight support for binary_metric
newline
atleast_2d.reshape -> reshape
weighted metrics: fix sample_weight handling for average=samples
format
metrics tests
doc: fix default
weighted metrics tests fixes
np.sum(np.multiply( -> np.dot(
update whats_new.rst
add test_base.test_score_sample_weight
sample_weight metrics tests: add missing micro and macro averaging for precision, recall, and f-score
tree: add min_weight_fraction_leaf
forest: add min_weight_fraction_leaf
min_weight_fraction_leaf and min_samples_leaf: test both best-first and depth-first tree building
gbrt: min_weight_fraction_leaf value test
tree: use min_weight_leaf internally while exposing min_weight_fraction_leaf
utils.testing: add assert_greater_equal and assert_less_equal
forest: test min_samples_leaf and min_weight_fraction_leaf
min_weight_fraction: comments and pep8
test assert_greater_equal and assert_less_equal
min_weight_fraction_leaf: narrative docs
min_weight_fraction_leaf: narrative doc update
scorer: add sample_weight support
plot_adaboost_twoclass.py: minor improvements
Norbert Crombach (1):
Fix L2 regularization order in sgd_fast
Okal Billy (2):
bug fix for issue #3526
bug fix for t-SNE (issue #3526) with new inputs
Olivier Grisel (2033):
test to reproduce issue 67 on LARS coef shape
Merge branch 'master' into issue-67-LARS-shape
tracking changes from master
follow API change in LARS
Merge branch 'master' into issue-67-LARS-shape
Merge branch 'master' into issue-67-LARS-shape
make sparse coding test pass
more .gitignore
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
started work on document classification: bag of wordsw extraction and hashed tfidf
some tests for the text features extractor
checkpointing work in progress on MLComp dataset integration
remove labels handling from vectorizer code
more work on document classification dataset loader
smaller default dim: faster to load by default, need experimental setting to find good tradeoff
make it easy to find the raw source document
better parameter ordering
example usage of MLComp document classification datasets
use compiled re pattern
small fixes
Merge branches 'master' and 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
typos
add the ability to use stop words for text classification, but does not improve accuracy hence not enabled by default
typo in comments
faster and better accuracy with hinge loss of doc classif example but not sparse anymore since l2 reg...
make the features package a first class citizen
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
ENH: more efficient stopping criterion coordinate descent GLM and comparison with python-glmnet
blas-ification of elastic net + ensure that the gap is initialized and evaluated
cosmit
work in progress on sparse vector extraction for document datasets
exclude scikits.learn.external package from top level nosetests env
missing pl.show() in rfe examples
more missing pl.show() in examples
using a separate class for the sparse version of the hashing vectorizer
readd the dense version of the vectorizer
checkpointing work in progress on the sparse version of the document vectorizer
more scalable TF-IDF computation unfortunately using a python for loop
new example to demonstrate sparse TF-IDF + sparse SVM on 20 newsgroups (too slow right now)
Merge branch 'sparse-documents'
avoid useless allocations in dense_to_sparse conversion
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
experimenting with character n-grams features (basic morphological analyzer)
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
fix one simple import in doctest for GMM
simple Makefile for repetitive dev tasks on POSIX OS
fix broken/unstable sparse SVM tests
even more stability fix for sparse SVM
fix broken doctest in HMM
disabling broken doctest in Gaussian Mixture Models
fix HMM doctests
cosmit
ignore coverage output folder
trailing spaces
skip remaining failing tests in HMM test suite
fix inline comment
Showcase the new LinearSVC wrapper for with sparse liblinear bindings in the 20 newsgroups document classification example
tracking changes in master branch
fix broken test for text features extraction
fix broken test for text features extraction
tracking changes from master and restore broken SparseHashingVectorizer
add ability to compute token ngrams too
fix broken doctests for SVC / NuSVC
Merge branch 'master' into char-ngram-features
cosmit
cosmit + trailing spaces + improved some comments
pep8 spacing
more cosmit
cosmit
Merge branch 'master' of github.com:scikit-learn/scikit-learn into issue-77-sparse-cd
starting boilerplate for sparse coordinate descent
Merge branch 'master' of github.com:scikit-learn/scikit-learn into issue-77-sparse-cd
fix broken test
checkpointing work in progress
avoid confusing cython extension names
fixed various issues with sparse datatype handling in previous checkpoint
better note
first stab at the sparse CD
leave sparse evaluation of the dual gap for later
forgot files from previous checkin
Merge branch 'master' of github.com:scikit-learn/scikit-learn into issue-77-sparse-cd
check that sparse API for coordinate descent also work with dense list-based input
one more test for sparse CD
sparse dual gap too!
Merge branch 'master' of github.com:scikit-learn/scikit-learn into issue-77-sparse-cd
cosmit
Merge branch 'master' of github.com:scikit-learn/scikit-learn
fix broken doctest in cross_val
Merge branch 'master' into issue-77-sparse-cd
more robust tests
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' into issue-77-sparse-cd
missing import
added sparse Lasso utility class
OPTIM: fix a typo and some suboptimal cython constructs in dense coordinate descent
Merge branch 'master' of github.com:scikit-learn/scikit-learn
share the same cython impl for both lasso and elastic net CD
make d_w_max early stopping criterion scale invariant
cosmit: s/nsamples/n_samples/g and s/nfeatures/n_features/g
group stopping criterion related boilerplate in the same place for readability
FIX: make CD lasso robust to zero valued columns (useless features)
Merge branch 'master' of github.com:scikit-learn/scikit-learn
add duration to glmnet benchmark output
removed useless includes
make d_w_max threshold independant of the squared norm of y to make it useful in practice
Merge branch 'master' into issue-77-sparse-cd
port latest bugfix and optims from dense CD to sparse CD
fix NORMALIZE_WHITESPACE issues in doctests
more robust and understable CD elastic net test using explained variance score instead of RMSE
forgot to setup the good value of rho in last checkin
Merge branch 'master' into issue-77-sparse-cd
Merge branch 'textextract' of git://github.com/mblondel/scikit-learn
Merge branch 'master' of github.com:scikit-learn/scikit-learn
better docstring + cosmit for the RFE module
pep8
move the RFE module to the feature selection package
merge textextract branch from mblondel
make analyzers inherit from BaseEstimators to get better repr and parameters management
work in progress: refactoring the document classification dataset API to remove the feature extraction step
Merge branch 'master' into textextract
tracking changes from master
cosmit
more precise doc on SVM complexity
FIX: explained variance is not symmetric: ground truth comes first
Merge branch 'textextract' of git://github.com/mblondel/scikit-learn into textextract
ENH: s/filter/preprocessor/ + docstring cosmit
ENH: more docstring love
ENH: implement __repr__ for DefaultPreprocessor so that estimators __repr__ looks prettier
cosmit
make the pipeline / grid_search object nicer to introspect in tests
FIX: make grid_search output deterministic even in case of tie on the scores
mereging pprett sgd work while tracking master
mark heisentest as skipped: it randomly passes 3 out of 5 times on my box with pyamg installed
Merge branch 'master' into sgd
kill evil tabs
cosmit
cosmit on example
cosmit: PEP8 + some missing docstrings
more cosmit
more cosmit
merging with alexandre's fixes
fix broken doctests: they are space sensitive unfortunately
Merge branch 'master' into textextract
better way to load folder / files dataset
porting the sparse document classification example to the new API
cosmit: PEP8 + some missing docstrings
Merge branch 'master' into textextract
PEP8 + better docstrings
FIX: add missing README.txt file for the sgd examples
ENH: more cosmit, docstring, test cleanup for the metrics module
cosmit
register the SGD chapter in the user guide TOC
PEP 8 in metrics module
ignore generated doc elements
some more cosmits / PEP8
more PEP8
helpers to use tags with vim / emacs
Make it possible to pass explicit labels to confusion matrix
make binary classification recall explicit
factorizing code to make it easier to do the multiclass refactoring in one place
refactored test_metrics to handle the binary case explictly and make room for
test precision recall for binary classification
more code factorization: fscore joins the party
and you thought you could escape the PEP8 screening
missing test for f1 output
cosmit
extract label extraction logics
ENH: make precision, recall and f1_score handle multi-class
removing test for multi label perf evaluation
FIX: area under curve: recall is x and precision is y
Merge branch 'master' into issue-155-multiclass-precision-recall
Merge branch 'master' into issue-155-multiclass-precision-recall
ENH: new utitity in metrics module: the classification report
showcase the new classification report in the examples
add detailed performance report to the digits example
Merge branch 'master' into issue-155-multiclass-precision-recall
tracking changes occurring on master
cosmits
ENH: handle support to do weighted averages of P/R/F scores
scalar scoring functions for P / R / Fbeta
s/explained_variance/explained_variance_score
make the distinction between loss and score function more explicit
Merge branch 'master' into textextract
spelling
make the grid search able to use an arbitrary score function
Merge branch 'master' into textextract
removing the hashing vectorizers code that need a full rewrite
update SGD example to showcase the new OVA implementation
cosmit in k_means module
FIX: better k-means tests + fixed broken array init
FIX: potential division by zero in scaler
FIX: fixed more cheesy NaNs than an Indian restaurant in Paris
New example to demonstrate the KMeans API with various init strategies
trailing spaces holocaust
let me introduce the culprit of the last checkin
Merge branch 'master' into dense
more trailing spaces cleanup
ignore downloaded data from example
Various improvement in low dim classification example
Merge branch 'dense' of git://github.com/pprett/scikit-learn into dense
remaining conflict markers in previous checkin
cosmit in SGD example
s/libsvm/liblinear/ in classification example
remove the dependency to explicit ABC to keep 2.5 compat + PEP8
make the dense SGD code & docstring more readable
more precise docstring in base SGD class
forgot to finish a sentence on regularization in a docstring
more docstring love
cosmit
use multi proc in multiclass SGD by default
cosmits in SGD tests
more comits in the sgd tests
cleanup
cosmit
better test file name
reuse the dense SGD test suite for the sparse variant using test case inheritance
cosmit
cosmits in the SGD pyx files
more cosmit in pyx files
more cosmit in example
PEP8 in SGD tests + docstring
better looking docstring for sparse sgd
more info on loss and penalty params for sparse SGD
propagate spelling fixes to the dense SGD docstring
ducktyping in analyzers
work in progress on vocabulary dimension restriction
small fixes + updated the tests
cosmit
add note on fortran contiguous memory optim for the X array
Merge branch 'master' into textextract2
OPTIM: vectorizer with predifined dictionary 5x faster by eliminating scipy.sparse.vstack calls
some optims in the text preprocessors
OPTIM: sparse vectorizer uses COO a init
multi-line print cosmit
use a SGD model in the mlcomp demo since it is the fastest for this problem
cosmit
make it possible to do fancy indexing on filenames
move the mlcomp SGD example as a generic 20 newsgroup classification example
cosmit
better pipeline notation in vectorizer + classifier grid search example
4 more years!^W^W^W 1 more test for vectorizers with max_features
cosmit
factorize out shuffling dataset since it might be useful by default
new example on how to use pipeline / grid_search for extraction parameters
sample run output in the grid_search_text_extraction_parameters example
reST formatting of example
cosmit
better title for the mlcomp example
better example filename
reference new example in the documentation of the grid_search module
cosmit
ENH: automated class_weight for SVC on imbalanced data
more s/predict_margin/decision_function/ in examples
FIX: typo in custom score_func in grid_search
initial face regonition example using eigenfaces
FIX: better handling of NaNs in precision / recall / f-score metrics
Merge branch 'faces-example'
face recognition example using eigenfaces and SVMs
more explicit subplot titles
cosmit
FIX: actually truncate the SVD to make it faster + add some test
forgot the test file in my last checkin...
drop the warning since useful even if approximate as demoed in the faces example
make fast_svd deteriministc by default while allowing to pass rng seeds
test singular values as well
new benchmark: comparing SVD implementations
remove useless import
more documentation on fast SVD + missing reference
PEP8 + various cosmits in sample generators
more tests for the iterated power refinement of the Martinsson randomized SVD
ENH: make the PCA transformer use the iterated power refinement by default
one more test for SVD
Welcome to Alexandre Passos
OPTIM: do not allocate a (n_samples, n_samples) temporary array with scipy.linalg.qr when (n_samples, k + p)) is all what is needed
OPTIM: fast_svd now has a auto transpose mode that switch to the fastest impl
cosmit
switching back to scipy.linalg.qr with econ=True to avoid half-installed numpy issues with wrong lapack bindings
FIX: numerical instability in Rdige regression tests
cosmit
new example: principal eigen / singular vector of the wikipedia graph
Better docstrings in the example
simpler SVD benchmark: use the sample_generator utility and fixed effective rank
moving real word examples to the applications subfolder
better gitignore data archives
s/_sparsedot/safe_sparse_dot/g
even better .gitignore (teasing...)
cosmit on PCA module
avoid global variable in test
ENH: make the PCA transformer perform variance scaling by default + update the face recognition accordingly
FIX: GridSearchCV refit did not propagate the fit params
switch whintening off in PCA by default + ensure unit scale + better docstring
use a grid search for the SVM params in the faces example
updated lasso benchmark to showcase the region where LassoLARS is faster than Lasso CD
OPTIM: ensure lasso_path aligns the data only once in if not alread fortran contiguous
pep8
ENH: LassoCV / ElasticNetCV now uses all folds data + example
ENH: make the LassoLARS and LassoCD path examples easier to compare
make MSE plot of LassoCV more readable by scaling the y axis
FIX: update broken tests by last checkin
switch to base 10 for the alpha logs in the Lasso CD path plot
revert the plot style to the LARS paper conventions
select the best alpha using the mean of the CV MSEs instead of the median
cosmit: += assignement replaced by plain = in coorinate_decent (more natural, less confusing)
extract the randomized SVD implementation as a toplevel class able to handle sparse data as well
consistently rename n_comp to n_components
Merge branch 'master' into sparse-pca
update doctest to handle the change in regularizer strenght definition in LARS
FIX: typo s/mean/mean_/g in RandomizedPCA
Merge branch 'master' into sparse-pca
sed -i "s/\<n_componentsonents\>/n_components/g"
SVD benchmark have a consistent filename
factorized out correlated regression dataset utility function and updated
do not allocate useless memory in make_regression_dataset
launch test on documentation by default when running make
cosmit
OPTIM: do not precompute r2_score_ in ElasticNet in the fit call
do not precompute explained_variance_ in linear model: can be too costly: use r2_score when needed instead
new benchmark for lasso path implementations
merging master
temporary test fix for refit instability in linear SVC: a bugfix branch will be open to reproduce the issue
cosmit (reST formatting of the SGD module documentation)
Merge branch 'master' of github.com:scikit-learn/scikit-learn
more formatting in SGD reST and fixed docstest broken by last checkin :(
cosmit
ENH: make it possible to customize the WordNGramAnalyzer token regexp
Merge branch 'master' of git://github.com/jaberg/scikit-learn
PEP8
more PEP8
more PEP8
style conventions for variable names
FIX: allow the trivial border case k==n in KFold CV
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'cv_indices' of https://github.com/agramfort/scikit-learn
ENH: KMeans tolerance parameter renamed tol (as in coordinate descent) and made public
FIX and more tests for PCA and inverse_transform also for RandomizedPCA
Add documentation for the RandomizedPCA class
add new method for fecthing datadir + reorg os related imports
checkpoint WIP for the LFW dataset loader
Merge branch 'master' into lfw-dataset
fix broken dataset description
checkpointing work in progress
Merge branch 'master' into lfw-dataset
work in progress on LFW: fetching the data
more work on dataset loader for LFW pairs
get rid of the normalization that should not be part of the load time
Merge branch 'master' into lfw-dataset
make it possible to load the LFW people dataset using the scikits.learn.datasets infra
remove stupid color slicing 'feature' and shuffle the examples
pep8
better default slice values
better looking example
Merge branch 'master' into lfw-dataset
face verification example will be implemented later
Merge branch 'master' into lfw-dataset
cosmit typo
first test for the LFW loader skipped if missing data folder
more LFW tests
pep8
documentation for the LFW dataset loaders
Merge branch 'master' into lfw-dataset
generate fake LFW dataset to fully test the LFW loader even without access to the real data
add HTML coverage report
more robustness test checks for LFW loader
first stab at factoring the 20 newsgroups dataset loading
cosmit
cosmit
fix kw params propagation to load_files
update the grid search example
remove function autodoc section that breaks sphinx
better name: rename load_files to load_filenames
better name: rename class_names to target_names for consistency
merge lfw-dataset to 20newsgroups-dataset
cosmit
Merge branch 'lfw-dataset' into 20newsgroups-dataset
Merge branch 'lfw-dataset' of https://github.com/GaelVaroquaux/scikit-learn into lfw-dataset
cosmit / ordering
use explicit parameter passing
merge changes from LFW branch
Merge branch 'lfw-dataset'
Merge branch 'master' into 20newsgroups-dataset
some more work on the datasets documentation
improvements to the datasets documentation
fix: avoid creating a spurious '~' in the current working directory
pep8
typo
missing justification for the shuffling of samples
Merge branch 'master' into 20newsgroups-dataset
restore python 2.5 compat
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' into 20newsgroups-dataset
FIX: make PCA models usable in pipelines
Merge branch 'master' into 20newsgroups-dataset
add backward compat for old load_files public API
Merge branch 'master' of github.com:scikit-learn/scikit-learn
boilerplate
typos
more tutorial boilerplate
missing download script
cosmit
typo
typo
some work on the introductionary section
more work in progress
better title structure and missing paragraph on good feature extractors
more work on machine learning 101
cosmit & typos
wording
missing fetch script for movie reviews
work on exercise 01
better use this a exercise number 2
do not forget to introduce linear separability
typo
add missing fetch script for the face dataset
solution for exercise 04
extracted skeleton for exercise 04
more work on the classifier section
starting explaining PCA
useless conf
more work on PCA
better titles
style
more work on general concepts (esp. supervised learning)
more syle fixes
more subtitles
ignore workspace
ignore OSX stuff
updated the link to the official documentation
work on clustering section
add some subsections separators
wording
reorganizing notes and adding a scikit-learn oriented complement to the supervised learn flow diagram
remove section on density estimation
started work on linearly separable data
added takeway points section
cosmit
various improvements
typo
various cosmits / wording
typo / wording
section on regression models
section on overfitting and the train / test split
removing confusing section + slight reorg
wording
wording
work on text feature extraction doc
one more check in the setup
better intro + more work on text features
more work on text classification example
fix header levels
exercises instructions
improvements in exercises instructions
some practical hints
use the pipeline in exercice 02
inversed solution and skeleton
sentiment analysis
missing skeleton
gh pages integration
trailing spaces
pep8
style
add check to the nature of y to have more explicit error messages
explicit ValueError when not enough data for kmeans and some pep8
style
make RandomizedPCA work on list data
FIX: the datasets doctest fixture could never skip the tests when required
use WARNING level logs before using network access
make the test display the output on stdout
ENH: add function to clear the data_home cache + tests
full PEP8 compliance for the scikits.learn.datasets package
renamed load_* to fetch_* when network connection is potentially involved
add load_lfw_pairs and load_lfw_functions for backward compat and consistency
load_20newsgroups as an alias for fetch_20newsgroups in offline mode
trailing spaces
break test data symmetry to avoid heisenfailure in RandomizedPCA test
Merge branch 'master' of github.com:scikit-learn/scikit-learn
FIX: heisen test failure + some pep8 in test_pca.py
better README.rst
use vec.fit_transform instead of vec.transform on the training set
FIX: make the PIL dependency optional (skip LFW tests if not present) + explicit error message
FIX: make the PIL dependency optional (skip LFW tests if not present) + explicit error message
FIX: workaround broken PIL installs
Merge branch 'nmf-lite' of https://github.com/vene/scikit-learn into vene-nmf-lite
Merge branch 'nmf-lite' of https://github.com/vene/scikit-learn into vene-nmf-lite
ENH: plot eigencefaces in face recognition example
ENH: do not download LFW when building the documentation by default
Merge branch 'master' into vene-nmf-lite
Merge branch 'nmf-lite' of https://github.com/vene/scikit-learn into vene-nmf-lite
Merge branch 'text' of https://github.com/vmichel/scikit-learn into vmichel-text
FIX: update the examples to match the new text feature extraction API
FIX: feature_extraction.text is now a module instead of package
FIX: forgot to update the documentation after the feature_extraction.text refactoring
FIX: decrease disk usage in LFW data folder
ENH: factorize some plot code in face recognition example
FIX: broken link to plot_kernel_pca kernel in the documentation
language detection gives slightly better results without IDF
typo
MISC: style fixes in NMF
ENH: improved contributors guide
Merge branch 'master' of github.com:scikit-learn/scikit-learn
ENH: add coverage install command
cosmit
DOC: first stap at the performance chapter (full of TODOs)
Merge branch 'master' of github.com:scikit-learn/scikit-learn
DOC: missing class reference
DOC: cosmit
MISC: another style fix for a private function in nmf
DOC: add sample python profiling session
DOC: note for later
DOC: add some missing reference in the performance guide
ENH: avoid the use of lambdas in NMF to get a more informative profiling output
Merge branch 'master' of github.com:scikit-learn/scikit-learn
DOC: fix small inaccurracy
DOC: more warning fixes for the classes reference toc
FIX: stupid statement in plot_face_recognition
DOC: make the face recognition example static (to avoid having to download the dataset to build the doc)
fix broken doctest
MISC: style fixes in NMF
ENH: improved contributors guide
ENH: add coverage install command
cosmit
DOC: first stap at the performance chapter (full of TODOs)
DOC: missing class reference
DOC: cosmit
MISC: another style fix for a private function in nmf
DOC: add sample python profiling session
DOC: note for later
DOC: add some missing reference in the performance guide
ENH: avoid the use of lambdas in NMF to get a more informative profiling output
DOC: fix small inaccurracy
DOC: more warning fixes for the classes reference toc
FIX: stupid statement in plot_face_recognition
DOC: make the face recognition example static (to avoid having to download the dataset to build the doc)
DOC: refined the python profiling example
DOC: fix / add more class reference links in perf doc
wording
DOC: started intro YEP
Merge branch 'variational-infinite-gmm' of https://github.com/alextp/scikit-learn into alextp-variational-infinite-gmm
DOC: use uppercase for project / language names
Merge branch 'batchKMeans' of https://github.com/NelleV/scikit-learn into NelleV-batchKMeans
ignore 'cython -a' HTML reports
Merge branch 'batchKMeans' of https://github.com/NelleV/scikit-learn into NelleV-batchKMeans
ENH: style, pep8, docstrings comments, variable names
ENH: more interesting batch size
ENH: more fixes for variable names
ENH: fix example docstring
DOC: more work on the performance chapter
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'variational-infinite-gmm' of https://github.com/alextp/scikit-learn into alextp-variational-infinite-gmm
Merge branch 'master' into alextp-variational-infinite-gmm
Merge branch 'variational-infinite-gmm' of https://github.com/alextp/scikit-learn into alextp-variational-infinite-gmm
Merge branch 'master' of https://github.com/ametaireau/scikit-learn-tutorial into ametaireau-master
Merge branch 'variational-infinite-gmm' of https://github.com/alextp/scikit-learn into alextp-variational-infinite-gmm
Merge branch 'variational-infinite-gmm' of https://github.com/alextp/scikit-learn into alextp-variational-infinite-gmm
Merge branch 'variational-infinite-gmm' of https://github.com/alextp/scikit-learn into alextp-variational-infinite-gmm
Merge branch 'variational-infinite-gmm' of https://github.com/alextp/scikit-learn into alextp-variational-infinite-gmm
Merge branch 'variational-infinite-gmm' of https://github.com/alextp/scikit-learn into alextp-variational-infinite-gmm
ENH: more informative test error message
typo
ENH: spectral clustering doc and style improvements (pep8, docstrings, references, variable names)
cosmit
cosmit
ENH: style / pep8 / docstring fixes in s/l/utils/fixes.py
ENH: new make_rng utility function to help make PRNG seeding explicit
TEST: forgot to checkin the unittest for the make_rng function
ENH: add test for picklability of the spectral clustering model
FIX: make normalizer use the real l1 norm on each row (without assuming positive values)
Merged pull request #4 from larsmans/master.
DOC: typo in line-prof package name
FIX: broken import in bench_plot_nmf
DOC: fix doctests to make them work with numpy 1.5 and olderw
merged master
DOC: trim_doctests_flags = True for sphinx
Merge pull request #147 from larsmans/master.
rename rng to random_state
cosmit
delayed check_random_state in k means and spectral clustering
Merge pull request #154 from larsmans/master.
kill trailing spaces
merge master
merge from master, update random_state API + pep8
Merge pull request #150 from pprett/learningrate
track changes from master
Compressed README.rst to make it an executive summary
started work on homogeneity, completeness and V-measure as clustering metrics
working implementation of V-measure, still needs doc and updated clustering examples
use V-measure metrics in K-means example
add missing return info in swiss roll docstring
illustrate clustering metrics on affinity propagation example
100% test coverage for the new clustering metrics
more tests
add more documentation for the new metrics
typo
typo
split some tests to make them more atomic
Merge branch 'master' into clustering-metrics
pep8
typos
Merge branch 'master' into clustering-metrics
typo
Merge branch 'batchKMeans' of https://github.com/NelleV/scikit-learn into NelleV-batchKMeans
pep8
Merge branch 'master' of github.com:scikit-learn/scikit-learn
more pep8
better docstring for the LabelBinarizer in the multilabel case
started work on normalizer API simplification
work in progress on package structure
FIX: rounding issues on python 2.6 in clustering metrics doctests
ENH: add a note on the symmetry of the metrics
ENH: simpler import statement in example
ENH: simpler import statement in example + explicit square
ENH: add links to the reference guide
ENH: better docstrings for symmetric considerations
cosmit
ENH: better organization of metrics references
ENH: reorganization of the document to be operational quicker
fix broken test introduced in last checkin
new utility function to generate blobby datasets
Merge branch 'master' of github.com:scikit-learn/scikit-learn
FIX: indexing bug when labels are not consecutive
Merge branch 'master' into clustering-metrics
FIX: broken doctests
Merge branch 'master' of github.com:scikit-learn/scikit-learn
ENH: new utility function to shuffle data in a consistent way
Merge branch 'batchKMeans' of https://github.com/NelleV/scikit-learn into NelleV-batchKMeans
Merge pull request #161 from ogrisel/clustering-metrics
ENH: small fixes in scikits.learn.utils.shuffle
Merge branch 'batchKMeans' of https://github.com/NelleV/scikit-learn into NelleV-batchKMeans
Welcome to Nelle\!
pep8
ENH: syntactic sugar for the shuffle utility
ENH: better / simpler handling of shuffling in MiniBatchKMeans
ENH: refactored shuffle to address the resampling with replacement case + more tests
FIX: n_samples bug in shuffle, 100% coverage in utils, missing reference doc entries
first shot at a boostrapping cross validator
typos
more typos
ENH: ensure that training and test split do not share any sample
ENH: better input validation + more representative doctest
ooops
cosmit
DOC: cleanup in cross validation doc
Merge branch 'master' into bootstrap
add bootstrap to reference doc
DOC: new section for the Bootstrap cross-validation
cosmit
cosmit
add see also in resample docstring
FIX: make cross_validation_score work with sparse inputs
merge master
cleanup leftover
ENH: add test for the permutation_test_score with sparse data
Merge branch 'master' into bootstrap
more tests
Merge branch 'balltree-wrapper' of https://github.com/jakevdp/scikit-learn into jakevdp-balltree-wrapper
Merge branch 'bootstrap'
FIX: make r2_score and explained_variance_score never return NaNs
Merge branch 'master' of github.com:scikit-learn/scikit-learn
pep8
add a comment explaining the + 10
Merge branch 'mldata' of https://github.com/pberkes/scikit-learn into pberkes-mldata
pep8 / style
fix broken test in MultinomialNB
ENH: more readable datasets definitions
ENH: avoid double HDD copy of mocked datasets + style
merge
merge master
add random projection and PCA to digits manifold example
use scikit-learn QR compat alias
cosmit
ENH: split figures for better reusability and readability
Merge branch 'extended-digits-manifold-example'
ENH: make the LLE random seeding controllable and deterministic by default
Merge branch 'master' of github.com:scikit-learn/scikit-learn
docstring style
FIX: broken doctests and missing max_iter attribute in LassoLARS
FIX: broken doctest in the documentation caused by the last fix
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' into preprocessing-simplification
work in progress on SampleNormalizer unification
enable test for the sparse variant
getting rid of the remaining stuff in the preprocessing.sparse package
more explicit / descriptive low level cython function names
cosmits / pyflakes / pep8
ENH: improve docstring with missing parameters and motivations
factorize a normalize utility function
s/SampleNormalizer/Normalizer/g
Merge branch 'master' into preprocessing-simplification
moar tests
more tests for preprocessing (scaling)
more tests for preprocessing: coverage is now 100%
make centering optional in Scaler / scale + fix broken test
one more test
one more test for preprocessing (no mean centering)
fail early
pep8
ENH: docstrings for Scaler / scale
bugfix: sparse_format can be omitted
typo
better docstring for Scaler
register the preprocessing utilities to the reference documentation
fixes in See also sections
ENH: give motivations for standardazation in the Scaler docstring
ENH: style fixes and better use of the scikit-learn API in ROC example
Merge branch 'master' into preprocessing-simplification
started work on the narrative documentation for the preprocessing package
typo
reorg TODO and notes
DOC: section on normalization
DOC: section on feature binarization
factorize the binarize function + write documentation
format
Merge pull request #194 from jakevdp/balltree-queryrad
Merge pull request #198 from amueller/fastICA_transposed
Merge pull request #207 from pprett/mbkm-fix
Merge pull request #6 from larsmans/master
add the test command
Merge branch 'master' into pberkes-mldata
DOC: reorg of dataset page to make it more consistent
FIX: make the dataset doctest fixture modular
typo
track changes from master
FIX: make the dataset doctest fixture modular
typo
PEP8
ENH: make rng of the LLE tests controllable to hunt down potential NaNs
FIX: add tolerance for lack of numerical precision
Merge remote-tracking branch 'lemin/sparse-mbkm'
remove leading _ in _gen_even_slices and duplicate implementation in sparse_pca
remove verbose output from GMMHMM test
Merge pull request #272 from glouppe/master
fixed broken doctest in HMM
Merge remote-tracking branch 'sabba/master'
Merge pull request #289 from sabba/master
ENH: more rng instance instead of singleton in tests
FIX: potential division by zero when normalizing non-pruned CSR matrices
PEP8 in LLE tests + better assertion failure messages
display the eigen solver name in case of LLE reconstruction test failure
ENH: make the file loader keep the filenames information
update doctest in tutorial to match current API
cosmit on docstring first line
FIX: broken Gram handling in OMP estimator + minor style improvements
Merge branch 'master' into jakevdp-manifold-isomap
FIX: broken dataset generator import + minor styling issues
fix comment
Merge pull request #303 from glouppe/master
FIX: avoid the dependency on pylab in the doctests
fix the random state of the files dataset loader for reproducible results
Upgrade the setup documentation to match the current master
updated README.md
fixed left-over of the previous version
Merge remote-tracking branch 'vene/patch-extraction' into vene-patch-extraction
fix broken doctests
more details for poor windowsians
upgrade to simpler new load_file API
s/class_names/target_names/g
upgraded exercise 2 to the new API
updated faces recognition example to the latest API
upgrade inline doctests to new load_files API
reorder exercises
ENH: remove references to digits + format
plot the original centered sample + make sparse pca a little less sparse + kmean a little less like init
DOC: make the decomposition doc more consistent with running faces example
cosmit
ENH: use introspection to find the cluster components
DOC: group SparsePCA and MiniBatchSparsePCA chapter to reduce redundancy
cosmit
ENH: minor style fixes in docstrings and comments
cosmit
cosmit
FIX: removed recently introduced mistake from dict_learning_online docstring
Carve the emmerging consensus on __init__ vs fit parameters in the contributors documentation
cosmit
DOC: give some motivation for the return of self in fit
DOC: formatting mistake
DOC: more fitting doc improvements
typo
DOC: more formatting
yet another typo
Merge pull request #311 from glouppe/test-coverage
Merge pull request #302 from jakevdp/manifold-doc
DOC: section level fix in clustering doc
Merge remote-tracking branch 'robertlayton/kmeans_transform2' into robertlayton-kmeans_transform2
checkpoint style improvements for the KMeans predict
track changes from upstream/master
time the main operations
add warning utils and use it in KMeans when data matrix is integers, boolean, complex...
checkpointing work in progress on VQ example
ENH: add missing inverse_transform method for Scaler
Merge branch 'master' into robertlayton-kmeans_transform2
fix the VQ example by switching to floats in range 0 - 1
Merge branch 'master' into robertlayton-kmeans_transform2
cosmit
use the scipy public API rather than PIL
update the documentation
ENH: 'make test' now runs the doc doctests as well
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge remote-tracking branch 'JeanKossaifi/master'
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge remote-tracking branch 'JeanKossaifi/sorted_repr' into JeanKossaifi-sorted_repr
FIX NMF doctests
ENH: shorter doctest output
ENH: pipeline doctest style improvements
FIX: updating doctests in gaussian_process.rst and linear_model.rst
FIX: remaining broken doctests
FIX: doctests on buildbot
cosmit
ENH: new example: NMF topic extraction on 20 newsgroups
FIX: useless arg to argsort in NMF example
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge remote-tracking branch 'glouppe/master' into glouppe-master
Merge pull request #328 from bdholt1/crossval
more scikits.learn => sklearn updates
ENH: new Makefile target to cythonize everything
Merge branch 'master' of github.com:scikit-learn/scikit-learn
batch re-cythonization with version 0.15 and new package names
new package name
more renamings
fix typo in scikits.learn.qda
update the Makefile test-coverage target to work with the new package layout
Merge branch 'master' into bdholt1-enh-tree
trailing spaces in pyx file
More style consistency improvements
style: constant in capital letter on top + extract graphviz tree template
cosmit
More style improvements in _tree.pyx
ENH: cross_val module docstring and style improvements
ENH: more randomized cross val docstring & var naming improvements
Merge branch 'master' into bdholt1-enh-tree
ENH: doctest simplification by using the cross_val_score func
He who seeks only vanity and no love for humanity shall fade away
style
Better exception messages in SVM
ENH: make the cross_val_score able to use functions from the metrics module
ENH: better docstrings for SVMs
Merge branch 'master' into cross-validation-improvements
DOC: improvements to the cross validation doc layout + missing ref to ShuffleSplit and cross_val_score
Merge remote-tracking branch 'bdholt1/enh/tree' into bdholt1-enh-tree
Merge branch 'master' into bdholt1-enh-tree
Merge remote-tracking branch 'glouppe/master' into glouppe-master
Merge branch 'master' into glouppe-master
Add missing authorship + license info to NMF topics example
Merge branch 'master' into cross-validation-improvements
ENH: more cross_val doc for LOLO and LPLO
DOC: add info about smart CV and IC estimators
cosmit
ENH: s/n_labels/n_unique_labels/g in cross_val
FIX: compat with numpy version lacking the out argument for dot
ENH: misc style / docstrings improvements
Merge pull request #341 from ogrisel/cross-validation-improvements
s/\bcross_val\b/cross_validation/g
backward compat for cross_val namespace
cosmit
API: start 'API changes summary' section in doc/whats_new.rst
API: removal of fit parameters
FIX: fix broken tests on ElasticNetCV
batch trailing spaces cleanup
ENH: docstring cleanup
Mark sklearn.hmm as orphaned
FIX: make the @deprecated class decorator not break the __repr__ of estimators
ENH: implementation Adjusted Rand Index for clustering evaluation
cosmit
removing the undocument implementation of the unadjusted Rand index in kmeans_
cosmit
missing import in the metrics namespace
DOC: narrative documentation for the ARI
DOC: typos
FIX: fix broken document clustering example and add ARI to examples
add doctest for combinations (to document the n < k case)
more tests for ARI and clustering metrics
test non consecutive integers in perfect match
FIX: use scipy's fast implementation of comb + fix tests + limit cases + faster adjustment test
cosmit
OPTIM: use exact comb evaluation since it's faster for the ARI case
cosmit
cosmit
DOC: add example to illustrate the concept of adjustment for chance
more details about ARI value range
make example script filename more explicit
typo
Merge branch 'master' into cluster-metrics-2
Merge remote-tracking branch 'jakevdp/neighbors-refactor' into jakevdp-neighbors-refactor
cosmit + docstest
DOC: reorg, bold important points, include adjustment plot as figure
typo
Merge pull request #347 from ogrisel/cluster-metrics-2
Merge remote-tracking branch 'jakevdp/neighbors-refactor'
more enhancements, variable names and test fixes
Added items for cross validation and clustering metrics
trailing spaces
Merge remote-tracking branch 'vene/sc' into vene-sc
cosmit
DOC: howto register the %lprun line_profiler magic on IPython 0.11+
Merge pull request #313 from robertlayton/pairwise_distance
Merge branch 'master' into vene-sc
Merge remote-tracking branch 'vene/sc' into vene-sc
Merge branch 'sc' of https://github.com/vene/scikit-learn into vene-sc
Merge branch 'sc' of https://github.com/vene/scikit-learn into vene-sc
Merge branch 'sc' of https://github.com/vene/scikit-learn into vene-sc
Merge branch 'sc' of https://github.com/vene/scikit-learn into vene-sc
Merge branch 'vene-sc'
LassoLarsIC/CV and metrics.roc_curve in whats_new
Cosmit.
if __name__ == '__main__' multiprocessing protection in ex2
more details
more details in multiprocessing comment in skeleton too
Merge pull request #353 from amueller/sgd_warm_starts
DOC: cross validation: introduce motivation and basic usage first
Merge branch 'master' of github.com:scikit-learn/scikit-learn
typo: s/accurracy/accuracy/g
Merge pull request #360 from cmd-ntrf/master
ENH: no need for L2 norm on input in doc clustering
ENH: make load_files use a fixed shuffling of the samples
DOC: better svmlight_loader / dumper docstrings
ENH: 30% speed improvements in load_svmlight_file
ENH: remove useless call to strip while staying robust to empty lines
ENH: make MiniBatchKMeans display more info in verbose mode
Merge pull request #373 from larsmans/svmlight
Revert "BUG fixed and cosmetics in CountVectorizer"
ENH: make it possible to skip label assignements in MiniBatchKMeans
thanks to @larsmans, TFIDF is now always positive :)
Merge remote-tracking branch 'bdholt1/enh/tree' into bdholt1-enh-tree
Merge pull request #381 from satra/doc/permutation
FIX: compat with numpy 1.5.1 and earlier in NMF
Merge remote-tracking branch 'bdholt1/enh/tree' into bdholt1-enh-tree
Merge pull request #377 from larsmans/sparse-nmf
pep8
pep8
OPTIM: inplace max in distances computation
OPTIM: avoid unnecessary repeted memory allocations in minibatch k-means
Merge remote-tracking branch 'bdholt1/enh/tree' into bdholt1-enh-tree
cosmit: pep8 and trailing spaces
merge master
DOC: fix broken links + various cosmits
FIX: remove non-ASCII char from silhouette docstrigs
Some clarification of the memory copy issues.
OPTIM: inplace dense minibatch updates and better variable names
cosmit
cosmit: better variable name in MiniBatchKMeans
Merge branch 'master' of github.com:scikit-learn/scikit-learn
ENH: make it possible to control the add variance caused by Randomized SVD
ENH: document clustering example simplification
FIX broken doctests on buildbot + pep257
Merge branch 'master' of github.com:scikit-learn/scikit-learn
first stab at nearest center in cython (+30% perf, need check correctness)
factorized label assignement as a reusable python func for the predict method
use direct blas ddot call and reuse _assign_labels in predict
FIX: broken test cause by the use of todense which return a matrix instance instead of a regular numpy array
WIP on simpler cython impl of the center update (still buggy)
compute inertia + remove code :)
update renamed function call
factorize dot product and bootstrap implementation for the dense case
use cpdef + less array overhead in ddot
started kmeans test suite refactoring
more code factorization
refactored the kmeans tests
test and fix input checks for various dypes
much cheaper yet stable stopping criterion for the minibatch kmeans
FIX: missing relative import marker
Merge pull request #400 from amueller/docs_typo
DOC: LogisticRegression is a wrapper for liblinear.
FIX #401: update tutorial doctests to reflect recent changes and add them to
Merge branch 'master' of github.com:scikit-learn/scikit-learn
DOC: new scikit-learn.org URLs and mention license in README.md
Merge remote-tracking branch 'robertlayton/ami' into robertlayton-ami
measure runtimes for various clustering metrics in adjusted for chance example
FIX warnings by avoiding 0.0 values in the log + cosmit
Merge branch 'master' into minibatch-kmeans-optim
unused import
low memory computation of the square diff
be more consistent with the usual behavior of fitted attributes
base convergence detection on EWA inertia monitoring
various cython cleanups
working in progress to make it possible to use a speedy version based on smoothed inertial only
ENH: more informative error messages when input has invalid shapes
Merge branch 'master' of github.com:scikit-learn/scikit-learn
ENH: more informative error message when shape mismatch in TF IDF transformer
merge master
preparing new stopping criterion impl
ENH: make it possible to pass class_weight='auto' as constructor param for SGDClassifier
Merge branch 'master' into minibatch-kmeans-optim
work in progress (broken tests) on early stopping with both tol and inertia lack of improvement
make min_dist test more explicit
fixed broken test
optimize label assignment for dense minibatch and new test
fix tests
fix tests
start with zero counts in tests
fix bug: x_squared_norms should follow the shuffle...
ensure that the sparse and dense variant of the minibatch update compute the same thing
better default value and parameter handling for max_no_improvement
switch to lazy sampling with explicit index to divide memory usage almost by 2 and decrease code complexity with no measurable impact on the run time
more code simplification
started example to check the convergence stability in various settings
FIX: buggy usage of for / else for k-means n_init loop
DOC: update what's new
tracking changes from master
FIX: broken HMM tests caused by KMeans convergence in one step
merge master
ENH: use integer indexing instead of boolean masks by default for CV
implemented n_init for MiniBatchKMeans
Merge branch 'master' into minibatch-kmeans-optim
refactored the init logic for MiniBatchKMeans
Merge branch 'master' into minibatch-kmeans-optim
fix stability and warning in tests
make k-means++ work on sparse input and use it as default for MB k-means
add version info in deprecation message
factorized out the early stopping logic in a dedicated method
first stab at a reinit strategy that work on low dim data only
new example to emphasize issues with current naive reinit scheme on sparse data
second experiment on reinit that does not work on high dim sparse data either
PEP8 + various cosmits
pep8 in sparse covariance example
PEP8 + PEP257 in samples_generator
PEP257 - docstring style
Merge branch 'master' into minibatch-kmeans-optim
FIX: make the doctests outcome deterministic
DOC: better toplevel docstring
DOC: add simple descriptions in the concrete class docstrings
FIX: workaround what looks like a numerical instability in doctest
Merge pull request #439 from glouppe/ensemble-rebased
Merge pull request #453 from yarikoptic/master
pep8
Merge pull request #452 from glouppe/doc
PEP257 cosmit
cosmit
Update README.txt dependencies info to match the configuration tested on jenkins
cosmit
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
track changes from master
pep8
fix k_means docstring to better match the scikit naming conventions
WIP: n_init refactoring
merge master
Merge pull request #481 from mblondel/mean_var2
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
Merge branch 'master' into minibatch-kmeans-optim
scale tolerance of minibatch kmeans on CSR input variance
delete broken example
example script is not meant to be executed when building the doc as it is slow
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
typo: accross => across
Merge branch 'master' into minibatch-kmeans-optim
typo: accross => across
Use python int for indices and indptr of scipy sparse matrices to ensure cross platform support
Make init less expensive by default on MinibatchKMeans to avoid dominating computation on large scale datasets
Fix broken duplicated / tests and more practical init
consolidating all cython utils for sparse CSR in the same file under utils
WIP: scaling CSRs
Merge branch 'master' into minibatch-kmeans-optim
FIX compat for errorbar legend for old matplotlib versions
slight optim: remove useless assignment from the inner loop
FIX: numerical instability caused by collapsed allocation of bad clusters to the center of mass
example tweaks
fix text position in example
its
better documentation for the convergence stability example
Merge branch 'master' into minibatch-kmeans-optim
simplify stability evaluation example
enable the kmeans stability as an auto examples as the speed is now fast enough
docstring in cython funcs + better var name: with_sqrt
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into minibatch-kmeans-optim
cosmit
merge master
readd dtype and ccontiguous checks removed by mistake during last conflict resolution
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into minibatch-kmeans-optim
merge master
remove useless dependency on pylab
fixed conflict in import resolution
FIX: validation is a relative package
FIX: py3k - more relative imports
FIX: py3k: string.letters is locale dependent and absent in py3k
Merge branch 'master' into sparse-scaler
WIP: feature scaling for CSR input (lacks some tests)
fix scaling, more tests and docstrings
Merge branch 'master' into sparse-scaler
wording
FIX: py3k integer division in robust covariance estimation
FIX: py3k integer division in samples generator
FIX: in py3k svmlight files must be explicitly opened in binary mode
FIX: py3k bytes split in svmlight format parser
Merge branch 'master' of github.com:scikit-learn/scikit-learn
FIX: py3k need explicit bytes buffers for svmlight format serialization
FIX: py3k need output file in binary mode for svmlight format serialization
FIX: py3k: string formatting is not supported on byte strings
FIX: fix test: integers are valid file descriptors in py3k
Merge branch 'master' into sparse-scaler
FIX: unused cython variable
More checks when transforming sparse matrices with centering scalers + typo
DOC: update narrative documentation
optim: avoid useless memory copy when input is non CSR
DOC: typo / wording
DOC: document sparsefuncs cython routines in developer section.
DOC: wording
DOC: wording
Merge pull request #515 from ogrisel/sparse-scaler
update what's new for sparse scaling
Fix the docstring of the univariate feature selection module to match the scikit conventions
cosmit
typo
cosmit
FIX: None and int comparison not authorized in py3k (in PCA)
FIX: dicts no longer have the has_key method in py3k: test for the method we actually use instead
FIX: make feature extraction work with the new py3k string API too
FIX: py3k's zip is not subscriptable
FIX: handle py3k exception API
FIX: previous fix for py3k str API in feature extraction was a bug in python 2
FIX: pervasive use of unicode in feature extraction for py3k compat
Update random forest face example to use several cores
ENH: make ShuffleSplit able to subsample the data
FIX: ensure fetch_20newsgroups_vectorized outputs CSR matrices to work with cross validators
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge pull request #519 from ogrisel/subsampling-shufflesplit
PEP257: docstring cosmits in utils.extmath
ENH: renamed fast_svd to randomized_svd + related improvements
FIX: incomplete test for inverse_transform in text feature extraction
Merge pull request #521 from lucidfrontier45/master
pep8 in forest.py
pep8 in tree.py
pep8 in kmeans tests
more pep8
more pep8
FIX: heisen doctest
DOC: readibility: make colon after 'Parameters' stay on the same line in reference documentation
FIX: Boston is a regression dataset
oops, the last test is about classification, not regression
Merge pull request #529 from eickenberg/doc_fix
ENH: mark coef_ as immutable for linear SVM models trained in the dual
immutable coef for the sparse SVM variant too
mark liblinear coef as immutable too
document the fact that coef_ is readonly for LogisticRegression and LinearSVC
avoid a memory copy in coef_ property
Merge pull request #541 from ogrisel/immutable-readonly-coef
FIX: broken link in SVM doc
Merge pull request #551 from fannix/master
FIX: make sklearn.base.clone robust to empty params
first stab at trying to wrap MurmurHash3
Merge pull request #3 from GaelVaroquaux/murmurhash
implementation & test for the murmurhash wrapper module
Export some public cython API
DOC: add entry for murmurhash in the developer utilities section
ENH: add the ability to hash int arrays
Better docstring
Shorter cpdef function names + missing docstrings
DOC: give usage example
test developers utilities as well
OPTIM: avoid unlikely np.int32 test upfront
Merge pull request #564 from ogrisel/murmurhash
FIX: broken build / tests
Merge remote-tracking branch 'larsmans/typesafe-murmurhash'
Merge pull request #587 from jakevdp/arpack-init
Merge pull request #593 from jaquesgrobler/doc_update
cosmit in memory debugging doc
Merge pull request #602 from jaquesgrobler/doc_remotes_note
ENH: use linear gradient cmap for more readable hyperparam heatmap
docstring cosmits and typos in label_propagation.py
useless imports
simpler random seeding scheme for parallel kmeans
less hacksih parallel random state seeding
round numerical results for better stability of the doc tests
use internal dataset fetch + switch to SGDClassifier for faster execution
typo + cosmit
avoid pl.set_cmap and align colors of colormesh with scatter
started work on utility function for quick train test split
more doctest
add parameters in docstring
DOC: narrative doc for train_test_split
add tests for invalid argument + fixed a type error
more tests
typo
reworked nested grid search example for better doc and output, use train_test_split and add more cross links
DOC: related improvement in GridSearchCV doc
DOC: more cross references
cosmit
DOC: what's new
Merge pull request #618 from ogrisel/train_test_split
FIX: make LFW data shapes consistent with Olivetti faces
ENH: more informative exception message
use Perceptron and quick splitter
updgrade to new scikit-learn API
make sklearn version dependency explicit
C is now scaled
DOC: improved SVM docstrings
typo
Merge pull request #628 from daien/master
Merge pull request #633 from robertlayton/ig
Merge pull request #634 from amueller/svm_decision_function_dirty_fix
FIX #614: raise ValueError at KernelPCA init if fit_inverse_transform and precomputed kernel
DOC: formatting improvement to ensemble.rst
working on tutorial exercices skeleton generation
FIX: make the 20 newsgroups loader explicitly decode latin1 content
shorten example a bit with train_test_split
manually rescale C in face recognition example
Merge pull request #664 from conradlee/663-kfold-init-bug
Flatten the feature extraction API
Merge branch 'master' of github.com:scikit-learn/scikit-learn into text-feature-extraction-simplification
missing C re-scaling in example
missing C re-scaling in example
MiniBatchSparsePCA and MiniBatchDictionaryLearning still use chunk_size as argument
merge master
factorize feature names array
make CountVectorizer able to output binary occurrence info
add a test for custom dtype
DOC: improve docstring for Vectorizer
Flatten the combined vectorizer as well
Merge remote-tracking branch 'upstream/master' into text-feature-extraction-simplification
Fix grid search example
Fix charse in mlcomp example
DOC: started section on text feature extraction
Merge remote-tracking branch 'upstream/master' into text-feature-extraction-simplification
switch back to the old vocabulary constructor argument
Merge remote-tracking branch 'upstream/master' into text-feature-extraction-simplification
better blob seed so that both DBSCAN and meanshift are working well
Merge branch 'master' into text-feature-extraction-simplification
finally the right API with plenty of efficient overrides
Filter stop words before ngrams
demonstrate stop words in example (+ slighly faster convergence)
missing sklearn.semi_supervised package in setup.py
ENH: remove useless array wrap for feature names + more TF-IDF tests
Make Vectorizer not inherit from TfidfTransformer while preserving direct gridsearchability
FIX: division by zero errors and negative IDF
DOC: TF-IDF and customizing
DOC: updated parameters
Merge branch 'master' into text-feature-extraction-simplification
updated whats new
s/Bags/Bag/ and Vector Space Model
better explanation for bigram features
No accent stripping by default + various doc fixes
update strip_accents in Vectorizer as well
typo
typo
typos
remove lambda + better comment position
enable stop words in clustering example
typo
Renamed Vectorizer to TfidfVectorizer + deprecation warning
updated what's new + backward compat for vocabulary attribute
fixed and inheritance bug in TfidfVectorizer.fit_transform + removed vocabulary backward compat that breaks grid_search
useless import
Merge pull request #668 from ogrisel/text-feature-extraction-simplification
trailing whitespace
FIX: broken doctest under OSX
Merge pull request #694 from njwilson/skip-kmeans-2-jobs-mac
Merge pull request #692 from njwilson/minor-doc-fixes
Had a link to autopep8
Merge pull request #695 from njwilson/tmp-dir-for-cache
Merge pull request #696 from njwilson/issue-691
Merge pull request #698 from njwilson/master
OPTIM: skip buffer unpacking in kmeans
Merge pull request #693 from jaquesgrobler/Collapse_Sidebar
Merge pull request #714 from jaquesgrobler/Next_button
Merge pull request #717 from jaquesgrobler/Issue714
typo + cosmetics
ENH: sort features in dict vectorizer + new doc
ENH: refactored the HMM tests to ease PY3K transition
Fix bad reference to LFW in example
useless import
FIX #752: raise explict ValueError if k is too large
FIX: missing string formating argument in MBKMeans error message
removed useless assert
Merge pull request #748 from ogrisel/hmm-test-hierarchy-simplification
Merge pull request #742 from davidmarek/pdistance
FIX: #774 Add documentation for lprun config in qtconsole and notebook
FIX #807: non regression test for KPCA on make_circles dataset
Merge pull request #809 from zaxtax/master
Merge pull request #812 from amueller/pipeline_decision_function
typo
Add note for port install py27-scikits-learn
trailing space
add missing attribute estimators_ to the docstring of forest models
FIX #898: narrative documentation for feature importances in forest models
Merge pull request #921 from fhoeni/scaler_bugfix
FIX: heisentest for robust covariance: seed MinCovDet
Merge pull request #926 from agramfort/fix_X_list_grid_search
Merge pull request #928 from yarikoptic/master
FIX #937: preserve double precision values in svmlight serializer
add a what's new entry
work on smmlight serualizaer to preserve double precision values
track master
Merge pull request #945 from cpa/master
Merge pull request #971 from acompa/master
Update doc/support.rst
Merge pull request #955 from vene/mem_prof
Merge pull request #995 from kernc/CountVectorizer_analyzer_char_nospace
fix broken doctests for the new char_wb text analyzer
DOC: better narrative for char_wb text analyzer + add a whats_new entry
Merge pull request #1043 from jaquesgrobler/master
Merge pull request #1039 from jakevdp/lle-test-fix
Merge pull request #1045 from agramfort/fix/as_float_array
Merge pull request #1049 from fsav/c-docstring-patch
Merge pull request #1063 from welinder/peter-dev
Merge pull request #1009 from amueller/one_class_check
Merge pull request #1094 from ibayer/warnings
Merge pull request #1100 from NelleV/makefile
Merge pull request #1110 from buma/predict_proba_doc
ENH: pass verbose consistently in forest module
cosmit
FIX: wrong probabilities for OvR LogisticRegression
ENH: make test_common check normalized probabilities
Merge pull request #1189 from fabianp/svmlight
Merge pull request #1187 from ogrisel/bugfix-logistic-ovr-probabilities
FIX: broken doctest for DictVectorizer
FIX: missing figures in FA narrative doc
Merge pull request #1266 from cdeil/patch-1
Merge pull request #1292 from aymas/pass_rng_kmeans_gmm
Merge pull request #1344 from mattilyra/CountVectorizer.decode
FIX: missing # for comment in pyx file and readded missing AMI docstring
FIX: lars drop for good platform specific test failure
FIX #1354: machine precision assertion failure in test_liblinear_random_state
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge pull request #1361 from astaric/py3k
DOC: make MinMaxScaler example snippet readable outside of other sections context
DOC: more improvements / fixes on the MinMaxScaler doc
Merge pull request #909 from larsmans/hashing-trick
Merge pull request #1397 from SnippyHolloW/travis
Improved bench_covtype.py to load data faster and support configurable n_jobs
Merge pull request #1415 from SnippyHolloW/travis
Merge pull request #1418 from kuantkid/archlinux
Merge pull request #1408 from satra/fix/rebase1396
Merge pull request #1425 from arjoly/enh_bench_covertype
Merge pull request #1424 from jaquesgrobler/plot_omg_fix
FIX #1417: move nosetests configuration parameter to setup.cfg
Remove doctest-options from setup.cfg as not supported in old version of nose
Merge pull request #1430 from erg/issue-1407
Merge pull request #1429 from tnunes/fix_pipeline_fit_transform
Merge pull request #1440 from amueller/matplotlib_requirement
Display the test names to understand which test is triggering the segfault on jenkins
Merge pull request #9 from benjaminwilson/master
FIX: fixed random_state for heisen doctest failure in multiclass module
Merge pull request #1468 from erg/random-failures-12345
Delete iris.dot in tree.rst doctest
FIX: seed blobs dataset to have a stable spectral clustering under OSX 10.8
Merge pull request #1470 from kuantkid/fix_spectral_cluster_test
Add comment in test_spectral_clustering_sparse
Merge pull request #1465 from AWinterman/issue-1017
first pass at implementing sparse random projections
DOC: better docstrings
DOC: more docstring improvements
Remove non-ASCII char from docstring
use random projections in the digits manifold example
test embedding quality and bad inputs (100% line coverage)
typos
one more typo
OPTIM: CPU and memory optim by using a binomial and reservoir sampling instead of direct uniform sampling in the n_features space
note for later possible optims
fix borked doctests
make it possible to use random projection on the 20 newsgroups classification example
FIX: raise ValueError when n_components is too large
remove the random projection option from the 20 newsgroups example
leave self.density to 'auto' to implement the curified estimator pattern
more curified estimator API
useless import
change API to enforce dense_output representation by default
ENH: vectorize the johnson_lindenstrauss_bound function
started work on plotting the JL bounds to be used in the narrative documentation
More vectorization of the johnson_lindenstraus_bound function
More work on the JL example to plot the distribution of the distortion
WIP: tweaking JL function names
check JL bound domain
JL Example improvements
WIP: starting implementation implicit random matrix dot product
working on implicit random projections using a hashing function
OPTIM: call murmurhash once + update test & example
first stab at CSR input for hashing dot projections
implemented dense_output=False for hashing_dot
refactored test to check that both materialized and implicit RP behave the same
fixed broken seeding of the hashing_dot function
leave dense_output=False by default
use the 20 newsgroups as example dataset instead
make it possible to use a preallocated output array for hashing_dot
missing docstring and s/hashing_dot/random_dot/g
eps=1.0 is no longer a valid value
Typo / fix in JL lemma example
FIX: MinMaxScaler on zero variance features
Simpler inline comment
Add one more test for MinMaxScaler on newly transformed data
ENH: issue warning when minmax scaling integer data + test
ENH: add the squared hinge loss to the SGD loss example
Merge pull request #1517 from amueller/lda_qda_cleanup
Merge pull request #1562 from kmike/master
P3K: avoid iteritems / itervalues when feasible
P3K: decode error message in svm wrapper
ENH: output processing speed in MB/s for vectorizer example
Initial work on hashing vectorizer
Add fit_transform support using the TransformerMixin + missing ABCMeta marker
Improved the clustering example with HashingVectorizer
Remove TransformerMixin from vectorizers and do a direct fit_transform alias for HashingVectorizer instead
Improve module docstring of document clustering example
cosmit
Updated whats_new.rst
DOC: Started section on hashing vectorizer in narrative section
DOC: narrative doc for HashingVectorizer
DOC: typos
DOC: merged the whats new entries and add links to the narrative doc
DOC: address @mblondel's comments
ENH: measure feature extraction speed in document classification example
DOC: typos
Update travis config to remove -qq flag for scipy
P3K: support for py3k in dict_vectorizer module
PY3: Fix stdout capture in graph lasso test
P3K More python 2 / 3 compat in tree exports
Merge pull request #1660 from rlmv/fe_tests
P3K use six to have a python 2 & 3 compatible code base
Merge pull request #1726 from agramfort/round_kfold
Merge pull request #1730 from arjoly/doc-feature-selection
Merge pull request #1741 from arjoly/metrics-fix-np-1.3
PY3: Disable lib2to3
PY3: fix urlopen in mldata and california housing loaders
PY3: fix remaining cStringIO imports
PY3: fix for string literals in datasets' test_base.py
PY3: print function in coordinate descent doctest
PY3: record is a kwarg argument for warnings.catch_warnings
PY3: long is no longer a type in Python 3
Merge pull request #1839 from amueller/dbscan_example
FIX: use the mldata mock in docstring as well
Merge pull request #1913 from Jim-Holmstroem/refactored_precision_recall_fscore_support_to_count_with_integer_type
FIX: restore numpy 1.3.0 compat with np.divide fix
FIX #2032, FIX #2033: ensure module names consistency with __all__
Remove redundant test that was checked in by mistake
FIX inconsistent cv_scores_ generation for randomized search and re-add example
ENH: removed leftover condition to get a wider application of the import all consistency check
Enforce n_folds >= 2 for k-fold cross-validation
Merge pull request #2004 from oddskool/out-of-core-examples
FIX: make doc auto-linking support any Unicode / UTF-8 content
Make the out-of-core example plot work when launched by the sphinx extension
FIX: do not print to many messages to stdout when generating the documentation
PY3: New test for the get_params handling of deprecated attributes.
Better status for the Py3 port
Merge more Py3 fixes
PY3: refcounting change introduced a regression on the use of resize in LARS
FIX: pep8 and Py3 support in sklearn.neighbors.base
FIX: Python 3 support for the neighbors doctests
FIX: pep8 + Py3 fixes in test_dist_metrics
FIX: pep8 and Py3 support in sklearn.neighbors.dist_metrics
FIX: Py3 / pep8 fixes in test_ball_tree / test_kd_tree
Update Python 3 support status
Style
More readable condition and more precise error message
FIX: Py3 print statements to print functions
Rename LabelBinarizer.multilabel to .multilabel_ + DOC
WIP: partial fit for discrete naive Bayes models
Remove the class_prior partial_fit param
WIP: started to factorized the raw count collection
Incrementally is useless now
Add reference to the Manning text + restaure previous smoothing
FIX shape issue when y has only one single class + some missing doc
Factorize common classes checks in partial_fit implementations
Add note on a possible future performance optimization
Add a note on performance tradeoffs in the docstring of partial_fit
More informative error message. Also CV now use integer indices by default now.
Use floats everywhere to get rid of warnings when using sample_weight
More input checks
Better test name
Remove redundant shape check already done by check_arrays
Add missing test for sample weight with partial_fit + fix issue classes passed as a list instead of an array
One more input check test
Add missing test for deprecation warning
Found a bug: add a failing test
Use unique_labels more consistently in the multiclass model
Fix broken partial_fit test
Factorize label_binarize for binarizing a sequence of labels with fixed classes
Add a new whats_new entry
Add some doc for the new partial_fit method
wording
Avoid raising a deprecation warning on label_binarizer_.multilabel_
Fix docstring and add some usage examples
FIX: do not update feature_log_prob_ in _update_class_log_prior
Add one more tests to check the performance on digits
Make test_deprecated_fit_param pass under python 3 as well
Address wording and typos identified in review
Better parameterization for test_check_accuracy_on_digits
Add a whitespace in parameter docstring item
More accurate documentation for class_count_ and feature_count_
Rename helper partial_fit function
Merge pull request #2175 from ogrisel/nb-partial-fit
Merge pull request #2228 from amueller/travis_virtualenv_stuff
Trying to enable python 3.3 too.
Update .travis.yml
One more Python 3 fix in feature_extraction.rst
Py3 fix
More explicit tests in test_label_binarizer_column_y
Catch expected warning in sklearn/tests/test_naive_bayes.py (part of #2274)
Revert "Catch expected warning in sklearn/tests/test_naive_bayes.py (part of #2274)"
FIX PY3: list and tuples cannot be compared in Python 3
Py3: fix version comparison in imputation module
Add supported python versions to the classifiers + fixes
Sample compiler config for windows
Force stdc++ link for the windows build
Regenerate pairwise_fast.pyx with recent cython for windows build
Fix atomics definitions under windows for sklearn._hmm.pyx
typo
Use extra_link_args for -lstdc++
Ignore compiled shared library files generated in the source tree under windows
Merge pull request #2293 from amueller/warning_input_shapes
Rename cv_scores(_) back to grid_scores(_) to keep the name free for a future refactoring
Merge pull request #2299 from ogrisel/grid-scores
WIP: explicitly mark all base classes as ABC with abstractmethod inits
Add concrete __init__ for LinearSVM
Add concrete implementation for SGDClassifier
Ignore the generated MANIFEST file
Fixed a typo in a contributor's name
Also clean the dist folder when calling make
Add missing credit to @smoitra87 for the Python 3 support
partial_fit for naive Bayes was done for 0.14-rc, not 0.11...
Merge pull request #2348 from arjoly/deprecate-auc_score
DOC: Simpler cross-validation iterator doc
Merge pull request #2369 from larsmans/no-warnings-in-fs
2-fold => 2-fold cross-validation
approximately the same percentage of samples
Remove spurious print statements in sample snippets to make the doc easier to follow
Lowercase y and more consistent blank line usage
Merge pull request #2288 from dengemann/fast_dot
Merge pull request #2396 from dengemann/insert_fast_dot
FIX: renable test_k_means_plus_plus_init_2_jobs on OSX 10.8
Fix broken link
Merge pull request #2370 from ogrisel/doc-cross-validation
Merge pull request #2278 from jnothman/prf_rewrite3
Merge pull request #2222 from FedericoV/Out_of_core_example
More detailed entry for StratifiedKFold fix in whats_new.rst
Revert "More detailed entry for StratifiedKFold fix in whats_new.rst"
Revert "Add entry for #2372 to whats_new.rst"
Revert "Update comment with numbers for when we run with 800 samples."
Revert "Avoid list, preallocate a numpy array for indices instead."
Revert "Instead of linking to NB, explain the problem inside the test itself."
Revert "Fix accidental doctest breakage."
Revert "FIX #2372: StratifiedKFold less impact on the original order of samples."
FIX: more accurate description of eta0 in SGDClassifier
Merge pull request #2442 from glouppe/tree-shuffle
FIX #2372: non-shuffling StratifiedKFold implementation and updated tests
Merge pull request #2463 from ogrisel/stratified-kfold
FIX: skip numpy.dot + multiprocessing test that segfaults under recently updated OSX 10.8
FIX: broken doctest impacted by stratified CV and tree RNG changes combined...
FIX: use next(iterator) built-in instead of iterator.next()
Merge pull request #2495 from jaquesgrobler/ENH_coverage_travis
Merge pull request #2487 from jaquesgrobler/DOC_speed_up_frontpage
Merge pull request #2502 from ericjster/plot_dbscan
FIX: clean-build target does no longer exist
Merge pull request #2516 from jaquesgrobler/add-copybutton-examples
FIX #2481: add warning for bug in old numpy with unicode
Remove leftover print statement
Merge pull request #2519 from ankit-maverick/minor_docfix
Use a RuntimeError instead of a warning to avoid raising a ValueError randomly later
Merge pull request #2523 from ogrisel/skip-numpy-unicode-bug
Release the GIL at tree building time
Put more code under non-GIL block
Merge pull request #2528 from ogrisel/tree-nogil
FIX: make RANSACRegressor pass test_common
FIX: np.ones_like does not support dtype on old numpy
Add a estimator introspection check to test_common
Merge pull request #2538 from ogrisel/test-test-common
Merge pull request #2536 from larsmans/rm-mlcomp-doc-class
Add plot titles and newstyle plt import in OLS vs Ridge example
OPTIM: remove memcopy for X_argsorted in GBRT models
Merge pull request #2524 from rmcgibbo/hmmfix
FIX #1622: OPTIM: remove obsolete random_state instance in the Tree class
Merge branch 'pr/2556'
Better title
ENH: Make forest prediction code more robust to mutations of the estimators list
Merge pull request #2571 from edran/readme
Only run make test when coverage is disabled
Merge pull request #2574 from ogrisel/travis-speed
FIX: errors in kernel parameters for OneClassSVM
Merge pull request #2460 from arjoly/auc-multilabel
FIX: broken sparse matrix check under scipy 0.13.0
FIX: Python 3 dict keys cannot be concatenated with a list
Merge pull request #2631 from pprett/fix-sgd-l1-ratio
ENH: more explicit traceback + message in case of common failure on sparse input
Merge pull request #2592 from jnothman/prf_derivative_warnings
Merge pull request #2649 from jakevdp/polynomial_doc
Merge pull request #2664 from agramfort/fix_randomized_pca
FIX: test failure under windows caused by temp files handling
ENH Upgrade to joblib 0.8.0a2
ENH: reduce memory usage and IPC overhead when fitting forests by using the new threading backend
DOC Update whats_new.rst to document threading backed for forests
FIX: broken sed command under OSX in joblib sync script
ENH: parallize the BaseForest.apply method with the threading backend
Merge pull request #2700 from amueller/refactor_common_tests
Merge pull request #2720 from amueller/chi2_fused_types
ENH: make forests' test_parallel_train run faster
FIX: np.searchsorted numpy bug on unicode objects also impacts 1.6.1
Python 3 compat
Python 3 fixes in exercises
Python 3 fixes for the second exercise
FIX #1565: fix race condition in parallel pre-dispatch by upgrading joblib
Update whats_new.rst for race condition fix
Merge pull request #2756 from ogrisel/joblib-0.8.0a3
FIX #2645: fix 20 newsgroups downloader under Python 3
Add thanks for infrastructure supporters
ENH: made example/svm/plot_iris.py clearer
peerindex testimonial
FIX #2924: make lobpcg test pass with reference and ATLAS impl of LAPACK
More stable test problem for eigen_solver='lobpcg'
Merge pull request #2928 from ogrisel/fix-linalgerror
ENH: update installation instructions for Ubuntu / Debian
Merge pull request #2973 from pprett/datarobot-testimonial
Merge pull request #2959 from Oscarlsson/silhouette_score_label_number
FIX: np.abs might not work on scipy.sparse matrices
Merge pull request #2981 from matrixorz/master
Merge pull request #2997 from GaelVaroquaux/okcupid_testimonial
Merge pull request #2982 from kmike/fix-macos-hangs
FIX: more robust skip of implicit constructor
Merge conflict in whats_new.rst
Merge pull request #3006 from ogrisel/fix-implicit-init-introspection
Merge pull request #3007 from Manoj-Kumar-S/refactor_sparsefuncs
FIX: np.random.randint expects signed 32 bit integers under Windows
FIX: numpy mtrand does not accept Python long instances under Windows
FIX: make python setup.py clean also delete __pycache__ folders
FIX: remove casting warning under Python 3
FIX #3014: use a different folder for covtype data under Python 3
FIX: fix build under Python 2.6
COSMIT: More readable drop for good test
FIX: checked in bad assertion in the last commit by mistake
Remove redundant yet unstable test_spectral_lobpcg_mode
ENH: make CD Lasso raise ConvergenceWarning
Merge pull request #3030 from ogrisel/fix-lars-drop-for-good-test-failure
Merge pull request #3025 from ogrisel/remove-unstable-lobpcg-test
ENH: stable test + catch warning + pep8
ENH: more robust test_toy_bayesian_ridge_object
ENH: configure travis to also test old numpy & scipy
DOC: add some header doc to the travis scripts
More informative numerical error message in SGD
cosmit
Merge pull request #3059 from mdbecker/update_from_33_to_34
Merge pull request #3061 from jess010/ami-docs
Fix merge conflict and missing URL ref in whats_new.rst
Merge pull request #3070 from luispedro/fix-enet-doc
Merge pull request #3062 from abatula/archlinux-install-doc
More Py3 fixes for sphinx build
Merge branch 'python-3-sphinx-fixes'
Merge pull request #3063 from chalmerlowe/improved-digits-example
FIX: One more PY3 fix in the documentation generator
FIX: typo in gen_rst.py
Merge pull request #3069 from ssaeger/issue_3068
Merge pull request #3085 from eickenberg/update_ridge_authors
Merge pull request #3082 from ElDeveloper/install-link
Merge pull request #3067 from mdbecker/truncated_svd_calculate_explained_variance
Merge pull request #3083 from sdenton4/enh_learning_curves
Merge pull request #3086 from mdbecker/update_authors
FIX: remove deprecation warnings in learning curves under Python 3
Merge pull request #3090 from ogrisel/learning-curves-warnings
COSMIT: use plural, there are 2 learning curves
typo
wrap lovely testimonial's paragraph
Merge branch 'pr/3091'
MAINT: ignore coveralls failures
Merge pull request #3145 from ogrisel/travis-coveralls
FIX: use clip(0) instead of abs()
Merge pull request #3182 from arjoly/test-forest
CI: make travis run the doctests
Merge pull request #3189 from ogrisel/travis-doctest
FIX: more Python 3 fixes for doc/gen_rst.py
FIX: euclidean divide in plot_image_denoising.py for Py3 support
PY3: more gen_rst.py fixes for Python 3 compat
DOC: better docstrings for PCA models
FIX: windows test failures in test_ransac
Merge pull request #3169 from mjbommar/issue-3167-eradicate-todense
ENH: PEP257 style + small code simplication
ENH: more explicit failure messages in test_common
MAINT: joblib 0.8.0
Merge pull request #3212 from ogrisel/joblib-0.8.0
Add optional PCA init to t-SNE
MAINT: run tests on files with the exec bit
Merge pull request #3234 from ogrisel/nosetests-exe
MAINT: bump up to scipy 0.14.0 in travis CI config
Merge pull request #3237 from ogrisel/travis-scipy-0.14.0
Merge pull request #3161 from hamsal/ada-sparse
MAINT: joblib 0.8.1
Merge pull request #3242 from ogrisel/joblib-0.8.1
ENH use the np.int dtype to encode integer classes
FIX workaround doctest failure with old numpy
Merge pull request #3248 from MechCoder/remove_precompute_multi
Merge pull request #3246 from ogrisel/rebased-pr-2657
DOC add whats new for precompute fix
Merge branch 'pr/3247'
FIX use NPY_INFINITY instead of C99 INFINITY for MSVC
Merge pull request #3251 from ogrisel/fix-msvc-INFINITY
Branching 0.15.X to prepare the 0.15.0b1 release
Merge pull request #3261 from larsmans/test-interaction-features
Merge branch 'pr/3263': minor fixes from the 0.15.X branch
MAINT: restore -git version in the master branch after the last release merge
DOC: remove broken example link
MAINT removed shadowed broken test
Merge pull request #3250 from mjbommar/isotonic-refactor-2
WIP: releasing the GIL in the inner loop of coordinate descent
Merge pull request #3102 from MechCoder/gil-enet
DOC fix versions of requirements in README.rst
Merge pull request #3295 from rphlypo/bootstrap_traintest
DOC fix bad formatting in decomposition.rst
FIX python 3 compat for plot_bias_variance.py
DOC safer way to run the tests under windows
FIX Python 3 support for datasets.species_distributions
FIX ensure that matplotlib.use('Agg') is called first when building the docs
DOC formatting fixes in whats_new.rst
DOC one formatting fix in whats_new.rst
FIX Python 3 support for examples/applications/plot_out_of_core_classification.py
FIX Python 3 support for examples/applications/plot_prediction_latency.py
DOC hmm example has been removed
DOC remove reference to the bootstrap target
DOC remove spurious plt.clf() from ROC example
DOC add missing plt.figure() for multiplot support in ROC example
Merge pull request #3256 from kastnerkyle/windows_download_helper
Convert windows_testing_downloader.ps1 from UTF-16le to UTF-8
Merge pull request #3268 from jblackburne/nonrepeating_roc_thresholds
MAINT bump joblib to 0.8.2
Merge pull request #3328 from ogrisel/joblib-0.8.2
MAINT skip tests that require large datadownload under travis
MAINT add docstring to explain the motivation of the fixture
Typo
Merge pull request #3355 from larsmans/optipng
TST fix precision failure on windows
MAINT skip joblib multiprocessing tests on travis
Merge pull request #3361 from ogrisel/travis-no-multiprocessing-joblib
FIX assert_array_almost_equal for windows tests
FIX multilabel deprecation warning in RidgeClassifierCV
FIX #3372: unstable input check test for RANSACRegressor
MAINT ensure that examples figures are displayed in the correct order
DOC Refreshed the documentation to install scikit-learn
MAINT: skip some unstable transformers test under Win 32
MAINT configure Windows CI with appveyor.com
DOC various typos and fixes for the installation doc
DOC phrasing
MAINT master to 0.16-git
DOC whats_new.rst format fix
MAINT .mailmap update
MAINT whats_new: contributors for 0.15
DOC Copy and paste error, thanks @larsmans.
DOC update news on the homepage
DOC broken formating for the People list of 0.15
MAINT update doc version navigation
Merge pull request #3377 from ldirer/hashing_fix3356
MAINT update doc version links in support.rst
MAINT point to the configuration of the docbuilder server
MAINT temporary fix to handle cython renaming in 0.15
MAINT 32 bit unstable tests are unstable on all OSs
MAINT typo in Makefile
Merge pull request #3349 from MechCoder/return_niter
TST non-regression test for CV on text pipelines
Merge pull request #3400 from arjoly/forest-test-oob
Merge pull request #3396 from amueller/less_loud_randomized_lasso
Merge pull request #3397 from arjoly/tree-factor-rand
MAINT faster test_weight_boosting
TST speedup test_spectral_biclustering
TST faster check_transformer_pickle by fixing n_iter
TST speedup test_permutation_score
TST remove joblib tests from sklearn
Merge pull request #3403 from ogrisel/speedup-tests
Merge pull request #3460 from jnothman/fix_nn_example_doc
Merge pull request #2777 from jnothman/doc_linkcode
FIX better RandomizedPCA sparse deprecation
Merge pull request #3470 from ogrisel/fix-sparse-randomized-pca-deprecation
MAINT disable verbose tests on travis
MAINT Move the build_ext --inplace call to install.sh
Merge pull request #3441 from ogrisel/travis-no-verbose-tests
MAINT More robust windows installation script
MAINT move skip for unstable 32bit to _check_transformer
FIX unstable test on 32 bit windows
FIX numerically unstable test_logistic.py
DOC whats_new.rst entry for warm_start forests
Merge pull request #3454 from ldirer/confusion_matrix_example
TST stabilize check_transformer_n_iter
FIX revert changes introduced by mistake in previous commit
Merge pull request #3492 from furukama/patch-2
COSMIT smipler & more robust check in test_kfold_valueerrors
MAINT fix prng in test_f_oneway_ints
DOC whats_new.rst: missing backported fixes for 0.15.1
FIX #3485: class_weight='auto' on SGDClassifier
FIX wording in whats new in 0.15.1
ENH removed manual code for parallel task batching forests
ENH: parallel feature importances for forests
FIX #3566: redefine isfinite alias in sklearn
Merge pull request #3612 from calvingiles/fix-resource-warnings
ENH cosmetic reorg of the confusion matrix example
FIX: divide each row of the CM by the true support
FIX bump up the miniconda installation script
Merge pull request #3619 from ogrisel/fix-miniconda
FIX define CC and CXX for travis
FIX Windows CI: use prebuilt numpy / scipy
FIX heisenfailure in test_lasso_lars_path_length
Merge pull request #3629 from ogrisel/fix-unstable-lars-test
FIX heisenfailure on 32 bit python + speedup
FIX finfo.eps should be used instead of tiny for equality checks
ENH upload Windows wheels to rackspace
DOC fixes in 0.15.2
FIX more tolerant early stopping for least angle
MAINT CI: reflect CLI change in wheelhouse-uploader tool
FIX #3370: better lars alpha path inequality checks for 32 bit support
ENH comment on drop for good not being triggered in the tests
Merge pull request #3224 from pprett/gbrt-sample-weight
Merge pull request #3285 from kastnerkyle/incremental_pca
MAINT use Python versions provided by AppVeyor CI
FIX call srand whenever random_seed >= 0 in libsvm
MAINT missing numpy / scipy for the Python 3.3 build
MAINT remove integer warnings under Python 3 + numpy 1.8+
FIX warning check in test_affinities
FIX #3503: use linalg.pinv to better deal with singular input data
MAINT CI: collect rackspace DNS info
MAINT try to avoid HTTP timeouts to rackspace
FIX #2986: ZeroDivisionError in LinearRegression on sparse data
Merge pull request #3853 from ogrisel/backport-lsqr-fix
FIX make assert_raises_regex backport for 2.6 consistent with 2.7+
ENH use specific warning class for RP
Merge pull request #3860 from ogrisel/custom-rp-warning
Merge pull request #2949 from FlorianWilhelm/theilsen
Merge pull request #3173 from arjoly/sparse-tree
Merge pull request #3870 from FlorianWilhelm/whats_new_theilsen
DOC update whats_new.rst (sparse data for trees)
Merge pull request #3826 from amueller/kde_y_none
Merge pull request #3871 from anntzer/fast-affinity-propagation
Merge pull request #3858 from banilo/km_init
MAINT bump joblib to 0.8.4
DOC update whats_new.rst for AffinityPropagation optim
FIX clip SGD gradient on linear models for stability
TST update numerical overflow (non-regression) tests
DOC update whats_new.rst for SGD stability
DOC target URLs on SGD loss / penalty figures
Merge pull request #3882 from MechCoder/doc_tree
MAINT enable verbose output to track random failure on appveyor
Merge pull request #3902 from JeanKossaifi/doc_enhancement
Merge pull request #3915 from ThomasDelteil/fix_css_examples
ENH do not use plt.cm.jet in JL bound example
ENH improve LSHF scalability example
DOC: explain the importance of iid index and queries in bench script
ENH: ensure that max hash size and hash dtype are consistent
ENH more LSH scalability example and doc improvements
MAINT dev version to follow PEP440
MAINT more informative assertion failure in test_common
MAINT make it possible to use wheelhouse-uploader
Merge pull request #4013 from amueller/deprecated_0.16
Merge pull request #3991 from jnothman/lshforest_improvements
Merge pull request #4030 from amueller/add_pipeline_docstrings
Merge pull request #4031 from amueller/check_scoring_fixes
ENH better RBF parameters heat map
Merge pull request #4095 from jjhelmus/bug_local_import
Merge pull request #4058 from amueller/cleanup_common_tests
Merge pull request #4018 from ragv/fix_for_nan_scale
STYLE cosmetics / unused import
FIX #4059: explicit warning for strict selectors
Merge pull request #4186 from xbhsu/non-inheriting_estimator_tests
Merge pull request #4064 from amueller/pipeline_scoring_y_none
TST more strict selectors with empty support
MAINT enable travis notifications in gitter
MAINT enable appveyor notifications in gitter
FIX tests in gmm broken by previous rebase
Merge pull request #4233 from amueller/minor_doc_improvement_ensemble
FIX buggy idiom found by pyflakes
FIX regularized covariance on 1D data
FIX stricter validation for non-empty input data
FIX broken test under 64 bit Python 2 / Windows
Merge pull request #4251 from ogrisel/fix-shape-format
Merge pull request #4266 from lesteve/python3-fix-plot-kmeans-silhouette-analysis
MAINT mark _sigmoid_calibration private
ENH ensure that a warning is raised when sample_weight is not supported
DOC update contributors for the calibration features
FIX DOC TST: alpha is an uppper bound on FDR
Merge pull request #4057 from amueller/dtype_object_conversion
DOC cosmetics in test docstring
DOC document GMM fix in whats_new.rst
Merge pull request #4284 from hbredin/dgpmm_convergence
Merge pull request #4261 from ragv/svm_scale_c_plot
DOC update whats_new for deterministic spectral_embedding
FIX test for consistent handling on empty input data
DOC whats_new.rst for BernoulliNB fix
TST check that constructor params are not mutated by fit
DOC whats_new.rst for radius_neighbors boundary handling
Merge pull request #4318 from amueller/skip_omp_cv_on_travis
Merge pull request #4313 from vortex-ape/spectral_clustering
Merge pull request #4192 from jnothman/binary_iff_binary
Merge pull request #4307 from amueller/more_quite_testing
Merge pull request #4326 from Barmaley-exe/svm-dual-coef-fix
DOC fix broken image link
DOC add some missing API links in whats_new.rst
STYLE trailing spaces
ENH no need for tie breaking jitter in calibration
ENH improve docstrings and test for radius_neighbors models
TST boundary handling in LSHForest.radius_neighbors
Merge pull request #4317 from ogrisel/fix-radius-queries
Merge pull request #4028 from ogrisel/wheelhouse-uploader
DOC deprecate random_state for DBSCAN
Merge pull request #4302 from amueller/isotonic_regression_duplicate_fixes
MAINT use canonical PEP-440 dev version consistently
FIX #4358: more tolerant test_scorer_sample_weight for windows
FIX revert wrong previous fix and make weighted scorer test deterministic
MAINT skip unstable (arpack dependent) estimators in check_pipeline_consistency on Win32
Merge pull request #4295 from ragv/sigmoid_decision_fn
FIX & TST at least min_samples to be considered a core sample
DOC whats new for DBSCAN optims
DOC fix broken image link in feature selection
Merge pull request #4402 from amueller/ompcv_fix
Merge pull request #4388 from Barmaley-exe/svr-doc-n-test
Merge pull request #4407 from agramfort/calibration_nan_fix
Merge pull request #4370 from amueller/0.17_deprecated
Merge pull request #4416 from vortex-ape/kernel_precomputed
Merge pull request #4377 from vortex-ape/intercept_scaling
Merge pull request #4322 from amueller/kneighbors_include_self_fixes
Merge pull request #4368 from xuewei4d/deprecate_estimator_params
Merge pull request #4409 from bryandeng/immutable_defaults
Merge pull request #4422 from trevorstephens/y_numeric_for_np180
DOC whats_new entry for stabler StandardScaler
Merge pull request #4432 from ragv/travis_ignore_docstring
FIX test for older versions of numpy
Merge pull request #4427 from amueller/lda_lsqr_predict
Merge pull request #4439 from bryandeng/dataset_docs
Merge pull request #4350 from vortex-ape/build_issue
Merge pull request #4189 from amueller/sparse_decision_function
DOC fix link to Birch in whatsnew.rst
Merge pull request #4423 from vortex-ape/agg_clustering
DOC typo
DOC add warning for partial pip upgrades
FIX respect astype(array, dtype, copy=True) on old numpy
TST add missing test for astype backport
STYLE PEP8
Merge pull request #4542 from ogrisel/fix-doc-partial-pip-upgrades
FIX make shuffle / resample pass-through indexing utilities
MAINT remove debug nslookup call
STYLE PEP8
FIX make shuffle / resample pass-through indexing utilities
MAINT do not ctags index non-source artifacts
STYLE whitespace around operator in sklearn.utils.fixes
Merge pull request #4584 from sdegryze/fit-predict-for-pipeline
Merge pull request #4590 from jfraj/fix_bug4559
Merge pull request #4606 from ibayer/remove_order_check
Merge pull request #4608 from jfraj/test_pickle_bunch
Merge pull request #4604 from jnothman/faster_lshforest
Merge pull request #4610 from sseg/ex-wikipedia-py3
Merge pull request #4613 from bnaul/graph_lasso_tests
Merge pull request #4371 from saketkc/fix_setup
Merge pull request #4550 from amueller/common_test_refactoring
Merge pull request #4650 from lesteve/fix-4641
FIX remove deprecation warning
Merge pull request #4654 from amueller/fix_partial_dependency_percentiles
FIX #4597: LassoLarsCV on with readonly folds
Merge pull request #4683 from yanlend/patch-1
Merge pull request #4725 from jseabold/avoid-nan-equals-comp
DOC add comment for np.nan equality test in clone
FIX Gram OMP check_finite on old scipy
Merge pull request #4705 from amueller/cross_val_predict_input_types
Merge pull request #4697 from amueller/remove_warning_test_pandas_slicing
Merge pull request #4730 from amueller/ovr_classifier_sparse_coef
MAINT use the new container-based travis workers
Merge pull request #4741 from amueller/multioutput_warning
Merge pull request #4748 from TomDLT/check_maxiter
Merge pull request #4754 from TomDLT/astype_fix
FIX fix test_lars_path_readonly_data failure under Windows
FIX ResourceWarning in twenty newsgroups loader
Merge pull request #4362 from rvraghav93/make_PyFuncDistance_picklable
STYLE removed unused import, fixed comment indent and spacing
Merge pull request #4751 from amueller/dict_learning_n_jobs_bug
Merge pull request #4765 from alexeygrigorev/master
FIX typo in RobustScaler example
DOC various fixes in whats_new.rst
Merge pull request #4770 from TomDLT/logistic
FIX pass random_state to inner call to randomized_svd
STYLE cosmetic fixes in sklearn.mixture.gmm
MAINT remove referrence to HMM in common tests
Merge pull request #4872 from ogrisel/remove-hmm-ref-in-tests
MAINT make sure we do not download large datasets on appveyor
Merge pull request #4928 from rvraghav93/pls_deprecated
Merge pull request #4927 from jnothman/hiding_predict_proba
MAINT ignore Python import state in ctags
MAINT make it possible to use plain HTTP to download pre-built numpy & scipy from rackspace container
Merge pull request #4983 from ogrisel/appveyor-http-rackspace
Merge pull request #4908 from TomDLT/rcv1
Merge pull request #4984 from TomDLT/rcv1-subset
DOC updated whatsnew for joblib 0.9.0b2
Merge pull request #4961 from mrphilroth/issue4959
FIX seed the initialization of NMF
Merge pull request #5020 from ogrisel/fix-nmf-rng-seeding
Merge pull request #5017 from lesteve/fix-coveralls-badge
Merge pull request #5021 from lesteve/make-copy-joblib-sh-python3-compatible
Merge pull request #5005 from amueller/drop_old_doc_links
Merge pull request #4967 from stephen-hoover/threshold-boosting-samme_proba
Merge pull request #5019 from glouppe/voting-get_params
Merge pull request #4957 from dotsdl/issue-4614
Merge pull request #5113 from fzalkow/master
Merge pull request #5133 from arthurmensch/lasso_perf_checks
Merge pull request #5081 from amueller/transformers_consistent_n_samples
Merge pull request #5098 from olologin/OneClassSvm_sparse_test
DOC explain fork related issues in FAQ + stopgap for Python 3.4
Merge pull request #5167 from NoonienSoong/clustering_example
Merge pull request #5161 from beepee14/sparse_prediction_check
DOC add what's new entry for cross_val_predict fix
PEP8 in cross_validation module + tests
MAINT bump joblib to 0.9.0b4 to use forkserver for Py3 & POSIX
DOC update FAQ on multiprocessing forkserver
MAINT enable multiprocessing + kmeans test on Python 3.4
DOC whats_new entry for forkserver
FIX deprecation message for 1d data
Merge pull request #5228 from jmschrei/gb_apply
MAINT re-cythonize sklearn/tree/*.pyx
MAINT PEP8 / slightly faster tests in decomp LDA
Merge pull request #5249 from MechCoder/init_centroids_bug
OPTIM make (Online) LDA reuse a joblib.Parallel instance
Merge pull request #5206 from acganesh/numpy_scipy_ver_check
Merge pull request #5245 from ogrisel/lda-acronym-deprecation
Merge pull request #5280 from ogrisel/pr-4924-rebased
Merge pull request #5293 from amueller/pipeline_X1d_inverse
MAINT use inspect.signature for introspection
MAINT build and test with Python 3.5 on appveyor
MAINT update numpy / scipy wheels used by appveyor
Merge pull request #5303 from larsmans/faster-lda
MAINT configure the appveyor's cache for pip
Merge pull request #5302 from ogrisel/python-3.5-appveyor
ENH better error message for estimators with ensure_min_* checks
FIX consistency of memory layout for linear CD solver
FIX ensure contiguous Gram matrix in graph lasso
Merge pull request #5337 from ogrisel/fix-coordinate-descent-memory-layout
ENH utility to have distinct dataset .pkl filenames
FIX separate filenames for RCV1
FIX separate filenames for species distributions
FIX separate filenames for covtype
FIX separate filenames for 20 newsgroups
FIX separate filenames for california housing
FIX separate filenames for Olivetti faces
DOC what's new entry for dataset fetchers
Merge pull request #5356 from rvraghav93/ridge_appveyor_failure
Merge pull request #5386 from jmschrei/contrib
Merge pull request #5384 from map222/patch-1
Merge pull request #5362 from MechCoder/lasso_fix
Merge pull request #5234 from vighneshbirodkar/mcd_fix
Merge pull request #5378 from amueller/gridsearch_docs_unsupervised
Merge pull request #4478 from amueller/fix_randomized_svd_transpose
ENH use explicit decimal keyword & PEP8
DOC whats_new entry for randomized_svd heuristic
Merge pull request #5395 from amueller/some_test_deprecation_warnings
Merge pull request #5399 from lesteve/update-joblib-to-0.9
Merge pull request #5413 from amueller/cleanup_tests
Merge pull request #5411 from lesteve/fix-plot-tomography-l1-reconstruction-example
MAINT Release 0.17b1
MAINT Use the full listing of the rackspace wheelhouse for appveyor
MAINT disable circle ci on 0.17.X
FIX increase tolerance of class weight check for OS X
Olivier Hervieu (7):
Refactor roc_curve method.
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn
fixes typo in roc_curve method
[refs #350] - variable renaming regarding reviewer comments
Removes useless (and time consuming) statement.
Improves signal sorting method (using numpy primitives).
FIX inconsistent coef_.shape in LinearRegression
Omer Katz (1):
Nest the for loops because they don't need to run if the condition is not true.
Oscar Carlsson (2):
Fix silhouette score n_labels
Added number of labels for debug and regexp in test
Paolo Losi (38):
liblinear bias/intercept handling
l1 logreg (liblinear): minimum C calculation
l1 logreg (liblinear): minimum C (sparse version)
review of min_C doc strings
numpy/scipy idioms as suggested by agramfort
pep8 compliance
min_C: reworked _y calculation
min_C: check for ill-posed problem _y * X == 0
min_C: let's avoid scipy.sparse top level import
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into l1_logreg_minC
min_C: fixes to the doc strings
s / shape = / .reshape() /
removed float64 and int32 conversion
docstrings updated
fix for "removed float64 and int32 conversion"
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into l1_logreg_minC
got rid of np.where
reimplemented l1_min_C as a function
removed old version of min_C
cleanup tests
some more cleanups
bound on C can be calculated also with one class
cleaned up tests
fixes to docstring (as for Fabian comments)
l1_min_c import in svm/__init__.py
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into l1_logreg_minC
Merge branch 'master' into l1_logreg_minC
DOC: added reference to l1_min_c
the l1 logreg example now works with l1_min_c
coverage 100% + pep8 small fix
Revert "Remove references to y in preprocessing objects."
Merge remote branch 'upstream/master' into revert_preprocessing
TEST: test for scaler in Pipeline
FIX: for SGD log loss
FIX: partial revert of the SGD log loss fix
DOC: Better doc string for l1_min_C
BENCHMARK covertype: select classifier via cmd line opt
Merge pull request #736 from paolo-losi/bench_covtype
Paolo Puggioni (1):
add crossvalidation
Paul Butler (1):
FIX Pipeline should raise ValueError for duplicate name
Pauli Virtanen (1):
TST: fix undefined behavior in test
Pavel (1):
Fixed typos.
Paweł Mandera (1):
Fix citation in TfidfTransformer
Peter (1):
cosmit
Peter Fischer (1):
Typos in comments corrected
Peter Prettenhofer (860):
initial checkin of sgd package.
set rho on 1 or 0 if L2 or L1 penalty.
l1 penalty implemented.
added class encoding.
does not belong to the repo.
Merge branch 'master' of git at github.com:pprett/scikit-learn
Code review from Alexandre:
Merge branch 'master' of github.com:pprett/scikit-learn
removed unnecessary print statements.
100% code coverage.
added doctests to SGD and sgd.LinearModel
initial checkin of sgd package.
set rho on 1 or 0 if L2 or L1 penalty.
l1 penalty implemented.
added class encoding.
Code review from Alexandre:
100% code coverage.
added doctests to SGD and sgd.LinearModel
Merge commit 'origin/master'
initial *draft* of the sgd module documentation added.
added Readme so that sphinx stops complaining.
additional documentation for sgd (plot of various convex loss functions).
math formulation cont'
penalty contour plot added.
more SGD documentation added: example, math formulation , implementation details.
EfficientBackprop reference added.
Documentation for sgd polished.
fixed doctests after SGD class index refactoring.
Removed tabs.
implemented OVA for multi-class SGD.
implemented OVA for multi-class SGD.
Merge branch 'master' into ova
SGD supports multi-class classification using one-vs.-all.
SGD multi-class documentation added.
SGD classifier supports multi-class with OVA.
documentation for multi-class sgd updated.
Changed docstrings for coef_ and intercept_ in sgd package. Wrap intercept_ in an array in the case of binary classification.
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
Liblinear docstring modified: deleted irrelevant attributes support_ and changed shape of intercept_ and coef_ accordingly.
Added dense implemenation of SGD.
Merge branch 'dense' of github.com:pprett/scikit-learn
Commit broken cython header import.
moved sgd_fast_sparse from sgd/sparse/src to sgd/src.
Moved sgd extension modules from sgd/src to sgd.
performance improvements in cython files; cython files rebuild.
added covertype example for dense sgd.
bugfix in plot_loss_functions (import loss functions).
covertype example now downloads dataset automatically.
Updated sgd documentation with multi-class documentation.
docstrings: n_jobs defaults to 1.
cosmit: color of data points matches color of decision regions and OVA hyperplanes.
warm start optimization changed from coef_ to init_coef_ and intercept_ to init_intercept_.
Multi-class documentation for module sgd added.
Include models with L1 and Elastic-Net penalty.
changed init_coef to coef_init (intercept likewise).
Merge branch 'warmstart'
Added new example on modeling the geographic distribution of species.
Merge branch 'speciesmodeling' of git at github.com:pprett/scikit-learn
added species distribution example as plot example.
if possible, species distribution example now uses basemap by default.
deleted old species_distribution_modeling example.
cosmit: pep8 and author
Reduced memory consumption in covertype example due to memory leak in np.loadtxt.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Added note on the importance of shuffeling. Minor changes in text.
Runtime improvement of species distribution example (fancy indexing).
set basemap as default.
Class weights for SGD similar to svm package. Same heuristic as Liblinear for multi-class (OVA): use only weight for the positive class.
removed parameters `p` and `C` from OneClassSVM (dense and sparse).
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
added tksvm from git://gist.github.com/673953.git.
use np.fromstring to load data from large csv text files.
changed predict_margin to decision_function
Merge branch 'svmgui'
added GUI example for SVM.
added tksvm from git://gist.github.com/673953.git.
added GUI example for SVM.
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
Merge branch 'importanceweighting' of git at github.com:pprett/scikit-learn
RegressorSGD added.
Merge branch 'master' into importanceweighting
changend "squarederror" to "squaredloss".
automatic refitting on radiobutton change and add example.
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
changed loss function names in SGD (squaredloss -> squared_loss; also for modified_huber).
added Oliviers ElasticNet convergence test to SGD.
move sgd into linear_model and rename sgd to stochastic_gradient.
finalized sgd module renaming.
moved sgd examples to examples/linear_model and added sgd prefix.
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
Merge branch 'master' into sgd-rename
COSMIT: smaller data points
updated SGD documentation (referenced in linear_model.rst and classes.rst).
fixed imports in non-auto examples.
BUGFIX in sparse.SGDRegressor
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
Merge branch 'sgd-rename'
refactored SGD module (removed code duplication, better variable naming).
Sample weights for SGDClassifier.
Additional tests for sample weights.
Merge branch 'master' into sgdsampleweight
pep8 + oliviers remarks
SGD: documentation for sample weights and class weights.
added doctests for sparse and dense SVR, NuSVR, NuSVC, and sparse SVC.
added doctests for sparse and dense SVR, NuSVR, NuSVC, and sparse SVC.
Cosmit in fast_sgd.pyx
fixed failed doctest. SVR parameter `p` renamed to `epsilon`.
SGD module supports two additional learning rates: constant and inverse scaling.
Added SGD regression benchmark
Fixed doc-tests and added doc strings for SGD learning rates.
added notes on learning rate schedules to sgd.rst.
added learning rate arguments to docstrings.
Run cython on sgd_fast and sgd_fast_sparse.
pep8 compliance
cosmit: removed trailing whitespace
ROC fixes for trivial classifiers (always predict one class) and input checks (raise ValueError in case of multi-class).
added doctests for roc and refined documentation.
cosmit: pep8
cosmit: beautified plotting.
docstrings: added note to roc_curve, changed y_scores to y_score.
docstrings: changed signature of metrics.auc from fpr and tpr to x and y.
Merge branch 'rocfix'
changed semantics of LossFunction.dloss.
cosmit: pep8 + doc
cosmit: changed docstring of svm_gui.py.
cosmit: removed requirements in svm_gui doc.
Merge remote branch 'upstream/master'
bugfix: bad Scaler example.
fixed LARS doctest.
Initial checkin of sparse.MiniBatchKmeans clustering + document clustering example on 20 news.
enh: compute cache only on samples in current batch.
Added different compute_cache functions: dot and euclidean distance.
added SpectralClustering to document_clustering example.
fix: random_state was set to None.
use provided x_squared_norms instead of recompute (if none euclidean_distances will recompute).
reuse squared sample norms if possible (_calculate_labels_inertia).
Use euclidean distance.
Merge remote branch 'upstream/master' into sparse-mbkm
dense and sparse seed differences: change order of shuffling X and init centroids to ensure consistant results.
changed mini batch representation in dense MiniBatchKMeans - saves mem but increases runtime.
deleted sparse clustering package.
Merged dense and sparse MiniBatchKMeans implementations.
Document clustering example updated.
cosmit: pep8
fast function to compute l2 norm of rows in CSR matrix.
set max_terms to 10k. Added spectral clustering.
more tests for (mini-batch) k-means (99% coverage).
Merge branch 'master' into sparse-mbkm
changed batch representation from indices to slices.
remove assert_warns from test case (not supported by build bots numpy version).
cosmit: docstring of MiniBatchKKeans
remove n_init arg from MiniBatchKMeans signature
fix: doctest formatting.
fix: remove n_init from mbkm tests.
fix: call set_params in mbkm fit.
Merge remote branch 'upstream/master'
cosmit: docstring + raise ValueError if kmeans input is sparse.
added document clustering example to KMeans clustering section.
Merge pull request #305 from vincentschut/mini-batch-kmeans-batch-labeling
fix: if n_samples < chunksize n_batches was 0 and no iterations are performed.
cosmit: rm debug output
add smoke test for MiniBatchKMeans
Merge remote branch 'upstream/master'
added NavigationToolbar to SVM gui
Merge branch 'enh/tree' of https://github.com/bdholt1/scikit-learn into bdholt1-enh/tree
Merge https://github.com/bdholt1/scikit-learn into bdholt1-enh/tree
Merge branch 'enh/tree' of https://github.com/bdholt1/scikit-learn into bdholt1-enh/tree
introduce reset method for Criterion and implemented linear version of MSE.
fix: weight left and right variance by num samples in each branch
added CART to covertype benchmark -> look at that error rate!
Merge remote branch 'bdholt1/enh/tree' into bdholt1-enh/tree
visitor pattern for export graphviz
cosmit: pep8 + docs
Merge remote branch 'bdholt1/enh/tree' into bdholt1-enh/tree
Merge remote branch 'bdholt1/enh/tree' into bdholt1-enh/tree
use hybrid sample_mask fancy indexing approach.
cosmit: docs + rm comments
added `min_density` parameter to CART
raise ValueError for min_split and max_depth on __init__ rather than fit.
we grow our trees deep
cosmit + n_samples fix
MSE bugfix (MSE.eval used to weight variances by n_left and n_right).
take DTYPE from _tree extension module
fix: inc n_left, n_right before variance computation; hopefully the last bugfix for MSE...
fix doctest + recompile cython code (accident)
make Node an extension type + change class label indexing.
recompile _tree.pyx
make _tree import relative
make node pickleable & tidy up some rebase mistakes
remove obsolete tests
check if y.shape[0] == X.shape[0]; this is especially troublesome for svm.sparse because most people are not aware of the sparse matrix - KFold troubles..
unified predict for sparse and dense SGD.
cosmit
fix: use None as default value for class_weight and sample_weight for sparse OneClassSVM; ample_weight -> sample_weight
cosmit: pep8
added y.shape[0] == X.shape[0] check to DiscreteNB
added X.shape[0] == y.shape[0] check to ElasitcNet
Merge remote branch 'upstream/master'
documented changes in whats_new
Merge remote branch 'upstream/master'
Merge branch 'fix-split-sample-mask' of https://github.com/TimSC/scikit-learn into TimSC-fix-split-sample-mask
compute threshold as t = low + (high - low) / 2.0
initial checkin of gradient boosting
GBRT benchmark from ELSII Example 10.2
added GBRT regressor + classifier classes; added shrinkage
use super in DecisionTree subclasses
first work on various loss functions for gradient boosting.
added store_sample_mask flag to build_tree
implemented lad and binomial deviance - still a bug in binomial deviance -> mapping to {-1,1} or {0,1} ?
updated benchmark script for gbrt.
some debug stmts
new benchmarks for gbrt classification
fix: MSE criterion was wrong (don't weight variance!)
more benchmarks
binomial deviance now works!!!!!
add gradient boosting to covtype benchmark
add documentation to GB
timeit stmts in boosting procedure.
add previously rm c code
updated tree
hopefully the last bugfix in MSE
new params in gbrt benchmark and comment out debug output
make Node an extension type + change class label indexing.
predict_proba now returns an array w/ as many cols as classes.
cosmit: tidyed up RegressionCriterion
added VariableImportance visitor and variable_importance property
minor changes to benchmark scripts
use `np.take` if possible, added monitor object to `fit` method for algorithm introspection.
cosmit
choose left branch if smaller or equal to threshold; add epsilon to find_larger_than.
compiled changes for last commit
cosmit
some tweaks and debug msg in tree to spot numerical difficulties.
added TimSC tree fix
changed from node.error to node.initial_error in graphviz exporter
recompiled cython code after rebase
fix: _tree.Node
comment out HuberLoss and comment in benchmarks
changed from y in {-1,1} to {0,1}
cosmit: beautified RegressionCriterion (sum and sq_sum instead of mean).
rename node.sample_mask to node.terminal_region
fix: Node.__reduce__
fix init predictor for binomial loss
performance enh: update predictions during update_terminal_regions
fix: samplemask
added timing info
cosmit: get rid of gcc warning (q_data_ptr was not initialized)
fix: overflow of `offset` variable if X.shape[0] * X.shape[1] > 250M
fix: broken doctest with precomputed kernel
changed Decision Tree representation to struct of arrays instead of composite structure.
fix: use tree.predict instead of functor
Graphviz visitor now works on array repr.
cosmit: doc strings
use safe_sparse_dot instead of np.dot
changed int64 to int32 in tree repr;
Merge branch 'tree-array-repr'
changed for `for i in 0 <= i < n` to `for i in xrange(n)`.
Merge branch 'tree-array-repr'
changed tree.left and tree.right to tree.children (similar to cluster.hierachical)
use new tree repr; adapt gradient boosting for new tree repr.
Merge branch 'master' into gradient_boosting
cythonized tree (still broken)
clear tree.py
updated _tree.c
updated GradientBoosting with current master
fix: update variable importance
added gradient boosting regression example
added test deviance to GBRT example
updated TODO in module doc
fix: sgd module clone issue w/ rho parameter
Merge remote branch 'upstream/master'
Merge branch 'master' into gradient_boosting
fix: make GradientBoostingBase clonable.
fix: learning rate schedule doc.
Merge remote branch 'upstream/master'
fix: rm `nu` argument from sparse.SVR (taken from dense SVR).
added unit tests for gradient boosting (coverage ~95%)
better test coverage
store loss object in estimator
don't use dict comprehensions (support python 2.5 and 2.6).
fix: tree doctests + ensemble doctests
Merge branch 'master' into gradient_boosting
stub for gradient boosting documentation
restore original bench_tree.py
Merge branch 'master' into gradient_boosting
min_density now works with store_terminal_regions (however, this only matters if you learn deep trees max_depth >> 5 which rarely happens).
cosmit
added input type and shape test
Merge remote branch 'upstream/master' into gradient_boosting
n_samples > min_split instead of >=
cosmits (cleanup after profiling)
repeat decorator now with arguments
fix: xmin -> X.min()
eliminate `compute_importances` fit parameter - make `feature_importances_` a property that will be computed on demand.
initial_error -> init_error
max_features bug in _tree.pyx (check if < 0 and assume all features!)
Merge branch 'tree-feature-importance' into old-gradient-boosting
merge with master finally resolved!
enh: performance enhancement by removing redundant computation of values - we use the state of `criterion` instead.
started work on gradient boosting docs
remove obsolete `sparse_coef_` doc string
remove reference to obsolete `sparse_coef_` parameter.
set coef_ to fortran layout after fit - this will enhance the test time performance for predicting singe data points.
added to whats new
cosmit: more detailed doc string for why fortran style arrays
Merge branch 'sgd-fortran-layout'
Merge branch 'master' into old-gradient-boosting
removed feature_importances_ property in tree module
work in progress on GBRT docs
added script to bench sklearn gbrt against R's gbm package.
cosmit: pep8 + comments
fix: undo compte_importances property merge in forest module and examples
wip: narrative doc
fix: table layout
restore original
restored original version
restored original version
restored original version
restored original version
Merge branch 'master' into gradient_boosting
Merge branch 'master' into gradient_boosting
test_oob_score_regression oob_score below 0.8 if n_estimators < 50
changed ``n_iter`` to ``n_estimators`` and attribute ``trees`` to ``estimators``.
added artificial dataset generator from Hastie et al. 2009, Example 10.2
wip: narrative doc for gradient boosting.
fix: wrong assertion
renamed estimators to estimators_
wip: narative documentatio for gradient boosting.
fix: import numpy in doctest
Merge remote branch 'upstream/master' into gradient_boosting
Merge remote branch 'upstream/master' into gradient_boosting
use mean_squared_error
added new mean_squared_error to metric imports
Merge remote branch 'upstream/master'
Merge branch 'master' into gradient_boosting
polished narrative documentation. fixed doctest.
cosmit: fix doc format
cosmit: fix doc format
Merge branch 'master' into gradient_boosting
factored out weight vector class; dense SGD now uses ``WeightVector`` instead of explicit ndarray and wscale.
enh: performance of WeightVector now comparable to explicit weight vector. some cosmits in dense sgd extension module.
wip: sparse sgd now uses WeightVector - there are some broken tests tough.
ENH changed naive bayes' self._classes attr to self.classes_
wip: still hunter sparse sgd bug
fix: forgot to scale by wscale at the end of dot_sparse. All tests are green again!
added new sgd dataset abstraction to unify sparse and dense implementations.
Merge branch 'master' into sgd-refactoring
major refactoring of sgd module::
use Py_ssize_t where appropriate; cosmit
Merge remote branch 'upstream/master' into sgd-refactoring
cosmit: better docstrings for SGD
Merge remote branch 'upstream/master' into sgd-refactoring
WeightVector now keeps track of its squared norm.
move WeightVector and Dataset abstraction to new module
moved WeightVector and dataset abstraction to new module
updated Dataset imports
no need for sgd_fast header anymore.
added largescale ext module to setup.py
fix: declare extension type attributes
comment in forest classes for covertype benchmark
Merge branch 'master' into gradient_boosting
renamed and updated covertype benchmark.
uncomment RandomForest
cosmit
expose 'ls' loss function for classification
cosmit: pep8
Merge branch 'master' into sgd-weight-vector
renamed largescale -> large_scale
Merge branch 'master' into gradient_boosting
Merge branch 'master' into sgd-weight-vector
moved WeightVector und SequentialDataset into seperate modules.
re-cythonized
fix: min_samples_split
Merge branch 'master' into sgd-weight-vector
don't need self here.
factored out norm updates and moved them to a dedicated subclass
cythonized
Merge branch 'master' into gradient_boosting
Merge branch 'gradient_boosting' of https://github.com/scottblanc/scikit-learn into scottblanc-gradient_boosting
Merge branch 'gradient_boosting' into scottblanc-gradient_boosting
cosmit: pep8
cosmit
added serialization test case
use `deviance` instead of `medviance` and `bdeviance`
wip: refactor ``fit_stage``; fix feature importances regression; tests still not green (performance regression on Example 12.7).
fix: make binary classification a special case.
refactoring for multi-class
test case for multi-class
comment out - yahoo learning to rank dataset
some profiling
impl. deviance for MultinomialDeviance.
fast tree prediction methods.
faster ``_predict`` by using low-level tree predict functions.
cosmit
forgot to remove debug function
changed self.classes to self.classes_
fix: forgot to rename classes
updated documentation: plots for gradient_boosting, new sample generator
new predict utils for early stopping; updated examples
Merge remote branch 'upstream/master' into gradient_boosting
updated benchmark script
delete benchmark scripts - include them in dedicated branch or ml-benchmarks
Merge remote branch 'upstream/master' into gradient_boosting
removed ``store_terminal_region`` from ``build_tree``.
mention multi-class
use ``apply_tree`` to compute terminal region. This is faster and reduces code complexity.
added __all__
enhanced documentation
type (differentiable)
boston -> Boston
Merge remote branch 'upstream/master' into sgd-weight-vector
un-done NormedWeightVector factorization; performance decrease on RCV1 is neglectable.
cythonized sgd files
Merge branch 'master' into gradient_boosting
Merge branch 'pprett/gradient_boosting' of https://github.com/glouppe/scikit-learn into glouppe-pprett/gradient_boosting
cythonized
added Gilles to authors
whats new? Gradient Boosting!
Merge remote branch 'upstream/master' into gradient_boosting
added util func to create random sample_masks
use random_sample_mask (issue pointed out by @glouppe);
update examples
update tests
remove np.seterr
cosmit: comments + rm unnecessary variables
cosmit: add comment to replace ``random_sample_mask`` if numpy requirement allows to do so
cosmit: fix ClassPriorPredictor docstring; rm comment
typos
typo
mv *Predictor to *Estimator
mv classification init estimators; use np.bincount for PriorProbabilityEstimator.
is_multi_class now is a class attribute.
update docs
don't need to store n_classes.
cosmit: no need for float literals
Merge branch 'master' into gradient_boosting
point out scalability problem with large numbers of classes;
cosmit; mention scalability issues w.r.t. large number of classes
Merge branch 'master' of https://github.com/udi/scikit-learn into udi-master
added prior test
more test cases for naive bayes
GaussianNB: use epsilon to overcome zero sigma problem.
rm print stmt
added gbrt extension module (faster prediction methods)
rm custom regression tree prediction method
faster prediction methods
wip
add prediction method for specific stage
add staged predict
use staged predict in gbrt examples
fast tree prediction based on mystic cython kung-fu
cosmit
staged_predict for regression
test for staged predict and cosmit
more test cases
more test cases (input check at prediction time, degenerate inputs)
use approriate data types (Py_ssize_t)
better input checks at prediciton time
rm old tree prediction methods;
cosmit
Merge branch 'gradient-boosting-enh2'
add test for multiple fits w/ different input shapes
fix issue 762: SGDRegressor does not clear coef_ from previous fit
asarray not needed because of check_arrays stmt above
rm unused vars
Merge branch 'fix-issue-762'
typo: Viola-Jones
Gradient Boosting also provided OOB estimates
fix: gradient boosting regressor does not check if X is c-continous
Merge remote branch 'upstream/master'
started work on Huber loss function for robust regression
ensure that std is not zero
add test case for scale div through zero
Merge branch 'master' into gbrt-huber
add huber loss to test
implemented huber loss for robust regression
fix errors in huber loss
add alpha parameter for huber robust regression loss
fix: ensure X is c-continuous
fix: make sure X is c-continuous
Merge branch 'master' into gbrt-huber
added feature subsampling to GBRT (via max_features)
fix: forgot comma
added test for max_features
fix: alpha needs to be scaled by 100
wip: added quantile regression loss; this allows for prediction intervals; adopted the GP regression example to show-case prediction intervals
added title to example
performance improvement for random split (ctyped two variables).
import random split
test for quantile loss function
Use BaseEstimator for constant predictors
cosmit
huber and quantile loss for gbrt
better docs for quantile reg
Merge branch 'master' into gbrt-huber
Merge remote branch 'upstream/master' into gbrt-huber
ctyped variables in ``find_random_split`` and use for loop over index range instead of array elements
Merge branch 'master' into gbrt-huber
fix: np.arange dtype issue; fix dtype to be np.int32
use np.int32_t instead of Py_ssize_t
Merge branch 'master' into gbrt-huber
Merge remote branch 'upstream/master' into gbrt-huber
use dtype float32
proper pylab import
Merge branch 'master' into gbrt-huber
Merge remote branch 'upstream/master' into gbrt-huber
added test case for symbol labels
y must be one dimensional
more tests
removed quantile regression example
added max_features to gbrt regularization example
fix: section label for gbrt was wrong
add quantile example again
added new features to whatsnew
Merge branch 'gbrt-huber'
change dtype of y to float64 (aka DOUBLE_t)
cosmit: better docstrings
forest uses DOUBLE for y
Merge branch 'master' into tree-y-float64
changed shape of predict_proba
adopted tests because of changed shape of predict_proba
adopted tests because of changed shape of predict_proba
cosmit in sgd docs
added change to ``whats_new``
add quantile regression example to gbm doc
Merge branch 'master' into sgd-predict-proba
Merge branch 'sgd-predict-proba'
added failing test for 2d y
rm redundant input check (we check in _partial_fit)
ravel y; use atleast2d_or_csr for input validation
_tocsr not needed because of atleast2d_or_csr
inline comment
cosmit: constants for penalty types and learning rate types; inline comments;
Merge branch 'master' into sgd-yshape-fix
fix typo
make smoke tests explicit; check ValueError on 2d inputs
work on BaseGradientBoostingCV
refactored prediction and decision_function (rm duplicate code)
ENH: use gini for feature importance
GradientBoosting classes with built in cross-validation; implemented via Decorator pattern
wip: aggregate fold via groupby
wip: fixing some set attr errors but still buggy if params not lists
remove *CV classes - only pick decision_function and staged predict refactoring
rm CV class tests
rm CV class legacy
remove CV class legacy
add API changes and feature_importance fix to whatsnew
added failing test for clone
rm instance variables learing_rate_type, loss_function, and penalty_type; create them before plain_fit
move get_loss_function to _partial_fit
add test for proper loss instantiation
n_iter must not be 0
refactored input validation; special loss function factory for huber and epsilon insensitive loss
use DEFAULT_EPSILON consistently
rename get_loss_function to _get_loss_function
Merge remote-tracking branch 'upstream/master' into sgd-clone-fix
added test to expose the predict_proba w/ sparse matrix regression
fix the predict_proba w/ sparse matrix regression by using shape instead of len
cosmit
followed @larsmans tip to get rid of _decision_function
fix docstring of predict_proba
add predict_log_proba and test; better docstrings
wip on fx interactions for GBRT
Merge branch 'master' into gbrt-interactions
implemented partial dependecy plot
fix: grid and model
cleaned tree traversal and sorted out weighting
cythonized and cosmit
automatically create grid from training data
add cartesian product
partial dependency plot example from ESLII 10.14.1
Merge branch 'master' into pr/975
docstrings for init and loss_
cosmit
added Emanuele to authors
Merge remote-tracking branch 'upstream/master' into pr/975
Merge branch 'master' into gbrt-interactions
Merge branch 'master' into gbrt-interactions
Merge branch 'master' into gbrt-interactions
add learn rate to partial dependency function
common ylim; comment out 3d plot
make fit_stage private
return axes instead of grid
3d plot of 2-way interaction plot
Merge branch 'master' into gbrt-interactions
multi-class is supported
cosmit
doc: use n_iter instead of epochs; remove backslash
Merge branch 'master' into gbrt-interactions
california housing dataset
cosmit
use California housing dataset loader
Merge branch 'master' into gbrt-interactions
remove legacy code
Merge remote-tracking branch 'upstream/master' into gbrt-interactions
renamed dependency -> dependence; docstring and cosmit
typo
fix: feature_importances_
rename dependency -> dependence
rename dependency -> dependence
add partial dependence plot example
document sample_mask and X_argsorted in BaseDecisionTree.fit and validate inputs using np.asarray (added two tests as well)
Merge branch 'master' into gbrt-interactions
tidy up deprecated warnings for learn_rate
Merge branch 'master' into gbrt-interactions
raise error if both grid and X are specified
initialize estimators_ with empty array not None
more input validation for partial dependence and doctest
tests for partial_dependence
rename learn_rate -> learning_rate
input validation for grid
test cases for grid
pep8
added test for cartesian
add partial dependence to whats new
documentation for partial dependence plots
add module imports
typo
cosmit
call pl.show
renamed datasets.cal_housing to datasets.california_housing
add plot titles
cosmit
Merge branch 'master' into gbrt-interactions
cosmit: docstrings
better narrative docs for partial dependence
cosmit: footnote header
empty instead of zeros
Merge branch 'master' into gbrt-interactions
more explicit typing (int32, float64)
Merge branch 'master' into gbrt-interactions
Merge branch 'master' into gbrt-interactions
add plotting convenience function
uses plotting convenience function
moved partial dependence into its own module.
doctest fix + cosmit
fix imports
remove partial dependence (moved to own module)
updated example
fix imports (partial dependence)
fix: california_housing not cal_housing
cosmit
switch axis for 2-way plot; better to compare with above plot
added partial dependence and fetch_california_housing to classes
better documentation
fix links
add partial dependence module
add test for staged_predict_proba
Merge branch 'master' into pr/1409
Merge branch 'master' into gbrt-interactions
better formatting of xticks (prevent overlap)
show how to use ``partial_dependence`` to generate custom plots.
doctest skip for plot function
fix doctests skip
renamed: ncols -> n_cols;
test decorator to skip tests if matplotlib cannot be imported
smoke test for plot_partial_dependence
fix: doc rename partial_dependence_plots -> plot_partial_dependence
Merge remote-tracking branch 'upstream/master' into gbrt-interactions
better input checking (e.g. for str features)
better handling of multi-class case (w/ symbol labels)
code snippets for narative doc and restructuring
fix: random_state got initialized in fit_stage; caused same feature subsample in each tree
add test for gbrt random_state regression
Merge branch 'master' into gbrt-random-state-fix
Merge branch 'master' into gbrt-interactions
doctest skip: matplotlib not available on travis
fix: doctest in ensemble.rst
Merge branch 'master' into gbrt-interactions
rephrased the one-way PDP description
Merge branch 'master' into gbrt-interactions
topics -> topic
Merge remote-tracking branch 'upstream/master'
use Agg backend with warn=False for matplotlib enabled tests
check in ``if_matplotlib`` if $DISPLAY set
use subplots_adjust instead of tight_layout
use 100 instead of 800 n_estimators; looks the same but faster; ESLII uses 800
ZipFile context manager is only available in Python >= 2.7
cosmit: remove fourth quote
set min_density when growing deep trees during gradient boosting
sampling w/ replacement via sample_weights
rename learn_rate -> learning_rate
raise ValueError if len(y_true) is less than or equal to 1
fix: docstring for power_t in SGDClassifier was not correct (0.25 instead of 0.5)
cosmit: rephrased doc
zero_one_loss now does normalize on default.
fix: map labels to {0, 1}
fix: deviance computation in BinomialDeviance was wrong (ignored cases where y == 0) - thanks to ChrisBeaumont for reporting this issue
raise ValueError if division through zero in LogOddsEstimator
add loss function for gradient boosting binomial deviance
pep8 and assert_equal instead of assert
correct docstring
Merge branch 'master' into gbrt-deviance-fix
use unique from sklearn backports (return_inverse)
Merge branch 'master' into gbrt-deviance-fix
Merge branch 'master' into gbrt-deviance-fix
decision_function forces dense output (in the case of sparse coef_)
Merge branch 'master' into pr/1798
get rid of ``rho`` in sgd documentation - has been replaced by ``l1_ratio``
Merge pull request #1893 from dougalsutherland/sgd-docs
corrected doctests after moving L2 penalty application in SGD
Merge remote-tracking branch 'upstream/master' into pr/2016
added SGD L2 fix to whatsnew
fix: add missing str formatting operator
enhanced (hopefully) DBScan documentation; killed some whitespace along the way...
Merge remote-tracking branch 'upstream/master' into dbscan-doc-enh
fix: needs_threshold not plural in repr
removed min_density example - dropped param
gbrt now works with new DecisionTree implementation
import classes - now they work!
fix: proper dtype for SIZE_t
add GBRT to covertype benchmark
added pxd to Manifest (to be included in source tarball)
Merge remote-tracking branch 'upstream/master'
add OOB improvement and set oob_score deprecated
example for oob estimates in GBRT
plot cv error as well
rm print stmt
rn: plt -> pl
fix: oob_improvement_ with trailing _
more docstrings
cosmit: use train_test_split - tuned params for nice plot
narrative documentation for oob improvement.
more tests
cosmit: better links and a note on efficiency using max_features
comments
cosmit: n -> n_samples
cosmit: rs -> random_state
more doc for OOB example
use new style str formatting
rearanged some code
rn: ACC -> Accuracy
rephrased max_features doc
moved to new pyplot import
more narrative documentation for oob in gbrt
regression tests for oob_improvement_
example doc string
Merge branch 'gbrt-oob-improvement'
covertype benchmark: use C-style input as default (most models require it as input)
fix: use asserts from sklearn.utils.testing
fix: python3.3 warning fix
doc: hedge the use of OOB estimates
Refactored verbose output in GBRT - output much more nice
fix: newest numpy doesn't like all-indexing non-existing dimension (reported by erg #2233)
Merge remote-tracking branch 'upstream/master'
remove negative indices from neighbors cython code
fix: check for impurity ties
added 32bit 64bit equality test case
adapt OOB regression test to change in tree module
GBRT checks if ``loss`` is in self.supported_loss
renamed supported_loss to _SUPPORTED_LOSS (constants)
fix: typo - y_pred instead of y_true
GBRT enhancements:
added EPSILON_FLT and EPSILON_DBL for almost equal impurity and fx value comparisons
fiddled with EPSILON_DBL
added monitor callback w/ early stopping support
fix: VerboseReporter remaining time in the case of partial_fit
added ZeroEstimator
use StackRecord as elements of stack not 5 consecutive entries
Tree: compute partition specific impurity and total impurity in criterion.children_impurity. Pass partition impurity to stack to avoid re-computation (saves some runtime).
set EPSILON_DBL to old val of 1e-7
add tests for complete
fix: use regressor in regression test
Add stack class
PriorityQueue for best-first tree growing
fix: test renamed and check only on nr of leafs.
tree code now supports both c-style and fortran inputs
tree code now supports both c-style and fortran inputs
updated after tree.pxd change
Tree ensemble classes don't enforce c-style inputs.
use assert_equal if possible
check value of max_leaf_nodes
common test for sparsify + fix for SGDRegressor
sparsify test: add multi-class test too
support 'zero' init, more tests for ZeroEstimator
fix: Huber loss function in gradient_boosting fails if negative_gradient is not called before __call__; now computes on-demand.
add huber loss bug fix to whats new
Merge branch 'master' into gbrt-enh-stackrec-greedy
Implement GBM's best-first heuristic tree growing procedure .
remove asserts
Best-first instead of branch heuristic - identical score as GBM on covertype
cleanup: remove BranchBuilder, added more comments
remove check for condition that might not hold
cosmit
rm MSE import
removed Impurity struct, Criterion.child_impurity now returns impurity_left and impurity_right (not total), fix: moved pos >= end check to if branch
moved data structures (Stack + Heap) to _utils
fix: correct impurity_improvement formula (weight by n_left and n_right)
remove split_impurity from node_split signature
update test (slight changes in tree output because of removal of EPSILON insensitive checks)
use max_leaf_nodes in regularization example
narrative documentation on controlling tree size - might be relevant for weighted boosting and CART as well.
rename GBM_MSE to FriedmanMSE
prefer np.as(fortran|continuous)array over np.asarray(order=order)
fix: pass warm_start to BaseSGDRegressor
Merge pull request #2617 from pprett/sgd-regressor-fix
pass impurity to ``impurity_improvement`` - now it is correct so that arbitrary Criteria can be used in best-first search.
Merge pull request #7 from glouppe/gbm
X need not be continous anymore (since we adhere to col and row strides)
cosmit: doc undocumented functions
cosmit: remove commented out code - add comment
Merge pull request #8 from glouppe/gbm
max_leaf_nodes has precedence over max_depth if the former is not None.
fix: l1_ratio is incorrect; its (1.0 - rho); added test case to make sure elastic net penalty with l1_ratio near to 1 and 0 matches L1 and L2 penalties.
added change log entry
set l1_ratio to (1.0 - l1_ratio)
revert: reverted l1_ratio change - not SGDRegressor - sorry
Merge branch 'master' into gbrt-enh-stackrec-greedy
Merge pull request #9 from glouppe/gbm
Merge branch 'master' into gbrt-enh-stackrec-greedy
renamed partial_fit -> fit_more for GBRT
moved fit_more to warm_start=True
fix: resize oob improvements, train score when early stopping
warm_start semantics now fit exactly n_estimators rather than self.estimators_ + self.n_estimators
fix: use estimators_.shape[0] rather than n_estimators to make predictions; what if user just changes n_estimators of an already fitted obj.
fix: warm_start demo in narrative documentation
updated whats new
malloc / realloc checks
cosmit: better errors
do not alter n_estimators after early stopping
cosmit: better exception
cosmit: better exceptions
fix: wrong placeholder in format string
better test coverage in tree module
GBRT subsection for warm_start
Merge pull request #2570 from pprett/gbrt-enh-stackrec-greedy
fix: remove print from gbrt test
fix: use six.string_types instead of basestring
uses raises instead of with assert_raises - apparently doesn't work on python3 nose version.
fix: remaining time not currectly computed
Merge pull request #2753 from pprett/gbrt-verbose-remaining-time
add datarobot testimonial
fix: i might be undefined
make IsotonicRegression pickleable
import pickle
Sample weights for gradient boosting
more tests
add exponential loss to narrative documentation
cosmit: doc
bincount supports weights
more elegant PriorProbabilityEstimator
adressed Olivier's feedback
update tests for new exception types
sample weight support for robust regression via weighted percentile algo
fix: consider sample_weights in robost init estimator and negative_gradient
fix: add **kwargs to Multinomial loss' negative_gradient
more tests for weighted percentile
cosmit
add GBRT sample weights to whats new
fix: sample_weight is None in Huber.deviance
cosmit arr -> array
non-uniform weight test case for both reg and clf and all losses
add probability output for exponential loss
we can produce probabilistic outputs using exponential loss
Merge pull request #3669 from pprett/fix-isotonic-pickling
fix issue 4447 : min_weight_leaf not properly passed to PresortBestSplitter
add regression test case for issue 4447
cosmit
add huber to non-uniform sample weight toy test case
Merge pull request #4448 from pprett/fix-gbrt-min-leaf-weight
Peter Welinder (2):
add support for non-ndarray lists
Merge branch 'master' into peter-dev
Phil Roth (9):
Correctly match penalty and loss with arguments
Add a space to an error message
allowing for different cluster stds
adding multiple cluster_std to make_blob test
adding assert statement to check the std
whitespace
Adding a new kmeans example.
Adding kmeans example to the narrative documentation
Adding a reference to the example in the narrative documentation
Philippe Gervais (32):
Style fixes
[DOC] missing parameter description
GraphLassoCV works with alphas given as list.
Simplified GraphLassoCV code.
Put back cov_init parameter in graph_lasso_path_
Speed up some tests
Removed unused import
Added GraphLassoCV changes to whatsnew.rst
[DOC] Corrected errors in clustering documentation
[DOC] fixed a typo in an warning message.
One more typo fixed
Added euclidean_distances_argmin
Chunking on both arrays in euclidean_distances_argmin
Improved tests for euclidean_distances_argmin
Wrote docstring for euclidean_distances_argmin
[BUG] safe_asarray() converts sparse matrices dtype
Used safe_asrray() in check_pairwise_arrays()
Change signature of euclidean_distances_argmin()
Added pairwise_distances_argmin()
Optimized euclidean case in pairwise_distances_argmin
Removed euclidean_distances_argmin
Minor code cleanup
Added gen_batches() to utils
Changed pairwise_distances_argmin API
Code cleanup in pairwise_distances_argmin
[DOC] Fixed pairwise_distances_argmin docstring
[API] Renamed pairwise_distances_argmin
[DOC] updated pairwise_distances_argmin_min doc
Performance enhancement for non-euclidean metrics
Small fixes
CSR matrix support in pairwise_distances_argmin_min
Added argument metric_kwargs
Pietro Berkes (22):
NEW: Function to automatically download any mldata dataset given its name
ERF: load files in "mldata" subdir; some documentation improvement
ERF: Error checking in fetch_mldata
ERF: fetch_mldata allows to use natural mldata.org names for datasets
FIX: trying to reverse-engineer mldata.org conventions
FIX: fetch_mldata fixed to support non-standard data sets in mldata.org
NEW: mldata tests
ERF: Simplify conversion of mldata.org data set name to filename
Merge pull request #1 from ogrisel/pberkes-mldata
FIX: Remove column name when renaming in fetch_mldata
ERF: Improved coverage of mldata, taking into account network availability
DOC: documentation for fetch_mldata
ERF: Test mldata download using mock urllib2
FIX: fix pep8 and pyflakes issues
ERF: refactor object mocking urllib2 for general use (to be used in doctests)
ERF: Refactor utility function to test that list of names are (not) in an object
ERF: Move testing utilities to make them accessible from doctests
FIX: Doctests use mock mldata.org and do not download
DOC: small fix in datasets.rst docs
Merge pull request #2 from larsmans/pberkes-mldata
Merge pull request #3 from ogrisel/pberkes-mldata
FIX: update mldata tests to match recent updates; mock_urllib2 now accepts ordering parameter
Pietro Zambelli (1):
FIX unused pos_label parameter in metrics.precision_recall_curve
Pratap Vardhan (2):
MANIT: pep8ize to an extent cluster module
MANIT: Revert to py3 print version
Preston Parry (1):
DOC Grammar fix in preprocessing
Rafael Carrascosa (1):
DOC: added Machinalis testimonial
Rafael Cunha de Almeida (1):
Only reassign centers if to_reassign.sum() > 1
Raghav R V (82):
Fixes #3644
Added NRT for return_path propagation check
DOC added note to warn users not to __init__ params with trailing _
ENH FIX Raise ValueError when the target does not match with classes arg
TST Added NRT to test if ValueError is raised during target mismatch.
PEP8 Fix E112 and E113 errors
COSMIT Remove trailing : in comment to facilitate pep8 autoindentation
PEP8 Fix E101, E111 errors and W191, W293, W293 and W391 warnings.
COSMIT / PEP8 Limit line length to 79
MAINT Add new function to check if an estimator is fitted
MAINT Make uniform the error raised for not fitted condition
MAINT Remove the _check_fitted test.
FIX the try except block to recognize NotFittedError
TST Add tests to assert if NotFittedError is raised appropriately
FIX support_ parameter should be tested, not the coef_
FIX Refactor _is_fitted method into check_is_fitted.
FIX Make error message more explicit in fit_inverse_transform
DOC/MAINT Update whats new entry for NotFittedError
FIX scale - Raise ValueError when input contains non finite values.
TST Add test to check if warning is raised for integer dtype-ed inputs.
TST Add test to check if ValueError is raised if input contains NaN
FIX Remove unused score_overrides_loss param from check_scoring
FIX various mismatch between docstring and signature params
DOC Add silhouette analysis plot for KMeans
DOC ENH Fill the silhouette with the corresponding cluster color
DOC ENH Remove the vertical x = 0 line and plot avg silhouette score line
FIX Use slinear interpolation for isotonic regression
MAINT Remove temporary fix #3995 in view of the change to slinear.
DOC ENH Simplify the example code; Add plots for n_clusters = 3 and 5
DOC Added what's new entry for silhouette plot example
DOC Correct what's new entry of Jaccard Similarity Score
MAINT threshold -> threshold_ with deprecation warning.
TST Removal or modification of stop_words_ should not affect transform.
ENH decision_function now returns #votes and scaled scores.
PEP8 Clean up minor PEP8 issues.
FIX Return np.zeros(n_alphas) when the targets are uniform
TST Add enet test for the case of uniform targets
FIX Use resolution instead of min from np.finfo(float)
TST Check for equality in alphas_ when uniform targets are used.
FIX Add check_symmetric to __all__
FIX loss, penalty --> penalty, loss
EXAMPLE/FIX l2 --> squared_hinge; L1/L2 --> l1/l2
FIX/MAINT Deprecate and support upper case values for loss/penalty
COSMIT Flake8 fixes for svm/base.py
FIX Make LinearSVC/LinearSVR support the deprecated l1/L1, l2/L2
TST test deprecation warning and support for uppercase
ENH Scale the sum_of_confidences to (-0.5, 0.5)
COSMIT remove trailing whitespaces
MAINT remove the deprecated n_iterations param
MAINT Remove the deprecated n_iterations from StratifiedShuffleSplit
MAINT docstring --> comments to prevent nose from using doc in verbose mode
MAINT use yield for cleaner output in verbose mode
DOC/TST docstring --> comment
FIX Include PyFuncDistance attributes while pickling.
TST Add test to make sure BallTree remains picklable even with callable metric
TST make sure DistMetric is picklable with both predefined as well as custom metric
MAINT merge _check_cv into check_cv as indices argument is removed in 0.17
REF Correct the reference link for additive chi squared sampler.
MAINT PLS Remove deprecated coefs attribute
FIX import column_or_1d from utils.validation directly
MAINT Remove support for the deprecated sequence of sequences
MAINT return 'unknown' for seq. of seqs in type_of_target
MAINT deprecate seq of seq in metrics fully with tests
FIX bailout w/ValueError if multilabel-seq or other unsupported types
MAINT/ENH inline the logic for _is_sequence_of_sequences
FIX make sure empty target vectors are returned as binary
FIX 2D binary array-like must be considered as unknown
TST Explicitly add one empty label
TST/MAINT reintroduce mixed types test sans seq. of seq.
MAINT is_label_indicator_matrix --> is_multilabel
FIX revert generation of seq of seq in make_multilabel_classification
MAINT remove unused import warnings
FIX type_of_target will now raise ValueError on ml-seq; [[1, 2]] is mc-mo;
TST (in metrics) change error message for seq. of seqs
DOC revert the documentation changes
MAINT Use check_is_fitted
FIX/REVERT undo removal of ignore_warnings to avoid a few warnings in tests
MAINT Remove ignore_warnings from test_matthews_corrcoef_nan
DOC Make cv documentation consistent across our codebase
FIX Use float64 to avoid spurious errors
FIX dtypes to conform to the stricter type cast rules of numpy 1.10
FIX Move validation from helper to main function
Rahiel Kasim (1):
Repeated word: 'the the' -> 'the'
Rajat Khanduja (5):
Updated examples to use pyplot for plotting instead of pylab
Fixed pep8 violations. Some 'line too long' errors still remain.
Fixed some PEP8 violations even present in master branch, in examples/
Some more examples updated to use matplotlib.pyplot
Modified example 'plot_stock_market.py' to use matplotlib.pyplot
Ralf Gommers (1):
MAINT: fix broken links to numfocus.org on donations page.
Randall Mason (1):
Spelling in heading newgroups -> newsgroups
Raul Garreta (7):
PY3: used six.u to fix unicode variables in svmlight
PY3: six.moves.cStringIO to fix StringIO import
PY3: fix None comparison (when not in OS X) in test_k_means.py
PY3: used six.moves.xrange to fix xrange
PY3: used six.iteritems to fix dict iteritems in module pipeline.py
added a new section on model persistence
model persistence doc, added improvements from ogrisel comments
Richard T. Guy (4):
Switched dynamic default args in random forest
Added test
Switched default parameter to tuple from lists.
move tuple back into arguments
Rob Speer (6):
Change 'charse_error' to 'charset_error' in load_files.
Revise documentation about handling text and bytes.
Add a documentation section about decoding text.
Move the new "Decoding text files" doc section
FIX Minor stuff in document_classification_20newsgroups output
ENH Add filters on newsgroup text
Rob Zinkov (37):
Fixed typo in documentation
Adding guide on how to contribute to project
Fix indentation
Removed tabs from indentation
COSMIT: noting that PRs don't send mail to mailing list
Moved link for further info to be more prominent
Adding Passive Aggressive learning rates
Added documentation to stochastic_gradient
Added to documentation
Added documentation and removed PA
Added tests
COSMIT: spelling correction
Adding example
Added smoothing to example
COSMIT typo
PEP8 fix
PEP8 COSMIT
PEP8 COSMIT
Enforcing non-negative step-size
Split out PassiveAggressive Classifier into its own object
Adding PassiveAggressiveRegressor estimator
COSMIT
Added documentation for new classifier and changed seed to random_state
Fixed typo
Renamed learning_rate loss in PassiveAggressive
Correct documentation
Corrected doctests
Fix indentation
Fixed docstrings and seed tests
Fresh fixes of grammar errors
Grammar fixes
Adding support indices in svm for sparse matrices
COSMIT PEP8
Adding test to check support_ is equal in dense and sparse matrices
COSMIT PEP8
Recompiled base
Adding sparse support for decision function
Robert (11):
Twenty newsgroups will not create folder if the folder doesn't exist and the files won't be downloaded anyway
Example file based on Affinity Propogation example.
Fixed noted issues with previous version
params in DBSCAN.fit description
DBSCAN now takes either a similarity matrix, OR a feature matrix.
label_num is now only calculated once. This corrects a previous patch, which I incorrected half finished a refactoring, breaking the code badly :(
dbscan_.py file reinstated after accidental deletion
Function to calculate similarity matrix given either a feature matrix or a similarity matrix
Fixed documentation, and the input matrix is now consistently called 'X'.
NOW X is used consistently everywhere
pep8'd and pyflakes'd
Robert Bradshaw (1):
Lift division out of loop in _isotonic_regression.
Robert Layton (233):
DBSCAN clustering algorithm. A density based cluster analysis algorithm that looks for core points in dense neighbourhoods.
DBSCAN density based clustering algorithm (Ester et al. 1996)
Merge pull request #1 from larsmans/dbscan
labels_ doc updated
Added a paragraph in the documentation.
K-means with transform method.
pep8 fix for k_means_.py
Fixed documentation in example
Examples for dbscan in documentation
Much better example with pyplot, thanks to suggestions by GaelVaroquaux.
vq now the default in KMeans.transform
n_samples used instead of n_points in transform()
American spelling
Example now much more likely to return 3 clusters.
calculate_similarity changed to calculate_distance, moved to metrics.pairwise.py
Import of calculate_distance in metrics.__init__.py.
Merge branch 'master' of https://github.com/robertlayton/scikit-learn
Tests updated to work with the new distance based method.
Test using a callable function as the metric
Multiple small changes
pep8'd
kmeans example renamed
Digits example has plot.
Merge branch 'origin/master' into dbscan
Small changes, mostly to wording
Reference to calculate_distances fixed
Returned line I removed for some reason
Deleted line I returned that I really didn't delete.
K-means documentation updated to include information based on this PR
Extra example removed
Small fixes as per ogrisel's comments.
Merge remote-tracking branch 'remotes/origin/master' into kmeans_transform2
Small changes based on mblondel's comments. Nothing overly noticable
Replace points with samples everywhere
random_state used instead of giving index_order as argument
Description for components_ attribute. Renamed core_samples_ attribute to core_samples_indices_ to remove confusion
Split the transform method into a predict and a transform.
Merge remote-tracking branch 'upstream/master'
Merge branch 'master' into kmeans_transform2
Merge remote-tracking branch 'mblondel/kmeans_transform2' into kmeans_transform2
Merge remote branch 'upstream/master' into pairwise_distance
Initial changes to improve this module. pairwise_distance now uses a dict for functions.
Working through some of the errors in testing
Fixing twenty_newsgroups
Fixed a few import errors
Example images
VQ example. Not working yet - clusters aren't well formed I think.
Fixed loader problems
X -> XA, Y -> XB. pairwise_distance back to metrics
check_set_Y -> check_arrays
Ran tests and fixed a few bugs. Unit tests added.
Less verbose name
Test for tuple input. Tests now run in suite (forgot to have test_ at start of func name!)
XA -> X, XB -> Y
Merge branch 'master' into pairwise_distance
Moved metrics file to sklearn
pairwise_kernel function (untested, for comment)
PEP8 of metrics.py
import to metrics namespace for pairwise_kernels
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into pairwise_distance
Merge branch 'master' into pairwise_distance
Merge branch 'master' into pairwise_distance
Tests working, mostly pass
Merged PR263 into this PR
Fixed merge conflict
Fixes based on ogrisel's comments
l1_distances -> manhattan distance
pep8'd and pyflakes'd
Remove l1_distances completely, updated gaussian_process
Actually removed l1_distances this time
test_checks merged into test_pairwise. test_checks is empty for now.
Removed test_checks
Fixed doctest and checked tests working - most are;
pairwise callable metrics fixed
Now tests if tuples given as input
check_pairwise_arrays now ensures at least two dimensional arrays are returned.
pep8'd and pyflakes'd
metrics listed in pairwise_distances and pairwise_kernels
kwds ws being passed to squareform, instead of pdist. this has been fixed, with a test added
pairwise helper functions to give verbose knowledge of which metrics
Fix commenting in pairwise_distance
check for sparse matrices for scipy metrics, and throw error. test included
Brief description of kernels and distance metrics in doc
Added a list
Little more description
Fixed typos
manhattan_distances now returns [n_samples_X * n_samples_Y, n_features_X] shape array
Doc update for manhattan_distance
Fixed doctest error
Edited sklearn/metrics/pairwise.py via GitHub
Initial Silhouette Coefficient code. no tests yet, and haven't checked it actually works yet as well
Initial test. Not working yet
Included distance helper functions line for 0.9 release
API changes in metrics/pairwise.py
Merge branch 'silhouette' of https://github.com/robertlayton/scikit-learn into silhouette
Test working, pep8'd and pyflakes'd
Sparse matrix testing
Swapped y, D to distance, labels
silhouette_coefficient -> silhouette_score
Restructured metrics/cluster into a folder with supervised and unsupervised modules
Narrative documentation
Merge remote-tracking branch 'upstream/master' into silhouette
"whats_new" updated
Example updated, which required fixing a backwards compatability bug (adjusted_rand_score not imported in metrics/cluster/__init__.py)
Silhouette added to AP example
Using pairwise_distances in the Silhouette Coefficient. Updates to docs, code, tests and examples
Silhouette calcualted for all forms of k-means in example
Faster version by removing inner loop comprehension
Sampling to improve SC speed
sampling added to silhouette_score, examples updated to match
pep8 and pyflakes
Updated doc with new API
Removed unneeded line from doc
Merge pull request #364 from robertlayton/silhouette
Trying to fix NaN errors, but its not working. Pushing to work on it later.
Mutual information now works (tested!)
AMI now works, and has been tested against the matlab code (test based on this to come!)
Remove phantom double v-measure !?
Added tests. There are two errors, but I'm going to bed. I'll fix them in the morning.
Merge branch 'master' into ami
Merge branch 'ami' of github.com:robertlayton/scikit-learn into ami
- AMI in the cluster examples
Higher level import for ami_score
There is an overflow problem. It can be reproduced with the plot_adjusted_for_chance_measures.py example
Narrative doc, and I think I fixed the overflow issue (more tests to come)
Fixed logs to match the matlab code results.
Test now tests a much larger array
Test actually does what I meant it to do, and works sufficiently
Fixed this example. Tested the others (they worked!)
pep8 and pyflakes
Merge pull request #3 from ogrisel/robertlayton-ami
Optimising the expected mutual information code
Adding old version of EMI, as I'm about to change it
This version doesn't work either. I am uploading for historical sake.
Initial usage of gammaln. Not yet tested
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into ami
Still overflows, but the closest so far. Using gammaln
It works! Still have some optimisation to do, but it works for larger arrays
Moved start and finish outside of loop
comments, pep8 and pyflakes
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into ami
ami_score -> adjusted_mutual_info_score
ami_score -> adjusted_mutual_info_score
"What's new?" AMI!
Merge branch 'ami' of https://github.com/robertlayton/scikit-learn into ami
mutual_information_score -> mutual_info_score
and in plot_adjusted example (mutual_info_score)
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into ami
cosmit
Merge pull request #402 from robertlayton/ami
Fixed values in Adjusted Mutual Information doctests
l1_distances was renamed to manhattan_distances.
Mutual Information docstring incorrectly said it was the adjusted mutual information
Removed single k-means run to it's own function to enable optional parallelisation later.
Parallel version of k-means.
pep8 and pyflakes tested
Not doing a full sort for getting the best results
Updating random_state inbetween iterations of k-means fixes some issues
Doc updates
Fixed author reference (removed link as it wasn't working)
Added my twitter account as homepage.
feature_extraction/text.py: 'ignore' removed as a default, class param
Added a test (that doesn't work yet)
Test now works, testing both the Word and Char analyzers
decode_error -> charset_error
docstring update
cosmit
cosmit undoing (was testing)
pep8
cosmit to docstrings
NearestCentroid classifier, with test suite.
Shrink threshold working, along with a test
Sparse tests, but they are currently failing. Committing for comment
Typo for "neighbours", and converted to en-US
Test for sparse matrices. Tests fails, my guess is that centroids are the same.
Fixed bug in nearest_centroid, and removed boston test.
Narrative documentation
Sparse tests pass when using shrinkage
Turned on final test (it works!)
Broadcasting used to remove a loop
Removed asserts in code
Test use assert_array_equal where appropriate
pyflakes on test
Update to documentation
Moved to the `neighbors` namespace
Example of nearest neighbor, getting an improvement when using a shrink threshold of 0.1
Explain example in docs
Update examples/neighbors/plot_nearest_centroid.py
Update doc/whats_new.rst
Update doc/whats_new.rst
Removed unneeded numpy.array call in test
metric fixed in tests
Merge remote-tracking branch 'origin/nearest_centroids' into nearest_centroids
Merge pull request #5 from larsmans/nearest_centroids
This test repeats issues 960, with the silhouette coefficient returning nan
nan values are converted to zeros
k-means now no longer needed in test.
Distance matrix doesn't matter, and was therefore removed
Test for "amg" mode for spectral clustering added.
docfix: spectral_cluster doesn't return n_centers
pep8
Spectral will raise an error if the mode is set to amg and pyamg is not available
Test that an unknown mode raises the appropritate error
Update to the clustering.rst module file for k-means. Added a plain language description and the objective function.
Updated fixes from larsmans
Merge pull request #1478 from amueller/pep8
Merge pull request #1451 from amueller/chunksize_batchsize_rename
First draft of new Affinity Propogation description in docs.
Who doesn't love equations?
Spelling
Update doc/modules/clustering.rst
DOC improve mini-batch k-means narrative
DOC: Replaced all BSD style licenses with "BSD 3 clause"
Minimal spanning tree backported from scipy 0.13
Added test
Moved mst to a subfolder and added a README file
Added new files (from previous commit)
Merge pull request #2055 from jnothman/cv_refactor
Merge pull request #2076 from pprett/dbscan-doc-enh
Traversal in and tested. Next step is to remove references to old code
Removed reference from spectral_clustering to old csgraph
csgraph updated from hierarchical.py
Removed actual _csgraph file, tests still all pass
Turns out sparsetools wasn't needed either
Missed a spot
Reference to graph components updated in dev docs
Two more spots. I think that's it
Now that the folder has more than just mst in it, rename to sparsetools, which should help with referencing it.
Update to mean shit clustering narrative documentation.
Update to docstring of meanshift
docstring of module
Fix typos found by Alexandre
ENH precomputed is now a valid metric for 'brute'
Robert Marchman (13):
test case for unfitted idf vector
raise ValueError for unfitted idf vector
FIX docstring deletions
ADD test coverage for _check_stop_list
FIX comment typo
ADD test cases to fill out VectorizerMixin coverage
ADD another VectorizerMixin test
ADD test for get_feature_names
ADD test for tfidf fit with incompatible n_features
ADD test for TfidfVectorizer attribute setters
MV Mixin tests to CountVectorizer tests
RM CV import
MV _check_stop_list tests to CV get_stop_words
Robert McGibbon (8):
fix the kwarg name
updated the .c file
remade the cython with 0.18
Fix bug identified in #1817, comment 17340049
Fixes to HMM docstrings
Clarification of the sequence length, per @ogrisel
Add inline comment
Change the order of test_score and train_score in the _fit_and_score docstring to reflect what the code actually does
Rohan Ramanath (1):
fixes scikit-learn/scikit-learn#5329
Rohit Sivaprasad (2):
Edit typo
DOC typo in SVM narrative
Roland Szabo (2):
[DOC] Remove that RBM's are not implemented yet
[DOC] Typo fixes in documentation for Novelty and Outlier Detection
Rolando Espinoza La fuente (1):
DOC typo: Pereptron -> Perceptron.
Roman Sinayev (3):
ENH Rewrote CountVectorizer fit_transform to be ~40% faster
ENH refactor and further speed up CountVectorizer
ENH speed up TfidfTransformer using spdiags
Ron Weiss (63):
added hmm code from http://github.com/ronw/gm
removed logging, dependency on abc, and unnecessary imports
added hmm unit tests
cleanup hmm module: made properties compatible with Python 2.5, etc.
changed hmm.trainer usage: each hmm object must have a _default_trainer property which can be overridden by passing a different trainer into hmm.train()
changed "train" -> "fit". Removed HMMGMM for now.
removed references to gmm.init() in gmm docstrings
fixed random seed in hmm unittests
removed init() method from hmm classes
minor tweaks to make hmm.GaussianHMM look like gmm.GMM
fixed bug in HMM viterbi logprob
added MultinomialHMM, unit tests
fixed *HMM.fit() to include *all* parameters by default
removed ndim argument from gmm.rvs()
added validate_covars back to gmm.py
added ndim property back to gmm to keep it consistent with HMM
Merge branch 'hmm'
added support for HMMs with GMM emissions
Merge remote branch 'upstream/master'
fixed GMM examples
fixed GMM examples
fixed broken doctests in gmm.py
updated gmm.py to comply with scikit-learn API. fixed pep8, pyflakes errors
DOC: fixed typos in developer documentation
Merge branch 'master' of ssh://scikit-learn.git.sourceforge.net/gitroot/scikit-learn/scikit-learn
BUG: fixed failing test in GMM
remove GMM.lpdf method
merge
update hmm module to comply with scikit-learn API
remove hmm.HMM factory to simplify hmm module's interface
merge hmm_trainers into hmm module
finish merge of hmm_trainers with hmm and remove hmm_trainers
remove extraneous tests from test_hmm.py
speed up hmm unit tests, add test for GaussianHMM with priors
fix GMMHMM bugs. speed up tests
Merge branch 'master' of github.com:scikit-learn/scikit-learn
FIX: failing gmm tests
Merge branch 'master' of github.com:scikit-learn/scikit-learn
change GMM initialization to use cluster.KMeans
change GaussianHMM initialization to use cluster.KMeans
merge
fix bug in hmm.GaussianHMM mstep update for 'full' covariance
Reapply "ENH: enhacements in the gmm module."
fix gmm examples
merge
fix bug in GMM._get_covars dimensions
make HMM interface consistent with GMM
clean up interfaces in hmm and gmm
remove n_symbols argument from MultinomialHMM.__init__
Merge branch 'master' of github.com:scikit-learn/scikit-learn
add GMM classification example
clarify GMM classifier labels
add GMM.predict_proba
add default initialization of GMM.weights to constructor
rename GMM.n_dim to GMM.n_features to be consistent with the rest of the scikit
add HMM.predict_proba
rename HMM.n_dim to HMM.n_features to be consistent with the rest of the scikit
fix pep8, pyflakes errprs
Merge branch 'master' of github.com:scikit-learn/scikit-learn
scikits.learn.gmm -> scikits.learn.mixture
Merge branch 'master' of github.com:scikit-learn/scikit-learn
BUG: fix GaussianHMM.fit to allow input sequences of different lengths
FIX remove broken test in test_mixture
Ronald Phlypo (12):
bug fixes in graph_lasso and moving non-custom log-likelihood definition to specific definition of cost
corrected sign error in _objective
pep8 compliance
added test cases to test_graph_lasso for alpha=0
deleted the .orig file in this commit
deleted *.py.orig from branch
cosmetic remarks incorporated
Merge branch 'glbugfixes'
cosmetic remarks, pep8 compliance against trailing white spaces, spelling errors
clean up
bug fix and deprectation warning added to cross-validation
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into bootstrap
Ronan Amicel (3):
Fix broken merge by ogrisel :-P
PEP8 + missing fit methods
Minor edits.
Roy Hyunjin Han (2):
Fixed some typos
Update examples/exercises/plot_iris_exercise.py
Rupesh Kumar Srivastava (1):
FIX max_features in CountVectorizer
Ryan Wang (1):
DOC typos in feature_extraction.text
Saket Choudhary (4):
DOC: Fix is->if
FIX precompute_gram->precompute as in #2224
MAINT: Remove reimport
FIX: Ensure dependencies installed
Salvatore Masecchia (6):
FIX: coordinate descent stopping rule
added missing _set_params call in LineadModelCV
unified and simplified path params creation in LinearModelCV
fixed parameters passing of LinearModelCV.fit, with test
thread safe tests for coordinate descent
pyflakes/pep8 on coordinate descent
Sam Nicholls (1):
FIX Verbose message in RFE to count from 1, not 0
Sam Zhang (1):
BUG: get_param in FeatureUnion disjoint for shallow/deep params
Samuel Charron (2):
DOC add Data Publica testimonial
OPTIM Lower memory usage of spectral_embedding.
Samuel St-Jean (1):
Update dict_learning.py
Satrajit Ghosh (58):
BF: k-fold should accept k==n
BF: k-fold should accept k==n
resolved init
initial import from milk
renamed, additional import
started conversion to scikits
updated information gain and set_entropy functions
modified base classes
updated docstring to reflect use
updated load_iris to return features
enh: updated decision tree classifier and associated example
updated default impurity measure
added new impurity measures
updated random forest classifier to operational status
updated cython script to calculate gini measure
removed classifier.py
resolved conflicts
Merge remote-tracking branch 'noel/decisiontree' into treemerge
fix: trailing-spaces option fixed to be executed
doc: updated docstring for permutation_test_score to reflect nature of p-value given the type of score_func
sty: ran make trailing-spaces
doc: fixed spelling
doc: updated docstring based on feedback
fix: permutation test score averages across folds
added avg_f1_score
tst: added tests
enh: added matthew's correlation coefficient
sty: pep8 + doc
Merge branch 'master' into enh/metrics
fix: added ensemble to setup.
Merge branch 'master' into enh/metrics
enh: added support for weighted metrics closes #83
doc: added description for matthew's corrcoef from wikipedia
sty: pep8 fixes
sty: pep8 on test file
doc: removed strange character
fix: updated tests to reflect that micro shows the same precision and recall
fix: average with elif
doc: improved description of average
api: changed pos_label to None for metrics
Merge remote-tracking branch 'upstream/master' into enh/metrics
Merge remote-tracking branch 'mblondel/metrics' into enh/metrics
Merge pull request #443 from satra/enh/metrics
fix: convert input arrays to float
fix: force copy to True in case underlying default behavior changes.
tst: added test for feature selection. this test would have failed in the previous case. closes #727
doc: added reference to lobpcg and note about small number of nodes
fix: addressing gael's comments
fix: set syntax
fix: increase robustness of label binarizer test
sty: white space
fix: change affinity check
doc: clean up style and grammar
ref: change name to indicate semantics
fix: removed unused keyword precomputed and clean up if clauses
fix: moved random state check to fit
doc: removed merge diff markers
doc: align hyphens
Saurabh Jha (2):
Implements Multiclass Hinge loss
Sample weight support for hinge loss
Scott Dickerson (3):
train_test_split: test_size default is None
Modified docstrings
Modified docstrings and tests
Scott White (2):
add support for multi-class
add todo
Seamus Abshere (1):
ENH reduce size of files produced by dump_svmlight_file
Sebastian Berg (1):
FIX: Do not rely on strides for contiguous arrays
Sebastian Raschka (9):
ensemble classifier 1st commit
votingclassifier
added _check_proba method
property trick to raise AttributeError if voting is "hard"
inline check_proba
silhouette_score docstring update
changed clf to regr in decision tree regression examples
upd randomforest docstring
note about variance reduction
Sebastian Saeger (14):
FIX: AdditiveChi2Sample can be initialized with sample_interval, #3068
Add tests for the sample_interval in #3069
Fix the pep8 violations
TST MiniBatchKMeans with many reassignments
Added a verbose flag to the GMM class.
Fixed typos.
Fixed doctests.
Fixed pep8 warnings.
Added tests for the verbose flag of GMM.
Fixed pep8 and removed unused variable.
Adapted the verbose flag of DPGMM. Boolean flag is deprecated now.
Added tests for the adapted verbose flag of DPGMM.
Removed the deprecation and added a test for a boolean flag value.
Cleaned the code. pep8, pyflakes and added comments.
Sergey Feldman (1):
Adding covariance regularization to QDA
Sergey Karayev (2):
fixing bug in linear_model.SGDClassifier for multi-class warm start
removing accidental space
Sergio Medina (2):
Fixed small typo, even though the message is kind of the same and the one with the typo is waaay funnier.
Corrected a few things on the Mutual Information doc pages.
Sergio Pascual (5):
Remove execute pemission
Remove shebang in library code
Move __future__ import after license text
MAINT Update six to version 1.4.1
Add a paragraph about installing a binary release in Fedora
Shaun Jackman (1):
BernoulliNB: Fix the denominator of P(feature)
Shiqiao Du (36):
improved computational speed by calling fast scipy build-in function and replaceing double loop
fixed some pep8 warnings
Merge remote branch 'upstream/master'
added a cython module to the hmm
replaced (T, N) -> (n_samples, n_states)
- renamed (n_samples, n_states) -> (n_observations, n_components) in hmm.py
Merge pull request #1 from agramfort/hmmc
dropped "_c" suffix
debugged _hmmc.pyx
fixed proble of _accumulate_sufficient_statictics in hmm.py
- removed unnecessary **kwargs specification in fit and _do_mstep methods
replaced deprecated "rvs" to "sample"
made `sample` also return the sequence of internal hidden states
added doc for hmm
- fixed typo in hmm.rst
made `sample` also return the sequence of internal hidden states
rebased to the master and fixed conflicts
bug fixed
fixed _do_viterbi_pass
fixed doc
fixed typo
replaced function call of decode to predict
removed pure python codes and beam pruning options
Added change history to what's new
updated author and pep8
modified phrases in what's new
- added decoder selection
fixed some typo, doctest and pep8
added comment on decoder algorithm in the rst doc.
Merge pull request #2 from GaelVaroquaux/hmmc
Merge pull request #847 from kwgoodman/master
fixed bug of initialization in hmm.py
added test_fit_with_init to tests/test_hmm.py
pep8, ignored E126-E128
- avoid startprob, transmat, emissionprob containing a zero element by
- check input format of MultinomialHMM.fit
Shivan Sornarajah (1):
Changed SVM to use gamma='auto' by default. Deprecated gamma=0.0.
Simon Frid (2):
Update testimonials.rst
adding lovely logo
SimonPL (2):
Rewrite of the documentation for LDA/QDA
Small edits to LDA/QDA documentation
Skipper Seabold (3):
ENH: Raise explicitly on non-unique vocab.
TST: Regression test for nan parameters cloning
ENH: Avoid equality comparison for nans.
Sonny Hu (2):
fix RidgeCV when cv != None
add sample_weight into LinearRegression
Stefan Otte (1):
DOC remove unnecessary backticks in CONTRIBUTING.
Stefan Walk (1):
python 3 compatibility fix
Stefan van der Walt (2):
DOC Correct definition of multiclass log loss
ENH Speed up and simplify cartesian product
Stefano Lattarini (1):
COSMIT various typofixes
Stephan Hoyer (1):
Update permutation_test_score docstring
Stephen Hoover (3):
DOC Fix doc for NMF `init` default
BUG Use epsilon threshold in `_samme_proba` and `_boost_real`
TST Increase samples for classifier tests
Steve Koch (1):
Update hmm.rst
Steve Tjoa (1):
fixed typo: diriclet -> dirichlet
Steven De Gryze (12):
PY3: fixed basestring in crossvalidation.py
PY3: use b() convenience function for string literals
PY3: ensuring file stream is read as binary
PY3: convert string literal to bytes using six in cython file
replacing numpy array with range for use in random.sample
PY3: changing None to 0 to ensure comparability in py3
PY3 fixing utf8 comments in svm through try/except and six.b
PY3: forcing execution of map by using tosequence
PY3 fix comparison of ndarray and string
ENH implement fit_predict on pipelines
verifies fit_predict is available on last step of pipeline
better docstring one-liner and use of gettattr in test
Steven Maude (3):
Update cross_validation.rst
Update feature_selection.rst
Minor typo fixes in grid_search_text_feature_extraction.py
Steven Seguin (4):
add py3 xrange to bicluster example
change to explicitly import function
Change xrange to range and remove patching import
Change urlopen and iteritems functions for Python3 compat
Sturla Molden (1):
Update typedefs.pxd with correct ITYPECODE
Subhodeep Moitra (17):
P3K: 'type' has been renamed 'class' in python3
P3K: Fixed dtype doctests for Python3
P3K: Fixed print related Python3 errors
P3K : Fixed range iterator to be list
PK3: __len__ returned float instead of int. Typecasted.
P3K : Convert int type checking to np.integer
P3K : Typecasted float to int
P3K : Changed / to // to typecast float to int
P3K: Modified RuntimeError message args
P3K : Replaced / by //
P3K : Refactored test cases to use setUp
P3K: print back compatible with python2.6-7 with __future__ import
P3K: Fixed None < Float Python 3 error
P3K: Fixed unicode pickling error by changing to BytesIO
P3K: Fixing prints and dtypes
P3K: Fixed RuntimeError.message
P3K: Fixed print related Python3 errors
Sylvain Zimmer (2):
Fix typo in SVM docstrings
Another doc typo fix
Szabo Roland (3):
ENH Added custom kernels to SpectralClustering
BUG Add lambda_ attribute to ARDRegression after fit
DOC Add labels and some explanation to confusion matrix example
Tadej Janež (17):
DOC: further improvements to the model selection exercise
DOC: further improvements to the model selection exercise
Merge remote-tracking branch 'upstream/master'
DOC: another improvement to the model selection exercise
DOC: Improved the code that shows how to export a decision tree to Graphviz and generate a PDF file.
Skip doctest for the Python code involving pydot.
Skip doctest for the remaining line involving pydot.
Removed an unnecessary if statement in KFold __iter__ method.
Improved the test that checks the balance of sizes of folds returned by KFold.
DOC Corrected the docstring of KFold about the sizes of the folds.
COSMIT Moved the test_roc_curve_one_label test where other ROC curve tests are.
FIX KFold should return the same result when indices=True and when indices=False.
ENH Function auc_score should throw an error when y_true doesn't contain two unique class values.
ENH optimizations in sklearn.cross_validation
FIX Moved copying of labels in LeaveOneLabelOut and LeavePLabelOut to __init__.
TST Added test that checks if LeaveOneLabelOut and LeavePLabelOut work normally if the labels variable is changed before calling __iter__.
DOC Fixed doc test to work with the fixed versions of LeaveOneLabelOut and LeavePLabelOut.
Theodore Vasiloudis (1):
[docs] [trivial] Fix small error in cross-validation docs.
Thomas Delteil (1):
reverted css to current stable, fixes the display bug
Thomas Jarosch (1):
BUG delete/delete[] error in Liblinear
Thomas Unterthiner (41):
Issue #2455: Make RBM work with sparse input when verbose=True.
Change RBM sparse format from CSC to CSR.
Cosmetic changes.
Add issparse import.
Cosmetic changes.
Expanded docstring of the verbose parameter.
Check that verbose output is sound.
Extend utils.sparsefuncs: inplace scale and axis min/max
MAINT explicit float64 in sparsefuncs_fast
ENH Add 'axis' argument to sparsefuncs.mean_variance_axis
ENH improved CSC matrix handling in VarianceThreshold
FIX consensus score on non-square similarity matrices
ENH add testcase for issue 2445
ENH Added RobustScaler
Check if parameters are fitted
ENH: cross-validate over predefined splits
COSMIT pep8 cleanups
Fix documentation errors.
Remove warn_if_not_float
Remove 'copy' parameter form RobustScaler functions
Fix documentation in test case
Export robust_scale and RobustScaler
Improved documentation of robust scaler
Fixed decprecation warning
Changed outdated test
Fixed removal of copy-parameter
Removed superfluous testcase for robust scaler
Remove interquartile_scale parameter
Added robust scaling example
Fix robust_scaling test
Fixed documentation oversight.
robust_scaling example
Improved example
Fix documentation errors
More documentation fixes
Remove redundant code
DOC fix contributor name
ENH add MaxAbsScaler
ENH add minmax_scale
Add missing scalers to classes.rst
DOC Improve wording in minmax_scale documentation
Thouis (Ray) Jones (4):
Wrapped BallTree in Cython.
Renamed for backwards compatibility, fixed C++ Exceptions to propagate to python
balltree - be explicit about return types' width
check input arguments to BallTree, and be more careful in dealloc'ing
Tiago Freitas Pereira (1):
Added the regularization term in the method _fit_transform
Tiago Nunes (6):
Add fit_transform to FeatureUnion
Change / to (…) line continuation
Add test case for FeatureUnion.fit_transform
Fallback to fit followed by transform if fit_transform is unavailable
Add test case for fit_transform fallback
Fix pipeline fails if final estimator doesn't implement fit_transform
Tian Wang (1):
add a new int array to store indices
Tim Head (14):
Mention oob_score in narrative documentation of RandomForest*
Mention oob_score in the narrative docs for Bagging*
Added warm_start to bagging
Fix random_state to make test reproducible
Test for unchanged n_estimator and indentation fix
Test oob_score for warm_start'ed bagging classifiers
Removed overly deep indentation
oob_score will only be calculated if warm_start=False
Keep estimators_samples_ and estimators_features_ across warmstarts
Better assert_raises use, API compliance, better ordering
Ensure dict_learning_online always has at least one batch
Check fit() returns self for all estimators
Example to demonstrate use of tree.apply() method
Explicitly mention RandomTreesEmbedding in the text
Tim Sheerman-Chase (5):
FIX: Corrected NuSVR impl type and set epsilon to None
Added a fix to prevent tree splits on samples that are
Removed exception from _find_best_split to avoid code bloat.
Removed unnecessary variables
Enable graphvis export function to export trees as well as regressors
Timothy Hopper (1):
Go to easy _open_ issues page
Tiziano Zito (1):
FIX broken links to Rubinstein's K-SVD paper.
Tom DLT (1):
FIX temporary fix for sparse ridge with intercept fitting
Tom Dupré la Tour (1):
Merge pull request #5360 from TomDLT/sparseridge
TomDLT (18):
ENH improve check_array
ENH change func_grad_hess into grad_hess
ENH check that unfitted estimators raise ValueError
DOC update whats_new
ENH use astype to avoid unnecessary copy
FIX check parameter in LogisticRegression
ENH improve parameter check in LogisticRegression
FIX use random_state in LogisticRegression
DOC fix rendering of code-block
add fetch_rcv1
ENH implement LYRL2004 train/test split of rcv1
Reorder target_names in rcv1
ENH add sag solver in LogisticRegression and Ridge
ENH remove some warnings in test suit
FIX array_equal for numpy < 1.8.1
ENH refactor NMF and add CD solver
DOC update docstring and warning
FIX decrease tolerance in test_logistic for appveyor failure
Tomas Kazmar (1):
FIX MinMaxScaler behavior on 1D inputs.
Tzu-Ming Kuo (1):
MAINT liblinear class label logic (disabled)
Udi Weinsberg (1):
corrected Gaussian naive-bayes to correctly computer the class priors
Valentin Haenel (1):
DOC add the Random Kitchen Sink synonym to the RBFSampler section
Valentin Stolbunov (1):
Added handling of sample weights in logistic.py
Vighnesh Birodkar (7):
MAINT deprecate 1d input arrays for all estimators
added predict_equal_labels test and changed kmeans single
moved condition out of the loop
Added check for max_iter and test
Added explanatory comment
Fix warnings during tests
removed reshaping in fast_mcd and replaced by check_array
Vijay Ramesh (1):
adding change.org testimonial, logo
Vinayak Mehta (16):
explicitly mentioned include_self
explicitly passed by name
Added Random Kitchen Sink reference
added docstrings for restrict
Raising an error when intercept_scaling is zero
Specified original intercept_scaling value
Added docstrings for kernel='precomputed'
Raising an error when n_clusters <= 0
used special.fdtrc instead of fprob
Moved optimize test to a new file
Modified docs
Removed array2d from developer docs
Deprecated load_lfw_pairs and load_lfw_people
Changed import statement and checked for ImportError
MAINT Deprecate LDA/QDA in favor of expanded names
Fixed LDA typo in doc
Vincent (1):
Added random seed for facial recognition example and updated the docstring
Vincent Dubourg (37):
Hello list,
Correction of a bug with the management of the dimension of the autocorrelation parameters.
Forgot to retire pdb.
Commit of a 'Gaussian Process for Machine Learning' module in the gpml directory. The module implement a class named GaussianProcessModel. I also add doc, examples and tests (involving a coupling with the cross_val module).
Correction of a bug in test_gpml.py (now runs perfect on my machine!). I just don't know how to involve this test within the whole scikit testing procedure (nosetests). Also add a modification of the TOC in doc.
Correction of a bug in the basic regression example.
Delete the old kriging.py module
Modification of the score function. The score function now evaluates the deviation between the predicted targets and the true ones. This is for convenience only because it allows then to use the distributing capacity of the cross_val module. The old score function is renamed with the more explicit name: `reduced_likelihood_function` (see eg the DACE documentation).
Modification of the main __init__.py file of the scikits.learn package in order to load the gpml module and tests.
Renames as suggested by Alexandre. Simplification of the examples. Remove the interactive contour label picking in the probabilistic classification example.
Bugged example after modification. Now correct!
I Ran the PEP8 and PYFLAKES utils and corrected the gaussian_process module related files.
Can't comply with contradictory PEP8 rules on some specfic code such as:
I removed the time-consuming test and made a regression example from it.
Replaced np.matrix(A) * np.matrix(B) by np.dot(A,B), so that the code is a lot clearer to read...
Removed plotting command from the examples in the GaussianProcess class docstring.
Simplification of input's shape checking using np.atleast_2d()
Changes in format of the fit() input (np.atleast_2d for X, and np.newaxis cat for y).
Force y to np.array before concatenating np.newaxis in fit().
Modifications following Gaël latest remarks.
Added Welch's MLE optimizer in arg_max_reduced_likelihood_function() plus reference in the docstring.
Correction of a minor typo error in correlation_models docstring
Improvement of the documentation with a piece of code and reference to the regression auto_example. Add a README.txt file at the root of the examples/gaussian_process directory.
From: agramfort: don't use capital letters for a vector. Y -> y.
Forgot to retire pdb... Again!
Forgot one capital Y in the piece of code of the RST docpage.
Removed trailing spaces in the RST doc page.
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
metrics.explained_variance was renamed to metrics.explained_variance_score so that I needed to modify this example.
Removal of the submodule relative imports in the toplevel init file.
gaussian_process module changes:
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'gaussian_process_review'
Debug in GaussianProcess.predict for batchwise computation
Debug GaussianProcess.predict for variance estimation in 'light' storage mode.
FIX: random_start feature in GaussianProcess
FIX: gp_diabetes_dataset examples (theta_ attribute)
Vincent Michel (47):
New feature selection
Last version of univariate selection
Merge branch 'master' of git at github.com:vmichel/scikit-learn
Corrections of indentation in univariate_selection
remove old univariate_selection
Correct nosetets for univariate_selection
Add doc to univariate_selection
Merge branch 'master' of git at github.com:vmichel/scikit-learn
Merge branch 'master' of git at github.com:agramfort/scikit-learn
Merge branch 'master' of git at github.com:agramfort/scikit-learn
Add rfe example
update example
Merge branch 'master' of git at github.com:agramfort/scikit-learn
Merge branch 'master' of git at github.com:agramfort/scikit-learn
Corrections in rfe
Remove feature selection
Add ranking_
Add Crossvalidated version of RFE
Add example of RFE CV
ENH : New version of Bayes Ridge
Merge branch 'master' of git at github.com:vmichel/scikit-learn
Newer (and faster !) version of Bayesian regression.
Merge branch 'master' of https://vmichel@github.com/scikit-learn/scikit-learn
ENH : New version of Bayes Ridge
Newer (and faster !) version of Bayesian regression.
Update tests for bayes
Merge branch 'master' of git at github.com:scikit-learn/scikit-learn
Add first draft of variational bayes
Add variational inference
DOC: Update doc for bayesian regression + examples
Merge branch 'master' of github.com:scikit-learn/scikit-learn
FIX : Remove reference to Variational Bayes
More doc in bayes.py, fix bug in high dimension, add score
Merge branch 'master' of github.com:scikit-learn/scikit-learn
ENH : change the convergence trigger
More coverage for bayes
DOC : create and start doc for cross-validation.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
DOC : add changes in classes.rst
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Add ward algorithm + feature agglomeration
Add documentation on Ward algorithm
Add documentation on hierachical clustering.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
[base.py] revert previous commit, as the error is raised when object does not follow scikit API
[feature_extraction] Refactor text/* to text.py
Fix RFE - Add the possibility to use feature_importances rather than coef_ when existing, see #2121
Vincent Schut (8):
added a converged_ attribute to GMM to indicate whether fit() returned because of convergence or because max_iter was reached.
reset GMM.converged_ when calling fit() again
split >80 char comment in 2
add GMM.converged_ attribute to GMM docstring
some optimizations for GaussianProcess
pep8 improvements
remove unnecessary parens
batch k-means: calculate labels and intertia in chunks to prevent memory errors
Virgile Fritsch (83):
DOC: Fix typos in svm module documentation.
Remove Y from fit in OneClassSVM.
Add a reinitialization function for estimators + write test for
Merge branch 'master' of github.com:GaelVaroquaux/scikit-learn
Change test name for the _reinit() method.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' of github.com:GaelVaroquaux/scikit-learn
DOC: Explain the _set_params method in BaseEstimator class.
Rename PyBallTree* --> BallTree* in BallTree.cpp.
DOC: typos + change name of LDA vs QDA examples
Refactoring of the covariance estimators modules.
OAS estimator of covariance + new example.
Refactoring of the covariance module and examples + add OAS.
Merge branch 'covariance' of github.com:VirgileFritsch/scikit-learn into covariance
More covariance refactoring: separate MLE computation from object.
Rename BaseCovariance as EmpiricalCovariance + reviews comments.
Remove useless calls to np.asanyarray and improve computation.
Cosmit
Handle integer type case for the estimation of covariances.
Use np.cov instead of empirical_covariance in covariance module.
Reintroduce empirical_covariance function + docstrings + cosmit.
DOC: Documentation about covariance estimation.
Compatibility Ubuntu 11.04 (with matplotlib 0.99.3)
Modify the method computing errors on covariances (<cov_object>.error)
Bug fix: turn <covariance_object>.mse into <covariance_object>.error
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Covariance errors computation API changes.
Docstrings about labels + cosmit in the metrics module.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Add a void cython module affording to check that `make` has been run.
Implements a robust covariance estimator: Rousseeuw's MCD.
Integrate Fabian's comments on Minimum Covariance Determinant.
Implements a robust covariance estimator: Rousseeuw's MCD.
Integrate Fabian's comments on Minimum Covariance Determinant.
Merge branch 'mcd' of github.com:VirgileFritsch/scikit-learn into mcd
BF: index out of bound in GraphLassoCV grid refinement.
Refactor MCD robust covariance estimator: it is easier to regularize.
Merge with Gael's glasso changes.
Make the design even more modular for MinCovDet.
Make the "robustness parameter" accessible through the API.
Integrate Gael's minor comments + Magnify examples + 1D data case.
Remove `correction` and `reweighting` parameters from the API.
Merge pull request #396 from VirgileFritsch/refactor_mcd
OPT: (minor) remove useless determinant computation in FastMCD.
Separate correction and reweighting steps from raw MCD computation.
Add a set of tools and a new object for outliers detection (+ example).
Add tools to perform outlier detection with sklearn + documentation.
Clean working directory
Integrate AlexG's comments on doc and examples + add tests.
Magnify novelty and outlier detection examples again + minor fixes.
DOC: Move Parameters section outside objects __init__ method.
Example on real data (outlier detection on boston housing data set).
Fix bugs + adjust OCSVM parameter in outlier detection example.
Cosmit: address Olivier's comments on examples naming.
BF: Avoid two consecutive centering of the data in outlier_detection.
rename mahalanobis_values to raw_values in covariance decision method.
ENH: make LedoitWolf estimation scale (memory usage) with n_features.
The LedoitWolf object has to return a covariance estimate or breaks.
Put Ledoit-Wolf shrinkage coefficient estimation in a separate function.
Avoid extra computations + clean `assume_centered` argument use.
Remove forgotten line related to previous commit.
Catch non-invertibility errors within MinCovDet computation.
Improve covariance module test coverage.
More tests for the covariance module.
BF: adapt a svm test to recent numpy versions.
BF: Make MinCovDet work with n_samples >> n_features.
Merge branch 'cov-speedup' of https://github.com/vene/scikit-learn into cov-speedup
Add comments on optimized precision computations.
Add comments on optimized precision computations.
Merge pull request #1015 from vene/cov-speedup
BF: Address issue #1059 in GMM by adding a supplementary check.
BF: Fix broken tests: change a check for compatibility with HMM.
BF: fix issue #1127 about MinCovDet breaking with X.shape = (3, 1)
Improve doc and error msg in MinCovDet in response to issue #1153.
BF: GridSearchCV + unsupervised covariance shrinkage selection.
Change legend + complete docstrings.
Improve example narrative doc (rewritten intro).
Fix typos in doc.
Add y=None to covariance estimators for API consistence purpose.
BF: Correct degrees of freedom in f_regression + test.
BF in f_regression: variable naming + use assert_*array*_almost_equal.
f_regression and degress of freedom: update whats_new + minor.
Merge pull request #2960 from ethanwhite/docs-typo
Vjacheslav Murashkin (1):
FIX handle selection in feature names for example
Vlad Niculae (769):
Barely functional NMF implementation.
Updated the example with doctest tags.
Cleaned some syntax, implemented more flexibility.
Fixed svd-based initialization, fixed example
Wrote a few test cases.
Merge branch 'master' into nmf
Merged upstream changes
Added benchmark.
Merge branch 'master' into nmf
Added CRO-based initialization, TODO tests, bench
Untracked changes
Merge branch 'master' into nmf-nnls
Put CRO inside nmf.py
Sparsity constraints and measures of sparsity
Merge branch 'master' into nmf-nnls
Style fixes all around. Clarified NNDSVD docstring.
Decreased default NMF tolerance to improve results.
Corrected sparseness measures in NMF.fit
Removed print in CRO.fit; moved utils to top.
Possibly fixed errors in doctest (not verified yet)
Doctests pass now
Fixed bug in transform (lack of .T), renaming
Non-negative least squares testing
Renamed tolerance to tol for consistency.
Wrote tests to cover mostly everything
NMF example on faces dataset
Implemented fit_transform
Tweaked plot aspect ratio
Fixed broken tests due to interface change
Tests now behave better
Renaming; removed numpy 2-norm
Removed useless _fit_transform
CRO inherits from BaseEstimator
Applied suggestions; updated bench and example
Updated doctest
pep8 fixes
Abbreviation expansion in benchmark
Fixed comments in NMF example
pep8 on test_nmf
Removed comments.
Removed CRO for now
Added nndsvda and nndsvdar options for NMF.init
Merge branch 'nmf-nnls' into nmf-lite
Benchmarks and more pep8
Fixed benchmark, removed unused import.
Fixed NMF benchmark colors
Merge branch 'nmf-lite' of git://github.com/agramfort/scikit-learn into nmf-lite
Merged
Fix benchmarks printing of error for alt-nmf
Documentation. Discussed fixes. Set default to ar.
Added KPCA citation.
Added NMF to classes.rst
Fixed non-ascii characters
Change PCA test to fit just once
Updated documentation with references
Added y=None in fit for pipelining
Fixed relative URI in NMF doc refs
Clarification of example in NMF doc
Capitalized Gram, added y=None in fit, pep8 test.
Docstring formatting in test_nmf.py
Docstrings in nmf.py
Merge branch 'nmf-nnls'. Docstring fixes, mainly.
Clarified NNDSVD in docstring
Documented NNDSVD. Fixed ar perturbation range.
Corrected error in docstring re: nndsvdar
Added disclaimer in nndsvdar docstring
Clarified invalid sparseness parameter error msg.
Clarified init parameter error message.
Transposed shape of components_ attribute
Renamed NMF to ProjectedGradientNMF
Updated authors
Merge branch 'master' into nmf-lite
DOC: Added both plots to NMF doc, tweaked plots.
DOC: Made plots look better.
pep8 in plot_kpca
Attributes renamed and documented.
Began work on decompositions package.
FIX: very confusing internal naming in NMF
Merge branch 'nmf-fix' into decomposition
Decomposition module WIP
Merge branch 'master' into nmf-fix
Merge branch 'master' into decomposition
Working decomposition package
MISC: pep8ification
Missed one reference
Merge branch 'master' into decomposition
FIX: KernelPCA plot in doc
FIX: forgot to track init file in tests
API: components_ shape fixed in PCA classes
ENH: More accurate and clean numeric code in PCA
ENH: More avoidance of np.dot for diagonal entries
Renamed fastica.py to fastica_.py
Merge branch 'master' into decomposition
FIX: Explicit docstring inheritance
FIX: last char in char analyzer, max_df behaviour
FIX: doctest
Copied the Sparse PCA file from the gist
Fixed Lasso call, all is still not right
LARS _update_V fixed by Gael
PEP-8
Initial factoring into SparsePCA class
Implemented transform, fixed confusion
DOC: clarified the default for NMF initialization
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' into sparsepca
Updated transform function, began tests
Merged Gael's gist newest update
Merge branch 'master' into sparsepca
A couple of passing tests
factored out the example code
DOC: a little commenting
renaming, included tests
Merge branch 'master' of github.com:scikit-learn/scikit-learn into sparsepca
Updated init.py
one more test and a quick example
pep8
DOC: foundations, prettified example
Doc enhancement, added alpha in transform
Merge branch 'master' into sparsepca
Added ridge in transform (factored here for now)
Removed print statement from test. Whoopsie!
Merge pull request #2 from agramfort/sparsepca
Initial integration of Orthogonal MP
Renaming, some transposing
Tests and the refactoring they induce
PEP8
Added signal recovery test
rigurous pep8
Added the example
Cosmetized the example
s/nonzero/non-zero
Added Olivier's patch extractor with enhancements
cleanup
Tests for various cases
PEP8, renaming, removed image size from params
Merged Gael's latest update to sparse_pca.py
Merge branch 'sparsepca' of github.com:vene/scikit-learn into sparsepca
Merge branch 'sparsepca' into sc
FIX: update_V without warm restart
FIX: weird branching accident
Merge branch 'sparsepca' into sc
Revert "FIX: update_V without warm restart"
Revert "FIX: update_V without warm restart"
Revert "Revert "FIX: update_V without warm restart""
Merge branch 'sparsepca' into sc
Initial integration of Orthogonal MP
Renaming, some transposing
Tests and the refactoring they induce
PEP8
Added signal recovery test
rigurous pep8
Added the example
Cosmetized the example
Added Olivier's patch extractor with enhancements
cleanup
Tests for various cases
PEP8, renaming, removed image size from params
FIX: weird branching accident
Revert "FIX: update_V without warm restart"
Revert "Revert "FIX: update_V without warm restart""
Merge branch 'sc' of github.com:vene/scikit-learn into sc
FIX: update_V without warm restart
Added dictionary learning example
Merge pull request #3 from agramfort/sc
renaming for consistency, tests for PatchExtractor
Initial shape of dictionary learning object
Added DictionaryLearning to __init__.py
FIX: silly bugs so that the example runs
ENH: Tweaked the example a bit
PEP8
Copied the Sparse PCA file from the gist
Fixed Lasso call, all is still not right
LARS _update_V fixed by Gael
PEP-8
Initial factoring into SparsePCA class
Implemented transform, fixed confusion
Updated transform function, began tests
Merged Gael's gist newest update
A couple of passing tests
factored out the example code
DOC: a little commenting
renaming, included tests
Updated init.py
one more test and a quick example
pep8
DOC: foundations, prettified example
Doc enhancement, added alpha in transform
Added ridge in transform (factored here for now)
Removed print statement from test. Whoopsie!
s/nonzero/non-zero
Merged Gael's latest update to sparse_pca.py
FIX: update_V without warm restart
FIX: weird branching accident
Revert "FIX: update_V without warm restart"
Revert "Revert "FIX: update_V without warm restart""
Merge pull request #5 from agramfort/sc
Merge branch 'sparse_pca' of git://github.com/GaelVaroquaux/scikit-learn into sparsepca
Finished merging Gael's pull request
Merge branch 'master' into sparsepca
Merge branch 'master' into sc
Merge branch 'sparsepca' into sc
Merge branch 'sc' of git://github.com/larsmans/scikit-learn into sc
Renaming, part one
Renaming, part two
Renamed online dict_learning appropriately
Merge branch 'sparsepca' into sc
Renaming part three
Fixed dico learning example
Used @fabianp's ridge refactoring
Exposed ridge_regression in linear_model init.py
Merge branch 'master' into sparsepca
Updated ridge import
Merge branch 'sparsepca' into sc
FIX: checks in orthogonal_mp
Cleanup orthogonal_mp docstrings
OMP docs, a little broken for now
DOC: omp documentation improved
DOC: omp documentation fixes
DOC: dict_learning docs
dictionary learning tests
Fixed overcomplete case and updated dl example
fixed overcomplete case
online dictionary learning object
factored base dico object
pep8
Merge branch 'sparsepca' into sc
pep8
more transform methods, split_sign
OMP dictionary must have normalized columns.
Merge branch 'master' into sparsepca
Merge branch 'master' into sc
DOC: improved dict learning docs
Tweaked the dico example
exposed dict learning online in init
working on partial fit
denoising example
Annotate the example
partial fit iteration tracking, test still fails
FIX: typo, s/treshold/threshold
Merge branch 'sparsepca' into mblondel-fix_ridge
simplify sparse pca
Tweak denoise example spacing
pep8 examples
pep8
Merge branch 'master' into sparsepca
Merge branch 'mblondel-fix_ridge' into sparsepca
Merge branch 'sparsepca' into sc
random state control, comment fixes
Merge branch 'sparsepca' into sc
random state control
clarify lasso method param
Merge branch 'sparsepca' into sc
clarify lasso method param in sc too
s/seed/random_state in patch extractor
DOC: fixed patch extraction comments
ENH: PatchExtractor transform
d:/progs/Git/s/seed/random_state in dico learning example
d:/progs/Git/s/seed/random_state in denoising example
FIX: s/V_views/code_views and pickling
Merge branch 'sparsepca' into sc
DOC: more sparse pca narrative documentation
FIX: gram when method=cd
Merge branch 'master' into sparsepca
removed fit_transform overload
Merge branch 'sparsepca' into sc
DOC: consistent punctuation, minor enh
DOC: missed a couple of dots
ENH: verbose and title in sparse pca example
DOC: fixed typo in sparse pca narratives
Merge branch 'dwf_sparse_pca' of git://github.com/GaelVaroquaux/scikit-learn into dwf_sparse_pca
TEST: fake parallelism
TEST: fake only on win32
TEST: no meddling with joblib outside of win32
Merge branch 'master' into sparsepca
Lower tolerance in sparse pca example
DOC: sparse pca transform rephrasing
DOC: more sparse pca transform rephrasing
One big decomposition example
DOC: consistent coding method in docstrings
Merge pull request #7 from GaelVaroquaux/dwf_sparse_pca
TEST: more coverage
FIX: sparse pca ignored initialization
Merge pull request #8 from GaelVaroquaux/dwf_sparse_pca
Merge branch 'sparsepca' of github.com:vene/scikit-learn into sparsepca
FIX: typo in example s/cluter/cluster
pep8
pep8 in example
FIX: messed up images in narrative doc
FIX: example image order is consistent (for now)
ENH: predictable ordering in example, included kmeans
kernel pca gets its own module
Merge branch 'master' into sc
DOC: fixed SparsePCA docstring issue
Brought in OMP from the larger branch
added functions to classes.rst
Remove useless prints in example
Merge branch 'master' into omp
consistency with lasso: s/n_atoms/n_features
DOC: some fixes
failing test for expected behaviour
FIX: LARS and LassoLARS did not accept n_features
PEP8
FIX: doctests
Merge branch 'master' into lars_n_features
FIX: broken doctest in Lars
cleared n_features naming confusion
s/n_nonzero_features/n_nonzero_coefs
Factored out sparse samples generator
pep8
OrthogonalMatchingPursuit estimator
pep8
Merge branch 'master' into omp
cosmit in example
unified notation
made code consistent with docstring
cleaned up tests, added count_nonzero to fixes
Added OMP bench
better cholesky management
pep8
arrayfuncs solve_triangular and EPIC creeping bugfix
fixed check for None
set random seed to hide odd random test failures
fix more None checks
more clarity
Added early stopping as in reference implementation
n_nonzero_coefs defaults to 10% if eps not passed
began rewriting the tests
transposed generator, updated tests
fixed stupid mistake causing the sample generator to be inconsistent
warn when omp stops early
no need for min, it would break on the previous line
change matrix order, gram looks ok now
use np.asfortranarray
tests robust to warnings
do not overwrite global warn filters in test
use np.argmax instead of x.argmax()
while 1 instead of while True
use nrm2 from BLAS
It's official: omp is faster than lars (w/o Gram)
API changes, part I
API changes, part II: Return of the Estimator
FIX: precompute_gram=auto
DOC: docstrings fixes
pep8
don't use gram in example, useless slowdown
FIX: benchmark was broken
DOC: docstrings
Convert to F-order as soon as possible
F-order asap, don't assume any overwriting
that was unneeded
clearer benchmark
Merge pull request #11 from agramfort/omp
DOC: referenced OrthogonalMatchingPursuit in doc
Merge branch 'omp' of github.com:vene/scikit-learn into omp
updated samples generator according to @glouppe's refactoring
typo s/dictionnary/dictionary
PEP8
Merge branch 'master' into omp
FIX: broken samples generator test
FIX: cruel bug in OMP, no more unneeded warnings now.
Merge branch 'master' into sc
Added Olivier's patch extractor with enhancements
Tests for various cases
PEP8, renaming, removed image size from params
s/seed/random_state in patch extractor
DOC: fixed patch extraction comments
ENH: PatchExtractor transform
extra blank line
pep8 in test file
image.py authors
speed up tests
improved warning for invalid max_patches
New file: Feature extraction documentation
Added feature extraction as a chapter
fix copy paste error in docstring
DOC: improved docstrings
Updated documentation, fixed bug in the process
DOC: clarified docstrings even more
Merge branch 'master' into sc
Accidentally removed a line in a test
pep8 in doc
rename coding_method, transform_method to fit/transform_algorithm
fix broken test
changed digits to faces decomposition example
added dict_learning_online function
MiniBatchSparsePCA is born
Removed dict_init in MiniBatchSparsePCA, docstrings
code reuse by inheritance, more tests
Fast-running face decomposition example
DOC: updated narrative docs for MiniBatchSparsePCA and example
DOC: fixes and updates
DOC: minor errors
FIX: broken test
Added MiniBatchSparsePCA and dict_learning_online to classes.rst
DOC: fixed issue in MiniBatchSparsePCA docstring
ENH: cleaner random number handling in tests
Removed default value of n_components=None in SparsePCA
Fixed inappropriate checks for None
Switched dict_learning_online returns order for consistency
ridge_alpha as instance parameter
prettify face decomposition example (ft. GaelVaroquaux)
add refs to example
Merge branch 'master' into sc
duplicated import
FIX: denoise example was broken
FIX: reconstruction test
make tests share data
clarify docstrings
added init test
partial_fit passes the test
added least-angle regression to dictionary learning transform
plugged in lars instead of lasso_lars in denoising example
Merge branch 'master' into sc
redesign the denoising example
FIX: BayesianRidge doctest
tweaked the example a little more
removed thresholding from denoising example
completely removed thresholding from denoising example
Prettify example
More work on example
tweaking example
DOC: clarified and enhanced dictionary learning narratives
added dictionary learning to classes.rst
corrected reference to omp
DOC: fixed link to decomposition example
DOC: fix See also
DOC: fix See also in both places
DOC: cleaner see also section
DOC: improved dict learning narratives some more
Data centering in denoising example
Prettify structure example
DOC: minor style changes
DOC: tweaks
Removed print in digits classification example
DOC: fixed links and made examples build
Merge branch 'clayw-label_prop' of github.com:vene/scikit-learn into clayw-label_prop
DOC: clarified example titles
Removed fit_params from dictionary learning objects
plot the dictionary in denoising example, other one will disappear
completely removed the duplicated example
Prettify the example
Rehauled example to show the difference
Renamed the example, bounded the difference range
Lower the range of the difference in example for better contrast
Added norm to titles
More explicit docstring in the example
Removed verbosity (example now 4s faster!), prettier output
fix output bug
PEP8 and style
Merge branch 'master' into sc
style
Merge branch 'master' into sc
Use fit_params in Pipeline
Moved dict_learning stuff out of sparse_pca.py
rename eps to tol in omp code
Exposed sparse_encoding, docs not updated
Consistent defaults
Updated the first part of the docs
Updated the docs
removed fit_transform for dict learning
Updated the narrative doc
Tweaking the example
Improved the example clarity
Merge branch 'master' into sc
removed unused imports
fixed all pyflakes warnings
Merge branch 'master' into sc
Copied tests, fixed examples imports, enhanced see alsos
Merge branch 'master' into sc
Merge branch 'sc' of git://github.com/agramfort/scikit-learn into sc
Merge branch 'master' into sc
Included dictionary learning online in decomp example
Added missing dashes in doc
Merge branch 'master' into sc
Merge branch 'vene-sc' of git://github.com/ogrisel/scikit-learn into sc
Merge branch 'master' into sc
Merge branch 'dictionary_learning' of git://github.com/GaelVaroquaux/scikit-learn into sc
renamed MiniBatchDictionaryLearning
layout
Reordered dictionary learning docs
tweaked faces decomposition and added to dict learning docs
added dict learning face decomposition to docs
Fixed image display in docs
simplified fit_algorithm keyword
s/img_denoising/image_denoising
made sparse_encode functions visible
added see also refs to sparse_encode functions
Reordered dictionary learning docs
Stabilized and improved face decomposition example
explicit seeding of olivetti faces loader
MISC: even better check_build error reporting
DOC: added Gaussian Processes to class reference
FIX: keep track of index swapping in OMP
Merge branch 'master' into omp_bug
Merge branch 'omp-bug-test' into omp_bug
Testing for swapped regressors in OMP
Merge branch 'omp-bug-test' into omp_bug
PEP8
Merge branch 'master' into omp_bug
Merge pull request #408 from vene/omp_bug
Skip tests in OMP that fail on old Python versions
Fix one-dimensional y in Gram OMP estimator
Added SparseCoder estimator
Basic testing
DOC: add missing split_sign in docstrings
FIX: 10% of features should be at least 1
PEP257 :)
restore typo
Added SparseCoder to init and class index
initial work on docs
implement noop fit in SparseCoder
clean up test
Fixed doc links
Fixed lena in example
Fixed lena import in denoising example
Merge branch 'master' into sparse-coder
cleaned up imports in test
Merge branch 'master' into sparse-coder
FIX: objective functions in Lasso linear model docs
DOC: correct ordering of returns in dict_learning_online
DOC: clarified dimensions in _update_dict
Fix the API and the scaling inside dict_learning
DOC: specify scaling in linear_model.rst
work on failing tests
Merge branch 'master' into sparse-coder
skip tests that were wrongly passing before
Test for almost equal instead of equal in sparse_encode_error
FIX: slices generation
Hide sparse_encode -- redundant
DOC: add optimization objective to lasso and enet docstrings
DOC: make docstrings as good as I could
Warnings and deprecation
DOC: better cross refs and docstrings
Adapted examples for alpha scaling
Merge branch 'master' into sparse-coder
PEP8
added sparse coding example
s/threhold/threshold
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into sparse-coder
Add SparseCoder example
Rehauled SparseCoder example
Merge branch 'master' into sc-example
Added @vene's work to the changelog
sparse coding transform is now a mixin
EHN: multilabel samples generator can create different number of labels per instance
pyflakes test_multiclass
Add the samples generator to the references
ENH: Added the synthetic example
ENH: Really added the synthetic example
DOC: add multiclass to class reference
DOC: add example to multiclass.rst
DOC: really add example to multiclass.rst
DOC: add image to narrative doc
Added missing space in PIL warning
DOC update changelog
Add Andy to the author list
Allow unlabeled samples in multilabel ex, collab between @vene and @mblondel on the plane
FIX typo that broke the test
ENH make example more expressive
Change seed to make example behave better
Removed unused imports in species dataset
FIX: issue #540, make omp robust to empty solution
Merge branch 'omp-zerofix'
ENHanced the multilabel example aspect
s/jacknife/jackknife
DOCFIX: make math block render
Add warnings and clean up tests
FIX: doctests for scale_C, took some liberties
FIX: bug in test_setup. Actually avoid multiprocessing now.
FIX: wrong cover-package, misleading coverage as 100%
DOC: updated testing instructions
Remove a warning from kmeans tests
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Remove deprecation warning in sparse_encode
Merge pull request #873 from vene/remove_sc_warning
ENH: make_regression supports multiple targets
Update make_regression return shapes in docstring
FIX: sparse ElasticNet tests that were not testing much at all
fix typo
ENH: faster design in FastICA
Begin updating the developers performance documentation
Update and fix errors in memory profiling documentation
DOC: better phrasing about memory profiling
Begin updating the developers performance documentation
Update and fix errors in memory profiling documentation
DOC: better phrasing about memory profiling
We already have the inverse at that step
Replase pinv calls with dgetri
More lapack inverting
Refactored fast_pinv without lapack calls
Compute pseudoinverse using eigendecomposition
Vectorize singular value inversion
Remove unused import
Merge branch 'master' into cov-speedup
Merge remote-tracking branch 'VirgileFritsch/cov-speedup' into cov-speedup
Merge remote-tracking branch 'jakevdp/vene-cov-speedup' into cov-speedup
Update and rename pinvh (by @jakevdp)
Cloned @jakevdp's pinvh tests
Remove odd-looking period in tests
Use pinvh in plot_sparse_recovery example
grammar
Use pinvh in bayes.py
Use pinvh in GMM and DPGMM
Remove deprecated _set_params and the call in grid_search
Remove chunk_size from k_means
Removed load_filenames and load_20newsgroups
Remove sparse_encode_parallel
Removed deprecated parameters in GridSearchCV
Remove LARS and LassoLARS
Remove fast_svd.
Remove _get_params
Corrected deprecation schedule in cross_validation
Remove deprecated properties in naive_bayes
Add or fix deprecation schedule in warnings.
Fix example using deprecated API, output was misleading.
Remove deprecated load_20newsgroups from classes.rst
FIX: randomly failing CountVectorizer test
MDS is not a transformer, fix the test to skip PLS
Merge branch 'master' into mixins
Improve the common tests, make fast_ica pipelinable
Support y-dependent transform as in PLS
fit_transform in PLS to support y
Make PLS degrade gracefully on sparse data
Rename Y to y in PLS
Check for sparse input in isomap and lle
Check for sparse data in MDS despite not being tested
Skip CCA in test_regressors_int
First effort in multitarget lassolars
ENH: move Gram precomputation outside of the loop
TEST: precomputed lasso and lars
Unnecessary copying
FIX: add test, fix memory initialization bug
ENH: multidimensional y in ElasticNet (WIP)
return_path option in lars_path
Add possibility to ignore the path in Lars objects
Fix doctests
Add __all__ for half of the scikit
Add __all__ for the second half of the scikit
Expose ENGLISH_STOP_WORDS
We already have the inverse at that step
Compute pseudoinverse using eigendecomposition
Vectorize singular value inversion
Cloned @jakevdp's pinvh tests
Use pinvh wherever it helps in the codebase.
First go at speeding up Euclidean distances
Make it less yellow
More reusable code, speed up symmetric case
Better cython style.
Add dense sparse support and precomputation
FIX: buggy case when X=dense, Y=sparse
Consistent argument naming and useful maintenance notes
FIX using out with sparse matrices
Relative imports, fix todense bug
safe_sparse_dot into preallocated output
Add test for dense_output, fix bug, cleaned up logic
Avoid reallocation in manifold.mds
add type prefix to blas funcs
DOC Clarify the docstrings
Added Cython-generated euclidean_fast.c
Separate dense_output and out parameters, document better
API change: mutually exclusive preallocation and precomputation
FIX: csr_matrix induced unwanted copying
Rename euclidean_fast to _euclidean_fast
Clean setup.py in metrics
ENH: improve test coverage
Add failing test and no-op flip
Sign flipping as suggested by @ogrisel, not in place
Make sign flip in place
Test more seeds for svd sign flipping
Add sign flip as flag in randomized_svd
Make sure svd_flip test actually tests something
Make randomized_svd flipped by default
svd_flip test fails on Travis. Change random seed, see if it helps
Cannot easily ensure non-uniqueness without the fix, just test uniqueness
TEST flipped svd remains correct
FIX: makes our libsvm port compile under MSVC
Merge branch 'master' of github.com:scikit-learn/scikit-learn
DOC: fix typo and formatting around MurmurHash3
DOC: Fixed wrong link and formatting in decomposition docs
DOC: fixed latex and formatting in SVM docs
DOC: more consistency in metrics docstrings
DOC: More consistency in metrics and clustering metrics docstrings
DOC: more consistency in docstrings for unsup clustering metrics & missing link
DOC: fixed missed details in metrics docstrings
DOC: addressed more inconsistencies in metrics docstrings
ENH: use lgamma function from John D. Cook
Merge branch 'master' into lgamma_port
FIX: variable naming inconsistency in NMF
DOC FIX: multi-target linear model attribute shapes
DOC spelling and clarification
Make callable svc test more robust for MacOSX.
Added RBM to whats_new.rst
DOC Added skeleton for RBM documentation
ENH Rename RestrictedBolzmannMachine to BernoulliRBM
FIX: make BernoulliRBM doctest pass
FIX: BernoulliRBM check random state in fit, not in init
FIX: validation in `BernoulliRBM.transform`
DOC: first attempt at RBM documentation
Link to RBM docs from the unsupervised toctree
FIX: uneven RBM image
DOC: PCD details and references
Fix typos in example
PEP8 and indentation
DOC add plot and example to docs
DOC rewrite BernoulliRBM example description
Set seed through params, not globally
FIX handling of random state, hide some of API
Pep8 example
Update example params by grid search, and docstring
One space after dot
DOCFIX neural networks module
DOCFIX spacing and clarification in RBM docstring
More stable implementation of logistic function and its derivative by @fabianp
Use gen_even_slices instead of homebaked code
ENH Add fast and stable logistic sigmoid to utils and RBM
ENH Support sparse input in RBMs
ENH Prevent memory copying in RBM's _fit
Do not touch uncopied memory
Nudge images using convolve, slower but more readable
Clarify narrative docs
Clarify and python3 RBM example
Periods and other docstring issues
Remove redundant test
Python3 support in RBM
TST RBM smoke-test verbosity
FIX missing class attribute in ICA. Common test was failing
FIX: fastica function dictionary default value
Deprecate FastICA.sources_
TEST remove deprecated stuff from fastica tests
Document the deprecation
FIX bug in test
Clean up and rename Hungarian algorithm
Clarify and clean up example
Remove print in Hungarian tests
Consistency for floats in consensus score
Add warning in private _Hungarian docstring just in case
ENH make spectral clustering test more stable to random seed
ENH add return_path in orthogonal matching pursuit
TEST for omp path feature
ENH OrthogonalMatchingPursuitCV estimator
FIX respect conventions in OMP init
FIX OrthogonalMatchingPursuit normalized twice
Use projected gradient solver in transform to support sparse matrices
Use same parameters when solving the transform
Use scipy.nnls.optimize for dense data
Add failing test for libsvm random state proba
FIX support random state in libsvm
DOC document changes in LIBSVM_CHANGES
DOC update docstrings to reflect libsvm random_state
Fix libsvm seed when predict_proba in tests and examples
Clarify and make libsvm random seed more consistent
Comment predict params in libsvm
DOC reference and rename cross decomposition module
FIX raise tolerance in svm predict_proba test
Make common PLS tests more stable
FIX for MSVC inline fmin, fmax and log2
FIX for MSVC inline fmax in dist_metrics
Add LibSVM random state to changelog
Add .bmp sklearn logo of correct width for Win setup
[TYPO] s/migh/might/
TYPO remove mutli (did you mean Muttley?)
Turn useless line of code into descriptive comment
DOC fix docstrings; add @hamsal to authors
COSMIT Use explicit if/else in scorer
TST default scorers with sample_weight
DOC Update What's New
FIX classes name in OvR
FIX set vectorizer vocabulary outside of init
Deprecate vectorizer fixed_vocabulary attribute
Wei Li (109):
FIX: this fixes issues #746 ProbabilisticPCA minor things
FIX: this further fixes issues #746 with API compatibility warning and integer division fix
ENH: using coo matrix construction to accelerate calculation of the contingency matrix
FIX: numerial issues in NMI
COMIT pep8
ENH add refs to issue #884
FIX: ADD test cases for exact 0 case, and nmi equal to v_measure case
FIX: accelerate v_measure calculation based on mutual information
COSMIT add doc to clearify how nmi is normalized and pep8 fix
COSMIT pep8 fix for test_supervised
FIX: fixes error caused by break line
Using coo_matrix to accelerate confusion_matrix calculation
COSMIT
ENH add test for testing v_measure is a variant of nmi
COSMIT typos in doc strings
FIX let test use random_state(seed)
PEP8..
FIX typos and vague comments
DOC add comments for log(a) - log(b) precision
COSMIT fails to see the function name use mi rather than mutual information
FIX doctest to check up to 6 digits precision
FIX: eliminate \ for continuation from doctests
FIX issue #1239 when confusion matrix y_true/y_labels has unexpected labels
PEP8
ENH docstring misleading
ADD install guide for archlinux
ADD spectra_embedding for wrap function spectra_embeeding as an estimator from spectral clustering
ENH finish sketch for the estimator wrapper
ENH add warning for inverse transform
ADD test cases for spectra_embedding
ADD empty test scripts
COSMIT
FIX typos
FIX inconsistent typos
FIX nearest_neighbor graph build
ADD add test_examples for pipelined spectral clustering and callable affinity
FIX remote does not have test file wired...
MOV move spectra_embedding from decomposition to manifold
ENH docs partially updated, happy mooncake festival
ENH move spectral_embedding as standalone, fixes for tests
COSMIT
ADD add the laplacian eigenmap to examples
ADD test cases for two components, unknown eigenvectors, unknown affinity
COSMIT
ENH test-coverage
PEP8 test files
ADD spectra_embedding for wrap function spectra_embeeding as an estimator from spectral clustering
rebase: fixing conflict
ENH add warning for inverse transform
ADD test cases for spectra_embedding
ADD empty test scripts
COSMIT
FIX typos
FIX inconsistent typos
FIX nearest_neighbor graph build
ADD add test_examples for pipelined spectral clustering and callable affinity
FIX remote does not have test file wired...
rebase: fixing conflict
ENH docs partially updated, happy mooncake festival
ENH move spectral_embedding as standalone, fixes for tests
COSMIT
ADD add the laplacian eigenmap to examples
ADD test cases for two components, unknown eigenvectors, unknown affinity
COSMIT
ENH test-coverage
PEP8 test files
SYNC doc built error on one machine, sync with another
DOC docs for spectral embedding
DOC dox fix and misc post-rebase things
MRG merge with @Gael's PR 1221 and some name changes
FIX lobpcg, amg drops the constant eigen vectors by default
ADD check for symmetric and check for connectivity
ADD add test for check_connectivity
COSMIT
Change sparse graph to use cs_graph funcs. minor doc changes
Minor doc changes
FIX spectral embedding offers choice whether to drop the first eigenvector
COSMIT
RENAME parameter rename in examples
RENAME rename eigen_tol and eigen_solver, and warning about using old variable name eig_tol and mode
ADD add a test for discretize function
COSMIT and Typo
FIX backwards support
FIX doc fix and test fix
COSMIT
ADD added examples, and eliminate unnecessary imports
FIX nn-affinity does not support sparse input
COSMIT and minor fixes
DOC update whatsnew
FIX: amg requires sparse matrices input
missing _set_diag
fix spectral related testing errors
COSMIT and unused lines
FIX further improve the thresholds
FIX discretization test have shape problem, use coo_matrix instead of LabelBinarizer
Addressing @ogrisel's comments
FIX roc_curve failed when one class is available
COSMIT
DOC fix
TYPO fixes
DOC address @amueller's comment
FIX typo
Update whatsnew
FIX spectral_embedding test erros, ADD spectral embedding to sphere examples
MOD use safe_asarray instead of np.asarray
MISC update my mailmap
MOD address @mblondel's comments
MOD move generating matrix out of the loop
Merge pull request #1563 from kuantkid/sparse_knn_graph
Wei Xue (8):
Deprecate estimator_params in RFE and RFECV #4292
Correct warning messages and move warnings to fit method #4292
FIX coding style #4292
Friendly error on KNeighbors(n_neighbors<=0 )
Raise ValueError when log_multivariate_normal_density has non positive definite covariance
DOC & COSMIT deprecate estimator_params in RFE and RFE_CV in docstring
ENH optimize rfecv by eliminating the for loop
Add regression test for the number of subsets of features
Will Lamond (2):
FIX allow ndim>2 in shuffle
Fixes ovr in the binary classifier case, and adds support for lists of feature arrays in the multiclass case.
Will Myers (1):
Fixed SelectKBest corner case: k=0
Wu Jiang (1):
Fix a typo in OneHotEncoder's docstring
X006 (2):
Dataset loader moved to datasets.base, but not being installed
Updates for DBSCAN clsutering docs
Xinfan Meng (6):
fix a bug of affinity propagtion, which is caused by incorrect index
BUG Disallow negative tf-idf weight
Fix a test case
Fix broken links
DOC Change URLs of NNDSVD papers to avoid paywall
DOC Fix broken class references
Yan Yi (2):
FIX inconsistent boundary handling
TST add test for radius_neighbors boundary handling
Yann Malet (2):
Update the installation guide with Ubuntu related info
Fix a Broken link in the documentation
Yann N. Dauphin (25):
ENH added Restricted Boltzmann machines
30% speed-up thanks to in-place binomial
ENH 12% RBM speedup with ingenious ordering of operations
rename h_samples to h_samples_
added URI for RBM reference
improved docstring for transform
renamed _sigmoid to _logistic_sigmoid
use double backquotes around equations
logistic_sigmoid moved to function
transposed components_, no performance penalty
only compute pseudolikelihood if verbose=True
more accurate pseudo-likelihood
use iteration terminology instead of epochs in RBM
default n_components from 1024 to 256
clarify some method names (ex: mean_h -> mean_hiddens)
added epoch time
ENH RBM example
switched to digits
moved rbms to neural_networks module
add tests for rbm
trim whitespace
use train_test_split
neural_networks -> neural_network
ENH rename n_particles to batch_size in RBM
TST added more RBM tests
Yannick Schwartz (26):
added a StratifiedShuffleSplit in the cross validation schemes
added test for stratified shuffle split
updated stratified shuffle split test
fixed sss test
cleanup of arg check and doc update
put sss validation in external function
updated doc/whats_new.rst, doc/modules/classes.rst and doc/modules/cross_validation.rst for the sss
sss raises error if a class has only one sample, added associated test
pep8
changed train_fraction to train_size
Fixed random state, changed _validate_sss name, fixed _validate_stratified_shuffle_split bug
New stratified shuffle split version that only return indices arrays
stratified shuffle split can return masks
Fixed StratifiedShuffleSplit issue for unbalanced classes
Fixed n_test issue in StratifiedShuffleSplit
pep8 fix
Added new tests for StratifiedShuffleSplit
Fixed SSS test
Removed redefinition of variable i in SSS
Permute the train and test sets in SSS to avoid class-sorted folds
Added validation for some corner cases in SSS
Updated tests for SSS
Added tests for the StratifiedShuffleSplit to check the sizes of the training and testing sets, and that they don't overlap
Minor cleanup of StratifiedShuffleSplit
BUG: set random state in LogisticRegression
Update multiclass/multilabel documentation
Yaroslav Halchenko (24):
DOC: removing a stale request for subversion write permissions
Allow to build _libsvm.so against system-wide LIBSVM's svm.h
API to control LIBSVM verbosity without patching
recythoning _libsvm.pyx for previous commit
revert change to libsvm -- now verbosity is controlled via API
enable more doc testing for test-doc Makefile rule
adding acknowledgement to Dr.Haxby for my support ;-)
FIX: removed obsolete entries and added current ones for top-level __all__ + unittest
DOC: minor spellings and formatting (trailing spaces, consistent spacing etc)
RF: use joblib.logger submodule itself while accessing its function in grid_search
FIX: reflect SVC API change (eps -> tol) in doc/tutorial.rst
FIX: lars_path -- assure that at least some features get added if necessary
test case for previous commit
minor -- pass verbose into LARS in the test case
FIX: strings are not necessarily singletones + catch mistakes earlier
DOC: minor spellings fixes in pls.py
DOC: minor typo "precom[p]uted"
DOC: fix name for line_profiler_ext.py extension
DOC: enhancement for Debian installation + fixed various typos
DOC rudimentary docstring to deprecated.__init__ describing "extra"
ENH do not fail the test reslying on numpy div 0 warnings if those are not spit out by numpy in general
ENH: sklearn.setup_module to preseed RNGs to reproduce failures
BF: explicitly mark train_test_split as not the one for nosetesting
BF: use hasattr with providing attr name (Thanks to Timo Schulz)
Yoni Ben-Meshulam (1):
Fix a minor typo: 'They requires' should be 'They require'.
Yoshiki Vázquez Baeza (2):
DOC: Remove dead link
DOC: Change URL-based link to sphinx :ref:
Your Name (1):
[base.py] Do not break while trying to pprint not existent attribut
Yu-Chin (4):
COSMIT tabs/spaces in Liblinear
MAINT variable initialization in Liblinear
MAINT use size_t in Liblinear
MAINT changes to (disabled) Liblinear code, from upstream
Yucheng Low (1):
Increase length of array indexing type in ArrayDataset
Yung Siang Liau (2):
FIX: Add allow_nans option to check_arrays
FIX TfidfVectorizer exports idf_ attribute
Yury V. Zaytsev (1):
BUG: typo fixes in sklearn.mixture.gmm
Yury Zhauniarovich (1):
Updated graph ranges
Zac Stewart (1):
Fix string literal concatenation whitespace
Zichen Wang (3):
added macro-average ROC to plot_roc.py
added macro-average ROC to plot_roc.py
rebased upstream/master with master
abhishek thakur (3):
staged_predict predicts classes, not probabilites
small spelling error in doc
fix svm.fit in feature_stacker.py
adrinjalali (1):
svm fit numpy array indexing deprecation warning fix.
ai8rahim (1):
Changed assert_array_equal() in Line 45 and 46 to assert_array_almost_equal(,,decimal=5). This has fixed the AssertionError, which occurs during the installation test.
akitty (1):
Fix typo
akshayah3 (15):
Added Deprecation warning
Small PEP8 fix, n_jobs attribute is compatible with both fit method and the constructor
Fixed a failing test in the doc
Added tests
Added seperate tests
Refactored some logic code
PEP8 fixes
Added make_fit_parameter method
dded tests to test_boosting
Added _check_sample_weight
Shifted the _check_sample_weight into BaseWeightBoosting class
Raise exception for sparse inputs in the case of svd solver
Documentation of max depth parameter
Sparse inputs for fit_params
Added helper function
alemagnani (1):
FIX+TST non-consecutive or duplicate vocabulary indices
alex (1):
MAINT: symeig->eigh
amormachine (2):
Updating copyright year to 2014
Update AUTHORS.rst
andy (8):
FIX manifold example - sorry, my bad.
COSMIT RST in manifold sphere example.
ENH fix random seed in manifold example
DOC added note in example that digits data is to small.
ENH Add "proximity" parameter to MDS.
FIX soime typos, modify test.
FIX another typo, fix examples
ENH updated to more examples.
banilo (3):
added n_init check for k_means()
FIX: typo in Pipeline error
extend R^2 description
benjamin wilson (1):
remove reference to removed API, fixes #6
benjaminirving (1):
Adding checks for the input LDA prior
bhsu (2):
non-inheriting estimator added for rfe, cross validation, and pipeline tests
PEP8 compliance
bob (1):
Couple of small changes from comments
buguen (1):
correcting typos in the doc
bwignall (8):
Add option to restrict LassoCV to positive-only coefficients
Add option to restrict ElasticNet to positive-only coefficients; add test for this case, and the same for LassoCV
Fix some typos in whats_new.rst, using ispell
MAINT: Fix PEP8 warnings in sklearn/covariance
DOC: Replace GT/LT with angle brackets for inner product
CLN: Fix typo
CLN: Capitalize "Gaussian" in example docstrings
CLN: Capitalize "Dirichlet" and "Mexican" in example docstrings
cgohlke (4):
Fix ValueError: Buffer dtype mismatch, expected 'INTP' but got 'long' on win-amd64
FIX MSVC compile error C2036: 'void *' : unknown size
TST: Fix ValueError: Buffer dtype mismatch, expected 'npy_intp' but got 'long' on win-amd64
MAINT Include binary_tree.pxi in source distribution
chalmerlowe (1):
Improved plot digits classification example.
chebee7i (1):
DOC: Reword docstring and deprecation warning for include_self.
cjlin (3):
MAINT import LibSVM 310
add a comment for sigmoid_predict in svm.cpp
MAINT import LibSVM patch from upstream
csytracy (2):
FIX "ValueError: startprob must sum to 1.0" in HMMs
FIX stability in HMMs
danfrankj (1):
Correct documentation for TfidfVectorizer
djipey (5):
Creation of the attribute LDA.explained_variance_ratio_, for the eigen solver.
Addition of a piece of docstring for the attribute
Correction of the name of a function.
Changing [ to (. Formatting consistency.
Modification of a line after @agramfort's suggestion.
draix (1):
PY3: replaced izip
dzikie drożdże (1):
PY3 file handling
edson duarte (1):
Fixed small typos in SVM examples.
eltermann (8):
Small documentation fix: from CMS to VCS
Fixed documentation typo
Fixed documentation on sklearn/tree
Replaced abbreviated 'w.r.t' to 'with regards to'
Fixed small typo
Doc fix - Compiled .pyx files with Cython 0.20
/s/2013/2014: Updated project copyright date
s/svn/svm
emanuele (1):
FIX: added logsumexp and nan_to_num to avoid underflows and NaNs
fcostin (9):
optimisations to Ridge Regression GCV
faster GCV for Ridge for n_samples > n_features
fixed tests to work with Ridge GCV
updated RidgeCV docstring and changelog
fixed bug with _values (thanks @mblondel)
fixed bug with > 1d y arrays
svd fails for sample_weights, use eig instead
coerce sparse matrices to dense before SVD
refactoring (thanks @GaelVaroquaux, @mblondel)
floydsoft (1):
fix the link of Out-of-core_algorithm
gpassino (1):
DOC fix kernels documentation inconsistencies
gwulfs (3):
PyMC
sklearn-theano
fix uri
hamzeh (3):
Implemented a check for ndim exceeding two in the utils.check_arrays function
Removed unnecessary variable n_dim in utils.check_arrays, removed unnecessary parens in same place
Trivial change in utils.check_arrays from > 2 to >=3 in attempt to rebuild Travis CI
hrishikeshio (1):
DOC dev guide: deprecation
isms (1):
DOC: Fix LabelBinarizer docstring to rst format bug
jamestwebber (3):
Update coordinate_descent.py
Fixed precompute issue (again) in ElasticNet and enet_path
DOC: removed duplicated Python 3 section
jansoe (1):
fix error in unwhitened case
jdcaballero (1):
Update plot_outlier_detection.py
jess010 (2):
updated AMI documentation for issue #2686
update ami doc for feedback on PR
jfraj (3):
Fixing bug in assert_raise_message #4559
Improving Bunch Class to ensure consistent attributes
Adding test to ensure bunch consistency after pickle
jnothman (18):
Merge pull request #4824 from amueller/testing_less_warnings
Merge pull request #4828 from untom/maxabs_scaler
Merge pull request #4853 from amueller/fix_ica_pca_example
Merge pull request #4859 from Titan-C/sidebar
Merge pull request #4862 from jnothman/fixlink
Merge pull request #4860 from rvraghav93/additive_chi2_sampler_reflink
Merge pull request #4840 from jnothman/dynamicgrid
Merge pull request #4934 from thvasilo/patch-1
ENH O(1) stop-word lookup when list provided
Merge pull request #4919 from rvraghav93/label_binarizer
DOC n_thresholds may be < no. of unique scores
Merge pull request #5094 from pv/fix-inplace
Merge pull request #5071 from mth4saurabh/fix-issue-3594
Merge pull request #5063 from amueller/bagging_input_validation
Merge pull request #5186 from sdvillal/issue5165
Merge pull request #5174 from jnothman/dbscan_precomputed_sparse
DOC what's new: DBSCAN sparse precomputed
DOC tweaks to what's new
john collins (3):
LOO is bad doc
updated for typos and a few wording changes
one last comment fix
kaushik94 (3):
Update supervised.py
Update supervised.py
ENH add sparse parameter to OneHotEncoder
kowalski87 (2):
fixed bug in mds.py
added not
leepei (2):
MAINT trailing whitespace in linear.cpp
Let nr_fold = l when nr_fold > l in CV
leonpalafox (1):
Change exception text when multiple input features have the same value from: "Multiple X are not allowed" to: "Multiple input features cannot have the same value"
maheshakya (58):
ENH: Implemented LSH forest
TST tests for LSHForest
ENH Added radius_neighbors method to LSHForest.
MAINT refactor LSH forest code
Updated docstrings and tests.
Used bisect in _find_matching_indices method.
Removed cached hash and replaced with numpy packbits.
Updated lsh_forest and test_lsh_forest.
Updated docstrings for tests.
DOC Added example for LSHForest.
FIX Updated tests. renamed lsh_forest to approximate.
DOC Updated the example.
COSMIT pep8 LSH code
DOC Updated the document of neighbors module.
FIX Completed tests for missed lines.
FIX Updated examples
Removed redundant ran_seed from GaussianRandomProjection Initialization.
Updated Approximate nearest neighbors in neighbors.rst
Inlined generate_hash into create_tree in LSHForest.
Updated test_approximate.
Re-written Approximate nearest neighbors documentation.
FIX Moved random_state to fit method to pass seed to create_tree.
FIX Updated plots in example.
DOC Included information about distance measure and LSH method in the documentation.
Added test to check randomness of hash functions.
FIX Added integer as random_state to test_hash_functions.
Updated example to avoid redundants.
Example splitted into hyperparameters and scalability.
FIX Fixed accuracy error in scalability example.
DOC Added new section-Mathematical description of LSH.
Created benchmark for approximate nearest neighbors.
Added scalability plots into documentation.
FIX Modified tests to handle cosine distances instead or euclidean.
Updated example with cosine measures.
Modified plots in hyper-parameters and scalability plots.
FIX Added user warnings for insufficient candidates.
Changed parameter descriptions of LSHForest.
FIX LSH: Fill up candidates uniformly from remaining indices
Added KNeighborsMixin, RadiusNeighborsMixin.
Added tests for graph methods.
FIX Fixed benchmark.
ENH Added error bars to scalability example plots.
FIX Fixed ordering of returned neighbors.
DOC Modified documentation of approximate neighbors.
FIX Fixed example.
ENH Modified hyperparameters example to calculate stadard deviations of accuracies.
ENH Modified approximate neighbors doc string.
ENH Changed doc string and comments in test_distance function.
ENH Used GaussianRandomProjectionHash for handling hashing.
ENH Changed description of hyper-parameters example.
ENH Inlined _create_tree method.
DOC fix LSH forest docstrings
ENH Modified kneighbors and radius_neighbors methods.
FIX Fixed partial_fit.
TST improve tests for approximate neighbors.
FIX Modified assert_warns_message to ignore order of warnings.
DOC Modified math expressions to theoretical bounds.
FIX Written in Latex format.
martinosorb (5):
Implemented parallelised version of mean_shift and test function.
Change par system to joblib and n_jobs convention
Minor appearance changes
Trivial function name bug fixed
pep8 style
matrixorz (1):
fix issue #2901
mbillinger (4):
Added Testcase for LDA-Covariance
Fixed shrinkage covariance for LDA
Explicit broadcasting
Updated whats_new.rst
mhg (25):
lars_path method complemented with nonnegative option.
lars_path method complemented with nonnegative option.
Merge branch 'nonnegative-lars' of https://github.com/michigraber/scikit-learn into nonnegative-lars
"nonnegative" -> "positive" + cleanup
lars_path method complemented with nonnegative option.
"nonnegative" -> "positive" + cleanup
Merge branch 'nonnegative-lars' of https://github.com/michigraber/scikit-learn into nonnegative-lars
cleanup
todos for pull request
docstrings added + fit_intercept reintroduced.
cleanup
positivity constrained can now also be used for the CV estimators.
doctest fix.
positive option for Lars propagated to doc.
agramfort code review input
test lars_path positive constraint.
merge with upstream/master
fit_intercept docstring comment removed from lars_path method
positive option passed on to LarsCV estimator
tests of positive option for estimator classes
tests of positive option for estimator classes condensed.
pep-8 fix
tests for comparison of results for Lasso and LarsLasso added.
docstrings updated with considerations regarding small alphas when positive=True.
comparison test of Lasso and LassoLars under positive restriction refactored and commented
mr.Shu (10):
moved class_prior in NB to __init__
added deprecation warning to fit function
fixed docstring tests
fixed typos
added warnings
updated based on comments
fixed local variables
renamed the new parameter to class_wieght
fixed docstring test
DOC: COSMIT: Naive Bayes long url + PEP8 cleanup
murad (3):
added tests for sparse matrix inputs to BaggingClassifier and BaggingRegressor
open iris files with statement to avoid ResourceWarning
BaggingClassifer/BaggingRegressor tests for sparse input
nzer0 (1):
Documentation ERROR: mixture.DPGMM.precs_
pianomania (3):
add a example for FeatureHasher
add an example for FeatureHasher
try to mollify travis
popo (1):
MAINT some more pointer safety in LibSVM
pvnguyen (1):
Update outlier_detection.rst
queqichao (4):
+ Fix the bug of calling len when using sparse matrix for fit_params in cross_validation.py.
+ Fix the bug of calling len when using sparse matrix for fit_params in cross_validation.py.
Merge branch 'fix_bug_in_cross_validation_when_using_sparse_matrix_for_fit_params' of https://github.com/queqichao/scikit-learn into fix_bug_in_cross_validation_when_using_sparse_matrix_for_fit_params
+ Fix a typo.
rasbt (1):
small typo fix "radius" in rbf kernel
samuela (1):
Fix typo in the Gaussian PDF
santi (1):
ENH exposing extra parameters in t-sne
saurabh.bansod (1):
Fixes #3594.
scls19fr (2):
Update confusion_matrix docstring
Update confusion_matrix docstring
sdenton4 (4):
FIX: More robust test_extmath.py, compatible with older numpy
ENH: Compute lower number of points by default
Improved the learning curve example.
Fixed style errors detected by pep8.
sergeyf (8):
Update qda.py
Update qda.py
Missed a space!
Updating to ensure pep8 compliaance
reg_param is a float
Update qda.py
Update test_qda.py
Update qda.py
sethdandridge (1):
Fixed error message typo
sinhrks (2):
DOC: Updated set_params doc
CLN: Added missing __all__
snuderl (1):
Small typo.
sseg (1):
Merge pull request #1 from scikit-learn/master
staubda (1):
Improved documentation of the "estimator_params" argument for RFE and RFECV.
swu (16):
Adding newton-cg solver for multinomial case.
Add comments to _multinomial_loss_grad_hess function.
Fixed pep8 errors. Added docstring for _multinomial_loss.
Fixing docstring on _multinomial_loss.
Fixing more pep8 errors.
Merge remote-tracking branch 'upstream/master' into multinomial_newtoncg
Removed some unnecessary lines and reformatted for consistency in _multinomial_loss and _multinomial_loss_grad_hess.
Modified parameter description in docstring of
Adding test cases for multinomial LogisticRegression using newton-cg
Modified hessp function in _multinomial_loss_grad_hess to compute r_yhat
Update whats_new.rst. Add contact information to logistic.py.
Modified doc/modules/linear_model.rst to incorporate the newton-cg solver
Minor doc and formatting changes.
Reformatted docstrings in multinomial functions. Refactored multinomial
Wrap _multinomial_loss_grad in lambda function for passing into lbfgs.
Modify shape check on coef matrix in _logistic_regression_path.
syhw (23):
travis config file
update travis config
put the requirements at the right place
added requirements to travis config file
Merge https://github.com/scikit-learn/scikit-learn
Travis CI cfg + status in README + sklearn requirements
with Ubuntu's scipy instead of pip's
with python-nose
removed requirements.txt from travis cfg
removed requirements.txt
changed the build image URL in README for after pull-merge
trying travis cfg with system-site-packages
Merge https://github.com/scikit-learn/scikit-learn into travis
nudging the digits dataset for BernouilliRBM example
TST added a 'fit [[0],[1]] + gibbs sample it' test for RBMs
replaced test_gibbs by a smoke test for NaNs
check for pseudo_likelihood clipping
COSMIT refactoring rbm
RBM example now verbose
squeezing logistic_sigmoid result only on 1D arrays
adding a test for sparse matrices in RBM
changing free_energy to private in RBM
added neural_network to setup
t-aft (1):
DOC grid_search_digits.py does no do nested CV
tejesh95 (1):
Update gaussian_process.py
terrycojones (1):
Added missing space to exception message. Simplified (trivially) code and corrected message. Updated tests.
tokoroten (5):
randomforest decrease allocated memory
refactoring to simplify
refactoring to DRY.
change valiable name to meaningful.
autopep8 E501
trevorstephens (30):
GradientBoostingClassifier docstring incorrectly specified default for max_features as "auto" when it is None.
various docstring fixes for web docs
add support for class_weights
add multioutput support & bootstrap auto mode
expanded class_weight dimension fix
add class_weight to trees, expand tests & minor refactor
parameter validation checks & tests for errors
Y-org rename & whats_new update
Merge branch 'master' into rf-class_weight
rename vars & copy sample_weight
Merge branch 'master' into rf-class_weight
rename cw option to subsample & refactor its implementation
fix nb partial_fit w class_prior #3186
fix rst table layout
add compute_sample_weight util
refactor forests & trees class_weight calc
use safe in1d, use compute_sample_weight in ridge
remove inplace multiplication in ridge
add test for None class_weights
fix LarsCV and LassoLarsCV fails for numpy 1.8.0
add gplearn to related projects
Pretty decision trees
update export_graphviz docs
unused params in gridsearchcv
add class_weight to PA cls, remove from PA reg
add sample_weight to RidgeClassifier
fixes #4846
deprecate fit params in qda and lda
fix rounding, adjust tests for 32 bit export_graphviz
OneHotEncoder warn fix
tttthomasssss (1):
fix for issue #4051; replaced X = np.asarray(X) with check_array(...) method
uber (1):
example yahoo stock issue fix
ugurcaliskan (2):
Update plot_rfe_with_cross_validation.py
Update plot_multilabel.py
ugurthemaster (8):
Update plot_digits_pipe.py
Update plot_tree_regression_multioutput.py
Update plot_forest_importances.py
Update plot_svm_regression.py
Update working_with_text_data.rst
Update plot_adaboost_regression.py
Update plot_tree_regression.py
COSMIT Move imports in example
unknown (9):
Added documentation for the Naive Bayes classifiers.
Added sparse MNNB and modified the textual examples to benchmark it.
Modified the Naive Bayes nose tests to the new location of the module and added sparse test.
changed wording in linear model docs about Normalized. It was frustrating me haha
compare residual_threshold directly with 0.0
if there are no inliers, raise ValueError
fixed newlines, comments
Fixed test to check for capital No
changed format string in no inliers exception to be 2.6 compatible
vstolbunov (2):
Updated logistic regression tests with sag solver
Fixed syntax and combined two test functions
wadawson (1):
Corrected two bugs related to 'tied' covariance_type in mixture.GMM(), added test, closes #4036
wangz10 (2):
fixed duplicated part in plot_roc.py
addressed jnothman's minor points.
x0l (6):
FIX error handling/memory deallocation in Liblinear wrapper
Fix a bug occurring when n_samples in a class < n_features
warning + log of prod -> sum of log
warning -> error
tests
ignore warnings, change docstring
zhai_pro (2):
FIX KMeans with fortran-aligned data.
TST: add a test for fortran-aligned data be used in KMeans.
Óscar Nájera (18):
A short cleanup
recovering deleted variable
gen_rst informs file with missing documentation
fix tab error
gen_rst to use css tooltip instruction
remove gallery css chunk from theme css
Remove JS from examples gallery
Examples gallery css file
fix gen_rst file extension
facecolor/edgecolor are passed arguments to savefig if they
sidebar hide on css
delete js of sidebar
A responsive behavior
Recover similar to old style
toggle shift to left larger vertical space
toggle contraction character
remove css for container-index that compensates old js behavior
css for container-index that compensates position of main content on index.rst and documentation.rst in small screens
-----------------------------------------------------------------------
No new revisions were added by this update.
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-science/packages/scikit-learn.git
More information about the debian-science-commits
mailing list