[scikit-learn] annotated tag debian/0.11.0-1 created (now 1f5b11f)
Andreas Tille
tille at debian.org
Wed Dec 28 13:10:56 UTC 2016
This is an automated email from the git hooks/post-receive script.
tille pushed a change to annotated tag debian/0.11.0-1
in repository scikit-learn.
at 1f5b11f (tag)
tagging 7eb39fa0dc43ce485d3af2857c587811332eb148 (commit)
replaces debian/0.10.0-1
tagged by Yaroslav Halchenko
on Tue May 8 21:03:34 2012 -0400
- Log -----------------------------------------------------------------
Debian release 0.11.0-1
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iEYEABECAAYFAk+pwmYACgkQjRFFY3XAJMjhugCaAnDIuy9JV6QlEAJGXpVYlEiu
PL8AniLMbEUhCxR/BITaNA+xs2oX0kV5
=m66F
-----END PGP SIGNATURE-----
Adrien Gaidon (5):
FIX: typo for default init_size in MiniBatchKMeans
Added tests to check for the correct value of init_size
FIX: make GridSearchCV work with precomputed kernels
raise ValueError when given a kernel_function or a non-square kernel matrix + some tests
Fixed a small typo
Alexandre Gramfort (82):
STY: mostly style + avoid a zip in favor of an np.argsort
STY : in label_propagation.py
ENH : using numpy broadcasting instead of dot_out
ENH : reformatting hmm_stock_analysis.py examples
MISC : typos in hmm_stock_analysis.py
ENH : rename hmm_stock_analysis.py so it appears as a figure in the doc
ENH : make metrics.auc work with 2 samples + add test
Merge pull request #591 from jaquesgrobler/doc_update
fix with new as_float_array
STY: pep8
mv randomized_lasso.py randomized_l1.py
ENH : some doc + renaming in RandomizedLasso
ENH : better plot_randomized_lasso.py with score path
ENH : prettify plot_randomized_lasso.py
ENH : creating lasso_stability_path func + adding tests on randomized_l1
ENH : add docstring to RandomizedLogistic
FIX: fix test_randomized_logistic
STY: s/a/scaling + adding docstring
DOC : adding doc for Randomized sparse linear models + fix test
ENH : adding sample_fraction to lasso_stability_path + add to doc
typos
cosmit in doc + pep8
cosmit in doc
ENH : addressing @ogrisel comments (PEP257, naming, see also)
DOC: rephrase rand linear model doc
ENH : fix docstrings + add func missing reference
ENH : center y too in _randomized_lasso
ENH : adding support for multiple regularization parameters in RandomizedLinearModel
MISC: removing one XXX
ENH : early stopping in lasso_stability_path (faster)
ENH : fix legeng of plot_randomized_lasso.py
pep8
API: set scale_C to True by default in libsvm/liblinear models
update what's new
DOC : add warning in docstrings for scale_C gone in 0.12
DOC: indent pb
DOC: update scale_C docstrings + add notes to svm.rst
ENH : use not(scale_C)
remaining docstring to be updated
update docstring with WARNING
TST: use assert_true instead of assert + remove some relative imports
FIX : fix SVM examples with new scale_C=True
FIX : fix ward benchmark
Merge pull request #654 from GaelVaroquaux/enet_cv
Merge pull request #679 from amueller/logistic_l1_l2_sample
API: use C=None by default in libsvm/liblinear bindings so (C=1, scale_C=False) which is libsvm default == (C=None, scale_C=True) which is the scikit default
FIX : remove useless C definition in non-fit methods
ENH : adding scaled_C_ attribute
Merge pull request #699 from njwilson/issue-250
TST : add test on ridge shapes for different y shapes
TST : add test failing test to reproduce #708
FIX : fix test for #708
FIX : fix test failing with OMP
ENH: y_mean with consistent shape in _center_data
FIX : prevent ICA with defined n_camponents and whiten=False (fix for #697)
TST: capture warning in test
FIX : use joblib from externals
Merge pull request #728 from satra/fix/f_regression
ENH : speed up f_regression
FIX : array copy for compat pb
FIX : missing self.copy = copy in PLS GH Issue #758
cosmit : consistent linestyle in plot_lasso_coordinate_descent_path
ENH : add duality gap check with Lasso(positive=True)
Merge pull request #747 from ibayer/posCoeff
Merge pull request #773 from amueller/forest_pre_dispatch
Merge pull request #782 from jaquesgrobler/Update_Changelog
Merge pull request #783 from dwf/svm_docs_minor
change web site for agramfort
FIX : fix SVC pickle with callable kernel
cosmit
FIX : callable kernel for prediction
FIX : sparse SVC clone with callable kernel
Merge pull request #796 from amueller/kmeans_dtype
Merge pull request #814 from invisibleroads/master
Merge pull request #813 from invisibleroads/patch-1
FIX : make plot_ica_vs_pca.py deterministic (fix for #815)
Merge pull request #802 from amueller/arpack_backports
typo
fix for #824
DOC : update SVM examples with scale_C
API : change back default C to 1. explicitely and epsilon 0.1
FIX : svm decision function test
Andreas Mueller (284):
ENH liblinear: cythonized sign switch for n_class<=2
ENH liblinear: get rid of n_class sign by switching class signs in liblinar implementation.
COSMIT typo
whatsnew: gave myself some credit
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge branch 'master' into svm_coef_sign
FIX adjust _set_coef_ and _set_intercept_ to sign switch
ENH DenseBaseLibSVM.coef_ correct. test simplified.
DOC try to document layout of dual_coef_ in multiclass libsvm
DOC fixed errors in load_images doc and SKIP'ed load_image doctest as was already the case for load_images
DOC: OCD and added image loader to class reference
DOC Trying to enhance the tree/forest docs. Headlines in tree, added reference, hopefully better description of 'min_density'.
DOC layout of dual_coef_ in 1vs1 svm in user guide, example
DOC fixed indices in dual_coef_ example
COSMIT factor out 1vs1 coef construction in libSVM, PEP8
DOC added RidgeClassifier to References
DOC fixes in Multiclass docs. Didn't show correctly on web.
DOC multi-class narrative: added links to the references, made citation clickable
ENH trees in random forests save the indices of the training data used in bootstrap sample
ENH Add function to predict on left part of training set
ENH use self.classes_, check input on predict_oob, add test
DOC Out of bag error estimates in grid_search module
COSMIT @glouppe says this is more pythonic :)
DOC reformulation out of bag error
COSMIT in doc: @ogrisel's remarks
ENH oob score as attribute, not separate function.
ENH: added oob_score_ and oob_prediction_ to regression ensembles
FIX copy/paste error. guess it was to late
ENH made oob_score an ``__init__`` param as suggested by @agramfort
DOC what's new, minor doc improvements
Merge pull request #571 from amueller/tree_indices
ENH: Replace asserts by appropriate errors. Fixes the rest of issue #570.
COSMIT how I love these sphinx errors
DOC complicated objects as parameters confuse sphinx and the reader. Fixes issue #567.
ENH: Default in Vectorizer "None" as @ogrisel suggested
DOC website: added link to 0.10 docs under support.
DOC added required versions of python, numpy and scipy to install documentation. Closes issue #579
COSMIT pep8
COSMIT removed unused imports
Merge branch 'master' into svm_coef_sign
DOC comment in linear.cpp
DOC @ogrisel's suggestion: putting a link to pull request in liblinear.cpp
COSMIT pep8
DOC fixed doc errors in metrics module
COSMIT removed unused imports
Merge pull request #546 from amueller/svm_coef_sign
FIX RandomizedLogisticRegression test import
COSMIT removed unused import
DOC fix sphinx errors
DOC more fixes in Docs
DOC cluster metrics: fixed see also sections, errors in references section.
COSMIT pep8
FIX SGD loss example for new hinge loss.
FIX lasso_dense_vs_sparse_data.py example needed update.
COSMIT pep8
DOC add cross_val_score to references, OCD.
FIX bug in text feature extraction, issue #606
COSMIT pep8
DOC fix sphinx errors
ENH: moved class_weight parameter in svms from fit to ``__init__``.
MISC Adjusted class_weight param in examples, fixed legend in unbalanced dataset examples.
DOC typos.
MISC reinserted class_weight as fit parameter, added deprecation warning.
MISC cleanup
DOC margin for old warning wrapper fixed
MISC Deprecated class weights in SGDClassifier
Merge pull request #578 from jakevdp/old-version-warning
pep8
COSMIT pep8
COSMIT get rid of warning in nosetests for equidistant neighbors. it's intentional.
MISC more sensible NMF test.
COSMIT pep8 wooops thanks @ogrisel
MISC forest tests: boston faster, probability test faster and no warning.
MISC decision tree test faster and no warning
COSMIT simplified error message checking, remove deprecation warning.
MISC more iterations for test_lasso_path. Still runs in <.1s, gives no warning and more accuracy.
MISC more iterations also for test_enet_path, same runtime as before, no warning.
COSMIT pep8
COSMIT pep8
FIX added missing import
MISC added warning to coordinate descent if alpha=0, don't call cd with alpha=0 in tests.
MISC replaced deprecated mean_square_error in test.
MISC test for warnings as @ogrisel suggested.
Merge pull request #620 from amueller/coordinate_decent_alpha_warning
add min_leaf (minimum size of leaf node) to decision tree
ENH min_leaf for ExtraTree
ENH added test for "min_leaf"
ENH set min_split if min_leaf is set.
DOC add load_svmlight_file to references
DOC minor fixes and typos
DOC more rst fixes....
DOC typo in whatsnew
Merge branch 'master' into svm_class_weights
DOC renamed duplicate label
FIX flip sign in decision function of LibSVM in binary case.
MISC renamed min_split and min_leaf to min_samples_split, min_samples_leaf, added them to the ensemble classifiers and documented them....
FIX OneClassSVM decision function sign.
ENH more elaborate one class svm testing....
MISC address @mbondels comments
MISC simplified test
FIX one class test, added more decision function tests.
COSMIT pep8 + "leafs" typo.
DOC Added changes to decision functions and coef_ to whatsnew
MISC don't use deprecated mean_square_error
Merge branch 'master' into svm_class_weights
Merge pull request #610 from amueller/svm_class_weights
COSMIT pep8
FIX whooops sorry
DOC Insert hidden toctree, mv "included" files from rst to txt
MISC Issue #639. Remove unused member types in linear_model CVs
DOCs change extension from txt to inc, add inc as doctest extension to makefile
MISC verbosity parameter for forests: better control over tree building.
Merge pull request #641 from amueller/doc_fixes
FIX dataset docs: changed suffixes in include to match rename.
DOC fixed inconsistent titles. sphinx didn't like them and didn't show these sections.
MISC @ogrisels comment about human-parsable counting
Merge pull request #643 from amueller/forest_logging
DOC C is pretty large now...
MISC class_weights constructor parameter in RidgeCV
DOC doc fixes
MISC added removal version for scikits.learn deprecation warning.
MISC remove ball_tree and cross_val namespaces
MISC scikits.learn removal at .12. I'm not so good at counting, sorry.
Merge pull request #660 from amueller/remove_namespaces
COSMIT renaming scikits.learn to sklearn in some places
COSMIT pep8
MISC Update all the other deprecation warnings that I forgot.
FIX: class_weight only in classifier Ridge classes
DOC Documentation for RidgeClassifierCV
DOC add removed docstring.
COSMIT pep8
ENH Added tests and fixes
DOC remove "for dense data" heading for SVM classes
Merge branch 'master' into linear_model_class_weights
DOC document classification plot
MISC removed deprecated api from examples
WEBSITE: make example gallery look even better!
DOC added reference to r2 score
ENH rename parameter "multi_class" of LinearSVC to "crammer_singer", add docs, add tests
FIX forgot doctest
DOC minor addition to SVM kernel parameters
DOC more readable make_friedman docs....
Merge pull request #649 from daien/GridSearchCV_precomputed_kernel
COSMIT don't use deprecated names
ENH new samples generators for classification and clustering. Refactored label propagation example a bit
ENH cluster comparison example (starting)
Merge pull request #669 from amueller/example_gallery_css
ENH added "shadow" parameter class_weight_ as @ogrisel suggested.
MISC changed parameter name back but changed semantics, as @mbondel suggested.
COSMIT pep8
DOC added one more sentence about crammer-singer
COSMIT typo. thanks @ogrisel.
DOC crammer_singer docstring by @ogrisel
ENH clustering example with spectral clustering and ward with connectivity. looking better now, still not perfect.
FIX broke label_probagation example, now fixed it again.
Merge branch 'master' into sample_datasets
Merge pull request #673 from amueller/crammer_singer_rename
DOC add new dataset generators to class reference
WEBSITE: another css enhancement to give figures a max width.
DOC move references from Notes to References section in docstrings
MISC simplified kpca example with new dataset generator, another minor fix in generator
DOC lasso/enet regression example with coefficient plots, corrected r2 score
DOC Basic docstrings for LDA and QDA classes
DOC lda/qda examples: remove redundant example, prettyfied other.
DOC Added QDA to references, narrative docs, improved docstrings
COSMIT newline in LDA doc
DOC explanation for plot in lda/qda narrative
MISC use Gaels pretty plot, add dbscan, normalize data...
COSMIT cleanup, pep8
ENH issue #661, plus some renaming and minor cleanup
MISC forbid mle initialization of PCA for n_samples < n_features
DOC added clustering example to the docs
COSMIT make plot look more like other coef plots
COSMIT removed debugging print
MISC added xlim and ylim for @ogrisel's weird matplotlib ;)
ENH fixed seed, added center positions
Merge pull request #674 from amueller/sample_datasets
FIX minor doc fixes
DOC add link to narrative in lda and qda references
DOC add ``estimate_bandwidth`` utility for MeanShift to the references and narrative
MISC make Ward check if input is sparse.
MISC make Ward test if connectivity is a valid connectivity matrix.
COSMIT changed error message for Ward
DOC another coefficient plot
COSMIT Adjust title for example gallery
ENH 2d plot for l1l2 digits example
COSMIT last try to make my plot pretty....
BUG fixed error that I introduces earlier: connectivity can also be `None`
DOC fixed reference to an example (that I also broke before)
Cosmit typo
FIX plot example fix for old matplotlib, so that it shows on the website.
Merge branch 'master' of github.com:amueller/scikit-learn
COSMIT make cross_validation nosetest slightly more readable and more pep8 respecting
FIX make class weight nosetests work
FIX get rid of some doctest errors (with the stricter nosetester)
ENH refactoring of dot-file export
COSMIT comments
COSMIT minor visual enhancement
ENH: don't fail on "yeast" dataset
Merge pull request #711 from davidmarek/sparse_pca
DOC Added clustering functions to references.
Merge pull request #685 from ibayer/master
ENH local variable in ``fit`` instead of modifying the estimator parameters. thanks @GaelVaroquaux
DOC: Added ElllipticEnvelop to the References
DOC added reference for EllipticEnvelop and fixed some sphinx errors.
FIXed nosetests. Thanks @pprett
Merge pull request #707 from amueller/graphviz_dot_refactoring
Merge pull request #648 from amueller/linear_model_class_weights
COSMIT Typo
COSMIT pep8
DOC sphinx/rst errors
DOC Believe it or not - this fixes the annoying sphinx error. And don't dare to
COSMIT minor fixes to docs
COSMIT fixed references to covariance.EllipticEnvelop in docs
COSMIT pep8
DOC correct links to face recognition example, take care of trailing underscores.
COSMIT pep8
ENH grid_search forgets estimators
DOC slightly better docs for ``refit``, document ``best_params``.
FIX clone base_clf before setting params.
FIX messed up something in the short cut method.
ENH pre_dispatch for foresters
FIX redundant code is redundant
COSMIT add todo comment to grep
Merge pull request #770 from amueller/oblivious_grid_search
Revert "Merge pull request #773 from amueller/forest_pre_dispatch"
COSMIT don't use deprecated attributes in tutorial.
COSMIT pep8
FIX don't use parameters to fit in GMMHMM.
FIX don't use Python 2.5 method of checking for warnings
MISC Don't warn on equidistant on iris. iris has duplicate datapoints.
FIX don't use fit parameters in grid_search test
ENH convert X to float in k_means predict.
MISC don't use private ``set_params`` method as that raises a warning.
MISC don't use iris in testing as it has duplicate data entries. Add some noise to simple examples.
MISC added note that we need better tests
DOC typo
ENH check if backport of sparse scipy ARPACK is needed. The backport breaks with scipy 0.11
COSMIT cleanup + pep8 in examples
COSMIT + MISC pep8, pyflakes, typos and some other cleanup of examples.
Merge pull request #800 from amueller/less_neighbors_warnings
FIXed pca example that I broke when "cleaning up"
ENH checked for scipy version
ENH add ``decision_function`` to ``Pipeline``
ENH joined tests for less duplication, checked shapes as @ogrisel suggested.
FIX we need to do "LooseVersion" to support dev/git versions of scipy
COSMIT pep8
COSMIT make test more explicit
COSMIT removed unused "verbose" option in dbscan
COSMIT removed unused import in test
FIX copy/paste error
FIX removed verbose also from main DBSCAN class
COSMIT dbscan test doesn't use fit params
COSMIT typos by `git grep independant`
MISC removed unused lines, see #666.
COSMIT rst in example
ENH adjusted examples to new matplotlib 1.1.1
MISC don't use ``set_cmap``
MISC use logsumexp in DPGMM for less warnings
FIX typos in examples
FIX one more example
MISC trying to remove scale_C
MISC forgot two
DOC docs and examples have scale_C removed
FIXed many tests
DOC some doc corrections
ENH remove duplicate definition of "assert_lower" in tests
FIX ditto (numbers are to random)
ENH backport "assert_less" and "assert_greater", rename "assert_lower" and use it everywhere :)
ENH rename out_dim to n_components in manifold module
FIX assert_greater message
DOC Added pipeline user guide
ENH use random states everywhere, never call np.random.
FIX don't do anything in the __init__
WEB Added page with links to various tutorials/presentations on scikit-learn
DOC added some explanation to video page
ENH added random_state to Gaussian Process
FIX testing: random state problem in forest testing.
DOC minor fixes to rst and image paths
DOC banner 14 duplication?
DOC more minor fixes
DOC fix last docstring error. Don't remove redundant docstring. I dare you, I double dare you mother******!
RELEASE 0.11
COSMIT typo in whatsnew
Bertrand Thirion (50):
Variable renaming and dostring fixing
merged with master logsum -> logsumexp
ENH: renaming estimated variables from self._variable to self.variable_
removed the decode
removed the decode in dpgmm and removed return_log in eval
ENH: Cleaned after rebase and compatibility with hmm
ENH: Removed X and z varaibles from dpmm cladd (should not ship the data)
ENH:aviod initializaing GMM means with zeros
ENH: more snsible initialization in case of divergence
BF: Mended the tied covariance estimator
ENH: added multiple initialization to the GMM -- untested
FIX: fixed collateral dammages in hmm
added some tests to ensure that GMMs work in about all conditions
ENH: renaming cv_type and posterior to more explicit name + tested multiple init
avoid changing the covariance when computing the Gaussian density
FIX: Fixed a buf I introduced in dpgmm
ENH: Added AIC/BIC + tests. Seems to work
Cosmit in dpgmm
merged with master
Changed the shape of spherical covariance matrices to be equal to disgonal covariance matrix, in order to avoir handling the dimension in particular
Merge branch 'master' of github.com:scikit-learn/scikit-learn into gmm-fixes
detail fixed in an example
Hopefully clarified notations in dpgmm
Many corrections in dpgmm to remove en-necessary loops (significant speed-up) + renaming
Fixed an example that happened to fail
Several details outlined by Jake
handled the eval on Null data
merged the master repo
Added an example with model selection
Oups: really added an example with model selection
ENH: Removal of properties from GMM -- unfinished
removed properties from dpgmm
replace log_weights_ by weights_, which makes the API more consistent
Getting rid of properties in hmm, gmm, dpgmm
fixed a doctest
ENH: Some cleaning in the examples
ENH:pep8
ENH: enforcing skls conventions
A pass on the docs
corrected the doc for dpgmm
removed get_means, set_means, get_weights, set_weights
ENH: renamed plot_gmm_model_selection.py to plot_gmm_selection.py
Fixed the doctests in hmm
COSMIT:pep8 in hmm
Corrected the docs
ENH: changes in the code to fulfill Gaels requirements
Merge branch 'master' of github.com:scikit-learn/scikit-learn into gmm-fixes
ENH:Added back rvs as deprectaed and updated whatsnew.rst
ENH: fixed the GMM docs
ENF changed INF_EPS to EPS in hmm too.
Brandyn A. White (1):
Fixed docstring to reflect current code in precision_recall_curve.
Carlos Scheidegger (1):
BUG: missing subpackage svm/sparse on setup.py. fixes issue #559
Charles McCarthy (2):
Fixed data.filenames consistency issue when 'all' specified for 'subset'.
Added basic test for filenames consistency when all specified.
Claire Revillet (1):
- fix missing links to the C math libray
Clay Woolam (103):
added label propagation class
switch map and sum commands to numpy
fixing up tests, adding "unlabeled_identifier"
basic features of multiclass labeling up
fixing the way labeling works
checking in minor changes
added documentation, reworking tests
fixing up tests
added a lot more to label propagation, explained algorithms and differences between the two models
more documentation
added beginning of examples
added "structure" example
tweaked structure plot
finalized SVM comparison example
all tests pass
removed some stuff from documentation
updated pydoc to make behaviour clearer
passed PEP8, using already implemented kernel functions
making everything more numpy compatible
graph construction and example more numpy-like
fixed other diagonal matrix construction
rename misnamed "plot" example
example conforms to pep8
other example conforms to pep8
made test conform to pep8
predict() method now numpy friendly (100% numpy friendly now)
more numpy integration
removed function kernel, switched to string for picklability
fixed a bug in the circle example
moved label propagation examples to lower subfolder
more numpy friendliness
more numpy use,
fine tuned some documentation
added a snazzy label propagation versus SVM decision boundary plot
added more explanation to the plot
added semi_supervised directory
removed old, useless code
removed unused imports
added more documentation, another doctest for LabelSpreading
minor tweaks to the overall layout of the code
reverted plot_iris accidental commit
added unlabeled_identifier explanation to docstrings
Merge remote-tracking branch 'upstream/master'
fixed indentation problem in documentation rst
conformance to pep8
fixed bug in tests causing gram matrix construction to not work properly (assumed casts to floats)
added two new examples, including an active learning demo with label propagation
heavily downsampled digits examples (runtime a few seconds now) and removed supporess_warrnging bug
changed doc to remove long runningtime warning
rennamed active learning example so it won't be run for doc compilation
changed subplot titles so the plot is more clear
fixed structure example
added vene's subplot adjustments
Merge branch 'new_lp'
made convergence check function private
fixed spelling error with variable name (indicies -> indices)
optimized _build_graph with inplace methods, conform to standards with variable names
one more optimization! avoids cast to numpy matrix and does in place matrix multiplications
fixed test cases to conform to api changes & new internal parameters
updated docs!
Merge git://github.com/scikit-learn/scikit-learn
localized a variable
fixed test suite, changed module to conform to new sklearn naming scheme
fixed examples for new naming scheme
merged ogrisel's docs & optimization, also fixed active learning example plot
changed a bunch of variable names, fixed some test cases
all code works great, all tests pass, full coverage
changed a variable name to conform to scikits code
correct variable names and added inline comments for active learning examples
added attributes text to explain named attributes
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
added support for sparse KNN graphs and tests
finishing up sparse additions (need to complete todo)
sparse KNN graphs now work
ENH add label propagation algorithm
finalized KNN work, all tests pass properly
Merge branch 'larsmans-label-propagation'
removed extra semisupervised folder
polished the lp & test code
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn
Merge branch 'master' of https://github.com/scikit-learn/scikit-learn into label-propagation
variable name changes, using premade functions, doc fixes as per
variable name changes, doc corrections
removed unlabeled_identifier, updated tests and examples to reflect this
corrected example that still refered to unlabeled_identifier
optimization that stores the spatial index when using knn graphs
updated rst docs with kernel information
shuffled digits example, added sensible point colors to plot chart,
docs describe the different kernels available in techniques
TL directory change to push label propagation code into semi_supervised
added __init__.py file to semi_supervised folder
Updated docs for label propagation, added more technical details about
specific fine tuning to the label propagation docs
doc updates & tweaks
fixed typo in test code
added AISTAT ref to docs
added AISTAT ref to rst doc
fixed bug causing error on sparse input data
corrected the documentation and add semi-supervised section to the user
placed semi-supervised under supervised learning techniques in user
Merge remote-tracking branch 'upstream/master'
fixed error in graphviz export code causing graph error raised with
Conrad Lee (1):
cross_validation.py: fixed bug in text of error message
David Marek (17):
fixed SparsePCA.transform returning NaN for 0 in all samples. (fixes #615)
Added test for SparsePCA.transform (checks #615)
ENH: Added p to classes in sklearn.neighbors
TEST: tested different p values in nearest neighbors
DOC: Documented p value in nearest neighbors
DOC: Added mention of Minkowski metrics to nearest neighbors.
FIX+TEST: Special case nearest neighbors for p = np.inf
FIX: pep8
ENH: Use squared euclidean distance for p = 2
ENH: train_size and test_size in ShuffleSplit (#721)
TEST: Added more tests for ShuffleSplit
TEST: Tested ShuffleSplit with different types of test_size
Changed deprecation warning.
DOC: Added changes in ShuffleSplit and sklearn.neighbors
Error checking now works for more types than just int and float.
Use numpy dtype.kind instead of isinstance
TEST: assert_equal instead of assert
David Warde-Farley (4):
Fix math notation for exp and tanh.
Add pointer to kernel equations from SVC docstring.
Rephrased narrative doc reference in docstring.
Added RST comment about where to find narrative docs.
Edouard DUCHESNAY (1):
Check that scikit-learn implementation of PLS provides exactly the same outcomes
Emmanuelle Gouillart (5):
Example on tomography reconstruction with Lasso for the gallery.
COSMIT: PEP08
Tomography example: PEP08, typos...
Reference to tomography example in narrative doc
ENH: a few typos in docstrings
Fabian Pedregosa (11):
Start of 0.11 development cycle.
Mailmap alias
And the winner is ...
DOC: links for people that have webpage.
DOC: some documentation fixes.
DOC: docstring update for dump_svmlight_file
Refactor in KFold.
Set the download link to PYPI.
FIX: bug in DenseBaseLibSVM when subclasses implement new params
FIX: inheritance in DenseBaseSVM
Add Satra to the AUTHORS list.
Gael Varoquaux (200):
DOC: start to merge statistical learning tutorial
MISC: species distribution example plotted
ENH: better error messages
MISC: shorten a bit the description
DOC: fix image
DOC: layout
DOC: random selection of frontpage images
DOC: compress a bit the layout
DOC: shorten a bit the front page
DOC: avoid imgs taking 2 lines
DOC: Add a few images to the banner
DOC: fix wrong link
DOC: avoid line return
ENH: get the murmurhash to build properly
DOC: prettify ensemble docs
BUG: restore score functionality in grid_search
ENH: refit now works in the GridSearchCV
FIX: MurmurHash3 compilation on older GCC
Cosmit: remove unused imports
MISC: fix bibtex
Merge pull request #588 from jakevdp/balltree-fix
ENH: make LassoLarsIC more reproductible
BUG: fix test_precision_recall_curve
ENH: Add randomized lasso
ENH: randomized_lasso example: multiple alpha
Better randomized_lasso
Jacknife in randomized_lasso
Add a randomized logistic
COSMIT: pep08
ENH: Add pre_dispath to RandomizedLinearModel
ENH: RandomizedLinearModels transformers + memory
BUG: fix broken merge
MISC: inherit from BaseClassifier
BUG: parameter was not set right
DOC: Improve feature selection docs
DOC: try to improve randomized lasso example
ENH: numerical stability in LassoLarsCV
DOC: update dostring
ENH: grid in terms of alpha/alpha_max
DOC: nicer path
DOC: beautify feature_selection docs
DOC: cross-reference linear_model and randomized_lasso
DOC: enrich example docstring.
DOC: better example for randomized lasso
MISC: make sure two figures hold on a line
DOC: example and docs for randomized-lasso
MISC: address @ogrisel and @mblondel's comments
Cosmit
MISC: add randomized linear models to what's new
BUG: make clone work on 2D arrays
TST: add a test for bug fixed in previous commit
COSMIT: make the plot landscape
DOC: improve the label_propagation docs
COSMIT: authorship and licensing info
Cosmits
DOC: minor rmk on label_propagation
TEST: assert -> nose.tools.assert_equal
Merge branch 'label-propagation'
BUG: fix typo in tests
DOC: update whats_new
BUG: fix tests under numpy 1.5
TEST: add a test for whitening in ICA
PEP8
ENH: control random state in ICA
BUG: SVM raw_coef_ must be fortran ordered.
MISC: cosmit: use subpackage setup.py
DOC: reorganize GMM docs
DOC: reorganize GMM docs
DOC: more examples for DPGMM
Cosmit
MISC: remove custom __repr__
Merge branch 'master' of github.com:scikit-learn/scikit-learn
BUG: fix doctests
ENH: optim hierarchical: heapq in tree traversal
ENH: hierarchical: speedups in tree cut
MISC: clean up old c file
MISC: assert -> raise ValueError
BUG: typo
MISC: fix broken link to example
ENH: parallel in lasso_stability_path
API univariate_selection: _scores -> scores_
ENH: update joblib to release 0.6.2: bugfix
Merge pull request #613 from bwhite/patch-1
MISC: remove joblib from .gitignore
BUG: add missing file in joblib
Merge pull request #601 from agramfort/scale_C_true
BUG: follow API change in example
ENH: update joblib
Merge pull request #603 from jakevdp/GPML-fixes
Merge pull request #637 from fannix/fix
ENH: optim in ward_tree
Cosmit
BUG: ShuffleSplit should give reproducible splits
ENH: small speedups in coordinate descent
Revert "ENH: small speedups in coordinate descent"
ENH/FIX: in graph shortest path
Faster hierarchical cluster for very dense trees
ENH: Add the ability to set rho by cross-val
ENH: store the path for rho in ENet
BUG: fix tests and reorganize code
ENH: draft of parallel CV in elastic net
TEST: setting rho with ElasticNetCV
DOC: document ElasticNetCV
MISC: cosmit to please @agramfort
BUG: Same MSE scaling for LassoLarsCV and LassoCV
TEST: better tests of LassoCV and LassoLarsCV
DOC: add a link the Gohlke's 64bit windows binaries
DOC/TEST: HMM fix doc layout and doctest
ENH: Add controled random_state in HMMs
DOC: prettify HMM sampling example
Cosmit
COSMIT: underscores are better than unseparated words
TST: fix trivial bug and control the rng
MISC: fix the random number generators
Merge branch 'hmmc'
TEST: fix doctest on non 64bit boxes
COSMIT: readability
TEST: Fix cross_validation tests
BUG: fix cross_validation on numpy 1.3
Merge pull request #709 from ibayer/cleanExamples
Merge pull request #705 from agramfort/fix_ica
MISC: better verbosity in lars
DOC: more visible version remark
ENH Ward: better behavior for non-fully-connected graphs
ENH: Don't modify connectivity unless specified
DOC: affinity-propagation in clustering comparison
DOC: add clustering example on front page
Merge pull request #726 from emmanuelle/doc_correction
ENH: summary table on clustering
DOC: better clustering comparison table
DOC clustering comparison: link table and figure
MISC: tweak example layout
DOC: finish table to compare clustering
Merge branch 'WIP_tut'
DOC: Better narrative for DBSCAN
DOC: finish misc in tutorial
BUG: no plotting in doctests
COSMIT: layout tweak
Redo CSS layout killed by commut 94088b81
BUG: fix doctests
Merge pull request #730 from jaquesgrobler/rename_EllipticEnvelope
DOC: timings in cluster comparison example
COSMIT: prettier plot
Merge pull request #733 from jaquesgrobler/master
DOC: misc wording
TEST GNB: test that class_prior sum to 1
Merge pull request #751 from jaquesgrobler/master
DOC: Manhattan distance == l1 norm
BUG fix LinearSVM doctest
MISC: verbosity in SVMs
ENH: use warning.catch_warnings
ENH: neighbor warning always raised
API: n_test -> test_size in Bootstrap
COSMITs on GGM
TEST: Fix doctest
Cosmit: comment on 'clever' code
Warn: Passing params to fit is depreciated
DOC: testing without sklearn.test()
COSMIT: macports package name
COSMIT: better warnings
ENH MiniBatchKMEans: increase init_size for large k
DOC: better description of init_size
DOC create example section for datasets
DOC title for the tutorial examples
EXMPL: fix legend in sgd sample weights
COSMIT we no longer support Py 2.5
COSMIT simplify a bit examples
DOC: restructure what new
BUG: explicit adding of libm at build
BUG test_oneclass_decision_function: fix RNG
COSMIT: no capitals outside of class names
COSMIT: remove print
BUILD: add libm onlyon posix systems
MISC: simpler faster code with vectorization
SPD: Minor speedups
SPD: minor speedups
FIX: handle deprecation with estimator API
BUG: fix assert_greater/assert_lower
BUG: fix assert_greater
BUG: fix doctests
DOC: cosmits in docs
COSMIT: only classes should have capitals
ENH: make LinearSVC copyiable
TST: do not raise warnings in sklearn.test()
BUG: fix testing on older numpy
DOC: cosmits on tutorials and videos
DOC: wording of whats_new
BUG: use permutation rather than shuffle
CLEAN sparse_encode: remove unused arguments
ENH: avoid an underflow
Revert "ENH: avoid an underflow"
DOC: instructions on testing
DOC: faster and more meaningful example
ENH: prevent multiprocessing in tests under Windows
DOC: avoid 2 rows of images
DOC: more readable title
DOC: Feature extraction vs feature selection
DOC: image to graph utilities
ENH: update joblib
BUG: remove n_jobs=-1 from examples
Gilles Louppe (17):
DOC: typo
2011 -> 2012
Merge pull request #627 from amueller/min_leaf_cherrypick
Merge pull request #684 from clayw/graphviz-fix
PEP8
ENH: move _compute_feature_importance into Tree
ENH: Use DTYPE instead of float64
Cosmit
ENH: Moved _build_tree into Tree
Cosmits + Fix to a test
Revert "ENH: Use DTYPE instead of float64"
FIX: return; instead of return NULL;
FIX: avoid dividing by zero in Tree.compute_importances
ENH: parallel computation of X_argsort
ENH: better argsort
ENH: cosmit and doc
Merge pull request #761 from glouppe/master
Immanuel Bayer (30):
Test added for multiple-outcome:
bugfix: lstsq coefficients output needed to be transposed
fixed spelling error
docstring updated and list append replaced with
consistency
spelling
pep8 errors fixed
pip8 errors fixed
parallelized
parameter n_jobs added
BugFix, matrix was not flagged as sparse.
cleaned some examples
combat for sp_linalg.lsqr
test for positive constrained lasso added
positive constrained option for lasso added
lasso docstring update
remove outcommented lines
wording
example for lasso with positive constraint
renaming
reset wrongly committed file
use scikit function to make train test split
set w[ii] = 0 if tmp > 0
- changed parameter from positive_constraint to positive
indent
add examples for positive constraint lasso and enet
merged into plot_lasso_coordinate_descent_path
fix doctest
fixed doctest
Merge pull request #1 from agramfort/posCoeff
Jake VanderPlas (26):
small simplification in LDA
add old version warning
add newline at file end
turn off old version warning
add random_state to LocallyLinearEmbedding
initialize indices and distances in balltree
check random state in _fit_transform
Address Issue #590 : use relative path link to about.html
Merge commit 'upstream/master'
ball_tree: more efficient array initialization
add info about valgrind to dev documents
Current version -> Latest version
Merge commit 'upstream/master' into old-version-warning
set warning margins to zero
allow for multiple nuggets in gaussian process
example + documentation of gaussian processes on noisy data
Merge commit 'upstream/master' into GPML-fixes
DOC: expand nugget explanation; combine two GPML examples
Merge pull request #6 from amueller/old-version-warning
fix link in warning
latest version -> latest stable version
BUG: fibonacci heap implementation
TEST: non-regression test for fibonacci heap bug fix
Generate c-code with cython 0.15.1
ENH: use shift-invert in spectral clustering
add detailed comment on ARPACK usage
Jan Hendrik Metzen (1):
FIX : Fixed bug in single_source_shortest_path_length in sklearn.utils.graph
Jaques Grobler (88):
Added a note to the install documentation
Added a note to the contributers documentation
Shorted the long line
Added a small note about the use of an upstream remote in the Contributions documentation
Shortened a line in the code
Merge branch 'WIP_tut', remote-tracking branch 'gaelVaroqueux/stat_tutorial' into WIP_tut
- Further integrated tutorial.rst (Section 2 in Userguide) with links to
moved tutorial files into separete folder within main tutorial folder. added folder for section2 tutorial. fixed some links.removed savefigure from plot_cv_diabetes.py
Merge remote-tracking branch 'origin/master' into WIP_tut
Merge branch 'master' into WIP_tut
Removed savefig from tutorial plot files.
Updated tutorial folders in doc with placeholders for other tutorials. updated index.rst for the tutorial menu accordingly
added an html page for plot_digits_first_image.py
Added links to some keywords.
Links, image resize and updated ipython code in tutorial
Added a dataset image, some links and 'import sklearn' updates
Added Knn classification example image&html
changed colours of plots, added links
Fixed link typo
Merge branch 'master' into WIP_tut
Simple linear regression example added to tut
Fixed spelling error,import lines,figures and html links for shrinkage section
Added links, images and docstrings to some plot files
fixed plots to have class coloured datapoints
Fixed some figures, added links & corrected SVM Param C explanation
Fixed missing image and GUI download link
Image page fixed
added div.green to the theme for Exersizes in scikit-tutorial
fixed link/updated some code
renamed file-names, finished model-selection, changed cv plot to use C
Section 4 done - images/links/htmls for images
All scikit tutorial images and links redone
Fixes for doctests
modified makefile for doctesting - not permanent
Merge remote-tracking branch 'origin/master' into WIP_tut
remove redundant file
removed redundant file
Better doctest time(wip),removed duplicate examples, update plot_ols.py
Merge remote-tracking branch 'origin/master' into WIP_tut
3 files moved into main example pool - links to them updated
Merged some examples into examples folder.
Merged a few examples into the example pool
delete redundant file, merged some examples and updated links
examples merged to example pool
deleted unused file, tutorial examples folder removed
replaced silence paramenter in makefile, links removed in stat_learn tutorial, big_toc_css copy deleted, heading changed in tutorial index, tutorial index info added
added ELLIPSIS to 4 examples
added ... to ellipsis
Merge remote-tracking branch 'origin/master' into WIP_tut
merged ols and ridge variance + some neating
fixed links & neatening
moved exercises into seperate folder, neating up
path fix of moved figure
fixed typo,changed 2.2s numbering, fixed 4 examples in exercises
fixed numbering in main User Guide
added collapsable sidebar - still WIP
Collapsable sidebar adding complete - appears to work well
Deleted redundant files
color change for button
comment added to gen_rst. Arrow added to button
Next button added:position correct,but does nothin
button is mostly working
spelling fixes
cleaned up
more cleaning-finished off
spelling errors,edit curse of dimensionality, explain top-down
bug fix - layout
changed hover colours for button
previous button added with hovering-effect
Merge branch 'master' into WIP_tut
fixed new doc-test error
Made old EllipticEnvelop deprecated class
changed message to *Use EllipticEnvelope instead*
Fixed broken image link
Removed `_plot` from the face recognition example
Added the name change for the recent change EllipticEnvelope
Changed GMM's API to suite rest of sklearn
1.Fixed typo 2.Removed has_key entries
restored last changes
Fixed syntax error
mixture/plot_gmm* examples updated
restored last changes
DPGMM API updated, along with plot_gmm_sin example
DPGMM and VBGMM API change, example updated
modified test_gmm to match API changes in gmm.py
updated documentation for gmm,dpgmm and vbgmm
Changed variable name `x` to `covar_type`
Updated `whats_new.rst` with API change
Joonas Sillanpää (3):
Radius-based classifier now raises exception, if no neighbors found
Corrected some mistakes, added optional outlier_label parameter, which can be given to outliers
Fixed weight calculation from distances (1. / dist), and weight function in tests (lamda d : d ** -2)
Lars Buitinck (96):
scikits.learn -> sklearn migration in label propagation
BUG don't pass estimator params to fit in label propagation
remove deprecated Neighbors{Classifier,Regressor}
ENH raise ValueError in metrics instead of AssertionError
ENH intercept_ on linear OvR clf + change exception to AttributeError
DOC pep257, or "sentences end with a full stop"
ENH input validation in DBSCAN
DOC rm confusing line in BernoulliNB docstring
FIX small stuff in new tomography example
factor out some common code in dense/sparse SGD
prevent a copy in SGD regressor fitting
refactor SGD, part 2: simplify parameter passing
refactor SGD, part 3: factor out more sparse/dense common code
COSMIT rm no-op conversion in SGDRegressor
BUG restore symbolic class label support in SGD + test it
ENH merge dense/sparse LinearSVC, part 1: no more SparseBaseLibLinear
ENH merge dense/sparse LinearSVC, part 2: no more sparse.CoefSelectTransformer
ENH merge dense/sparse LinearSVC, part 3: deprecate sparse.LinearSVC
ENH merge dense/sparse LinearSVC, part 4: deprecate sparse.LogisticRegression
DOC reference for logistic regression training with liblinear
COSMIT refactor liblinear bindings
TST merge dense and sparse LogisticRegression tests
Merge branch 'master' into merge-linearsvcs
COSMIT fix ugly import, left over from LinearSVC refactoring
DOC put merged LinearSVC and LR in changelog + explain @mblondel's work
BUG fix SGD doctest
Merge pull request #561 from larsmans/merge-linearsvcs
BUG promote type-safety in murmurhash
BUG make coef_ 1-d in Naive Bayes for binary case
BUG replace assert by custom exceptions
COSMIT refactor SGD code further
Revert "COSMIT refactor SGD code further"
ENH merge sparse and dense SVMs, part 1
ENH merge sparse and dense SVMs, part 2
ENH merge sparse and dense SVMs, part 3: adapt sparse tests
DOC merge sparse and dense SVMs, part 4
Merge pull request #576 from larsmans/merge-svms
DOC improve intro to Git in the developers' documentation
DOC rm unused param from sparse.ElasticNet docstring
COSMIT abstract base class in univariate feature selection
ENH sublinear tf scaling in TfidfTransformer
DOC s/with dense data// in merged SGD module
refactor SGD regression input validation + doc fixes
ENH more generic dict-like test in CountVectorizer
DOC typos in whats_new
DOC typos
DOC typo
COSMIT refactor SGD with Dataset factory function
COSMIT rename _mkdataset function in SGD
ENH add DictVectorizer
ENH test feature_extraction.DictVectorizer
DOC syntax error in DictVectorizer docstring
COMPAT turns out collections.Mapping has an iteritems member
ENH add test for DictVectorizer.restrict
DOC + ENH DictVectorizer: complete docs, add dict_type param
COSMIT disable liblinear I/O code
ENH implement one-of-K/one-hot coding in DictVectorizer
COSMIT rename DictVectorizer source files
ENH optimize DictVectorizer (sparse case)
TEST more strict test for one-of-K coding in DictVectorizer
DOC narrative documentation for DictVectorizer
DOC + pyflakes in DictVectorizer
ENH reduce memory usage of DictVectorizer.transform in sparse case
BUG fix doctests for DictVectorizer (nose 0.X compat)
Merge branch 'dictvectorizer'
COSMIT simplify input validation in KMeans
DOC small fixes to NearestCentroid classifier
BUG disallow shrinking with sparse data in NearestCentroid
DOC typos, line-width and minor stylistic fixes in pipeline module
COSMIT shallow copy of steps in Pipeline + code style
Merge pull request #741 from ogrisel/sorted-dictvectorizer
COSMIT use sorted instead of list.sort in DictVectorizer
DOC small fixes to DictVectorizer documentation
BUG fix issue #753, "Sparse OneClassSVM missing argument to super()"
BUG re-allow zero-based indexes in SVMlight files
COSMIT replace utils.testing.assert_in with Nose-compatible functions
DOC + FIX DictVectorizer: actually support single Mapping arg in transform
ENH zero_based="auto" support + better n_features=None in load_svmlight_files
COSMIT vanity + license for ArrayBuilder
COSMIT refactor SVMlight loader
ENH fit_predict convenience method on KMeans and MiniBatchKMeans
Merge pull request #729 from larsmans/fit-predict
COSMIT pep8 SVMlight loader
BUG close files in time in SVMlight loader (with statement)
TEST + FIX zero_based="auto" behavior in SVMlight loader
DOC + PEP8 SVMlight loader
Merge pull request #756 from larsmans/svmlight_fix
DOC typo
COSMIT pep8 document classification example
DOC typo in example
DOC clarify zero_one_score
DOC typo
revert PLS param rename + move input validation out of loop
BUG chi² feature selection didn't work for COO matrices
ENH export f_oneway from feature_selection module
BUG ensure that SelectKBest actually selects k features
Mathieu Blondel (114):
FIX: support for regressors in multiclass module.
Support for coef_ in OneVsRestClassifier.
Mention multi-variate resgression support in Ridge.
Add safe_mask utility.
coef_ and intercept_ in LinearSVC are now writable.
Add safe_mask to developer doc.
Typos.
Create partial_fit and call partial_fit from fit.
Add partial_fit to SGDRegressor.
Partial tests + fix bugs.
Fix a few more bugs.
Use proper assertions.
Fix more bugs + tests.
Add decision_function to SGDRegressor.
Multiclass tests.
Merge dense and sparse SGD implementations.
Re-enable sparse tests.
Add deprecation warning.
Update docstrings.
What's new.
Removed needless line.
Use only one epoch in partial_fit.
Use named parameters.
Updat examples.
Update doc.
Use only epoch SGDRegressor.partial_fit.
Save iteration number.
More tests + fixes.
Fix bug when fit is called mutiple times.
Fix "what's new".
Merge pull request #10 from larsmans/sgd_partial_fit
Address @ogrisel and @larsmans 's comments.
pep8!
FIX: y should be np.float64.
Add filter_params option to pairwise_kernels.
Precomputed kernel can actually be non-squared.
Use pairwise_kernels in KernelPCA.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Merge pull request #11 from larsmans/sgd_partial_fit
More technically correct description.
Rename _get_params() to get_params().
Merge branch 'sgd_partial_fit'
Use classes_.
Better title in README.rst.
More intuitive warm-restart in SGD.
Fix doctests.
warm_restart -> warm_start
More intuitive warm-start in ElasticNet.
Fix doctests.
Copy in user-land.
Missing docstring in ElasticNet and Lasso.
Fix failure in `test_bad_input`.
Revert change on svm.base.
Remove if statement.
Suppress deprecation warnings.
Merge branch 'warm_start' of github.com:mblondel/scikit-learn into warm_start
Make sure order="C".
Merge branch 'warm_start'
Fix doctest.
preprocessing/__init__.py -> preprocessing/preprocessing.py
Move preprocessing.py to sklearn/.
Remove CoefSelectTransformerMixin and use SelectorMixin instead.
Better default threshold for L1-regularized models.
euclidian_distances is to be deprecated in v0.11.
Add n_jobs option to pairwise_distances and pairwise_kernels.
Merge branch 'enh/metrics' of https://github.com/satra/scikit-learn into metrics
Backward compatibility in precision, recall and f1-score.
Factor some code.
More what's new items.
Fix what's news.
Add Perceptron.
Add Perceptron to document classification example.
Minimal documentation.
Add references and implementation details.
Propagate parameters.
Expose more parameters.
Explain parameter in Hinge loss.
Don't rescale coef if not necessary.
Quick note on sparsity.
Don't break API in precision_recall_fscore_support.
Pep8!
Fix scale_C warning.
Merge branch 'perceptron' of github.com:mblondel/scikit-learn into perceptron
t -> threshold
Add mean_squared_error and deprecate mean_square_error.
Don't raise warning when passing explicit scale_C=False.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
DOC: scaling regression targets.
Merge pull request #623 from npinto/ridge-docfix
Set label encoding in LabelBinarizer.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Guess threshold if not explicitly provided.
Bug: must be strictly less than.
Pep8.
Don't raise warning in auto mode.
Merge pull request #712 from agramfort/fix_y_center
Merge branch 'shuffle_kfold' of https://github.com/NelleV/scikit-learn into kfold-shuffle
Test indices=False case.
Factor tests.
Merge branch 'combat' of https://github.com/ibayer/scikit-learn into lsqr_fix
Fix lsqr for scipy 0.7.
Add test for grid search with only one grid point.
Check param grid.
Return early if there's only one grid point.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Fix doctest failure.
Merge branch 'nearest_centroids' of https://github.com/robertlayton/scikit-learn into nearest_centroids
Fix doc mistakes.
Precomputed distance matrices can be rectangular.
Add test for precomputed distance.
Doc cosmits.
Fix bug when refit=False.
Fix kernel pca example.
Fix doctest in PLS.
Meng Xinfan (1):
fix an error in naive bayes docs
Nelle Varoquaux (4):
closes #677 - improved affinity propagation docstrings
closes #703 - KFold has now an option to shuffle the data
Added unit test for shuffle option in KFold
Now tests the randomness of the KFolds when shuffle is True, and that all indices are returned in the different test folds
Nick Wilson (7):
DOC: Various minor fixes to "Contributing" docs
Skip k-means parallel test on Mac OS X Lion (10.7)
FIX: Delete temporary cache directory
BUG: Fix metrics.aux() w/ duplicate values
FIX: Add NORMALIZE_WHITESPACE to broken doctest
Stop passing keyword arguments for positional args
Add verbose parameter to SVMs (fixes #250)
Nicolas Pinto (5):
DOC: fix a few incoherencies in ridge.py
ENH: add verbose option to LinearSVC
BUG: fix LibLinear verbosity for L2R_L2_SVC
MISC: verbose should be int, not bool
TST: add smoke test for LinearSVC's verbose option
Olivier Grisel (129):
FIX: compat with numpy version lacking the out argument for dot
ENH: misc style / docstrings improvements
more enhancements, variable names and test fixes
Merge pull request #551 from fannix/master
FIX: make sklearn.base.clone robust to empty params
first stab at trying to wrap MurmurHash3
Merge pull request #3 from GaelVaroquaux/murmurhash
implementation & test for the murmurhash wrapper module
Export some public cython API
DOC: add entry for murmurhash in the developer utilities section
ENH: add the ability to hash int arrays
Better docstring
Shorter cpdef function names + missing docstrings
DOC: give usage example
test developers utilities as well
OPTIM: avoid unlikely np.int32 test upfront
Merge pull request #564 from ogrisel/murmurhash
FIX: broken build / tests
Merge remote-tracking branch 'larsmans/typesafe-murmurhash'
Merge pull request #587 from jakevdp/arpack-init
Merge pull request #593 from jaquesgrobler/doc_update
cosmit in memory debugging doc
Merge pull request #602 from jaquesgrobler/doc_remotes_note
ENH: use linear gradient cmap for more readable hyperparam heatmap
docstring cosmits and typos in label_propagation.py
useless imports
simpler random seeding scheme for parallel kmeans
less hacksih parallel random state seeding
avoid pl.set_cmap and align colors of colormesh with scatter
started work on utility function for quick train test split
more doctest
add parameters in docstring
DOC: narrative doc for train_test_split
add tests for invalid argument + fixed a type error
more tests
typo
reworked nested grid search example for better doc and output, use train_test_split and add more cross links
DOC: related improvement in GridSearchCV doc
DOC: more cross references
cosmit
DOC: what's new
Merge pull request #618 from ogrisel/train_test_split
FIX: make LFW data shapes consistent with Olivetti faces
ENH: more informative exception message
DOC: improved SVM docstrings
typo
Merge pull request #628 from daien/master
Merge pull request #633 from robertlayton/ig
Merge pull request #634 from amueller/svm_decision_function_dirty_fix
FIX #614: raise ValueError at KernelPCA init if fit_inverse_transform and precomputed kernel
DOC: formatting improvement to ensemble.rst
FIX: make the 20 newsgroups loader explicitly decode latin1 content
shorten example a bit with train_test_split
manually rescale C in face recognition example
Merge pull request #664 from conradlee/663-kfold-init-bug
Flatten the feature extraction API
Merge branch 'master' of github.com:scikit-learn/scikit-learn into text-feature-extraction-simplification
missing C re-scaling in example
missing C re-scaling in example
MiniBatchSparsePCA and MiniBatchDictionaryLearning still use chunk_size as argument
merge master
factorize feature names array
make CountVectorizer able to output binary occurrence info
add a test for custom dtype
DOC: improve docstring for Vectorizer
Flatten the combined vectorizer as well
Merge remote-tracking branch 'upstream/master' into text-feature-extraction-simplification
Fix grid search example
Fix charse in mlcomp example
DOC: started section on text feature extraction
Merge remote-tracking branch 'upstream/master' into text-feature-extraction-simplification
switch back to the old vocabulary constructor argument
Merge remote-tracking branch 'upstream/master' into text-feature-extraction-simplification
better blob seed so that both DBSCAN and meanshift are working well
Merge branch 'master' into text-feature-extraction-simplification
finally the right API with plenty of efficient overrides
Filter stop words before ngrams
demonstrate stop words in example (+ slighly faster convergence)
missing sklearn.semi_supervised package in setup.py
ENH: remove useless array wrap for feature names + more TF-IDF tests
Make Vectorizer not inherit from TfidfTransformer while preserving direct gridsearchability
FIX: division by zero errors and negative IDF
DOC: TF-IDF and customizing
DOC: updated parameters
Merge branch 'master' into text-feature-extraction-simplification
updated whats new
s/Bags/Bag/ and Vector Space Model
better explanation for bigram features
No accent stripping by default + various doc fixes
update strip_accents in Vectorizer as well
typo
typo
typos
remove lambda + better comment position
enable stop words in clustering example
typo
Renamed Vectorizer to TfidfVectorizer + deprecation warning
updated what's new + backward compat for vocabulary attribute
fixed and inheritance bug in TfidfVectorizer.fit_transform + removed vocabulary backward compat that breaks grid_search
useless import
Merge pull request #668 from ogrisel/text-feature-extraction-simplification
trailing whitespace
FIX: broken doctest under OSX
Merge pull request #694 from njwilson/skip-kmeans-2-jobs-mac
Merge pull request #692 from njwilson/minor-doc-fixes
Had a link to autopep8
Merge pull request #695 from njwilson/tmp-dir-for-cache
Merge pull request #696 from njwilson/issue-691
Merge pull request #698 from njwilson/master
OPTIM: skip buffer unpacking in kmeans
Merge pull request #693 from jaquesgrobler/Collapse_Sidebar
Merge pull request #714 from jaquesgrobler/Next_button
Merge pull request #717 from jaquesgrobler/Issue714
typo + cosmetics
ENH: sort features in dict vectorizer + new doc
ENH: refactored the HMM tests to ease PY3K transition
Fix bad reference to LFW in example
useless import
FIX #752: raise explict ValueError if k is too large
FIX: missing string formating argument in MBKMeans error message
removed useless assert
Merge pull request #748 from ogrisel/hmm-test-hierarchy-simplification
Merge pull request #742 from davidmarek/pdistance
FIX: #774 Add documentation for lprun config in qtconsole and notebook
FIX #807: non regression test for KPCA on make_circles dataset
Merge pull request #809 from zaxtax/master
Merge pull request #812 from amueller/pipeline_decision_function
typo
Add note for port install py27-scikits-learn
Paolo Losi (3):
DOC: Better doc string for l1_min_C
BENCHMARK covertype: select classifier via cmd line opt
Merge pull request #736 from paolo-losi/bench_covtype
Peter Prettenhofer (239):
initial checkin of gradient boosting
GBRT benchmark from ELSII Example 10.2
added GBRT regressor + classifier classes; added shrinkage
use super in DecisionTree subclasses
first work on various loss functions for gradient boosting.
added store_sample_mask flag to build_tree
implemented lad and binomial deviance - still a bug in binomial deviance -> mapping to {-1,1} or {0,1} ?
updated benchmark script for gbrt.
some debug stmts
new benchmarks for gbrt classification
fix: MSE criterion was wrong (don't weight variance!)
more benchmarks
binomial deviance now works!!!!!
add gradient boosting to covtype benchmark
add documentation to GB
timeit stmts in boosting procedure.
add previously rm c code
updated tree
hopefully the last bugfix in MSE
new params in gbrt benchmark and comment out debug output
make Node an extension type + change class label indexing.
predict_proba now returns an array w/ as many cols as classes.
cosmit: tidyed up RegressionCriterion
added VariableImportance visitor and variable_importance property
minor changes to benchmark scripts
use `np.take` if possible, added monitor object to `fit` method for algorithm introspection.
cosmit
choose left branch if smaller or equal to threshold; add epsilon to find_larger_than.
compiled changes for last commit
cosmit
some tweaks and debug msg in tree to spot numerical difficulties.
added TimSC tree fix
changed from node.error to node.initial_error in graphviz exporter
recompiled cython code after rebase
fix: _tree.Node
comment out HuberLoss and comment in benchmarks
changed from y in {-1,1} to {0,1}
cosmit: beautified RegressionCriterion (sum and sq_sum instead of mean).
rename node.sample_mask to node.terminal_region
fix: Node.__reduce__
fix init predictor for binomial loss
performance enh: update predictions during update_terminal_regions
fix: samplemask
added timing info
use new tree repr; adapt gradient boosting for new tree repr.
Merge branch 'master' into gradient_boosting
cythonized tree (still broken)
clear tree.py
updated _tree.c
updated GradientBoosting with current master
fix: update variable importance
added gradient boosting regression example
added test deviance to GBRT example
updated TODO in module doc
Merge branch 'master' into gradient_boosting
fix: make GradientBoostingBase clonable.
added unit tests for gradient boosting (coverage ~95%)
better test coverage
store loss object in estimator
Merge branch 'master' into gradient_boosting
stub for gradient boosting documentation
restore original bench_tree.py
Merge branch 'master' into gradient_boosting
min_density now works with store_terminal_regions (however, this only matters if you learn deep trees max_depth >> 5 which rarely happens).
cosmit
added input type and shape test
Merge remote branch 'upstream/master' into gradient_boosting
n_samples > min_split instead of >=
cosmits (cleanup after profiling)
repeat decorator now with arguments
eliminate `compute_importances` fit parameter - make `feature_importances_` a property that will be computed on demand.
initial_error -> init_error
max_features bug in _tree.pyx (check if < 0 and assume all features!)
Merge branch 'tree-feature-importance' into old-gradient-boosting
merge with master finally resolved!
enh: performance enhancement by removing redundant computation of values - we use the state of `criterion` instead.
started work on gradient boosting docs
Merge branch 'master' into old-gradient-boosting
removed feature_importances_ property in tree module
work in progress on GBRT docs
added script to bench sklearn gbrt against R's gbm package.
cosmit: pep8 + comments
fix: undo compte_importances property merge in forest module and examples
wip: narrative doc
fix: table layout
restore original
restored original version
restored original version
restored original version
restored original version
Merge branch 'master' into gradient_boosting
Merge branch 'master' into gradient_boosting
test_oob_score_regression oob_score below 0.8 if n_estimators < 50
changed ``n_iter`` to ``n_estimators`` and attribute ``trees`` to ``estimators``.
added artificial dataset generator from Hastie et al. 2009, Example 10.2
wip: narrative doc for gradient boosting.
fix: wrong assertion
renamed estimators to estimators_
wip: narative documentatio for gradient boosting.
fix: import numpy in doctest
Merge remote branch 'upstream/master' into gradient_boosting
Merge remote branch 'upstream/master' into gradient_boosting
use mean_squared_error
added new mean_squared_error to metric imports
Merge remote branch 'upstream/master'
Merge branch 'master' into gradient_boosting
polished narrative documentation. fixed doctest.
cosmit: fix doc format
cosmit: fix doc format
Merge branch 'master' into gradient_boosting
factored out weight vector class; dense SGD now uses ``WeightVector`` instead of explicit ndarray and wscale.
enh: performance of WeightVector now comparable to explicit weight vector. some cosmits in dense sgd extension module.
wip: sparse sgd now uses WeightVector - there are some broken tests tough.
ENH changed naive bayes' self._classes attr to self.classes_
wip: still hunter sparse sgd bug
fix: forgot to scale by wscale at the end of dot_sparse. All tests are green again!
added new sgd dataset abstraction to unify sparse and dense implementations.
Merge branch 'master' into sgd-refactoring
major refactoring of sgd module::
use Py_ssize_t where appropriate; cosmit
Merge remote branch 'upstream/master' into sgd-refactoring
cosmit: better docstrings for SGD
Merge remote branch 'upstream/master' into sgd-refactoring
WeightVector now keeps track of its squared norm.
move WeightVector and Dataset abstraction to new module
moved WeightVector and dataset abstraction to new module
updated Dataset imports
no need for sgd_fast header anymore.
added largescale ext module to setup.py
fix: declare extension type attributes
comment in forest classes for covertype benchmark
Merge branch 'master' into gradient_boosting
renamed and updated covertype benchmark.
uncomment RandomForest
cosmit
expose 'ls' loss function for classification
cosmit: pep8
Merge branch 'master' into sgd-weight-vector
renamed largescale -> large_scale
Merge branch 'master' into gradient_boosting
Merge branch 'master' into sgd-weight-vector
moved WeightVector und SequentialDataset into seperate modules.
re-cythonized
fix: min_samples_split
Merge branch 'master' into sgd-weight-vector
don't need self here.
factored out norm updates and moved them to a dedicated subclass
cythonized
Merge branch 'master' into gradient_boosting
Merge branch 'gradient_boosting' of https://github.com/scottblanc/scikit-learn into scottblanc-gradient_boosting
Merge branch 'gradient_boosting' into scottblanc-gradient_boosting
cosmit: pep8
cosmit
added serialization test case
use `deviance` instead of `medviance` and `bdeviance`
wip: refactor ``fit_stage``; fix feature importances regression; tests still not green (performance regression on Example 12.7).
fix: make binary classification a special case.
refactoring for multi-class
test case for multi-class
comment out - yahoo learning to rank dataset
some profiling
impl. deviance for MultinomialDeviance.
fast tree prediction methods.
faster ``_predict`` by using low-level tree predict functions.
cosmit
forgot to remove debug function
changed self.classes to self.classes_
fix: forgot to rename classes
updated documentation: plots for gradient_boosting, new sample generator
new predict utils for early stopping; updated examples
Merge remote branch 'upstream/master' into gradient_boosting
updated benchmark script
delete benchmark scripts - include them in dedicated branch or ml-benchmarks
Merge remote branch 'upstream/master' into gradient_boosting
removed ``store_terminal_region`` from ``build_tree``.
mention multi-class
use ``apply_tree`` to compute terminal region. This is faster and reduces code complexity.
added __all__
enhanced documentation
type (differentiable)
boston -> Boston
Merge remote branch 'upstream/master' into sgd-weight-vector
un-done NormedWeightVector factorization; performance decrease on RCV1 is neglectable.
cythonized sgd files
Merge branch 'master' into gradient_boosting
Merge branch 'pprett/gradient_boosting' of https://github.com/glouppe/scikit-learn into glouppe-pprett/gradient_boosting
cythonized
added Gilles to authors
whats new? Gradient Boosting!
Merge remote branch 'upstream/master' into gradient_boosting
added util func to create random sample_masks
use random_sample_mask (issue pointed out by @glouppe);
update examples
update tests
remove np.seterr
cosmit: comments + rm unnecessary variables
cosmit: add comment to replace ``random_sample_mask`` if numpy requirement allows to do so
cosmit: fix ClassPriorPredictor docstring; rm comment
typos
typo
mv *Predictor to *Estimator
mv classification init estimators; use np.bincount for PriorProbabilityEstimator.
is_multi_class now is a class attribute.
update docs
don't need to store n_classes.
cosmit: no need for float literals
Merge branch 'master' into gradient_boosting
point out scalability problem with large numbers of classes;
cosmit; mention scalability issues w.r.t. large number of classes
Merge branch 'master' of https://github.com/udi/scikit-learn into udi-master
added prior test
more test cases for naive bayes
GaussianNB: use epsilon to overcome zero sigma problem.
rm print stmt
added gbrt extension module (faster prediction methods)
rm custom regression tree prediction method
faster prediction methods
wip
add prediction method for specific stage
add staged predict
use staged predict in gbrt examples
fast tree prediction based on mystic cython kung-fu
cosmit
staged_predict for regression
test for staged predict and cosmit
more test cases
more test cases (input check at prediction time, degenerate inputs)
use approriate data types (Py_ssize_t)
better input checks at prediciton time
rm old tree prediction methods;
cosmit
Merge branch 'gradient-boosting-enh2'
add test for multiple fits w/ different input shapes
fix issue 762: SGDRegressor does not clear coef_ from previous fit
asarray not needed because of check_arrays stmt above
rm unused vars
Merge branch 'fix-issue-762'
typo: Viola-Jones
Gradient Boosting also provided OOB estimates
Rob Zinkov (1):
Fixed typo in documentation
Robert Layton (42):
Mutual Information docstring incorrectly said it was the adjusted mutual information
Removed single k-means run to it's own function to enable optional parallelisation later.
Parallel version of k-means.
pep8 and pyflakes tested
Not doing a full sort for getting the best results
Updating random_state inbetween iterations of k-means fixes some issues
Doc updates
Fixed author reference (removed link as it wasn't working)
Added my twitter account as homepage.
feature_extraction/text.py: 'ignore' removed as a default, class param
Added a test (that doesn't work yet)
Test now works, testing both the Word and Char analyzers
decode_error -> charset_error
docstring update
cosmit
cosmit undoing (was testing)
pep8
cosmit to docstrings
NearestCentroid classifier, with test suite.
Shrink threshold working, along with a test
Sparse tests, but they are currently failing. Committing for comment
Typo for "neighbours", and converted to en-US
Test for sparse matrices. Tests fails, my guess is that centroids are the same.
Fixed bug in nearest_centroid, and removed boston test.
Narrative documentation
Sparse tests pass when using shrinkage
Turned on final test (it works!)
Broadcasting used to remove a loop
Removed asserts in code
Test use assert_array_equal where appropriate
pyflakes on test
Update to documentation
Moved to the `neighbors` namespace
Example of nearest neighbor, getting an improvement when using a shrink threshold of 0.1
Explain example in docs
Update examples/neighbors/plot_nearest_centroid.py
Update doc/whats_new.rst
Update doc/whats_new.rst
Removed unneeded numpy.array call in test
metric fixed in tests
Merge remote-tracking branch 'origin/nearest_centroids' into nearest_centroids
Merge pull request #5 from larsmans/nearest_centroids
Roy Hyunjin Han (2):
Fixed some typos
Update examples/exercises/plot_iris_exercise.py
Satrajit Ghosh (21):
added avg_f1_score
tst: added tests
enh: added matthew's correlation coefficient
sty: pep8 + doc
Merge branch 'master' into enh/metrics
Merge branch 'master' into enh/metrics
enh: added support for weighted metrics closes #83
doc: added description for matthew's corrcoef from wikipedia
sty: pep8 fixes
sty: pep8 on test file
doc: removed strange character
fix: updated tests to reflect that micro shows the same precision and recall
fix: average with elif
doc: improved description of average
api: changed pos_label to None for metrics
Merge remote-tracking branch 'upstream/master' into enh/metrics
Merge remote-tracking branch 'mblondel/metrics' into enh/metrics
Merge pull request #443 from satra/enh/metrics
fix: convert input arrays to float
fix: force copy to True in case underlying default behavior changes.
tst: added test for feature selection. this test would have failed in the previous case. closes #727
Scott White (2):
add support for multi-class
add todo
Shiqiao Du (27):
added a cython module to the hmm
replaced (T, N) -> (n_samples, n_states)
- renamed (n_samples, n_states) -> (n_observations, n_components) in hmm.py
Merge pull request #1 from agramfort/hmmc
dropped "_c" suffix
debugged _hmmc.pyx
fixed proble of _accumulate_sufficient_statictics in hmm.py
- removed unnecessary **kwargs specification in fit and _do_mstep methods
replaced deprecated "rvs" to "sample"
made `sample` also return the sequence of internal hidden states
added doc for hmm
- fixed typo in hmm.rst
made `sample` also return the sequence of internal hidden states
rebased to the master and fixed conflicts
bug fixed
fixed _do_viterbi_pass
fixed doc
fixed typo
replaced function call of decode to predict
removed pure python codes and beam pruning options
Added change history to what's new
updated author and pep8
modified phrases in what's new
- added decoder selection
fixed some typo, doctest and pep8
added comment on decoder algorithm in the rst doc.
Merge pull request #2 from GaelVaroquaux/hmmc
Udi Weinsberg (1):
corrected Gaussian naive-bayes to correctly computer the class priors
Virgile Fritsch (1):
BF: Avoid two consecutive centering of the data in outlier_detection.
Vlad Niculae (15):
Prettify structure example
DOC: minor style changes
DOC: tweaks
Removed print in digits classification example
DOC: fixed links and made examples build
Merge branch 'clayw-label_prop' of github.com:vene/scikit-learn into clayw-label_prop
DOC: clarified example titles
ENHanced the multilabel example aspect
s/jacknife/jackknife
DOCFIX: make math block render
Add warnings and clean up tests
FIX: doctests for scale_C, took some liberties
FIX: bug in test_setup. Actually avoid multiprocessing now.
FIX: wrong cover-package, misleading coverage as 100%
DOC: updated testing instructions
Xinfan Meng (1):
Fix a test case
Yannick Schwartz (11):
added a StratifiedShuffleSplit in the cross validation schemes
added test for stratified shuffle split
updated stratified shuffle split test
fixed sss test
cleanup of arg check and doc update
put sss validation in external function
updated doc/whats_new.rst, doc/modules/classes.rst and doc/modules/cross_validation.rst for the sss
sss raises error if a class has only one sample, added associated test
pep8
changed train_fraction to train_size
Fixed random state, changed _validate_sss name, fixed _validate_stratified_shuffle_split bug
Yaroslav Halchenko (10):
DOC rudimentary docstring to deprecated.__init__ describing "extra"
Merge tag '0.11' (theirs) into releases
Merge branch 'releases' into dfsg
Merge branch 'dfsg' into debian
Initial changelog for 0.11.0-1
ENH: adjusted Format in debian/copyright
Adjusted patches/deb_use_system_joblib to avoid submodule import
Made running unittests verbose
added patch up_ICA_test_seeding to "cherry-pick" f6d7f45a45d21a779d1a2d59a6f7ff30de83b76e (FIX: control RNG seeds in ICA tests)
exclude test_sparse_svc_clone_with_callable_kernel from tests
fcostin (9):
optimisations to Ridge Regression GCV
faster GCV for Ridge for n_samples > n_features
fixed tests to work with Ridge GCV
updated RidgeCV docstring and changelog
fixed bug with _values (thanks @mblondel)
fixed bug with > 1d y arrays
svd fails for sample_weights, use eig instead
coerce sparse matrices to dense before SVD
refactoring (thanks @GaelVaroquaux, @mblondel)
jansoe (1):
fix error in unwhitened case
leonpalafox (1):
Change exception text when multiple input features have the same value from: "Multiple X are not allowed" to: "Multiple input features cannot have the same value"
-----------------------------------------------------------------------
No new revisions were added by this update.
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-science/packages/scikit-learn.git
More information about the debian-science-commits
mailing list