[scikit-learn] annotated tag debian/0.14_a1+git20-gc9ba2c3-1 created (now 7946ba8)
Andreas Tille
tille at debian.org
Wed Dec 28 13:11:03 UTC 2016
This is an automated email from the git hooks/post-receive script.
tille pushed a change to annotated tag debian/0.14_a1+git20-gc9ba2c3-1
in repository scikit-learn.
at 7946ba8 (tag)
tagging 7c6e398d5a05caa7ab61bfad0c4bdf03482a1c4e (commit)
replaces debian/0.13.1-1
tagged by Yaroslav Halchenko
on Tue Aug 6 23:05:20 2013 -0400
- Log -----------------------------------------------------------------
scikit-learn Debian release 0.14~a1+git20-gc9ba2c3-1
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iEYEABECAAYFAlIBuXAACgkQjRFFY3XAJMi75QCgjGF9L5lblAZxxnMaDy63rUtp
u24An3iqPIesnjWoZqzQBSrmg+9Jj+MS
=X5Lq
-----END PGP SIGNATURE-----
A. Flaxman (3):
DOC: add random_state parameter to StratifiedShuffleSplit doc string
DOC: latex beautification
DOC: latex beautification
Abhijeet Kolhe (1):
Fix setup.py to resolve numpy requirement
Alexander Fabisch (1):
DOC update example path
Alexandre Gramfort (64):
ENH : add reconstruction_err_ for NMF with sparse input
use scipy.linalg in test_nmf.py
adding comment on why sparse frobenius is ok as done
Merge pull request #1607 from agramfort/reconstruction_err_nmf_sparse
FIX : fix kfold balance due to int rounding
FIX : test due to KFold change
FIX : better fix of KFold balance
fix doctest
TST : improve test_kfold_balance test
update what's new
TST : improve again test_kfold_balance test
Merge pull request #1772 from jnothman/comment_exhaustive_search
typo
pep8
Merge pull request #1907 from aflaxman/stratified_shuffle_split_rand_state_doc_str
Merge pull request #2071 from djv/patch-1
Merge pull request #2075 from jnothman/agglomeration_simplify
FIX : use unique from fixes
Merge pull request #2074 from jnothman/ward_docstring
Merge pull request #2080 from ahojnnes/dist-todo
FIX : missing y=None in FactorAnalysis
Merge pull request #2087 from ahojnnes/examples-print-doc
Merge pull request #2118 from NelleV/DOC_fix
Merge pull request #2135 from fhs/meanshift-doc
Merge pull request #2138 from NelleV/kCCA
Merge pull request #2142 from sergeyf/master
Merge pull request #2145 from NelleV/kCCA
FIX : finish get rid of fit_... param
ENH : avoid one copy in FastICA code
misc
update ICA examples
adding comment
Merge pull request #2196 from erg/labelencoder-docs-fix
ENH : massive refactoring of CV models in coordinate descent. Now the algo core is in path functions
update what's new
DOC : more fixes in covariance module
Merge pull request #2202 from NelleV/isotonic_reverse
Merge pull request #6 from jaquesgrobler/cov_doc_fix
Merge pull request #2203 from agramfort/cov_doc_fix
cosmit : protect attributes in RBM for sphinx
pep8
better coverage
fix doctest
ENH : use warning instead of print
update what's new
Merge pull request #2212 from dengemann/ica_memory
Merge pull request #2213 from cmd-ntrf/master
Merge pull request #2217 from vene/ica_fit_transform
Merge pull request #2182 from NelleV/pls_refactor_2
DOC+ENH: fixes in least_angle + one vectorization
DOC : better doc of array shapes in fastica
MISC : use linalg from scipy
ENH : removing warnings from tests in cd linear models
Merge pull request #2194 from NicolasTr/as_float_array_copy
Merge pull request #2223 from arjoly/doc-datasets
DOC : docstring fixes
DOC : more docstring fixes
use pre_fit in OMP
API : deprecate a lot of extra parameters in OMP object
API : deprecations in orthogonal_mp
ENH : update example of OMP
update what's new + classes.rst
Merge pull request #2247 from pgervais/docfixes
Merge pull request #2258 from NicolasTr/ignore_pycharm_files
Andreas Mueller (186):
REL change version to 0.14-git everywhere, update news, support page.
website: fix for survey bar
COSMIT remove unused imports, pep8
TST some more tests for multi output lars
DOC fix typo in LinearSVC error message
FIX make error message work when return_path=False. Btw I feel that getting "references" for numbers out of numpy arrays is pretty ugly.
TST fix random states in all dict learning tests, make test independent of test sequence.
Revert "trying travis cfg with system-site-packages"
COSMIT pep8
DOC add return values of cross_val_score and train_test_split to docstrings.
ENH added test, started on cross_val_score
ENH adding SomeScore objects for better (?!) grid search interface.
ENH refactor, taking @GaelVaroquaux's and @ogrisel's suggestions into account
ENH deprecated ``score_func``, introduced ``score`` parameter in GridSearchCV
TST test giving score as string in GridSearchCV
FIX rename ``score`` to ``scoring`` because of the name-clash with the ``score`` function.
FIX two score objects, adjust tests to new interface
ENH remove old interface completely from tests.
DOC fix docstring
ENH working on cross_val_score, trying to simplify unsupervised treatment.
ENH better testing of old an new interface. Still a bit to do for unsupervised grid search, though.
FIX usage of scores for unsupervised algorithms.
ENH use new api in permutation_test_score, don't use old api in testing.
ENH fbeta score working, more tests
DOC-string for AsScorer
ENH renamed ap and auc, added RecallScorrer
DOC narrative docs for scoring functions. Put them next to GridSearchCV. Should they go into metrics?
ENH update example, minor fix.
DOC improve cross validation and grid search docstring
FIX rename error
DOC add whatsnew entry
DOC fixed formatting in user guide
FIX example
DOC added a new template to sphinx so view the "__call__" function.
COSMIT address @ogrisel's comment.
FIX rename ZeroOneScorer to AccuracyScorer
DOCFIX for zero_one_score / accuracy_score renaming
DOC add narrative about score func objects to the model_evaluation docs.
ENH rename scorer objects to lowercase as they are instances, not classes
DOC minor fixes in pairwise docs.
ENH/DOC add "score_objects" function for documenting the score object dict.
DOC add metrics.score_objects to the references
DOC use table from score_functions docstring in model_evaulation narrative.
DOC move scoring function narrative above dummy estimators, fix tables, some refinement.
DOC minor fixes in score_objects documentation.
DOC better table of score functions in grid-search docs.
ENH GridSearchCV and cross_val_score check whether the returned score is actually a number, not an array (otherwise cross_val_score returns bogus).
TST improve coverage of permutation test scores
TST slightly better test coverage in cross_val_score
COSMIT built-in typo
DOC some improvements as suggested by @ogrisel
TST add test for pickling custom scorer objects
DOC more improvements by @ogrisel
COSMIT rename AsScorer to Scorer
MISC moved score_objects.py to scorer.py, added module level doc string and license note.
DOC add kwargs in Scorer to docstring.
ENH add ``__repr__`` to Scorer
DOC addressed @ogrisel's comments.
COSMIT text reflow
MISC pep8: rename scorers to SCORERS, remove score_objects getter
DOC remove duplicate table, add references to appropriate user guide section to docstrings of cross_val_score, GridSearchCV and permutation_test_score
DOC add note on deprecation of score_func to whatsnew
FIX imports for Scorer and SCORERS
DOC fixes in whatsnew, typo
TST smoke test repr
COSMIT removed unused imports, fixed error message in test of boosting
ENH break ties in OvO using scores
TST test for breaking OVO ties
COSMIT pep8
ENH get rid of imports in test_common by checking by names, not classes.
ENH fix test_estimators_overwrite_params to also test regressors and transformers. Then fix all the regressors and transformers ... meh!
ENH set the random state to avoid heisenfailures
COSMIT pep8, removing unused imports
FIX remove dtype from covertype, add fetch_covtype to init, add missing docstrings.
FIX doctest kernelpca
ENH get rid of most imports in test_common
TST stronger tests for arbitrary classes. make explicit what works and what doesn't.
FIX rebasing trouble in common tests: the meaning of dont_test changed
FIX don't compare strings with "is". that is really not robust!
ENH in transformer pickle test, only test transformers that provide a 'transform' method. and only test that.
ENH in common tests, use long variable names for all tests
FIX remove all unseeded random variables from common tests.
Merge pull request #1695 from mrjbq7/issue-1694
COSMIT pep8: blank line contains whitespace
DOC added sentence about oob_decision_function_ containing NaN to docstring. Still need some narrative about oob score.
DOC add 0.13.1 changelog to whats_new.rst
DOC add random_state parameter to docs of LogisticRegression and LinearSVC
TST/FIX set random_state in logistic regression tests
TST/FIX always use "almost equal" for floats.
FIX MinMaxScaler bug.
TST FIX random state for LibLinear sparse tests
ENH add randomized hyperparameter optimization
DOC fixed links in whatsnew
Merge pull request #1736 from jamestwebber/patch-1
Merge pull request #1740 from tjanez/move_roc_curve_test
COSMIT pep8
DOC FIX links on grid search narrative
FIX compute_class_weight edge case
DOC some sphinx / rst fixes
MISC minor fixes in examples
DOC FIX column span alignment problem in NMF ^^
COSMIT typo
DOC fixing some more rst / sphinx errors :-/
DOC more sphinx stuff.
Merge pull request #1767 from rmcgibbo/balltree_docstring
DOC add roll your own estimator docs
FIX for iid weighting in grid-search
DOC FIX finite precision
COSMIT pep8
DOC correct / simplify dbscan examle
COSMIT typo. the French again ;)
FIX setting k in KMeans and MiniBatchKMeans was silently ignored. Left over in 07c56d7cd2ddfe71e7a4399d74fc367d6000d854 Damn, that was nasty :-/
COSMIT pep8
FIX jenkins error on numpy 1.3.0
DOC documented n_init parameter of MiniBatchKMeans. Closes #1900.
FIX broken scorer, add non-regression test.
FIX WARN about **params being not used in GridSearchCV.fit. Closes #1815.
FIX bug in callable kernel decision function - Sorry, I think that was me.
FIX test error in test common for KernelPCA that doesn't respect its n_components.
FIX typo in test for RdigeCV
DOC typo in RandomizedSearchCV docstring
DOC fetch_20newsgroups returns the text, not text files. see SO question: http://stackoverflow.com/questions/16615523/using-scikits-kmeans-to-cluster-ones-own-documents
DOC Fixed documentation of kernel parameters: sigm uses gamma, but not degree. Closes #1972.
DOC clarification in Scoring objects: Its not a good sign if I don't understand my own wording.
DOC much more readable formula in chi2 kernel doc
COSMIT sphinx fixes
COSMIT pep8
DOC FIX typo on fbeta, closes #2219
fix whitespace around new tree.pyx docstring
use new virtualenv features of travis, so we don't have to kill the virtualenv
FIX hopefully fixing travis.
FIX hopefully fixing travis.
DOC improve svm sample weight example
DOC improve documentation of sample_weight, add to docstring.
TST small improvement of test for sample weight in svm
cosmit typo
Show 95% confidence interval, not 40% confidence ^^
FIX whoops sorry!
fix pycharm file ending
ENH add "make_y_1d" to utils, use it in estimators where needed.
fix make ``make_y_1d`` save for lists.
use column_or_1d, move it to utils
ENH rename eval / pseudolikelihood to score_samples
fixing ridge and label binarizer... I'm pretty sure that worked before?
FIX make neighbors y prediction shape consistent
TST add regression test for label_binarizer
FIX/ENH make StandardScaler convert int input to float and warn about it, instead of warning and rounding for dense and crashing for sparse.
DOC adjust docstring as suggested by @gvaroquaux
addressing @ogrisel's comments: catch warnings in test, no unneeded digits
COSMIT fixing some unused imports, adding stuff to __all__, and light pep8 (not all whitespace to make rebasing less painful)
DOC fixing some sphinx stuff.
more sphinx fixes
first try at bootstrap-based website
"fix" sidebar stuff - this was not my idea
remove gray boxes around h3 on the two new pages
put banner into header, make it spread over whole page
Fix link to flowchart, add text descriptions.
Minor fixes in front-page text, css
rework front-page box texts
fix typo, missing p
fix and refine some css and html tags
add example banner image
add section, estimator and model links on the frontpage
fix styling of rst links
add links for examples
fix css that I just broke with the sphinx links
flatten the tutorial / doc structure as proposed by @ogrisel
add js for collabsible toc tree in the user guide.
minor typo thing
don't have old version warning on install, as that will be shared across all versions.
added "show source" link to footer, made dimensionality reduction examples link to decomposition
slightly hackish way of inserting a whatsnew link. I really don't want all the sphinx containers here, though. Asked on stackoverflow about it btw.
a little less ugly footer. @glouppe should maybe have a look ;)
make links to old versions actually do something (currently link to the user guide as the other versions are not rebuild yet).
replaced lorem ipsum in news. still a draft but whatever.
nicer dates
Try to raise and test warnings.
DOC added website to whatsnew, added link to github for Nelle
FIX don't use old API in examples
more fixes for docs, deprecated interfaces
FIX made the building of the docs slightly more robust. readme files in folders without examples kill it otherwise.
try to fix the toctree in a semi-meaningful way.
DOC/EXAMPLES fix more documentation errors, deprecated api usages.
EXAMPLES remove non-existing example from doc, don't trigger deprecated interface in enet_path, lasso_path
much better input validation, test that warning is raised on (n_samples, 1) y
rearrange permutation_score parameters to match previous ones.
Arnaud Joly (121):
Typo
ENH multilabel metrics: accuracy, Hamming, 0-1 loss
DOC FIX foating point issue
FIX numpy 1.3 issues with multilabel metrics
ENH add normalize option to accuracy_score + FIX bug with 1d array
DOC return_path argument, prettier references
ENH more pythonic way to treat list of list of labels
ENH add jaccard similarity score metrics
FIX compatibility issue with np 1.3 py 2.6
ENH add multilabel support to PRF metric family
ENH remove pos_label argument with multilabel binary indicator format
ENH remove warnings at testing time
FIX unique_labels in corner case
FIX issue with comparable but different dtype
ENH don't allow mix of input multilabel format
ENH simpler check for mix of string and number input
COSMIT better name
Typo
ENH use type_of_target within unique_labels
ENH improve documentation with allowed label types
ENH check that we don't mix number and strings
Flatten label type checking
TST add smoke test for all supported format
COSMIT
PY3K use six.string_type
OPTIM + ENH simplify mix string and number check
FIX bug with indicator format
ENH use a comprehension over imap
@arjoly and @glouppe thanks their funding FNRS and DYSCO
ENH remove _is_1d and _check_1d_array thanks to @GaelVaroquaux
flake8
ENH raise ValueError with row vector if multilabel or multioutput is not supported
ENH being less permissive thanks to @jnothman
DOC add example is_multilabel
ENH handle properly row vector
Flake8
ENH better error message
FIX switch to the new format syntax
ENH prettier error message for _binary_clf_curve with bad input shape
ENH use ravel instead of atleast_1d and squeeze whenever possible
ENH coherently input checking for regression metrics
ENH dryer thanks to @jnothman
TST stronger test for _column_or_1d function
FIX ^ is a symetric difference
MAINT Set random_state, modernize tests
TST max_features for more tree estimators
TST remove unused tests
ENH add missing pxd of utis.random
ENH Use file configuration
FIX signature
TST error message for _check_clf_target
COSMIT
FIX TST given cosmit
COSMIT don't need set
DOC explain the code
COSMIT product(..., repeat=2)
Update mailmap
DOC add missing datasets helper
ENH remove deprecated
ENH remove deprecated things (2)
Update what's thanks @NicolasTr
ENH add support for string input with classification metrics
ENH use the new format syntax
ENH remove inspect
COSMIT
Update what's new
DOC state that string is possible
TST with labels arguments
FIX what's new...
ENH remove bad examples
DOC let some example for prf metrics
ENH allows make_multilabel_classification to return label indicator f…
TST grid_search_cv works with multioutput data
TST cross_val_score with multoutput data
COSMIT
ENH consistency mse=> mean_squared_error ari => adjusted_rand_score
FIX docstring
Update what's new
DOC add missing links to the scorer and classication section
ENH add multioutput support to KNeighborsRegressor
ENH add multioutput support to RadiusNeighborsRegressor
ENH add multioutput support for KNeighborsClassifier
ENH add multioutput support to RadiusNeighborsClassifier
DOC + example with multioutput regression face completion for knn
ENH allows make_multilabel_classification to return label indicator format
ENH TST grid search with multioutput
ENH TST random search with multioutput data
DOC gridsearch support mulioutput data
TST cross_val_score with multioutput data
DOC more information about which classifier support multilabel
DOC unveil that some estimators support multilabel classification and multioutput-multiclass classification
DOC overall improvements
pep8
DOC credit + fix typo + wording + use mathplotlib.pyplot
ENH take @glouppe comments into account
FIX small title issue
DOC update what's knn and radius-nn support multioutput data
FIX bug in f_score with beta !=1
FIX formula inversion for sample-based precision/recall
FIX set same default behavior for precision, recall and f-score
ENH raise warning with ill define precision, recall and fscore
Backport assert_warns and assert_no_warnings from np 1.7
TST test warning + ENH Add warning average=samples
FIX TST with warnings thx to @jnothman
flake8
ENH set warning to stacklevel 2
TST silence warning
ENH use with np.errstate
DOC TST correct comment
FIX warning test
FIX warning tests in preprocessing
PY3K remove __pycache__ in make clean
FIX PY3K warning.catch_filter set record
DOC overall improvements in the multiclass documentation
DOC take into account @vene and @ogrisel + specify format for multioutput-multiclass
DOC rewording
Typo
DOC ENH take into account @NelleV comments
DOC more comments from @NelleV
DOC Remove deprecated reference + acknowledge @larsman
DOC Update what's new
Bastiaan van den Berg (1):
BUG allow outlier_label=0 in RadiusNeighborClassifier
Ben Root (7):
This should make the hungarian algorithm accept rectangular cost matrices. Also enabled the tests.
An additional check needed in case where there are fewer columns than rows.
Added support for hungarian assignment problems where one dimension of the cost function is zero-length.
Created an alternative hungarian solver for rectangular matrices that does not involve matrix padding.
hungarian() now returns a 2-D array of indices instead of a 1-D array. Also modified the find_permutations test to accomodate.
Some minor changes to docs, and small simplification in code.
Updating namespace usage from scikits.learn to sklearn
Benjamin Peterson (1):
ENH import six package for Py2/Py3 compat in a single codebase
Daniel Velkov (1):
Fix wrong argument name in RFECV docstring
Denis Engemann (27):
FIX transform tests
FIX: remove inplace mod
COSMITS
FIX: inverse transform + add mean_
COSMITS
FIX: syntax typo
FIX: tutorial
COSMITS + DOC
COSMITS
ENH: improve tutorial to be more clean.
ENH + FIX: remove inverse-t kwarg + fix mean_
FIX: address @agramfort 's comments
FIX: address remaining issues
ENH: speed up logcosh
ENH: improve ICA memory profile by 40%
ENH: add failing test exposing bug in RandomizedPCA
FIX: only center if copy == True
ENH: get it right.
FIX: inverse_transform; tests
DOC better doc message
API: get rid of **params in PCA estimators.
DOC: more doc string fixes in pca.py
DOC: more fixes in pca.py doc strings
STY: get rid of unnecessary identifiers
FIX: X.copy() test now works
STY: removing unnecessay import
COSMITS
Denton Cockburn (3):
DOC fix some docstring/parameter list mismatches
renamed weight to sample_weight in sklearn/isotonic.py
DOC missing stuff in randomized_l1 module
Diego Molla (2):
Minor bug fix in metrics.adjusted_rand_score
Added tests
Doug Coleman (15):
FIX: Cast floats to int before slicing in robust_covariance
BUG: Build random forests the same way regardless of n_jobs and add a test for this. Don't predict in parallel since the cost of copying memory in joblib outweighs the speedups for random forests. Fixes #1685.
COSMIT: Fix up a loop.
COSMIT: Better assert.
DOC: Update new magic numbers in docs since random forests train differently now.
FIX: sklearn.ensemble.forest: Refactor to remove references to parallelism in predict() functions.
BUG: Fix performance regression on large datasets in random forest.
DOC: Emphasize that n_jobs is for fit and predict methods in random forests.
BUG: Use Py_ssize_t to index into numpy arrays to help Python handle big data.
MISC: Update _tree.c with cython.
BUG: Use ``Py_ssize_t`` in a few more places for strides. Add the c file again.
DOC: Clarify docs on preprocessing.Binarizer.
FIX: Finish package rename from mst -> sparsetools. Fixes #2189.
DOC: Fix backwards docs on thresholds for preprocessing.
FIX: Newer numpy causes scipy to issue a DeprecationWarning. Ignore it. Fixes #2234.
Dougal Sutherland (3):
StratifiedKFold: remove pointless copy of labels
stochastic_gradient: fix mistake in _init_t docstring
stochastic_gradient: describe all losses, fix epsilon description
Eustache Diemert (35):
added first version of out-of-core example
revision round #1 (move to examples/applications, 1 file, auto-download dataset)
pep8 / pep257 compliant formating
get rif of feature dicts, leverage HashingVectorizer class directly
plot as both a function of time and n_examples
using print() function
improve explanations on out-of-core learning paradigm
improve explanations on example structure
fixed use of docstrings + added section in whats_new.rst + added data dir to .gitignore
more robust data location
use same, separate held-out data to estimate accuracy after each mini-batch
added first version of out-of-core example
revision round #1 (move to examples/applications, 1 file, auto-download dataset)
pep8 / pep257 compliant formating
get rif of feature dicts, leverage HashingVectorizer class directly
plot as both a function of time and n_examples
using print() function
improve explanations on out-of-core learning paradigm
improve explanations on example structure
fixed use of docstrings + added section in whats_new.rst + added data dir to .gitignore
more robust data location
use same, separate held-out data to estimate accuracy after each mini-batch
fixed conflict in whats_new.rst
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn into out-of-core-examples
factorized instance extraction + plots
added note on test set creation rationale
cosmit : inline extract_instance
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn into out-of-core-examples
more structured iteration using islice + wrappers; renamed chunk for minibatch as the latter seems more common in hte literature
added sub section on out-of-core scaling in the narrative docs
Merge branch 'master' of git://github.com/scikit-learn/scikit-learn into out-of-core-examples
some more language corrections
more pep257 fixes (not for ReuterStreamReader as it is not really the interesting class here)
DOC recommend understanding NumPy in the tutorial
DOC expand feature selection docs with an example
Fabian Pedregosa (11):
Clarify docstring in lars_path
Update LIBSVM_CHANGES
Add SVD-based solver to ridge regression.
Remove unnecessary code in ridge svd
BUG: solver was not passed to computational method in Ridge object
Use Cholesky solver by default, but use SVD as fallback
Use ValueError for non-existant solvers
Merge pull request #1914 from fabianp/ridge_svd
Test for singular matrices in Ridge regression
Fix broken link to web designer
Fix broken link to web designer
Fazlul Shahriar (1):
DOC fix docstring typos in cluster/mean_shift_
Federico Vaggi (5):
Added test_regressor_pickle to tests.
Added test_classifiers_pickle to tests.
Finished adding pickle tests.
Removed the use of StringIO, using pickle.dumps instead.
cosmetic: Changed all instances of nonlinear to non-linear
Felix Brockherde (1):
FIX scores calculation in ovo multiclass
Félix-Antoine Fortin (1):
DOC/FIX affinity_propagation damping default value.
Gael Varoquaux (93):
DOC: typo in warning
BUG: reassignment_ratio == 0 in MiniBatchKmeans
BUG: sparse center reassignment MiniBatchKMeans
BUG: sparse vs non sparse centers
BUG: fix test to use sparse array
DOC: reference for discretise option
COSMIT :: in rst is easier for syntax highlighters
DOC: minor formatting in model_evaluation.rst
DOC: minor rst issues
DOC: misc rst formatting
COSMIT: prettify code and figure in example
COSMIT
Merge branch 'treeweights'
Merge pull request #1656 from rlmv/idf_diag
BUG: update joblib to 0.7.0d
TST: add a test for empty reassignment in MBKmeans
BUG: highly-degenerate roc curves
BUG: fix change of behavior in last commit
DOC: add example and ref to lars_path in lasso_path
BUG: ElasticNectCV choosing improper l1_ratio
ENH: minor changes for numpy versions
DOC: remove typo
DOC: libatlas3-base in requirement
ENH: Avoid computations in ElasticNetCV
ENH: improve memory usage in ElasticNetCV
DOC: docstring of private functions
BUG: fix sparse support in ElasticNetCV
COSMIT: address @agramfort's comments
DOC add 2012 GSOC students
COSMIT: labels in plot_lasso_coordinate_descent_path
COSMIT: txt -> rst
DOC: cosmit - fix latex typo
ENH: avoid MemoryError on manhattan_distances
BUG: old versions of numpy
BUG: old versions of numpy
MISC: details about the donations
BUG: type conversion in spectral_embedding
MISC: remove unused imports
BUG: restore Python 2.6
COSMIT: two empty lines between functions
Merge branch 'pr_1732'
BUG: fix sparsetools tests in old scipy
PEP8
Cosmit
Merge branch 'pr_2002'
BUG: fix unsafe casting
DOC: improve RBM example
MISC: remove unecessary dtype
ENH: better error message on scoring
DOC: reorganize model_evaluation
MISC: address comments and test failure
DOC: address remarks by @NelleV
DOC: Address @larsman's comments
DOC: @amueller's comments
ENH: Add the hungarian algorithm
TEST: Increase testing of hungarian
MISC: cosmit in hungarian
ENH: Speed up in hungarian
ENH: More speedups in hungarian
ENH: More speedups in hungarian
ENH: Still more speed ups in Hungarian
ENH: More speedups on Hungarian
API: scikits.learn -> sklearn
BUG: fix some numpy 1.3 compat issue
BUG: numpy 1.6 compat
:
BUG: fix kde tests
MAINT: update copy_joblib script
ENH: update joblib to 0.7.1
MAINT: misc change to copy_joblib
ENH: make bdist_rpm work
COMPAT: empty_like does not have a dtype in np 1.3
COMPAT: fix arpack and pls on old scipy/numpy
COMPAT: string formatting syntax in Py 2.6
COMPAT: median and nans in old numpys
COMPAT: no assert_warns in np 1.3
BUG: fix Py 3
DOC: invert priorities bootstrap <-> nature.css
DOC: sidebar lighter
ENH: add a new DataConversionWarning
MISC: fix plot_multilabel example
BUG: implement concrete __init__ for SGDRegressor
BUG: tests were raising the DataConversionWarning
Merge branch 'pr_2304'
MAINT: recompile Cython files
DOC: add whats_new on the news
TST: adjust test relying on change order
MISC: deprecate balance_weights (it's internal)
REL: 0.14a1 Release candidate for 0.14
MISC: update whats_new
MISC: fix reference to example
DOC: DBSCAN misc doc formatting
DOC: also point installation menu to stable
Gilles Louppe (275):
ENH: weighted r2 score for regression
COSMITs
ENH: Added balance_weights
ENH: added some tests
FIX: test_oob_score_regression
FIX: compute weighted oob scores
FIX: NaN problem + Added some tests
TEST: added some more tests
EXAMPLE: simplify n_estimators and n_samples
TEST: importances
TEST: multi-output problems
ENH: WeightedClassifier/Regressor mixins
DOC
FIX: drop support for multi-output
TEST: errors
ENH: staged_score
EXAMPLE: reduce the number of samples
EXAMPLE: merge plot_adaboost_iris into plot_forest_iris
EXAMPLE: drop plot_adaboost_quantiles
FIX: move balance_weights into preprocessing
PEP8 + PyFlakes
FIX: broken test
FIX: one more bug
FIX: remove prints
DOC: edited some docstrings
DOC: added references into classes.rst
ENH: rename boost method to _boost
DOC: cosmits + narrative documentation (begin)
DOC: proper citations
DOC
TEST: make test_importances more stable
DOC: narrative documentation
DOC: What's new
TEST: base_estimator
DOC: classes_ and n_classes_
DOC: put docstrings into subclasses to make them appear in the documentation
DOC + Better default parameter values
DOC: cosmits
DOC: typo
PEP8 and DOC
ENH: use shuffle
Roll back some changes
Roll back some changes (2)
Merge branch 'master' of github.com:scikit-learn/scikit-learn into adaboost
FIX: broken test
FIX: @amueller comments
Cosmits, code structure and tests
EXAMPLE: better plot_adaboost_regression
Revert changes on plot_adaboost_error.py
ENH: set default parameter values
Cleanup
EXAMPLE: give plot_adaboost_classification some love
DOC: narrative documentation
Merge branch 'master' of github.com:scikit-learn/scikit-learn into adaboost
Merge branch 'master' of github.com:scikit-learn/scikit-learn into adaboost
FIX: some nitpicks
ENH: remove boost_method parameter and use a string as switch
ENH: weights_ -> estimator_weights_
FIX: pprett comments
DOC: Added a References section in _samme_proba
COSMIT: flake8
ENH: weight -> estimator_weight
ENH: weight -> estimator_weight (2)
ENH: weight -> estimator_weight (3)
EXAMPLE: better x-axis label
EXAMPLE (2)
FIX: make_hastie_10_2 reference docstring
DOC: add a short dataset description in hastie example
DOC: narrative documentation
FIX: doctest
EXAMPLE: add AdaBoost to plot_classifier_comparison
FIX: some of Gael comments
What's new: Adaboost
Remove compute_importances parameter
What's new
ENH: Remove compute_importances in AdaBoost
ENH: Update feature_importances in GBRT
ENH: remove "mse" method and simplify
COSMIT
DOC: feature importances
Merge pull request #1657 from glouppe/feature-importances
DOC: add balance_weights to reference
EXAMPLE: compute_importances=True is no longer required (1)
EXAMPLE: compute_importances=True is no longer required (2)
DOC: narrative documentation on feature importances
ENH: precompute X_argsorted when possible
DOC: X_argsorted
Flake8
ENH: use isinstance instead
Merge pull request #1668 from glouppe/adaboost-tree
Merge pull request #1700 from erg/rf
FIX: use DOUBLE_t type
Merge pull request #1705 from glouppe/tree-fix
ENH: support float value for max_features
DOC: if float, then max_features is a percentage
ENH: Defer parameter checking of trees
DOC: GBRT max_features
TEST: added test
ENH: use numbers
FIX: numpy integers
PEP8
Merge pull request #1712 from glouppe/tree-maxfeatures
What's new: float values support for max_features
What's new: fix indentation
Merge pull request #1816 from ndawe/master
Merge pull request #1823 from erg/issue-1466
Merge pull request #1852 from slattarini/typofixes
ENH: moved export_graphviz to sklearn/tree/export.py
ENH: add max_depth to export_graphviz
ENH: output criterion name instead of "error" in export_graphviz
Merge pull request #1998 from kgeis/fix-setup-instruction
Merge pull request #2031 from jnothman/tree_comments
WIP: new Cython interface for decision trees
WIP: comments on the Cython interface
WIP: Criterion interface and base class
WIP: ClassificationCriterion (reset, update)
WIP: Gini criterion
WIP: entropy criterion
WIP: remove n_left and n_right attributes
WIP: MSE criterion
WIP: tree class
WIP: tree algorithm
WIP: add_node
WIP: node_value
WIP: node_value
WIP: predict + apply
WIP: Random Splitter
WIP: splitter
WIP: Best Splitter
WIP: sort features
WIP: first pass on tree.py
WIP: some debug
WIP: some more debug
WIP: debug in progress...
WIP: debug (tests still don't pass...)
WIP: one more bug fixed
WIP: cleanup
WIP: one more test fixed
WIP: more bugs fixed :)
WIP: 19 tests passed
WIP: test_tree.py now passes \o/
Cleanup
WIP: feature importances
WIP: discard samples with weight = 0
WIP: fix export functions
Cleanup
WIP: first pass on ensembles
WIP: use heapsort
WIP: small optimization to heapsort
WIP: remove asserts
WIP: use C-based random number generator
WIP: set n_classes as ndarray
FIX: fix test_random_hasher
WIP: fix adaboost
WIP: small optim to regression criterion
WIP: optimize tree construction procedure
WIP: optimization of the tree construction procedure
cleanup
recompile _tree.pyx
FIX: export_graphviz test
FIX: set random_state in adaboost
FIX: doctests
FIX: doctests in partial_dependence
FIX: feature_selection doctest
FIX: feature_selection doctest (bis)
WIP: allow Splitter objects to be passed in constructors
FIX
Some PEP8 / Flake8
Small optimization to RandomSplitter
FIX: fix RandomSplitter
Cosmit
FIX: free old structures
WIP: Added BreimanSplitter
WIP: small optimizations
WIP: fix BreimanSplitter
Cleanup
WIP: optimize swaps
Regenerate _tree.c
WIP: some optimizations to criteria
WIP: add -O3 to setup.py
WIP: normalize option for compute_feature_importances
WIP: Added deprecations in tree.py
WIP: updated documentation in tree.py
WIP: added deprecations in forest.py
WIP: updated documentation
WIP: unroll loops
WIP: setup.py
WIP: make sort a function, not a method
WIP: Cleaner Splitter interface
WIP: even cleaner splitter interface
WIP: some optimization in criteria
WIP: remove some left-out comments
WIP: declare weighted_n_node_samples
WIP: better swaps
WIP: remove BreimanSplitter
WIP: small optimization to predict
WIP: catch ValueError only
WIP: added some documentation details in _tree.pxd
WIP: PEP8 a few things
Benchmark: use default values in forests
WIP: remove irrelevant and unstable doctests
WIP: address @ogrisel comments
WIP: address @ogrisel comments (2)
WIP: remove partition_features
WIP: style in _tree.pyx
WIP: make resize a private method, improve docstring
WIP: use re-entrant rand_r
FIX: doctest in partial_dependence
WIP: break or shorten some long lines
FIX: doctest in feature_selection
WIP: break one-liner if statements
WIP: revert use of rand_r
FIX: broken tests based on rng
DOC: update header in rand_r.c
TEST: skip test in feature_selection (too unstable)
FIX: one more doctest
WIP: Faster predictions if n_outputs==1
WIP: Break comments on new line
WIP: make criteria nogil ready
WIP: enforce contiguous arrays to optimize construction
WIP: avoid data conversion in AdaBoost
WIP: use np.ascontiguousarray instead of array2d
TEST: add test_memory_layout
FIX: broken test
WIP: Make trees and forests support string labels
WIP: refactor some code in forest.fit
TEST: skip doctest in feature_selection (unstable)
WIP: better check inputs
WIP: check inputs for gbrt
Merge pull request #2131 from glouppe/trees-v2
What's new: new implementation for trees
FIX: remove debug message
FIX: remove -funroll-all-loops
FIX: ur strings are not supported in Python 3.3
DOC: some documentation for the Tree Cython structure
Merge pull request #2216 from glouppe/tree-doc
Benchmark: use specified dtype
TEST: cosmit on err_msg
Raise an exception if rows are full of missing values
FIX: doctest
Better error message
FIX: use range instead of xrange
FIX: imputation example
Merge pull request #2241 from arjoly/grid-cv-multioutput
Merge pull request #2262 from NicolasTr/fix_statistics
FIX: remove blank lines
Use epsilon=1e-7
FIX: partial dependence test
TEST: skip test_oob_multilcass_iris for now
Merge pull request #2277 from glouppe/tree-fix-32bits
COSMIT: typo in examples/imputation.py
Mr. Proper, act 1
Banner improvements
Banner style
Boxes on front page
Load bootstrap first
FIX: footer character encoding
CSS tweaks
CSS tweaks (2)
Lower part of the index
CSS tweaks
More css tweaks
Better alignment in the sidebar
CSS tweaks
More css kungfu
CSS stuff
Remove testimonials for now
CSS tweaks
Donate button + citing
Enhance contrasts
Contributin
Remove toc on the API page (it is already in the sidebar)
FIX: sidebar.js
Move Google javascript near </body>
FIX: remove dupplicate entry in What's new
Harikrishnan S (1):
DOC/FIX twenty_newsgroups.rst should use TfidfVectorizer
Hrishikesh Huilgolkar (7):
chi2 and additive_chi2 raise error if input are sparse matrices
Added same for additive_chi2_kernel
Fixed pep8 issues
pairwise_distance_functions renamed to PAIRWISE_DISTANCE_FUNCTIONS
Made more changes renamed pairwise_kernel_functions, kernel_params to allcaps
Added test for fit_transform(X)==fit(X).transform(X)
Fixed pep8 issues
Ian Ozsvald (3):
clearer decision surface plots and classifier final predictions for the ensembles
improved formatting
updated docs to fix formatting errors
Imran Haque (2):
ENH Release GIL when entering LibSVM/Liblinear code
Release GIL around sparse liblinear training
Jack Hale (1):
WMinkowskiDistance corrections to error messages and docstring
Jake VanderPlas (75):
BUG update graph_laplacian to upstream SciPy version
Ball Tree, KD Tree, and tests
Fix tests for scipy <= 0.9
speed up KD tree construction by ~25%
add author & license information to pyx files
add median of 3 pivoting to quicksort
add pydist code
fix binary tree sort bug
add pydist: user-defined metric
add haversine distance
add exception passing to C functions
rename dist conversion funcs
Implement correct d-dimensional kernel norms
add metric mappings to dist_metrics
binary tree: make valid_metrics a class variable
dist_metrics: allow callable metric
add chebyshev distance to kd tree
add functionality to NearestNeighbors estimators
Roger-Stanimoto -> Rogers-Tanimoto
calculate kernel norm only once
compute kernel norm only once
TST: compare gaussian KDE against scipy version
Change dual splits to single splits in query_dual
Merge pull request #7 from jhale/new_ball_tree
add notes on implementation details to binary_tree.pxi
remove scipy cKDTree support from neighbors
add neighbors module changes to whats_new
Merge pull request #2104 from kastnerkyle/master
BUG: fix precision issues in kernel_density; remove buggy dual-tree KDE versions
add KDE Estimator class
add kwargs to PyFuncDistance
DOC: document the new neighbors functions & KDE
undo change to clustering example
fix conflicts with master
import KernelDensity from neighbors module
adjust math formatting in neighbors docs
fix NearestNeighbors to pass common tests
add KernelDensity to class list
set random seed in KDE example
skip KDE test to prevent failure due to older SciPy versions
fix typo: SkipTe -> SkipTest
fix doctest in neighbors
BUG: return proper algorithm in KDE
add species KDE example
PEP8: neighbors module
DOC: rearrange KDE examples
TST: increase test coverage in neighbors module
DOC: pep8 & formatting in neighbors docs
DOC: make doc tests pass
add 1D KDE example
DOC: small fixes to neighbors doc
DOC: move KDE discussion to separate page
add some notes and doc strings to neighbors cython code
add more documentation to ball tree and kd tree
DOC: tweak kde examples and move density docs
BUG: fix tophat sampling in KDE
Xplot -> X_plot
bt->tree; dm->dist_metric
Additional implementation notes in binary tree
BUG: use correct algorithm for callable metric
TST: set random state in callable_metric test
BUG: add new preprocessing module to setup.py
Merge pull request #2264 from jakevdp/setup_fix
neighbors numpy1.3 compat: fix typedefs, regen with cython 0.19
numpy 1.3 compat: use explicit type definitions
numpy 1.3 compat: make neighbors/dist_metrics compatible
COMPAT: make NeighborsHeap compatible with numpy 1.3
COMPAT: make NodeHeap compatible with numpy 1.3
COMPAT: make BinaryTree class compatible with numpy 1.3
COMPAT: make BallTree & KDTree compatible with numpy 1.3
COMPAT: last few BallTree/KDTree numpy 1.3 issues
BUG: type->dtype in a cross-platform way
compute offset in a cross-platform way
BUG: don't subtract offset in binary_tree
add explicit types to neighbors cython code
JakeMick (1):
TST added test of fit and transform for kernels for nystroem
James McDermott (1):
DOC rename lambda to alpha in plot_lasso_coordinate_descent_path. (Re)-Closes #903.
Jaques Grobler (103):
remove the equaldistance code warning, replace with doc warnings
typo fix
remove warning
warning removal
update warning box
deprecation warnings, indent fix
andys suggestions and test
add warning for no internet
Merge pull request #1644 from jaquesgrobler/doc_url_error
TYPO fix
example title change
gallery effects,icon change,cleanups
typo fix and heading changes
fix indentation error-cause lots of build warnings
4 thumbs per row/hover effect/some cleanup
fix for iris dataset
line_count sort added, some changes reverted
move comment out of list
remove comment, undo change
Merge pull request #1803 from kmike/hmm
rename example title
Switch off survey banner
newline at end of file
Merge pull request #1581 from jaquesgrobler/example_gallery_cleanup
temp disable line-count-sort for gallery while fixing bug
sort-by-line-count bug fixed
Merge branch 'master' of github.com:scikit-learn/scikit-learn
fix numbering for tutorials page
Add bit more instruction on writing docs
big O/tilde add in
removed old complexity info
image and html file added
link fixes
add further links
last links fixed
jquerys added
intigrated to tutorial index
update tutorial page
make links relative
rename image/html
add instructions for editing Readme, and script needed for that
remove svg2html script,toctree section added,doc page for ml_map created
sidebar added
layout fixes and top paragraph
TYPO fix
update what`s new
deleted unnecessary thumbnail
DOC improve description of cross validation
resized image
disable sidebar using cookies to remember last position
COSMIT pep8
Merge pull request #1884 from jaquesgrobler/ml_map
DOC added link to scipy lecture notes to tuts
Merge pull request #1924 from jaquesgrobler/FIX_sidebar_on_index_page
Merge pull request #1911 from Jim-Holmstroem/generalize_label_type_for_confusion_matrix
Merge pull request #1944 from jnothman/selectpercentile_limit_bug
fixed typo
maintenance scripts added for machine learning maps - needed for modifying the map in future
DOC Fix references to missing examples
fix incorrect reference
Merge pull request #1986 from jaquesgrobler/DOC_reference_fixes
add optional banner to index page to advertise code sprints
link updated
Merge pull request #1996 from jaquesgrobler/DOC_sprint_sponser_banner
hover removed from nature, jquery more recent version, containerexpansion on mouseover add
image resizing added
Zoom bug fixed
added docstring space to popup block
docstrings embedded into example hovers
Final visual effects added to hovering
Nelle`s review fixes addressed
Cross browser shadows covered
remove forgotten print
shorten displayed dosctring to 95 chars
fix white space inconsistency between header and docstring
example docstring fixes
logistic regresion example fix
Merge pull request #2056 from jnothman/leavepout_clarify
firefox bug fixed
classifiers comparison fix
DOC spellfixes
Donate buttons added `About us` and front page
donations paragraphs added
Merge branch 'master' of github.com:scikit-learn/scikit-learn
misalignment fix
example fixes to clean first docstring paragraph of rst code
fix merge conflict
border added for IE
make new classes for lasso_path/enet_path and deprecate old
rel_canonical prelim
Merge branch 'master' into ENH_docstrings_in_gallery
syntax fix
cleaned up-ready
Merge pull request #2017 from jaquesgrobler/ENH_docstrings_in_gallery
Small docstring changes for plot_ward_structured_vs_unstructered example, as mentioned in PR #2017
nitpick fixes, pep8 and fix math equations
removed old_version block test
Merge pull request #2205 from jaquesgrobler/ENH_rel_canonical
sidebar fix - sidebar.js was called before jquery. works fine under new version jquery too
sidebar/toctree harmonie, must still fix toggle
jquery reverted to 1.7.2 version. sidebar/toc-collapse works
DOC: few small doc fixes to layout bugs on new website
comments added to the changes
Jim Holmström (10):
Added random_state=0 for AdaBoostRegressor
Replaced 'for i' with 'for _' at place where i is not used.
Extended test_confusion_matrix_binary to incorporate non-integer labels
Extended test_confusion_matrix_multiclass to incorporate non-integer labels
BUG: Fix for non-integer datatypes in confusion_matrix
ENH: faster preallocation and integer type for the accumulators
STY: one-lined lines that where less than 79
MAINT: let the result type be infered by coo_matrix, possible since np.ones already integer typed
MAINT: refactored metrics.auc to use np.trapz
ENH: Added input checks in confusion_matrix
Jochen Wersdörfer (2):
ENH CountVectorizer using arrays instead of lists
ENH added multiclass_log_loss metric
Joel Nothman (79):
Fix comment: returns fbeta_score, not f1_score
ENH allow SelectKBest to select all features in a parameter search
DOC Allowing a list of param_grids means GridSearchCV is more than grids
DOC clarify relationship between pos_label and average parameters for
ENH/FIX make best_estimator_'s predict functions available in parameter search
FIX make *SearchCV picklable
REFACTOR combine train_wrap and csr_train_wrap
ENH call asarray on returned scores and pvalues
TST ensure SelectKBest and SelectPercentile scores are best
FIX ensure SelectPercentile only removes tied features in case of ties
ENH _BaseFilter.inverse_transform should respect dtype
DOC Fix comment for _BaseFilter.inverse_transform
ENH sparse _BaseFilter.inverse_transform
FIXTST fix errors introduced to feature selection tests
DOC comment feature selection sparse inverse_transform
Merge pull request #1935 from jnothman/base_filter_inv_transform
ENH Feature selection should use CSC matrices
COSMIT Remove redundant code in CountVectorizer
TST test CountVectorizer.stop_words_ value
ENH Use csr_matrix.sum_duplicates instead of tocoo
DOC small typographical fixes in grid_search documentation
COSMIT refactor roc_curve and precision_recall_curve
FIX bug where hinge_loss(..., neg_label=1) produced incorrect results
Merge pull request #1880 from NicolasTr/patch_extractor_float_max_patches
DOC Fix estimator unsupervised fit method signature
DOC clarification of parameter search
DOC fix typos
COSMIT shorten long line for pep8
ENH Create FeatureSelectionMixin for shared [inverse_]transform code
DOC rewrite descriptions of P/R/F averages and define support
DOC/COSMIT fix typos in What's New
DOC add some contributions to What's New
TST Use assert_almost_equal in test_symmetry
COSMIT prefer partial over lambda in test_metrics
TSTFIX use name, not metric, in test_metrics error messages
DOC correct note on handling 0-denominator in P/R/F
Merge pull request #2005 from kmike/test_pipeline_methods_preprocessing_svm
ENH faster unique_labels for big sequences of sequences
DOC explain labels parameter to confusion_matrix
DOC Detail on parent-child relationship in tree
FIX/COSMIT helper to identify target types
FIX cannot use set notation for Py2.6
FIX need explicit dtype for array of sequences in numpy 1.3
COSMIT remove redundant target size check
FIX numpy 1.3 has no float16; use float32
FIX/TST np.squeeze in numpy1.3 fails with array of sequences
FIX numpy 1.3 throws error with array of arrays
FIX use Python 2.6-compatible str.format
COSMIT refactor cross-validation strategies
Include LeavePLabelOut in refactoring
A further refactor
COSMIT Base class for KFold/StratifiedKFold validation
COSMIT make BaseKFold abstract
COSMIT pep8 in cross_val_score
COSMIT Base class for [Stratified]ShuffleSplit
DOC clarify LeavePOut's combinatoric explosion
DOC similar note in narrative docs
DOC More explicit note
DOC fix docstring headings
COSMIT make helpers private with underscore
COSMIT make BaseKFold private with underscore
TST additional tests for preprocessing.Binarizer
COSMIT add underscore prefixes where forgotten in cross_validation
COSMIT much simpler agglomeration inverse_transform
TST stronger test for agglomeration transforms
DOC minor fixes to Ward docstrings
DOC fix docstrings for AgglomerationTransform
DOC detail Ward.children_ and fix n_components_ type
DOC comment on Ward algorithm
DOC clean pooling_func arg type
DOC copy comment describing hierarchical clustering children
Merge pull request #2054 from ogrisel/invalid-n-folds
FIX avoid spectral_embedding naming conflict
Merge pull request #2085 from agramfort/fix_y_score_fa
Merge pull request #2090 from kanielc/fix_weight
COSMIT move deprecated parameter to end
COSMIT refactor document frequency implementations
ENH print number of fits in BaseSearchCV._fit
DOC fix comment on svm probability param
Johannes Schönberger (3):
Remove invalid todo comment
Add missing doc string printing for examples
DOC : fixes in covariance module
John Benediktsson (1):
COSMIT: fix excessive indentation.
John Zwinck (1):
FIX use float64 in metrics.r2_score() to prevent overflow
Joshua Vredevoogd (1):
DBSCAN BallTree implementation
Justin Pati (1):
changed warnings in grid_search.py related to loss_func and score_func being passed
Justin Vincent (19):
PY3 xrange, np.divide, string.uppsercase, None comparison
TST + PY3 various fixes
Got all the doc-tests working
Merge in master
More python3 fixes (and just plain bugs)
use ELLIPSIS in doctest to deal with numpy changes.
Forcing the deprecation warnings to happen while in get_params.
Force warning to be heeded in deprecated args check. Possibly fixed a test bug (but maybe I just got it wrong)
Make a test not dictionary order dependent.
Fix up last doc tests.
Make the fixes 2.6 compatible
ELLIPSIS around a unicode issue.
Fix y vector. We wanted round off division so that y == [0 0 1 1 2 2 ...], not [0 .5 1 1.5...]
A little more of those unicode helpers
Another ELLIPSIS
Pop off the recently added filter after testing for deprecation warnings.
merge in origin
Comment change
Fix two remaining python3 bugs.
Kemal Eren (99):
ridge regression uses compute_class_weight()
Re-add deprecated class_weight parameter.
removed class_weight parameter from RidgeClassifier.fit()
check_pairwise_arrays() preserves dtype==numpy.float32
implement spectral biclustering and spectral co-clustering
wrote tests
wrote methods for generating bicluster data
added option to return piecewise vectors
cast data in fit()
made internal functions private
use random state in test
removed pickle test
shorten first lines of test docstrings
use random state in preprocess tests
duck typing, minor corrections: spacing and typos
fixed exceptions and their messages
updated svd()
better array validation
use random state in data generator
tests reuse data generators
user may select svd method
Added to docstring
split spectral biclustering into two classes
removed unused code
test bad arguments
now supports sparse data
check n_clusters parameter more thoroughly
made base class an abstract class
checkerboard panels may have arbitary values.
fixed exception type
removed empty mixin
started biclustering documentation and examples
shorter array slicing
made some methods into private methods
cleaner use of check_arrays()
named arguments
use safe_sparse_dot()
use np.random.RandomState directly
do not do any checks during __init__()
do not use mutable default arguments
added new tests for sample data generators
fixed bug in make_checkerboard(), so tests pass again
use assert_all_finite
skip permutation test for now
fixed some errors reported by pyflakes
raise exception instead of converting sparse arrays to dense
expanded biclustering documentation
corrected k_means in docstring
rearranged imports from general to specific
moved and renamed _make_nonnegative() and _safe_min()
added option to use mini-batch k-means
use dia_matrix
renamed 'preprocess' to 'normalize'
use sklearn.utils.extmath.norm
base class __init__ is no longer abstract
added more information to error messages
also use norm in _project_and_cluster()
make test more sparse
made 'bicluster' a submodule of 'cluster'
removed svd_kwargs argument
added n_svd_vecs parameter
tests use ParameterGrid to avoid deep nesting
replaced kmeans_kwargs with some useful k-means parameters
updated documentation
keep biclustering algorithms in submodule
renamed examples; added to example docstrings
re-added bicluster mixin, this time with some functionality
wrote newsgroup biclustering example
fixed a few things in examples, documentation, and docstrings
wrote bicluster scoring using jaccard index and hungarian matching
removed some parameters to speed up test
added default arguments to base class's__init__ to make test pass
test_make_checkerboard was wrong after api change
added documentation for bicluster evaluation
moved shuffle functionality to utility function
added consensus score to bicluster examples
renamed example to get output to work
made bicluster utilities for dealing with indicator vectors
index in one go. added sparse test.
documentation and docstring fixes
merged newsgroup example with Vlad's
moved bicluster examples to their own category
reduced noise in spectral coclustering example
updated newsgroups example
added n_discard parameter to _svd()
check value of n_components and n_best
a fix for nan values in singular vectors.
wrote tests to ensure svd works on perfect checkerboard
redundant phrase in docstring
put biclustering section after clustering section in reference
misc. fixes
changes to newsgroups example:
fixed some docstrings: backticks and missing parameters
updated setup.py
added myself to authors; added biclustering to whats new
examples use matplotlib.pyplot instead of pylab
consistency changes:
removed plot_ from newsgroups example file
import biclustering methods in sklearn.cluster and sklearn.metrics.cluster
Ken Geis (4):
Changed the setup instructions in the README to properly install the package in the user home.
FIX mbkmeans benchmark bug (k instead of n_clusters)
FIX off-by-one error in neighbors benchmark
ENH lots of benchmarks fixes
Kevin Hughes (1):
ENH actually use scikit-learn's PCA class in plot_pca_3d.py
Kyle Kastner (6):
Removed pl.axis('tight') and set the plot limits with pl.xlim(), pl.ylim(). pl.axis('tight') appears to be adding whitespace around the colormesh
Added decision_function support to OneVsRestClassifier and a test, test_ovr_single_label_decision_function, in test_multiclass.py
Updated fixes for #2012.
Strengthened tests for OneVsRestClassifier decision_function
Cleaned up tests, and removed unused multilabel parameter in decision_function_ovr
Inlined extraneous function call from decision_function and added a check that the base estimator has a decision_function attribute
Lars Buitinck (244):
COSMIT rm deprecated svm.sparse module
COSMIT rm deprecated attrs from [LQ]DA
BUG last references to svm.sparse
COSMIT rm deprecated stuff
BUG fix failing doctest
BUG one more failing doctest
BUG move label_ from BaseLibSVM to BaseSVC
COSMIT decouple regression and classification in SVMs
BUG in RadiusNeighborClassifier outlier handling
Merge pull request #1576 from mrorii/fix_kneighbors
ENH rewrite radius-NN classifier's outlier handling
COSMIT translate lgamma replacement to C and clean it up
COSMIT add lgamma to gitattributes
DOC update SMART notation in TfidfTransformer docs
P3K: use print as a function in the examples
ENH refactor univariate feature selection
P3K use six.string_types and six.PY3
P3K one more iteritems
COSMIT rm Python 2.5 and Jython compat from six
BUG fix import problem in preprocessing
P3K StringIO vs BytesIO
DOC fix failing doctest due to unicode_literals
DOC whitespace in doctest
BUG revert P3K changes that broke mldata tests
rm gender classification example
P3K death to the print statement
P3K fix broken doctest and add forgotten print_function import
DOC no more need for compute_importances in trees
DOC copyedit FeatureHasher narrative
ENH move covtype loading to sklearn.datasets
TST covertype loader
DOC copyedit FeatureHasher narrative further
P3K range vs. xrange
Merge pull request #1524 from amueller/break_ovo_ties
DOC pretty math in kernel docstrings
BUG MinMaxScaler missing from preprocessing.__all__
BUG in KernelPCA: wrong default value for gamma
Merge pull request #1688 from hrishikeshio/fit_transform
ENH speed up RBFSampler by ~10%
BUG oops, removed validation by accident
BUG fix broken grid search example
COSMIT update mailmap
ENH sparsify method for L1-reg linear models
DOC developer guidelines for unit tests and classes_
DOC dev guide: random_state_ + @amueller's remarks
DOC r2_score may return negative values
Merge branch 'sparse-coef'
COSMIT callable instead of hasattr __call__
DOC rm failing doctest on graph_laplacian
DOC fix text vectorizer docs and add NLTK example
DOC fix broken doctests for feature_extraction.text
BUG restore empty vocabulary exc in CountVectorizer
ENH prevent copying of indices in CountVectorizer
DOC credit @ephes
Merge pull request #1713 from larsmans/vectorizer-memory-use
COSMIT use callable instead of hasattr
Merge pull request #1727 from amueller/min_max_scaler_fix
BUG broke the what's new while rebasing
ENH set min_df in fe.text back to 1
TST compute_class_weight in utils
FIX + TST + DOC compute_class_weight
ENH use bincount in compute_class_weight
BUG use fixes.unique
BUG in SVM tests
BUG fix compute_class_weights issue in SGD
Merge pull request #1753 from NelleV/FIX
P3K some more fixes in random places
DOC OpenBLAS is more dangerous than I thought
DOC oops, typo
COSMIT get rid of undocumented attributes on SVMs
PEP8 and allow non-bool truth values in CD
BUG + ENH: removal of components in kernel PCA
Merge pull request #1758 from larsmans/kernelpca-fix
P3K make feature_extraction.text work
BUG failing doctest
DOC IsotonicRegression wasn't in the changelog at all
P3K all of feature_extraction passes tests on Py2 and 3
DOC clarify column ordering in SVC scores
COSMIT DictVectorizer.inverse_transform readability
DOC CountVectorizer does NOT do stopword filtering by default
ENH don't recompute distances in MBKMeans
ENH cut MiniBatchKMeans memory usage in half for large n_clusters
DOC installation instructions: MacPorts, fix types, stdeb instructions
Merge pull request #1773 from jnothman/prf_docstring
BUG StandardScaler would ignore with_std for CSR input
BUG SGDClassifier and friends did not forget labels_ in re-fit
DOC clarify C parameter on LogisticRegression
TST + DOC + COSMIT refactor ParameterGrid and test it
ENH len on ParameterGrid and ParameterSampler
BUG deprecation of grid_scores_ in GridSearchCV
BUG always do cross-validation in GridSearchCV
DOC fix clone and get_params documentation
TST grid search/randomized search on non-BaseEstimator
TST actual sparse input in sparse k-NN tests
COSMIT prevent a copy in randomized LR
TST speed up comment tests by ~20%
TST radius-neighbors regression test not entirely stable
BUG additive_chi2 missing in KERNEL_PARAMS
BUG + DOC fix Nystroem for other kernels than RBF
COSMIT rm repetitive __main__ blocks from tests
ENH allow additional kernels on KernelPCA
TST fix broken doctest
P3K developer docs
Merge branch 'pr/1790' -- Python 3 support from PyCon sprint
Merge pull request #1812 from kmike/testing-fixes
DOC describe SVM probability calibration (and advise against it)
DOC further comments on SVM probabilities
ENH multiclass probability estimates for SGDClassifier
BUG digits grid search was passing cv to the wrong method
DOC typos in grid search docstrings
PY3 + TST decouple test_metrics from random module
Merge pull request #1836 from kmike/master
DOC distributions produced by hashing trick depend on input
DOC multiclass: typo and use case
DOC PR means pull request
FIX BytesIO and urllib usage in fetch_olivetti_faces
DOC I didn't mean soft-O by "tilde notation"
DOC describe API, not internals, for AdaBoost
DOC replace "arithmetical order" in AdaBoost docs
TST strengthen AdaBoost tests
FIX SVR complaining about a single class in the input
COSMIT do np.unique(y) once in SVC
DOC rewrite description of k-fold CV
mailmap entry for @lqdc
DOC define validation before cross validation
DOC typos in cross-validation description
clean up mailmap/deduplicate contributors
BUG disable memory-blowing SVD for sparse input in RidgeCV
FIX DictVectorizer behavior on empty X and empty samples
TST + DOC AdaBoostClassifier.predict_proba fix
COSMIT refactor AdaBoost code
ignore PDFs
ENH speed up sklearn.feature_selection.chi2
DOC dependency installation with yum (Red Hat, CentOS)
FIX bug (swapped args) in chi2
FIX yet another chi2 bug
ENH add latent semantic analysis/sparse truncated SVD
ENH use rnd SVD in TruncatedSVD by default for speed
COSMIT omit unused parameter/return value in svd_flip
TST strengthen TruncatedSVD tests
DOC + MAINT deprecate RandomizedPCA scipy.sparse support
FIX and link LSA clustering example
DOC explain normalization in LSA KMeans example
Merge pull request #1716 from larsmans/truncated-svd
FIX metrics/scoring bug with LeaveOneOut CV
MAINT remove deprecated gprime handling from FastICA + refactoring
Merge pull request #2067 from jnothman/test_binarizer
DOC no more mention of the Bunch in the narrative docs
FIX don't rely on Bunch behavior with fetch_covtype
DOC fix some docstring/parameter list mismatches
DOC fix RandomizedPCA docstring for n_components=None
ENH allow empty grid in ParameterGrid
MAINT ignore kernprof.py reports
DOC ParameterGrid on lists
Merge pull request #2082 from larsmans/empty-parameter-grid
DOC fix V-measure docstring
MAINT dedup Clay Woolam's contribs (>100 commits!)
FIX/ENH mean shift clustering
DOC typo
ENH micro-optimize RFECV
COSMIT refactor LibSVM wrapper for safety and readability
DOC fix some broken URLs
FIX charset -> encoding in load_files
DOC typo
Revert "FIX charset -> encoding in load_files"
FIX verbose output from k-means
FIX remove params from RandomizedSearchCV
FIX charset -> encoding in load_files
FIX search bug introduced in 1327057f4258f41712ecab5c94770aac5ff01982
FIX inconsistent attributes shapes in naive Bayes
FIX test failure in naive Bayes
FIX failing doctest for CountVectorizer
Merge pull request #2027 from mblondel/select_categorical
FIX copy in OneHotEncoder and _transform_selected
ENH optimize KMeans for sparse inputs
FIX KMeans bug; argsort result apparently not always C-contiguous
DOC what's new: faster KMeans
DOC more explicit description of degree param on SVMs
COSMIT pep8
ENH order *does* matter for sparse matrices
FIX get rid of the last few asanyarray calls
DOC fix erroneous docstring on preprocessing._transform_selected.
MAINT: dedup @jakevdp and @jnothman in mailmap
COSMIT simplify printing of number of fits in grid search
COSMIT fix a docstring in feature_extraction.text
P3K developer docs
TST r2_score float32 overflow fix
Revert "TST r2_score float32 overflow fix"
PY3 use urllib2 or urllib.request, based on Py2/3
DOC let OneHotEncoder, DictVectorizer and FeatureHasher refer to each other
DOC correct class_weight description for LogisticRegression
FIX memory usage in DictVectorizer.fit
ENH back-port rand_r from 4.4BSD
FIX move rand_r to tree module for now
DOC 20news filtering with smaller set and MultinomialNB
PY3 fix string literal syntax error
TST skip Graphviz export docstring in trees
TST use TruncatedSVD in random forest tests
COSMIT refactor random forests
COSMIT refactor forests, part 2
FIX faulty import in 20news docs
ENH fit_inverse_transform for FastICA
DOC document mixing_ attr on FastICA
COSMIT attribute checking in FastICA
COSMIT explicit None check in naive Bayes
ENH simplify the Scorer API
FIX bug in scorers that take probabilities
COSMIT RBM test in usual nose style + moved to proper module
BUG + COSMIT + ENH RBMs
Merge branch 'pr/1954'
MAINT _logistic_sigmoid.c is "binary"
PY3 fix RBM test
DOC copyedit RBM docstrings
DOC pep257 + c/e in sklearn.base
TST fix string labels in metrics tests
DOC copyedit preprocessing docs
MAINT ignore profiling results from kernprof.py
DOC copyedit KernelCenterer docstring
DOC minimal kernel centering narrative docs
DOC minor copyedit to FS docs
Merge pull request #2230 from pprett/neighbors-segfault-fix
TST catch deprecation warning in feature_extraction.text
Merge branch 'pr/2246'
DOC correct/copyedit linear model docstrings
FIX inline rand_r to fix build on Windows
DOC add an extremely simple classifier code example to dev docs
ENH rewrite multiclass_log_loss, rename log_loss, document it
ENH Scorer object for log loss
ENH add log_likelihood_score as -log_loss
PY3 new overfit prevention stuff in 20newsgroups loader
DOC SGDClassifier has multiclass predict_proba
DOC minor copyedit to narratives
FIX don't use old scoring API in randomized search
FIX use category and stacklevel=2 for {loss,score}_func
ENH speed up BernoulliNB's predictions
DOC "creating features" -> "feature extraction" + minor stuff
Revert "ENH add log_likelihood_score as -log_loss"
DOC copyedit example docstring
DOC XHTML fixes (unclosed tags, type="text/javascript")
ENH speed up logistic_sigmoid (using less code)
FIX make BaseSGDClassifier an ABC
Merge pull request #2295 from larsmans/fast-sigmoid
DOC credit to @ephes and myself for log loss in metrics
DOC copyedit SGDClassifier docstring
Martin Luessi (6):
WIP: doc hyperlinks, fixed size thumbnails
gzip support, whats_new
use Sphinx searchindex.js
no_image.png for examples w/o thumbnail
fix paths for Windows
links for scipy, cleanup
Mathieu Blondel (53):
Merge pull request #1604 from darkrho/doc-linear-model-typo
DOC: make distinction between evaluation and pairwise metrics.
Merge branch 'master' of github.com:scikit-learn/scikit-learn
Cosmit: more explicit xlabel.
Cosmit: more explicit label.
Update load_svmlight_file docstring.
FIX: X was converted twice.
Merge pull request #1804 from AlexanderFabisch/fix_example_path
Cosmit: remove needless blank lines.
Cosmit: more idiomatic way of clipping to zero.
Demystify magic values in NNLS implementation.
BUG: fix replacement for _neg.
Fix random state where appropriate.
Fixx doctest.
DOC: document attributes fitted by DictVectorizer.
DOC: put feature extraction before pre-processing.
COSMIT: better notation in CountVectorizer.
COSMIT: same changes in transform method.
COSMIT: more robust condition in inverse_transform.
Import gzip and bz2 only if necessary.
Move balance_weights out of preprocessing.
Add categorical_features option to OneHotEncoder.
Support both masks and arrays of indices.
Typo.
Rename _apply_transform to _transform_selected and make it a function
Merge branch 'master' of github.com:scikit-learn/scikit-learn into select_categorical
Address @jnothman's comments.
Test exception is raison when number of targets and penalties don't
Simplify ridge solvers (ongoing work).
Extract sparse_cg and lsqr solvers.
Extract dense_cholesky solver (linear case).
Extract dense_cholesky solver (kernel case).
Clean up.
Extract SVD-based solver.
Clean ups.
Remove copy option.
Cosmit in docstring.
What's new.
Remove if statement.
Cosmit.
Fix failures in grid search.
Do not set sample_weights unless need to.
Add warning when fall back to other solver.
Remove unused variable.
Fix failure in svd-based ridge solver w/ old numpy.
BUG: replace elif by if in Ridge solver selection.
Add fit_transform to FastICA.
Add inverse_transform to FastICA.
Add docstrings to methods in FastICA.
Address @dengemann's comments.
Add test.
Push failing test.
Merge pull request #2229 from larsmans/kernel-center-narrative
Matthias Ekman (1):
ENH: add pre_dispatch option to cross_val_score
Michael Eickenberg (14):
ridge multi target with individual penalties written. To be tested
old tests passing
new multiple target tests added, functionality confined to direct usage of ridge_regression function
Ridge estimator works with individual penalties
test for ridge estimator
ridge doc string
ValueError for wrong shaped input instead of assertion failure, in order for sklearn/tests/test_common.py, line 238 to pass
docstring in Ridge estimator
added individual penalties function for all other solvers. Tests passing for all of them
always make alpha into an array
updated tests
tests passing
removed elaborate testing in ridge.fit, not necessary anymore
simplified _solve_svd
Mikhail Korobov (9):
PY3 array.array wants str in Python 2.x and 3.x - give it a str
Update outdated comments in sklearn.hmm.
PY3: fix exception syntax in tests/test_common.py
PY3 fix test_cross_validation
PY3 fix OneHotEncoder doctest ( "<type 'float'>" is "<class 'float'>" in Python 3.x)
PY3 fix metaclasses. See #1829.
ENH speed improvements in HMM
TST Fixed test_pipeline_methods_preprocessing_svm: pca was unused
Fixed typo in metrics.py
Miroslav Shubernetskiy (1):
PY3 allow multiple base classes in six.with_metaclass
Naoki Orii (1):
FIX issue #1457 KNeighbors should test that n_samples > 0
Nelle Varoquaux (62):
DOC: small fix in the regression's score method documentation
FIX make_classification now outputs integer labels
DOC formatting (k_means)
ENH - 3x speedup in the isotonic regression
FIX gen_rst.py was something using an undefined variable
Merge pull request #1886 from NelleV/DOX_fix
Added sponsors to the about.rst page
Spelling mistake
DOC fix in the hierarchical clustering
DOC Acknowledge sponsors for the Paris sprint
DOC fixed small mistakes in the pls module
Merge pull request #2140 from arjoly/ajoly-glouppe-sponsor
DOC fix small mistakes
DOC fixed some formatting in kernel approximation
DOC fixed some formatting in the multiclass module
Merge pull request #2146 from ianozsvald/clearer_iris_decision_surfaces
Merge pull request #2163 from ianozsvald/fix_plot_forest_iris_docs
ENH better error message when estimators don't specify their parameters in the signature.
Merge pull request #2187 from FedericoV/non_negative_style
Merge pull request #2195 from erg/bug-2189
ENH added an option to do an isotonic regression on decreasing functions
TEST: added a small test for fitting an isotonic regression on a decreasing function
TEST tests the class instead of the function for the decreasing isotonic regression
MAINT moved the pls file based module to a folder
TEST fixing pls tests failing:
MAINT Move the pls to the cca to a cross_decomposition module
MAINT renamed pls to cross_decomposition in the documentation
FIX the example plots of the pls module did not import pls methods from the correct module
FIX removed the cca and pls modules
FIX added the new module to the setup.py installation
DOC improved docs/docstrings on cross_decomposition
MAINT deprecated the pls module, moved CCA to cca_
FIX init methods of ABCMeta class also need to be abstract
FIX on py3k, we need explicit relative imports
FIX missing deprecation release information.
MAINT charset is deprecated in favor of encoding
TST added tests for encoding/charset deprecation
DOC better deprecation warning messages.
TST better testing of the PLS module
FIX PLSSVD now returns the correct number of components
COSMIT small documentation tweaks
DOC ignoring gen_rst's parsing errors
Merge pull request #2280 from larsmans/randomsearch-scoring
Merge pull request #2281 from ogrisel/improvements-to-setup-py
DOC fixed the optional arguments
FIX added some descriptions to each categories in the main webpage
FIX spelling mistake
FIX the css in the API
ENH added the fork me ribbon to the website
WEB added testimonials
DOC fixed the previous/next button
DOC fided the collapsable sidebar
DOC dropdown menu works
FIX minor edits on the website
DOC fixed z-index on the website
FIX website layout on small screens
FIX improve display on small device
DOC fix dropdown menu
FIX backward compatibility was broken
DOC added link from banner to example.
DOC now building to html/stable
DOC home always points to stable
Nicolas Trésegnie (38):
DOC fix macports package name
Add test for PatchExtractor (float value for max_patches)
Fix float value support for max_patches in PatchExtractor
Fix as_float_array behaviour when copy=True
Add test of the as_float_array behaviour when copy=True
Add a copy parameter to safe_asarray()
Imp readability
Missing value imputation
Fix tests
Fix tests + doc improvements + renaming
Add test with default value of copy + doc improvements
Imp readability
Fix use of as_float_array
pep8
Imp variables names
Del use of as_float_array + naming and documentation improvements
Fix use of mask
Fix import names
Add pycharm files in .gitignore
Imp splitting of preprocessing.py
Imp splitting of test_preprocessing.py
Del unused imports in preprocessing + pep8
Fix imports
Imp move OneHotEncoder to preprocessing/data.py
pyflakes and pep8
Fix self.statistics_ souldn't be set if axis==1
Fix use of self
Refactor loss_func and score_func warnings in grid_search
Add score_overrides_loss to _deprecate_loss_and_score_funcs
Add deprecation warnings in Ridge
Add deprecation warnings in rfe
Add catching of the deprecation warnings in rfe and ridge tests
Refactor loss_func and score_func warnings in cross_validation + replacement in two examples
Fix 'scoring' docstrings
Imp documentation
Fix tests
Fix grid_search.py example
Fix tests
Noel Dawe (102):
implement AdaBoost
use weighted mean in ClassifierMixin.score
FIX: DecisionTreeRegressor.score
FIX: import not used
FIX: overlapping y-axis labels
FIX: use generator instead of np.random
rm doctest in make_gaussian_quantiles
fix variable naming in weight_boosting
FIX: TypeError for regressor
FIX minor comment
FIX: docs, code clean up, learn_rate -> learning_rate
FIX: plot_adaboost_classification.py
don't enforce DTYPE at the ensemble level
DOCS: note generator behaviour in staged methods
Make BaseWeightBoosting abstract and other misc changes
revert changes to grid_search
FIX: import
revert implementation of sample weights in BaseWeightBoosting.staged_score
revert a few spurious changes
pep8 + pyflakes, use arrays for errors_ and weights_
init weights_ to zeros and errors_ to ones
add Hastie 10.2 example
pep8
implement SAMME.R algorithm
update adaboost hastie example and weight_boosting tests
use broadcasting
combine real and discrete algorithms under one class
DOC: AdaBoostClassifier real arg
update example: fix histogram range
Merge pull request #20 from glouppe/adaboost
Merge pull request #21 from glouppe/adaboost
update adaboost example: exposes instability
displace predict_proba by 1e-10
Merge pull request #22 from glouppe/adaboost
FIX: adaboost predict_proba
only boost positive sample weights
FIX: only boost positive sample weights
Merge pull request #23 from glouppe/adaboost
FIX: negative and zero probabilities while boosting with SAMME.R
FIX: doctest
FIX: doctest and slightly larger displacement from zero probabilities (32 vs 64bit doctest instability)
remove weighted_r2_score (leave for next PR scikit-learn#1574)
revert spurious change in metrics.py
FIX: use full decision tree in AdaBoost and fix title in plot_forest_iris.py
DOC: add __doc__ to plot_adaboost_hastie_10_2.py
FIX: reference format
FIX: show decision boundary in plot_adaboost_classification.py
FIX: refactor plot_adaboost_classification.py and add legend
rename plot_adaboost_classification.py -> plot_adaboost_twoclass.py and add predict_twoclass method to AdaBoostClassifier
FIX: only possible split sometimes creating children with negative or zero weight in the presence of negative sample weights
FIX: improve multi-class AdaBoost example (rename to plot_adaboost_multiclass.py)
add author
typo
use metrics module and pep8
typo
fix class ordering in two-class
faster sample_weight initialization
speed improvements to make_gaussian_quantiles
even more speed improvements to make_gaussian_quantiles
py3k
DOC: note initialization of sample_weight if None
factorize common sample_weight check
Merge pull request #24 from glouppe/adaboost
add decision_function and staged_decision_function and refactor some code
Merge remote-tracking branch 'upstream/master' into treeweights
Merge pull request #25 from glouppe/adaboost
pep8
Merge pull request #26 from glouppe/adaboost
update adaboost regression example and use estimator_errors_
rm n_estimators argument from predict methods
DOC: fix docstring for make_gaussian_quantiles
FIX: alpha=.5 and use more difficult dataset in two-class example. Add mean and cov arguments to make_gaussian_quantiles
FIX: learning_rate default value consistency
FIX: TypeError message if base_estimator does not support class probabilities
FIX: comments from @ogrisel
make learning_rate=1 default for classification
only sum sample_weight once
rm sphinx/docutils formatting in exception messages
inline comment about learning_rate in hastie example
add note about SAMME.R converging faster than SAMME
add note about y coding construction
add description of dataset in two-class example
fix missing parenthesis in make_hastie_10_2 dataset
Merge pull request #27 from glouppe/adaboost
import pylab as pl
remove check for fit_predict
fix importance test and test both SAMME and SAMME.R algs
don't show class B probabilities in two-class example
two-class decision scores -> decision scores
clarification on two-class decision scores plot
explain decision scores in two-class example
fix AdaBoost.R2 and update example
DOC: loss_function
fix failing tests
fix failing doctest
Merge pull request #28 from glouppe/adaboost
API consistency with gradient boosting: loss_function -> loss
Merge pull request #29 from glouppe/adaboost
minor edits in docs
DOC: notes about examples and minor edits
make setup.py executable
AdaBoost: use estimator weights in predict_proba
Norbert Crombach (1):
Fix L2 regularization order in sgd_fast
Olivier Grisel (107):
Update travis config to remove -qq flag for scipy
P3K: support for py3k in dict_vectorizer module
PY3: Fix stdout capture in graph lasso test
P3K More python 2 / 3 compat in tree exports
Merge pull request #1660 from rlmv/fe_tests
P3K use six to have a python 2 & 3 compatible code base
Merge pull request #1726 from agramfort/round_kfold
Merge pull request #1730 from arjoly/doc-feature-selection
Merge pull request #1741 from arjoly/metrics-fix-np-1.3
PY3: Disable lib2to3
PY3: fix urlopen in mldata and california housing loaders
PY3: fix remaining cStringIO imports
PY3: fix for string literals in datasets' test_base.py
PY3: print function in coordinate descent doctest
PY3: record is a kwarg argument for warnings.catch_warnings
PY3: long is no longer a type in Python 3
Merge pull request #1839 from amueller/dbscan_example
FIX: use the mldata mock in docstring as well
Merge pull request #1913 from Jim-Holmstroem/refactored_precision_recall_fscore_support_to_count_with_integer_type
FIX: restore numpy 1.3.0 compat with np.divide fix
FIX #2032, FIX #2033: ensure module names consistency with __all__
Remove redundant test that was checked in by mistake
FIX inconsistent cv_scores_ generation for randomized search and re-add example
ENH: removed leftover condition to get a wider application of the import all consistency check
Enforce n_folds >= 2 for k-fold cross-validation
Merge pull request #2004 from oddskool/out-of-core-examples
FIX: make doc auto-linking support any Unicode / UTF-8 content
Make the out-of-core example plot work when launched by the sphinx extension
FIX: do not print to many messages to stdout when generating the documentation
PY3: New test for the get_params handling of deprecated attributes.
Better status for the Py3 port
Merge more Py3 fixes
PY3: refcounting change introduced a regression on the use of resize in LARS
FIX: pep8 and Py3 support in sklearn.neighbors.base
FIX: Python 3 support for the neighbors doctests
FIX: pep8 + Py3 fixes in test_dist_metrics
FIX: pep8 and Py3 support in sklearn.neighbors.dist_metrics
FIX: Py3 / pep8 fixes in test_ball_tree / test_kd_tree
Update Python 3 support status
Style
More readable condition and more precise error message
FIX: Py3 print statements to print functions
Rename LabelBinarizer.multilabel to .multilabel_ + DOC
WIP: partial fit for discrete naive Bayes models
Remove the class_prior partial_fit param
WIP: started to factorized the raw count collection
Incrementally is useless now
Add reference to the Manning text + restaure previous smoothing
FIX shape issue when y has only one single class + some missing doc
Factorize common classes checks in partial_fit implementations
Add note on a possible future performance optimization
Add a note on performance tradeoffs in the docstring of partial_fit
More informative error message. Also CV now use integer indices by default now.
Use floats everywhere to get rid of warnings when using sample_weight
More input checks
Better test name
Remove redundant shape check already done by check_arrays
Add missing test for sample weight with partial_fit + fix issue classes passed as a list instead of an array
One more input check test
Add missing test for deprecation warning
Found a bug: add a failing test
Use unique_labels more consistently in the multiclass model
Fix broken partial_fit test
Factorize label_binarize for binarizing a sequence of labels with fixed classes
Add a new whats_new entry
Add some doc for the new partial_fit method
wording
Avoid raising a deprecation warning on label_binarizer_.multilabel_
Fix docstring and add some usage examples
FIX: do not update feature_log_prob_ in _update_class_log_prior
Add one more tests to check the performance on digits
Make test_deprecated_fit_param pass under python 3 as well
Address wording and typos identified in review
Better parameterization for test_check_accuracy_on_digits
Add a whitespace in parameter docstring item
More accurate documentation for class_count_ and feature_count_
Rename helper partial_fit function
Merge pull request #2175 from ogrisel/nb-partial-fit
Merge pull request #2228 from amueller/travis_virtualenv_stuff
Trying to enable python 3.3 too.
Update .travis.yml
One more Python 3 fix in feature_extraction.rst
Py3 fix
More explicit tests in test_label_binarizer_column_y
Catch expected warning in sklearn/tests/test_naive_bayes.py (part of #2274)
Revert "Catch expected warning in sklearn/tests/test_naive_bayes.py (part of #2274)"
FIX PY3: list and tuples cannot be compared in Python 3
Py3: fix version comparison in imputation module
Add supported python versions to the classifiers + fixes
Sample compiler config for windows
Force stdc++ link for the windows build
Regenerate pairwise_fast.pyx with recent cython for windows build
Fix atomics definitions under windows for sklearn._hmm.pyx
typo
Use extra_link_args for -lstdc++
Ignore compiled shared library files generated in the source tree under windows
Merge pull request #2293 from amueller/warning_input_shapes
Rename cv_scores(_) back to grid_scores(_) to keep the name free for a future refactoring
Merge pull request #2299 from ogrisel/grid-scores
WIP: explicitly mark all base classes as ABC with abstractmethod inits
Add concrete __init__ for LinearSVM
Add concrete implementation for SGDClassifier
Fixed a typo in a contributor's name
Re-align the what's new file with the new ordering of items from master
partial_fit for naive Bayes was done for 0.14-rc, not 0.11...
Ignore the generated MANIFEST file
Also clean the dist folder when calling make
Peter Prettenhofer (65):
rename learn_rate -> learning_rate
raise ValueError if len(y_true) is less than or equal to 1
fix: map labels to {0, 1}
fix: deviance computation in BinomialDeviance was wrong (ignored cases where y == 0) - thanks to ChrisBeaumont for reporting this issue
raise ValueError if division through zero in LogOddsEstimator
add loss function for gradient boosting binomial deviance
pep8 and assert_equal instead of assert
correct docstring
Merge branch 'master' into gbrt-deviance-fix
use unique from sklearn backports (return_inverse)
Merge branch 'master' into gbrt-deviance-fix
Merge branch 'master' into gbrt-deviance-fix
decision_function forces dense output (in the case of sparse coef_)
Merge branch 'master' into pr/1798
get rid of ``rho`` in sgd documentation - has been replaced by ``l1_ratio``
Merge pull request #1893 from dougalsutherland/sgd-docs
corrected doctests after moving L2 penalty application in SGD
Merge remote-tracking branch 'upstream/master' into pr/2016
added SGD L2 fix to whatsnew
fix: add missing str formatting operator
enhanced (hopefully) DBScan documentation; killed some whitespace along the way...
Merge remote-tracking branch 'upstream/master' into dbscan-doc-enh
fix: needs_threshold not plural in repr
removed min_density example - dropped param
gbrt now works with new DecisionTree implementation
import classes - now they work!
fix: proper dtype for SIZE_t
add GBRT to covertype benchmark
added pxd to Manifest (to be included in source tarball)
Merge remote-tracking branch 'upstream/master'
add OOB improvement and set oob_score deprecated
example for oob estimates in GBRT
plot cv error as well
rm print stmt
rn: plt -> pl
fix: oob_improvement_ with trailing _
more docstrings
cosmit: use train_test_split - tuned params for nice plot
narrative documentation for oob improvement.
more tests
cosmit: better links and a note on efficiency using max_features
comments
cosmit: n -> n_samples
cosmit: rs -> random_state
more doc for OOB example
use new style str formatting
rearanged some code
rn: ACC -> Accuracy
rephrased max_features doc
moved to new pyplot import
more narrative documentation for oob in gbrt
regression tests for oob_improvement_
example doc string
Merge branch 'gbrt-oob-improvement'
covertype benchmark: use C-style input as default (most models require it as input)
fix: use asserts from sklearn.utils.testing
fix: python3.3 warning fix
doc: hedge the use of OOB estimates
Refactored verbose output in GBRT - output much more nice
fix: newest numpy doesn't like all-indexing non-existing dimension (reported by erg #2233)
Merge remote-tracking branch 'upstream/master'
remove negative indices from neighbors cython code
fix: check for impurity ties
added 32bit 64bit equality test case
adapt OOB regression test to change in tree module
Philippe Gervais (11):
Style fixes
[DOC] missing parameter description
GraphLassoCV works with alphas given as list.
Simplified GraphLassoCV code.
Put back cov_init parameter in graph_lasso_path_
Speed up some tests
Removed unused import
Added GraphLassoCV changes to whatsnew.rst
[DOC] Corrected errors in clustering documentation
[DOC] fixed a typo in an warning message.
One more typo fixed
Rafael Cunha de Almeida (1):
Only reassign centers if to_reassign.sum() > 1
Raul Garreta (5):
PY3: used six.u to fix unicode variables in svmlight
PY3: six.moves.cStringIO to fix StringIO import
PY3: fix None comparison (when not in OS X) in test_k_means.py
PY3: used six.moves.xrange to fix xrange
PY3: used six.iteritems to fix dict iteritems in module pipeline.py
Rob Speer (6):
Change 'charse_error' to 'charset_error' in load_files.
Revise documentation about handling text and bytes.
Add a documentation section about decoding text.
Move the new "Decoding text files" doc section
FIX Minor stuff in document_classification_20newsgroups output
ENH Add filters on newsgroup text
Rob Zinkov (5):
Adding support indices in svm for sparse matrices
COSMIT PEP8
Adding test to check support_ is equal in dense and sparse matrices
COSMIT PEP8
Recompiled base
Robert Layton (17):
DOC improve mini-batch k-means narrative
DOC: Replaced all BSD style licenses with "BSD 3 clause"
Minimal spanning tree backported from scipy 0.13
Added test
Moved mst to a subfolder and added a README file
Added new files (from previous commit)
Merge pull request #2055 from jnothman/cv_refactor
Merge pull request #2076 from pprett/dbscan-doc-enh
Traversal in and tested. Next step is to remove references to old code
Removed reference from spectral_clustering to old csgraph
csgraph updated from hierarchical.py
Removed actual _csgraph file, tests still all pass
Turns out sparsetools wasn't needed either
Missed a spot
Reference to graph components updated in dev docs
Two more spots. I think that's it
Now that the folder has more than just mst in it, rename to sparsetools, which should help with referencing it.
Robert Marchman (13):
test case for unfitted idf vector
raise ValueError for unfitted idf vector
FIX docstring deletions
ADD test coverage for _check_stop_list
FIX comment typo
ADD test cases to fill out VectorizerMixin coverage
ADD another VectorizerMixin test
ADD test for get_feature_names
ADD test for tfidf fit with incompatible n_features
ADD test for TfidfVectorizer attribute setters
MV Mixin tests to CountVectorizer tests
RM CV import
MV _check_stop_list tests to CV get_stop_words
Robert McGibbon (3):
fix the kwarg name
updated the .c file
remade the cython with 0.18
Rolando Espinoza La fuente (1):
DOC typo: Pereptron -> Perceptron.
Roman Sinayev (3):
ENH Rewrote CountVectorizer fit_transform to be ~40% faster
ENH refactor and further speed up CountVectorizer
ENH speed up TfidfTransformer using spdiags
Seamus Abshere (1):
ENH reduce size of files produced by dump_svmlight_file
Sergey Feldman (1):
Adding covariance regularization to QDA
Sergey Karayev (2):
fixing bug in linear_model.SGDClassifier for multi-class warm start
removing accidental space
Sergio Medina (1):
Corrected a few things on the Mutual Information doc pages.
Stefano Lattarini (1):
COSMIT various typofixes
Steve Koch (1):
Update hmm.rst
Steven De Gryze (9):
PY3: fixed basestring in crossvalidation.py
PY3: use b() convenience function for string literals
PY3: ensuring file stream is read as binary
PY3: convert string literal to bytes using six in cython file
replacing numpy array with range for use in random.sample
PY3: changing None to 0 to ensure comparability in py3
PY3 fixing utf8 comments in svm through try/except and six.b
PY3: forcing execution of map by using tosequence
PY3 fix comparison of ndarray and string
Sturla Molden (1):
Update typedefs.pxd with correct ITYPECODE
Szabo Roland (3):
ENH Added custom kernels to SpectralClustering
BUG Add lambda_ attribute to ARDRegression after fit
DOC Add labels and some explanation to confusion matrix example
Tadej Janež (10):
Removed an unnecessary if statement in KFold __iter__ method.
Improved the test that checks the balance of sizes of folds returned by KFold.
DOC Corrected the docstring of KFold about the sizes of the folds.
COSMIT Moved the test_roc_curve_one_label test where other ROC curve tests are.
FIX KFold should return the same result when indices=True and when indices=False.
ENH Function auc_score should throw an error when y_true doesn't contain two unique class values.
ENH optimizations in sklearn.cross_validation
FIX Moved copying of labels in LeaveOneLabelOut and LeavePLabelOut to __init__.
TST Added test that checks if LeaveOneLabelOut and LeavePLabelOut work normally if the labels variable is changed before calling __iter__.
DOC Fixed doc test to work with the fixed versions of LeaveOneLabelOut and LeavePLabelOut.
Thomas Jarosch (1):
BUG delete/delete[] error in Liblinear
Vlad Niculae (71):
FIX: variable naming inconsistency in NMF
DOC FIX: multi-target linear model attribute shapes
DOC spelling and clarification
Make callable svc test more robust for MacOSX.
Added RBM to whats_new.rst
DOC Added skeleton for RBM documentation
ENH Rename RestrictedBolzmannMachine to BernoulliRBM
FIX: make BernoulliRBM doctest pass
FIX: BernoulliRBM check random state in fit, not in init
FIX: validation in `BernoulliRBM.transform`
DOC: first attempt at RBM documentation
Link to RBM docs from the unsupervised toctree
FIX: uneven RBM image
DOC: PCD details and references
Fix typos in example
PEP8 and indentation
DOC add plot and example to docs
DOC rewrite BernoulliRBM example description
Set seed through params, not globally
FIX handling of random state, hide some of API
Pep8 example
Update example params by grid search, and docstring
One space after dot
DOCFIX neural networks module
DOCFIX spacing and clarification in RBM docstring
More stable implementation of logistic function and its derivative by @fabianp
Use gen_even_slices instead of homebaked code
ENH Add fast and stable logistic sigmoid to utils and RBM
ENH Support sparse input in RBMs
ENH Prevent memory copying in RBM's _fit
Do not touch uncopied memory
Nudge images using convolve, slower but more readable
Clarify narrative docs
Clarify and python3 RBM example
Periods and other docstring issues
Remove redundant test
Python3 support in RBM
TST RBM smoke-test verbosity
FIX missing class attribute in ICA. Common test was failing
FIX: fastica function dictionary default value
Deprecate FastICA.sources_
TEST remove deprecated stuff from fastica tests
Document the deprecation
FIX bug in test
Clean up and rename Hungarian algorithm
Clarify and clean up example
Remove print in Hungarian tests
Consistency for floats in consensus score
Add warning in private _Hungarian docstring just in case
ENH make spectral clustering test more stable to random seed
ENH add return_path in orthogonal matching pursuit
TEST for omp path feature
ENH OrthogonalMatchingPursuitCV estimator
FIX respect conventions in OMP init
FIX OrthogonalMatchingPursuit normalized twice
Use projected gradient solver in transform to support sparse matrices
Use same parameters when solving the transform
Use scipy.nnls.optimize for dense data
Add failing test for libsvm random state proba
FIX support random state in libsvm
DOC document changes in LIBSVM_CHANGES
DOC update docstrings to reflect libsvm random_state
Fix libsvm seed when predict_proba in tests and examples
Clarify and make libsvm random seed more consistent
Comment predict params in libsvm
DOC reference and rename cross decomposition module
FIX raise tolerance in svm predict_proba test
Make common PLS tests more stable
FIX for MSVC inline fmin, fmax and log2
FIX for MSVC inline fmax in dist_metrics
Add LibSVM random state to changelog
Yann N. Dauphin (25):
ENH added Restricted Boltzmann machines
30% speed-up thanks to in-place binomial
ENH 12% RBM speedup with ingenious ordering of operations
rename h_samples to h_samples_
added URI for RBM reference
improved docstring for transform
renamed _sigmoid to _logistic_sigmoid
use double backquotes around equations
logistic_sigmoid moved to function
transposed components_, no performance penalty
only compute pseudolikelihood if verbose=True
more accurate pseudo-likelihood
use iteration terminology instead of epochs in RBM
default n_components from 1024 to 256
clarify some method names (ex: mean_h -> mean_hiddens)
added epoch time
ENH RBM example
switched to digits
moved rbms to neural_networks module
add tests for rbm
trim whitespace
use train_test_split
neural_networks -> neural_network
ENH rename n_particles to batch_size in RBM
TST added more RBM tests
Yannick Schwartz (2):
BUG: set random state in LogisticRegression
Update multiclass/multilabel documentation
Yaroslav Halchenko (7):
BF: explicitly mark train_test_split as not the one for nosetesting
Merge commit '0.14a1-20-gc9ba2c3' into releases
Merge commit '0.14a1-239-g0872592' into dfsg
Merge branch 'dfsg' into debian
changelog entry
debian/control - python-imaing to build-depends (for documentation) and removed not needed XS-DM-Upload-Allowed
Let's upload to experimental for testing
draix (1):
PY3: replaced izip
hrishikeshio (1):
DOC dev guide: deprecation
jamestwebber (2):
Update coordinate_descent.py
Fixed precompute issue (again) in ElasticNet and enet_path
nzer0 (1):
Documentation ERROR: mixture.DPGMM.precs_
sergeyf (8):
Update qda.py
Update qda.py
Missed a space!
Updating to ensure pep8 compliaance
reg_param is a float
Update qda.py
Update test_qda.py
Update qda.py
syhw (10):
nudging the digits dataset for BernouilliRBM example
TST added a 'fit [[0],[1]] + gibbs sample it' test for RBMs
replaced test_gibbs by a smoke test for NaNs
check for pseudo_likelihood clipping
COSMIT refactoring rbm
RBM example now verbose
squeezing logistic_sigmoid result only on 1D arrays
adding a test for sparse matrices in RBM
changing free_energy to private in RBM
added neural_network to setup
uber (1):
example yahoo stock issue fix
unknown (1):
changed wording in linear model docs about Normalized. It was frustrating me haha
-----------------------------------------------------------------------
No new revisions were added by this update.
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-science/packages/scikit-learn.git
More information about the debian-science-commits
mailing list