scikit-learn ƒ[ƒ~

http://scikit-learn.org/stable/index.html

@@@@@@@@“úŽž@@@@@@@@@@ “à—e ƒXƒ‰ƒCƒh‚̃tƒ@ƒCƒ‹ •â‘«Ž‘—¿‚È‚Ç ”­•\ŽÒ
 4ŒŽ24“ú 1.Supervised Learning
1.1. Generalized Linear Models
1.1.1. Ordinary Least Squares
1.1.1.1. Ordinary Least Squares Complexity
1.1.2. Ridge Regression
1.1.2.1. Ridge Complexity
1.1.2.2. Setting the regularization parameter: generalized Cross-Validation
 sklearn-shinnou-0424.pdf   V”[
 5ŒŽ8“ú 1.1.3. Lasso
1.1.3.1. Setting regularization parameter
1.1.3.1.1. Using cross-validation
1.1.3.1.2. Information-criteria based model selection
 sklearn-kunii-0508.pdf   š ˆä
  1.1.4. Elastic Net  sklearn-kikuchi-0508.pdf   ‹e’r
1.1.5. Multi-task Lasso  sklearn-onodera-0508.pdf   ¬–쎛 
 5ŒŽ15“ú 1.1.6. Least Angle Regression  sklearn-xiao-0515.pdf   Xiao
  1.1.7. LARS Lasso
1.1.7.1. Mathematical formulation
 sklearn-kouno-0515.pdf   ‰Í–ì
1.1.8. Orthogonal Matching Pursuit (OMP)  sklearn-nagata-0515.pdf   ‰i“c
 5ŒŽ22“ú 1.1.9. Bayesian Regression
1.1.9.1. Bayesian Ridge Regression
1.1.9.2. Automatic Relevance Determination - ARD
 sklearn-shinnou-0522.pdf   V”[
  1.1.10. Logistic regression  sklearn-kunii-0522.pdf   š ˆä
  1.1.11. Stochastic Gradient Descent - SGD  sklearn-kikuchi-0522.pdf   ‹e’r
 5ŒŽ29“ú 1.1.12. Perceptron  sklearn-onodera-0529.pdf   ¬–쎛
  1.1.13. Passive Aggressive Algorithms sklearn-xiao-0529.pdf   Xiao
1.2. Support Vector Machines
1.2.1. Classification
1.2.1.1. Multi-class classification
1.2.1.2. Scores and probabilities
1.2.1.3. Unbalanced problems
 sklearn-kouno-0529.pdf   ‰Í–ì
6ŒŽ5“ú 1.2.2. Regression  sklearn-nagata-0605.pdf   ‰i“c
1.2.3. Density estimation, novelty detection  sklearn-shinnou-0605.pdf   V”[
1.2.4. Complexity  sklearn-kunii-0605.pdf   š ˆä
1.2.5. Tips on Practical Use  sklearn-kikuchi-0605.pdf   ‹e’r
6ŒŽ12“ú 1.2.6. Kernel functions
1.2.6.1. Custom Kernels
1.2.6.1.1. Using Python functions as kernels
1.2.6.1.2. Using the Gram matrix
1.2.6.1.3. Parameters of the RBF Kernel
 sklearn-onodera-0612.pdf   ¬–쎛
1.2.7. Mathematical formulation
1.2.7.1. SVC
1.2.7.2. NuSVC
sklearn-xiao-0612.pdf   Xiao
1.2.8. Implementation details sklearn-kouno-0612.pdf   ‰Í–ì
1.3. Stochastic Gradient Descent
1.3.1. Classification
sklearn-shinnou-0612.pdf   ‰i“c¨V”[
6ŒŽ26“ú 1.3.2. Regression sklearn-shinnou-0626.pdf   V”[
1.3.3. Stochastic Gradient Descent for sparse data     š ˆä
1.3.4. Complexity     ‹e’r
1.3.5. Tips on Practical Use sklearn-onodera-0626.pdf   ¬–쎛
10ŒŽ14“ú 1.3.6. Mathematical formulation
1.3.6.1. SGD
sklearn-xiao-1014.pptx   Xiao
1.3.7. Implementation details sklearn-kouno-1014.pdf   ‰Í–ì
1.4. Nearest Neighbors
1.4.1. Unsupervised Nearest Neighbors
1.4.1.1. Finding the Nearest Neighbors
1.4.1.2. KDTree and BallTree Classes
sklearn-nagata-1014.pdf   ‰i“c
1.4.2. Nearest Neighbors Classification sklearn-shinnou-1014.pdf   V”[
10ŒŽ21“ú 1.4.3. Nearest Neighbors Regression sklearn-kunii-1021.pdf   š ˆä
1.4.4. Nearest Neighbor Algorithms
1.4.4.1. Brute Force
1.4.4.2. K-D Tree
1.4.4.3. Ball Tree
1.4.4.4. Choice of Nearest Neighbors Algorithm
1.4.4.5. Effect of leaf_size
sklearn-kikuchi-1021.pdf   ‹e’r
1.4.5. Nearest Centroid Classifier
1.4.5.1. Nearest Shrunken Centroid
sklearn-onodera-1021.pdf   ¬–쎛
1.5. Gaussian Processes
1.5.1. Examples
1.5.1.1. An introductory regression example
1.5.1.2. Fitting Noisy Data
sklearn-xiao-1021.pptx   Xiao
10ŒŽ28“ú 1.5.2. Mathematical formulation
1.5.2.1. The initial assumption
1.5.2.2. The best linear unbiased prediction (BLUP)
1.5.2.3. The empirical best linear unbiased predictor (EBLUP)
sklearn-kouno-1028.pdf   ‰Í–ì
1.5.3. Correlation Models sklearn-nagata-1028.pdf   ‰i“c
1.5.4. Regression Models sklearn-shinnou-1028.pdf   V”[
1.5.5. Implementation details sklearn-kunii-1028.pdf   š ˆä
11ŒŽ4“ú 1.6. Cross decomposition sklearn-kikuchi-1104.pdf   ‹e’r
1.7. Naive Bayes
1.7.1. Gaussian Naive Bayes
sklearn-onodera-1104.pdf   ¬–쎛
1.7.2. Multinomial Naive Bayes sklearn-xiao-1104.pptx   Xiao
1.7.3. Bernoulli Naive Bayes sklearn-kouno-1104.pdf   ‰Í–ì
1.7.4. Out-of-core naive Bayes model fitting sklearn-nagata-1104.pdf   ‰i“c
11ŒŽ11“ú 1.8. Decision Trees
1.8.1. Classification
sklearn-shinnou-1111.pdf   V”[
1.8.2. Regression sklearn-kunii-1111.pdf   š ˆä
1.8.3. Multi-output problems sklearn-kikuchi-1111.pdf   ‹e’r
1.8.4. Complexity sklearn-onodera-1111.pdf   ¬–쎛
1.8.5. Tips on practical use sklearn-xiao-1111.pdf   Xiao
1.8.6. Tree algorithms: ID3, C4.5, C5.0 and CART sklearn-kouno-1111.pdf   ‰Í–ì
1.8.7. Mathematical formulation
1.8.7.1. Classification criteria
1.8.7.2. Regression criteria
sklearn-nagata-1111.pdf   ‰i“c
 11ŒŽ21“ú 1.9. Ensemble methods
1.9.1. Forests of randomized trees
1.9.1.1. Random Forests
1.9.1.2. Extremely Randomized Trees
1.9.1.3. Parameters
1.9.1.4. Parallelization
1.9.1.5. Feature importance evaluation
1.9.1.6. Totally Random Trees Embedding
sklearn-shinnou-1121.pdf   V”[
  1.9.2. AdaBoost
1.9.2.1. Usage
sklearn-kunii-1121.pdf   š ˆä
  1.9.3. Gradient Tree Boosting
1.9.3.1. Classification
1.9.3.2. Regression
1.9.3.3. Mathematical formulation
1.9.3.3.1. Loss Functions
1.9.3.4. Regularization
1.9.3.4.1. Shrinkage
1.9.3.4.2. Subsampling
1.9.3.5. Interpretation
1.9.3.5.1. Feature importance
1.9.3.5.2. Partial dependence
sklearn-kikuchi-1121.pdf   ‹e’r
 11ŒŽ25“ú 1.10. Multiclass and multilabel algorithms
1.10.1. Multilabel classification format
sklearn-onodera-1125.pdf   ¬–쎛
  1.10.2. One-Vs-The-Rest
1.10.2.1. Multiclass learning
1.10.2.2. Multilabel learning
sklearn-xiao-1125.pdf   Xiao
  1.10.3. One-Vs-One
1.10.3.1. Multiclass learning
sklearn-kouno-1125.pdf   ‰Í–ì
  1.10.4. Error-Correcting Output-Codes
1.10.4.1. Multiclass learning
sklearn-nagata-1125.pdf   ‰i“c
 12ŒŽ2“ú 1.11. Feature selection
1.11.1. Removing features with low variance
1.11.2. Univariate feature selection
sklearn-shinnou-1202.pdf   V”[
  1.11.3. Recursive feature elimination     š ˆä
  1.11.4. L1-based feature selection
1.11.4.1. Selecting non-zero coefficients
1.11.4.2. Randomized sparse models
sklearn-kikuchi-1202.pdf   ‹e’r
  1.11.5. Tree-based feature selection     ¬–쎛
  1.11.6. Feature selection as part of a pipeline sklearn-xiao-1202.pdf   Xiao
 12ŒŽ5“ú 1.12. Semi-Supervised
1.12.1. Label Propagation
    ‰Í–ì
  1.13. Linear and quadratic discriminant analysis
1.13.1. Dimensionality reduction using LDA
    ‰i“c
  1.13.2. Mathematical Idea sklearn-shinnou-1205.pdf   V”[
  1.14. Isotonic regression     š ˆä
 81 2. Unsupervised Learning
2.1. Gaussian mixture models
2.1.1. GMM classifier
2.1.1.1. Pros and cons of class GMM: expectation-maximization inference
2.1.1.1.1. Pros
2.1.1.1.2. Cons
    ‹e’r
  2.1.1.2. Selecting the number of components in a classical GMM
2.1.1.3. Estimation algorithm Expectation-maximization
    ¬–쎛
  2.1.2. VBGMM classifier: variational Gaussian mixtures
2.1.2.1. Pros and cons of class VBGMM: variational inference
2.1.2.1.1. Pros
2.1.2.1.2. Cons
2.1.2.2. Estimation algorithm: variational inference
    Xiao
  2.1.3. DPGMM classifier: Infinite Gaussian mixtures
2.1.3.1. Pros and cons of class DPGMM: Diriclet process mixture model
2.1.3.1.1. Pros
2.1.3.1.2. Cons
2.1.3.2. The Dirichlet Process
    ‰Í–ì
 82 2.2. Manifold learning
2.2.1. Introduction
    ‰i“c
2.2.2. Isomap
2.2.2.1. Complexity
V”[
  2.2.3. Locally Linear Embedding
2.2.3.1. Complexity
    š ˆä
2.2.4. Modified Locally Linear Embedding
2.2.4.1. Complexity
    ‹e’r
  2.2.5. Hessian Eigenmapping
2.2.5.1. Complexity
    ¬–쎛
 83 2.2.6. Spectral Embedding
2.2.6.1. Complexity
    Xiao
  2.2.7. Local Tangent Space Alignment
2.2.7.1. Complexity
    ‰Í–ì
  2.2.8. Multi-dimensional Scaling (MDS)
2.2.8.1. Metric MDS
2.2.8.2. Nonmetric MDS
    ‰i“c
  2.2.9. Tips on practical use     V”[
 84 2.3. Clustering
2.3.1. Overview of clustering methods
    š ˆä
  2.3.2. K-means
2.3.2.1. Mini Batch K-Means
    ‹e’r
2.3.3. Affinity Propagation     ¬–쎛
  2.3.4. Mean Shift     Xiao
  2.3.5. Spectral clustering
2.3.5.1. Different label assignment strategies
    ‰Í–ì
 85 2.3.6. Hierarchical clustering
2.3.6.1. Adding connectivity constraints
    ‰i“c
  2.3.7. DBSCAN     V”[
  2.3.8. Clustering performance evaluation
2.3.8.1. Inertia
2.3.8.1.1. Presentation and usage
2.3.8.1.2. Advantages
2.3.8.1.3. Drawbacks
2.3.8.2. Adjusted Rand index
2.3.8.2.1. Presentation and usage
2.3.8.2.2. Advantages
2.3.8.2.3. Drawbacks
2.3.8.2.4. Mathematical formulation
2.3.8.3. Mutual Information based scores
2.3.8.3.1. Presentation and usage
2.3.8.3.2. Advantages
2.3.8.3.3. Drawbacks
2.3.8.3.4. Mathematical formulation
2.3.8.4. Homogeneity, completeness and V-measure
2.3.8.4.1. Presentation and usage
2.3.8.4.2. Advantages
2.3.8.4.3. Drawbacks
2.3.8.4.4. Mathematical formulation
2.3.8.5. Silhouette Coefficient
2.3.8.5.1. Presentation and usage
2.3.8.5.2. Advantages
2.3.8.5.3. Drawbacks
    š ˆä
 86 2.4. Biclustering
2.4.1. Spectral Co-Clustering
2.4.1.1. Mathematical formulation
    ‹e’r
 87 2.4.2. Spectral Biclustering
2.4.2.1. Mathematical formulation
    ¬–쎛
 88 2.4.3. Biclustering evaluation     Xiao
 89 2.5. Decomposing signals in components (matrix factorization problems)
2.5.1. Principal component analysis (PCA)
2.5.1.1. Exact PCA and probabilistic interpretation
2.5.1.2. Approximate PCA
2.5.1.3. Kernel PCA
2.5.1.4. Sparse principal components analysis (SparsePCA and MiniBatchSparsePCA)
    ‰Í–ì
 90 2.5.2. Truncated singular value decomposition and latent semantic analysis     ‰i“c
 91 2.5.3. Dictionary Learning
2.5.3.1. Sparse coding with a precomputed dictionary
2.5.3.2. Generic dictionary learning
2.5.3.3. Mini-batch dictionary learning
    V”[
 92 2.5.4. Factor Analysis     š ˆä
 93 2.5.5. Independent component analysis (ICA)     ‹e’r
 94 2.5.6. Non-negative matrix factorization (NMF or NNMF)     ¬–쎛
 95 2.6. Covariance estimation
2.6.1. Empirical covariance
    Xiao
 96 2.6.2. Shrunk Covariance
2.6.2.1. Basic shrinkage
2.6.2.2. Ledoit-Wolf shrinkage
2.6.2.3. Oracle Approximating Shrinkage
    ‰Í–ì
 97 2.6.3. Sparse inverse covariance     ‰i“c
 98 2.6.4. Robust Covariance Estimation
2.6.4.1. Minimum Covariance Determinant
    V”[
 99 2.7. Novelty and Outlier Detection
2.7.1. Novelty Detection
    š ˆä
 100 2.7.2. Outlier Detection
2.7.2.1. Fitting an elliptic envelop
2.7.2.2. One-class SVM versus elliptic envelop
    ‹e’r
 101 2.8. Hidden Markov Models
2.8.1. Using HMM
2.8.1.1. Building HMM and generating samples
2.8.1.2. Training HMM parameters and inferring the hidden states
2.8.1.3. Implementing HMMs with custom emission probabilities
    ¬–쎛
 102 2.9. Density Estimation
2.9.1. Density Estimation: Histograms
    Xiao
 103 2.9.2. Kernel Density Estimation     ‰Í–ì
 104 2.10. Neural network models (unsupervised)
2.10.1. Restricted Boltzmann machines
2.10.1.1. Graphical model and parametrization
2.10.1.2. Bernoulli Restricted Boltzmann machines
2.10.1.3. Stochastic Maximum Likelihood learning
    ‰i“c
 105 3. Model selection and evaluation
3.1. Cross-validation: evaluating estimator performance
3.1.1. Computing cross-validated metrics
    V”[
 106 3.1.2. Cross validation iterators
3.1.2.1. K-fold
3.1.2.2. Stratified k-fold
3.1.2.3. Leave-One-Out - LOO
3.1.2.4. Leave-P-Out - LPO
3.1.2.5. Leave-One-Label-Out - LOLO
3.1.2.6. Leave-P-Label-Out
3.1.2.7. Random permutations cross-validation a.k.a. Shuffle & Split
3.1.2.8. See also
3.1.2.9. Bootstrapping cross-validation
    š ˆä
 107 3.1.3. Cross validation and model selection     ‹e’r
 108 3.2. Grid Search: Searching for estimator parameters
3.2.1. Exhaustive Grid Search
3.2.1.1. Scoring functions for parameter search
    ¬–쎛
109 3.2.2. Randomized Parameter Optimization Xiao
 110 3.2.3. Alternatives to brute force parameter search
3.2.3.1. Model specific cross-validation
3.2.3.1.1. sklearn.linear_model.RidgeCV
3.2.3.1.2. sklearn.linear_model.RidgeClassifierCV
3.2.3.1.3. sklearn.linear_model.LarsCV
3.2.3.1.4. sklearn.linear_model.LassoLarsCV
3.2.3.1.5. sklearn.linear_model.LassoCV
3.2.3.1.6. sklearn.linear_model.ElasticNetCV
3.2.3.2. Information Criterion
3.2.3.2.1. sklearn.linear_model.LassoLarsIC
3.2.3.3. Out of Bag Estimates
3.2.3.3.1. sklearn.ensemble.RandomForestClassifier
3.2.3.3.2. sklearn.ensemble.RandomForestRegressor
3.2.3.3.3. sklearn.ensemble.ExtraTreesClassifier
3.2.3.3.4. sklearn.ensemble.ExtraTreesRegressor
3.2.3.3.5. sklearn.ensemble.GradientBoostingClassifier
3.2.3.3.6. sklearn.ensemble.GradientBoostingRegressor
    ‰Í–ì
 111 3.3. Pipeline: chaining estimators
3.3.1. Usage
    ‰i“c
 112 3.3.2. Notes     V”[
 113 3.4. FeatureUnion: Combining feature extractors
3.4.1. Usage
    š ˆä
 114 3.5. Model evaluation: quantifying the quality of predictions
3.5.1. The scoring parameter: defining model evaluation rules
3.5.1.1. Common cases: predefined values
3.5.1.2. Defining your scoring strategy from score functions
3.5.1.3. Implementing your own scoring object
    ‹e’r
 115 3.5.2. Function for prediction-error metrics
3.5.2.1. Classification metrics
3.5.2.1.1. Accuracy score
3.5.2.1.2. Average precision score
3.5.2.1.3. Confusion matrix
3.5.2.1.4. Classification report
3.5.2.1.5. Hamming loss
3.5.2.1.6. Jaccard similarity coefficient score
3.5.2.1.7. Precision, recall and F-measures
3.5.2.1.7.1. Binary classification
3.5.2.1.7.2. Multiclass and multilabel classification
3.5.2.1.8. Hinge loss
3.5.2.1.9. Log loss
3.5.2.1.10. Matthews correlation coefficient
3.5.2.1.11. Receiver operating characteristic (ROC)
3.5.2.1.12. Zero one loss
3.5.2.2. Regression metrics
3.5.2.2.1. Explained variance score
3.5.2.2.2. Mean absolute error
3.5.2.2.3. Mean squared error
3.5.2.2.4. R² score, the coefficient of determination
    ¬–쎛
 116 3.5.3. Clustering metrics     Xiao
 117 3.5.4. Biclustering metrics
3.5.4.1. Clustering metrics
    ‰Í–ì
 118 3.5.5. Dummy estimators     ‰i“c
 119 4.1. Feature extraction
4.1.1. Loading features from dicts
    V”[
 120 4.1.2. Feature hashing     š ˆä
 121 4.1.3. Text feature extraction     ‹e’r
 122 4.1.4. Image feature extraction     ¬–쎛
 123 4.2. Preprocessing data
4.2.1. Standardization, or mean removal and variance scaling
    Xiao
 124 4.2.2. Normalization     ‰Í–ì
 125 4.2.3. Binarization     ‰i“c
 126 4.2.4. Encoding categorical features     V”[
 127 4.2.5. Label preprocessing     š ˆä
 128 4.2.6. Imputation of missing values     ‹e’r
 129 4.3. Kernel Approximation
4.3.1. Nystroem Method for Kernel Approximation
    ¬–쎛
 130 4.3.2. Radial Basis Function Kernel     Xiao
 131 4.3.3. Additive Chi Squared Kernel     ‰Í–ì
 132 4.3.4. Skewed Chi Squared Kernel     ‰i“c
 133 4.3.5. Mathematical Details     V”[
 134 4.4. Random Projection
4.4.1. The Johnson-Lindenstrauss lemma
    š ˆä
 135 4.4.2. Gaussian random projection     ‹e’r
 136 4.4.3. Sparse random projection     ¬–쎛
 137 4.5. Pairwise metrics, Affinities and Kernels
4.5.1. Cosine similarity
    Xiao
 138 4.5.2. Chi Squared Kernel     ‰Í–ì
 139 5. Dataset loading utilities
5.1. General dataset API
    ‰i“c
 140 5.2. Toy datasets     V”[¨‰i“c
 141 5.3. Sample images     š ˆä
 142 5.4. Sample generators     ‹e’r
 143 5.5. Datasets in svmlight / libsvm format     ¬–쎛
 144 5.6. The Olivetti faces dataset     Xiao
 145 5.7. The 20 newsgroups text dataset
5.7.1. Usage
    ‰Í–ì
 146 5.7.2. Converting text to vectors     ‰i“c
 147 5.7.3. Filtering text for more realistic training     V”[¨‰i“c
 148 5.8. Downloading datasets from the mldata.org repository     š ˆä
 149 5.9. The Labeled Faces in the Wild face recognition dataset
5.9.1. Usage
    ‹e’r
 150 5.9.2. Examples     ¬–쎛
 151 5.10. Forest covertypes     Xiao