scikit-learn ƒ[ƒ~
http://scikit-learn.org/stable/index.html
| @@@@@@@@“úŽž@@@@@@@@@@ | “à—e | ƒXƒ‰ƒCƒh‚̃tƒ@ƒCƒ‹ | •â‘«Ž‘—¿‚È‚Ç | ”•\ŽÒ |
| 4ŒŽ24“ú | 1.Supervised Learning 1.1. Generalized Linear Models 1.1.1. Ordinary Least Squares 1.1.1.1. Ordinary Least Squares Complexity 1.1.2. Ridge Regression 1.1.2.1. Ridge Complexity 1.1.2.2. Setting the regularization parameter: generalized Cross-Validation |
sklearn-shinnou-0424.pdf | V”[ | |
| 5ŒŽ8“ú | 1.1.3. Lasso 1.1.3.1. Setting regularization parameter 1.1.3.1.1. Using cross-validation 1.1.3.1.2. Information-criteria based model selection |
sklearn-kunii-0508.pdf | š ˆä | |
| 1.1.4. Elastic Net | sklearn-kikuchi-0508.pdf | ‹e’r | ||
| 1.1.5. Multi-task Lasso | sklearn-onodera-0508.pdf | ¬–쎛 | ||
| 5ŒŽ15“ú | 1.1.6. Least Angle Regression | sklearn-xiao-0515.pdf | Xiao | |
| 1.1.7. LARS Lasso 1.1.7.1. Mathematical formulation |
sklearn-kouno-0515.pdf | ‰Í–ì | ||
| 1.1.8. Orthogonal Matching Pursuit (OMP) | sklearn-nagata-0515.pdf | ‰i“c | ||
| 5ŒŽ22“ú | 1.1.9. Bayesian Regression 1.1.9.1. Bayesian Ridge Regression 1.1.9.2. Automatic Relevance Determination - ARD |
sklearn-shinnou-0522.pdf | V”[ | |
| 1.1.10. Logistic regression | sklearn-kunii-0522.pdf | š ˆä | ||
| 1.1.11. Stochastic Gradient Descent - SGD | sklearn-kikuchi-0522.pdf | ‹e’r | ||
| 5ŒŽ29“ú | 1.1.12. Perceptron | sklearn-onodera-0529.pdf | ¬–쎛 | |
| 1.1.13. Passive Aggressive Algorithms | sklearn-xiao-0529.pdf | Xiao | ||
| 1.2. Support Vector Machines 1.2.1. Classification 1.2.1.1. Multi-class classification 1.2.1.2. Scores and probabilities 1.2.1.3. Unbalanced problems |
sklearn-kouno-0529.pdf | ‰Í–ì | ||
| 6ŒŽ5“ú | 1.2.2. Regression | sklearn-nagata-0605.pdf | ‰i“c | |
| 1.2.3. Density estimation, novelty detection | sklearn-shinnou-0605.pdf | V”[ | ||
| 1.2.4. Complexity | sklearn-kunii-0605.pdf | š ˆä | ||
| 1.2.5. Tips on Practical Use | sklearn-kikuchi-0605.pdf | ‹e’r | ||
| 6ŒŽ12“ú | 1.2.6. Kernel functions 1.2.6.1. Custom Kernels 1.2.6.1.1. Using Python functions as kernels 1.2.6.1.2. Using the Gram matrix 1.2.6.1.3. Parameters of the RBF Kernel |
sklearn-onodera-0612.pdf | ¬–쎛 | |
| 1.2.7. Mathematical formulation 1.2.7.1. SVC 1.2.7.2. NuSVC |
sklearn-xiao-0612.pdf | Xiao | ||
| 1.2.8. Implementation details | sklearn-kouno-0612.pdf | ‰Í–ì | ||
| 1.3. Stochastic Gradient Descent 1.3.1. Classification |
sklearn-shinnou-0612.pdf | ‰i“c¨V”[ | ||
| 6ŒŽ26“ú | 1.3.2. Regression | sklearn-shinnou-0626.pdf | V”[ | |
| 1.3.3. Stochastic Gradient Descent for sparse data | š ˆä | |||
| 1.3.4. Complexity | ‹e’r | |||
| 1.3.5. Tips on Practical Use | sklearn-onodera-0626.pdf | ¬–쎛 | ||
| 10ŒŽ14“ú | 1.3.6. Mathematical formulation 1.3.6.1. SGD |
sklearn-xiao-1014.pptx | Xiao | |
| 1.3.7. Implementation details | sklearn-kouno-1014.pdf | ‰Í–ì | ||
| 1.4. Nearest Neighbors 1.4.1. Unsupervised Nearest Neighbors 1.4.1.1. Finding the Nearest Neighbors 1.4.1.2. KDTree and BallTree Classes |
sklearn-nagata-1014.pdf | ‰i“c | ||
| 1.4.2. Nearest Neighbors Classification | sklearn-shinnou-1014.pdf | V”[ | ||
| 10ŒŽ21“ú | 1.4.3. Nearest Neighbors Regression | sklearn-kunii-1021.pdf | š ˆä | |
| 1.4.4. Nearest Neighbor Algorithms 1.4.4.1. Brute Force 1.4.4.2. K-D Tree 1.4.4.3. Ball Tree 1.4.4.4. Choice of Nearest Neighbors Algorithm 1.4.4.5. Effect of leaf_size |
sklearn-kikuchi-1021.pdf | ‹e’r | ||
| 1.4.5. Nearest Centroid Classifier 1.4.5.1. Nearest Shrunken Centroid |
sklearn-onodera-1021.pdf | ¬–쎛 | ||
| 1.5. Gaussian Processes 1.5.1. Examples 1.5.1.1. An introductory regression example 1.5.1.2. Fitting Noisy Data |
sklearn-xiao-1021.pptx | Xiao | ||
| 10ŒŽ28“ú | 1.5.2. Mathematical formulation 1.5.2.1. The initial assumption 1.5.2.2. The best linear unbiased prediction (BLUP) 1.5.2.3. The empirical best linear unbiased predictor (EBLUP) |
sklearn-kouno-1028.pdf | ‰Í–ì | |
| 1.5.3. Correlation Models | sklearn-nagata-1028.pdf | ‰i“c | ||
| 1.5.4. Regression Models | sklearn-shinnou-1028.pdf | V”[ | ||
| 1.5.5. Implementation details | sklearn-kunii-1028.pdf | š ˆä | ||
| 11ŒŽ4“ú | 1.6. Cross decomposition | sklearn-kikuchi-1104.pdf | ‹e’r | |
| 1.7. Naive Bayes 1.7.1. Gaussian Naive Bayes |
sklearn-onodera-1104.pdf | ¬–쎛 | ||
| 1.7.2. Multinomial Naive Bayes | sklearn-xiao-1104.pptx | Xiao | ||
| 1.7.3. Bernoulli Naive Bayes | sklearn-kouno-1104.pdf | ‰Í–ì | ||
| 1.7.4. Out-of-core naive Bayes model fitting | sklearn-nagata-1104.pdf | ‰i“c | ||
| 11ŒŽ11“ú | 1.8. Decision Trees 1.8.1. Classification |
sklearn-shinnou-1111.pdf | V”[ | |
| 1.8.2. Regression | sklearn-kunii-1111.pdf | š ˆä | ||
| 1.8.3. Multi-output problems | sklearn-kikuchi-1111.pdf | ‹e’r | ||
| 1.8.4. Complexity | sklearn-onodera-1111.pdf | ¬–쎛 | ||
| 1.8.5. Tips on practical use | sklearn-xiao-1111.pdf | Xiao | ||
| 1.8.6. Tree algorithms: ID3, C4.5, C5.0 and CART | sklearn-kouno-1111.pdf | ‰Í–ì | ||
| 1.8.7. Mathematical formulation 1.8.7.1. Classification criteria 1.8.7.2. Regression criteria |
sklearn-nagata-1111.pdf | ‰i“c | ||
| 11ŒŽ21“ú | 1.9. Ensemble methods 1.9.1. Forests of randomized trees 1.9.1.1. Random Forests 1.9.1.2. Extremely Randomized Trees 1.9.1.3. Parameters 1.9.1.4. Parallelization 1.9.1.5. Feature importance evaluation 1.9.1.6. Totally Random Trees Embedding |
sklearn-shinnou-1121.pdf | V”[ | |
| 1.9.2. AdaBoost 1.9.2.1. Usage |
sklearn-kunii-1121.pdf | š ˆä | ||
| 1.9.3. Gradient Tree Boosting 1.9.3.1. Classification 1.9.3.2. Regression 1.9.3.3. Mathematical formulation 1.9.3.3.1. Loss Functions 1.9.3.4. Regularization 1.9.3.4.1. Shrinkage 1.9.3.4.2. Subsampling 1.9.3.5. Interpretation 1.9.3.5.1. Feature importance 1.9.3.5.2. Partial dependence |
sklearn-kikuchi-1121.pdf | ‹e’r | ||
| 11ŒŽ25“ú | 1.10. Multiclass and multilabel algorithms 1.10.1. Multilabel classification format |
sklearn-onodera-1125.pdf | ¬–쎛 | |
| 1.10.2. One-Vs-The-Rest 1.10.2.1. Multiclass learning 1.10.2.2. Multilabel learning |
sklearn-xiao-1125.pdf | Xiao | ||
| 1.10.3. One-Vs-One 1.10.3.1. Multiclass learning |
sklearn-kouno-1125.pdf | ‰Í–ì | ||
| 1.10.4. Error-Correcting Output-Codes 1.10.4.1. Multiclass learning |
sklearn-nagata-1125.pdf | ‰i“c | ||
| 12ŒŽ2“ú | 1.11. Feature selection 1.11.1. Removing features with low variance 1.11.2. Univariate feature selection |
sklearn-shinnou-1202.pdf | V”[ | |
| 1.11.3. Recursive feature elimination | š ˆä | |||
| 1.11.4. L1-based feature selection 1.11.4.1. Selecting non-zero coefficients 1.11.4.2. Randomized sparse models |
sklearn-kikuchi-1202.pdf | ‹e’r | ||
| 1.11.5. Tree-based feature selection | ¬–쎛 | |||
| 1.11.6. Feature selection as part of a pipeline | sklearn-xiao-1202.pdf | Xiao | ||
| 12ŒŽ5“ú | 1.12. Semi-Supervised 1.12.1. Label Propagation |
‰Í–ì | ||
| 1.13. Linear and quadratic discriminant analysis 1.13.1. Dimensionality reduction using LDA |
‰i“c | |||
| 1.13.2. Mathematical Idea | sklearn-shinnou-1205.pdf | V”[ | ||
| 1.14. Isotonic regression | š ˆä | |||
| 81 | 2. Unsupervised Learning 2.1. Gaussian mixture models 2.1.1. GMM classifier 2.1.1.1. Pros and cons of class GMM: expectation-maximization inference 2.1.1.1.1. Pros 2.1.1.1.2. Cons |
‹e’r | ||
| 2.1.1.2. Selecting the number of components in a classical GMM 2.1.1.3. Estimation algorithm Expectation-maximization |
¬–쎛 | |||
| 2.1.2. VBGMM classifier: variational Gaussian mixtures 2.1.2.1. Pros and cons of class VBGMM: variational inference 2.1.2.1.1. Pros 2.1.2.1.2. Cons 2.1.2.2. Estimation algorithm: variational inference |
Xiao | |||
| 2.1.3. DPGMM classifier: Infinite Gaussian mixtures 2.1.3.1. Pros and cons of class DPGMM: Diriclet process mixture model 2.1.3.1.1. Pros 2.1.3.1.2. Cons 2.1.3.2. The Dirichlet Process |
‰Í–ì | |||
| 82 | 2.2. Manifold learning 2.2.1. Introduction |
‰i“c | ||
| 2.2.2. Isomap 2.2.2.1. Complexity |
V”[ | |||
| 2.2.3. Locally Linear Embedding 2.2.3.1. Complexity |
š ˆä | |||
| 2.2.4. Modified Locally Linear Embedding 2.2.4.1. Complexity |
‹e’r | |||
| 2.2.5. Hessian Eigenmapping 2.2.5.1. Complexity |
¬–쎛 | |||
| 83 | 2.2.6. Spectral Embedding 2.2.6.1. Complexity |
Xiao | ||
| 2.2.7. Local Tangent Space Alignment 2.2.7.1. Complexity |
‰Í–ì | |||
| 2.2.8. Multi-dimensional Scaling (MDS) 2.2.8.1. Metric MDS 2.2.8.2. Nonmetric MDS |
‰i“c | |||
| 2.2.9. Tips on practical use | V”[ | |||
| 84 | 2.3. Clustering 2.3.1. Overview of clustering methods |
š ˆä | ||
| 2.3.2. K-means 2.3.2.1. Mini Batch K-Means |
‹e’r | |||
| 2.3.3. Affinity Propagation | ¬–쎛 | |||
| 2.3.4. Mean Shift | Xiao | |||
| 2.3.5. Spectral clustering 2.3.5.1. Different label assignment strategies |
‰Í–ì | |||
| 85 | 2.3.6. Hierarchical clustering 2.3.6.1. Adding connectivity constraints |
‰i“c | ||
| 2.3.7. DBSCAN | V”[ | |||
| 2.3.8. Clustering performance evaluation 2.3.8.1. Inertia 2.3.8.1.1. Presentation and usage 2.3.8.1.2. Advantages 2.3.8.1.3. Drawbacks 2.3.8.2. Adjusted Rand index 2.3.8.2.1. Presentation and usage 2.3.8.2.2. Advantages 2.3.8.2.3. Drawbacks 2.3.8.2.4. Mathematical formulation 2.3.8.3. Mutual Information based scores 2.3.8.3.1. Presentation and usage 2.3.8.3.2. Advantages 2.3.8.3.3. Drawbacks 2.3.8.3.4. Mathematical formulation 2.3.8.4. Homogeneity, completeness and V-measure 2.3.8.4.1. Presentation and usage 2.3.8.4.2. Advantages 2.3.8.4.3. Drawbacks 2.3.8.4.4. Mathematical formulation 2.3.8.5. Silhouette Coefficient 2.3.8.5.1. Presentation and usage 2.3.8.5.2. Advantages 2.3.8.5.3. Drawbacks |
š ˆä | |||
| 86 | 2.4. Biclustering 2.4.1. Spectral Co-Clustering 2.4.1.1. Mathematical formulation |
‹e’r | ||
| 87 | 2.4.2. Spectral Biclustering 2.4.2.1. Mathematical formulation |
¬–쎛 | ||
| 88 | 2.4.3. Biclustering evaluation | Xiao | ||
| 89 | 2.5. Decomposing signals in components (matrix factorization problems) 2.5.1. Principal component analysis (PCA) 2.5.1.1. Exact PCA and probabilistic interpretation 2.5.1.2. Approximate PCA 2.5.1.3. Kernel PCA 2.5.1.4. Sparse principal components analysis (SparsePCA and MiniBatchSparsePCA) |
‰Í–ì | ||
| 90 | 2.5.2. Truncated singular value decomposition and latent semantic analysis | ‰i“c | ||
| 91 | 2.5.3. Dictionary Learning 2.5.3.1. Sparse coding with a precomputed dictionary 2.5.3.2. Generic dictionary learning 2.5.3.3. Mini-batch dictionary learning |
V”[ | ||
| 92 | 2.5.4. Factor Analysis | š ˆä | ||
| 93 | 2.5.5. Independent component analysis (ICA) | ‹e’r | ||
| 94 | 2.5.6. Non-negative matrix factorization (NMF or NNMF) | ¬–쎛 | ||
| 95 | 2.6. Covariance estimation 2.6.1. Empirical covariance |
Xiao | ||
| 96 | 2.6.2. Shrunk Covariance 2.6.2.1. Basic shrinkage 2.6.2.2. Ledoit-Wolf shrinkage 2.6.2.3. Oracle Approximating Shrinkage |
‰Í–ì | ||
| 97 | 2.6.3. Sparse inverse covariance | ‰i“c | ||
| 98 | 2.6.4. Robust Covariance Estimation 2.6.4.1. Minimum Covariance Determinant |
V”[ | ||
| 99 | 2.7. Novelty and Outlier Detection 2.7.1. Novelty Detection |
š ˆä | ||
| 100 | 2.7.2. Outlier Detection 2.7.2.1. Fitting an elliptic envelop 2.7.2.2. One-class SVM versus elliptic envelop |
‹e’r | ||
| 101 | 2.8. Hidden Markov Models 2.8.1. Using HMM 2.8.1.1. Building HMM and generating samples 2.8.1.2. Training HMM parameters and inferring the hidden states 2.8.1.3. Implementing HMMs with custom emission probabilities |
¬–쎛 | ||
| 102 | 2.9. Density Estimation 2.9.1. Density Estimation: Histograms |
Xiao | ||
| 103 | 2.9.2. Kernel Density Estimation | ‰Í–ì | ||
| 104 | 2.10. Neural network models (unsupervised) 2.10.1. Restricted Boltzmann machines 2.10.1.1. Graphical model and parametrization 2.10.1.2. Bernoulli Restricted Boltzmann machines 2.10.1.3. Stochastic Maximum Likelihood learning |
‰i“c | ||
| 105 | 3. Model selection and evaluation 3.1. Cross-validation: evaluating estimator performance 3.1.1. Computing cross-validated metrics |
V”[ | ||
| 106 | 3.1.2. Cross validation iterators 3.1.2.1. K-fold 3.1.2.2. Stratified k-fold 3.1.2.3. Leave-One-Out - LOO 3.1.2.4. Leave-P-Out - LPO 3.1.2.5. Leave-One-Label-Out - LOLO 3.1.2.6. Leave-P-Label-Out 3.1.2.7. Random permutations cross-validation a.k.a. Shuffle & Split 3.1.2.8. See also 3.1.2.9. Bootstrapping cross-validation |
š ˆä | ||
| 107 | 3.1.3. Cross validation and model selection | ‹e’r | ||
| 108 | 3.2. Grid Search: Searching for estimator parameters 3.2.1. Exhaustive Grid Search 3.2.1.1. Scoring functions for parameter search |
¬–쎛 | ||
| 109 | 3.2.2. Randomized Parameter Optimization | Xiao | ||
| 110 | 3.2.3. Alternatives to brute force parameter search 3.2.3.1. Model specific cross-validation 3.2.3.1.1. sklearn.linear_model.RidgeCV 3.2.3.1.2. sklearn.linear_model.RidgeClassifierCV 3.2.3.1.3. sklearn.linear_model.LarsCV 3.2.3.1.4. sklearn.linear_model.LassoLarsCV 3.2.3.1.5. sklearn.linear_model.LassoCV 3.2.3.1.6. sklearn.linear_model.ElasticNetCV 3.2.3.2. Information Criterion 3.2.3.2.1. sklearn.linear_model.LassoLarsIC 3.2.3.3. Out of Bag Estimates 3.2.3.3.1. sklearn.ensemble.RandomForestClassifier 3.2.3.3.2. sklearn.ensemble.RandomForestRegressor 3.2.3.3.3. sklearn.ensemble.ExtraTreesClassifier 3.2.3.3.4. sklearn.ensemble.ExtraTreesRegressor 3.2.3.3.5. sklearn.ensemble.GradientBoostingClassifier 3.2.3.3.6. sklearn.ensemble.GradientBoostingRegressor |
‰Í–ì | ||
| 111 | 3.3. Pipeline: chaining estimators 3.3.1. Usage |
‰i“c | ||
| 112 | 3.3.2. Notes | V”[ | ||
| 113 | 3.4. FeatureUnion: Combining feature extractors 3.4.1. Usage |
š ˆä | ||
| 114 | 3.5. Model evaluation: quantifying the quality of predictions 3.5.1. The scoring parameter: defining model evaluation rules 3.5.1.1. Common cases: predefined values 3.5.1.2. Defining your scoring strategy from score functions 3.5.1.3. Implementing your own scoring object |
‹e’r | ||
| 115 | 3.5.2. Function for prediction-error metrics 3.5.2.1. Classification metrics 3.5.2.1.1. Accuracy score 3.5.2.1.2. Average precision score 3.5.2.1.3. Confusion matrix 3.5.2.1.4. Classification report 3.5.2.1.5. Hamming loss 3.5.2.1.6. Jaccard similarity coefficient score 3.5.2.1.7. Precision, recall and F-measures 3.5.2.1.7.1. Binary classification 3.5.2.1.7.2. Multiclass and multilabel classification 3.5.2.1.8. Hinge loss 3.5.2.1.9. Log loss 3.5.2.1.10. Matthews correlation coefficient 3.5.2.1.11. Receiver operating characteristic (ROC) 3.5.2.1.12. Zero one loss 3.5.2.2. Regression metrics 3.5.2.2.1. Explained variance score 3.5.2.2.2. Mean absolute error 3.5.2.2.3. Mean squared error 3.5.2.2.4. R² score, the coefficient of determination |
¬–쎛 | ||
| 116 | 3.5.3. Clustering metrics | Xiao | ||
| 117 | 3.5.4. Biclustering metrics 3.5.4.1. Clustering metrics |
‰Í–ì | ||
| 118 | 3.5.5. Dummy estimators | ‰i“c | ||
| 119 | 4.1. Feature extraction 4.1.1. Loading features from dicts |
V”[ | ||
| 120 | 4.1.2. Feature hashing | š ˆä | ||
| 121 | 4.1.3. Text feature extraction | ‹e’r | ||
| 122 | 4.1.4. Image feature extraction | ¬–쎛 | ||
| 123 | 4.2. Preprocessing data 4.2.1. Standardization, or mean removal and variance scaling |
Xiao | ||
| 124 | 4.2.2. Normalization | ‰Í–ì | ||
| 125 | 4.2.3. Binarization | ‰i“c | ||
| 126 | 4.2.4. Encoding categorical features | V”[ | ||
| 127 | 4.2.5. Label preprocessing | š ˆä | ||
| 128 | 4.2.6. Imputation of missing values | ‹e’r | ||
| 129 | 4.3. Kernel Approximation 4.3.1. Nystroem Method for Kernel Approximation |
¬–쎛 | ||
| 130 | 4.3.2. Radial Basis Function Kernel | Xiao | ||
| 131 | 4.3.3. Additive Chi Squared Kernel | ‰Í–ì | ||
| 132 | 4.3.4. Skewed Chi Squared Kernel | ‰i“c | ||
| 133 | 4.3.5. Mathematical Details | V”[ | ||
| 134 | 4.4. Random Projection 4.4.1. The Johnson-Lindenstrauss lemma |
š ˆä | ||
| 135 | 4.4.2. Gaussian random projection | ‹e’r | ||
| 136 | 4.4.3. Sparse random projection | ¬–쎛 | ||
| 137 | 4.5. Pairwise metrics, Affinities and Kernels 4.5.1. Cosine similarity |
Xiao | ||
| 138 | 4.5.2. Chi Squared Kernel | ‰Í–ì | ||
| 139 | 5. Dataset loading utilities 5.1. General dataset API |
‰i“c | ||
| 140 | 5.2. Toy datasets | V”[¨‰i“c | ||
| 141 | 5.3. Sample images | š ˆä | ||
| 142 | 5.4. Sample generators | ‹e’r | ||
| 143 | 5.5. Datasets in svmlight / libsvm format | ¬–쎛 | ||
| 144 | 5.6. The Olivetti faces dataset | Xiao | ||
| 145 | 5.7. The 20 newsgroups text dataset 5.7.1. Usage |
‰Í–ì | ||
| 146 | 5.7.2. Converting text to vectors | ‰i“c | ||
| 147 | 5.7.3. Filtering text for more realistic training | V”[¨‰i“c | ||
| 148 | 5.8. Downloading datasets from the mldata.org repository | š ˆä | ||
| 149 | 5.9. The Labeled Faces in the Wild face recognition dataset 5.9.1. Usage |
‹e’r | ||
| 150 | 5.9.2. Examples | ¬–쎛 | ||
| 151 | 5.10. Forest covertypes | Xiao |