<< back to Guides
<< back to Guides
AI Core Concepts (Part 11): Supervised Learning
Supervised Learning is a type of machine learning where models are trained on labeled dataβeach input is associated with a known output (label). The model learns to map inputs to outputs by minimizing a loss function.
1. Core Idea
- Learn a function
f(x) β y
from examples of(x, y)
pairs. - The model predicts outputs on new unseen inputs.
- Loss is computed based on how close predictions are to true labels.
2. Types of Supervised Learning
πΉ Classification
- Goal: Predict a discrete label.
- Examples: Spam detection, image classification, sentiment analysis
from sklearn.linear_model import LogisticRegression
model = LogisticRegression()
model.fit(X_train, y_train)
predictions = model.predict(X_test)
πΉ Regression
- Goal: Predict a continuous value.
- Examples: Price prediction, temperature forecasting, stock values
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train, y_train)
y_pred = regressor.predict(X_test)
3. Common Algorithms
Task | Algorithms |
---|---|
Classification | Logistic Regression, SVM, Random Forest, k-NN, Neural Networks |
Regression | Linear Regression, SVR, XGBoost, Decision Trees, Ridge/Lasso |
Universal | Gradient Boosting, Neural Nets, Transformers (with labels) |
4. Supervised Learning Pipeline
- Data Preparation β clean, normalize, encode
- Train/Test Split β evaluate generalization
- Model Selection β pick based on data & task
- Training β learn parameters on training data
- Evaluation β use metrics (accuracy, MSE, F1)
- Tuning β hyperparameter optimization
- Deployment β inference on new data
5. Example: End-to-End Classification
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
# Load and split data
data = load_iris()
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.2)
# Train
clf = RandomForestClassifier()
clf.fit(X_train, y_train)
# Predict and evaluate
y_pred = clf.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))
6. Key Evaluation Metrics
Task | Metrics |
---|---|
Classification | Accuracy, Precision, Recall, F1 |
Regression | Mean Squared Error (MSE), RΒ² score |
All Tasks | Cross-validation, confusion matrix |
7. When to Use Supervised Learning
β Use when:
- You have labeled data
- The problem is well-defined
- You want to predict something specific
β Avoid when:
- You only have unlabeled data (consider unsupervised learning)
- Task involves sequential decisions (consider reinforcement learning)
π Further Resources
- Supervised Learning - Scikit-learn Guide
- Hands-On ML with Scikit-Learn, Keras, & TensorFlow
- Google ML Crash Course
<< back to Guides