03. Mastering Logistic Regression & Evaluation Metrics

Sigmoid, Cross-Entropy, Confusion Matrix, ROC/PR Curve

Learning Objectives

After completing this tutorial, you will be able to:

Understand Sigmoid function and Log-Loss (Binary Cross-Entropy) formulas
Understand logistic regression principles and Gradient Descent learning process
Fully understand Confusion Matrix components: TP, FP, TN, FN
Calculate and analyze Precision, Recall, F1-Score trade-offs
Interpret ROC Curve, AUC, PR Curve
Meet business requirements through Decision Threshold optimization
Understand Imbalanced Data problems and class_weight solutions

Key Concepts

1. What is Logistic Regression?

An algorithm that solves Binary Classification problems.

Passes linear combination through Sigmoid function to convert to probability:

P(y=1|x) = σ(z) = 1 / (1 + e^(-z))

Where z = wᵀx + b (linear combination)

Sigmoid Function Properties

Property	Description
Output Range	(0, 1) → Interpretable as probability
Center Value	σ(0) = 0.5
Derivative	σ'(z) = σ(z)(1 - σ(z))
Asymptotes	z → ∞: σ(z) → 1, z → -∞: σ(z) → 0

Since Sigmoid function output is always between 0 and 1, it can be interpreted as probability. Classes are determined based on a threshold (default 0.5).

2. Log-Loss (Binary Cross-Entropy)

L(w) = -(1/n) Σ[yᵢlog(p̂ᵢ) + (1-yᵢ)log(1-p̂ᵢ)]

Intuitive Understanding of Loss Function

Actual Value	Predicted Probability	Loss	Interpretation
y=1	p=0.9	Low	Good prediction
y=1	p=0.1	High	Confidently wrong (big penalty)
y=0	p=0.1	Low	Good prediction
y=0	p=0.9	High	Confidently wrong (big penalty)

⚠️

Log-Loss increases exponentially when confidently wrong. This trains the model to avoid assigning high probability to incorrect predictions.

3. Understanding Confusion Matrix

                  Predicted
                 Neg    Pos
Actual   Neg     TN     FP
         Pos     FN     TP

Item	Meaning	Description
TN	True Negative	Correctly predicted negative as negative
FP	False Positive	Incorrectly predicted negative as positive (Type I Error)
FN	False Negative	Incorrectly predicted positive as negative (Type II Error)
TP	True Positive	Correctly predicted positive as positive

🚫

Medical Diagnosis Example (Breast Cancer Data):

FN (False Negative): Misdiagnosing malignant tumor as benign → Critical! Missing treatment opportunity
FP (False Positive): Misdiagnosing benign tumor as malignant → Additional unnecessary tests needed

Therefore, Recall (sensitivity) is very important in cancer diagnosis.

4. Evaluation Metrics Summary

Metric	Formula	Meaning
Accuracy	(TP+TN) / All	Overall accuracy
Precision	TP / (TP+FP)	Ratio of actual positives among positive predictions ("probability of being correct when predicting positive")
Recall	TP / (TP+FN)	Ratio of positive predictions among actual positives ("probability of not missing actual positives")
F1-Score	2PR / (P+R)	Harmonic mean of Precision and Recall
Specificity	TN / (TN+FP)	Ratio of negative predictions among actual negatives

5. ROC Curve & AUC

ROC Curve: Plots TPR (True Positive Rate) vs FPR (False Positive Rate) at all thresholds
AUC: Area under the ROC curve

AUC Value	Interpretation
1.0	Perfect classifier
0.9-1.0	Excellent
0.8-0.9	Good
0.7-0.8	Fair
0.5	Random guess level

from sklearn.metrics import roc_curve, auc, roc_auc_score
 
fpr, tpr, thresholds = roc_curve(y_test, y_proba)
roc_auc = auc(fpr, tpr)

ROC-AUC is threshold-independent, making it suitable for comparing different models.

6. Precision-Recall Curve

More useful than ROC for imbalanced data
AP (Average Precision): Area under the PR curve

from sklearn.metrics import precision_recall_curve, average_precision_score
 
precision, recall, thresholds = precision_recall_curve(y_test, y_proba)
ap = average_precision_score(y_test, y_proba)

7. Precision-Recall Trade-off

Adjusting threshold changes Precision and Recall inversely:

Threshold Change	Positive Predictions	Precision	Recall
↑ (e.g., 0.5→0.7)	Stricter condition	↑ (FP decreases)	↓ (FN increases)
↓ (e.g., 0.5→0.3)	Looser condition	↓ (FP increases)	↑ (FN decreases)

Threshold Selection by Business Scenario

Scenario	Important Metric	Threshold	Reason
Cancer Diagnosis	Recall	Low	Minimize FN (cannot miss cancer)
Spam Filter	Precision	High	Minimize FP (protect normal emails)
Balance	F1-Score	Optimal point	Balance Precision/Recall

# Find threshold that maximizes F1
f1_scores = 2 * (precision * recall) / (precision + recall + 1e-10)
best_idx = np.argmax(f1_scores)
best_threshold = thresholds[best_idx]

Code Summary

from sklearn.linear_model import LogisticRegression
from sklearn.metrics import (
    accuracy_score, precision_score, recall_score, f1_score,
    confusion_matrix, classification_report,
    roc_auc_score, average_precision_score
)
 
# Training
model = LogisticRegression(max_iter=1000)
model.fit(X_train, y_train)
 
# Prediction
y_pred = model.predict(X_test)
y_proba = model.predict_proba(X_test)[:, 1]
 
# Evaluation
print(f"Accuracy:  {accuracy_score(y_test, y_pred):.4f}")
print(f"Precision: {precision_score(y_test, y_pred):.4f}")
print(f"Recall:    {recall_score(y_test, y_pred):.4f}")
print(f"F1:        {f1_score(y_test, y_pred):.4f}")
print(f"ROC-AUC:   {roc_auc_score(y_test, y_proba):.4f}")
 
# Detailed report
print(classification_report(y_test, y_pred))

Handling Imbalanced Data

# Using class_weight (higher weight for minority class)
model = LogisticRegression(class_weight='balanced')
 
# Or specify directly
model = LogisticRegression(class_weight={0: 1, 1: 10})

class_weight='balanced' automatically calculates weights as the inverse of class frequencies. It assigns higher weights to minority class samples, helping the model learn the minority class better.

Evaluation Metric Selection Guide

Scenario	Recommended Metric	Reason
Balanced data	Accuracy, F1	Can evaluate overall performance
Imbalanced data	Precision, Recall, AUC	Accuracy can be misleading
Model comparison	ROC-AUC	Threshold independent
Imbalanced + Positive important	PR-AUC	Focuses on positive class

Interview Questions Preview

Explain the trade-off between Precision and Recall
When do you use ROC-AUC vs PR-AUC?
How does class_weight='balanced' work?
Why is Log-Loss more suitable than MSE for classification?

Check out more interview questions at Premium Interviews (opens in a new tab).

Practice Notebook

Practice the above concepts with the Breast Cancer dataset:

The notebook additionally covers:

Visualization of Sigmoid function and Log-Loss
Logistic regression from Scratch implementation (Gradient Descent)
Performance comparison experiments at various thresholds
Decision Boundary visualization using PCA
Imbalanced data generation and class_weight effect comparison
Practice problems (Cost-Sensitive Learning, algorithm comparison, etc.)

Previous: 02. Linear Regression | Next: 04. Decision Tree

02. Linear Regression 04. Decision Tree