Exploring Machine Learning Algorithms

data cleaning, data cleansing, data scrubbing, data quality, data management, data analysis, data accuracy, data consistency, data completeness, data enrichment, data validation
What is Data Cleaning?
Balancing act, ALIREZA RASHIDI
Balancing ACT


Machine Learning: A Friendly Tour
Ali’s Field Notes

THE ALGORITHM
GARDEN

Machine Learning isn’t magic. It’s a toolbox.
From teaching puppies to folding maps, here is your friendly tour.

Start With The Question

Before names or math, ask: What do you want to discover? Pick the right tool, plus a bit of care, and you get useful answers.

Predicting Numbers?

Supervised > Regression

Predicting Categories?

Supervised > Classification

Finding Groups?

Unsupervised > Clustering

Complex Patterns?

Deep Learning
🐶

“The first time I trained a model, it felt like teaching a curious puppy to fetch—clumsy at first, then surprisingly good.”

Family 01

Supervised Learning

Learning from examples that have the “right” answer. Like a student with an answer key.

Linear Regression

The “Ruler”

Think of a ruler through dots. Tries to draw the best straight line. Great for quick baselines like housing prices.

Logistic Regression

The “Yes/No”

Estimates the chance something is A or B. Will a customer churn? Is this spam? Clean and reliable.

Trees & Forests

The “Flowchart”

Splits data into simple rules (If > X then Y). Random Forests use many trees to vote, increasing stability.

Family 02

Unsupervised Learning

Finding structure without answers. Looking for shapes in the fog.

Clustering (k-Means)

segmentation

Imagine tossing magnets on a metal sheet. Points pull toward the nearest magnet. Great for grouping similar customers.

PCA (Dimensionality)

simplification

Like folding a complex map. It reshapes the data space so most variation fits into fewer directions.

Family 03

Deep Learning

When data is rich and messy (Images, Sound, Text). Neural networks shine here.

CNN

Convolutional Neural Networks scan images with filters. They catch edges, textures, shapes.

Transformers

The modern standard for text. They pay attention to all words at once. Powered by massive compute.

Neural Nets

Stacked layers of simple units. Can surpass classic models when signal is complex.

How To Judge Models

Overfitting is the classic pothole. You crush the training set, then stumble on new data. It’s like memorizing the answers instead of learning the subject.

Precision vs Recall

Precision: No false alarms.
Recall: Don’t miss the bad guys.
(Choose based on what hurts more.)

Split Fairly

Train Validate Test

The Sweet Spot (Bias vs Variance)

A Quick Field Guide

What to try first when you land on a new planet.

Scenario Start With… Then Try…
Tabular Data (Excel/SQL) Linear / Logistic Regression Random Forest or Gradient Boosting
Few rows, many columns Regression + Regularization PCA (Simplify) before modeling
Images Small CNN Pre-trained ResNet/EfficientNet
Text / NLP Bag-of-Words Transformers (BERT/GPT)
Anomalies Isolation Forest Simple Thresholds

Keep It Responsible

Models touch people. Check for bias. Monitor drift. Explain choices. A model that is fair and stable earns trust.

Trust is your real metric.

“Features beat fancy. Clean data wins.”

© 2025 Ali’s Machine Learning Series
Ali Reza Rashidi
Ali Reza Rashidi
Ali Reza Rashidi, a BI analyst with over nine years of experience, He is the author of three books that delve into the world of data and management.

Comments are closed.

error: Content is protected!