CAIIB ABM Statistics & Data Interpretation — Module A Complete Guide with Worked Examples

Last updated by BankersClub on June 29, 2026

Quick Answer: CAIIB ABM Module A (Statistics & Data Interpretation) covers 8 topics: measures of central tendency, dispersion, correlation, regression, probability, sampling, index numbers, and time series. It carries approximately 25–30 marks in ABM. The exam tests calculation and interpretation — not theory. You need to solve problems, not describe formulas. This guide covers every topic with worked examples in the exact format IIBF uses.

CAIIB ABM Statistics & Data Interpretation — Module A Complete Guide with Worked Examples

ABM Module A is the topic most CAIIB candidates dread and most consistently underestimate. Not because the statistics is hard — it isn’t. The concepts are standard Class 11–12 level. The problem is that most bankers haven’t touched a statistics problem since college, so the gap between “I’ve heard of standard deviation” and “I can calculate it in 90 seconds and answer three MCQs about it” feels enormous.

This guide closes that gap. Every topic is covered with the exact formula, a worked example in IIBF style, and the specific question patterns the exam repeats. Work through each section with a calculator. By the time you finish, Module A will be the part of ABM you’re most confident about.

Module A — Topic Map & Mark Weightage

Topic	Est. Questions	Type	Priority
Measures of Central Tendency	3–4	Calculation + interpretation	HIGH
Measures of Dispersion	3–4	Calculation (SD, variance, CV)	HIGH
Correlation	3–4	Calculation + interpretation of r	HIGH
Regression	3–4	Find equation, predict values	HIGH
Probability	3–4	P(A∪B), P(A\|B), Bayes	HIGH
Index Numbers	3–4	Laspeyres, Paasche, Fisher	HIGH
Sampling Theory	2–3	Conceptual + standard error	MEDIUM
Time Series	2–3	Components, trend identification	MEDIUM

Topic 1 — Measures of Central Tendency

Central tendency measures describe the “centre” of a dataset. IIBF tests all three — mean, median, mode — with particular emphasis on weighted mean and when each measure is appropriate.

Key Formulas

Arithmetic Mean

X̄ = Σx / n

Sum of all values ÷ number of values

Weighted Mean

X̄w = Σ(w·x) / Σw

Each value multiplied by its weight

Median

Middle value (sorted data)

Odd n: middle. Even n: average of two middle values

Relationship

Mode = 3·Median − 2·Mean

Empirical relation — used to find mode from Mean & Median

Worked Example — IIBF Style

Q: A bank officer’s loan sanctioned amounts (₹ lakh) over 6 months were: 12, 18, 15, 21, 18, 24. Find the mean, median, and mode.

Step 1 — Mean:
Σx = 12 + 18 + 15 + 21 + 18 + 24 = 108
n = 6
Mean = 108 / 6 = 18

Step 2 — Median (sort first):
Sorted: 12, 15, 18, 18, 21, 24
n = 6 (even) → Median = (3rd + 4th) / 2 = (18 + 18) / 2 = 18

Step 3 — Mode:
18 appears twice (most frequent) → Mode = 18

Answer: Mean = Median = Mode = 18 → Symmetric distribution

IIBF Question Pattern: “Which measure of central tendency is most affected by extreme values?” → Mean. “Which is best for skewed distributions?” → Median. “Which is used for nominal/categorical data?” → Mode. These conceptual questions appear in every exam — memorise the three answers.

Topic 2 — Measures of Dispersion

Dispersion measures how spread out data is around the mean. Standard Deviation (SD) and Coefficient of Variation (CV) are the two most tested — and both appear in calculations, not just definitions.

Key Formulas

Variance (σ²)

σ² = Σ(x − X̄)² / n

Average of squared deviations from mean

Standard Deviation (σ)

σ = √[Σ(x − X̄)² / n]

Square root of variance — same units as data

Coefficient of Variation (CV)

CV = (σ / X̄) × 100

Relative dispersion — compares variability across datasets

Range

Range = Max value − Min value

Simplest measure — highly affected by extremes

Worked Example — SD and CV

Q: NPA rates (%) at 5 branches: 4, 6, 8, 10, 12. Calculate Standard Deviation and Coefficient of Variation.

Step 1 — Mean:
X̄ = (4 + 6 + 8 + 10 + 12) / 5 = 40 / 5 = 8

Step 2 — Deviations and squared deviations:

x	(x − X̄)	(x − X̄)²
4	−4	16
6	−2	4
8	0	0
10	+2	4
12	+4	16
Σ = 40	0	Σ = 40

Step 3 — Variance: σ² = 40 / 5 = 8
Step 4 — SD: σ = √8 = 2.83
Step 5 — CV: CV = (2.83 / 8) × 100 = 35.4%

Interpretation: NPA rates vary by 35.4% relative to the mean — moderate dispersion across branches.

Topic 3 — Correlation

Correlation measures the strength and direction of the linear relationship between two variables. In banking context: correlation between loan amount and NPA rate, between interest rate and credit demand, etc. The Pearson correlation coefficient (r) ranges from −1 to +1.

Key Formulas

Karl Pearson’s r

r = Σ(x−X̄)(y−Ȳ) / √[Σ(x−X̄)² · Σ(y−Ȳ)²]

Range: −1 ≤ r ≤ +1

Spearman’s Rank Correlation

rₛ = 1 − [6·Σd² / n(n²−1)]

d = difference in ranks; used for ordinal data

Interpretation of r

r = +1 → Perfect positive
r = −1 → Perfect negative
r = 0 → No correlation
|r| > 0.7 → Strong
0.3 < |r| < 0.7 → Moderate
|r| < 0.3 → Weak

Worked Example — Spearman’s Rank Correlation

Q: Two officers ranked 5 loan proposals. Find the rank correlation between their rankings.

Proposal	Officer A (Rank)	Officer B (Rank)	d = A−B	d²
P1	1	2	−1	1
P2	2	1	+1	1
P3	3	3	0	0
P4	4	5	−1	1
P5	5	4	+1	1
Σd²				4

rₛ = 1 − [6 × 4 / 5(25−1)] = 1 − [24 / 120] = 1 − 0.2 = 0.8

Interpretation: r = 0.8 → Strong positive correlation. The two officers largely agree on loan proposal quality.

Topic 4 — Regression Analysis

Regression finds the equation of the line that best fits your data — so you can predict one variable from another. IIBF tests two things: finding the regression equation, and using it to estimate a value. Both are straightforward once you know the two-step process.

Regression of Y on X: Y = a + bX

Regression coefficient b (slope)

b = r × (σy / σx)

Also: b = [n·Σxy − Σx·Σy] / [n·Σx² − (Σx)²]

Intercept a

a = Ȳ − b·X̄

Solve for a after finding b

Key relationship

r² = byx × bxy

Product of both regression coefficients = r² (coefficient of determination)

Worked Example — Finding Regression Equation

Q: Given: r = 0.6, X̄ = 40, Ȳ = 50, σx = 5, σy = 8. Find the regression equation of Y on X. Estimate Y when X = 45.

Step 1 — Find b:
b = r × (σy / σx) = 0.6 × (8 / 5) = 0.6 × 1.6 = 0.96

Step 2 — Find a:
a = Ȳ − b·X̄ = 50 − (0.96 × 40) = 50 − 38.4 = 11.6

Step 3 — Regression equation:
Y = 11.6 + 0.96X

Step 4 — Estimate Y when X = 45:
Y = 11.6 + 0.96 × 45 = 11.6 + 43.2 = 54.8

Answer: Regression equation: Y = 11.6 + 0.96X. When X = 45, estimated Y = 54.8

Topic 5 — Probability

IIBF tests three levels of probability: basic definitions, addition/multiplication rules, and Bayes’ theorem. Banking context questions are common — “probability that a loan becomes NPA given certain conditions.”

Key Formulas

Addition Rule

P(A∪B) = P(A) + P(B) − P(A∩B)

If mutually exclusive: P(A∪B) = P(A) + P(B)

Multiplication Rule

P(A∩B) = P(A) × P(B|A)

If independent: P(A∩B) = P(A) × P(B)

Conditional Probability

P(A|B) = P(A∩B) / P(B)

Probability of A given B has occurred

Bayes’ Theorem

P(A|B) = P(B|A)·P(A) / P(B)

Revises probability in light of new evidence

Worked Example — Conditional Probability (Banking Context)

Q: In a branch, 40% of loans are to MSME borrowers. Of all MSME loans, 15% become NPA. Of all non-MSME loans, 5% become NPA. A randomly selected NPA loan is reviewed. What is the probability it is an MSME loan?

Given:
P(MSME) = 0.40, P(Non-MSME) = 0.60
P(NPA | MSME) = 0.15, P(NPA | Non-MSME) = 0.05

Step 1 — P(NPA):
P(NPA) = P(NPA|MSME)·P(MSME) + P(NPA|Non-MSME)·P(Non-MSME)
= (0.15 × 0.40) + (0.05 × 0.60) = 0.06 + 0.03 = 0.09

Step 2 — Bayes’ theorem:
P(MSME | NPA) = P(NPA | MSME) × P(MSME) / P(NPA)
= (0.15 × 0.40) / 0.09 = 0.06 / 0.09 = 0.667 = 66.7%

Answer: Given that the loan is NPA, there is a 66.7% probability it was an MSME loan.

Topic 6 — Index Numbers

Index numbers measure the change in a variable (price, quantity, value) over time relative to a base period. IIBF consistently tests three types: Laspeyres (base-year quantity weights), Paasche (current-year quantity weights), and Fisher’s Ideal (geometric mean of both).

Key Formulas

Laspeyres Price Index

L = [Σ(P₁·Q₀) / Σ(P₀·Q₀)] × 100

Base-year quantities (Q₀) as weights

Paasche Price Index

P = [Σ(P₁·Q₁) / Σ(P₀·Q₁)] × 100

Current-year quantities (Q₁) as weights

Fisher’s Ideal Index

F = √(L × P)

Geometric mean of Laspeyres and Paasche

Worked Example — All Three Index Numbers

Q: Calculate Laspeyres, Paasche, and Fisher’s index from the data below.

Commodity	P₀	P₁	Q₀	Q₁	P₁Q₀	P₀Q₀	P₁Q₁	P₀Q₁
A	5	8	10	12	80	50	96	60
B	4	6	15	10	90	60	60	40
Σ					170	110	156	100

Laspeyres: L = (170 / 110) × 100 = 154.5
Paasche: P = (156 / 100) × 100 = 156.0
Fisher’s: F = √(154.5 × 156.0) = √24,102 = 155.2

Prices rose approximately 54–56% from the base year to the current year across all three methods.

Topic 7 — Sampling Theory

Types of Sampling — Know All 5

Simple Random: Every unit has equal probability
Stratified: Population divided into strata; sample from each
Systematic: Every kth unit selected
Cluster: Groups (clusters) randomly selected
Purposive/Judgement: Non-random; researcher’s discretion

Key Formulas

Standard Error of Mean:

SE = σ / √n

Confidence Interval (95%):

X̄ ± 1.96 × SE

Topic 8 — Time Series Analysis

4 Components of Time Series

Trend (T): Long-term movement (upward/downward)
Seasonal (S): Regular pattern within a year
Cyclical (C): Long-wave fluctuations over years
Irregular (I): Random, unpredictable variation

Methods to Isolate Trend

Moving Average: Average over n periods, slides forward
Least Squares: Regression of Y on time (T) — same as regression above
Free-hand curve: Visual smoothing (rarely tested numerically)

Additive model: Y = T + S + C + I

Multiplicative model: Y = T × S × C × I

Complete Formula Quick Reference

Topic	Formula	When to Use
Mean	X̄ = Σx / n	Average of numeric data
Weighted Mean	X̄w = Σ(wx) / Σw	When values have different importance
Mode (empirical)	Mode = 3Median − 2Mean	When mode is unknown but mean & median are given
Standard Deviation	σ = √[Σ(x−X̄)²/n]	Spread of data around mean
Coefficient of Variation	CV = (σ/X̄) × 100	Compare variability across different datasets
Pearson’s r	r = Σ(x−X̄)(y−Ȳ) / (n·σx·σy)	Correlation between two continuous variables
Spearman’s rₛ	rₛ = 1 − 6Σd²/n(n²−1)	Rank data or ordinal variables
Regression b	b = r × (σy/σx)	Slope of regression line Y on X
Regression a	a = Ȳ − b·X̄	Intercept of regression line
Addition Rule	P(A∪B) = P(A)+P(B)−P(A∩B)	“OR” probability
Conditional Probability	P(A\|B) = P(A∩B) / P(B)	Probability given some condition
Bayes’ Theorem	P(A\|B) = P(B\|A)·P(A) / P(B)	Reverse conditional probability
Laspeyres Index	L = Σ(P₁Q₀)/Σ(P₀Q₀) × 100	Base-year weighted price index
Paasche Index	P = Σ(P₁Q₁)/Σ(P₀Q₁) × 100	Current-year weighted price index
Fisher’s Index	F = √(L × P)	Ideal index — geometric mean of L and P
Standard Error	SE = σ / √n	Sampling variability of the mean

Frequently Asked Questions — ABM Statistics

How many statistics questions appear in the actual CAIIB ABM exam?

Module A (Statistics) typically contributes 25–30 questions out of 100 in ABM. The exact split varies each cycle, but IIBF consistently weights statistics heavily because it is an objective differentiator between candidates — you either can calculate it or you can’t. Missing Module A completely makes it very difficult to reach 45/100.

Is a calculator allowed in the CAIIB ABM exam?

Yes — IIBF allows a basic scientific calculator in the CBT exam. The on-screen calculator is provided. You are expected to know the formulas and the sequence of steps; the calculator handles the arithmetic. Practice your workflow with the sequence: mean first → deviations → squared deviations → SD. Don’t rely on memorising steps — know why each step comes where.

Which statistics topics are most frequently repeated in CAIIB ABM?

Based on patterns across recent exam cycles: (1) Standard Deviation and CV calculation — appears every cycle. (2) Regression equation — given r, σx, σy, find Y = a + bX, then estimate Y for a given X. (3) Spearman’s rank correlation. (4) Index numbers — Laspeyres, Paasche, and Fisher. (5) Conditional probability and Bayes. These five appear with high regularity. Master them and you’re covering 20–22 of the ~25–30 Module A questions.

How much time should I spend on Module A during preparation?

10–12 days out of your ABM preparation time — but those days must be calculation-heavy. Read each topic once to understand the formula and its derivation, then immediately start solving problems. Aim for 50–70 practice problems across all Module A topics before your exam. For SD, correlation, and regression specifically, solve 8–10 examples each — repetition is how these become automatic.