CAIIB ABM Statistics & Data Interpretation — Module A Complete Guide with Worked Examples
Quick Answer: CAIIB ABM Module A (Statistics & Data Interpretation) covers 8 topics: measures of central tendency, dispersion, correlation, regression, probability, sampling, index numbers, and time series. It carries approximately 25–30 marks in ABM. The exam tests calculation and interpretation — not theory. You need to solve problems, not describe formulas. This guide covers every topic with worked examples in the exact format IIBF uses.
CAIIB ABM Statistics & Data Interpretation — Module A Complete Guide with Worked Examples
ABM Module A is the topic most CAIIB candidates dread and most consistently underestimate. Not because the statistics is hard — it isn’t. The concepts are standard Class 11–12 level. The problem is that most bankers haven’t touched a statistics problem since college, so the gap between “I’ve heard of standard deviation” and “I can calculate it in 90 seconds and answer three MCQs about it” feels enormous.
This guide closes that gap. Every topic is covered with the exact formula, a worked example in IIBF style, and the specific question patterns the exam repeats. Work through each section with a calculator. By the time you finish, Module A will be the part of ABM you’re most confident about.
Module A — Topic Map & Mark Weightage
| Topic | Est. Questions | Type | Priority |
|---|---|---|---|
| Measures of Central Tendency | 3–4 | Calculation + interpretation | HIGH |
| Measures of Dispersion | 3–4 | Calculation (SD, variance, CV) | HIGH |
| Correlation | 3–4 | Calculation + interpretation of r | HIGH |
| Regression | 3–4 | Find equation, predict values | HIGH |
| Probability | 3–4 | P(A∪B), P(A|B), Bayes | HIGH |
| Index Numbers | 3–4 | Laspeyres, Paasche, Fisher | HIGH |
| Sampling Theory | 2–3 | Conceptual + standard error | MEDIUM |
| Time Series | 2–3 | Components, trend identification | MEDIUM |
Topic 1 — Measures of Central Tendency
Central tendency measures describe the “centre” of a dataset. IIBF tests all three — mean, median, mode — with particular emphasis on weighted mean and when each measure is appropriate.
Key Formulas
Arithmetic Mean
X̄ = Σx / n
Sum of all values ÷ number of values
Weighted Mean
X̄w = Σ(w·x) / Σw
Each value multiplied by its weight
Median
Middle value (sorted data)
Odd n: middle. Even n: average of two middle values
Relationship
Mode = 3·Median − 2·Mean
Empirical relation — used to find mode from Mean & Median
Worked Example — IIBF Style
Q: A bank officer’s loan sanctioned amounts (₹ lakh) over 6 months were: 12, 18, 15, 21, 18, 24. Find the mean, median, and mode.
Σx = 12 + 18 + 15 + 21 + 18 + 24 = 108
n = 6
Mean = 108 / 6 = 18
Step 2 — Median (sort first):
Sorted: 12, 15, 18, 18, 21, 24
n = 6 (even) → Median = (3rd + 4th) / 2 = (18 + 18) / 2 = 18
Step 3 — Mode:
18 appears twice (most frequent) → Mode = 18
Answer: Mean = Median = Mode = 18 → Symmetric distribution
IIBF Question Pattern: “Which measure of central tendency is most affected by extreme values?” → Mean. “Which is best for skewed distributions?” → Median. “Which is used for nominal/categorical data?” → Mode. These conceptual questions appear in every exam — memorise the three answers.
Topic 2 — Measures of Dispersion
Dispersion measures how spread out data is around the mean. Standard Deviation (SD) and Coefficient of Variation (CV) are the two most tested — and both appear in calculations, not just definitions.
Key Formulas
Variance (σ²)
σ² = Σ(x − X̄)² / n
Average of squared deviations from mean
Standard Deviation (σ)
σ = √[Σ(x − X̄)² / n]
Square root of variance — same units as data
Coefficient of Variation (CV)
CV = (σ / X̄) × 100
Relative dispersion — compares variability across datasets
Range
Range = Max value − Min value
Simplest measure — highly affected by extremes
Worked Example — SD and CV
Q: NPA rates (%) at 5 branches: 4, 6, 8, 10, 12. Calculate Standard Deviation and Coefficient of Variation.
X̄ = (4 + 6 + 8 + 10 + 12) / 5 = 40 / 5 = 8
Step 2 — Deviations and squared deviations:
| x | (x − X̄) | (x − X̄)² |
|---|---|---|
| 4 | −4 | 16 |
| 6 | −2 | 4 |
| 8 | 0 | 0 |
| 10 | +2 | 4 |
| 12 | +4 | 16 |
| Σ = 40 | 0 | Σ = 40 |
Step 4 — SD: σ = √8 = 2.83
Step 5 — CV: CV = (2.83 / 8) × 100 = 35.4%
Interpretation: NPA rates vary by 35.4% relative to the mean — moderate dispersion across branches.
Topic 3 — Correlation
Correlation measures the strength and direction of the linear relationship between two variables. In banking context: correlation between loan amount and NPA rate, between interest rate and credit demand, etc. The Pearson correlation coefficient (r) ranges from −1 to +1.
Key Formulas
Karl Pearson’s r
r = Σ(x−X̄)(y−Ȳ) / √[Σ(x−X̄)² · Σ(y−Ȳ)²]
Range: −1 ≤ r ≤ +1
Spearman’s Rank Correlation
rₛ = 1 − [6·Σd² / n(n²−1)]
d = difference in ranks; used for ordinal data
Interpretation of r
r = −1 → Perfect negative
r = 0 → No correlation
|r| > 0.7 → Strong
0.3 < |r| < 0.7 → Moderate
|r| < 0.3 → Weak
Worked Example — Spearman’s Rank Correlation
Q: Two officers ranked 5 loan proposals. Find the rank correlation between their rankings.
| Proposal | Officer A (Rank) | Officer B (Rank) | d = A−B | d² |
|---|---|---|---|---|
| P1 | 1 | 2 | −1 | 1 |
| P2 | 2 | 1 | +1 | 1 |
| P3 | 3 | 3 | 0 | 0 |
| P4 | 4 | 5 | −1 | 1 |
| P5 | 5 | 4 | +1 | 1 |
| Σd² | 4 | |||
Interpretation: r = 0.8 → Strong positive correlation. The two officers largely agree on loan proposal quality.
Topic 4 — Regression Analysis
Regression finds the equation of the line that best fits your data — so you can predict one variable from another. IIBF tests two things: finding the regression equation, and using it to estimate a value. Both are straightforward once you know the two-step process.
Regression of Y on X: Y = a + bX
Regression coefficient b (slope)
b = r × (σy / σx)
Also: b = [n·Σxy − Σx·Σy] / [n·Σx² − (Σx)²]
Intercept a
a = Ȳ − b·X̄
Solve for a after finding b
Key relationship
r² = byx × bxy
Product of both regression coefficients = r² (coefficient of determination)
Worked Example — Finding Regression Equation
Q: Given: r = 0.6, X̄ = 40, Ȳ = 50, σx = 5, σy = 8. Find the regression equation of Y on X. Estimate Y when X = 45.
b = r × (σy / σx) = 0.6 × (8 / 5) = 0.6 × 1.6 = 0.96
Step 2 — Find a:
a = Ȳ − b·X̄ = 50 − (0.96 × 40) = 50 − 38.4 = 11.6
Step 3 — Regression equation:
Y = 11.6 + 0.96X
Step 4 — Estimate Y when X = 45:
Y = 11.6 + 0.96 × 45 = 11.6 + 43.2 = 54.8
Answer: Regression equation: Y = 11.6 + 0.96X. When X = 45, estimated Y = 54.8
Topic 5 — Probability
IIBF tests three levels of probability: basic definitions, addition/multiplication rules, and Bayes’ theorem. Banking context questions are common — “probability that a loan becomes NPA given certain conditions.”
Key Formulas
Addition Rule
P(A∪B) = P(A) + P(B) − P(A∩B)
If mutually exclusive: P(A∪B) = P(A) + P(B)
Multiplication Rule
P(A∩B) = P(A) × P(B|A)
If independent: P(A∩B) = P(A) × P(B)
Conditional Probability
P(A|B) = P(A∩B) / P(B)
Probability of A given B has occurred
Bayes’ Theorem
P(A|B) = P(B|A)·P(A) / P(B)
Revises probability in light of new evidence
Worked Example — Conditional Probability (Banking Context)
Q: In a branch, 40% of loans are to MSME borrowers. Of all MSME loans, 15% become NPA. Of all non-MSME loans, 5% become NPA. A randomly selected NPA loan is reviewed. What is the probability it is an MSME loan?
P(MSME) = 0.40, P(Non-MSME) = 0.60
P(NPA | MSME) = 0.15, P(NPA | Non-MSME) = 0.05
Step 1 — P(NPA):
P(NPA) = P(NPA|MSME)·P(MSME) + P(NPA|Non-MSME)·P(Non-MSME)
= (0.15 × 0.40) + (0.05 × 0.60) = 0.06 + 0.03 = 0.09
Step 2 — Bayes’ theorem:
P(MSME | NPA) = P(NPA | MSME) × P(MSME) / P(NPA)
= (0.15 × 0.40) / 0.09 = 0.06 / 0.09 = 0.667 = 66.7%
Answer: Given that the loan is NPA, there is a 66.7% probability it was an MSME loan.
Topic 6 — Index Numbers
Index numbers measure the change in a variable (price, quantity, value) over time relative to a base period. IIBF consistently tests three types: Laspeyres (base-year quantity weights), Paasche (current-year quantity weights), and Fisher’s Ideal (geometric mean of both).
Key Formulas
Laspeyres Price Index
L = [Σ(P₁·Q₀) / Σ(P₀·Q₀)] × 100
Base-year quantities (Q₀) as weights
Paasche Price Index
P = [Σ(P₁·Q₁) / Σ(P₀·Q₁)] × 100
Current-year quantities (Q₁) as weights
Fisher’s Ideal Index
F = √(L × P)
Geometric mean of Laspeyres and Paasche
Worked Example — All Three Index Numbers
Q: Calculate Laspeyres, Paasche, and Fisher’s index from the data below.
| Commodity | P₀ | P₁ | Q₀ | Q₁ | P₁Q₀ | P₀Q₀ | P₁Q₁ | P₀Q₁ |
|---|---|---|---|---|---|---|---|---|
| A | 5 | 8 | 10 | 12 | 80 | 50 | 96 | 60 |
| B | 4 | 6 | 15 | 10 | 90 | 60 | 60 | 40 |
| Σ | 170 | 110 | 156 | 100 | ||||
Paasche: P = (156 / 100) × 100 = 156.0
Fisher’s: F = √(154.5 × 156.0) = √24,102 = 155.2
Prices rose approximately 54–56% from the base year to the current year across all three methods.
Topic 7 — Sampling Theory
Types of Sampling — Know All 5
- Simple Random: Every unit has equal probability
- Stratified: Population divided into strata; sample from each
- Systematic: Every kth unit selected
- Cluster: Groups (clusters) randomly selected
- Purposive/Judgement: Non-random; researcher’s discretion
Key Formulas
Standard Error of Mean:
SE = σ / √n
Confidence Interval (95%):
X̄ ± 1.96 × SE
Topic 8 — Time Series Analysis
4 Components of Time Series
- Trend (T): Long-term movement (upward/downward)
- Seasonal (S): Regular pattern within a year
- Cyclical (C): Long-wave fluctuations over years
- Irregular (I): Random, unpredictable variation
Methods to Isolate Trend
- Moving Average: Average over n periods, slides forward
- Least Squares: Regression of Y on time (T) — same as regression above
- Free-hand curve: Visual smoothing (rarely tested numerically)
Additive model: Y = T + S + C + I
Multiplicative model: Y = T × S × C × I
Complete Formula Quick Reference
| Topic | Formula | When to Use |
|---|---|---|
| Mean | X̄ = Σx / n | Average of numeric data |
| Weighted Mean | X̄w = Σ(wx) / Σw | When values have different importance |
| Mode (empirical) | Mode = 3Median − 2Mean | When mode is unknown but mean & median are given |
| Standard Deviation | σ = √[Σ(x−X̄)²/n] | Spread of data around mean |
| Coefficient of Variation | CV = (σ/X̄) × 100 | Compare variability across different datasets |
| Pearson’s r | r = Σ(x−X̄)(y−Ȳ) / (n·σx·σy) | Correlation between two continuous variables |
| Spearman’s rₛ | rₛ = 1 − 6Σd²/n(n²−1) | Rank data or ordinal variables |
| Regression b | b = r × (σy/σx) | Slope of regression line Y on X |
| Regression a | a = Ȳ − b·X̄ | Intercept of regression line |
| Addition Rule | P(A∪B) = P(A)+P(B)−P(A∩B) | “OR” probability |
| Conditional Probability | P(A|B) = P(A∩B) / P(B) | Probability given some condition |
| Bayes’ Theorem | P(A|B) = P(B|A)·P(A) / P(B) | Reverse conditional probability |
| Laspeyres Index | L = Σ(P₁Q₀)/Σ(P₀Q₀) × 100 | Base-year weighted price index |
| Paasche Index | P = Σ(P₁Q₁)/Σ(P₀Q₁) × 100 | Current-year weighted price index |
| Fisher’s Index | F = √(L × P) | Ideal index — geometric mean of L and P |
| Standard Error | SE = σ / √n | Sampling variability of the mean |
Frequently Asked Questions — ABM Statistics
How many statistics questions appear in the actual CAIIB ABM exam?
Module A (Statistics) typically contributes 25–30 questions out of 100 in ABM. The exact split varies each cycle, but IIBF consistently weights statistics heavily because it is an objective differentiator between candidates — you either can calculate it or you can’t. Missing Module A completely makes it very difficult to reach 45/100.
Is a calculator allowed in the CAIIB ABM exam?
Yes — IIBF allows a basic scientific calculator in the CBT exam. The on-screen calculator is provided. You are expected to know the formulas and the sequence of steps; the calculator handles the arithmetic. Practice your workflow with the sequence: mean first → deviations → squared deviations → SD. Don’t rely on memorising steps — know why each step comes where.
Which statistics topics are most frequently repeated in CAIIB ABM?
Based on patterns across recent exam cycles: (1) Standard Deviation and CV calculation — appears every cycle. (2) Regression equation — given r, σx, σy, find Y = a + bX, then estimate Y for a given X. (3) Spearman’s rank correlation. (4) Index numbers — Laspeyres, Paasche, and Fisher. (5) Conditional probability and Bayes. These five appear with high regularity. Master them and you’re covering 20–22 of the ~25–30 Module A questions.
How much time should I spend on Module A during preparation?
10–12 days out of your ABM preparation time — but those days must be calculation-heavy. Read each topic once to understand the formula and its derivation, then immediately start solving problems. Aim for 50–70 practice problems across all Module A topics before your exam. For SD, correlation, and regression specifically, solve 8–10 examples each — repetition is how these become automatic.
More ABM & CAIIB Resources
BankersClub Courses
Ready to prepare for your promotion exam?
Structured chapter-by-chapter courses — built by bankers, mapped to the actual promotion syllabus.