All Topics
mathematics-us-0444-advanced | cambridge-igcse
Responsive Image
4. Geometry
5. Functions
6. Number
8. Algebra
Determine median, quartiles, percentiles, and interquartile range

Topic 2/3

left-arrow
left-arrow
archive-add download share

Your Flashcards are Ready!

15 Flashcards in this deck.

or
NavTopLeftBtn
NavTopRightBtn
3
Still Learning
I know
12

Determine Median, Quartiles, Percentiles, and Interquartile Range

Introduction

Understanding measures of central tendency and dispersion is fundamental in statistics, particularly for analyzing data distributions. This article explores how to determine the median, quartiles, percentiles, and interquartile range, essential concepts in the Cambridge IGCSE Mathematics curriculum (US - 0444 - Advanced). Mastery of these topics enables students to interpret and analyze data effectively, fostering critical thinking and problem-solving skills.

Key Concepts

Median

The median is the middle value in a data set when the numbers are arranged in ascending or descending order. It divides the data into two equal halves. For an odd number of observations, the median is the central number. For an even number, it is the average of the two central numbers.

Formula:

For an ordered data set with an odd number of observations:

$$\text{Median} = \text{Middle value}$$

For an even number of observations:

$$\text{Median} = \frac{\text{Value at position } \frac{n}{2} + \text{Value at position } \left(\frac{n}{2} + 1\right)}{2}$$

Example:

Consider the data set: 3, 7, 8, 5, 12, 14, 21, 13, 18

Arranged in order: 3, 5, 7, 8, 12, 13, 14, 18, 21

Median = 12 (the fifth value in a nine-number data set)

Quartiles

Quartiles divide a ranked data set into four equal parts. The three quartiles (Q1, Q2, Q3) represent the 25th, 50th, and 75th percentiles, respectively.

First Quartile (Q1): The median of the lower half of the data (25th percentile).

Second Quartile (Q2): The median of the data set (50th percentile).

Third Quartile (Q3): The median of the upper half of the data (75th percentile).

Example:

Using the previous data set: 3, 5, 7, 8, 12, 13, 14, 18, 21

Q1 = 5 (median of 3, 5, 7, 8)

Q2 = 12 (median of the entire data set)

Q3 = 14 (median of 12, 13, 14, 18, 21)

Percentiles

Percentiles indicate the relative standing of a value within a data set. The nth percentile is the value below which n percent of the data fall.

Formula to find the percentile rank:

$$P = \left(\frac{b + \frac{c}{d}}{N}\right) \times 100$$

Where:

  • P = Percentile rank
  • b = Number of values below the target value
  • c = Number of values equal to the target value
  • d = Total number of values
  • N = Total number of observations

Example:

Find the 40th percentile in the data set: 3, 5, 7, 8, 12, 13, 14, 18, 21

N = 9

P = 40

Position = $\frac{40}{100} \times (9 + 1) = 4$

The 4th value is 8, so the 40th percentile is 8.

Interquartile Range (IQR)

The interquartile range measures the spread of the middle 50% of the data. It is the difference between the third quartile (Q3) and the first quartile (Q1).

Formula:

$$\text{IQR} = Q3 - Q1$$

Example:

Using the previous quartiles: Q3 = 14 and Q1 = 5

IQR = 14 - 5 = 9

Determining Median, Quartiles, Percentiles, and IQR

To determine these measures, follow these steps:

  1. Organize the data in ascending order.
  2. Find the median (Q2) of the entire data set.
  3. Determine Q1 by finding the median of the lower half.
  4. Determine Q3 by finding the median of the upper half.
  5. Calculate the IQR by subtracting Q1 from Q3.
  6. For percentiles, use the percentile formula to find the desired value.

Example:

Data set: 7, 15, 36, 39, 40, 41

Step 1: Ordered data: 7, 15, 36, 39, 40, 41

Step 2: Median (Q2) = (36 + 39)/2 = 37.5

Step 3: Lower half: 7, 15, 36 → Q1 = 15

Step 4: Upper half: 39, 40, 41 → Q3 = 40

Step 5: IQR = 40 - 15 = 25

Step 6: To find the 90th percentile, Position = 0.9 * (6 + 1) = 6.3

The 90th percentile is between the 6th value (41) and the 7th value (not present), so it is approximately 41.

Advanced Concepts

In-Depth Theoretical Explanations

The measures of median, quartiles, percentiles, and interquartile range are crucial for understanding data distribution without being affected by outliers. Unlike the mean, which can be skewed by extreme values, the median provides a better central location for skewed distributions.

Mathematical Derivation of Quartiles:

For a data set with an odd number of observations, the quartiles are determined by the median of the lower and upper halves. For an even number, they are the medians of the divided data.

The calculation of percentiles can be approached using linear interpolation, especially when the desired percentile falls between two data points.

Proof of IQR Robustness:

The IQR is less sensitive to outliers compared to the range. Since it only considers the middle 50% of the data, extreme values do not influence it, making it a reliable measure of variability.

Complex Problem-Solving

Problem 1: A data set consists of the following ages of participants in a workshop: 22, 27, 29, 31, 35, 38, 40, 42, 45, 48, 50, 52. Calculate the median, Q1, Q3, IQR, and the 85th percentile.

Solution:

Ordered data: 22, 27, 29, 31, 35, 38, 40, 42, 45, 48, 50, 52

N = 12 (even number)

Median (Q2) = (35 + 38)/2 = 36.5

Lower half: 22, 27, 29, 31, 35, 38 → Q1 = (29 + 31)/2 = 30

Upper half: 38, 40, 42, 45, 48, 50, 52 → Q3 = (42 + 45)/2 = 43.5

IQR = 43.5 - 30 = 13.5

85th percentile position = 0.85 * (12 + 1) = 11.05

85th percentile ≈ 48 + 0.05*(50 - 48) = 48 + 0.1 = 48.1

Interdisciplinary Connections

These statistical measures are not confined to mathematics but are extensively used in various fields:

  • Economics: Median income provides insights into the economic status of a population without being skewed by extremely high or low incomes.
  • Medicine: Percentiles are used to assess growth charts in pediatrics, indicating a child's growth relative to peers.
  • Environmental Science: IQR is utilized in analyzing pollutant levels to understand variations and detect anomalies.
  • Education: Median scores help in evaluating student performance distributions, aiding in curriculum adjustments.

Advanced Techniques in Calculations

When dealing with large data sets or grouped data, the calculation of medians, quartiles, and percentiles requires formulas to estimate their values accurately.

Grouped Data: For data presented in frequency tables, the median, quartiles, and percentiles can be found using interpolation within the appropriate class intervals.

Software Applications: Statistical software and programming languages like R and Python provide functions to calculate these measures efficiently, especially for extensive datasets.

Handling Skewed Data Distributions

In skewed distributions, the median provides a better central location than the mean. The distance between Q1, Q2 (median), and Q3 can indicate the direction and degree of skewness.

Positive Skew: Q3 - Q2 > Q2 - Q1

Negative Skew: Q2 - Q1 > Q3 - Q2

Understanding skewness is essential in fields like finance for risk assessment and in quality control for process improvement.

Comparison Table

Measure Definition Use Case
Median The middle value of an ordered data set Identifying the central tendency in skewed distributions
Quartiles Values that divide data into four equal parts Analyzing the spread and identifying outliers
Percentiles Values below which a certain percent of data falls Assessing individual performance relative to a group
Interquartile Range (IQR) Difference between Q3 and Q1 Measuring data variability and identifying outliers

Summary and Key Takeaways

  • The median provides a robust measure of central tendency in data sets.
  • Quartiles divide data into four equal parts, offering insights into data spread.
  • Percentiles rank individual data points relative to the entire data set.
  • The interquartile range measures variability by focusing on the middle 50% of data.
  • These measures are essential in various interdisciplinary applications, enhancing data analysis and interpretation skills.

Coming Soon!

coming soon
Examiner Tip
star

Tips

To easily remember quartile positions, think of Q1 as the first quarter, Q2 as the second quarter (median), and Q3 as the third quarter. When calculating percentiles, practice using linear interpolation to estimate values accurately. For the AP exam, familiarize yourself with both manual calculation methods and statistical software tools to efficiently handle large data sets.

Did You Know
star

Did You Know

Did you know that the concept of quartiles dates back to the early 19th century, developed by American engineer and mathematician Francis Galton? Quartiles are not only fundamental in statistics but are also used in financial markets to analyze stock performance and in public health to assess the distribution of health indicators across populations.

Common Mistakes
star

Common Mistakes

Mistake 1: Confusing median with mean.
Incorrect: Assuming the average of numbers represents the central value.
Correct: Arrange the data and identify the middle value.

Mistake 2: Incorrectly dividing data sets when finding quartiles.
Incorrect: Including the median in both lower and upper halves for an odd number of observations.
Correct: Exclude the median when the number of observations is odd.

Mistake 3: Misapplying the percentile formula.
Incorrect: Using the wrong position formula leading to inaccurate percentiles.
Correct: Apply the correct formula: $P = \left(\frac{b + \frac{c}{d}}{N}\right) \times 100$.

FAQ

What is the difference between median and mode?
The median is the middle value in an ordered data set, while the mode is the most frequently occurring value. The median is less affected by outliers compared to the mode.
How do you find quartiles in a data set with an even number of observations?
First, find the median to divide the data set into lower and upper halves. Then, calculate the median of each half to determine Q1 and Q3, respectively.
Can percentiles be used for grouped data?
Yes, percentiles can be calculated for grouped data using interpolation within the appropriate class interval.
Why is the interquartile range preferred over the range?
The interquartile range (IQR) focuses on the middle 50% of the data, making it less sensitive to outliers and providing a better measure of data variability.
How do skewed distributions affect the median and mean?
In skewed distributions, the median is a better measure of central tendency as it is not influenced by extreme values, whereas the mean can be skewed in the direction of the tail.
4. Geometry
5. Functions
6. Number
8. Algebra
Download PDF
Get PDF
Download PDF
PDF
Share
Share
Explore
Explore
How would you like to practise?
close