DISTRIBUTIONS OF CONTINUOUS DATA

ST101 – DR. ARIC LABARR


A random variable is a numerical description of the outcome of an experiment.

They can be either discrete or continuous.

A discrete random variable may assume either a finite number of values or an infinite sequence of values.

A continuous random variable may assume any numerical value in an interval or collection of intervals.


RANDOM VARIABLES


CONTINUOUS RANDOM VARIABLES

A continuous random variable can assume any value in an interval on the real line or in a collection of intervals on the real line.

It is not possible to talk about the probability of the random variable assuming a particular value. 

Instead, we talk about the probability of the random variable assuming a value inside of a given interval.


 

PROBABILITIES ON INTERVALS

 

 

 


POPULAR CONTINUOUS DISTRIBUTIONS

Uniform

Exponential

Normal

 

 

 

 

 

 


A continuous random variable may assume any numerical value in an interval or collection of intervals.

It is not possible to talk about the probability of the random variable assuming a particular value, but we instead talk about probabilities of intervals. 


SUMMARY


UNIFORM DISTRIBUTION

DISTRIBUTIONS OF CONTINUOUS DATA


A random variable follows a uniform distribution whenever the probability is proportional to the interval’s length.

In other words, every value has an equal probability of happening.

The probability density function for the uniform distribution is:


UNIFORM PROBABILITY DISTRIBUTION

 


Assume that sales calls that go into a company are uniformly distributed by the years of experience of the sales staff so that everyone has the same chance of getting a call.

The years of experience ranges from 2-12.


EXAMPLE OF UNIFORM DISTRIBUTION

 


Assume that sales calls that go into a company are uniformly distributed by the years of experience of the sales staff so that everyone has the same chance of getting a call.


EXAMPLE OF UNIFORM DISTRIBUTION

 

 

 

 

 


Assume that sales calls that go into a company are uniformly distributed by the years of experience of the sales staff so that everyone has the same chance of getting a call.

What is the probability a call is answered by an employee with 10 to 12 years of experience?


EXAMPLE OF UNIFORM DISTRIBUTION

 

 

 

 

 

 


What is the probability a call is answered by an employee with 10 to 12 years of experience?

Area under the curve between 10 and 12.


EXAMPLE OF UNIFORM DISTRIBUTION

 

 

 

 

 

 

 


Expected Value:



Variance:

MEASURES ON UNIFORM DISTRIBUTION

 

 


Assume that sales calls that go into a company are uniformly distributed by the years of experience of the sales staff so that everyone has the same chance of getting a call.

What is the expected years of experience of a person answering a new sales call?

EXAMPLE OF UNIFORM DISTRIBUTION

 

 

 

 

 


Assume that sales calls that go into a company are uniformly distributed by the years of experience of the sales staff so that everyone has the same chance of getting a call.

What is the expected years of experience of a person answering a new sales call?

EXAMPLE OF UNIFORM DISTRIBUTION

 

 

 

 

 

 


Assume that sales calls that go into a company are uniformly distributed by the years of experience of the sales staff so that everyone has the same chance of getting a call.

What is the expected years of experience of a person answering a new sales call?

EXAMPLE OF UNIFORM DISTRIBUTION

 

 

 

 

 

 

 


A random variable follows a uniform distribution whenever the probability is proportional to the interval’s length.

The probability density function for the uniform distribution is:


SUMMARY

 


NORMAL DISTRIBUTION

DISTRIBUTIONS OF CONTINUOUS DATA


The Normal probability distribution is one of the most common and important distributions for describing a continuous random variable.

The Normal distribution is the foundation of statistical inference:

Hypothesis Testing

Confidence Intervals

Regression Analysis

Appears in nature and real-world data.


IMPORTANCE


The probability density function for the Normal distribution is defined as:


PROBABILITY DENSITY FUNCTION

 


The probability density function for the Normal distribution is defined as:


PROBABILITY DENSITY FUNCTION

 

 


The probability density function for the Normal distribution is defined as:


PROBABILITY DENSITY FUNCTION

 

 

 


CHARACTERISTICS OF NORMAL DISTRIBUTION

 


CHARACTERISTICS OF NORMAL DISTRIBUTION

 

More Probable

Less Probable


CHARACTERISTICS OF NORMAL DISTRIBUTION

 

 

 

 

Mean can take ANY value


CHARACTERISTICS OF NORMAL DISTRIBUTION

 

 

 

Standard Deviation controls the width


The Normal probability distribution is one of the most common and important distributions for describing a continuous random variable.

The Normal distribution is the foundation of statistical inference.

The Normal distribution has some useful characteristics.



SUMMARY


EMPIRICAL RULE

DISTRIBUTIONS OF CONTINUOUS DATA


The probabilities for the Normal random variable are determined by the area under the curve.

The total area under the curve = 1.

Since the Normal distribution is perfectly symmetric around the mean (and median), then the area of the curve below the mean = above the mean = 0.5.


PROBABILITIES


EMPIRICAL RULE

 


EMPIRICAL RULE

 

 

 

 


EMPIRICAL RULE

 

 

 

 


EMPIRICAL RULE

 

 

 

 


EMPIRICAL RULE

 

 

 

 

 

 

 

 

 

 


EMPIRICAL RULE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Assume new employees at a company have previous years of professional experience that follow a Normal distribution where the mean is 7.5 and the standard deviation is 2.5.

What is the probability any random new employee has between 5 and 10 years of experience?

EXAMPLE


What is the probability any random new employee has between 5 and 10 years of experience?

EXAMPLE

 

 

 

 

 

 

 

 

 

 


What is the probability any random new employee has between 5 and 10 years of experience?

EXAMPLE

 

 

 

 

 

 

 

 


Assume new employees at a company have previous years of professional experience that follow a Normal distribution where the mean is 7.5 and the standard deviation is 2.5.

What is the probability any random new employee has between 2.5 and 10 years of experience?

EXAMPLE


What is the probability any random new employee has between 2.5 and 10 years of experience?

EXAMPLE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


What is the probability any random new employee has between 2.5 and 10 years of experience?

EXAMPLE

 

 

 

 

 

 

 

 

 

 

 

 

 


The empirical rule (68, 95, 99.7 rule) is good for quick, fast, rough analysis.

Not good for exact analysis unless your interests are only in the integer standard deviations.

What about fractions of standard deviations away from the mean?

Need another way to quickly calculate area under the curve.


SUMMARY


STANDARD SCORES

DISTRIBUTIONS OF CONTINUOUS DATA


A random variable having a Normal distribution with a mean of 0 and a standard deviation of 1 is said to have a standard Normal probability distribution.

All Normal distributions can be converted into standard Normal distributions for ease of computing probabilities under the curve.

Standard Normal probability tables help calculate area under the curve.


CONVERSION OF NORMAL DISTRIBUTIONS


The standard Normal table is an extension of the empirical rule where the area under the standard Normal curve to the left of any point is calculated up to two decimal points.


STANDARD NORMAL TABLE

z

.00

.01

.02

.03

.04

.05

.06

.07

.08

.09

.

.

.

.

.

.

.

.

.

.

.

0.5

.6915

.6950

.6985

.7019

.7054

.7088

.7123

.7517

.7190

.7224

0.6

.7257

.7291

.7324

.7357

.7389

.7422

.7454

.7486

.7517

.7549

0.7

.7580

.7611

.7642

.7673

.7704

.7734

.7764

.7794

.7823

.7852

0.8

.7881

.7910

.7939

.7967

.7995

.8023

.8051

.8078

.8106

.8133

0.9

.8159

.8186

.8212

.8238

.8264

.8289

.8315

.8340

.8365

.8389

.

.

.

.

.

.

.

.

.

.

.


The standard Normal table is an extension of the empirical rule where the area under the standard Normal curve to the left of any point is calculated up to two decimal points.


STANDARD NORMAL TABLE

z

.00

.01

.02

.03

.04

.05

.06

.07

.08

.09

.

.

.

.

.

.

.

.

.

.

.

0.5

.6915

.6950

.6985

.7019

.7054

.7088

.7123

.7517

.7190

.7224

0.6

.7257

.7291

.7324

.7357

.7389

.7422

.7454

.7486

.7517

.7549

0.7

.7580

.7611

.7642

.7673

.7704

.7734

.7764

.7794

.7823

.7852

0.8

.7881

.7910

.7939

.7967

.7995

.8023

.8051

.8078

.8106

.8133

0.9

.8159

.8186

.8212

.8238

.8264

.8289

.8315

.8340

.8365

.8389

.

.

.

.

.

.

.

.

.

.

.

 


The standard Normal table is an extension of the empirical rule where the area under the standard Normal curve to the left of any point is calculated up to two decimal points.

STANDARD NORMAL TABLE

 

 

 

 


The standard Normal table is an extension of the empirical rule where the area under the standard Normal curve to the left of any point is calculated up to two decimal points.

To calculate values to the right of any point, use the laws of probability:


CALCULATING OPPOSITE PROBABILITIES

 

 

 


All Normal distributions can be converted into standard Normal distributions for ease of computing probabilities under the curve.


CONVERSION OF NORMAL DISTRIBUTIONS

 

 


All Normal distributions can be converted into standard Normal distributions for ease of computing probabilities under the curve.


CONVERSION OF NORMAL DISTRIBUTIONS

 

 


All Normal distributions can be converted into standard Normal distributions for ease of computing probabilities under the curve.


CONVERSION OF NORMAL DISTRIBUTIONS

 


All Normal distributions can be converted into standard Normal distributions for ease of computing probabilities under the curve.


CONVERSION OF NORMAL DISTRIBUTIONS

 

 


 

Z-SCORES

 


Assume that the daily number of total users follows a Normal distribution.  The average daily number of total users is 4,504 with a standard deviation of 1,937.  What is the probability that any random day has more than 6,000 total users?


Z-SCORES BIKE DATA EXAMPLE


Assume that the daily number of total users follows a Normal distribution.  The average daily number of total users is 4,504 with a standard deviation of 1,937.  What is the probability that any random day has more than 6,000 total users?


Z-SCORES BIKE DATA EXAMPLE

 


Z-SCORES BIKE DATA EXAMPLE

z

.00

.01

.02

.03

.04

.05

.06

.07

.08

.09

.

.

.

.

.

.

.

.

.

.

.

0.5

.6915

.6950

.6985

.7019

.7054

.7088

.7123

.7517

.7190

.7224

0.6

.7257

.7291

.7324

.7357

.7389

.7422

.7454

.7486

.7517

.7549

0.7

.7580

.7611

.7642

.7673

.7704

.7734

.7764

.7794

.7823

.7852

0.8

.7881

.7910

.7939

.7967

.7995

.8023

.8051

.8078

.8106

.8133

0.9

.8159

.8186

.8212

.8238

.8264

.8289

.8315

.8340

.8365

.8389

.

.

.

.

.

.

.

.

.

.

.

 

 


Assume that the daily number of total users follows a Normal distribution.  The average daily number of total users is 4,504 with a standard deviation of 1,937.  What is the probability that any random day has more than 6,000 total users?


Z-SCORES BIKE DATA EXAMPLE

 

 

 

 


Z-SCORES BIKE DATA EXAMPLE

 

 

 

 


Z-SCORES BIKE DATA EXAMPLE

 

 

 

 

 


Assume that the daily number of total users follows a Normal distribution.  The average daily number of total users is 4,504 with a standard deviation of 1,937.  What is the number of daily users that would be in the bottom 10% of daily users?


Z-SCORES BIKE DATA EXAMPLE


Z-SCORES BIKE DATA EXAMPLE

z

.00

.01

.02

.03

.04

.05

.06

.07

.08

.09

.

.

.

.

.

.

.

.

.

.

.

-1.4

.0808

.0793

.0778

.0764

.0749

.0735

.0721

.0708

.0694

.0681

-1.3

.0968

.0951

.0934

.0918

.0901

.0885

.0869

.0853

.0838

.0823

-1.2

.1151

.1131

.1112

.1093

.1075

.1056

.1038

.1020

.1003

.0985

-1.1

.1357

.1335

.1314

.1292

.1271

.1251

.1230

.1210

.1190

.1170

-1.0

.1587

.1562

.1539

.1515

.1492

.1469

.1446

.1423

.1401

.1379

.

.

.

.

.

.

.

.

.

.

.

 


Assume that the daily number of total users follows a Normal distribution.  The average daily number of total users is 4,504 with a standard deviation of 1,937.  What is the number of daily users that would be in the bottom 10% of daily users?


Z-SCORES BIKE DATA EXAMPLE

 

 


A random variable having a Normal distribution with a mean of 0 and a standard deviation of 1 is said to have a standard Normal probability distribution.

All Normal distributions can be converted into standard Normal distributions for ease of computing probabilities under the curve.

Standard Normal probability tables help calculate area under the curve.


SUMMARY


Последнее изменение: понедельник, 17 октября 2022, 13:11