Applied statistics lecture           . website of the department

optional course for students of the Faculty of Science, accredited Ph.D. course

2019/2020 Autumn semester

Lectures delivered by Ernő Keszei, professor of chemistry , Room No. 148, phone:  372-2500 / ext. 1904, keszei-AT-chem.elte.hu

Course description and recommended reading    -----      Oral exam topics .

Time and location:  Tuesday 14:15 to 15:45, Chemistry Building, 1st floor, Room No. 132

Exam dates: to be discussed; in December and  January

Exams are forseen in Room No. 148, Chemistry Building

Actual schedule:

 September October November December 1 5 3 5 8 12 10 12 15 19 17 19 22 26 24

September 5, Thursday:
Registration period

September 12, Thursday:

1st lecture: Probability theory basics I. - Random experiment, outcome, sample space, event. Postulates (axioms) of probability theory and some properties following from them. Independence and conditional probability. Probability of independent simultaneous events. Interpretations of prbability: frequency, classical probability and geometric probability.
Review (in Hungarian) of the book Paradoxes in Probability Theory and Mathematical Statistics.
Random experiment: roll a die or flip a coin.

September 19, Thursday:
2nd lecture:
Probability theory basics II. - Random variables. Simple events concerning discrete and continuous random variables. Sampling distributions. Properties of the probability density function. Relation of the probability distribution function to the probability density function. Calculation of probabilities based on the probability distribution function and on the probability density function. Expectations and their properties. Calculation of the expectation as a linear operation. Some particularly important expectations. (Distribution mean, distribution moments, variance, covariance, correlation coefficient, entropy.) Some important relations for calculating expectations.

September 24, Tuesday:  Attention! Change of day from here on to Tuesdays!
3rd lecture: Probability theory basics III. - The covariance matrix and its properties. Relation between independence of random variables and their covariance. Calculation of (normalised) probability density function, if not known. The law of large numbers. Stochastic convergence and random walk. Some important probability distributions. The binomial and hypergeometric distributions. Poisson process: interrelation of the uniform, the Poisson and the exponential distributions.
An Excel worksheet to explore random walk and fluctuations.

October 1, Tuesday:
4th lecture:
Probability theory basics IV. -  Poisson and the exponential distributions. Gamma distribution. Normal distribution.

October 8, Tuesday:
5th lecture:
Introduction to statistical methods. Chi-squared, Student's t, and Fisher's F distribution. Distributions without a maximum. The arcsin distribution.
The aim and methods of statistics. Population vs. sample. Sampling. Estimation and characteristics of estimators. Estimation methods. Histograms. Sample statistics. Sample mean, sample variance and covariance.
Auxiliary material: Problems concerning sampling in a Hungarian election and a U.S. election. A short appetizing paper and a deeper analysis on cognitive bias (or "self fooling") from Nature.
Homework: histogram construction.

October 15, Tuesday:
6th lecture:
Estimators, estimation and estimates. Expected properties of estimators. Methods of estimation. Histograms. The method of maximum likelihood and a few actual applications. Further examples concerning maximum likelihood.
Auxiliary material: Assignment for histogram construction.

October 22, Tuesday:

7th lecture:
Estimation: The method of least squares. The method of moments. Other estimation methods. Estimation of expectation and variance of functions of random variables.
Confidence intervals. Formulation of confidence in a computable form. Confidence interval for the expectation of a normal distribution with known variance. Confidence interval for the expectation of a normal distribution with unknown variance. Confidence interval for the the parameters of a binomial distribution.
Auxiliary material: Assignment for parameter estimation using the method of moments.

October 29, Tuesday:

Autumn holiday

November 5, Tuesday:
8th lecture:
Confidence interval for the the parameters of a binomial distribution. Confidence interval for the variance. Approximate confidence interval for a function of random variables. Confidence interval for the difference of the expectations of two random variables.
Statistical hypothesis testing - general considerations. Null hypothesis and alternative hypothesis. Types of hypotheses. Statistics underlying decision making. Type I and type II errors. Power function of the test.
Auxiliary material:
Common confusions concerning hypothesis testing. Assignment concerning confidence intervals: 1. Confidence of mean and variance. 2. Confidence and comparison of two means.

November 12, Tuesday:
9th lecture:
Statistical hypothesis testing: Statistics underlying decision making. Type I and type II errors. Test on the mean of a normal distribution with known and unknown variance. Test on the parameter p of a binomial distribution ("test on proportions"). Tests between means drawn from two samples in the case of normal distributions and binomial distributions ("tests on differences").
Auxiliary material:
Assignement on the confidence interval of proportion differences; Assignement on testing a mean.

November 19, Tuesday:
10th lecture:
Tests on matching pairs. Nonparametric tests. The Sign-test; the Mann-Whitney and Wilcoxon (Rank-Sum) test.  Tests on several means: One-way ANOVA and two-way ANOVA tests. Multivariate analysis of variance (MANOVA). Homoscedasticity tests (tests on variances). Functional relations between random variables. Testing the correlation coefficient.
Auxiliary material:
A CU Boulder leaflet on MANOVA; Assignement on testing two meansns. Assignment on 1-way ANOVA, assignment on 2-way ANOVA.

November 26, Tuesday:
11th lecture:
Estimation of parameters describing functions of random variables. The general straight line and the straight line through the origin. Testing the difference between the two cases: the significance test of the intercept. Weighted least squares (LSQ) estimation.

December 3, Tuesday:
12th lecture:
Optimisation of weights to give MVU estimation. Implicit regression. Overview of the conditions for the validity of LSQ estimation. Multivariate analysis: overview of multivariate methods.

December 10, Tuesday:
13th lecture:
Multivariate analysis. Overview of multivariate methods.

December 17, Tuesday:
14th lecture:
Eventual supplementary lecture.