Methods of Applied Statistics lecture           . website of the department

Accredited Ph.D. course

2022/2023 Autumn semester

Lectures delivered by Ernő Keszei, emeritus professor of chemistry , Room No. 148, phone:  372-2500 / ext. 1904, keszei-AT-chem.elte.hu

#### Course description and recommended reading.

Time and location:  Tuesday 4:00 to 5:30 PM, Room 132, Chemistry Building

Exam dates: to be discussed; in December and  January

Exams are forseen in Room No. 148, Chemistry Building

Actual schedule:

 September October November December 3 7 5 10 14 12 12 17 21 19 19 24 28 26 31

September 12, Tuesday:
Registration week

September 19, Tuesday:
Negotiation on lecture time

September 26, Tuesday:

1st lecture:
Probability theory basics I. - Random experiment, outcome, sample space, event. Postulates (axioms) of probability theory and some properties following from them. Independence and conditional probability. Probability of independent simultaneous events. Interpretations of prbability: frequency, classical probability and geometric probability. Random variables. Simple events concerning discrete and continuous random variables. Sampling distributions. Properties of the probability density function. Relation of the probability distribution function to the probability density function. Calculation of probabilities based on the probability distribution function and on the probability density function.
Textbook pp. 1 to 29.
Suggested literature: Paradoxes in Probability Theory and Mathematical Statistics. Auxiliary material: Probabilty basics demo.

October 3, Tuesday:
2nd lecture:
Probability theory basics II. - Expectations and their properties. Calculation of the expectation as a linear operation. Some particularly important expectations. (Distribution mean, distribution moments, variance, covariance, correlation coefficient, entropy.) Some important relations for calculating expectations. The covariance matrix and its properties. Relation between independence of random variables and their covariance. Calculation of (normalised) probability density function, if not known. The law of large numbers. Stochastic convergence and random walk.
To study: An Excel worksheet to explore random walk and fluctuations. A detailed explanation of the waiting-time paradox. Textbook pp. 22 to 45.

October 10, Tuesday:
3rd lecture:
Probability theory basics III. -  Some important probability distributions. The binomial and hypergeometric distributions. Poisson process: interrelation of the uniform, the Poisson and the exponential distributions. Gamma distribution. Normal distribution.
To study:
Textbook pp. 40 to 57.
Homework:
Auxiliary material:
Arcsin ditribution: description of use / R-code to run.

October 17, Tuesday:
4th lecture:
Probability theory basics IV. -  Properties and use of the Normal distribution. Construction of a p.d.f. from physical calculations.
Homework:
Mean and variance of a normal distribution.

October 24, Tuesday:
5th lecture:
Introduction to statistical methods. Chi-squared, Student's t, and Fisher's F distribution. Distributions without a maximum. The arcsin distribution.
The aim and methods of statistics. Population vs. sample. Sampling. Estimation and characteristics of estimators. Estimation methods. Histograms. Sample statistics. Sample mean, sample variance and covariance.
Auxiliary material: Problems concerning sampling in a Hungarian election and a U.S. election. A short appetizing paper and a deeper analysis on cognitive bias (or "self fooling") from Nature.
Homework:
histogram construction.

October 31, Tuesday:

No lecture:
Autumn holiday

November 7, Tuesday:

6th lecture: Estimators, estimation and estimates. Expected properties of estimators. Methods of estimation. Histograms. The method of maximum likelihood and a few actual applications. Further examples concerning maximum likelihood.
Auxiliary material: Assignment for histogram construction.

November 14, Tuesday:

7th lecture:
Estimation: The method of least squares. The method of moments. Other estimation methods. Estimation of expectation and variance of functions of random variables.
Confidence intervals. Formulation of confidence in a computable form. Confidence interval for the expectation of a normal distribution with known variance. Confidence interval for the expectation of a normal distribution with unknown variance. Confidence interval for the the parameters of a binomial distribution.
Auxiliary material: Assignment for parameter estimation using the method of moments.

November 21, Tuesday:
8th lecture:
Confidence interval for the the parameters of a binomial distribution. Confidence interval for the variance. Approximate confidence interval for a function of random variables. Confidence interval for the difference of the expectations of two random variables.
Statistical hypothesis testing - general considerations. Null hypothesis and alternative hypothesis. Types of hypotheses. Statistics underlying decision making. Type I and type II errors. Power function of the test.
Auxiliary material:
Common confusions concerning hypothesis testing. Assignment concerning confidence intervals: 1. Confidence of mean and variance. 2. Confidence and comparison of two means.

November 28, Tuesday:
9th lecture:
Statistical hypothesis testing: Null hypothesis and alternative hypothesis. Types of hypotheses. Statistics underlying decision making. Type I and type II errors. Test on the mean of a normal distribution with known and unknown variance. Test on the parameter p of a binomial distribution ("test on proportions"). Tests between means drawn from two samples in the case of normal distributions and binomial distributions ("tests on differences").
Auxiliary material:
Assignement on the confidence interval of proportion differences; Assignement on testing a mean.

December 5, Tuesday:
10th lecture:
Tests on matching pairs. Nonparametric tests. The Sign-test; the Mann-Whitney and Wilcoxon (Rank-Sum) test.  Tests on several means: One-way ANOVA and two-way ANOVA tests. Multivariate analysis of variance (MANOVA). Homoscedasticity tests (tests on variances). Functional relations between random variables. Testing the correlation coefficient.
Auxiliary material:
A CU Boulder leaflet on MANOVA; Assignement on testing two meansns. Assignment on 1-way ANOVA, assignment on 2-way ANOVA.

December 12, Tuesday:
11th lecture:
Estimation of parameters describing functions of random variables. The general straight line and the straight line through the origin. Testing the difference between the two cases: the significance test of the intercept. Weighted least squares (LSQ) estimation. Optimisation of weights to give MVU estimation. Implicit regression. Overview of the conditions for the validity of LSQ estimation. Multivariate analysis: a short overview of multivariate methods.

December 19, Tuesday:
No lecture