Welcome to the home page for the PsyPag-MSCP-Section Simulation Summer School ran between the 4th-30th June 2021. The schedule and embedded videos are below.
We were overwhelmed with the amount of interest in both the live sesions and videos and would definitely consider running another school (not necessarily in the summer). If you would be interested in delivering a session, please do get in contact at SimSummerSchool[a][t]gmail.com. We would be particularly keen for attendees to deliver sessions, extending what they have learned this year!
A general pre-requisite for the summer school is a basic familiarity with R and R-Studio. Fortunately there are numerous videos on YouTube to get you started - see below for a playlist by Andy Field and a video by Dorothy Bishop. The individual sessions may have their own, so check those too!
Friday 4th June, Time (UTC+1) 13:00, Duration 2 hours
In this workshop I will introduce the {faux} R package for simulating factorial designs from existing data or data parameters. I will also cover adding continuous predictors to exsiting data and simulating multiple datasets for simulation studies.
Monday 7th June, Time (UTC+1) 13:00, Duration 1.5 hours
This will be a really basic introduction to random number generation, loops, and functions which will put you in good stead for the remainder of the workshops. We will look at simulating independent samples and running t-tests on them thousands of times to get an idea of the distribution of p-values when there is a known effect, and when there is not.
Wednesday 9th June, Time (UTC+1) 16:00, Duration 1.5 hours
This workshop will provide a basic introduction to power analyses for regression using simulation in R.
Friday 11th June, Time (UTC+1) 13:00, Duration 2 hours
In this session we will simulate participants’ item-level response data, where the observed response is binary (0 = incorrect, 1 = correct). We will look at this in the context of a multilevel design, where the dataset includes multiple observations of the same participants and the same items. To do this, we will use a logistic multilevel model to describe the response generating process. Building from a simple model of the response generating process, we will look at including single and multiple predictors which change the log odds of observing a correct response (i.e. effects). Random intercept and effect variances, and the covariance between these, will also be explored. We will also look at how we could estimate the statistical power to detect an effect of interest according to our defined model, for a given experimental design. The freely available programs R and R Studio will be used to code and run the simulations. All code will be available for the simulations in this session.
Monday 14th June, Time (UTC+1) 16:00, Duration 1.5 hours
A common design for psychological experiments is a comparison of measures between groups of participants based on their experimental condition. This design is analogous to the ANOVA statistical model. Using the mathematically-defined F-distribution is an opaque way of operationalizing ANOVA designs. Instead, this workshop focus on simulation-based approaches and will teach attendees how to utilize R and simulation to compare experimental data to the expectations based on candidate hypotheses. Topics covered will include (1) Specifying a detailed design model, (2) Identifying random sources of variation, (3) identifying and quantifying potential experimental sources of variation, (4) simulating results under candidate models, (5) drawing conclusions.
Wednesday 16th June, Time (UTC+1) 17:00, Duration 1.5 hours
In this workshop, we will introduce the basic statistical foundation of the Monte Carlo method and discuss how it is used to evaluate properties of estimators (e.g., unbiasedness, efficiency, robustness). An illustrative example will be used to compare the efficiency of the median relative to the mean. I will also discuss the concept of Monte Carlo simulation error, and touch on simulating multivariate normal data for structural equation modeling
Friday 18th June, Time (UTC+1) 13:00, Duration 2 hours
In this workshop, I will cover a more general approach to demonstrate how you can take advantage of simulation to make a more effective pre-registration. I will use an R Markdown template that makes it easy to include a simulated power analysis (covered by other workshops) and planned analysis steps using simulated data. The first hour will be dedicated to a walk through of creating a preregistration using simulated data. The second hour will be dedicated to working on your own preregistration with the opportunity to troubleshoot.
Monday 21st June, Time (UTC+1) 17:00, Duration
I will discuss two sets of simulations that evaluated algorithms that were in use to give grades to schools. For both, participants will construct the code for the simulations (with guidance from the instructor) and run the simulations. We will discuss whether governments should use simulation of algorithms prior to their implementation if they are not sure how they will perform.
Wednesday 23rd June, Time (UTC+1) 13:00, Duration 2 hours
The conclusions we can draw from data depend on our knowledge about the data generating process.We will simulate data according to an assumed model and discuss the conclusions based on linear regression. With that we will get further insights into Simpson’s and Berkson’s paradox.
Friday 25th June, Time (UTC+1) 17:00, Duration 2 hours
This workshop introduces the NOrmal To Anything (NORTA) method as a general technique to simulate a variety of non-normal distributions where the researcher controls the correlation structure of the data (i.e., the population correlation matrix) as well as the non-normality of the univariate, marginal distributions (both continuous and discrete). It begins with a quick overview of the multivariate normal distribution and how it can be altered to allow the user to either control the skewness and excess kurtosis (for continuous data) or the number of categories and frequency (for discrete data). It aims to connect the NORTA approach to popular techniques such as the 3rd order polynomial transformation and other algorithms which are commonly used in simulation studies in the psychology, education and the social sciences.
Monday 28th June, Time (UTC+1) 13:00, Duration
Posterior predictive checks are used to evaluate model fits and model assiumptions in Bayesian (MCMC) based analyses. They involve drawing samples from the posterior distribution to general hypothetical data sets, and then comparing these data sets with the observed data to indentify discrepancies between the model’s predictions and the reality of the data. Posterior predictive checks are particularly useful in complex models where the assumptions of the model are not very easy to check. By contrast to posterior predictive checks, prior predictive checks are very useful to clarify the assumptions of choices of priors. Again, hypothetical data sets are generated and these make it clear what kinds of the data the prior distribution does and does not assume.
Wednesday 30th June, Time (UTC+1) 14:00, Duration 2 hours
The purpose of this workshop is to demonstrate how to write safe, effective, and intuitive R code for Monte Carlo simulation experiments containing one or more simulation factors. A few of the attractive Monte Carlo simulation coding strategies we will cover are: 1) How to write code which is intuitive to read, write, and debug; 2) How to take advantage of SimDesign’s built-in features for creating flexible and extensible simulations; 3) Computational efficiency; 4) Reproducibility at the macro and micro level; 5) Safe and reliable code execution.