Recently I’ve been thinking alot about teaching in terms of both content and pedagogy, and particularly surrounding data science education and aiming to acheive specific learning objectives therein (see for example the intro data science course I did recently and my thoughts thereon). I just thought I would quickly jot down some of my thoughts and resources as they could potentially be useful to refer too later.

I’ve been reading ‘An Introduction to Statistical Learning with Applications in R’ by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani. I also read ‘50 Years in Data Science’ by David Donoho. Both of which gave interesting perspectives on the concepts of data science. Dot points 2 and 3 on page 8 of ISLR resonated with me in particular, while Donoho gives a great historical insight into the relationship between Data Science and Statistics as fields and as professions that serve perhaps overlapping roles.

One interesting realisation I had while reading ISLR was that in Chapter 2 they mention off-hand how the concept that a model with better training MSE might be considered a better model to use for prediction and that this is ‘intuitive’. This was fascinating because to me this is the opposite of intuitive, and helped me remember that my intuition has be trained and modified by many years of study and experience. Something the students won’t share — not to mention this particular intuition is one shared by many experienced statisticians. This reminded me of the importance of formative assessment to determine students perceptions and understandings of concepts rather than to assume them when designing content and pedagogy, and often their needs will be unknown to me without asking them first about what their needs are. It is NOT obvious what misconceptions they might naturally lean towards, at least to me. As such, it is important to discover these.

A couple of resources I’ve come accross recently: