Preface
To the Student
If you like math and are interested in learning what statistics is all about, you will likely enjoy and do well in your introductory statistics course. If your prior experiences in math courses have caused you to have an aversion to anything resembling math, I want to convince you to adjust your attitude at the start of your statistics course. Although statistics looks a lot like a math course, I do not take a mathematical approach to teaching statistics in this book. Of course, there are formulas, but they are generally simple and do not require advanced math. Most importantly, the formulas are readily comprehensible if you make the effort to understand them. And that is my real message: Your goal should always be to understand the concepts and procedures. If you do that, you may surprise yourself in your statistics course.
My goals for you can be summarized briefly: (1) I want you to understand tables and graphs and basic measures of central tendency (i.e., mean, median, mode) and dispersion (i.e., range, variance, standard deviation) so that you can make sense of them in the many contexts in which you will encounter them. (2) I want you to understand what a sampling distribution of a statistic is and how it is central to all the procedures known as “inferential statistics.” (3) I want you to understand the purpose and logic of the procedure known as “hypothesis testing.” (4) I want you to understand the concept of the power of a statistical test. (5) I want you to understand the information provided by measures of “effect size.” And (6) I want you to understand the purpose and logic of the procedure known as “confidence intervals.” If you complete your course with a reasonably deep understanding of these six topics, you can consider yourself literate with respect to knowledge of statistics.
Although the book does not take a mathematical approach to statistics instruction, the material is not easy. For most students in the behavioral sciences, statistical thinking is novel, so you will need to work to make sense of the concepts and procedures. The first chapter of the book contains recommendations for how to use the book. Don’t blow by these suggestions! Read them, think about them, put them into practice. If you do, you will learn a great deal in your course. Yes, it will take time, effort, and persistence to follow the recommendations. I make no apologies for that: I want you to get the most out of your course and I am confident you will if you follow the suggestions.
To the Instructor
Teaching Philosophy
My teaching philosophy shaped the textbook in many ways, so I share my perspective briefly because it may help you to understand choices I made in coverage, organization, and presentation.
The Goal Is Understanding
In the book, statistical formulas and concepts are developed verbally, not mathematically. The information synthesized by each formula is explained, and the interpretation of the number that results from the calculation is given. The logic of each major inferential procedure (e.g., hypothesis testing, effect size estimation, confidence intervals, and power) is carefully developed, and each procedure is encountered in multiple contexts (i.e., analysis of frequencies, analysis of means, analysis of variance, correlation, and regression). Chapter-ending exercises and test bank items include elaborated answers that emphasize interpretation as well as execution. “Learning Questions” begin major chapter sections, and “Key Takeaways” conclude chapter sections, and both emphasize concepts and the logic of procedures.
Maintaining a focus on comprehension is difficult with a student population composed of many students who have learned to approach math in rote fashion, where formulas are to be memorized and numbers are the input and output of procedures. Memorization is the enemy of comprehension. In my own teaching, I never require memorization of any content. At exam times, students prepare “cheat sheets” with all the formulas and examples they want to include. This exercise reinforces the message that the goal is understanding as well as having the benefit of encouraging students to review and pull together the material from the tested section.
Respect Students’ Abilities to Understand Statistical Concepts
You are well aware that many majors in the behavioral sciences are intimidated by statistics. Some authors address students’ fear by trying to “dumb down” the material, emphasizing the execution of procedures over comprehension of statistical concepts and procedures. I believe this approach does students a great disservice: They learn very little of value, and most importantly, it underestimates their capabilities (even though they may not initially be aware of those abilities). My textbook is not a cookbook. Formulas are given and procedures are illustrated, but the exposition attempts to consistently focus more on “why” than “how.” I try to adopt a conversational tone and keep the conversation about making sense of things.
Depth of Understanding Is More Important than Breadth of Coverage
The book covers most topics that are covered in conventional introductory texts. However, I will admit that I have never taught ANOVA and often do not cover some of the inferential procedures in correlation and regression that are covered in the textbook. I teach the first-year graduate course in ANOVA so my decision to exclude coverage of ANOVA in my undergraduate course is most definitely a reflection of my belief that a first course in statistics should not be about how many procedures can be taught in the course. What is the point? Students who continue their study of statistics and use statistics in their work will learn all the procedures they need to know. The great majority of students who will not take another course in statistics will forget all the procedural details. But if they have learned basic descriptive statistics, the role of probability in inference, and the logic of the core inferential procedures, they will be able to function much more effectively in a world that frequently presents them with tables and graphs, means and medians, ranges and standard deviations, polling results, and references to studies that show no difference between Brand X and Brand Y.
My goals for students taking an introductory course are threefold: First, students should know basic descriptive statistics, including how to present and use tables and graphs and the common measures of properties of distributions. Second, students should understand that the process of inferring population characteristics from sample characteristics entails risk because no sample is a perfect reflection of the population from which it was drawn. Therefore, the process of inference requires the ability to compute probabilities. We can compute probabilities because we know the sampling distributions of certain sample statistics under specified conditions. Third, students should come away with a good understanding of the logic and interpretation of the four major inferential procedures in conventional statistical applications; namely, hypothesis testing, confidence intervals, effect size estimation, and power. In my experience, most first-year graduate students have gaps in their understanding of the major inferential procedures, and it is a rare student who has a coherent understanding of the role of probability and sampling distributions in statistical inference. So if you achieve the three broad goals just listed, you have created a very successful course!
Emphasize the Logic of the Major Inferential Procedures
The logic of hypothesis testing, power, confidence intervals, and effect size measures is the same regardless of whether the sample statistic is a mean, a difference between means, a variance, or the frequency of occurrence of some category of observations. The common logic of the core inferential procedures is emphasized across each sample statistic that is treated in the book: analyses of frequency data based on the chi-square distribution; analyses of means based on the normal and t-distributions; analysis of variance based on the F-distribution; and analyses of correlation and regression statistics based on the t-distribution. By repeatedly returning to the core logic of a procedure in new domains, understanding of the logic will become deeper and more elaborated for the student.
Integrate Related Procedures within Domains
A common failing of instruction in statistics is that we tend to “silo” our treatment of statistical procedures. As each new procedure is introduced, it is developed and illustrated in the text . . . then the book moves onto the next procedure. Take the example of procedures for analyzing means. In my experience, students are easily confused about when to apply the normal and when to use t. They also struggle to identify the design of a study described in a problem. It is understandable that students have these confusions because there are many similarities among the procedures. They should have instruction in the important distinctions when identifying what procedure is relevant in a particular situation. And they should have opportunities to practice making those distinctions. Thus, the book includes supplementary “integration and application” chapters. Because of concerns about book length, these optional additional chapters are placed in Appendix B “The Research Study”. These chapters present real studies that ask multiple questions and require application of many of the procedures covered in the corresponding section of the book. The chapters discuss the relevant considerations in selecting a procedure and illustrate full analyses of relatively complex datasets. If you decide not to cover a topic like ANOVA in your course, you may create the room in your course to use (some of) the integration and application chapters to help consolidate the topics you choose to teach.
Organization and Content
The organization of the content into an initial long section on univariate statistics followed by a short section on bivariate statistics is unusual. Most books begin with a few chapters on descriptive statistics, including correlation and regression. Those books also tend to place any coverage of bivariate inferential procedures toward the end of the book. By that point, students have forgotten how to compute correlation and regression statistics and must be reminded. Also, they are left wondering what that correlation and regression stuff was all about as they spend weeks studying univariate inferential procedures. Thus, my decision was to keep the descriptive and inferential procedures grouped together at the end of the text.
The initial section of the book on descriptive statistics is unusual in a second, minor way. Namely, there is no presentation of standard scores (i.e., Z-scores). Instead, Z-scores are treated in the context of the normal distribution when the standardized normal distribution is introduced.
Another way in which the organization of the book is a bit unusual is that chi-square is not presented as the last chapter of the book. Consistent with the overall organization of chapters into univariate and bivariate procedures, chi-square is the last chapter of the long section on univariate statistics. Again, this paragraph is from the preface for the other book.
Finally, a couple of topics that are covered in my introductory textbooks were omitted from this book. First, many authors choose to cover a variety of nonparametric tests (e.g., Wilcoxon signed ranks), along with chi-square procedures. I have chosen not to do so for both practical and pedagogical reasons. From a practical point of view, these procedures are not very widely used. Should a student have occasion to need to use a nonparametric procedure, the procedures are easily learned and widely available in software packages. From a pedagogical perspective, the nonparametric tests are generally “one off” topics. They are typically quite limited in their applicability. More importantly from my perspective, studying the procedures does not reinforce and elaborate understanding of the logic of the inferential procedures of hypothesis testing, power, effect size estimation, and confidence intervals.
The second omission that may be inconvenient for some instructors concerns coverage of procedures based on the normal distribution. Specifically, only single sample designs are covered for the normal distribution (i.e., hypothesis testing, confidence intervals, Cohen’s d, and power). Procedures for analyzing independent samples designs using the normal are not included in the text. This decision was made because the t-distribution is much more useful in practice than procedures based on the normal. Including coverage of independent samples designs with the normal distribution simply does not seem like the best use of time.
Pedagogical Devices and Ancillary Material
The writing style in the book tries to engage the student by asking questions, directing students to do computations, and adopting a conversational tone that focuses on explaining concepts. In addition, several pedagogical devices are used to help students learn the material.
Key terms are highlighted, and their definitions are readily accessible so that students attend to important vocabulary and concepts.
Major sections of each chapter begin with a set of “Learning Questions” to orient students to their goals in studying the section.
Chapter sections conclude with “Key Takeaways” that summarize the answers to the learning questions. Key takeaway boxes are also used to bullet point steps in important procedures (e.g., hypothesis testing, power computations).
Chapter sections also include a self-study option. These consist of multiple-choice questions designed to test students’ understanding of the key concepts in the section and provide practice with procedures.
At the conclusion of each chapter, “Test Your Understanding” lists all the learning questions from the chapter. This section provides students with an opportunity to review the important ideas and procedures in the chapter. Answers to the questions are not provided with the questions; instead, students are expected to go back to the relevant section of the chapter to review the material because the effort to do so will reinforce learning more effectively than simply reading an immediately available answer.
Links to video clips are included in each chapter. These YouTube videos were selected to elaborate important concepts. They are not essential to the book, but many students will find them useful for clarifying and extending statistical concepts. The one caveat is that there occasionally are differences in terminology and notation. I have included comments to warn the student in those few cases where I am concerned that a video might introduce some confusion. (Obviously, in those cases, I thought the video’s strengths outweighed its flaws.)
Statistics cannot be learned without doing problems. Regular problem sets are a must for students to identify misunderstandings and consolidate their understanding. Every chapter concludes with a set of exercises designed to give practice in executing procedures and to push students to deepen their understanding of concepts. Answers are provided for you. You will find that the answers are generally full explanations rather than the final numerical results of calculations. In many cases, the answers address common errors that students make. Thus, the problems and their answers are an important learning opportunity for students.
Additional, open-ended problems are provided in the test bank. I recognize that scoring of open-ended problems on problem sets can be a challenge if you teach a large course. I am sure you recognize that doing open-ended problems is very beneficial to student learning. A system I use with my problem sets is to score each problem set on a 0, 1, 2, 3 point basis (not each problem, but the entire problem set). A score of 0 is assigned if the problem set is not turned in by a student; a score of 1 is for a submitted problem set that shows little effort; 2 is assigned for a conscientious effort that has many errors; 3 is assigned for a mostly correct problem set (i.e., a B-level effort or better). I strongly encourage students to work together and to seek help from me before the problem set is due. The scoring is lenient to encourage students to do the work in a timely fashion and to give them credit for consistent, conscientious effort. Because full, elaborated answers are provided after the problem sets are submitted, it is not necessary to give detailed feedback on each student’s problem set answers.
A test bank of multiple-choice problems is provided, along with the correct answers.
For those instructors who include SPSS as a component of their instruction, Appendix A contains an introduction to SPSS. In addition, most of the chapters include a section called “Using SPSS” that describes how to use SPSS to analyze a dataset presented in the chapter. These sections conclude with a problem or two that give students an opportunity to practice applying the procedures to a new dataset. The “Using SPSS” sections always follow the “Exercises” at the end of a chapter.
Finally, the text contains optional “integration and application chapters” in Appendices C–F. If you choose to use them, these chapters provide opportunities to make distinctions among closely related and similar procedures. Students struggle to decide whether, for example, to use a procedure based on the normal or t-distribution or the binomial or chi-square distribution. They often struggle to determine whether answering a question requires a test of a hypothesis about a single mean or a difference between two means. They can be confused about whether a study involves independent or matched groups. The integration and application chapters discuss potential sources of confusion and explain what the relevant distinctions are in choosing procedures. Further, the problems selected for illustration are real studies with all the complexities that entails. There are typically multiple analyses involved in completely analyzing a dataset, so the chapters also illustrate the process of carefully distinguishing the different types of questions to be asked about a study and the corresponding analyses appropriate for answering the questions.
A complete set of PowerPoint presentations (PPTs) has been created for every chapter in the book. The PPTs present how I structure my lectures. In my own course, I would introduce new examples of problems to increase the range of examples students see, but the included PPTs stick closely to the examples in the text to avoid ambiguity or confusion over the relationship between the PPTs and the text.
The Homework platform provides practice with procedures, extend students’ understanding to novel situations, and get students to think more about important concepts.