NAIS - The Importance of Evaluating Research

Finding research and data has never been easier. Need a data point to support an argument or proposition you’re making in a presentation? A quick Google search can bring up a dizzying list of options. Finding quality research and data, however, has never been more difficult. We live in a time in which we are surrounded by fake news and untruths. Mass media articles intend to titillate, escalate anxiety, or generate some other emotion in order to appeal to the market. It’s not hard to see how research can be exploited to that end.

But in an era of big data, in which independent schools, like corporations and nonprofits, are seeking to create data-driven cultures and use research to drive decisions, the need for understanding how to evaluate and interpret research and data has never been more important. No matter how big or small a decision might seem, the research that is used to guide and support it must be sound and thorough.

Research in Decision-Making

When trying to determine, for example, changes to programs or procedures, school heads often use their own judgment, implement board-defined decisions, or leave teachers to their own devices. But this is inadequate. Securing faculty buy-in is another strategy for decision-making, but it often means that a committee leads discussions about what should occur and ends with a vote to determine the course of action. It is no more appropriate to vote on pedagogical practices in our schools than it is to vote on the validity of mathematical theorems, diagnoses of illness, or the laws of physics. As Karen Beerer writes in an April 2017 ESchool News article, “While consensus and collaborative decision-making is important, it can also be paralyzing to innovation.”

One of the most urgent exigencies of the Information Revolution in which we live is that staff—and students—be trained to engage in and recognize quality research. Mass media articles often cite anecdotal evidence garnered from the study of a single school or district, rather than reporting serious research. It is not enough for senior administrators to occasionally share an article of interest or lead a discussion about the latest education-related bestseller. The fads that capture the headlines should provide an impetus to examine the research (or lack thereof) behind them. Hundreds of peer-reviewed articles that examine and analyze topics related to the field of education and what we do on a daily basis are published each year. We must foster a culture of professional development and expect that educators, like professionals in every field, keep up with the pedagogical and discipline-specific research. Information can be disseminated and shared through online faculty and departmental platforms, like a class using the existing LMS, or even social media, to facilitate sharing and commenting. A minimum level of participation can be required, like the posting of one article every month and at least three substantive comments on others’ posts. Discussing the research in small groups across and within disciplines should be part of faculty meetings as well. Talking is not enough, however; the implementation is key. Staff should test promising, research-based practices in their classrooms by designing their own experiments, forming and testing hypotheses, and sharing their work with colleagues and the broader community of educators.

Not All Research is Good Research

We are “infoxicated” with data, yet the ability to conduct research is becoming a lost art. In 2012, C. Glenn Begley, the former head of global cancer research at Amgen, reported that his team was able to reproduce the findings of only six of 53 major cancer studies, putting countless lives at risk and wasting millions of dollars. In 2014, the National Institutes of Health famously announced a widespread crisis of reproducibility of research, the core of the scientific method, due to poor experimental design. Knowing how to evaluate a study is key. Here are several points to consider:

The intention of the study. Identify the study’s intended audience: educational decision-makers, parents, scientists, etc., and its sample population. If you are seeking information about various aspects of homework on the secondary school level, for example, a study about the effects of homework on the primary or tertiary level is not necessarily applicable. Similarly, a study about the effects of a new homework strategy on inner-city, low-income, Asian-American K–12 students is not necessarily applicable to suburban elite high school students. To better focus the results, use the advanced search option and generate keywords by predicting the terms you are most likely to see on a useful page.

The authorship. Search the qualifications of the authors, but understand that credentials and experience
do not automatically imply reliable claims or research design. Sometimes the biography will reveal a specific corporate, political, or ideological disposition.

The sponsor. Search for any sponsorship or funding details, bias, or commercial affiliation. Is the study funded by an entity (or an organization with ties to an entity) that stands to gain in any way from the results? This is crucial, but can be tricky to trace.

The currency. The nature of science is that new information supplants previous studies. Is the data sufficiently current to make sense relative to the information problem? If the issue is summer slide, it’s fine to draw from a century of research, but if the issue is the impact of interactive e-books on student engagement, the data should not exceed five years. What are logical date delimiters to use in an advanced search?

The size and selection of the sample. In general, randomized studies are stronger than those that are not because hand-picked samples only authentically apply to that sample. A sample size of 45,000 is more reliable than one involving 100 students because it reduces the amount that the sample may differ from the general population [margin of error]. What is the composition of the cohort? Is it a sufficiently random (or targeted) sample, cohort or cross-section? Think about the ways in which the sample could affect the data. While a study based on a single school or district is not authoritative, it can provide an idea of how one might structure a similar experiment or study in one’s own school.

The methodology. In general, stronger data is yielded when:

An intervention is conducted on one group while a similar group, a control group, has no such intervention.
A study is conducted on a group and the data is analyzed (rather than examining the effects of an intervention already).
A study compares groups during the same period of time, rather than a current group with a historical one.

Is the method logical? Is there a control group? In a time series, are the baseline, interval, and outcome reliable measures of the efficacy of the intervention? How was the data collected? Did they collect the right data? How did they reduce bias? What modeling strategy was used to analyze the data? What variables are the researchers taking into account? What were they unable to consider or what did they fail to consider? Are there rival explanations? Are the results generalizable? Have the results been replicated? In what ways do the answers to these questions affect the integrity of the outcomes?

Remember that correlation is not causation. This is a common error in logic that we have largely become inured to through constant exposure. Malcolm Gladwell’s, for example, work is full of examples of this assumption. Do the researchers presume causation? Just because two events are concurrent (e.g. an increase in both adolescent screen time and depression), we cannot conclude that one is responsible for the other without isolating other variables. Is there enough evidence and were enough variables considered and controlled to make the author’s claim?

It’s also important to consider that we tend to seek information that affirms what we believe, desire, or have already found. This is called confirmation bias. Google exacerbates this by providing results based on what our search history algorithm indicates we want to see. Examine related studies and compare the samples, intentions, methodologies, and outcomes. Cherry-picking the research to share without providing contrary views is a sure way to lose the trust of the community.

A Research Process in Action

Let’s say that, on the heels of reports on NPR and in The Atlantic, a group of activist parents is pressing to
start classes an hour later to be “healthier” for high school students. A committee is formed to gather research that will be used to determine whether the effort is worth considering.

Before launching into a research study, define as specifically as possible the problem to solve and the questions to answer:

First, what does “healthier” mean, exactly? Of course, we all want our students to be healthy and perform at their best. Are our students unhealthy? If so, in what ways? How would this intervention contribute to solving the problem, if in fact there is one, and are there other more effective ways to achieve it?
Identify the students and families likely to benefit and those likely to be disadvantaged, as well as the mitigating factors on both sides. What impact could this have on the larger school (logistics, transportation, extracurricular activities, student and staff morale, etc.) and its relations with the community (athletic competitions, bus companies)? What criteria will we use to measure success and by when? Is this something we could try out for a month or a term?
In addition to the academic research, find out what other schools have done, how and why, and the consequences, especially unintended, they experienced. Ask online discussion boards and affinity groups, and speak with the people involved and affected, preferably on both sides. But be cautious: Anecdotal evidence is just one data point.
Finally, weigh the costs and benefits in terms of finances, operations, community life, values, and morale, and determine whether there is sufficient basis for implementing change or developing a survey or convening focus groups. When convening such groups, remember to include students, staff, and parents, as well as board members so they can hear the issues for themselves.

To start, do a basic Google search with terms that you are likely to find on a relevant page: late school start times without quotation marks denoting an exact phrase search, and without qualifiers like “advantages” or “disadvantages.” On the first pages of research, the results’ titles will likely indicate a preponderance of evidence favoring later start times to ensure students get more sleep. More sleep is presented as a panacea for tardiness, sleepiness, absences, obesity, and automobile accidents, and improves mental health and academic achievement. Maintain a healthy skepticism.

From a cursory skim of the URLs, it’s easy to see the organizations responsible for the posts: a nonprofit dedicated to the effort, National Sleep Foundation, RAND, the Centers for Disease Control, the American Psychological Association, American Medical Association, the National Education Association, and the common news media outlets like Time and Washington Post. Keep in mind that expert opinion remains opinion. Reviewing the evidence from cited scientific studies in these articles is indispensable.

After developing a general sense of the landscape, it’s time to engage in a more serious study of the research. Type “Google Scholar” in the Google search box, and the wardrobe door opens to a veritable academic Narnia. A search for “adolescent sleep” yields academic studies that have been cited hundreds of times, attesting to their influence—though not necessarily to their validity. If a search yields too many results that are not relevant, add an exact phrase, exclude certain terms, or limit dates.

Reading the abstracts and conclusions of the studies referenced in the basic and Google Scholar searches is a start, but an insufficient foundation for policymaking. Using the criteria above (intention, authorship, sponsorship currency, sample, and methodology), evaluate the quality of the studies. Take time to think critically about the methodology. (It’s certainly intriguing to learn that due to biological circadian rhythms, adolescent non-humans and humans cross-culturally are predisposed to becoming sleepy later, but it doesn’t necessarily mandate later school start times.)

It is vital to locate opposing and alternate views. While adding a term such as “con,” “problem,” or “against” may be helpful, it can be more effective to identify the substance of another perspective. In this case, you might wonder if there’s a study of parents (or boarding schools) setting earlier bedtimes to achieve the same results, or at least ensure more sleep. (It turns out there are such studies, and theyare fruitful.)

Now that information has been gathered on both sides, it’s time to return to the original set of framing questions and determine whether the evidence supports the proposed intervention as a solution to the
original problem. If it does, and the benefits outweigh the costs, implement it; if it doesn’t, don’t. Opinion must not supersede evidence. Consensus should not replace leadership. This is a time of rapid change. We cannot merely stay the course—or succumb to fads. Our schools have the responsibility to provide students with the highest quality education and experience, grounded in evidence-based best practices. That means we have to actively seek them to know what they are.

Time to Take the Step

Independent schools’ freedom to innovate, experiment, and create programs to inspire and engage our students is our great strength and one of the reasons parents entrust us with their children’s education. Increased competition means we must use data to support our claims of comparative advantage over other options and be open to innovate based on research, not seductive shiny fads. We must present a high standard of professionalism by keeping up with the developments in our field the way other professionals do. We have unprecedented access to information. Let’s use it.