Autistics/individuals diagnosed with autism spectrum disorderFootnote 1 (ASD) commonly display qualitative impairments in social behavior (American Psychiatric Association, 2013). The challenges related to social behavior for these individuals commonly result in the use of interventions directly targeting the development of social skills. To date, the effectiveness of several interventions for the development of social skills has been documented within the peer-reviewed literature. These include, but are not limited to, the teaching interaction procedure (e.g., Dotson et al., 2010; Leaf, Oppenheim-Leaf, et al., 2012a), video modeling (e.g., Rudy et al., 2014), discrete-trial teaching (e.g., Garcia-Albea et al., 2014), pivotal response treatment (e.g., Mohammadzaheri et al., 2014), behavioral skills training (e.g., Stewart et al., 2007), and the Cool Versus Not Cool procedure (e.g., Milne et al., 2017).

Although it is most common for these interventions to occur in person, in a clinical or home setting, the COVID-19 pandemic has illustrated the need for effective interventions that can be delivered via telehealth (Cox et al., 2020; LeBlanc et al., 2020). Behavior-analytic research related to intervention for autistics/individuals diagnosed with ASD via telehealth is not new (Ferguson et al., 2019), but the sudden move to telehealth for many service providers has accelerated the need for research evaluating the effectiveness of behavior-analytic procedures delivered via telehealth (Cox et al., 2020; LeBlanc et al., 2020). To date, much of the behavior-analytic research related to telehealth-delivered behavior-analytic interventions has focused on training an individual to subsequently implement the intervention in person (see Ferguson et al., 2019, for a review). However, there are some recent, notable examples of the direct application of behavior-analytic interventions delivered via online tools.

For example, Pellegrino and DiGennaro Reed (2020) evaluated an intervention delivered via telehealth that used total task chaining and least-to-most prompting for two adults with intellectual and developmental disabilities. All sessions occurred via VSee, with the two adults participating from their apartments while the experimenter was located in a separate apartment within the same complex. Targeted skills were those that the participants expressed interest in learning (e.g., light cooking, money management). The results of a multiple-probe across-behaviors design indicated the intervention was effective for teaching both participants three self-selected skills. Furthermore, each of the skills maintained, and both participants indicated satisfaction with the procedures and outcomes. In another recent example, Ferguson et al. (2020) evaluated the effectiveness of discrete-trial teaching with instructive feedback delivered via telehealth to teach tact relations to six children diagnosed with ASD. The participants were divided into dyads, and each participant had their own primary (i.e., targeted tact) and secondary (i.e., instructive feedback) targets. All sessions occurred via Zoom, with the interventionist and participants located in different physical locations. The results indicated that all participants acquired primary and secondary responses, and five of six acquired primary and secondary observational responses (i.e., the targets for the other participant in the dyad).

Although the aforementioned (i.e., Ferguson et al., 2020; Pellegrino & DiGennaro Reed, 2020) and past research (e.g., Wacker et al., 2013) is promising for many who are shifting to service delivery via telehealth, the research is limited with respect to social skills interventions delivered directly via telehealth. One intervention approach that may transfer with little effort to a telehealth model is the Cool Versus Not Cool procedure. The Cool Versus Not Cool procedure is an approach to developing social discriminations through what is commonly referred to as discrimination training via instructor demonstration. Role-plays are then used to increase the likelihood the learner will engage in the desired social skill in the terminal environment. More specifically, the Cool Versus Not Cool procedure consists of five components: (a) labeling the targeted social skill (e.g., the “cool” skill), (b) the interventionist modeling the cool (i.e., the desired topography of the behavior) and not cool ways (i.e., the undesired topography of the behavior) to display the social skill, (c) providing the learner with the opportunity to label the model as cool or not cool and why the model was cool or not cool, (d) the learner role-playing the cool way, and (e) providing reinforcement or feedback based on learner responding throughout. It should also be noted the terms “cool” and “not cool” were selected based on the learners with whom the procedure was originally developed. Those learners were using those words already, and it was thought that using those same words to describe desired and undesired social behaviors would increase the likelihood of generalization and maintenance in the terminal environment (Leaf et al., 2020). As such, interventionists should determine the labels that will be most appropriate based on the learners with whom they provide intervention (e.g., “dope” and “weak,” “good choice” and “bad choice,” or “appropriate” and “inappropriate” may be more appropriate and effective than “cool” and “not cool” for some learners).

The Cool Versus Not Cool procedure has been demonstrated to be effective for teaching a variety of social skills to autistics/individuals diagnosed with ASD (e.g., interrupting, changing the game, greetings, joint attention, changing the conversation, abduction prevention, and eye contact; Leaf, Tsuji, et al., 2012b). The effectiveness of the Cool Versus Not Cool procedure has also been demonstrated in one-to-one settings (e.g., Leaf et al., 2015), small group settings (Au et al., 2016), and large group settings (Milne et al., 2017). In light of the recent increased need for effective telehealth-delivered interventions, the purpose of the present study was to evaluate the effectiveness of the Cool Versus Not Cool procedure conducted via telehealth for three children diagnosed with ASD.

Method

Participants

Three children independently diagnosed with ASD participated in the study. Participant demographic information is provided in Table 1. All participants had a previous history of receiving in-person social skills interventions including the Cool Versus Not Cool procedure. All participants had some experience with direct intervention delivered via telehealth due to the COVID-19 pandemic but had recently transitioned back to in-person intervention. All participants were also currently participating in a social skills group two times a week via telehealth. None of the participants had any previous experience with the use of the Cool Versus Not Cool procedure to target changing the conversation when someone is bored. Informed consent was obtained from each of the participants’ parents prior to participation in the study. The participants were free to leave at any point during the session; however, this never occurred, and participants assented to all sessions.

Table 1 Participant Demographic Information

Interventionist

Julia L. Ferguuson, the second author, served as the interventionist for all sessions, with the exception of generalization sessions. She was a 29-year-old White female with an undergraduate degree in applied behavior analysis and a master’s degree in behavior analysis and had begun her studies toward a doctoral degree in applied behavior analysis. She had over 8 years of experience implementing interventions based on the principles and procedures of applied behavior analysis for individuals diagnosed with ASD. This experience also included the use of the methods within this study to teach a variety of social skills to a variety of learners.

Setting

Throughout all conditions, probes, and intervention sessions, the participants and the interventionist were in different locations in Southern California. Winston was located at home for all of his sessions, whereas Nick and Schmidt were located at an isolated location within a private clinic. The interventionist conducted all sessions from her home. All sessions were conducted using the Zoom Video Communications, Inc. (Zoom; www.Zoom.us), platform using various devices with video and audio capabilities. Winston, Nick, and Schmidt used an iPad, and the interventionist used a laptop computer.

Dependent Measures

The participants’ supervisors, who were responsible for training staff, developing curriculum and intervention strategies, and overseeing the participants’ overall progress, were asked to provide a list of social skills that were likely to be included relatively soon within the participants’ regularly scheduled clinical sessions. The lists provided by the supervisors were examined for areas of overlap across participants. Each of the supervisors noted the participants’ challenges with changing the conversation when someone is bored (i.e., identifying possible boredom cues and changing their behavior as a result). Therefore, the main dependent variable for all three participants was changing the conversation when someone is bored. Changing the conversation when someone is bored was divided into seven component steps (see Table 2). Participant engagement in each of the steps was assessed during probe sessions (described later). The mastery criterion was defined as the participant engaging in each of the steps in the outlined order across three consecutive probe sessions. Generalization of changing the conversation when someone is bored was also measured once prior to intervention and once following a participant reaching the mastery criterion. Probe sessions to assess generalization after intervention occurred between 2 and 5 days following a participant reaching the mastery criterion.

Table 2 Components of Changing the Conversation When Someone Is Bored

Probe Sessions

Probe sessions occurred across each condition (i.e., baseline, intervention, generalization, and maintenance). Probe sessions consisted of one opportunity for the participant to demonstrate the targeted skill and lasted an average of 2 min (range 1–4 min). During the intervention condition, probe sessions always preceded intervention sessions with a 1 min break between sessions. Probe sessions were used to assess participant responding in the absence of direct instruction, prompting, or programmed reinforcement.

To begin each probe, the interventionist engaged the participant in a conversation about a preferred topic. These topics were determined by discussing with the participant’s staff and supervisor the participant’s preferred movies, video games, and activities. Following 2 min or a minimum of four exchanges, the interventionist began to engage in nonvocal boredom cues for up to 15 s. These cues consisted of looking away from the screen, looking at their phone or watch, or not responding to the participant. If the participant engaged in the steps for changing the conversation when someone is bored, the interventionist responded in accordance with the task analysis. That is, the interventionist engaged in a vocal/verbal response related to the statement made by the participant. If the participant did not engage in the steps for changing the conversation when someone is bored, the interventionist ended the probe by saying “thanks” and returning them back to the activity in which they were previously engaged.

General Procedure

Sessions occurred once a day, 2 to 5 days a week, depending on participant and interventionist availability. Intervention sessions lasted an average of 10 min (range 7–17 min). The interventionist sent a link for the Zoom video conference to the participant’s staff member. The staff member began the video conference for the participant and was available during the intervention sessions for role-plays. The staff member did not have any other interactions or functions during research sessions (i.e., they did not function as a shadow or provide any prompts or praise throughout the sessions).

Baseline

The purpose of baseline was to assess participant responding prior to any intervention or programmed reinforcement. Baseline sessions consisted of a probe session (previously described).

Generalization

Generalization of changing the conversation when someone is bored was assessed prior to intervention and after reaching the mastery criterion for all participants. To assess generalization, probe sessions occurred as described previously (i.e., via Zoom) with the exception that the participant’s supervisor served as the conversation partner (i.e., someone different from the interventionist and staff member). Within the clinic, the participant’s supervisor was responsible for training staff, developing curriculum and intervention strategies, and overseeing the participant’s overall progress. Winston’s supervisor was a 33-year-old White female with a bachelor’s degree in psychology and 11 years of experience providing intervention for autistics/individuals diagnosed with ASD. Nick’s supervisor was a 30-year-old White female with a bachelor’s degree in psychology and 7.5 years of experience providing intervention for autistics/individuals diagnosed with ASD. Schmidt’s supervisor was a 39-year-old Korean female with a master’s degree in applied behavior analysis and 10 years of experience providing intervention for autistics/individuals diagnosed with ASD.

Intervention

Intervention consisted of the Cool Versus Not Cool procedure delivered via Zoom video conferences. To begin, the interventionist labeled the targeted skill (e.g., “Today we are going to work on changing the conversation when someone is bored.”). The interventionist then provided a demonstration with the participant’s staff member. The demonstration consisted of the interventionist and the participant’s staff member engaging in a conversation via the Zoom connection. The staff member (who was in the same room as the participant) would then engage in nonvocal boredom cues. The interventionist followed the steps for changing the conversation when someone is bored (i.e., “cool” demonstration) or responded similarly to how the participant was responding to nonvocal boredom cues during probe sessions (i.e., “not cool” demonstration). The interventionist then ended the demonstration and asked the participant to label whether the interventionist responded in the cool or not cool way (e.g., “Was that cool or not cool?”). If the participant responded correctly, the interventionist provided praise and asked the participant to label why the demonstration was cool or not cool (e.g., “That’s right! Why was it cool/not cool?”). If the participant responded incorrectly, the interventionist provided feedback and asked why the demonstration was cool or not cool (e.g., “No, that was actually not cool. Tell me why that was not cool.”). There were a total of four demonstrations with two cool and two not cool demonstrations that were randomized prior to each session. Following the demonstrations, the interventionist then provided the participant with an opportunity to practice. During these role-plays, the interventionist engaged the participant in a conversation about a preferred topic and began to engage in nonvocal boredom cues. If the participant engaged in the steps for changing the conversation when someone is bored, the interventionist provided praise (e.g., “That’s the way! Kiss your brain!”). If the participant did not engage in the steps for changing the conversation when someone is bored, the interventionist ended the role-play and provided corrective feedback (e.g., “You missed it. I was bored, and you didn’t change the conversation.”). This continued until the participant engaged in all the steps for changing the conversation when someone is bored correctly.

Maintenance

Maintenance probes began 7 days following the participant reaching the mastery criterion for changing the conversation when someone is bored (as previously described). All maintenance sessions were conducted by the previously described interventionist. Three maintenance sessions occurred for each participant.

Experimental Design

A nonconcurrent multiple-baseline design (Watson & Workman, 1981) across participants, with a modification to improve experimental control, was used to evaluate the effectiveness of the Cool Versus Not Cool procedure delivered via telehealth on the participants’ changing the conversation when someone is bored. This design was selected for its flexibility when conducting research in applied settings, which a concurrent multiple-baseline design does not allow. This flexibility was even more necessary for conducting research during an ongoing pandemic. Traditionally, within a nonconcurrent multiple-baseline design, baseline phases are predetermined, and participants are randomly assigned to each baseline length as they become available (Watson & Workman, 1981). Similar to previous studies (e.g., Cihon et al., 2019), an additional criterion common within multiple-baseline logic was used in this study in an attempt to improve the strength of the design. Participants progressed from baseline to intervention once a stable level of responding was observed during baseline. That is, we extended baseline sessions for the next participant until intervention effects were observed with the previous participant if necessary. Therefore, predetermined baseline phases and assignment were not used within this study. Experimental control was demonstrated when the intervention resulted in changes in a participant’s behavior without changes in the remaining participants’ behavior during baseline sessions (Baer et al., 1968; Carr, 2005). Furthermore, although the nonconcurrent multiple-baseline design permits participant removal if stable responding is not obtained, no participants were removed from this study for this or any other reason.

Interrater Agreement and Treatment Fidelity

Julia L. Ferguson, the second author (who was the interventionist in the study), also served as the primary rater for all sessions. Matthew Lee, the third author, served as the secondary rater. He held an undergraduate degree in psychology and had 6 months of experience with interventions based in applied behavior analysis for autistics/individuals diagnosed with ASD. Interrater agreement was collected on the primary dependent variable on 37.5% of all sessions across participants and conditions. Agreements were defined as both raters scoring the same response on a step for changing the conversation when someone is bored. Disagreements were defined as one rater scoring one response and the other rater scoring a different response on a step for changing the conversation when someone is bored. Interrater agreement was calculated by dividing the number of agreements by the number of disagreements plus agreements and dividing by 100. Interrater agreement across all three participants was 100% during baseline, generalization, intervention, and maintenance conditions.

The fidelity of the interventionist’s implementation of probes and the Cool Versus Not Cool procedure was also assessed. Matthew Lee (previously described) independently observed 35.7% of probe sessions and 37.8% of intervention sessions across all participants and conditions to score the interventionist’s behavior. Correct interventionist behavior during probes consisted of (a) engaging the participant in a conversation about a preferred topic, (b) engaging in nonvocal boredom cues following 2 min or a minimum of four exchanges, (c) responding in accordance with the task analysis if the participant engaged in the steps of the targeted skill, and (d) ending the probe by saying “thanks” if the participant did not engage in the steps of the targeted skill. Treatment fidelity was calculated by dividing the number of steps the interventionist displayed correctly by the total number of steps and multiplying by 100. Treatment fidelity for probes averaged 100% across all participants and conditions.

Correct interventionist behavior during intervention sessions consisted of (a) labeling the targeted skill, (b) providing an instruction for the participant to watch the demonstration, (c) providing two cool and two not cool demonstrations, (d) providing the participant with an opportunity to rate whether the demonstration was cool or not cool after each demonstration, (e) providing the consequence that corresponded with the participant’s response after each rating, (f) providing the participant with an opportunity to label why the demonstration was cool or not cool after each demonstration, (g) providing the consequence that corresponded with the participant’s response after each answer, (h) providing the participant with an opportunity to role-play the targeted skill, (i) providing the consequence that corresponded with the participant’s response, and (j) repeating role-plays until the participant engaged in all steps of the targeted skill correctly. Treatment fidelity for intervention sessions averaged 100% across all participants.

Social Validity

To assess social validity, a questionnaire with four questions was sent to the participants’ supervisors and parents at the end of the study. The supervisors and parents were provided with a video from the intervention condition (i.e., without the probe) selected at random for their child/client and were asked to fill out the questionnaire after viewing the video. The questions consisted of the following:

  1. 1.

    How important was it for the/you child to learn the skill the interventionist was teaching?

  2. 2.

    Please rate the degree to which you found the intervention to be acceptable.

  3. 3.

    Please rate the degree to which you found the intervention to be effective.

  4. 4.

    Do you feel that this method of social skills instruction is an acceptable replacement for in-person social skills instruction?

Responses to each of the questions were on a Likert scale from 1 (i.e., not at all important, not at all acceptable, and not at all effective) to 5 (i.e., very important, very acceptable, and very effective). The supervisors and parents then sent the questionnaires back to the researchers anonymously.

Results

Figure 1 displays the results for all three participants across all conditions. During baseline, all three participants engaged in the same number of steps for changing the conversation when someone is bored. All three participants were consistently engaging in the first three steps (i.e., facing the screen, maintaining a neutral or positive facial expression, and maintaining a neutral or positive tone) but not any of the remaining steps, which is not surprising given the possibility that those steps are likely necessary for any conversation-based skills. This responding continued during the assessment of generalization prior to the intervention condition. All three participants reached the mastery criterion (i.e., engaging in each of the steps in the outlined order across three consecutive probe sessions) during the intervention condition. The total intervention time required for Winston, Nick, and Schmidt to reach the mastery criterion was 70, 45, and 59 min, respectively. After reaching the mastery criterion, Winston’s responding during the assessment of generalization returned to baseline levels, whereas Nick and Schmidt continued to engage in all steps correctly during the assessment of generalization. All three participants continued to engage in all steps correctly during maintenance sessions that occurred 7 days after they reached the mastery criterion.

Fig. 1
figure 1

Participant Responding During Probe Sessions

All three supervisors and only one parent returned the social validity questionnaire. When asked how important it was for the child to learn the skill the interventionist was teaching, all three supervisors responded with a 4. When asked to rate the degree to which they found the intervention to be acceptable, all three supervisors responded with a 5. When asked to rate the degree to which they found the intervention to be effective, the supervisors responded with a 4 on average (range 3–5). When asked if they felt that this method of social skills instruction is an acceptable replacement for in-person social skills instruction, all three supervisors responded with a 4. The only parent to return the social validity questionnaire responded with a 5 on all questions. Given the context in which this study occurred (i.e., COVID-19 pandemic) that may have resulted in increased stress and hardships for the parents and families, the researchers did not persist in requests to return the questionnaire if the parents did not do so after one follow-up.

Discussion

The purpose of the present study was to evaluate the effectiveness of the Cool Versus Not Cool procedure conducted via telehealth to teach three children diagnosed with ASD to change the conversation when someone is bored. All three participants reached the mastery criterion in a relatively low number of sessions (range 4–8). Responding generalized to another adult for two of the three participants, and all three participants maintained correct responding on all steps of the targeted skill on all maintenance probes. Furthermore, responses to the social validity questionnaires indicated the skill was important to teach, the intervention was acceptable and effective, and the telehealth format was an acceptable replacement for in-person intervention for these three participants. These results have implications for clinicians providing intervention for autistics/individuals diagnosed with ASD.

The development of an effective social skills repertoire is a common focus of interventions for autistics/individuals diagnosed with ASD. Recent events have put many clinicians in the position of shifting this intervention to telehealth. Given the limited research on effective social skills interventions delivered via telehealth, this study provides clinicians with a viable option when making this shift. The results indicated that the Cool Versus Not Cool procedure can effectively be used when delivered via telehealth under conditions similar to this study (e.g., staff available, similar participant demographics, similar social skill). Furthermore, given previous comparisons with other interventions (e.g., Social Stories; Leaf et al., 2016) and a lack of data supporting the use of those other interventions via telehealth, clinicians may opt for the use of the Cool Versus Not Cool procedure over interventions such as Social Stories. However, future research will be necessary to compare these interventions within the same telehealth context to provide empirical evidence for the use of one procedure over the other.

Clinicians should also closely monitor the generalization of social skills that are targeted using the Cool Versus Not Cool procedure via telehealth. Winston’s responding within the teaching context did not generalize to a very similar context. It may be the case that a social skills intervention delivered via telehealth poses limitations that are absent in an in-person intervention. The social context involved in the delivery of interventions via telehealth differs greatly from the social context of in-person delivery. This difference could make developing the desired stimulus controls for social behavior difficult. For example, there may be subtleties within in-person interactions that become part of the contingencies that are not present or possible in a telehealth context. This difference may vary based on the social skill that is targeted, and future research may more efficiently evaluate this by targeting multiple skills within a study.

The results of the social validity questionnaire indicated the intervention was appropriate, effective, and an acceptable replacement for in-person social skills instruction for the respondents of the survey. These results provide more support for clinicians to use the Cool Versus Not Cool procedure when delivering social skills intervention via telehealth. This may be a welcomed result with respect to the numerous considerations required when making the move to telehealth (Cox et al., 2020). It should be noted, however, that there was a low response rate to the survey from the parent respondents, which may affect the interpretation of the social validity data. Furthermore, clinicians should remain cautious and take into consideration individual differences, given the limited number of participants, skills, and social validity respondents within this study.

This study did not go without limitations that warrant discussion. First, although the interventionist conducted all sessions via Zoom, each of the participants had a staff member present at their location during all sessions who also assisted during the demonstration portion of the Cool Versus Not Cool procedure. This might not be possible in all situations. However, it is common for a parent or caregiver to be present during telehealth sessions. In these cases, a parent could fill the position of the staff member. Second, only one skill was targeted with each of the participants. This limits the possible generality of the results to other social skills. Previous research has demonstrated the effectiveness of the Cool Versus Not Cool procedure to teach a variety of social skills, and it is possible those results would extend to a telehealth context. Future research will be necessary to evaluate whether that is the case. Third, all the participants in this study had a previous history of receiving in-person social skills interventions including the Cool Versus Not Cool procedure, had some experience with direct intervention delivered via telehealth, and had rather well-developed repertoires (see Table 1). As such, the results may not generalize to other autistics/individuals diagnosed with ASD with less experience with the Cool Versus Not Cool procedure and intervention delivered via telehealth or with less developed repertoires. Fourth, due to the setting in which the study occurred, a nonconcurrent multiple-baseline design was used. Although nonconcurrent multiple-baseline designs control for threats to internal validity (Harvey et al., 2004; Watson & Workman, 1981) and are common within applied research, a concurrent multiple-baseline design may be desired by future researchers. Finally, no measures of generalization were collected for in-person settings. These measures were not included for two primary reasons. First, due to the COVID-19 pandemic, increasing the number of people the participants contacted in person was deemed potentially harmful. Second, given the increase in the use of virtual environments, targeting social skills potentially useful in those environments may be useful in its own right given the increased use of communication technologies among young persons (Manago et al., 2020). Nonetheless, the results should be taken with caution with respect to the generalization of effects to in-person settings.

Despite the limitations, this study contributes to the limited literature on effective social skills interventions delivered via telehealth for autistics/individuals diagnosed with ASD. Given the uncertainty involved in what the intervention context will resemble in the future and the limited number of qualified professionals in some areas, effective, evidence-based interventions delivered via telehealth are more necessary than ever. We hope this study provides clinicians with an effective, evidence-based social skill intervention when shifting to telehealth-based instruction and targeting social skills for use in virtual environments and inspires future similar research to increase the number of treatment options available to interventionists.