Introduction

In the past three decades, much research has focused on the relation between spatial skills and mathematical skills (e.g., Casey et al., 1995; Verdine et al., 2017). This effort has resulted in widely reported links between these two sets of skills, as well as a considerable body of evidence highlighting the benefits of well-developed spatial skills in a wide variety of science, technology, engineering, and mathematics (STEM) fields (Atit et al., 2015; Newcombe, 2010; Uttal et al., 2013). But to understand and perhaps leverage these relations within educational contexts, the magnitude of the relation as well as the moderators and mediators of the relation need to be better understood. The meta-analyses presented here aim to estimate the average magnitude of the reported relationship between spatial skills and mathematical achievement, to identify potential moderators and mediators of this relationship, and to inform the design and execution of future research in this area.

The current meta-analysis is particularly pressing in light of widespread underachievement of US students in STEM subjects (Aud et al., 2012), specifically in mathematics (Organization for Economic Development, 2016). A report from the Program of International Student Assessment (PISA) suggests that American students’ weaknesses in mathematics are particularly evident when “[r]easoning in a geometric context – requiring authentic reasoning in a planar or spatial geometric context”(Organization for Economic Development, 2012, p.3). This indicates that problems engaging one’s spatial skills may be particularly difficult for American students. As a concerted research and educational effort rises to address this need to improve students’ mathematical outcomes, it is worth considering the existing wealth of research examining the relation between these two skill sets. A brief overview of relevant research, grouped by common themes, is presented to situate the current meta-analysis within the broader academic and societal context.

Spatial skills and mathematics

Spatial skills enable us to mentally manipulate, organize, reason about, and make sense of spatial relationships in real and imagined spaces (e.g., Newcombe & Shipley, 2015; Uttal et al., 2013). These skills are commonly employed when completing everyday tasks such as assembling furniture or navigating from one location to another. Understanding spatial skills has been a topic of interest in psychology for much of the last 100 years (e.g., Bethell-Fox & Shepard, 1988; Carroll, 1993; Linn & Petersen, 1985; Shepard & Metzler, 1971). Historically, the interest in spatial skills has roots in the study of mechanical aptitude (Cox, 1928; Paterson et al., 1930) and in defining the factors of intelligence (Carroll, 1993; Thurstone & Thurstone, 1941). Examples of spatial skills commonly examined in studies of mathematical understanding include mental rotation (e.g., Geer et al., 2019) and spatial visualization (e.g., Burte et al., 2017).

Many studies examining the relations between spatial skills and mathematical achievement indicate that the two are significantly correlated for students at all educational levels (e.g., Casey et al., 1995; Delgado & Prieto, 2004; Verdine et al., 2017). For example, mental rotation performance relates to mathematical reasoning skills in elementary students (Geer et al., 2019) and secondary students (Delgado & Prieto, 2004), as well as mathematical aptitude in undergraduates (Casey et al., 1995). Furthermore, some studies report that spatial skills are predictive of students’ future mathematics learning even after accounting for indicators of general reasoning skills, such as verbal skills and executive functioning skills (Verdine et al., 2017; Zhang et al., 2014). Yet, the details of the relations between spatial skills, mathematical skills, and factors of general intelligence are not well understood.

In addition to the performance-based evidence of the relation between spatial skills and mathematical achievement, prior research suggests that the positive relation between spatial and mathematical reasoning found in many studies may be based on shared cognitive or neural processes (Gunderson et al., 2012; Mix & Cheng, 2012; see Hawes & Ansari, 2020, for a review). Some researchers assert that number information is mentally represented in spatial formats (Mix, 2019; Mix & Cheng, 2012). For example, work by Gunderson et al. (2012) indicates that quantitative magnitudes are represented in the mind spatially as a mental number line. Further evidence comes from brain-imaging studies showing similar areas of brain activation when individuals process both spatial and numerical information (Hubbard et al., 2005). Lastly, successful interventions often focus on helping students translate mathematical symbols or problem statements into spatial representations, including number lines, diagrams, concrete models, or hand gestures (Novack et al., 2014; Valenzeno et al., 2003).

In line with the success of spatial interventions in supporting students’ mathematical reasoning, some researchers believe that the relation between spatial skills and mathematical skills is perhaps due to differences in problem-solving strategies (Delgado & Prieto, 2004). More specifically, studies show that many mathematical problems can be solved by using spatial visualization and/or analytic strategies, and there may be two different styles: one is based on algorithm memorization and automatic application, the other on the visuospatial representation of the problems. This suggests that perhaps those who have better spatial skills would employ the second style of problem solving (i.e., visuospatial strategies), which is often considered to be more efficient (Delgado & Prieto, 2004).

Domain-general cognitive processes and mathematical achievement

Parallel to the research on spatial skills and mathematical achievement, researchers interested in mathematics learning and outcomes have also focused their efforts on examining the role of domain-general cognitive processes in students’ mathematical achievement and understanding (e.g., Hawes et al., 2019; Raghubar et al., 2010; Taub et al., 2008). Domain-general cognitive processes that have been found to have a strong positive relationship with mathematical achievement include executive functions, such as working memory (Raghubar et al., 2010) and fluid reasoning (Green et al., 2017; Taub et al., 2008). Working memory refers to a mental workspace that is involved in controlling, regulating, and actively maintaining relevant information to accomplish complex cognitive tasks (Miyake & Shah, 1999). Studies have found that working memory is fundamental for mathematics learning and performance in school-aged children (Berg, 2008; McKenzie et al., 2003) and adults (Imbo et al., 2007; Seitz & Schumann-Hengsteler, 2002). For instance, Berg (2008) examined the contribution of working memory in third to sixth grade students to performance on a mathematical test of a range of skills, such as single- and multi-digit arithmetic, fractions, and algebra. Working memory contributed unique variance to mathematical performance independent of chronological age, short-term memory, reading, and processing speed. In children with and without significant mathematics difficulties, Swanson and Beebe-Frankenberger(2004) found that working memory predicted solution accuracy on word problems independent of several academic and cognitive variables, such as fluid reasoning, reading skills, and phonological processing, among others. Additionally, Imbo et al. (2007) found that working memory plays a significant role during multi-digit arithmetic problem solving in adults.

Fluid reasoning is the ability to solve novel problems flexibly and deliberately without using previous information. More specifically, it is the ability to analyze novel problems, identify patterns and relationships that underpin these problems, and apply logic (Schneider & McGrew, 2012). In a synthesis of studies investigating the concurrent relationships between cognitive abilities and achievement measures, fluid reasoning was one of the three cognitive abilities that was consistently related to mathematical performance in calculation and problem solving at all age ranges throughout development (the other two were verbal comprehension and processing speed). Additionally, fluid reasoning has been found to predict future mathematical achievement (Green et al., 2017; McGrew & Wendling, 2010). Using a longitudinal cohort sequential design, Green et al. (2017) examined how fluid reasoning measured at three assessment occasions, spaced 1.5 years apart, predicted math outcomes for a group of 69 participants between ages 6 and 21 years across all three assessment occasions. Results revealed that fluid reasoning was the only significant predictor of future mathematical achievement, while age, spatial skills, and vocabulary were not significant predictors.

Green et al.’s (2017) finding that fluid reasoning, but not spatial skills, predicts future mathematical achievement is not surprising given that spatial skills and fluid reasoning skills are reported to be very strongly related (e.g., Fry & Hale, 1996; Green et al., 2017). Moreover, many tests of fluid reasoning skills require processing spatial information, such as Matrix Reasoning and Raven’s Progressive Matrices. Despite the strong correlation between fluid reasoning and spatial skills, there is evidence that suggests that the two skills rely on separable cognitive processes and brain regions (e.g., Halford et al., 1998; Holyoak, 2012; Klingberg, 2006; Krawczyk, 2012; Vendetti & Bunge, 2014). Taken together, these findings indicate that gaining an accurate understanding of the relation between spatial skills and mathematical skills requires accounting for their shared relations with fluid reasoning.

Furthermore, though many studies simultaneously examine the contribution of executive functions and spatial skills to mathematical achievement in their analyses (e.g., Green et al., 2017; Hawes et al., 2019; Verdine et al., 2017), the findings regarding the relations between the three cognitive skill sets have not been consistent. For instance, contrary to the findings of Green et al. (2017), Hawes et al. (2019) found that spatial skills were the strongest predictor of mathematical achievement in children ages 4–11 years after controlling for age, while executive functioning skills (which included measures of working memory) was not a significant unique predictor. Moreover, the patterns of the relations between these three factors remained stable across age and grade (Hawes et al., 2019). Thus, taken together, the differences in findings between these studies indicate that the nuances of the relations between spatial skills, executive functioning skills, and mathematical skills are not well understood.

In a recent theory of intelligence called process overlap theory, Kovacs and Conway (2016) postulate that any task requires a number of domain-specific and domain-general cognitive processes. The domain-specific cognitive tests measuring different cognitive skill sets (e.g., tests of spatial skills, tests of mathematical skills) all tap into a common pool of domain-general executive functions, such as working memory and fluid reasoning (Kovacs & Conway, 2016). In particular with regard to the relations between spatial skills and mathematical achievement, the role of domain-general executive functions has yet to be ascertained. More specifically, it remains unclear whether executive functions, such as working memory and fluid reasoning, mediate the relations between spatial skills and mathematical performance.

In addition to executive functions, another domain-general cognitive factor positively related to mathematical achievement is general intelligence, or g (Taub et al., 2008; Wrigley, 1958). General intelligence includes the ability to think logically and systematically (Embretson, 1995) and is the best individual predictor of achievement across academic domains, including mathematics (e.g., Deary et al., 2007; Jensen, 1998; Stevenson et al., 1976; Taub et al., 2008; Walberg, 1984). For instance, in a 5-year prospective study of more than 70,000 students, Deary et al. (2007) found that general intelligence assessed at 11 years of age explained nearly 60% of the variation on national mathematics tests at 16 years of age. A study by Geary (2011) aimed at identifying cognitive factors and quantitative competencies in first-grade students that predict mathematical achievement in fifth-grade students found that general intelligence, along with processing speed and working memory, predicted fifth-grade mathematical achievement as well as growth in achievement. While these studies solidify the contribution of general intelligence on mathematical achievement and performance, what is unclear is the magnitude of its contribution to the relation between spatial skills and mathematical achievement.

In sum, existing research suggests that spatial skills and mathematical skills may not be directly related. Theories of intelligence indicate that other domain-general cognitive factors, such as working memory, fluid reasoning, and general intelligence, may underlie their relation. Thus, a more accurate understanding of how spatial skills and mathematical skills are related requires examining if and how associated skillsets influence the relation.

The influence of participants’ age or grade-level and gender on the relations between spatial and mathematical skills

Just as the influence of domain-general cognitive processes on the relation between spatial and mathematical skills is unclear, whether the relation between the skill sets varies across development is also unknown. Much prior research suggests that the relation between spatial skills and mathematical skills could vary depending on participants’ age or grade level (Battista, 1990; Stannard et al., 2001; Wolfgang et al., 2001). For instance, in a longitudinal study by Li and Geary (2013), first- to fifth-grade gains in visuospatial memory predicted the end of fifth-grade mathematical achievement. However, visuospatial memory was not related to mathematics in first grade (Li & Geary, 2013). Similarly, in a study by Hanline et al. (2010), block construction scores, an indicator of students’ understanding of spatial relations and geometric knowledge, predicted preschoolers’ scores on the Test of Early Reading Ability at age 8 years and the growth rate on the same test from age 5–8 years. However, there was no significant relation between block construction scores and scores on the Test of Early Math Ability at age 8 or the growth rate for the Test of Early Math Ability from ages 5–8. Similarly, in a cross-sectional study, Mix et al. (2016) examined the relations between spatial and mathematical skills in 854 students from kindergarten, and third and sixth grades and found that mental rotation skills were strongly related to mathematics performance in kindergarten and third-grade students, but was not a significant predictor of mathematics performance in sixth-grade students. Instead, working memory and visual motor integration showed the strongest relations with mathematical performance in sixth-grade students.

While some studies show that the relations between spatial and mathematical skills vary across development, there are some studies that show that the relations remain stable across ages. For example, as previously discussed, Hawes et al. (2019) found that the strong relations between spatial skills and mathematical skills were evident in children from ages 4 to 11 years. Specifically, the relation between spatial skills and mathematical skills remained consistent across the ages. Taken together, these mixed findings underline the lack of understanding in the field about if and how the relations between spatial skills and mathematical skills vary between ages or grades. Identifying if and how the relation between spatial skills and mathematical skills varies across development is critical because it could shed light on when implementing spatial interventions may be most effective in improving students’ mathematical outcomes.

Just as the relations between spatial skills and mathematics outcomes could vary by age or grade level, studies have suggested that gender could also be a moderating factor. Women are under-represented at the highest levels of STEM occupations (Ceci & Williams, 2011; Halpern et al., 2007), especially in math-intensive STEM domains (e.g., computer and information sciences, engineering, and physical and technical sciences; see Wang & Degol, 2017, for a review). Moreover, boys outperform girls on tests of mathematical aptitude, such as on the SAT mathematics subtest (SAT-M; Halpern et al., 2007), especially on the most complex problems (Hyde et al., 1990). Underlying the gender differences in mathematical outcomes are reported differences in men and women’s spatial skills (e.g., Casey et al., 1995; Nuttall et al., 2005), as well as differences in the relations between spatial skills and mathematical skills (e.g., Tartre & Fennema, 1995). For instance, Casey et al. (1995) found that gender differences on the mathematics section of the SAT-M were related to gender differences on a spatial skills measure in high-achieving students. Boys/men performed better than girls/women on both mental rotation and the SAT-M (Casey et al., 1995).

Beyond these gender differences in levels of mathematical or spatial skills, there is also some evidence for gender differences in the strength of the relations between spatial skills and mathematical skills (e.g., Tartre & Fennema, 1995). Tartre and Fennema (1995) examined the relationship between spatial skills, verbal skills, and mathematical achievement in 60 students as they progressed from sixth to 12th grades. Unlike Casey et al. (1995), they found no significant difference between boys’ and girls’ spatial skills, verbal skills, or their mathematical achievement. However, they found that the relations between the three factors varied by gender. Spatial skills were a consistent significant predictor of mathematical achievement for girls across the years, but not for boys, whereas verbal skills were a significant predictor of mathematics across the years for boys, but not for girls (Tartre & Fennema, 1995). On the contrary, Ganley and Vasilyeva (2011) examined the relations between spatial skills and mathematical performance in middle school students and found that despite similar levels of mathematical performance for boys and girls, spatial skills predicted mathematical performance in boys, but not in girls. As the research findings on the role of gender in the association between spatial skills and mathematical skills is not consistent, a systematic investigation synthesizing prior findings is necessary to elucidate how gender impacts the relationship between these skill sets.

Prior meta-analytic findings on the relations between spatial skills and mathematical skills

With the aim of better understanding the relations between spatial skills and mathematical reasoning skills, a meta-analysis was recently conducted by Xie et al. (2019) summarizing the findings of 73 studies reporting correlations between these two factors. The aims of their study included: (1) determining whether there is a significant association between spatial skills and mathematical skills, and (2) determining whether the domains of spatial skills, mathematical skills, age, and developmental disability status moderated this relationship. In this meta-analysis, Xie et al. (2019) report finding a medium positive association between spatial skills and mathematical skills (r = .27; 95% confidence interval (CI) [0.24, 0.32]), which did not differ by age, developmental disability status, or type of spatial skill. However, they did find that mathematical domain did moderate the relations between mathematical and spatial skills. Specifically, logical reasoning showed the strongest association with spatial skills in comparison with numerical skills or arithmetic skills (Xie et al., 2019). Important to the research reported here, Xie et al. (2019) did not examine if domain-general reasoning skills, such as working memory or fluid reasoning skills, mediate the relation between spatial and mathematical skills, or whether the magnitude of the relation between the two factors differ by gender. Thus, the role of domain-general reasoning skills and participants’ gender on the relations between spatial skills and mathematical outcomes is still unknown.

The current study

As discussed above, the nuances of the relationship between spatial skills and mathematical skills remains unclear. Specifically, it is unknown if the relation differs by gender and how much domain-general reasoning skills may account for the association. Furthermore, although Xie et al. (2019) found that age was not a significant moderator in their meta-analysis, we decided to include the factor in our analyses to see if we could replicate their finding.

In this study, we examined two sets of questions that each required different analytical tools. As such, we chose to conduct two separate meta-analyses that would each address one set of questions. In the first meta-analysis, we employed correlated and hierarchical effects (CHE) meta-regression models (Pustejovsky & Tipton, 2021) to answer the following research questions: (1) What is the magnitude of the relation between spatial skills and mathematical skills? and (2) What is the effect of gender and grade-level on the association between spatial skills and mathematical skills? In the second meta-analysis, we used the meta-analytic structural equation modeling (MASEM) approach to examine how accounting for domain-general reasoning skills (working memory, fluid reasoning, and verbal skills) impacts the relation between spatial skills and mathematical skills (depicted in Fig. 1). Based on Carrol’s (1993) model of cognitive abilities, Wai et al. (2009) posit that cognitive abilities center around three cognitive domains: quantitative/numerical, spatial/pictorial, and verbal/linguistic (or mathematical, spatial, and verbal domains, respectively). The shared variance across these three content domains can be attributed to the higher order construct of general intelligence (g) (Wai et al., 2009). Thus, we included measures of verbal skills in our analyses to account for the variance contributed by g in the relationship between spatial skills and mathematical skills.

Fig. 1
figure 1

Hypothesized path model between cognitive constructs. Note. An illustration of the hypothesized theoretical relations between spatial skills and mathematical skills, and general reasoning skills to be examined using MASEM analysis. Fluid reasoning, working memory, and verbal skills are hypothesized to be moderators between spatial skills and mathematical skills in this model

Method

Literature search/information sources

We began with electronic searches of PsycINFO, ProQuest, and ERIC databases. We searched all available records published since 1997 to the date of the search: 1 May 2018. We limited our search to this 20-year window for three reasons. First, the time frame was large enough to provide a wide range of studies and would encompass the large increase in studies on the topic of mathematics and spatial skills that has recently occurred. Second, the time frame is narrow enough to allow us to gather most of the relevant published and unpublished data for analysis. Third, educational structures (i.e., schooling and curriculum) have remained mostly unchanged during this time frame (Payne, 2008).

We used the following search terms: (Intervention OR training OR practice OR class OR enhancement OR education OR quasi-experimental OR experimental) AND (“spatial skills” OR “spatial ability” OR “spatial cognition” OR “spatial thinking” OR visuospatial) AND (mathematics OR math OR “math skills” OR “math ability” OR “mathematical reasoning”). Where possible, database search filters were employed prior to screening. For example, when searching the ERIC database, the following filters were used: Peer-Reviewed Journal Publications, and Within 21 Years of the Search Date. To obtain unpublished data, we conducted a search in ProQuest Dissertations and Abstracts. When searching ProQuest Dissertations and Abstracts, we limited our search term to within the abstracts due to the expansive nature of dissertations, which significantly reduces the effectiveness of search terms. In addition to an electronic search of databases, we acquired unpublished data through requests in social media posts and emails to various relevant research groups (e.g., Cognitive Development Society, The Spatial Intelligence and Learning Center network listserv). Furthermore, prominent researchers in the field were contacted directly via email for any current or previous unpublished work.

Searching the databases using the Boolean phrases and filters mentioned above resulted in the acquisition of 858 articles. All resulting articles were compiled within an Endnote database and duplicates were removed with 760 unique articles remaining. An additional nine articles were acquired through social media and email requests, resulting in a total of 769 articles.

Abstract screening

All studies found through our initial search procedure were compiled into a numbered study catalogue. As noted by Berman and Parker (2002), “Failure to blind review could lead to biases similar to those in a record review when subjects are selected by investigators who are not blinded to the outcomes of interest” (p.4). In this catalogue, each study was assigned a unique identifier and only the study title and abstract were displayed for evaluation in line with best practice (Berman & Parker, 2002). The abstracts for all acquired studies were then evaluated by the first and second authors to establish whether they included at least one measure of spatial skills and at least one measure of mathematical skills. Studies that did not mention a mathematical measure or a spatial measure in the study abstract were excluded from further analysis. Additionally, studies where manuscripts were not written in English, or an English translation was not able to be acquired, were also excluded from further analysis.

To ensure reliability of the abstract coding and selection process, 25% of the abstracts were randomly selected and screened by both reviewers for inclusion of the appropriate measures. Inter-rater reliability using Cohen’s Kappa was then calculated for this subset of studies, which indicated that satisfactory levels of inter-rater reliability were observed (k = 0.77, p<.01) (Altman, 1990). After inter-rater reliability was calculated for the abstract coding, any disagreements in study selection were discussed and agreed upon prior to final designation. The remaining 75% of the studies were randomly assigned to each reviewer for the completion of the abstract-coding process. 592 studies were removed from further analysis during the abstract-evaluation phase. After the study abstracts were evaluated, manuscripts coded as included were retained for full manuscript review.

Full manuscript review and data extraction

One hundred and seventy-seven full manuscripts were retained for full manuscript review and data extraction, which was again conducted by the first and second authors. Each manuscript was reviewed and the following information was extracted: (1) measure(s) of spatial skills, mathematical skills, working memory, fluid reasoning, and verbal skills administered; (2) bivariate correlations between measures of spatial skills and measures of mathematical skills; (3) bivariate correlations between measures of spatial skills and measures of working memory, fluid reasoning, and verbal skills; (4) bivariate correlations between measures of mathematical skills and measures of working memory, fluid reasoning, and verbal skills; (5) the reported sample size for each correlation; (6) the age or grade-level of the participants; and (6) the proportion of males relative to female participants. If during full manuscript review, a study was found to be missing information about the spatial skills measure(s) administered, the mathematical skills measure(s) administered, or the bivariate correlation(s) between them, the manuscript authors were contacted for the information. If no response was provided, the study was excluded from further data analysis. Measures of working memory, fluid reasoning, or verbal skills were not required to retain the manuscript for further data analysis. A list of the 45 included manuscripts as well as the study characteristics and measure information extracted from each one (except for the bivariate correlations) is provided in Table 1. A summary of the steps carried out in the abstract screening and full manuscript review components of this study, based on the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flowchart template (Moher et al., 2015), are outlined in Fig. 2.

Table 1 Articles included in the meta-analyses
Fig. 2
figure 2

The process of article selection for inclusion in analysis. Note. After carrying out this selection process, data from 45 articles were included for analysis

Study characteristics and measures of interest

Participant grade

Both the age of the participants and their grade-level reported within each study was extracted. If studies collected data on participants at multiple time points, age information at each administration of the relevant measures was extracted. As many studies examining mathematical understanding occur in a classroom setting (e.g., Verdine et al., 2017), if participants’ grade-level was reported without age-related data, then age was approximated based on typical age per grade defined by the National Center for Education Statistics (2018). More studies reported on participants’ grade-level than age, so grade-level was used as a moderator of effect size in the analysis.

Percent of male participants

To examine whether gender moderates the relationship between spatial skills and mathematical skills, the percent of male participants was extracted from each study. Few studies reported the correlation between spatial skills and mathematical skills by gender. Thus, we use percent of male participants as an indirect measure of whether the magnitude of the correlation between spatial and mathematical skills differs as a function of the percent of male participants.

Spatial skills

Tasks that were considered as measures of spatial skills assessed participants’ skills in visualizing and/or mentally manipulating objects or figures or navigating spaces (Atit, Uttal, & Stieff, 2020b; Newcombe & Shipley, 2015). Common examples of measures of spatial skills include the Mental Rotations Test (Peters et al., 1995; Vandenberg & Kuse, 1978), the Wechsler Preschool and Primary Scale of Intelligence (WPPSI-IV) Block Design subtest (Wechsler, 2012), and the Paper Folding test (Ekstrom et al., 1976).

Mathematical skills

Tasks that were considered measures of mathematical skills assessed participants’ skills in reasoning about “numbers and their operations, interrelations, combinations, generalizations, and abstractions and of space configurations and their structure, measurement, transformations, and generalizations” (Merriam Webster, n.d.). Examples of measures assessing mathematical skills include assessments of arithmetic (e.g., Ackerman & Wolman, 2007), geometry tests (e.g., Bonny & Lourenco, 2015), and the mathematics portion of the Criterion Referenced Competency Test (e.g., Carr et al., 2008).

Working memory capacity

Tasks that were considered measures of working memory capacity assessed participants’ ability to both retain and process information simultaneously (Conway et al., 2003). Examples of working memory measures include Backwards Digit Span (GrÉGoire & Van Der Linden, 1997), Automated Symmetry Span (Oswald et al., 2015), and the Corsi Block Task (Corsi, 1972; Miyake et al., 2001).

Fluid reasoning

Fluid reasoning tasks measured participants’ ability to flexibly and deliberately solve novel problems without using previous information. More specifically, they measured participants’ ability to analyze novel problems, identify patterns and relationships that underpin these problems, and apply logic (Schneider & McGrew, 2012). Common measures of fluid reasoning include the Matrix Reasoning from the Perceptual Reasoning Index in the Wechsler Intelligence Scale for Children, Fourth Edition (WISC-IV; Flanagan & Kaufman, 2004), the Culture Fair test (Cattell, 1971), and the Diagramming Relationships Test (Ekstrom et al., 1976).

Verbal skills

Measures of verbal skills assess participants’ vocabulary, verbal comprehension skills, and/or verbal reasoning capacity, and are strong predictors of achievement in school and the ability to learn in non-school settings (Gottfredson & Deary, 2004). Examples of verbal measures include the WAIS-R Information (Wechsler, 1981), the Woodcock-Johnson III Picture Vocabulary (Woodcock et al., 2001), and the verbal portion of the SAT (SAT-V; College Board, 2020).

Effect size measures

Correlation coefficients representing the bivariate correlation between spatial skills and mathematical skills, spatial skills and a measure of domain-general reasoning skills (i.e., fluid reasoning, working memory capacity, and verbal skills), and mathematical skills and a measure of domain-general reasoning skills were used as measures of effect size for this study. Prior to conducting any analyses, for our first meta-analysis (examining the magnitude of the relation between spatial skills and mathematical skills), we converted all correlation coefficients to the Fisher’s z scale (Borenstein et al., 2005). After the analysis, all the results in Fisher’s z scale were then converted back to correlation coefficients for the ease of interpretation. For our second meta-analysis, a MASEM approach was used to examine the contribution of domain-general reasoning skills to the relations between spatial skills and mathematical skills, and the original bivariate correlation coefficients were utilized.

Analysis

Study 1: What is the magnitude of the relation between spatial skills and mathematical skills?

To answer the above research question, we first fit an intercept-only random-effects model and estimated the pooled correlation coefficient between spatial and mathematical skills. Next, we examined whether the grade and gender (i.e., proportion of males in the primary study sample) explained further variability in effect sizes across different studies using a meta-regression model. Because the unit of analysis is the study, the grade and gender are the study and sample characteristics, not an individual participant characteristic.

Given that there is more than one kind of spatial skill and, thus, more than one kind of spatial test (Linn & Petersen, 1985; Newcombe & Shipley, 2015), and given that there is more than one kind of mathematical domain, many of the studies included in our meta-analyses administered more than one measure of spatial skills or mathematical skills. Therefore, multiple effect sizes between different types of spatial tests and measures of mathematical skills were reported based on the same sample. To account for the dependent structure of effect sizes within studies, robust variance estimation (RVE) with the small-sample correction technique (Hedges et al., 2010; Tipton, 2015) was used. Specifically, we incorporated an extended working model for RVE that models the correlated and hierarchical structure of effect size estimates, called the Correlated and Hierarchical Effects (CHE) working model (Pustejovsky & Tipton, 2021). Many studies included in the synthesis report multiple correlations from the same participants (e.g., the correlation between algebra performance and mental rotation skills and algebra performance and perspective-taking skills) leading to dependencies among correlations nested within individuals and studies. The CHE working model represents our data structure and is described by Pustejovsky and Tipton (2021) as the most broadly applicable model in social science systematic reviews. In applying CHE to the data, we assumed a constant correlation of .8 for the effect sizes within studies.

The random-effects meta-regression models were fit using the CHE working model to estimate the mean correlation between spatial skills and mathematical skills, and to examine whether the correlation between spatial skills and mathematical skills was moderated by grade or gender. The R packages metafor (Viechtbauer, 2010) and clubSandwich (Pustejovsky, 2020) were used for the analyses, and the variance components were estimated using restricted maximum likelihood (REML).

Study 2: How do domain-general reasoning skills influence the relation between spatial skills and mathematical skills?

To examine whether domain-general reasoning skills account for some of the relation between spatial skills and mathematical skills, we hypothesized a path model with verbal skills, and fluid reasoning as mediators of the relationship from spatial skills to mathematical skills as seen in Fig. 3. We had originally planned to use working memory in the path model, but we decided to remove it due to the insufficient numbers of studies that reported correlations with working memory. For example, the nine correlation coefficients including working memory and verbal skills are derived from only three studies (see Table 2). This adversely affects the degrees of freedom of the pooled correlation coefficients. The degrees of freedom of the pooled correlation were less than 4, which indicates unreliable estimates when using RVE (Tipton, 2015). Thus, we removed working memory from the path model.

Fig. 3
figure 3

Path model examining the relations between spatial skills and mathematical skills. Note. This figure shows the paths between spatial skills, mathematical skills, and general reasoning skills (i.e., fluid reasoning and verbal skills) examined using a modified TSSEM approach for MASEM.

Table 2 The number of correlations, the number of studies, and the total average number of participants for each pairing

We used MASEM with a two-stage structural equation modeling (TSSEM) process to estimate the path model (Cheung & Chan, 2005). In the first stage of MASEM, the correlation matrices are pooled across studies. In the second stage, the pooled correlation matrix from the first stage is used to fit the structural equation model. In the first stage, we implemented RVE with the small-sample correction using the CHE working model (Pustejovsky & Tipton, 2021) to pool the correlations in order to incorporate dependent effect-size estimates within study. Our approach differs from a standard TSSEM application, which assumes one correlation matrix is reported per study, a limitation of current MASEM applications (Wilson et al., 2016). Thus, we implemented RVE so that the variance-covariance matrix of the pooled correlation matrix takes into account the dependent structure of correlations within studies. As in the meta-regression analysis, the R packages metafor (Viechtbauer, 2010) and clubSandwich (Pustejovsky, 2020) were used. In the second stage, we fit the path model using the 4 × 4 pooled correlation and variance-covariance matrices. We used the metaSEM R package (Cheung, 2015) to estimate the path model provided in Fig. 3.

Publication bias and selective reporting

To examine possible publication bias in the data, we implemented a modified version of Egger’s test of funnel plot asymmetry (Egger et al., 1997), or Egger’s sandwich (Rodgers & Pustejovsky, 2020), that can be used when dependent effect sizes exist in the meta-analysis data. Egger’s sandwich uses the same approaches as we used in our meta-analysis (i.e., RVE with CHE working model using Fisher’s z scale effect sizes was used to reduce any artifactual correlation between the effect size and its variance estimates (i.e., measure of precision).Footnote 1 The results suggested that funnel plot asymmetry was not present (p = .642), indicating no evidence of publication bias.

Results

Sampling characteristics

Of the 45 studies examined for the two meta-analyses, 18 of the studies were conducted with participants outside of the USA. A summary of the different countries from which participants were recruited is provided in Table 3. Regarding the distribution of studies across educational levels, 18 of the studies were conducted with participants in preschool/primary grades (i.e., preschool to fifth grades), 18 of the studies were conducted with participants in secondary grades (sixth–12th grades), and one study was conducted with participants in both preschool/primary and secondary grades. The remaining eight studies were conducted in participants at the post-secondary level.

Table 3 Summary of the study locations for the studies included in the meta-analyses

Descriptive

In total, the data include 568 correlation coefficients across ten types of correlations from 45 studies. Table 2 shows the number of reported correlation coefficient effect sizes and the number of studies by the type of correlation. Again, because the unit of analysis is the study, not the individual participants, the number of studies is our focused sample size in meta-analysis. Note that for Study 1, we used the subset of the dataset only using the correlations between spatial and mathematical skills.

Study 1

In study 1, we estimated the pooled correlations between spatial and mathematical skills and examined whether grade and gender moderated the relationship between spatial and mathematical skills. To estimate the pooled correlation between spatial and mathematical skills, we fit an intercept-only meta-regression under the CHE working model. The pooled effect size in the correlation coefficient metric was .358 (95% CI [.295, .419]) and it is statistically significantly different from zero (\( \hat{\upbeta} \) = .375, SE = .035, t (43.277) = 10.635, p < 001). The estimated between-study heterogeneity (τ2) was .039 and the estimated within-study heterogeneity (ω2) was .017.

Next, we conducted a meta-regression using grade and proportion of males as moderator variables. There were three missing values in grade from two studies and five missing values in proportion of males from four studies among 181 effect size estimates. The effect sizes with missing moderators were deleted to fit the meta-regression model. Thus, 173 correlation coefficients from 41 studies were used to examine the effects of grade and proportion of males. The results of the meta-regression model are presented in Table 4. Neither moderator was significantly related to effect size. The degrees of freedom for the test of proportion of males in the sample is less than 4, indicating little information in the data about the relationship between the proportion of males in the sample and the correlation between spatial skills and mathematical skills. The proportion of males in the studies has a mean of 0.496 with a standard deviation of 0.13, indicating little variation among studies in this variable. As discussed by Hedges et al. (2010) as well as Tipton (2015), results from RVE with small degrees of freedom should be interpreted cautiously. In sum, results indicated a positive moderate association between spatial skills and mathematical achievement. However, neither grade nor gender significantly moderated the relationship.

Table 4 Results of a meta-regression examining the effect of age and gender on the relations between mathematics and spatial skills

Study 2

In study 2, a path model was examined to understand the relationships between spatial and mathematical skills as well as fluid reasoning and verbal skills as mediators. In this set of analyses, the 4 × 4 correlation matrix of these four variables is the unit of analysis.

For the first stage of TSSEM, we pooled correlation matrices across studies and estimated the variance-covariance matrix using the CHE model given the presence of dependent correlations within studies (Pustejovsky & Tipton, 2021). Table 5 provides the pooled correlation matrix and Table 6 provides its corresponding standard error estimates. The pooled correlation coefficients among four variables ranged between .277 (verbal skill and fluid reasoning) and .418 (spatial skill and fluid reasoning). All six correlation coefficients were statistically significant at α = .05 and generally moderate in magnitude.

Table 5 The pooled correlation matrix in the lower triangle and its standard error estimates in the upper triangle
Table 6 Correlation estimates between spatial skills, mathematical skills, and verbal skills

We also examined whether the pooled correlation coefficients differed due to the moderators of grade and gender (i.e., proportion of males in a sample). Neither of the moderators were significantly related to the magnitude of the correlation coefficients. Thus, we used a single pooled correlation matrix to fit a path model in the second stage.

In the second stage of TSSEM, we fit a path model using the pooled correlation matrix and variance-covariance matrix in the first stage. The total sample size used in this stage was 17,824.80 obtained by averaging sample size within studies and summing across studies. The total averaged sample size within study per pooled correlation coefficient is presented in Table 2. The second stage ensures that the parameter estimates take into account the precision of each pooled correlation coefficient from different sample sizes. The hypothesized path model included a direct path from spatial skills to mathematical skills and indirect paths through verbal skills and fluid reasoning (see Fig. 3). We implemented likelihood-based confidence intervals (Neale & Miller, 1997) to test indirect and direct effects. The approximate goodness-of-fit indices for the model were mixed with root mean square error of approximation (RMSEA; Brown & Cudeck, 1993) of .043 (95% CI [.031, .056]), and standardized root mean residual (SRMR; Hu & Bentler, 1999) of .110. Cheung (2015) suggested RMSEA and SRMR to evaluate the second stage of TSSEM. RMSEA less than or close to 0.06 and SRMR less than or close to 0.08 support good fit (Hu & Bentler, 1999). The model fit indices showed somewhat mixed conclusions about the hypothesized path model. Given the exploratory nature of this analysis, we treat the interpretation of the path coefficients including direct and indirect effects among the skills as suggestive. Table 7 shows the results of the path model including direct and indirect effects. All direct and indirect paths were statistically significant (p < .05). The strongest direct path was spatial skills to fluid reasoning. The direct path from spatial skills to mathematical skills remained significant, even after accounting for the indirect paths that included fluid reasoning or verbal skills. Fluid reasoning mediated more of the effect from spatial skills to mathematical skills than did verbal skills, and the relation that remained between spatial skills and mathematical skills was larger than both of these indirect relations. Figure 3 shows the path model with the path coefficients.

Table 7 The results of the analyzed path model

Discussion

This study investigated the relations between spatial skills and mathematical skills. Furthermore, it examined the effect of gender and age on this association, and the role of domain-general reasoning skills in the relations between these two constructs. Results from synthesizing the reported effect sizes between spatial skills and mathematical skills from 45 studies revealed a positive moderate association between the two skill sets (r = .36, robust standard error = 0.035, τ2 = 0.039). However, no significant effect of gender or grade on the association was found.

By implementing the meta-analytic SEM (MASEM) approach, the current study was able to model directional paths among the variables with direct and indirect effects simultaneously. In addition, we utilized RVE with the CHE working model approach (Pustejovsky & Tipton, 2021) to handle dependent effect sizes within studies when estimating the pooled correlation matrix and variance-covariance matrix. Results indicated that fluid reasoning and verbal skills mediated the relationships between spatial skills and mathematical skills, and these indirect effects were small but statistically significant. In addition to the indirect effects, the direct relation from spatial skills to mathematical skills was also statistically significant and larger than the indirect effects.

Consistent with prior research, this meta-analysis confirms that spatial skills are significantly related to mathematical skills (Atit, Power, et al., 2020a; Casey et al., 1995; Mix et al., 2016, 2017; Verdine et al., 2017). More specifically, our study verifies findings of previous studies conducted using experimental, factor analytic, and meta-analytic methods showing that spatial skills and mathematical skills have a direct positive association (e.g., Lombardi et al., 2019; Mix et al., 2016, 2017; Xie et al., 2019).

The magnitude of the relation between spatial skills and mathematical achievement found in our study (r = .36) is meaningful as it is consistent with findings of existing meta-analyses examining the relations between various cognitive skills and mathematical achievement. For instance, Jacob and Parkinson (2015) used meta-analytic techniques to identify the relation between executive functioning skills and mathematical achievement in children aged 2–18 years. Their results showed that executive functioning skills and mathematical achievement are positively and moderately correlated (r = 0.31). Similarly, Peng et al. (2019) used meta-analytic techniques to determine the relation between fluid intelligence and mathematical achievement in individuals aged 3–80 years. Their analyses revealed a positive moderate association between fluid intelligence and mathematical achievement (r = 0.41). Lastly, our findings replicate the findings of the meta-analysis conducted by Xie et al. (2019) who also found a moderate positive relation between spatial skills and mathematical achievement (r = 0.27). These findings suggest that spatial skills are similarly related to students’ mathematical understanding as other cognitive processes, such as executive functioning skills and fluid intelligence. Future research should examine how to leverage these pertinent cognitive skill sets to bolster students’ mathematics understanding and performance.

However, unlike many studies closely examining the relationship between spatial skills and mathematical achievement (e.g., Casey et al., 1997; Li & Geary, 2013; Stannard et al., 2001), we found that the association between these two skill sets was not influenced by participants’ grade or gender. It is well established that there is more than one kind of spatial skill (e.g., Linn & Petersen, 1985; Newcombe & Shipley, 2015) and more than one kind of mathematical domain (American Mathematical Society, 2020). Many of the studies investigating the effects of gender and age/grade on the association between the two constructs focus only on one kind of spatial skill (e.g., Casey et al., 2017; Li & Geary, 2013) and/or one kind of mathematical domain (e.g., Lombardi et al., 2019). In our study, we did not examine whether the relations differed for different kinds of spatial skills or different kinds of mathematical concepts. The meta-analysis conducted by Xie et al. (2019), which did consider domains of spatial skills and mathematical skills in their analyses, did not examine the effect of participants’ age/grade or gender for each subarea. Thus, the nuances of how gender and age/grade influence the relations between spatial skills and mathematical skills have yet to be understood. Future meta-analyses should aim to disentangle how different kinds of spatial skills are related to different kinds of mathematical skills, and identify the role of age/grade and gender in these various associations.

In line with findings on the importance of executive functioning skills in mathematical achievement (e.g., Green et al., 2017; McGrew & Wendling, 2010), our study found that fluid reasoning skills were an essential component of mediating the relation between spatial and mathematical skills. In a synthesis of studies investigating the concurrent relationships between various cognitive abilities and achievement measures, McGrew and Wendling (2010) found that fluid reasoning was one of the three broad cognitive abilities (fluid reasoning, verbal comprehension, and processing speed) that was consistently related to mathematical performance at all age ranges. Furthermore, in a longitudinal study, Green et al. (2017) found that fluid reasoning, spatial skills, and verbal skills accounted for 90% of the variance in future math achievement in individuals ranging from 5 to 15 years of age, with fluid reasoning skills being the only significant predictor. In our study, the indirect relation from spatial skills to fluid reasoning to mathematical skills was larger in magnitude compared to the indirect relation from spatial skills to verbal skills to mathematical skills. However, the direct path from spatial skills to mathematical skills was larger than the two indirect paths. These results are consistent with Kovacs and Conway’s (2016) process overlap theory, which poses that domain-general executive processes, such as fluid reasoning, underlie performance on domain-specific cognitive tasks, such as spatial and mathematical measures. There were not enough studies that measured working memory capacity to include the construct in our estimated model. Thus, an area of focus for future research is to ascertain the role of additional executive functioning processes, such as working memory capacity and inhibition (Miyake et al., 2000), on the relations between spatial skills and mathematical performance.

One limitation of this meta-analysis is that while we tried to include all relevant studies in our analysis, some eligible studies may have been missed due to our search strategy. We used generalized terms such as “spatial skills” or “spatial ability” or “mathematics” or “math skills.” However, as many studies focus on a specific kind of spatial skill or mathematical skill, they may have only used such specific terms, such as mental rotation or algebra, in their articles. Therefore, future studies should consider using a broad variety of search terms to reduce the possibility of missing eligible articles. Secondly, the small number of effect sizes limited our ability to examine the relations between different kinds of spatial skills and different kinds of mathematical domains. Thus, more effect sizes between sub-constructs are needed to examine these relationships in future studies.

Another limitation of this research is that the studies included in these meta-analyses were largely from the USA and other Western nations. Only a small number of studies, specifically seven studies, were conducted in Eastern countries (i.e., China and Russia). The lack of international diversity of the participants in the selected studies potentially biases our results as existing data indicates that mathematical achievement varies greatly between students from eastern and western nations (Mullis et al., 2012). Thus, more primary studies examining the relations between spatial skills and mathematical skills need to be conducted in samples across a broader range of countries. Furthermore, the search and inclusion criteria for literature to be examined in future meta-analyses needs to be revised, perhaps by expanding the literature search to studies published in languages other than English, so that a broader range of international studies are represented in the analyses.

The findings from these meta-analyses have multiple implications for mathematics education research and practice. First, this research highlights that integrating practices that develop and support students’ spatial skills could benefit students’ mathematical understanding at all educational levels. Many of the efforts aimed at improving students’ mathematical achievement by way of bolstering their spatial skills have been carried out with students at the preschool/primary and secondary levels (e.g., Cheng & Mix, 2014; Hawes et al., 2015; Lowrie et al., 2017; Lowrie et al., 2019; Schmitt et al., 2018). Only a handful of efforts have been made to bolster students’ mathematical outcomes by improving their spatial skills at the postsecondary level (e.g., Sorby, 2007; Sorby et al., 2013). However, our research shows that the relations between spatial skills and mathematical skills do not vary by participants’ grade level. Therefore, future research should examine how to support mathematics instructors at the postsecondary level in developing and supporting their students’ spatial skills.

Second, while our findings underline the importance of developing students’ spatial skills to support their mathematics learning and achievement, efforts should also focus on developing students’ domain-general cognitive processes, such as their verbal skills and fluid reasoning skills. Our research shows that domain-general cognitive processes account for some of the variance between spatial skills and mathematical skills. This suggests that bolstering spatial skills in conjunction with fluid reasoning and verbal skills may provide a greater boost to students’ mathematical achievement than developing spatial skills alone. Thus, future research should aim to identify pedagogical practices focused on developing students’ domain-general reasoning skills as well as their spatial skills, which can be integrated into preschool to postsecondary mathematics curricula.

Conclusions

In conclusion, the current meta-analysis affirms that spatial skills and mathematical skills are positively related and, moreover, there is a direct relationship between the two constructs that did not vary based on grade or gender. Furthermore, we found that other cognitive processes, specifically fluid reasoning skills and verbal skills, mediated the relations between spatial skills and mathematical skills, and yet a direct relation between the two constructs independent of other general cognitive processes remained. These findings indicate that efforts to improve mathematical skills should include bolstering spatial skills but may also consider support for other pertinent cognitive skill sets.