### QPSS: quantitative phenotype scan statistic

Assume a study comprising *n* subjects, each with a continuous outcome of interest *y*_{i} (*i* = 1, …, *n*). Let *m*_{G} denote the total number of rare variants in a genetic region of interest *G*, where “rare” is defined by a specified cutoff value (e.g., MAF < 5%). Variants are sorted by physical position in ascending order. We set a sub-window *W* containing *m*_{W} variants (*m*_{W} < *m*_{G}) within the genetic region *G* (*i*. *e*., *W* ⊆ *G*). Let *n*_{G} be the number of individuals who carry at least one rare variant within the genetic region *G* (*n*_{G} ≤ *N*). Of the *n*_{G} rare variants carriers, \( {n}_{W_{+}}\left(\le {n}_G\right) \) carry a rare variant in the window W. Similarly, \( {n}_{W_{-}}\Big(={n}_G-{n}_{W_{+}} \)) do not carry a rare variant within window W. We can then partition the maximum likelihood estimate of the trait variance \( {\sigma}_W^2 \) among the *n*_{G} rare variant carriers as

$$ {\hat{\sigma}}_W^2=\frac{1}{n_G}\left\{\sum \limits_{i\in \left\{1,\dots, {n}_{W_{+}}\right\}}{\left({y}_i-{\hat{\mu}}_{W_{+}}\right)}^2+\sum \limits_{i\in \left\{1,\dots, {n}_{W_{-}}\right\}}{\left({y}_i-{\hat{\mu}}_{W_{-}}\right)}^2\right\} $$

where

$$ {\hat{\mu}}_{W_{+}}=\frac{1}{n_{W_{+}}}\sum \limits_{i\in \left\{1,\dots, {n}_{W_{+}}\right\}}{y}_i $$

$$ {\hat{\mu}}_{W_{-}}=\frac{1}{n_{W_{-}}}\sum \limits_{i\in \left\{1,\dots, {n}_{W_{-}}\right\}}{y}_i $$

Under the null hypothesis that variants within the sub-window *W* are equally as likely to correlate with the disease outcome that those outside the window (but still within the region of interest *G*), the outcome variance should be similar between the \( {n}_{W_{-}} \) and \( {n}_{W_{+}} \) subjects so that the pooled variance would equal the overall variance, that is,

$$ {\sigma}_W^2={\sigma}_0^2 $$

where \( {\sigma}_0^2 \) is the variance under the null hypothesis and its maximum likelihood estimate is

$$ {\hat{\sigma}}_0^2=\frac{1}{n_G}\sum \limits_{i\in \left\{1,\dots, {n}_G\right\}}{\left({y}_i-{\hat{\mu}}_0\right)}^2 $$

$$ {\hat{\mu}}_0=\frac{1}{n_G}\sum \limits_{i\in \left\{1,\dots, {n}_G\right\}}{y}_i $$

Under the alternative hypothesis, we expect the outcome values of the \( {n}_{W_{+}} \) subjects carrying a rare variant within *W* to be more similar to each other than to the rare variant carrying subjects whose variants are all outside *W*, that is \( {\sigma}_W^2<{\sigma}_0^2 \). Using these definitions, we calculate the following log likelihood ratio test statistic

$$ \ln {\hat{LR}}_W=\frac{n_G}{2}\ln \frac{{\hat{\sigma}}_0^2}{{\hat{\sigma}}_W^2} $$

for the window *W* (see Supplementary Method 1 for full details).

In order to distinguish between risk and protective clusters, we can adjust the log likelihood ratio test statistic based on the estimated effect direction. For instance, Kulldorff et al. suggested the simple indicator function \( I\left({\hat{\mu}}_{W_{+}}>{\hat{\mu}}_{W_{-}}\right) \) for risk clusters with high values of the outcome, which effectively removes from consideration any window *W* where the trait mean among subjects in *W*_{+} is less that that for subjects in *W*_{−}. Similarly, \( I\left({\hat{\mu}}_{W_{+}}<{\hat{\mu}}_{W_{-}}\right) \) can be used for protective clusters with low values of the outcome [15]. To evaluate both positive and negative associations of clusters with the phenotype simultaneously, we can employ the sign function \( \mathit{\operatorname{sgn}}\left({\hat{\mu}}_{W_{+}}-{\hat{\mu}}_{W_{-}}\right) \), that is, an indicator of +1 for \( {\hat{\mu}}_{W_{+}}>{\hat{\mu}}_{W_{-}} \) and −1 for \( {\hat{\mu}}_{W_{+}}<{\hat{\mu}}_{W_{-}} \). The window harboring rare variants associated with a phenotype can be identified as the window that maximizes the log likelihood ratio, i.e.,

$$ \underset{W}{\max}\left|\ln {\hat{LR}}_W\times \mathit{\operatorname{sgn}}\left({\hat{\mu}}_{W_{+}}-{\hat{\mu}}_{W_{-}}\right)\right|. $$

The null distribution of the test statistic is unknown; thus, *p*-values are calculated by a permutation approach [16]. To minimize computation time, we applied a GPD approximation [17] (see Supplementary Method 2 for full details) for estimating p-value of permutation test in the application study.

### Simulation study

#### Genotype data

We generated 10,000 haplotypes of a 1 Mb genomic region from a European population as implemented in the software Cosi2 (https://software.broadinstitute.org/mpg/cosi2/) [18]. We randomly extracted haplotype pairs to create genotypes for sample sizes (n) of 500 and 1000. We then removed common variants (MAF > 0.05) and singletons.

#### Phenotype data

We considered three cluster sizes: small (200 basepairs (bp)) and moderate (500 bp), both of which have consecutively located disease-associated rare variants; and large (2 kb) where 20% of rare variants are disease-related (thus, not consecutive). In each of the clusters, we randomly chose a start position and then generated quantitative phenotypes from the model

$$ {y}_i=\sum \limits_{j=1}^{m_C}{\beta}_j{g}_{ij}+{\varepsilon}_i $$

where *β*_{j} = *c*|log_{10}*MAF*_{j}|, *m*_{C} is the number of variants in the cluster, *g*_{ij} is the additively-coded genotype (*g*_{ij} = 0, 1, 2 according to the minor allele count) for the *i* th individual at the *j* th variant (*j* = 1, …, *m*_{C}), and *ε*_{i} is the error for the *i* th individual generated from a standard normal (i.e., *ε*_{i} ∼ *N*(0, 1)). We set *c* = 0.2, *c* = 0.4 and *c* = 0.6 for the empirical power simulations, and *c* = 0 for the type I error simulations. For each scenario, we generated 1000 simulation replicates.

Specification of the overall genetic region *G* and the subwindows *W* are flexible. To find a window that maximizes the log likelihood ratio, we thus employed a sliding window approach considering several different window sizes. We used windows sizes of 5 k, 2 k, 1 k, and 500 bp, and then slid each of those windows by half its respective size (i.e., by 2.5 k, 1 k, 500, and 250 bp). These scenarios are depicted both graphically and via a table (Fig. 1 and Supplementary Table 1). To provide a better sense of the effect sizes examined, the means of the simulated phenotypes (Supplementary Figure 1) and the estimates of genetic heritability (Supplementary Figure 2) are shown for each scenario.