Abstract

The neutrino mass hierarchy determination ( MHD) is one of the main goals of the major current and future neutrino experiments. The statistical analysis usually proceeds from a standard method, a single-dimensional estimator that shows some drawbacks and concerns, together with a debatable strategy. The drawbacks and considerations of the standard method will be explained through the following three main issues. The first issue corresponds to the limited power of the standard method. The estimator provides us with different results when different simulation procedures were used. Regarding the second issue, when and are drawn in a 2D map, their strong positive correlation manifests as a bidimensional variable, instead of a single-dimensional estimator. The overlapping between the distributions of the two hypotheses leads to an experiment sensitivity reduction. The third issue corresponds to the robustness of the standard method. When the JUNO sensitivity is obtained using different procedures, either with as one-dimensional or as two-dimensional estimator, the experimental sensitivity varies with the different values of the atmospheric mass, the input parameter. We computed the oscillation of with the input parameter values, . The MH significance using the standard method, , strongly depends on the values of the parameter . Consequently, the experiment sensitivity depends on the precision of the atmospheric mass. This evaluation of the standard method confirms the drawbacks.

1. Introduction

Neutrino oscillation is a quantum mechanical phenomenon in which neutrino flavor changes spontaneously to another flavor. According to the standard 3 neutrino paradigm, neutrinos come with three flavors, , , and , and with three , , and mass eigenstates [1]. Although neutrinos were introduced over 80 years ago, their properties remain to a large extent unknown [2]. Some of the –paradigm fundamental parameters are still missing until now like the absolute masses of neutrinos [3], the amount of the possible leptonic charge parity violation (CPV) [4], the Dirac or Majorana neutrino nature [5], and the neutrino mass ordering [6].

Currently, the determination of the neutrino mass ordering using reactor neutrino spectrum is pursued by several experiments and proposals. There are some challenges facing anyone that tries to solve this problem. First, its evaluation from reactor experiments is based on the tiny interference effect between the and oscillations [7]. Second, current analyses require several years of data taking and an extreme energy resolution to achieve anyhow less than 5. Third, the sensitivity may depend on the input values of the oscillation parameters used by the global fits on the oscillation analysis. In particular, the neutrino atmospheric mass may have different values for normal ordering (NH) or inverted one (IH). The answer to the third point depends on the used analysis method. It is mandatory to establish the robustness of all these analyses.

The Jiangmen Underground Neutrino Observatory (JUNO) experiment [8] has been proposed and approved for realization in the south of China, being the mass ordering (MO) evaluation one of its main goals. JUNO will allow to single out one of the missing fundamental information, the neutrino mass hierarchy, in an almost independent way of the other neutrino parameters. In particular, there will be no dependence on the phase of the leptonic CP violation, , no strong dependence on three vs four neutrino pattern, no dependence on , and no dependence on matter effects [8]. The mass hierarchy study can be performed by looking at the vacuum oscillation pattern in medium baseline reactor antineutrino experiments [9]. The JUNO strategy is based on the observation that the contribution to the oscillation probability is represented by fast oscillating terms superimposed to a general oscillation pattern. Their relative size changes according to the two different possibilities, NH or IH, leading to a contribution of opposite sign in the two cases. Therefore, it is possible to discriminate between the two possible mass hierarchies by studying the interference between the two oscillation frequencies driven by and in the reactor antineutrino spectrum [10]. The discrimination power of the experiment is maximized when the oscillation is maximal, and the baseline at JUNO has been chosen in such a way to realize this condition [11]. Since the difference of neutrino oscillation in vacuum for different mass hierarchies is very small, energy resolution is the crucial factor for the success of JUNO. The goal is that the energy resolution reaches at 1 MeV to detect electron neutrino coming from reactor plants.

In the next section, the usual method is recollected and evaluated. In the following sections, the three issues of the standard algorithm are explained. Section 3 includes the first issue, Section 4 explains the second issue, and the third issue is accounted for in Section 5. Further, results are described in Section 6. After that, conclusions are drawn in Section 7. Finally, in the appendix, a technical description of the implementation of the simulations is reported.

2. The Standard Method

For JUNO, can be divided into three parts as indicated:

summarizes the prior knowledge on oscillation parameters. In JUNO, these parameters are and . Then, becomes

While the total normalization of reactor antineutrino flux is in principle degenerate with the inverted beta decay cross section, the fiducial volume, and the weight fraction of free proton, such that they might be combined into a single overall factor, large uncertainties on the shape of the reactor antineutrino flux may be expected. On purpose, that is the reason why the near detector JUNO-TAO is going to be built. However, for the scope of the present study, those uncertainties are not expected to have a large impact. Within this assumption, the contributions to the function can be represented by a single term as where and .

The last term of Equation (1), , represents the statistical fluctuation. When we introduce binning with respect to , it looks like with the summation running over all the energy bins. Here, is the event number for the bin when the hierarchy is NH(IH). is the fitted number of events, calculated as a function of the four model parameters and the normalization factor . All parameters are varied under the NH(IH) constraints of Equation (2) and Equation (3). A different definition of the function based on the Poisson distribution yields a consistent MH sensitivity [8].

In the minimization procedure, all the parameters were initially set to their global best values that are indicated in Table 1. The fitting procedures and the minimization of are done with the TMinuit algorithm (ROOT libraries). The distributions are obtained for four parameters (, , , and ), based on a total of 108357 signal events (Figures 1 and 2).

As reported in [8], the sensitivity can reach in the ideal case of a single reactor and single detector, and considering the spread of reactor cores and uncertainties of the detector response. All these results have been reached using semianalytical simulations, i.e., simulations as used in [8, 13]. Semianalytical simulations are generated by fluctuating the bin content according to Poisson or Gaussian distributions that represent the number of events. In addition, a second fluctuation is added by applying energy smearing in each single energy bin and not in each single event. If the energy resolution smearing per each single event is replaced by smearing for the whole bin, an event balance migration occurs, and the number of events per each single bin becomes uncorrelated with side bins leading to the results reported in [8]. We provided the simulation performed on an event–by–event basis and computed the experimental sensitivity for the JUNO by changing the atmospheric neutrino mass. The distributions are obtained for and , for IH hypothesis and NH hypothesis, respectively (Figures 1 and 2), with infinite and 3% relative energy resolution, respectively.

3. Issue One: The Limited Power of as a Single-Dimensional Estimator

The two discrete hypotheses are not nested; thus, the Wilks theorem is not applicable in this problem when it is based on the defined in Equation (5). As a consequence, does not follow a distribution [14]. The MO significance is usually obtained in terms of the single-dimensional estimator , and its evaluation is based on two distinct hypotheses, NH and IH. For each MO, the best solution is found: the comes from two different best-fit values for the NH model, , and the IH model, : where the two minima are evaluated spanning the uncertainties on the three-neutrino oscillation parameters. The experimental sensitivity to the neutrino mass hierarchy arises from the small phase shift in the oscillation terms depending on the two large mass-squared differences and . JUNO sensitivity can be calculated using the single-dimensional test statistics . The median sensitivity can be obtained using the -test, where is the number of assuming that NH is the true model and is the number of assuming that IH is the true model,

The , , , and are the mean value and standard deviation of the distribution assuming that NH and IH are the true models, respectively. There, an approximation is usually used [8, 1517]: where is the mean value of the distribution. Therefore, Equation (6) becomes

When the analysis is performed on an event–by–event basis and not semianalytical simulations as in [8], the dispersions of the distributions cannot be described by Equation (7) anymore. That strongly affects the statistical significance that drops to less than 2 as indicated in Table 2 for relatively energy resolution and in Table 3 for infinite energy resolution. The reason stays in the convolution of the energy resolution. To check it, the analysis has been also done at an infinite energy resolution to find out whether it is consistent with the latter conclusion (Figure 3).

The investigation of the origin of the approximation has been pursued by looking whether it is still valid in event–by–event simulations as it is in semianalytical simulations. In fact, we found that the dispersion of the two distributions becomes wider than in semianalytical simulations when a finite energy resolution is taken into account. The energy error introduces strong correlations between bins, and it corresponds to an extended systematic error.

The limited power of the manifests itself being controlled by the statistical assumption, i.e., Equation (7). The experimental sensitivity is reduced when the energy systematic error is taken into account, and Equation (7) is no more valid. Specific cases are reported in the following figures and tables, and other details are reported in subsection 6.1.

In other words, it is worth to stress the loss of the Gaussianity of the full process. When the energy uncertainty is considered in an event–by–event simulation, a net migration of events occurs from the upper bin to the lower one when the expected number of events is increasing with the energy. The opposite occurs when the event expectation is decreasing with the energy. That corresponds to a loss of independency of the random variables of the energy bin, and a consequent loss of the Gaussianity. Instead, the simple addition of the energy uncertainty in each bin will keep that independence, mystifying the final results.

Figure 4 is a comparison of the estimator distributions at for NH sample and IH sample. An infinite energy resolution is assumed for the left plot and a 3% relative energy resolution for the right plot.

Figure 5 for NH sample at and IH sample for shows the distributions for a relative 3% and an infinite energy resolution. The JUNO sensitivity is clearly different from that reported in [8].

When only statistical fluctuations are included, the MH sensitivities using -test ( and ) do not exactly equal to the MH sensitivities obtained in the approximated Equation (7) ( and ) as reported in Tables 4 and 5. This observation is consistent with what is obtained at the atmospheric mass, for IH sample and for NH sample for infine energy resolution in Table 6 and for in Table 7. This conclusion will be confirmed for other 18 different values for the atmospheric mass at infinite energy resolution in subsection 6.1.

4. Issue Two: Nonbright Results Using as a Bidimensional Estimator

When and are drawn in a 2D map, their strong positive correlation manifests as a bidimensional estimator. This strong positive correlation leads to overlap between the distributions of the two hypotheses, thus reducing the experiment sensitivity. When we look at as a bidimensional estimator, the experiment sensitivity can be calculated with a -test for two-dimensional test statistic providing the results indicated in Tables 8, 9, and 10.

Using -test for 2D, the MH sensitivity can be calculated as

, where , indicates the mean of the distribution of the A sample, assuming the B hypothesis to be true. expresses the standard derivation of distribution of the A sample assuming that B hypothesis is the true hypothesis. Figures 68 are shown the 2D maps.

5. Issue Three: The Robustness

Robust statistics are the statistics that yield good performance when data is drawn from a wide range of probability distributions that are largely unaffected by outliers or small departures from model assumptions in a given data set [18]. In other words, a robust statistic is resistant to initial deviations with respect to the final results [19].

The main focus of the statistical analysis using the standard method is to calculate neutrino mass hierarchy determination sensitivity, and less attention or none is put about its robustness. Subsection 5.1 will discuss how the standard method using is not able to maintain the robustness while subsection 5.2 will discuss the inability of the to establish the robustness as a bidimensional estimator. This study is done for 20 different data values of the input atmospheric neutrino mass in the range, .

5.1. The Oscillations with

There are trends in our data to confirm that the varies with the input atmospheric neutrino mass . We studied the relation between the values and the value of the input parameter for 20 different values, in the range, , and we computed the corresponding experimental sensitivity for the two cases, with and without including the systematic uncertainties. In particular, since the main systematic error is largely dominated by the energy resolution, when we refer to with/without systematics, we are either taking into account or not the systematic uncertainty due to the energy resolution, which is taken to be 3%. Figure 9 illustrates the variation of as a function of the input atmospheric neutrino mass , in the range of , assuming infinite energy resolution. Figure 10 illustrates the variation of with the input atmospheric neutrino mass , in the range of when the 3% relative energy resolution is included. We performed additional data collection ignoring the systematic uncertainties in order to provide a strong evidence for the result. How the oscillations with reflects on the neutrino mass hierarchy determination sensitivity depends on how the significance will be calculated, for example using Equation (6) or Equation (8).

In case the approximation is not valid, the -test for 1D, sigmaIH, can be used to calculate the neutrino MH sensitivity. As expected, the variation of the estimator will influence neutrino MH sensitivity. Figure 11 confirms the influence on neutrino MH sensitivity in case that only the statistical uncertainties are included and the sensitivity varies from about 4.5 to 7.5. Figure 12 confirms this influence in case that the systematic and statistical uncertainties are included and the sensitivity oscillates from about 0.9 to 1.5.

Assuming that the approximation of Equation (7) is valid at infinite energy resolution, the neutrino mass hierarchy determination sensitivity is expected to have a large variation with the input parameter as confirmed in Figure 13. The sensitivity may vary from about 9.5 to 7.5.

Assuming that the approximation of Equation (7) is still valid at 3% relative energy resolution, the neutrino mass hierarchy determination sensitivity is not robust as confirmed in Figure 14. The sensitivity using Equation (8) varies from a maximum of 4.1 to about 3.2.

5.2. The Robustness

The significance using as bidimensional distribution through Equation (9) varies from 1.3 to 0.9 assuming an infinite energy resolution as shown in Figure 15 and from 0.24 to 0.18 assuming 3% relative energy resolution, as shown in Figure 16.

The oscillation of the experimental sensitivity with the value of the input parameter, the neutrino atmospheric mass difference , implies that the standard method results have a strong dependency on the input parameter value. Whether the approximation is not valid or not, systematic uncertainties included or not, this dependence still holds.

6. Results

In order to present the findings as clear as possible, it is imperative to study the three reported issues of the standard algorithm in the range of the atmospheric mass between and . These issues are categorized into two types depending on which estimator is being used. The first sensitivity category using estimator is reported in subsection 6.1. The second sensitivity category using is reported in subsection 6.2. For each category, a detailed study is provided for 20 different values of the atmospheric mass in the range of , with and without systematic errors. The final results now provide solid evidences about the problematic use of the standard algorithm.

6.1. The Issues of

Here, we report two results. First, our result on the limited power of (issue one) confirming that, when systematic uncertainties are included, the approximated Equation (7) is not acceptable in the range of neutrino atmospheric mass, . We provide the results of 20 different values of the in that range showing the limit of the approximation when including the systematic uncertainties (as confirmed in Figure 17). Although Equation (7) is widely accepted, it suffers from some limitations due to its limitation when systematic uncertainties are included (Figure 17). The limitation manifests itself decreasing the power of the estimator to determine the correct neutrino MH. The reasons behind this limitation are explained in details in Section 3. As a result, the power of this estimator for the MH discrimination is not promising as reported in Table 11. On the contrary, without including the systematic uncertainties, Equation (7) is valid, and the results are very good as reported in Figure 18 and Table 12. Second, the studies about the robustness in the range of show its dependence. This result is directly in line with previous result in Section 5. From these sensitivity tables, (Tables 11 and 12), it is clear that the experimental sensitivity using has a strong dependence on the value of the input atmospheric mass. If the value of the input parameter, input atmospheric mass, is modified, the experimental sensitivity will change according to it. This change is not affected by the systematic uncertainties. It is an intrinsic property of the itself. Table 12 shows the sensitivities using with infinite energy resolution. As can be seen in the table, the experimental sensitivities vary a lot with different values of the neutrino atmospheric mass proving that the robustness of is not well established even at infinite energy resolution. Table 11 provides the sensitivities including the systematic uncertainties: the neutrino mass ordering discrimination varies a lot. The implications of this issue are fully discussed in Section 5.

As mentioned in Section 3, the MH sensitivities using -test, and , do not exactly equal to the MH sensitivities obtained in the approximated Equation (7), and . Table 11 reports this observation for 20 different values for the atmospheric mass at infinite energy resolution providing a solid experimental evidence for overestimation behavior for this approximation.

6.2. The Issues of

Each plot of Figures 19 and 20 proves that has not enough ability to produce high sensitivity to distinguish between the right and wrong ordering of the neutrino using the medium baseline reactor spectrum. From the sensitivity tables (Tables 13 and 14), it is clear that the experimental sensitivity using the estimator has a strong dependence on the value of the neutrino atmospheric mass. If the neutrino atmospheric mass value is modified, the experimental sensitivity will change according to it, even when the systematic uncertainties are not included.

The results about the standard algorithm confirmed the three statistical issues in the range of .

7. Conclusion

Advances in statistical methods may play a decisive role in the discovery reachable at neutrino physics experiments. Evaluating the used statistical methods and updating them is a necessary step in building a robust statistical analysis for answering the open questions in neutrino physics [20]. The statistical issues on the MHD from the reactor experiments have been illustrated, starting from the limited power of the . When the simulation is performed on an event–by–event basis and not on a semianalytical one, the significance drastically drops. In fact, the systematic uncertainties due to the 3% relatively energy resolution cause unbalanced migration effects between events that do not show up when the simulations are not made on an event–by–event basis. To confirm the effect, simulations at infinite energy resolution have also been performed confirming the validation of the assumption of Equation (7) in case of exclusion of the systematic uncertainties. is fully controlled by the statistical assumptions as explained in Section 3. That is the major limit to the approximation, reducing the experimental standard sensitivity that is officially reported. To conclude this first issue, it has been pointed out that the estimator provides us with different results following different simulation procedures. Second, the strong positive correlations between the and when they are drawn in a 2-dimensional map confirms the being a bidimensional estimator. As a second issue, we then conclude that JUNO sensitivity using as bidimensional estimator is not promising as well. Third, the is dominated by the value as described in dx_dm. Then, the MHD significance using depends on the values of the input parameter . That is the reason we were interested in studying the MHD problem by using the standard method at 20 different values of in the range between and .

Appendix

Fitting with TMinuit Class

Toy simulations were based on a single event basis and the expected systematic errors via a Gaussian distribution centered at the expected mean and with the standard deviation of the estimated uncertainty can be added. For JUNO, a global resolution on the energy reconstruction is expected. The oscillation parameters have been taken from the most recent global fits listed in Table 1. The Poisson statistical fluctuation is automatically included.

The fitting procedures and the minimization of are done via the ROOT minimization libraries (the TMinuit algorithm). In the minimization procedure, all the oscillation parameters were fixed to the best-fitting values of [8]. A total of 108357 signal events are processed for each toy simulations. The official version of JUNO Software “J17v1r1” is used. will be often scaled with the number of degrees of freedom, which is clearly equal to the number of fitted data minus the constraints: bin−6. Figures 21 and 22 indicate distributions for 1000 toy JUNO-like simulations generated for NH and IH samples, respectively. The simulations are generated at 20 different values of the atmospheric mass in the range of for NH hypothesis (blue graphs) and for IH hypothesis (red graphs) with six years of exposure and the ten near reactor cores with an infinite energy resolution. Figures 23 and 24 indicate the distributions for 1000 toy JUNO-like simulations generated for NH and IH samples, respectively. The simulations are generated at 20 different values of the atmospheric mass in the range of for NH hypothesis (blue graphs) and for IH hypothesis (red graphs) with six years of exposure and the ten near reactor cores with an energy resolution.

Data Availability

The statistical sample data used to support the findings of this study are included within the article.

Disclosure

An arXiv has previously been published [21]. The arXiv version of the paper is at https://arxiv.org/abs/2310.01814.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

We are particularly grateful to Xuefeng Ding for valuable suggestions on the robustness of . The article is written with the financial contribution from Fondazione Cariparo in the Dipartimento di Fisica e Astronomia “Galileo Galilei,” University of Padova, Italy, and INFN, Padova station, Italy.