Abstract:
In the standard model (SM), the CP violation is introduced through a single phase in the CKM matrix. The neutral kaon system is one of the most precise channels to test how the SM theory describes the experiment data such as $\epsilon_K$ accurately. The indirect CP violation is parametrized into $\epsilon_{K}$, which can be calculated directly using lattice QCD. In this calculation, the largest uncertainty comes from two sources: one is $\hat{B}_K$ and the other is $V_{cb}$. We use the lattice results of $\hat{B}_K$ and exclusive $V_{cb}$ to calculate the theoretical estimate of $\epsilon_K$, which turns out to be $3.1\sigma$ away from its experimental value. Here, the error is evaluated using the standard error propagation method.

Abstract:
We report a possible solution to the trouble that the covariance fitting fails when the data is highly correlated and the covariance matrix has small eigenvalues. As an example, we choose the data analysis of highly correlated $B_K$ data on the basis of the SU(2) staggered chiral perturbation theory. Basically, the essence of the problem is that we do not have an accurate fitting function so that we cannot fit the highly correlated and precise data. When some eigenvalues of the covariance matrix are small, even a tiny error of fitting function can produce large chi-square and spoil the fitting procedure. We have applied a number of prescriptions available in the market such as diagonal approximation and cutoff method. In addition, we present a new method, the eigenmode shift method which fine-tunes the fitting function while keeping the covariance matrix untouched.

Abstract:
GPU has a significantly higher performance in single-precision computing than that of double precision. Hence, it is important to take a maximal advantage of the single precision in the CG inverter, using the mixed precision method. We have implemented mixed precision algorithm to our multi GPU conjugate gradient solver. The single precision calculation use half of the memory that is used by the double precision calculation, which allows twice faster data transfer in memory I/O. In addition, the speed of floating point calculations is 8 times faster in single precision than in double precision. The overall performance of our CUDA code for CG is 145 giga flops per GPU (GTX480), which does not include the infiniband network communication. If we include the infiniband communication, the overall performance is 36 giga flops per GPU (GTX480).

Abstract:
We present results for $\varepsilon_K$, the indirect CP violation parameter, calculated in the Standard Model using inputs from lattice QCD: the kaon bag parameter $\hat{B}_K$, and the CKM matrix element $V_{cb}$ from the axial current form factor for the exclusive decay $\bar{B}\to D^*\ell\bar{\nu}$ at zero-recoil. In addition, we take the coordinates of the unitarity triangle apex $(\bar{\rho},\bar{\eta})$ from the angle-only fit of the UTfit Collaboration and use $V_{us}$ to fix $\lambda$. In order to estimate the systematic error, we also use Wolfenstein parameters from the CKMfitter and UTfit. We find a $3.3(2)\sigma$ difference between $\varepsilon_K$ and experiment with exclusive $V_{cb}$. We report details of this preliminary result.

Abstract:
We address a frequently asked question on the covariance fitting of the highly correlated data such as our $B_K$ data based on the SU(2) staggered chiral perturbation theory. Basically, the essence of the problem is that we do not have an accurate fitting function enough to fit extremely precise data. When eigenvalues of the covariance matrix are small, even a tiny error of fitting function yields large chi-square and spoils the fitting procedure. We have applied a number of prescriptions available in the market such as the cut-off method, modified covariance matrix method, and Bayesian method. We also propose a brand new method, the eigenmode shift method which allows a full covariance fitting without modifying the covariance matrix at all. In our case, the eigenmode shift (ES) method and Bayesian method turn out to be the best prescription to the problem. We also provide a pedagogical example of data analysis in which the diagonal approximation and the cut-off method fail in fitting manifestly, but the ES method and the Bayesian approach work well.

Abstract:
We report the Standard Model evaluation of the indirect CP violation parameter $\varepsilon_K$ using inputs determined from lattice QCD: the kaon bag parameter $\hat{B}_K$, $\xi_0$, $|V_{us}|$ from the $K_{\ell 3}$ and $K_{\mu 2}$ decays, and $|V_{cb}|$ from the axial current form factor for the exclusive decay $\bar{B} \to D^* \ell \bar{\nu}$ at zero-recoil. The theoretical expression for $\varepsilon_K$ is thoroughly reviewed to give an estimate of the size of the neglected corrections, including long distance effects. The Wolfenstein parametrization $(|V_{cb}|, \lambda, \bar{\rho}, \bar{\eta})$ is adopted for CKM matrix elements which enter through the short distance contribution of the box diagrams. For the central value, we take the Unitarity Triangle apex $(\bar{\rho}, \bar{\eta})$ from the angle-only fit of the UTfit collaboration and use $V_{us}$ as an independent input to fix $\lambda$. We find that the Standard Model prediction of $\varepsilon_K$ with exclusive $V_{cb}$ (lattice QCD results) is lower than the experimental value by $3.4\sigma$. However, with inclusive $V_{cb}$ (results of the heavy quark expansion), there is no gap between the Standard Model prediction of $\varepsilon_K$ and its experimental value. For the calculation of $\varepsilon_K$, we perform the renormalization group running to obtain $\eta_{cc}$ at next-to-next-to-leading-order; we find $\eta_{cc}^\mathrm{NNLO}=1.72(27)$.

Abstract:
We present the Standard Model evaluation of the indirect CP violation parameter $\varepsilon_K$ using inputs determined from lattice QCD together with experiment: $|V_{us}|$, $|V_{cb}|$, $\xi_0$, and $\hat{B}_K$. We use the Wolfenstein parametrization ($|V_{cb}|$, $\lambda$, $\bar{\rho}$, $\bar{\eta}$) for the CKM matrix elements. For the central value, we take the angle-only fit of the UTfit collaboration, and use $|V_{us}|$ from the $K_{\ell 3}$ and $K_{\mu 2}$ decays as an independent input to fix $\lambda$. For the error estimate, we use results of the global unitarity triangle fits from the CKMfitter and UTfit collaborations. We find that the Standard Model (SM) prediction of $\varepsilon_K$ with exclusive $V_{cb}$ (lattice QCD results) is lower than the experimental value by $3.6(2)\sigma$. However, with inclusive $V_{cb}$ (results of the heavy quark expansion), the tension between the SM prediction of $\varepsilon_K$ and its experimental value disappears.

Abstract:
The CKM matrix element $|V_{cb}|$ can be extracted by combining experimentally determined branching fractions for $\bar{B}\to D^{(*)}\ell\bar{\nu}$ decays with form factors from the lattice. While successful, the precision of this approach has been limited by heavy-quark discretization effects. An improved version of the Fermilab action, the Oktay-Kronfeld action, can be used to reduce heavy-quark discretization effects in calculations performed at the physical bottom and charm quark masses. Treating charm and bottom quarks as massive, we are carrying out improvement of the flavor-changing currents through third order in the momentum (HQET) expansion.

Abstract:
We present results for the indirect CP violation parameter $\varepsilon_K$ determined directly from the standard model using lattice QCD to fix the inputs $\hat{B}_K$, $\xi_0$, $|V_{us}|$, and $|V_{cb}|$. We use the FLAG and SWME results for $\hat{B}_K$. We use the RBC-UKQCD result for $\xi_0$ determined using the experimental value of $\varepsilon'/\varepsilon$ and the lattice result of $\mathrm{Im}\,A_2$. To set the Wolfenstein parameter $\lambda$, we use $|V_{us}|$, which is determined from $K_{\ell3}$ and $K_{\mu2}$ decays combined with lattice evaluations of the $K \to \pi \ell \nu$ vector form factor and $f_K$. To set the Wolfenstein parameter $A$, we use the FNAL/MILC results for $|V_{cb}|$, which are determined from the exclusive decay $\bar{B} \to D^* \ell \bar{\nu}$ and the axial form factor at zero recoil. We also use the inclusive $|V_{cb}|$ obtained using the heavy quark expansion based on QCD sum rules and the OPE. We compare the results with those for exclusive $|V_{cb}|$. We find that the standard model prediction of $\varepsilon_K$ with exclusive $|V_{cb}|$ (lattice QCD results) is lower than the experimental value by 3.4$\sigma$. However, we observe no tension in $\varepsilon_K$ determined from inclusive $|V_{cb}|$.

Abstract:
Kepler GTX Titan Black and Kepler Tesla K40 are still the best GPUs for high performance computing, although Maxwell GPUs such as GTX 980 are available in the market. Hence, we measure the performance of our lattice QCD codes using the Kepler GPUs. We also upgrade our code to use the latest CPS (Columbia Physics System) library along with the most recent QUDA (QCD CUDA) library for lattice QCD. These new libraries improve the performance of our conjugate gradient (CG) inverter so that it runs twice faster than before. We also investigate the performance of Xeon Phi 7120P coprocessor. It has similar computing power with the Kepler GPUs in principle. However, its performance for our CG code is significantly inferior to that of the GTX Titan Black GPUs at present.