Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
freem
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Openai/6922876a-7988-8007-9c62-5f71772af6aa
(section)
Add languages
Page
Discussion
English
Read
Edit
Edit source
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
Edit source
View history
General
What links here
Related changes
Special pages
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== User: Can you double-check this proof from beginning to end, and make sure it is fully correct? === Can you double-check this proof from beginning to end, and make sure it is fully correct? Please identify any place where there are incorrect arguments. Also suggest fixes as dropin replacements to the LaTeX. \subsection{Optimal IDS analysis for uniform location}\label{subsec:uniform-IDS-sharp} Recall $\mu_\theta=\Unif[\theta-\tfrac12,\theta+\tfrac12]$. Under IDS we observe $X_1',\dots,X_n'\stackrel{\mathrm{i.i.d.}}{\sim}Q$ with $W_2(Q,\mu_\theta)\le \ep$. Risk is squared error $R(\theta,\hat\theta)=\E_Q[(\hat\theta-\theta)^2]$, and the IDS minimax risk is $\M_I(\ep;n)=\inf_{\hat\theta}\sup_{Q:W_2(Q,\mu_\theta)\le \ep} R(\theta,\hat\theta)$. We introduce a smoothed location family $\{\nu_{\theta,\tau}:\theta\in\R,\; 0<\tau\le \tfrac12\}$ by replacing the uniform's two edges by Gaussian kernel tails. Let $K(x)=(2\pi)^{-1/2}e^{-x^2/2}$, $S$ its CDF. Define the density \begin{align}\label{eq:nu-density} f(x;\theta,\tau)= \begin{cases} 1 & \text{if } |x-\theta|\le \tfrac12-\tau,\\[3pt] \dfrac{1}{K(0)}\,K\!\left(\dfrac{|x-\theta|-(\tfrac12-\tau)}{2\tau K(0)}\right) & \text{if } |x-\theta|>\tfrac12-\tau. \end{cases} \end{align} By construction $f(\cdot;\theta,\tau)$ is $C^1$, equals unity on the bulk, and is strictly log-concave on the two edge strips of width $\tau$; the choice $K'(0)=0$ guarantees $C^1$ matching at the junctions. We provide an upper bound achieved by a Z-estimator in the smoothed family. \begin{theorem}[Sharp IDS risk]\label{thm:IDS-sharp} Let $X_1',\dots,X_n'$ be i.i.d.\ from $Q$ with $W_2(Q,\mu_\theta)\le \ep$. Define \[ \tau_\star=\ep^{2/3}\wedge \frac12,\qquad \psi_{\tau_\star}(x-\theta)=\partial_\theta \log f(x;\theta,\tau_\star), \] with $f$ as in \eqref{eq:nu-density}. Let $\hat\theta_\ep$ be a solution to the estimating equation \begin{align}\label{eq:M-est} \frac{1}{n}\sum_{i=1}^n \psi_{\tau_\star}(X_i'-t)=0. \end{align} It follows from our analysis that this solution is unique. There exists a constant $C<\infty$ such that, for all $n\ge 1$ and $\ep\in(0,\tfrac12]$,s \begin{align}\label{eq:IDS-sharp-risk} \sup_{Q:W_2(Q,\mu_\theta)\le \ep} \E_Q\big[(\hat\theta_\ep-\theta)^2\big] \;\le\; C\left(\,\frac{\ep^{2/3}}{n}\;\;\vee\;\;\left[\ep^2+\frac{1}{2(n+1)(n+2)}\right]\right). \end{align} Consequently, \begin{align}\label{eq:IDS-sandwich} \M_I(\ep;n)\;\asymp\; \frac{\ep^{2/3}}{n}\ \vee\ \M_C(\ep;n). \end{align} \end{theorem} \paragraph{Discussion.} The estimator \eqref{eq:M-est} is a one-dimensional $M$-estimator with a strictly decreasing $C^1$ score; as we will show, existence and uniqueness follow from strict concavity of the smoothed log-likelihood. % Tuning uses only $\tau_\star\asymp \ep^{2/3}$. If $\ep$ is unknown one may plug an upper bound or use a Lepski grid over $\tau$; rates are unaffected. \ed{?} The proof uses three ingredients. First, a standard $M$-estimation expansion reduces the risk to a variance term of order $1/\{n I(\tau_\star)\}$ plus a squared bias term governed by the population score under $Q$. Second, a one-dimensional dynamic formulation of $W_2$ controls the deviation of score moments between $Q$ and the reference model $\mu_\theta$ through the $L^2$ size of derivatives of the score; in our construction these derivatives concentrate on the edge strips and have explicit $\tau$-scaling. Third, Lemma~\ref{lem:info-W2} identifies $I(\tau_\star)=\pi/\tau_\star$ and the least-favorable scaling $\tau_\star\asymp \ep^{2/3}$. We next formalize the transport control we will use. \begin{lemma}[Dynamic $W_2$ control of expectation differences]\label{lem:dyn-W2} Let $P,Q$ be Borel probability measures on $\R$ with finite second moments, and let $T$ be the monotone transport pushing $P$ to $Q$. For $t\in[0,1]$ set $X_t=(1-t)X+tT(X)$ with $X\sim P$, and let $P_t$ denote the law of $X_t$. Then, for any $C^1$ function $\varphi$ such that $\int_0^1 \E_{P_t}[\varphi'(X_t)^2]\,dt<\infty$, \[ \left|\E_Q[\varphi]-\E_P[\varphi]\right| \le \left(\int_0^1 \E_{P_t}[\varphi'(X_t)^2]\,dt\right)^{1/2}\, W_2(P,Q). \] \end{lemma} \begin{proof} By the fundamental theorem of calculus, \[ \E_Q[\varphi]-\E_P[\varphi]=\int_0^1 \frac{d}{dt}\E_{P_t}[\varphi]\,dt =\int_0^1 \E_{P_t}[\varphi'(X_t)\,v_t(X_t)]\,dt, \] where $v_t(x)=T(x)-x$ is the velocity along the monotone geodesic. Cauchy–Schwarz yields \[ \left|\E_Q[\varphi]-\E_P[\varphi]\right| \le \left(\int_0^1 \E_{P_t}[\varphi'(X_t)^2]\,dt\right)^{1/2}\left(\int_0^1 \E_{P_t}[v_t(X_t)^2]\,dt\right)^{1/2}. \] Finally $\int_0^1 \E_{P_t}[v_t(X_t)^2]\,dt=\E[(T(X)-X)^2]=W_2(P,Q)^2$ in one dimension with monotone $T$. \end{proof} We now bound, uniformly over the IDS ball, the score moments that appear in the $M$-estimation risk. \begin{lemma}[Uniform control of score moments on the $W_2$ ball, benchmarked at $\nu_{\theta,\tau}$]\label{lem:score-moments} Fix $\tau\in(0,\tfrac12]$ and let $\psi_\tau$ be the score of $\nu_{\theta,\tau}$. There exist constants $C_1,C_2,C_3<\infty$ such that for all $Q$ with $W_2(Q,\mu_\theta)\le \ep$, \begin{align} \left|\E_Q[\psi_\tau]-\E_{\nu_{\theta,\tau}}[\psi_\tau]\right| &\le C_1\,\frac{\ep}{\sqrt{\tau}},\label{eq:bias-score}\\ \left|\E_Q[\psi_\tau']-\E_{\nu_{\theta,\tau}}[\psi_\tau']\right| &\le C_2\,\frac{\ep}{\sqrt{\tau}},\label{eq:curv-score}\\ \left|\E_Q[\psi_\tau^2]-\E_{\nu_{\theta,\tau}}[\psi_\tau^2]\right| &\le C_3\,\frac{\ep}{\tau^{3/2}}.\label{eq:var-score} \end{align} In particular, since for the smoothed family the information identity gives $\E_{\nu_{\theta,\tau}}[\psi_\tau']=\E_{\nu_{\theta,\tau}}[\psi_\tau^2]=I(\tau)$ with $I(\tau)=\pi/\tau$ from \eqref{eq:I-tau}, we obtain \[ \E_Q[\psi_\tau']=I(\tau)+O\!\left(\frac{\ep}{\sqrt{\tau}}\right),\qquad \E_Q[\psi_\tau^2]=I(\tau)+O\!\left(\frac{\ep}{\tau^{3/2}}\right). \] \end{lemma} \begin{proof} Apply Lemma~\ref{lem:dyn-W2} with $P=\nu_{\theta,\tau}$, $Q$ as given, and successively $\varphi=\psi_\tau$, $\varphi=\psi_\tau'$, and $\varphi=\psi_\tau^2$. This yields \[ \left|\E_Q[\varphi]-\E_{\nu_{\theta,\tau}}[\varphi]\right| \le \left(\int_0^1 \E_{P_t}[\varphi'(X_t)^2]\,dt\right)^{1/2}\, W_2(Q,\nu_{\theta,\tau}), \] where $X_t=(1-t)X+tT(X)$ with $X\sim \nu_{\theta,\tau}$ and $T$ the monotone transport from $\nu_{\theta,\tau}$ to $Q$. By the triangle inequality and Lemma~\ref{lem:info-W2}, \[ W_2(Q,\nu_{\theta,\tau}) \;\le\; W_2(Q,\mu_\theta)+W_2(\mu_\theta,\nu_{\theta,\tau}) \;\le\; \ep + c^{1/2}\tau^{3/2}. \] We next bound the $t$–integrals. By construction, $\psi_\tau$ vanishes on the bulk $\{|u|\le \tfrac12-\tau\}$ and is smooth on the two edge regions. Differentiating the explicit tail expression in \eqref{eq:nu-density} and using a change of variables $y=\frac{u-(1/2-\tau)}{2\tau K(0)}$ on each edge, one checks that there are absolute constants $A_1,A_2,A_3$ such that, for all $t\in[0,1]$, \[ \E_{P_t}\!\big[\psi_\tau'(X_t-\theta)^2\big]\le \frac{A_1}{\tau},\qquad \E_{P_t}\!\big[\psi_\tau''(X_t-\theta)^2\big]\le \frac{A_2}{\tau^2},\qquad \E_{P_t}\!\big[((\psi_\tau^2)'(X_t-\theta))^2\big]\le \frac{A_3}{\tau^3}. \] Intuitively, these come from the facts that: (i) all these derivatives vanish on the bulk and are nonzero only on the two edge regions; (ii) the $\nu_{\theta,\tau}$-mass of each edge region equals $\tau$ (the two edges carry total mass $2\tau$); and (iii) the derivatives scale, respectively, like $\tau^{-1/2}$, $\tau^{-1}$, and $\tau^{-3/2}$ in $L^\infty$ over the edge regions generated by the kernel rescaling in \eqref{eq:nu-density}. Integrating over $t\in[0,1]$ preserves these orders. Putting the pieces together, \[ \left(\int_0^1 \E_{P_t}[\psi_\tau'(X_t-\theta)^2]\,dt\right)^{1/2}\!\!\lesssim \tau^{-1/2},\, \left(\int_0^1 \E_{P_t}[\psi_\tau''(X_t-\theta)^2]\,dt\right)^{1/2}\!\!\lesssim \tau^{-1}, \] and $\left(\int_0^1 \E_{P_t}[((\psi_\tau^2)'(X_t-\theta))^2]\,dt\right)^{1/2}\!\!\lesssim \tau^{-3/2}$. Since $W_2(Q,\nu_{\theta,\tau})\le \ep + c^{1/2}\tau^{3/2}$ and $\tau\le \tfrac12$, the $\ep$ term controls the product for the regimes of interest; absorbing harmless constants yields the stated bounds \eqref{eq:bias-score}–\eqref{eq:var-score}. Finally, the “in particular” lines follow by inserting $\E_{\nu_{\theta,\tau}}[\psi_\tau']=\E_{\nu_{\theta,\tau}}[\psi_\tau^2]=I(\tau)$ from \eqref{eq:I-tau}. \end{proof} We can now prove Theorem~\ref{thm:IDS-sharp}. \begin{proof}[Proof of Theorem~\ref{thm:IDS-sharp}] Fix $\tau=\tau_\star$ and write $\psi:=\psi_{\tau_\star}$, $I_\star=I(\tau_\star)=\pi/\tau_\star$. Let $m(Q)=-\partial_t \E_Q[\psi(X'-t)]\big|_{t=\theta}=\E_Q[\psi'(X'-\theta)]$ denote the population curvature under $Q$. Next, we claim that \begin{equation}\label{mq} m(Q)=I_\star + O\!\left(\frac{\ep}{\sqrt{\tau_\star}}\right). \end{equation} Indeed, for the location family $\{\nu_{\theta,\tau}:\theta\in\R\}$ with score $s_\theta(x)=\psi_\tau(x-\theta)$, the information identity \( -E_{\nu_{\theta,\tau}}\!\big[\partial_\theta s_\theta(X)\big] \;=\; E_{\nu_{\theta,\tau}}\!\big[s_\theta(X)^2\big] \) holds whenever differentiation under the integral is justified (true here because $f(\cdot;\theta,\tau)$ is $C^1$ and piecewise real analytic on the two edge strips, with bounded score and score derivative). Since $s_\theta(x)$ depends on $\theta$ only via $x-\theta$, we have $\partial_\theta s_\theta(x)=-\psi_\tau'(x-\theta)$, hence \begin{equation}\label{eq:E-nu-psi'-is-I} \E_{\nu_{\theta,\tau}}\!\big[\psi_\tau'(X-\theta)\big]\;=\; \E_{\nu_{\theta,\tau}}\!\big[\psi_\tau(X-\theta)^2\big]\;=\; I(\tau). \end{equation} \paragraph{Step 2: curvature under $Q$.} By Lemma~\ref{lem:score-moments} with $\varphi=\psi_\tau'$ and $P=\nu_{\theta,\tau}$, we have \[ m(Q)\equiv \E_Q[\psi_\tau'(X'-\theta)] =\E_{\nu_{\theta,\tau}}[\psi_\tau']+O\!\left(\frac{\ep}{\sqrt{\tau}}\right) =I(\tau)+O\!\left(\frac{\ep}{\sqrt{\tau}}\right). \] With $\tau=\tau_\star$ this is exactly \eqref{mq}. Since $\tau_\star=\ep^{2/3}$, the error term in \eqref{mq} equals $O(\ep^{2/3})$, hence $m(Q)\ge \tfrac12 I_\star$ for all $\ep\in(0,\ep_0]$ with $\ep_0$ fixed absolute. For larger $\ep$, the other term \eqref{eq:IDS-sharp-risk} will dominate the risk, so we may assume $m(Q)\ge \tfrac12 I_\star$ without loss of generality below. Standard $M$-estimation analysis yields \begin{align}\label{eq:risk-decomp} \E_Q[(\hat\theta_\ep-\theta)^2] \;\le\; \frac{1}{n}\,\frac{\mathrm{Var}_Q(\psi)}{m(Q)^2}\;+\; \frac{\E_Q[\psi]^2}{m(Q)^2} +\frac{C}{n}\,\frac{(\E_Q[\psi^4])^{1/2}\,\|\psi'\|_\infty^2}{m(Q)^4}. \end{align} See Section \ref{mearg} for a self-contained argument. In our construction $\psi$ is bounded by $C\tau_\star^{-1/2}$, so $\E_Q[\psi^4]\lesssim \tau_\star^{-2}$. Moreover, $\|\psi'\|_\infty^2 \lesssim \tau_\star^{-2}$ and, since $m(Q)\gtrsim I_\star\asymp \tau_\star^{-1}$, the last term is of order $\frac{\tau_\star^{3/2}}{n}$. By Lemma \ref{lem:score-moments}, $\mathrm{Var}Q(\psi)\le \E_Q[\psi^2]\le I\star + C\ep/\tau_\star^{3/2}$, and $m(Q)=I_\star+O(\ep/\sqrt{\tau_\star})$ Therefore the first term in \eqref{eq:risk-decomp} is bounded by $C/(n I_\star)= C\,\tau_\star/(\pi n)$. The second term is a squared bias term; Lemma~\ref{lem:score-moments} gives $|\E_Q[\psi]|\le C\,\ep/\sqrt{\tau_\star}$ and hence \[ \frac{\E_Q[\psi]^2}{m(Q)^2}\ \le\ C\,\frac{\ep^2\,\tau_\star}{I_\star^2} = C\,\ep^2\,\tau_\star^3 = C\,\ep^2\left(\ep^2\right) = C\,\frac{\ep^4}{c}. \] In the small-shift regime $\ep\le 1$ this is dominated by the variance term $C/(n I_\star)$ provided $\ep\ll n^{-1/2}$; when $\ep\gtrsim n^{-1/2}$, the CDS baseline $\ep^2$ (and the uniform noncontaminated term $1/[2(n+1)(n+2)]$) already dominates. Finally, the remainder term $\frac{\tau_\star^{3/2}}{n}$ is of smaller order than to $1/(n I_\star)$. Combining the pieces and recalling $\tau_\star=\ep^{2/3}$, $I_\star=\pi/\tau_\star$, we obtain \[ \E_Q[(\hat\theta_\ep-\theta)^2] \ \le\ C\left(\frac{\tau_\star}{n}\right)\ \vee\ \left(\ep^2+\frac{1}{2(n+1)(n+2)}\right) = C\left(\frac{\ep^{2/3}}{n}\right)\ \vee\ \M_C(\ep;n), \] uniformly over $Q$ with $W_2(Q,\mu_\theta)\le \ep$, as claimed in \eqref{eq:IDS-sharp-risk}. Then, \eqref{eq:IDS-sandwich} follows by combining this upper bound with the lower bound obtained from Lemma~\ref{lem:info-W2} by taking $\tau=\ep^{2/3}$ in the least-favorable submodel $\{\nu_{\theta,\tau}\}$ and invoking the Cram\'er–Rao bound $1/\{n I(\tau)\}$. This completes the proof. \end{proof} \paragraph{Consequences.} The IDS difficulty is pinned to the cubic edge geometry: the least-favorable $\tau$ satisfies $W_2(\mu_\theta,\nu_{\theta,\tau})\asymp \ep$ and $I(\tau)\asymp \tau^{-1}\asymp \ep^{-2/3}$, hence the sharp rate $\ep^{2/3}/n$. \begin{lemma}[Information and $W_2$ distance for $\nu_{\theta,\tau}$]\label{lem:info-W2} Let $\psi_\tau(x-\theta)=\partial_\theta \log f(x;\theta,\tau)=-\partial_x \log f(x;\theta,\tau)$ denote the score of the location family $\{\nu_{\theta,\tau}\}$. Then, for all $\theta\in\R$ and $\tau\in(0,\tfrac12]$, \begin{align} I(\tau)\equiv \E_{\nu_{\theta,\tau}}[\psi_\tau(X-\theta)^2]=\frac{\pi}{\tau},\label{eq:I-tau}\\ W_2^2\!\left(\mu_\theta,\nu_{\theta,\tau}\right)=c\,\tau^3,\qquad c=\frac{2\left(6-6\sqrt2+\pi\right)}{3\pi}.\label{eq:W2-tau} \end{align} \end{lemma} \begin{proof} The score vanishes on the bulk and is supported in the edge strips. On $x\ge \theta+\tfrac12-\tau$ one has \[ \psi_\tau(x-\theta)=-\frac{\partial}{\partial x}\log f(x;\theta,\tau) =-\frac{K'\!\left(\frac{x-\theta-(\tfrac12-\tau)}{2\tau K(0)}\right)}{2\tau K(0)\,K\!\left(\frac{x-\theta-(\tfrac12-\tau)}{2\tau K(0)}\right)}, \] with the analogous expression on the left edge by symmetry. Using the substitution $y=(x-\theta-(\tfrac12-\tau))/(2\tau K(0))$ and symmetry of the two edges, \begin{align*} I(\tau) &=2\int_{\theta+1/2-\tau}^\infty \psi_\tau(x-\theta)^2\, f(x;\theta,\tau)\,dx\\ &=\frac{2}{(2\tau K(0))^2 K(0)}\int_{\theta+1/2-\tau}^\infty \frac{K'(y)^2}{K(y)}\,dx =\frac{1}{\tau K(0)^2}\int_0^\infty \frac{K'(y)^2}{K(y)}\,dy =\frac{\pi}{\tau}, \end{align*} where in the last equality we used $K(0)=1/\sqrt{2\pi}$ and the Gaussian integral $\int_0^\infty K'(y)^2/K(y)\,dy=1$. For $W_2$, work with quantile functions: $F_1^{-1}(q)=\theta-1/2+q$ for $\mu_\theta$, and on $q\in[1-\tau,1]$ a direct inversion of the right-edge CDF yields \[ F_2^{-1}(q)=\theta+\tfrac12-\tau+2\tau K(0)\,S^{-1}\!\left(\frac{q-(1-2\tau)}{2\tau}\right). \] By symmetry, \begin{align*} W_2^2(\mu_\theta,\nu_{\theta,\tau}) &=2\int_{1-\tau}^1\Big(F_1^{-1}(q)-F_2^{-1}(q)\Big)^2\,dq\\ &= 2\int_{1-\tau}^1 \Big((q+\tau-1)-2\tau K(0) S^{-1}\!\big(\tfrac{q-(1-2\tau)}{2\tau}\big)\Big)^2\,dq\\ &=\tau^3\left(\frac{2}{3}+16K(0)^2\int_{0}^\infty u^2 K(u)\,du -16K(0)\int_0^\infty (2S(u)-1)\,u\,K(u)\,du\right). \end{align*} Evaluating the Gaussian integrals gives \eqref{eq:W2-tau} with the stated constant $c$. \end{proof} \subsubsection{M-estimation argument} \label{mearg} Here, we show \eqref{eq:risk-decomp}. Let $G_n(t)=n^{-1}\sum_{i=1}^n\psi(X_i'-t)$, $g_Q(t)=\E_Q[\psi(X'-t)]$, so that $m(Q)=-g_Q'(\theta)=\E_Q[\psi'(X'-\theta)]$. We assume: \begin{enumerate} \item[(A1)] $\psi$ is $C^1$, strictly decreasing, and has bounded derivatives: $\|\psi\|_\infty<\infty$, $\|\psi'\|_\infty<\infty$, $\|\psi''\|_\infty<\infty$. \item[(A2)] $g_Q$ is differentiable in a neighborhood of $\theta$ and $m(Q)>0$. \end{enumerate} In our model these hold: $\psi$ vanishes on the bulk and is smooth, monotone, and bounded on the two edge strips. By \eqref{mq}, we have $m(Q)\ge \tfrac12 I(\tau_\star)>0$. \medskip \noindent\textbf{Mean-value expansion of the estimating equation.} By definition $\hat\theta_\ep$ solves $G_n(\hat\theta_\ep)=0$. By the mean-value theorem there exists a (random) $\bar t$ on the line segment $[\theta,\hat\theta_\ep]$ such that \[ 0 \;=\; G_n(\hat\theta_\ep) \;=\; G_n(\theta) + (\hat\theta_\ep-\theta)\,\frac{d}{dt}G_n(t)\big|_{t=\bar t} \;=\; G_n(\theta) - (\hat\theta_\ep-\theta)\,m_n(\bar t), \] where we set $m_n(t):=-G_n'(t)=n^{-1}\sum_{i=1}^n \psi'(X_i'-t)$. Hence \begin{equation}\label{eq:basic-mvt} \hat\theta_\ep-\theta \;=\; \frac{G_n(\theta)}{m_n(\bar t)}. \end{equation} Add and subtract expectations in the numerator: \[ G_n(\theta) \;=\; \underbrace{\big(G_n(\theta)-g_Q(\theta)\big)}_{\text{empirical fluctuation}} \;+\; \underbrace{g_Q(\theta)}_{\text{population bias}}, \] and write \begin{equation*} \frac{1}{m_n(\bar t)} \;=\; \frac{1}{m(Q)} \;+\; \Big(\frac{1}{m_n(\bar t)}-\frac{1}{m(Q)}\Big). \end{equation*} Multiplying this by $G_n(\theta)$ and substituting into \eqref{eq:basic-mvt} yields the exact decomposition \begin{align} \hat\theta_\ep-\theta &= \frac{1}{m(Q)}\Big(G_n(\theta)-g_Q(\theta)\Big) \;+\; \frac{g_Q(\theta)}{m(Q)} \;+\; \mathrm{Rem}, \\ \mathrm{where}\quad \mathrm{Rem} &:= \Big(G_n(\theta)-g_Q(\theta)\Big)\Big(\frac{1}{m_n(\bar t)}-\frac{1}{m(Q)}\Big) + g_Q(\theta)\Big(\frac{1}{m_n(\bar t)}-\frac{1}{m(Q)}\Big).\label{eq:Rem-def} \end{align} Next, we claim that \begin{equation}\label{eq:Rem-quad-goal} \E_Q[\mathrm{Rem}^2] \ \le\ \frac{C}{m(Q)^4}\,\E_Q\!\left[\Big(G_n-g_Q\Big)^2\big(m_n(\bar t)-m(Q)\big)^2\right] \ +\ \frac{C}{m(Q)^4}\,\big(\E_Q[\psi^4]\big)^{1/2}\cdot \frac{\|\psi'\|_\infty^2}{n}, \end{equation} By the triangle inequality, \begin{align} \mathrm{Rem}^2 &\le 2\Big(G_n(\theta)-g_Q(\theta)\Big)^2\Big(\frac{1}{m_n(\bar t)}-\frac{1}{m(Q)}\Big)^2 + 2\,g_Q(\theta)^2\Big(\frac{1}{m_n(\bar t)}-\frac{1}{m(Q)}\Big)^2.\label{eq:Rem-square-raw} \end{align} Next, for any $a,b>0$, $|a^{-1}-b^{-1}|=|a-b|/(ab)$. Apply this with $a=m_n(\bar t)$, $b=m(Q)$. Define the <code>good curvature'' event \( \mathcal A:=\{\,m_n(\bar t)\ge m(Q)/2\,\}. \) On $\mathcal A$, \begin{equation}\label{eq:recip-good} \Big|\frac{1}{m_n(\bar t)}-\frac{1}{m(Q)}\Big| =\frac{|\,m_n(\bar t)-m(Q)\,|}{m_n(\bar t)\,m(Q)} \le \frac{2}{m(Q)^2}\,|\,m_n(\bar t)-m(Q)\,|, \end{equation} and on $\mathcal A^c$ we have deterministically \begin{equation}\label{eq:recip-bad} \Big|\frac{1}{m_n(\bar t)}-\frac{1}{m(Q)}\Big| \le \frac{1}{m_n(\bar t)}+\frac{1}{m(Q)} \le \frac{2}{m(Q)}. \end{equation} Taking expectations in \eqref{eq:Rem-square-raw} and splitting over $\mathcal A$ and $\mathcal A^c$, we obtain \begin{align} \E_Q[\mathrm{Rem}^2] &\le 2\,\E_Q\!\left[\Big(G_n(\theta)-g_Q(\theta)\Big)^2\Big(\frac{1}{m_n(\bar t)}-\frac{1}{m(Q)}\Big)^2\mathbf 1_{\mathcal A}\right]\nonumber\\ &\quad + 2\,\E_Q\!\left[\Big(G_n(\theta)-g_Q(\theta)\Big)^2\Big(\frac{1}{m_n(\bar t)}-\frac{1}{m(Q)}\Big)^2\mathbf 1_{\mathcal A^c}\right]\nonumber\\ &\quad + 2\,g_Q(\theta)^2\,\E_Q\!\left[\Big(\frac{1}{m_n(\bar t)}-\frac{1}{m(Q)}\Big)^2\mathbf 1_{\mathcal A}\right] + 2\,g_Q(\theta)^2\,\E_Q\!\left[\Big(\frac{1}{m_n(\bar t)}-\frac{1}{m(Q)}\Big)^2\mathbf 1_{\mathcal A^c}\right].\label{eq:four-terms} \end{align} \paragraph{Bounds on the </code>good'' part $\mathcal A$.} Use \eqref{eq:recip-good} inside the expectations: \begin{align} 2\,\E_Q\!\left[\Big(G_n-g_Q\Big)^2\Big(\frac{1}{m_n(\bar t)}-\frac{1}{m(Q)}\Big)^2\mathbf 1_{\mathcal A}\right] &\le \frac{8}{m(Q)^4}\,\E_Q\!\left[\Big(G_n-g_Q\Big)^2\big(m_n(\bar t)-m(Q)\big)^2\right],\label{eq:good1}\\ 2\,g_Q(\theta)^2\,\E_Q\!\left[\Big(\frac{1}{m_n(\bar t)}-\frac{1}{m(Q)}\Big)^2\mathbf 1_{\mathcal A}\right] &\le \frac{8\,g_Q(\theta)^2}{m(Q)^4}\,\E_Q\!\left[\big(m_n(\bar t)-m(Q)\big)^2\right].\label{eq:good2} \end{align} \paragraph{Bounds on the `bad'' part $\mathcal A^c$.} Use \eqref{eq:recip-bad}: \begin{align} 2\,\E_Q\!\left[\Big(G_n-g_Q\Big)^2\Big(\frac{1}{m_n(\bar t)}-\frac{1}{m(Q)}\Big)^2\mathbf 1_{\mathcal A^c}\right] &\le \frac{8}{m(Q)^2}\,\E_Q\!\left[\Big(G_n-g_Q\Big)^2\mathbf 1_{\mathcal A^c}\right],\label{eq:bad1}\\ 2\,g_Q(\theta)^2\,\E_Q\!\left[\Big(\frac{1}{m_n(\bar t)}-\frac{1}{m(Q)}\Big)^2\mathbf 1_{\mathcal A^c}\right] &\le \frac{8\,g_Q(\theta)^2}{m(Q)^2}\,\mathrm{P}_Q(\mathcal A^c).\label{eq:bad2} \end{align} We now control the two factors $\E_Q[(G_n-g_Q)^2\mathbf 1_{\mathcal A^c}]$ and $\mathrm{P}_Q(\mathcal A^c)$. \medskip \noindent\emph{(a) Bounding $\mathrm{P}_Q(\mathcal A^c)$ via Chebyshev.} By definition $\mathcal A^c=\{\,m_n(\bar t)<m(Q)/2\,\}$, hence \[ \mathrm{P}_Q(\mathcal A^c)\;\le\; \mathrm{P}_Q\big(|m_n(\bar t)-m(Q)|\ge m(Q)/2\big)\;\le\; \frac{4}{m(Q)^2}\,\E_Q\big[(m_n(\bar t)-m(Q))^2\big].\label{eq:Ac-prob} \] Therefore, \begin{equation}\label{eq:bad2-bnd} \eqref{eq:bad2}\ \le\ \frac{32\,g_Q(\theta)^2}{m(Q)^4}\,\E_Q\!\left[(m_n(\bar t)-m(Q))^2\right]. \end{equation} \medskip \noindent\emph{(b) Bounding $\E_Q[(G_n-g_Q)^2\mathbf 1_{\mathcal A^c}]$ via Cauchy--Schwarz and a 4th moment bound.} By Cauchy--Schwarz, \begin{equation}\label{eq:CS-bad1} \E_Q\!\left[\Big(G_n-g_Q\Big)^2\mathbf 1_{\mathcal A^c}\right] \le \Big(\E_Q\big[\big(G_n-g_Q\big)^4\big]\Big)^{1/2}\ \mathrm{P}_Q(\mathcal A^c)^{1/2}. \end{equation} We bound the fourth moment of the empirical mean $G_n(\theta)=n^{-1}\sum_{i=1}^n Y_i$ with $Y_i=\psi(X_i'-\theta)$, which are i.i.d.\ under $Q$, centered at $g_Q(\theta)=\E_Q[Y]$. A standard inequality (e.g., the Marcinkiewicz--Zygmund inequality or a direct expansion) yields \begin{equation}\label{eq:Gn-fourth} \E_Q\!\left[\Big(G_n(\theta)-g_Q(\theta)\Big)^4\right] \ \le\ \frac{C}{n^2}\,\E_Q\!\left[|Y-g_Q(\theta)|^4\right] \ \le\ \frac{C}{n^2}\,\E_Q\!\left[|Y|^4\right] \ \le\ \frac{C}{n^2}\,\E_Q[\psi(X'-\theta)^4], \end{equation} where we used $|g_Q(\theta)|\le(\E_Q[Y^2])^{1/2}$ and then $(a+b)^4\le 8(a^4+b^4)$ to absorb $g_Q(\theta)$ into the constant $C$. Combining \eqref{eq:CS-bad1}, \eqref{eq:Gn-fourth}, and \eqref{eq:Ac-prob} gives \begin{equation}\label{eq:bad1-bnd} \eqref{eq:bad1}\ \le\ \frac{8}{m(Q)^2}\cdot \frac{C^{1/2}}{n}\,\big(\E_Q[\psi^4]\big)^{1/2}\ \cdot\ \frac{2}{m(Q)}\Big(\E_Q[(m_n(\bar t)-m(Q))^2]\Big)^{1/2}. \end{equation} \paragraph{Assembling the pieces.} Summing \eqref{eq:good1}, \eqref{eq:good2}, \eqref{eq:bad2-bnd}, and \eqref{eq:bad1-bnd} we obtain \begin{align} \E_Q[\mathrm{Rem}^2] &\le \frac{8}{m(Q)^4}\,\E_Q\!\left[\Big(G_n-g_Q\Big)^2\big(m_n(\bar t)-m(Q)\big)^2\right] + \frac{C\,g_Q(\theta)^2}{m(Q)^4}\,\E_Q\!\left[(m_n(\bar t)-m(Q))^2\right]\nonumber\\ &\quad + \frac{C}{m(Q)^3 n}\,\big(\E_Q[\psi^4]\big)^{1/2}\cdot \Big(\E_Q[(m_n(\bar t)-m(Q))^2]\Big)^{1/2}.\label{eq:sum-before-CS} \end{align} By the inequality $ab\le \frac12(a^2+b^2)$ applied to the last term, \begin{align} \frac{C}{m(Q)^3 n}\,\big(\E_Q[\psi^4]\big)^{1/2}\cdot \Big(\E_Q[(m_n-m(Q))^2]\Big)^{1/2} &\le \frac{C}{2m(Q)^4}\,\E_Q[(m_n-m(Q))^2]\ +\ \frac{C}{2m(Q)^2 n^2}\,\E_Q[\psi^4].\label{eq:CS-last} \end{align} Insert \eqref{eq:CS-last} into \eqref{eq:sum-before-CS} and absorb constants: \begin{align} \E_Q[\mathrm{Rem}^2] &\le \frac{C}{m(Q)^4}\,\E_Q\!\left[\Big(G_n-g_Q\Big)^2\big(m_n(\bar t)-m(Q)\big)^2\right]\nonumber\\ &\quad + \frac{C}{m(Q)^4}\,\Big(g_Q(\theta)^2+1\Big)\,\E_Q\!\left[(m_n(\bar t)-m(Q))^2\right] + \frac{C}{m(Q)^2 n^2}\,\E_Q[\psi^4].\label{eq:pre-final} \end{align} \paragraph{Bounding $g_Q(\theta)^2$ and reducing to \eqref{eq:Rem-quad-goal}.} By Cauchy--Schwarz, \[ g_Q(\theta)^2=\big(\E_Q[\psi(X'-\theta)]\big)^2\le \E_Q[\psi(X'-\theta)^2]\le \big(\E_Q[\psi(X'-\theta)^4]\big)^{1/2}. \] Therefore the second line of \eqref{eq:pre-final} is bounded by \[ \frac{C}{m(Q)^4}\,\big(\E_Q[\psi^4]\big)^{1/2}\,\E_Q\!\left[(m_n(\bar t)-m(Q))^2\right]\ +\ \frac{C}{m(Q)^2 n^2}\,\E_Q[\psi^4]. \] Finally, we use the standard variance bound for the empirical derivative at $t=\theta$: \begin{equation}\label{eq:mn-theta-var} \E_Q\!\left[(m_n(\theta)-m(Q))^2\right]\ \le\ \frac{1}{n}\mathrm{Var}_Q\!\big(\psi'(X'-\theta)\big)\ \le\ \frac{\|\psi'\|_\infty^2}{n}. \end{equation} Replacing $\bar t$ by $\theta$ only weakens the bound (since $\E[(m_n(\bar t)-m(Q))^2]\le 2\E[(m_n(\theta)-m(Q))^2]+ 2\E[(m_n(\bar t)-m_n(\theta))^2]$, and the second term can be handled by a Lipschitz bound in $t$; any such term proportional to $\E_Q[(\hat\theta_\ep-\theta)^2]$ will be absorbed as explained below). Using \eqref{eq:mn-theta-var} and $(\E_Q[\psi^4])^{1/2}\ge 1$, the second line of \eqref{eq:pre-final} is bounded by \[ \frac{C}{m(Q)^4}\,\big(\E_Q[\psi^4]\big)^{1/2}\cdot \frac{\|\psi'\|_\infty^2}{n}\;+\; \frac{C}{m(Q)^4}\,\E_Q[\psi^4]\cdot \frac{1}{n^2}. \] The last term is dominated by the displayed one because $\E_Q[\psi^4]\le \|\psi\|_\infty^4$ and $n\ge 1$; hence we may keep the cleaner expression. Collecting terms we have shown \eqref{eq:Rem-quad-goal}. \paragraph{Absorbing terms proportional to $\E_Q[(\hat\theta_\ep-\theta)^2]$.} If, in an intermediate step, one keeps the Lipschitz contribution \[ \E_Q\!\left[(m_n(\bar t)-m_n(\theta))^2\right]\ \le\ L^2\,\E_Q\big[(\hat\theta_\ep-\theta)^2\big] \] (with $L=\sup_{x,t}|\partial_t \psi'(x-t)|=\|\psi''\|_\infty$), then the bound \eqref{eq:pre-final} produces an extra addend of the form \( \frac{C\,L^2}{m(Q)^4}\,\E_Q\big[(\hat\theta_\ep-\theta)^2\big]. \) When this remainder bound is plugged back into the main risk decomposition (equation (B.6) in the text), \[ \E_Q[(\hat\theta_\ep-\theta)^2] \le \frac{1}{n}\frac{\mathrm{Var}_Q(\psi)}{m(Q)^2}+ \frac{\E_Q[\psi]^2}{m(Q)^2} + \E_Q[\mathrm{Rem}^2], \] the term $\frac{C\,L^2}{m(Q)^4}\E_Q[(\hat\theta_\ep-\theta)^2]$ sits on the right-hand side. Using the curvature lower bound $m(Q)\ge c_0 I_\star$ (Lemma~\ref{lem:score-moments}), choose $n$-independent constants so that \( \frac{C\,L^2}{m(Q)^4}\ \le\ \frac12, \) and move $\tfrac12\,\E_Q[(\hat\theta_\ep-\theta)^2]$ to the left-hand side. This is the standard absorption step; it only changes the multiplicative constant in the final risk bound. Thus, for the purpose of \eqref{eq:Rem-quad-goal}, we can to replace $\E_Q[(m_n(\bar t)-m(Q))^2]$ by $\E_Q[(m_n(\theta)-m(Q))^2]\le \|\psi'\|_\infty^2/n$, yielding the stated clean form. \paragraph{The remaining mixed term in \eqref{eq:Rem-quad-goal}.} Starting from \eqref{eq:Rem-quad-goal}, the only piece we have not yet bounded is \( \E_Q\!\left[\Big(G_n(\theta)-g_Q(\theta)\Big)^2\big(m_n(\bar t)-m(Q)\big)^2\right]. \) Write $A:=G_n(\theta)-g_Q(\theta)$ and $B(\bar t):=m_n(\bar t)-m(Q)$. Using the Lipschitz property in $t$ implied by $\|\psi''\|_\infty<\infty$ we have \[ |m_n(\bar t)-m_n(\theta)| \le \|\psi''\|_\infty\,|\bar t-\theta|=\|\psi''\|_\infty\,|\hat\theta_\ep-\theta|, \] and similarly for the population derivative, \[ \big|\E_Q[\psi'(X'-\bar t)]-\E_Q[\psi'(X'-\theta)]\big|\le \|\psi''\|_\infty\,|\hat\theta_\ep-\theta|. \] Hence \[ |B(\bar t)|=\big|m_n(\bar t)-m(Q)\big| \le \big|m_n(\theta)-m(Q)\big| + 2\|\psi''\|_\infty\,|\hat\theta_\ep-\theta|. \] Squaring and using $(x+y)^2\le 2x^2+2y^2$ gives \( B(\bar t)^2 \le 2\big(m_n(\theta)-m(Q)\big)^2 + 8\|\psi''\|_\infty^2\,(\hat\theta_\ep-\theta)^2. \) Therefore \begin{align*} \E_Q\!\left[A^2 B(\bar t)^2\right] &\le 2\,\E_Q\!\left[A^2\big(m_n(\theta)-m(Q)\big)^2\right] + 8\|\psi''\|_\infty^2\,\E_Q\!\left[A^2(\hat\theta_\ep-\theta)^2\right]. \end{align*} We bound the two terms separately. For the first, by Cauchy--Schwarz and the fourth-moment bounds already used in \eqref{eq:Gn-fourth} and the analogous bound for $m_n(\theta)-m(Q)$, \[ \E_Q\!\left[A^2\big(m_n(\theta)-m(Q)\big)^2\right] \le \Big(\E_Q[A^4]\Big)^{1/2}\Big(\E_Q\!\left[\big(m_n(\theta)-m(Q)\big)^4\right]\Big)^{1/2} \le \frac{C}{n^2}\,(\E_Q[\psi^4])^{1/2}\,\|\psi'\|_\infty^2. \] For the second term we use boundedness of $\psi$: since $|G_n(\theta)|\le \|\psi\|_\infty$ and $|g_Q(\theta)|\le \|\psi\|_\infty$, we have $|A|\le 2\|\psi\|_\infty$, hence \( \E_Q\!\left[A^2(\hat\theta_\ep-\theta)^2\right] \le 4\|\psi\|_\infty^2\,\E_Q\!\left[(\hat\theta_\ep-\theta)^2\right]. \) Combining the last two displays yields \begin{equation}\label{eq:key-mixed} \E_Q\!\left[\Big(G_n-g_Q\Big)^2\big(m_n(\bar t)-m(Q)\big)^2\right] \le \frac{C}{n^2}\,(\E_Q[\psi^4])^{1/2}\,\|\psi'\|_\infty^2 \;+\; C\,\|\psi''\|_\infty^2\,\|\psi\|_\infty^2\,\E_Q\!\left[(\hat\theta_\ep-\theta)^2\right]. \end{equation} \paragraph{Completing the remainder bound.} Insert \eqref{eq:key-mixed} into \eqref{eq:Rem-quad-goal}. We obtain \[ \E_Q[\mathrm{Rem}^2] \le \frac{C}{m(Q)^4}\left\{\frac{(\E_Q[\psi^4])^{1/2}\,\|\psi'\|_\infty^2}{n^2} + \|\psi''\|_\infty^2\,\|\psi\|_\infty^2\,\E_Q[(\hat\theta_\ep-\theta)^2]\right\} + \frac{C}{m(Q)^4}\,(\E_Q[\psi^4])^{1/2}\,\frac{\|\psi'\|_\infty^2}{n}. \] The $n^{-2}$ term is dominated by the $n^{-1}$ term, since $\E_Q[\psi^4]\le \|\psi\|_\infty^4$ and $n\ge1$. %The term proportional to $\E_Q[(\hat\theta_\ep-\theta)^2]$ will be absorbed momentarily. Thus \begin{equation}\label{eq:Rem-final} \E_Q[\mathrm{Rem}^2] \le \frac{C}{m(Q)^4}\,(\E_Q[\psi^4])^{1/2}\,\frac{\|\psi'\|_\infty^2}{n} \;+\; \frac{C\,\|\psi''\|_\infty^2\,\|\psi\|_\infty^2}{m(Q)^4}\,\E_Q[(\hat\theta_\ep-\theta)^2]. \end{equation} \paragraph{Risk decomposition and absorption.} Returning to \eqref{eq:basic-mvt} and its exact decomposition, the standard inequality $(a+b+c)^2\le3(a^2+b^2+c^2)$ yields \[ \E_Q[(\hat\theta_\ep-\theta)^2] \le \frac{C}{m(Q)^2}\,\E_Q\!\left[\big(G_n(\theta)-g_Q(\theta)\big)^2\right] + \frac{g_Q(\theta)^2}{m(Q)^2} + \E_Q[\mathrm{Rem}^2]. \] Since $\E_Q[(G_n(\theta)-g_Q(\theta))^2]=\mathrm{Var}_Q(G_n(\theta))=\mathrm{Var}_Q(\psi(X'-\theta))/n$, using \eqref{eq:Rem-final} gives \begin{align*} \E_Q[(\hat\theta_\ep-\theta)^2] &\le \frac{1}{n}\,\frac{\mathrm{Var}_Q(\psi(X'-\theta))}{m(Q)^2} + \frac{g_Q(\theta)^2}{m(Q)^2} + \frac{C}{m(Q)^4}\,(\E_Q[\psi^4])^{1/2}\,\frac{\|\psi'\|_\infty^2}{n}\\ &\qquad + \frac{C\,\|\psi''\|_\infty^2\,\|\psi\|_\infty^2}{m(Q)^4}\,\E_Q[(\hat\theta_\ep-\theta)^2]. \end{align*} By the curvature lower bound $m(Q)\ge \tfrac12 I(\tau_\star)>0$ and the boundedness of $\psi$ and its derivatives, the last coefficient is an $n$-independent constant $<1$ after adjusting $C$. Moving it to the left-hand side (the standard absorption step) alters only constants, yielding \begin{equation}\label{eq:final-risk} \E_Q[(\hat\theta_\ep-\theta)^2] \ \le\ \frac{C}{n}\,\frac{\mathrm{Var}_Q(\psi(X'-\theta))}{m(Q)^2} \ +\ \frac{C\,g_Q(\theta)^2}{m(Q)^2} \ +\ \frac{C}{n}\,\frac{(\E_Q[\psi^4])^{1/2}\,\|\psi'\|_\infty^2}{m(Q)^4}. \end{equation} \qed
Summary:
Please note that all contributions to freem are considered to be released under the Creative Commons Attribution-ShareAlike 4.0 (see
Freem:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)