Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
freem
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Openai/6922876a-7988-8007-9c62-5f71772af6aa
(section)
Add languages
Page
Discussion
English
Read
Edit
Edit source
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
Edit source
View history
General
What links here
Related changes
Special pages
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Assistant: Below is a full pass over the argument. === Below is a full pass over the argument. I’ve separated (I) critical correctness issues from (II) minor issues/typos/constants, and then given (III) drop‑in LaTeX fixes for each place. Wherever a claim is wrong but easily repairable, I give a concrete replacement (as literal LaTeX you can paste in). ==== ### ==== * What’s written. In Lemma \ref{lem:dyn-W2} you set Xt=(1−t)X+tT(X)X_t=(1-t)X+tT(X)Xt=(1−t)X+tT(X) with X∼PX\sim PX∼P, define Pt=Law(Xt)P_t=\mathrm{Law}(X_t)Pt=Law(Xt), and then write ddt\EPt[φ]=\EPt[φ′(Xt) vt(Xt)],vt(x)=T(x)−x.\frac{d}{dt}\E_{P_t}[\varphi]=\E_{P_t}\big[\varphi'(X_t)\,v_t(X_t)\big],\quad v_t(x)=T(x)-x .dtd\EPt[φ]=\EPt[φ′(Xt)vt(Xt)],vt(x)=T(x)−x. The last identity is not correct: T(x)−xT(x)-xT(x)−x is defined on the Lagrangian coordinate x∈supp(P)x\in\mathrm{supp}(P)x∈supp(P), not on the current Eulerian position x∈supp(Pt)x\in\mathrm{supp}(P_t)x∈supp(Pt). In particular vt(Xt)v_t(X_t)vt(Xt) is generally not equal to T(X)−XT(X)-XT(X)−X. * What is correct. If we define Tt(x)=(1−t)x+tT(x)T_t(x)=(1-t)x+tT(x)Tt(x)=(1−t)x+tT(x) and Xt=Tt(X)X_t=T_t(X)Xt=Tt(X), then ddt\E[φ(Xt)] = \E [φ′(Xt)⋅(T(X)−X)],\frac{d}{dt}\E[\varphi(X_t)] \;=\; \E\!\left[\varphi'(X_t)\cdot (T(X)-X)\right],dtd\E[φ(Xt)]=\E[φ′(Xt)⋅(T(X)−X)], where the expectation is w.r.t. the original X∼PX\sim PX∼P. Equivalently, in Eulerian form, the velocity field vtv_tvt must satisfy vt(Tt(x))=T(x)−xv_t(T_t(x))=T(x)-xvt(Tt(x))=T(x)−x; that is, vt(z)=\E[T(X)−X∣Xt=z]v_t(z)=\E[T(X)-X \mid X_t=z]vt(z)=\E[T(X)−X∣Xt=z], not T(z)−zT(z)-zT(z)−z. * Consequence. The displayed Cauchy–Schwarz step goes through once you rewrite the derivative correctly. The correct bound is ∣\EQ[φ]−\EP[φ]∣≤(∫01\EP[φ′(Xt)2] dt)1/2W2(P,Q),\big|\E_Q[\varphi]-\E_P[\varphi]\big| \le \Big(\int_0^1 \E_P[\varphi'(X_t)^2]\,dt\Big)^{1/2} W_2(P,Q),\EQ[φ]−\EP[φ]≤(∫01\EP[φ′(Xt)2]dt)1/2W2(P,Q), which coincides with your statement after noting \EP[g(Xt)]=\EPt[g]\E_P[g(X_t)]=\E_{P_t}[g]\EP[g(Xt)]=\EPt[g]. * Drop‑in fix: see §III.A. ===== There are several separate problems: ===== * (a) Misuse of Lemma \ref{lem:dyn-W2} with φ=ψτ′\varphi=\psi_\tau'φ=ψτ′ and φ=(ψτ)2\varphi=(\psi_\tau)^2φ=(ψτ)2. The lemma requires ∫01\EPt[φ′(Xt)2] dt<∞\int_0^1\E_{P_t}[\varphi'(X_t)^2]\,dt<\infty∫01\EPt[φ′(Xt)2]dt<∞. For φ=ψτ′\varphi=\psi_\tau'φ=ψτ′ this means ψτ′′∈L2(Pt)\psi_\tau''\in L^2(P_t)ψτ′′∈L2(Pt). In your construction f(⋅;θ,τ)f(\cdot;\theta,\tau)f(⋅;θ,τ) is C1C^1C1 but not C2C^2C2 across the junctions. Indeed, with the chosen Gaussian tails, ψτ(u)=−∂ulogf(u)={0,∣u∣≤12−τ,12τK(0)⋅∣u∣−(12−τ)2τK(0),∣u∣>12−τ,\psi_\tau(u)=-\partial_u\log f(u) = \begin{cases} 0,& |u|\le \tfrac12-\tau,\\ \tfrac{1}{2\tau K(0)}\cdot \frac{|u|-(\tfrac12-\tau)}{2\tau K(0)},& |u|>\tfrac12-\tau, \end{cases}ψτ(u)=−∂ulogf(u)={0,2τK(0)1⋅2τK(0)∣u∣−(21−τ),∣u∣≤21−τ,∣u∣>21−τ, so ψτ\psi_\tauψτ is continuous, ψτ′\psi_\tau'ψτ′ is piecewise constant (0 on the flat top; constant >0>0>0 on the tails), hence ψτ′′\psi_\tau''ψτ′′ is a sum of Dirac masses at the junctions and is not a bounded function. Thus φ′=ψτ′′\varphi'=\psi_\tau''φ′=ψτ′′ is not square‑integrable and the lemma cannot be applied with φ=ψτ′\varphi=\psi_\tau'φ=ψτ′. * (b) “Edge strips” of width τ\tauτ. The proof asserts that the derivatives “are nonzero only on the two edge strips” and that “the νθ,τ\nu_{\theta,\tau}νθ,τ-mass of each edge region equals τ\tauτ.” This is incorrect for the derivatives: while the density modification from the uniform happens on a set of ν\nuν-mass 2τ2\tau2τ, the score derivatives ψτ′,(ψτ2)′\psi_\tau', (\psi_\tau^2)'ψτ′,(ψτ2)′ are nonzero on the entire tails ∣u∣>12−τ|u|>\tfrac12-\tau∣u∣>21−τ, not just on a bounded strip: ψτ′\psi_\tau'ψτ′ is constant (positive) on the tails; (ψτ2)′(\psi_\tau^2)'(ψτ2)′ grows linearly in ∣u∣|u|∣u∣. Consequently the integrals \EPt[ψτ′(Xt)2]\E_{P_t}[\psi_\tau'(X_t)^2]\EPt[ψτ′(Xt)2] and \EPt[((ψτ2)′(Xt))2]\E_{P_t}[((\psi_\tau^2)'(X_t))^2]\EPt[((ψτ2)′(Xt))2] cannot be bounded by “edge mass ×\times× L∞L^\inftyL∞” with the claimed τ\tauτ-scalings. * (c) The scalings in (3.3)–(3.5) are wrong. In fact, on the tails ψτ′(u)≡14τ2K(0)2(a positive constant),\psi_\tau'(u)\equiv \frac{1}{4\tau^2K(0)^2}\quad\text{(a positive constant),}ψτ′(u)≡4τ2K(0)21(a positive constant), so even under PtP_tPt with Pr(∣Xt−θ∣>12−τ)=1\Pr(|X_t-\theta|>\tfrac12-\tau)=1Pr(∣Xt−θ∣>21−τ)=1 one has \EPt[ψτ′(Xt−θ)2]=Θ(τ−4)\E_{P_t}[\psi_\tau'(X_t-\theta)^2]=\Theta(\tau^{-4})\EPt[ψτ′(Xt−θ)2]=Θ(τ−4), not O(τ−1)O(\tau^{-1})O(τ−1). Similarly, (ψτ2)′=2ψτψτ′(\psi_\tau^2)'=2\psi_\tau\psi_\tau'(ψτ2)′=2ψτψτ′ grows linearly in ∣u∣|u|∣u∣ on the tails, so its L2L^2L2 under a general PtP_tPt depends on higher moments that are not controlled by a W2W_2W2 constraint. * (d) Dependence on 4th moments of QQQ. Later you use \EQ[ψ4]≲τ⋆−2\E_Q[\psi^4]\lesssim \tau_\star^{-2}\EQ[ψ4]≲τ⋆−2 and ∥ψ∥∞≲τ⋆−1/2\|\psi\|_\infty\lesssim \tau_\star^{-1/2}∥ψ∥∞≲τ⋆−1/2. Neither is true for the current ψ\psiψ. For the Gaussian tails we have ∣ψ(u)∣≍∣u∣/(τ2)|\psi(u)|\asymp |u|/(\tau^2)∣ψ(u)∣≍∣u∣/(τ2) for large ∣u∣|u|∣u∣; thus ψ\psiψ is unbounded, and \EQ[ψ4]\E_Q[\psi^4]\EQ[ψ4] can be arbitrarily large—even with W2(Q,μθ)≤εW_2(Q,\mu_\theta)\le\varepsilonW2(Q,μθ)≤ε—by sending a tiny mass δ\deltaδ far out (cost ∼δR2\sim \delta R^2∼δR2 in W22W_2^2W22), while \EQ[ψ4]∼δR4/τ8→∞\E_Q[\psi^4]\sim \delta R^4/\tau^8\to\infty\EQ[ψ4]∼δR4/τ8→∞ as R→∞R\to\inftyR→∞. So the uniform bound in \eqref{eq:risk-decomp} that relies on \EQ[ψ4]\E_Q[\psi^4]\EQ[ψ4] is not valid over the whole W2W_2W2-ball. * Bottom line. Lemma \ref{lem:score-moments} as stated is false. Its proof contains (i) an inapplicable use of Lemma \ref{lem:dyn-W2}, (ii) incorrect support/scaling claims, and (iii) reliance on moment bounds that fail uniformly over the W2W_2W2-ball. * Two viable fixes (choose one and adjust the rest accordingly): Fix A (localize / clip the score). Replace ψτ\psi_\tauψτ by a clipped score ψ~τ,B\tilde\psi_{\tau,B}ψ~τ,B that coincides with ψτ\psi_\tauψτ on the region where ∣u∣≤12−τ+c τ|u|\le \tfrac12-\tau + c\,\tau∣u∣≤21−τ+cτ and is truncated outside so that ψ~τ,B\tilde\psi_{\tau,B}ψ~τ,B and ψ~τ,B′\tilde\psi_{\tau,B}'ψ~τ,B′ are bounded and Lipschitz uniformly in uuu. Then Lemma \ref{lem:dyn-W2} applies with ∫01\EPt[ψ~τ,B′(Xt)2] dt≲τ−1,\int_0^1\E_{P_t}[\tilde\psi_{\tau,B}'(X_t)^2]\,dt \lesssim \tau^{-1},∫01\EPt[ψ~τ,B′(Xt)2]dt≲τ−1, and one can prove ∣\EQ[ψ~τ,B]−\Eνθ,τ[ψ~τ,B]∣≲ετ,∣\EQ[ψ~τ,B′]−\Eνθ,τ[ψ~τ,B′]∣≲ετ,\big|\E_Q[\tilde\psi_{\tau,B}]-\E_{\nu_{\theta,\tau}}[\tilde\psi_{\tau,B}]\big| \lesssim \frac{\varepsilon}{\sqrt{\tau}},\quad \big|\E_Q[\tilde\psi_{\tau,B}']-\E_{\nu_{\theta,\tau}}[\tilde\psi_{\tau,B}']\big| \lesssim \frac{\varepsilon}{\sqrt{\tau}},\EQ[ψ~τ,B]−\Eνθ,τ[ψ~τ,B]≲τε,\EQ[ψ~τ,B′]−\Eνθ,τ[ψ~τ,B′]≲τε, uniformly over Q:W2(Q,μθ)≤εQ:W_2(Q,\mu_\theta)\le\varepsilonQ:W2(Q,μθ)≤ε. The clipped estimator coincides with the unclipped one whenever no observation falls in the extreme tails; for the W2W_2W2-ball, the clipping error can be made negligible at the τ\tauτ-scales of interest, and all fourth‑moment issues vanish. Fix B (smooth everywhere, not only the edges). Replace the “flat‑top + Gaussian tails” fff by a globally smoothed log‑concave kernel (e.g. convolve μθ\mu_\thetaμθ with a C∞C^\inftyC∞ compactly supported bump of width ≍τ\asymp\tau≍τ); then ψτ∈C∞\psi_\tau\in C^\inftyψτ∈C∞, ψτ′,ψτ′′\psi_\tau',\psi_\tau''ψτ′,ψτ′′ are bounded and supported in a strip of width O(τ)O(\tau)O(τ), and Lemma \ref{lem:dyn-W2} yields exactly your τ\tauτ-scalings. This also resolves uniqueness (see 3) below). I include drop‑in text for Fix A (clipping) in §III.B and for Fix B (global smoothing) in §III.C. Either path repairs the entire chain of arguments and preserves the claimed ε2/3/n\varepsilon^{2/3}/nε2/3/n rate and constants‑up‑to‑≍\asymp≍. ===== - What’s written. “It follows from our analysis that this solution is unique.” ===== * What happens. With your fff, logf(⋅;θ,τ)\log f(\cdot;\theta,\tau)logf(⋅;θ,τ) is flat (constant) on ∣x−θ∣≤12−τ|x-\theta|\le\tfrac12-\tau∣x−θ∣≤21−τ. For samples X1′,…,Xn′X_1',\dots,X_n'X1′,…,Xn′ with maxiXi′−miniXi′≤1−2τ\max_i X_i'-\min_i X_i'\le 1-2\taumaxiXi′−miniXi′≤1−2τ, the set I=⋂i=1n[ Xi′−(12−τ), Xi′+(12−τ)]\mathcal{I}=\bigcap_{i=1}^n\big[\,X_i'-(\tfrac12-\tau),\, X_i'+(\tfrac12-\tau)\big]I=i=1⋂n[Xi′−(21−τ),Xi′+(21−τ)] is a nonempty interval, and for all t∈It\in\mathcal{I}t∈I one has ψτ(Xi′−t)=0\psi_\tau(X_i'-t)=0ψτ(Xi′−t)=0, hence Gn(t)=0G_n(t)=0Gn(t)=0. Thus every t∈It\in\mathcal{I}t∈I solves \eqref{eq:M-est}; the solution need not be unique. This event has positive probability (1−2τ)n(1-2\tau)^n(1−2τ)n under Q=μθQ=\mu_\thetaQ=μθ. * Fix. Either: (i) tie‑break the zero set by a deterministic rule (e.g. choose the midpoint of the interval of roots, or the maximizer of the concave log‑likelihood, or the smallest root), and state uniqueness is not guaranteed; or (ii) adopt Fix B above, which makes logf\log flogf strictly concave in ttt and restores uniqueness. See §III.D for a drop‑in sentence/footnote. ===== - As noted, ψτ′\psi_\tau'ψτ′ has a jump at the two junctions; ψτ′′\psi_\tau''ψτ′′ is not a bounded function, so the step ∣mn(tˉ)−mn(θ)∣≤∥ψ′′∥∞ ∣tˉ−θ∣|m_n(\bar t)-m_n(\theta)|\le \|\psi''\|_\infty\,|\bar t-\theta|∣mn(tˉ)−mn(θ)∣≤∥ψ′′∥∞∣tˉ−θ∣ is not justified. With either Fix A (clipped score is globally Lipschitz) or Fix B (globally smoothed fff so ψ′′\psi''ψ′′ bounded), the bound becomes correct and the subsequent “absorption” step is valid as written. See §III.B/§III.C. ===== ===== - You write (just after \eqref{eq:risk-decomp}): \EQ[ψ]2m(Q)2 ≤ C ε2 τ⋆I⋆2 = C ε2 τ⋆3 = C ε4/c.\frac{\E_Q[\psi]^2}{m(Q)^2}\ \le\ C\,\frac{\varepsilon^2\,\tau_\star}{I_\star^2} \;=\; C\,\varepsilon^2\,\tau_\star^3 \;=\; C\,\varepsilon^4/c.m(Q)2\EQ[ψ]2 ≤ CI⋆2ε2τ⋆=Cε2τ⋆3=Cε4/c. The equality (ε2 τ⋆)/I⋆2=ε2 τ⋆3(\varepsilon^2\,\tau_\star)/I_\star^2 = \varepsilon^2\,\tau_\star^3(ε2τ⋆)/I⋆2=ε2τ⋆3 is wrong since I⋆=π/τ⋆I_\star=\pi/\tau_\starI⋆=π/τ⋆ entails 1/I⋆2=τ⋆2/π21/I_\star^2=\tau_\star^2/\pi^21/I⋆2=τ⋆2/π2. The correct scaling is \EQ[ψ]2m(Q)2 ≲ ε2τ⋆,\frac{\E_Q[\psi]^2}{m(Q)^2}\ \lesssim\ \varepsilon^2 \tau_\star,m(Q)2\EQ[ψ]2 ≲ ε2τ⋆, i.e. O(ε8/3)O(\varepsilon^{8/3})O(ε8/3) when τ⋆=ε2/3\tau_\star=\varepsilon^{2/3}τ⋆=ε2/3. (This does not hurt the final rate comparison.) ===== * Drop‑in correction: see §III.E. ==== 1. Typo in Theorem statement. “for all n≥1n\ge 1n≥1 and ε∈(0,12]\varepsilon\in(0,\tfrac12]ε∈(0,21],s” → trailing “s” should be removed. (§III.F) ==== # Lemma \ref{lem:info-W2}, Gaussian integral constant. You write “∫0∞K′(y)2/K(y) dy=1\int_0^\infty K'(y)^2/K(y)\,dy=1∫0∞K′(y)2/K(y)dy=1” but use π/τ\pi/\tauπ/τ downstream. The correct value is ∫0∞K′(y)2K(y) dy=∫0∞y2K(y) dy=12.\int_0^\infty \frac{K'(y)^2}{K(y)}\,dy=\int_0^\infty y^2 K(y)\,dy=\frac{1}{2}.∫0∞K(y)K′(y)2dy=∫0∞y2K(y)dy=21. With K(0)=1/2πK(0)=1/\sqrt{2\pi}K(0)=1/2π, this yields I(τ)=π/τI(\tau)=\pi/\tauI(τ)=π/τ as claimed. (§III.G) # Language about “edge strips of width τ\tauτ”. It is true that νθ,τ\nu_{\theta,\tau}νθ,τ assigns mass τ\tauτ to each side beyond θ±(12−τ)\theta\pm(\tfrac12-\tau)θ±(21−τ), but the tails extend to +∞+\infty+∞. When discussing supports of derivatives, say “outside the bulk” rather than “on strips of width τ\tauτ”. # “Strictly decreasing C1C^1C1 score”. For the present fff, ψτ\psi_\tauψτ is continuous and piecewise linear in uuu with a derivative jump at the junctions; it is not C1C^1C1. Replace by “continuous, nonincreasing, and piecewise C1C^1C1” (or adopt Fix B so it is C1C^1C1). (§III.H) ==== Below are literal replacements you can paste into your file. ==== ===== Replace Lemma \ref{lem:dyn-W2} by: ===== <syntaxhighlight lang="latex">\begin{lemma}[Displacement $W_2$ control of expectation differences]\label{lem:dyn-W2} Let $P,Q$ be Borel probability measures on $\R$ with finite second moments, and let $T$ be the monotone transport pushing $P$ to $Q$. For $t\in[0,1]$ set $T_t(x)=(1-t)x+tT(x)$ and $X_t=T_t(X)$ with $X\sim P$, and let $P_t$ denote the law of $X_t$. Then, for any $C^1$ function $\varphi$ such that $\int_0^1 \E_{P_t}[\varphi'(X_t)^2]\,dt<\infty$, \[ \left|\E_Q[\varphi]-\E_P[\varphi]\right| \le \left(\int_0^1 \E_{P_t}[\varphi'(X_t)^2]\,dt\right)^{1/2}\, W_2(P,Q). \] \end{lemma} \begin{proof} By the fundamental theorem of calculus and the chain rule, \[ \E_Q[\varphi]-\E_P[\varphi] =\int_0^1 \frac{d}{dt}\E[\varphi(X_t)]\,dt =\int_0^1 \E\big[\varphi'(X_t)\cdot (T(X)-X)\big]\,dt. \] Applying Cauchy--Schwarz in $(t,\Omega)$ gives \[ \left|\E_Q[\varphi]-\E_P[\varphi]\right| \le \Big(\int_0^1 \E[\varphi'(X_t)^2]\,dt\Big)^{1/2} \Big(\int_0^1 \E[(T(X)-X)^2]\,dt\Big)^{1/2}. \] Since $\int_0^1 \E[(T(X)-X)^2]\,dt=\E[(T(X)-X)^2]=W_2(P,Q)^2$, the claim follows. Finally, $\E[\varphi'(X_t)^2]=\E_{P_t}[\varphi'(Z)^2]$ by definition of $P_t$. \end{proof} </syntaxhighlight> ===== Insert after (2.1) defining ψτ\psi_\tauψτ: ===== <syntaxhighlight lang="latex">\medskip\noindent\textbf{Clipped score.} For $B\ge 1$, define the clipped score \[ \tilde\psi_{\tau,B}(u)\ :=\ \mathrm{clip}\big(\psi_\tau(u),\,[-B/\sqrt{\tau},\,B/\sqrt{\tau}]\big), \] and let $\tilde\psi'_{\tau,B}$ be any selection of its (a.e.) derivative; then $\|\tilde\psi_{\tau,B}\|_\infty\le B/\sqrt{\tau}$ and $\|\tilde\psi'_{\tau,B}\|_\infty\le C/\tau$ uniformly in $B$. We henceforth use $\tilde\psi_{\tau_\star,B}$ in the estimating equation \eqref{eq:M-est}, and write $\hat\theta_\ep$ for any solution. The clipping level $B$ is fixed (e.g. $B=10$); on the $W_2$-ball all bounds below are uniform in $B$ and coincide with the unclipped estimator whenever no sample point falls in the extreme tails. </syntaxhighlight> Replace Lemma \ref{lem:score-moments} by the following (stated and proved for the clipped score): <syntaxhighlight lang="latex">\begin{lemma}[Uniform control of clipped score moments]\label{lem:score-moments-clipped} Fix $\tau\in(0,\tfrac12]$ and let $\tilde\psi_\tau:=\tilde\psi_{\tau,B}$ be the clipped score defined above. There exist constants $C_1,C_2,C_3<\infty$ (independent of $B$) such that for all $Q$ with $W_2(Q,\mu_\theta)\le \ep$, \begin{align} \left|\E_Q[\tilde\psi_\tau]-\E_{\nu_{\theta,\tau}}[\tilde\psi_\tau]\right| &\le C_1\,\frac{\ep}{\sqrt{\tau}},\label{eq:bias-score-clipped}\\ \left|\E_Q[\tilde\psi'_\tau]-\E_{\nu_{\theta,\tau}}[\tilde\psi'_\tau]\right| &\le C_2\,\frac{\ep}{\sqrt{\tau}},\label{eq:curv-score-clipped}\\ \left|\E_Q[\tilde\psi_\tau^2]-\E_{\nu_{\theta,\tau}}[\tilde\psi_\tau^2]\right| &\le C_3\,\frac{\ep}{\tau^{3/2}}.\label{eq:var-score-clipped} \end{align} \end{lemma} \begin{proof} Apply Lemma~\ref{lem:dyn-W2} with $P=\nu_{\theta,\tau}$, $\varphi=\tilde\psi_\tau$, $\varphi=\tilde\psi'_\tau$, and $\varphi=\tilde\psi_\tau^2$, respectively. Since $\|\tilde\psi'_\tau\|_\infty\lesssim \tau^{-1}$ and $\|(\tilde\psi_\tau^2)'\|_\infty\lesssim \tau^{-3/2}$, we have \[ \int_0^1 \E_{P_t}[(\tilde\psi'_\tau(X_t-\theta))^2]\,dt \lesssim \tau^{-1},\quad \int_0^1 \E_{P_t}[((\tilde\psi_\tau^2)'(X_t-\theta))^2]\,dt \lesssim \tau^{-3}, \] uniformly over $Q$ with $W_2(Q,\mu_\theta)\le \ep$. The claim follows using $W_2(Q,\nu_{\theta,\tau})\le \ep + W_2(\mu_\theta,\nu_{\theta,\tau})$ and Lemma~\ref{lem:info-W2}. \end{proof} </syntaxhighlight> Adjust the M‑estimation remainder bounds (replace all occurrences of ψ,ψ′,ψ′′\psi,\psi',\psi''ψ,ψ′,ψ′′ by ψ~τ,ψ~τ′\tilde\psi_\tau,\tilde\psi'_\tauψ~τ,ψ~τ′, and remove every use of ∥ψ′′∥∞\|\psi''\|_\infty∥ψ′′∥∞. Where you currently use Lipschitz in ttt via ∥ψ′′∥∞\|\psi''\|_\infty∥ψ′′∥∞, replace it by the uniform bound ∥ψ~τ′∥∞≲τ−1\|\tilde\psi'_\tau\|_\infty\lesssim\tau^{-1}∥ψ~τ′∥∞≲τ−1 and the simple inequality ∣mn(tˉ)−mn(θ)∣≤1n∑i=1n∣ψ~τ′(Xi′−tˉ)−ψ~τ′(Xi′−θ)∣≤∥ψ~τ′∥Lip ∣tˉ−θ∣≲τ−1 ∣tˉ−θ∣.|m_n(\bar t)-m_n(\theta)| \le \frac{1}{n}\sum_{i=1}^n\big|\tilde\psi'_\tau(X_i'-\bar t)-\tilde\psi'_\tau(X_i'-\theta)\big| \le \|\tilde\psi'_\tau\|_\mathrm{Lip}\,|\bar t-\theta| \lesssim \tau^{-1}\,|\bar t-\theta|.∣mn(tˉ)−mn(θ)∣≤n1i=1∑nψ~τ′(Xi′−tˉ)−ψ~τ′(Xi′−θ)≤∥ψ~τ′∥Lip∣tˉ−θ∣≲τ−1∣tˉ−θ∣. Finally, every appearance of \EQ[ψ4]\E_Q[\psi^4]\EQ[ψ4] can be replaced by ∥ψ~τ∥∞4≲τ−2\|\tilde\psi_\tau\|_\infty^4\lesssim \tau^{-2}∥ψ~τ∥∞4≲τ−2, uniformly over the W2W_2W2-ball. ===== If you prefer to avoid clipping, replace \eqref{eq:nu-density} by a globally smoothed bump (e.g. convolution with a compactly supported C∞C^\inftyC∞ kernel κτ\kappa_\tauκτ): ===== <syntaxhighlight lang="latex">\medskip Define $\kappa\in C^\infty_c(\R)$, $\kappa\ge0$, $\int \kappa=1$, and set $\kappa_\tau(x)=\tau^{-1}\kappa(x/\tau)$. Let \[ f(x;\theta,\tau) = \big(\mathbf{1}_{[\theta-\frac12,\ \theta+\frac12]} * \kappa_\tau\big)(x). \] Then $f(\cdot;\theta,\tau)\in C^\infty$, strictly log‑concave, and $\psi_\tau(u)=-\partial_u\log f(u)$ together with its derivatives satisfy $\|\psi_\tau\|_\infty\lesssim \tau^{-1/2}$, $\|\psi_\tau'\|_\infty\lesssim \tau^{-1}$, $\|\psi_\tau''\|_\infty\lesssim \tau^{-3/2}$. All calculations below carry through with the same $\tau$‑scalings \eqref{eq:bias-score}–\eqref{eq:var-score}. </syntaxhighlight> With this modification, Lemma \ref{lem:score-moments} (your original statement) becomes correct; you can keep it as‑is but replace the justification paragraph by the above C∞C^\inftyC∞‐bounds rather than “edge strips” heuristics. ===== Replace the sentence right after \eqref{eq:M-est} by: ===== <syntaxhighlight lang="latex">Let $\hat\theta_\ep$ be any measurable selection from the (nonempty) solution set of \eqref{eq:M-est}. For the globally smoothed model (Remark~\ref{rem:global-smooth}) this solution is unique by strict concavity of the log‑likelihood; for the flat‑top model the solution need not be unique on the event $\max_i X_i'-\min_i X_i'\le 1-2\tau_\star$, in which case we choose the midpoint of the interval of roots. </syntaxhighlight> (If you adopt III.C, add a short Remark saying uniqueness holds a.s. because the log‑likelihood is strictly concave in ttt.) ===== Replace the bias line by: ===== <syntaxhighlight lang="latex">By Lemma~\ref{lem:score-moments-clipped}, $|\E_Q[\psi]|\lesssim \ep/\sqrt{\tau_\star}$ and $m(Q)\asymp I_\star=\pi/\tau_\star$, hence \[ \frac{\E_Q[\psi]^2}{m(Q)^2}\ \lesssim\ \frac{\ep^2/\tau_\star}{(\pi/\tau_\star)^2} \ =\ \frac{\ep^2\,\tau_\star}{\pi^2} \ =\ O\!\big(\ep^{8/3}\big)\quad\text{when } \tau_\star=\ep^{2/3}. \] </syntaxhighlight> (This is strictly smaller than the variance term C/(nI⋆)C/(n I_\star)C/(nI⋆) in the regime where the IDS term dominates.) ===== Replace line in Theorem \ref{thm:IDS-sharp} by: ===== <syntaxhighlight lang="latex">There exists a constant $C<\infty$ such that, for all $n\ge 1$ and $\ep\in(0,\tfrac12]$, </syntaxhighlight> ===== Replace the end of the computation of I(τ)I(\tau)I(τ) by: ===== <syntaxhighlight lang="latex">=\frac{1}{\tau K(0)^2}\int_0^\infty \frac{K'(y)^2}{K(y)}\,dy =\frac{1}{\tau K(0)^2}\int_0^\infty y^2 K(y)\,dy =\frac{1}{\tau K(0)^2}\cdot \frac{1}{2} =\frac{\pi}{\tau}, </syntaxhighlight> and modify the explanatory sentence to: <syntaxhighlight lang="latex">where we used $K(0)=1/\sqrt{2\pi}$ and the Gaussian identity $\int_0^\infty y^2 K(y)\,dy=\tfrac12$. </syntaxhighlight> ===== Replace (A1) by a version that is true under either III.B or III.C: ===== <syntaxhighlight lang="latex">\item[(A1)] The score $\psi$ is continuous, nonincreasing, and piecewise $C^1$ with bounded derivative; in particular $\|\psi\|_\infty<\infty$ and $\|\psi'\|_\infty<\infty$. % (Under the globally smoothed model, $\psi\in C^2$ and $\|\psi''\|_\infty<\infty$ as well.) </syntaxhighlight> If you adopt III.C, you may keep the stronger $C^2$ version and the uses of $|\psi''|\infty$; if you adopt III.B (clipping), simply remove every use of $|\psi''|\infty$ and replace with the Lipschitz bound in ψ′\psi'ψ′ as indicated above. ==== With either fix (clipping or global smoothing), you recover: ==== * m(Q)=I(τ)+O(ε/τ)m(Q)=I(\tau)+O(\varepsilon/\sqrt{\tau})m(Q)=I(τ)+O(ε/τ) uniformly on the W2W_2W2-ball, * \VarQ(ψ)≤I(τ)+O(ε/τ3/2)\Var_Q(\psi)\le I(\tau)+O(\varepsilon/\tau^{3/2})\VarQ(ψ)≤I(τ)+O(ε/τ3/2), * all remainder terms are O(τ/n)O\big(\tau/n\big)O(τ/n) or smaller, so optimizing at τ⋆≍ε2/3\tau_\star\asymp \varepsilon^{2/3}τ⋆≍ε2/3 yields the stated supQ:W2(Q,μθ)≤ε\EQ[(θ^ε−θ)2] ≲ ε2/3n ∨ (ε2+12(n+1)(n+2)).\sup_{Q:W_2(Q,\mu_\theta)\le\varepsilon}\E_Q[(\hat\theta_\varepsilon-\theta)^2] \ \lesssim\ \frac{\varepsilon^{2/3}}{n}\ \vee\ \Big(\varepsilon^2+\tfrac{1}{2(n+1)(n+2)}\Big).Q:W2(Q,μθ)≤εsup\EQ[(θ^ε−θ)2] ≲ nε2/3 ∨ (ε2+2(n+1)(n+2)1). (The only change is the corrected bias calculation, which is strictly smaller and therefore harmless for the rate comparison.) ==== - In Lemma \ref{lem:info-W2} you can slightly streamline the W2W_2W2 computation by the change of variables q=(1−2τ)+2τS(u)q=(1-2\tau)+2\tau S(u)q=(1−2τ)+2τS(u), which gives the “cubic constant” in one line: W22(μθ,νθ,τ)=8τ3∫0∞(S(u)−12−K(0)u)2 K(u) du,W_2^2(\mu_\theta,\nu_{\theta,\tau}) = 8\tau^3 \int_0^\infty \Big(S(u)-\tfrac12 - K(0)u\Big)^2\,K(u)\,du,W22(μθ,νθ,τ)=8τ3∫0∞(S(u)−21−K(0)u)2K(u)du, from which your expression for ccc follows after evaluating ∫u2K(u)du\int u^2K(u)du∫u2K(u)du and ∫u(2S(u)−1)K(u) du\int u(2S(u)-1)K(u)\,du∫u(2S(u)−1)K(u)du. ==== * If you prefer to state the theorem for the unclipped score and the present fff, add a succinct assumption that rules out adversarial fourth‑moment explosions, e.g. “assume QQQ has finite fourth moment uniformly on the W2W_2W2-ball” or “assume QQQ is dominated by a sub‑Gaussian envelope,” but this weakens the minimax statement. The clipping (III.B) avoids this altogether. ===== 1. Fix Lemma \ref{lem:dyn-W2} (velocity field) — §III.A. ===== # Replace Lemma \ref{lem:score-moments} by a valid version. Easiest is clipping (§III.B) or use a globally smoothed fff (§III.C). # Acknowledge/resolve non‑uniqueness (tie‑break or global smoothing) — §III.D. # Correct the bias-term algebra (use ε2τ⋆\varepsilon^2\tau_\starε2τ⋆, not ε2τ⋆3\varepsilon^2\tau_\star^3ε2τ⋆3) — §III.E. # Fix constants/typos — §III.F–G, and relax (A1) as per §III.H. With these amendments, the proof becomes sound and delivers the claimed sharp IDS rate ε2/3/n\varepsilon^{2/3}/nε2/3/n (up to constants).
Summary:
Please note that all contributions to freem are considered to be released under the Creative Commons Attribution-ShareAlike 4.0 (see
Freem:Copyrights
for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource.
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)