← 上一篇 · 区域识别版 SDR→HDR 系统← Previous · Region-based SDR→HDR system

Inverse Tone Mapping · 新一代思路Inverse Tone Mapping · A New Approach

SDR → HDR：
让 AI 只当测光表SDR → HDR:
Use AI Only as a Light Meter

上一套系统靠规则式场景识别判断"哪里是光源"，准但总有漏检。这一套换了思路：用 LTX HDR 模型的 EXR 输出当"哪里该亮、该多亮"的增益图（gainmap），再把这个增益只作用在原始 SDR 的像素上——既拿到 AI 的判断力，又一点不虚构画面细节。The previous system used rule-based scene recognition to identify light sources. It was accurate, but inevitably missed some. This approach changes the premise: use the EXR output from the LTX HDR model as a gain map describing what should be bright and by how much, then apply that gain only to pixels from the original SDR image. We keep AI's judgment without inventing any image detail.

LTX HDR · EXR 线性输出LTX HDR · Linear EXR output Gainmap = LTX亮度 ÷ SDR亮度Gain map = LTX luminance ÷ SDR luminance 原片细节零改写Original detail preserved F_APC 自适应色度增强F_APC adaptive chroma 4K 原生 · Rec.2020 PQNative 4K · Rec.2020 PQ 免规则 · 免漏检No rules · No missed detections

01 · 承接01 · Where we left off

为什么要换思路Why change the approach?

上一套区域识别系统用 YOLOE 开放词汇检测把每个像素分成自发光 / 反射 / 普通三类，再差异化提亮。它在"懂内容"这件事上已经做得很细，但命门是规则本身：The previous region-based system used YOLOE open-vocabulary detection to classify each pixel as emissive, reflective, or ordinary, then applied a different lift to each class. Its content awareness was already sophisticated, but the rules themselves were the weak point:

🕳️

规则覆盖不全 → 漏检Incomplete rules → missed detections

提示词、阈值、形态学全是人定的。遇到非常规光源、复杂反射、奇异材质，规则没覆盖到就漏提亮——这正是实际使用里最常见的抱怨。Prompts, thresholds, and morphology are all hand-authored. Unusual emitters, complex reflections, or unfamiliar materials fall outside the rules and fail to brighten—the most common complaint in real use.

🧩

多环节耦合，调一处动全身Coupled stages: tune one, disturb them all

检测、分类、色调映射、时序四个环节互相牵连。一个"效果不对"背后可能是四个问题叠加，调参像走钢丝。Detection, classification, tone mapping, and temporal processing all interact. One bad-looking result may combine four separate failures, making parameter tuning a balancing act.

🤖

而 AI 恰好擅长"理解画面"AI happens to excel at understanding the image

LTX 这类模型对"哪里该亮"的语义判断又准又全——正好补上规则的短板。问题只剩：怎么用它，又不被它的副作用反噬。Models like LTX make broad, accurate semantic judgments about what should be bright, directly filling the rules' blind spots. The remaining question is how to use that ability without inheriting its side effects.

更根本地说：规则是对封闭世界的枚举，而真实场景是开放的。你永远写不全所有光源、材质、反射的组合——每遇到一个没覆盖的情况，就得回去补提示词、调阈值、加形态学，像打地鼠。系统越想做全，规则之间的耦合就越重，维护成本滚雪球。换句话说，瓶颈不是某条规则不够好，而是"用规则枚举开放世界"这件事本身。与其不停打补丁，不如换一个天生就理解画面内容的模型来回答"哪里该亮"——这正是引入 LTX HDR 的动机。More fundamentally: rules enumerate a closed world, but real scenes are open-ended. You can never enumerate every combination of light source, material, and reflection—each uncovered case sends you back to add prompts, tweak thresholds, and patch morphology, like whack-a-mole. The more complete the system tries to be, the more tightly its rules couple, and maintenance snowballs. In other words, the bottleneck is not that one rule is too weak, but the very idea of enumerating an open world with rules. Rather than endlessly patching, bring in a model that understands image content natively to answer "what should be bright"—which is exactly the motivation for LTX HDR.

补充 · 关于 LTX HDRBackground · About LTX HDR

LTX HDR 是什么What is LTX HDR?

LTX HDR 就是我们请来的那位"测光师"。它不是一个独立产品，而是架在开源视频生成模型 LTX-Video（Lightricks 的 DiT 视频模型，以推理快著称）之上的一个 HDR IC-LoRA（In-Context LoRA）扩展——社区放出的 beta。所谓 IC-LoRA，是用一张参考图把生成"锚"在上下文里；这里学的就是 SDR → HDR 的映射。给它一帧 SDR，它会"想象"出这帧在 HDR 下应有的样子，并以 线性光 EXR（scene-referred、亮度可超过 1.0）输出——相当于一张携带了真实高光亮度的"参考答案"。LTX HDR is the "light meter" we bring in. It is not a standalone product but an HDR IC-LoRA (In-Context LoRA) extension—a community beta—built on the open-source LTX-Video model (Lightricks' DiT video model, known for fast inference). An IC-LoRA anchors generation on a reference image in context; here it learns the SDR → HDR mapping. Given an SDR frame, it imagines how that frame should look in HDR and outputs a linear-light EXR—scene-referred, with luminance allowed above 1.0—in effect a reference answer carrying real highlight luminance.

它的判断为什么值得借？因为它"见过"海量真实影像——一盏灯、一扇窗、一片天空该有多亮，是它从数据里学来的先验，而不是谁手写的阈值。这恰好补上规则的开放世界短板：规则枚举不到的场景，它往往也能给出合理的亮度判断。代价也很清楚——它毕竟是个会"重画"画面的生成模型（见下面三张卡片）。Why is its judgment worth borrowing? Because it has "seen" vast amounts of real imagery—how bright a lamp, a window, or a patch of sky should be is a prior learned from data, not a hand-written threshold. That fills exactly the open-world gap rules leave: for scenes rules never enumerated, it still tends to give a sensible luminance call. The cost is just as clear—it is, after all, a generative model that repaints the image (see the three cards below).

🧠

它强在哪Its strength

靠生成模型的语义理解判断"哪里该亮、该多亮"，能覆盖规则写不全的复杂光源与材质——基本不漏检。The model uses semantic understanding to judge what should be bright and by how much. It covers complex sources and materials that rules cannot enumerate, with very few missed detections.

🎞️

它的副作用Its side effects

毕竟是生成模型：会改写/虚构纹理与物体，分辨率受限（实测约 ~800p），且逐帧判断会抖动。It is still a generative model: it may rewrite or invent textures and objects, resolution is limited to about 800p in testing, and frame-by-frame judgments can flicker.

💡

我们怎么用它How we use it

只取它 EXR 输出当"亮度参考"、提取增益图，丢掉它生成的像素——把判断力留下，副作用扔掉。Use its EXR output only as a luminance reference from which to extract a gain map, then discard every generated pixel. Keep the judgment; throw away the side effects.

02 · 三条路线02 · Three routes

同一段素材，三条路见高下Same clip, three routes head to head

把同一段室内 SDR 升 HDR，有三条路可走。关键矛盾就一句话："判断哪里该亮"和"保住画面真实细节"，能不能兼得。There are three ways to convert the same indoor SDR clip to HDR. The central tension is simple: can we decide what should be bright while preserving authentic image detail?

维度	路线 A · 区域识别（规则）	路线 B · LTX 直接输出	路线 C · LTX gainmap ✦
"哪里该亮"判断	规则检测，易漏检	AI 语义，准且全	AI 语义，准且全（借 LTX）
画面细节	原片，真实	被生成模型改写 / 虚构	原片，真实（增益只改亮度）
分辨率	原生（4K）	受模型限制（~800p）	原生 4K
时序稳定	中（检测抖动）	差（逐帧生成抖动）	中→好（增益是低频场，可时域滤波）
可控 / 可解释	高	低（黑盒生成）	高（增益场可视化、可限幅）

核心洞察：LTX 直接输出的两个死穴——改写细节、分辨率低——都出在"它生成了像素"。但我们其实不需要它的像素，只需要它的亮度判断。把"该多亮"抽出来、丢掉它生成的纹理，问题就解了。

Dimension	Route A · Region detection (rules)	Route B · Direct LTX output	Route C · LTX gain map ✦
What should be bright?	Rule-based detection; prone to misses	AI semantics; accurate and comprehensive	AI semantics borrowed from LTX
Image detail	Original and authentic	Rewritten or invented by the model	Original and authentic; gain changes luminance only
Resolution	Native (4K)	Model-limited (~800p)	Native 4K
Temporal stability	Moderate; detection flicker	Poor; frame-by-frame generation flicker	Moderate → good; gain is a low-frequency field suitable for temporal filtering
Control / explainability	High	Low; black-box generation	High; the gain field can be visualized and limited

Core insight: the two fatal flaws of direct LTX output—rewritten detail and low resolution—both come from letting it generate pixels. We do not need those pixels; we need only its luminance judgment. Extract how bright each area should be and discard the generated texture, and both problems disappear.

03 · 原理03 · How it works

Gainmap：只借 LTX 的"亮度判断"Gain map: borrow only LTX's luminance judgment

手机 HDR 照片的本质就是 SDR 基础层 + 增益图。我们把同一思路反过来用：让 LTX 当那张增益图的"预言家"。A phone HDR photo is fundamentally an SDR base layer plus a gain map. We reverse the same idea and let LTX predict that gain map.

增益图 gainmap = LTX 亮度 ÷ SDR 亮度（曝光归一 · 置信门控 · 引导平滑 · 亮度封顶）gain map = LTX luminance ÷ SDR luminance (exposure normalization · confidence gating · guided smoothing · luminance ceiling)

HDR 结果 = 原始 SDR 的 RGB × gainmapHDR result = original SDR RGB × gain map

LTX 只决定"哪里、提多少"；所有真实像素 / 颜色 / 纹理都来自原片 SDR → 数学上不可能混入虚构细节。（颜色随后是否提饱和，是独立、可关的一步——见下文「色度增强」。）LTX decides only where and how much to lift. Every real pixel, color, and texture comes from the original SDR image, so invented detail cannot enter mathematically. (Whether saturation is then enhanced is a separate, switchable step—see “Chroma enhancement” below.)

IN·a

SDR 源SDR source

16-bit 读取，保住母版细节16-bit ingest preserves master detail

IN·b

LTX EXR

线性 HDR，直接读不做 gammaLinear HDR; read directly with no gamma

几何对齐Geometric alignment

宽匹配 + 居中裁切复原width match + reverse center crop

算增益图Compute gain map

LTX÷SDR 亮度比LTX÷SDR luminance ratio

归一·门控·平滑Normalize · gate · smooth

去全局曝光 / 护栏 / 引导滤波remove global exposure / guardrails / guided filtering

贴回原片Apply to original

增益升采样 × 4K SDRupsampled gain × 4K SDR

色度增强Chroma enhance

Rec.2020 线性域 · F_APC（dmax 可调 / 可关）Rec.2020 linear · F_APC (tunable dmax, optional)

OUT

HDR10 / JXL

Rec.2020 PQ

看得见的处理链：一个真实镜头走完全程The pipeline you can see: one real shot, end to end

取 ASC StEM2 第 4 秒的镜头（暗室中手持发光装置）。注意三张照片的结构完全一致——像素永远来自原片；变化只发生在亮度与色度。中间的增益图是这套方案的核心：AI 只把提亮集中在真实光源（手电 / 绿管）上，暗部几乎不动，因此不会凭空"发明"细节。A shot from ASC StEM2 at 4 s (a glowing device held in a dark room). Note that the three photographs are structurally identical—pixels always come from the source; only luminance and chroma change. The gain map in the middle is the core idea: the AI concentrates brightening only on the real light sources (flashlight / green tube) and barely touches the shadows, so no detail is ever invented.

SDR source frame — 01SDR 原片SDR source偏暗、动态范围受限dark, limited dynamic range

→

LTX HDR estimate — 02LTX 亮度判断LTX luminance estimate大模型给出"该多亮"the model proposes brightness

→

Constrained gain map heatmap — 03约束增益图Constrained gain map红=提亮最多，只落在光源red = most lift, only on light sources

→

Full workflow HDR result — 04完整工作流成片Full-workflow result贴回原片 + F_APC 色度applied to source + F_APC chroma

照片为 HDR 成片经固定色调映射（Reinhard@200nit）回 SDR 显示，仅供在普通屏幕上对比；真实 HDR 屏上手电峰值可达数百 nit。增益图热力图由成片 ÷ 原片的亮度比逐像素计算。Photographs are the HDR result tone-mapped back to SDR for on-screen comparison (fixed Reinhard@200 nit); on a real HDR display the flashlight peaks at several hundred nits. The heat map is the per-pixel result-to-source luminance ratio.

为什么这能成立Why this works

EXR 是线性、scene-referred 的，真实携带了高光超白亮度，正是"该多亮"的答案；
实测 LTX 与 SDR 在中间调的亮度比≈1.17、结构相关 0.86——结构一致，增益图有意义；
增益图是低频平滑场，升采样贴回 4K 不损原片任何细节。

EXR is linear and scene-referred, carrying genuine super-white highlight values—the answer to how bright each region should be.
In testing, the LTX-to-SDR midtone luminance ratio is about 1.17 with structural correlation of 0.86. Their structures align, making the gain map meaningful.
The gain map is a smooth, low-frequency field; upsampling it back to 4K preserves every source detail.

几个关键护栏Key guardrails

曝光归一化：用中间调比值中位数去掉 LTX 的全局曝光漂移，只留语义提亮；
亮度空间封顶：按输出 nits（而非增益倍数）封顶——既不压暗处真实光源，又让高光不撞钳位、消除高光偏色；
暗部噪点保护：暗区默认压住、真实亮源可越权，避免放大 SDR 噪点。

Exposure normalization: remove LTX's global exposure drift using the median midtone ratio, retaining only semantic enhancement.
Luminance-domain ceiling: cap output nits rather than gain ratio. This preserves genuine sources in dark regions while keeping highlights away from clipping and eliminating highlight color shifts.
Shadow-noise protection: suppress gain in dark regions by default while allowing genuine bright sources through, avoiding amplified SDR noise.

再细一层：增益到底怎么被"约束"One level deeper: how the gain is constrained

输出亮度天花板 = min( LTX 目标亮度, 4.5 ) ≈ 914 nitoutput ceiling = min( LTX target luminance, 4.5 ) ≈ 914 nit

增益上限 = 输出亮度天花板 ÷ SDR 亮度（暗处自动放大、亮处自动收敛）gain ceiling = output ceiling ÷ SDR luminance (auto-amplifies in shadow, converges in highlight)

关键在于：封顶定在"输出 nits"而不是"增益倍数"——既不把暗处的真实光源压成小倍数，又让高光不撞 per-channel 钳位（这正是高光偏色的根因）。暗部另设 keep 门控：默认压住噪点，够亮的 LTX 目标可越权放行，所以暗背景里的真实灯不会被一并压死。示例中的 4.5 × 203 nit 是初版配置；如今上限与参考白都已参数化——网页工具默认参考白 120 nit × headroom 4.0 ≈ 480 nit 峰值，这一档是照专业 HDR 母版实测校准的（全片亮度均值 1.16× 母版、高光峰值 0.93×）。The key: the ceiling is defined in output nits, not in gain ratio—so a genuine source sitting in shadow is not pinned to a tiny multiplier, while highlights stay off the per-channel clamp (the exact cause of hue shift). A separate keep gate suppresses shadow noise by default but lets a bright enough LTX target override it, so a real lamp embedded in a dark background is not crushed along with the noise. The 4.5 × 203 nit in the example is the original configuration; both the ceiling and the reference white are now parameters—the web tool defaults to a 120 nit reference white × 4.0 headroom ≈ 480 nit peak, a level calibrated against a professional HDR master (full-film mean luminance 1.16× the master, highlight peaks 0.93×).

深入 · 色度增强Deep dive · Chroma enhancement

亮度之后，颜色也要跟上After luminance, color has to follow

增益图只动亮度，是"不虚构"的根基——但也留下一个空缺：SDR 的颜色是照 Rec.709 小色域调的，原封不动放进 HDR 容器，会比专业 HDR 母版素一截。把我们的全片与 ASC StEM2 官方 HDR 母版逐帧对标（ICtCp 域 · 去亮度饱和），纯亮度直出的饱和度只有母版的 90%。为此流水线在亮度之后补了独立的一步色度增强：注入点在 Rec.709→Rec.2020 色域转换之后、PQ 编码之前的线性域——亮度 gainmap 路径一个字节不动，这一步随时可关（off 时逐位等于原直出）。The gain map touches only luminance—the foundation of "nothing invented." But that leaves a gap: SDR color is graded within the small Rec.709 gamut, and dropped unchanged into an HDR container it looks noticeably plainer than a professionally graded HDR master. Benchmarking our full film frame by frame against the official ASC StEM2 HDR master (ICtCp domain, luminance-normalized saturation), the luminance-only output reaches just 90% of the master's saturation. So the pipeline adds an independent chroma-enhancement step: injected in the linear domain after the Rec.709→Rec.2020 gamut conversion and before PQ encoding—the luminance gain-map path is untouched, and the step can be switched off (off is bit-identical to the plain output).

每像素系数 s = 1 + dmax × w， w = 主角色 × 色域余量 × 亮度耦合 × 肤色保护 × 高光滚降per-pixel factor s = 1 + dmax × w, w = hero-color × gamut headroom × luminance coupling × skin protection × highlight rolloff

ICtCp 域：(Ct, Cp) × s，I 不变 —— 只提饱和 · 不换色相 · 不造新色In ICtCp: (Ct, Cp) × s, I unchanged — more saturation, same hue, no new colors

自研 F_APC（Adaptive Perceptual Chroma）：五个乘性门控逐像素决定"这里还能推多少"。其中色域余量按"离 BT.2020 / 有效范围边界还有多远"衰减——天然永不截断。"不虚构"的边界也说得清：色度只沿"同色相更饱和"这一个方向缩放。Our F_APC (Adaptive Perceptual Chroma): five multiplicative gates decide per pixel how much further to push. The gamut-headroom gate decays with the distance to the BT.2020 / valid-range boundary, so it never clips. The no-invention boundary also stays crisp: chroma is scaled along a single axis—same hue, more saturation.

off

不做色度增强，亮度 gainmap 后按容器映射直出。基线、逐位可复现；对标母版欠饱和 ~10%。No chroma enhancement; encode straight after the luminance gain map. The bit-reproducible baseline—about 10% under the master's saturation.

B · ICtCp ×1.3

全局提饱和一刀切（附高光滚降、护肤色）。简单可靠，但不看内容；全片实测 1.15× 母版，整体偏艳。A global ×1.3 saturation lift (with highlight rolloff and skin protection). Simple and dependable but content-blind; measured at 1.15× the master over the full film—uniformly too vivid.

F_APC ✦

自适应：中饱和的"主角色"推得最多，近中性 / 已饱和的收着推；越亮越艳（Hunt 效应）、肤色少动、高光向白滚降、色域余量兜底。强度由一个 dmax 旋钮统一控制。Adaptive: mid-saturation "hero colors" get the most push while near-neutrals and already-saturated pixels get the least; brighter pixels push more (the Hunt effect), skin barely moves, highlights roll toward white, and gamut headroom backstops everything. One knob—dmax—sets the strength.

校准：拿专业母版当标尺，扫出 dmaxCalibration: sweep dmax against a professional master

F_APC 最初按 dmax=1.0 上线，全片对标发现过冲：饱和度到了母版的 1.17×——方向对、力道大。于是在无色度基底上扫 dmax，逐档对标母版：F_APC first shipped at dmax = 1.0, and the full-film benchmark showed an overshoot: 1.17× the master's saturation—right direction, too much force. So we swept dmax on a chroma-free base, benchmarking each step against the master:

dmax	饱和度 / 母版	解读
0（关）	0.90×	纯亮度直出：欠饱和 ~10%
0.4（默认 ✦）	0.99×	与母版几乎重合
0.45	1.00×	同样贴合——0.4–0.5 都在 ±1% 内
0.6	1.04×	开始偏艳
1.0（旧默认）	1.13×	明显过冲（全片编码实测 1.17×）

dmax	saturation / master	reading
0 (off)	0.90×	luminance-only output: ~10% under
0.4 (default ✦)	0.99×	virtually on top of the master
0.45	1.00×	equally close—0.4–0.5 all within ±1%
0.6	1.04×	starting to over-saturate
1.0 (old default)	1.13×	clear overshoot (1.17× in the full-film encode)

结论：默认 dmax = 0.4，饱和度与专业母版偏差 ~1%。网页工具把它做成滑块（0–1 · 步进 0.05），想更素或更艳随手可调。亮度这条线也是同一把尺校的（全片均值 1.16× 母版、高光峰值 0.93×）——亮度、颜色两个维度都贴着专业调色走。Conclusion: default dmax = 0.4, putting saturation within ~1% of the professional master. The web tool exposes it as a slider (0–1, step 0.05) for anyone wanting plainer or more vivid. Luminance was calibrated with the same yardstick (full-film mean 1.16× the master, highlight peaks 0.93×)—both dimensions track professional grading.

一个诚实的发现：色度增强对齐的是专业调色的审美，不是算法真值。同一个 F_APC（dmax=1.0）放到 HDRTV1K 学术基准上，色差 ΔE_ITP 反而从 18.5 恶化到 21.1——那套 GT 不是艺术调色，通用提饱和冲着它就是过冲；而对专业母版恰好落进档位。色度强度必须跟着目标调色走——这正是把 dmax 做成可调参数、而不是写死常数的原因。An honest finding: chroma enhancement aligns with the taste of professional grading, not with algorithmic ground truth. The same F_APC at dmax = 1.0, run on the HDRTV1K academic benchmark, actually worsens ΔE_ITP from 18.5 to 21.1—that GT is not an artistic grade, so a generic saturation lift overshoots it—while against the professional master it lands right on target. Chroma strength must follow the target grade—exactly why dmax is an adjustable parameter rather than a hard-coded constant.

深入 · 工程化Deep dive · Engineering

从一帧，到能上线的整片流水线From one frame to a shippable full-length pipeline

把"能出一帧"做成"能稳定出整片、还能直接上线"，中间有几处关键工程——都守着同一条底线：只动亮度、不碰原片。Turning "one good frame" into "a stable full clip you can ship" takes a few key pieces of engineering—all holding the same line: touch only luminance, never the original pixels.

分段接缝匹配：消除 LTX 拼接处的亮度跳变Segment seam matching: removing brightness jumps at LTX joints

LTX 对长视频是分段独立生成的——即使同一镜头被按长度切开，相邻两段的全局曝光也不一样，拼起来会在接缝处出现一道可见的亮度跳变。做法：在拼好的 EXR 序列上给每段乘一个全局尺度，让"上段尾帧"与"下段首帧"的亮度连续（累积匹配），再按帧数加权归一化、消掉整体漂移——只补接缝，不改段内 HDR 层次。修正发生在 EXR 源头，所以"LTX 直接"和"gainmap"两条交付路径同时受益。LTX generates long videos segment by segment, independently. Even when a single shot is split by length, adjacent segments receive different global exposure, leaving a visible brightness jump at the joint. The fix multiplies each segment by a global scale so the last frame of one segment matches the first frame of the next (cumulative matching), then normalizes by a frame-weighted mean to remove overall drift—mending only the seam, never the in-segment HDR gradation. Because it happens at the EXR source, both the Direct-LTX and gain-map paths benefit.

用整帧亮度中位数做匹配量：对尺度严格等变、反映可见的中间调，不会被 PQ 压暗、肉眼看不见的阴影差带偏；
只取接缝处确切相邻的两帧，不用窗口中位——避开段内非边界漂移，防过修；
单接缝与最终段尺度都有温和钳位（约 0.6–1.6 / 0.8–1.25），坏边界不至于压垮中间调与高光。

Match on the median luminance of the whole frame: strictly scale-equivariant and tied to visible midtones, so it is not dragged by PQ-crushed, imperceptible shadow differences.
Compare only the two frames actually adjacent at the seam, not a windowed median—avoiding in-segment drift and over-correction.
Both per-seam and final segment scales are gently clamped (about 0.6–1.6 / 0.8–1.25) so a bad boundary cannot crush midtones or highlights.

多格式交付：视频与图片各取所需Multi-format delivery: the right container for video and stills

视频Video

HDR10 / HEVC mp4（hvc1 标签，QuickTime 友好）；可选每帧 HDR JXL 序列（Rec.2020 PQ）用于逐帧抠图。HDR10 / HEVC mp4 (hvc1 tag, QuickTime-friendly), plus an optional per-frame HDR JXL sequence (Rec.2020 PQ) for frame extraction.

图片Stills

UltraHDR JPG：直接拿我们的 SDR base + 增益图写成手机式 gainmap 图——本方案天然就是 UltraHDR 的形态；另出 HDR AVIF（Rec.2020 PQ，Chrome / Safari 内联显示）。UltraHDR JPG: our SDR base plus the gain map, written directly as a phone-style gain-map image—this approach is natively UltraHDR—plus an HDR AVIF (Rec.2020 PQ, inline in Chrome / Safari).

几何对齐Geometry

增益图按 LTX 实际喂入的缩放模式（拉伸 / 适配 / 裁切）反向还原，再升采样贴回原片，保证逐像素对位。The gain map reverses whichever resize mode (stretch / fit / crop) was fed to LTX, then upsamples back onto the source for pixel-accurate registration.

两条路径（LTX 直接 / gainmap）共用同一个 HDR10 编码器；整套外面再包一层 web 队列服务，在多卡服务器上从 SDR 一路跑到成片。Both paths (Direct LTX / gain map) share one HDR10 encoder; the whole thing is wrapped in a web queue service that runs from SDR to finished clip on a multi-GPU server.

04 · 交互对比04 · Interactive comparison

拖动中线：左 SDR ↔ 右 HDRDrag the divider: SDR left ↔ HDR right

同一帧（室内 · 第 11 秒）。左边永远是 SDR 原片，右边是可切换的 HDR 方案。拖动中间的竖线，直接看同一处在 SDR 与 HDR 下的差别；点上方按钮切换右侧的 HDR 技术方案。The same indoor frame at 11 seconds. The left side is always the original SDR image; the right side is a selectable HDR method. Drag the vertical divider to compare the same area in SDR and HDR, then use the buttons above to change the method shown on the right.

HDR 图在支持 HDR 的浏览器（Chrome / Safari）+ HDR 屏上才会真正"亮起来"。非 HDR 环境下，仍可对比右侧画面细节是否被改写——切到「LTX 直接」尤其明显。The HDR image becomes physically brighter only in an HDR-capable browser (Chrome / Safari) on an HDR display. In SDR environments, you can still compare whether detail on the right has been rewritten—especially with Direct LTX selected.

右侧 HDR 方案：HDR method on the right:

SDR

HDR · LTX gainmap

❮❯

LTX gainmap（选定）：借 LTX 的亮度判断、像素全用原片。4K 原生、细节真实、提亮精准——三者兼得。拖动时右侧只是变亮，画面结构和左侧 SDR 完全一致。LTX gain map (selected): borrows LTX's luminance judgment while using only source pixels. Native 4K, authentic detail, and precise enhancement all coexist. As you drag, the right side only becomes brighter; its structure remains identical to the SDR source.

再看一例 · 酒吧场景（第 2 秒 16 帧）Another example · Bar scene (2 s 16 f)

同样左 SDR、右可切换 HDR——这次把三种方案放在同一镜头对比：区域识别（规则）、LTX 模型（直接）、以及完整 gainmap 工作流（目前最好的效果，由网页转换工具一键产出，含分段接缝匹配）。Again SDR on the left and a switchable HDR on the right—this time three methods on the same shot: region detection (rules), the direct LTX model, and the full gain-map workflow (the best result, produced in one click by the web conversion tool with segment seam matching).

右侧 HDR 方案：HDR method on the right:

SDR

HDR · LTX gainmap

❮❯

LTX gainmap · 完整工作流（最佳）：借 LTX 的亮度判断、像素全用原片，原生 1920×1080，含分段接缝匹配。拖动时右侧只是变亮，结构与左侧 SDR 完全一致。LTX gain map · full workflow (best): borrows LTX's luminance judgment using only source pixels, native 1920×1080, with segment seam matching. As you drag, the right side only brightens; structure stays identical to the SDR source.

05 · 结论05 · Conclusion

为什么选 LTX gainmapWhy choose the LTX gain map?

把三条路线放回最初那句矛盾——"判断哪里该亮" × "保住真实细节"——只有路线 C 同时占住两头：Return the three routes to the original tension—deciding what should be bright × preserving authentic detail—and only Route C satisfies both:

拿了 AI 的长处Keep AI's strength

"哪里该亮、该多亮"交给 LTX 的语义理解，不再靠规则、不再漏检——直接解决上一套系统的命门。LTX's semantic understanding decides what should be bright and by how much. No hand-written rules and far fewer misses directly address the previous system's weakness.

丢了 AI 的副作用Discard AI's side effects

只取它的亮度判断，丢掉它生成的像素。细节、颜色、分辨率全来自原片 → 数学上不可能虚构细节。Keep only its luminance judgment and discard generated pixels. Detail, color, and resolution all come from the source, making invented detail mathematically impossible.

保留了工程可控性Retain engineering control

增益图可视化、可限幅、可时域滤波。延续上一套"有真值评测 + 单帧诊断"的方法论，黑盒变白盒。The gain map can be visualized, limited, and temporally filtered. The previous system's ground-truth evaluation + single-frame diagnosis method turns the black box into something inspectable.

一句话：LTX 直接输出是"AI 重画一遍"，区域识别是"人写规则提亮"，而 LTX gainmap 是"让 AI 只当测光表，原片自己变亮"——这才是既准又真、还能上 4K 的解。In one sentence: direct LTX means AI repaints the image; region detection means people write rules to brighten it; the LTX gain map means AI acts only as a light meter while the original image brightens itself. That is the route that is accurate, authentic, and viable at 4K.

06 · 仍在打磨06 · Still refining

已知局限与下一步Known limitations and next steps

室内场景已经很稳；当前主要待解的是室外的帧间一致性。Indoor scenes are already stable. The main remaining problem is outdoor frame-to-frame consistency.

问题 · 室外局部闪烁Problem · Local outdoor flicker

实测全片高光闪烁均值仅 2.4%，但室外大动态区（天空 / 树叶 / 水面高光）存在局部闪烁。根因是LTX 逐帧对这些高频亮区的判断在抖，而室外常伴随平移/推拉，固定像素位置的时域平滑会错位。Measured average highlight flicker across the full clip is only 2.4%, but high-dynamic-range outdoor regions—sky, foliage, and water highlights—show local flicker. LTX's frame-by-frame judgment of these high-frequency bright areas varies, while outdoor shots often pan or zoom, causing temporal smoothing at fixed pixel positions to misalign.

方向 · 运动补偿时域滤波Direction · Motion-compensated temporal filtering

增益图是低频场，特别适合时域处理：先估相邻帧运动把上一帧增益 warp 对齐再做时域中值/EMA，即可强力压闪而不糊运动；更进一步可让 LTX 跑关键帧 + 中间帧光流插值，从源头消抖动还省算力。Because the gain map is a low-frequency field, it is well suited to temporal processing. Estimate inter-frame motion, warp the previous gain map into alignment, then apply a temporal median or EMA to suppress flicker without blurring motion. A further step would run LTX only on keyframes and use optical-flow interpolation between them, reducing both instability and compute.

方法论不变：先实测定位（是闪烁还是真变化、在哪一段），再针对性上时域方案，用 nit/JXL 对比验证压制效果——而不是盲调参数。The method remains unchanged: measure first to determine whether a change is flicker or real motion, and where it occurs; then apply a targeted temporal solution and verify suppression through nit/JXL comparisons instead of tuning blindly.

SDR → HDR：让 AI 只当测光表SDR → HDR:Use AI Only as a Light Meter

为什么要换思路Why change the approach?

规则覆盖不全 → 漏检Incomplete rules → missed detections

多环节耦合，调一处动全身Coupled stages: tune one, disturb them all

而 AI 恰好擅长"理解画面"AI happens to excel at understanding the image

LTX HDR 是什么What is LTX HDR?

它强在哪Its strength

它的副作用Its side effects

我们怎么用它How we use it

同一段素材，三条路见高下Same clip, three routes head to head

Gainmap：只借 LTX 的"亮度判断"Gain map: borrow only LTX's luminance judgment

看得见的处理链：一个真实镜头走完全程The pipeline you can see: one real shot, end to end

为什么这能成立Why this works

几个关键护栏Key guardrails

再细一层：增益到底怎么被"约束"One level deeper: how the gain is constrained

亮度之后，颜色也要跟上After luminance, color has to follow

off

B · ICtCp ×1.3

F_APC ✦

校准：拿专业母版当标尺，扫出 dmaxCalibration: sweep dmax against a professional master

从一帧，到能上线的整片流水线From one frame to a shippable full-length pipeline

分段接缝匹配：消除 LTX 拼接处的亮度跳变Segment seam matching: removing brightness jumps at LTX joints

多格式交付：视频与图片各取所需Multi-format delivery: the right container for video and stills

视频Video

图片Stills

几何对齐Geometry

拖动中线：左 SDR ↔ 右 HDRDrag the divider: SDR left ↔ HDR right

再看一例 · 酒吧场景（第 2 秒 16 帧）Another example · Bar scene (2 s 16 f)

为什么选 LTX gainmapWhy choose the LTX gain map?

拿了 AI 的长处Keep AI's strength

丢了 AI 的副作用Discard AI's side effects

保留了工程可控性Retain engineering control

已知局限与下一步Known limitations and next steps

问题 · 室外局部闪烁Problem · Local outdoor flicker

方向 · 运动补偿时域滤波Direction · Motion-compensated temporal filtering

SDR → HDR：
让 AI 只当测光表SDR → HDR:
Use AI Only as a Light Meter