(shorthand) pixelNeRF: Neural Radiance Fields from One or Few Images

pixelNeRF: Neural Radiance Fields from One or Few Images


paper
code

仅仅抓取了我认为的关键点,细节可以去看原文。

1. Lightspot

支持a sparse set of views,甚至支持仅仅一个view也可以进行novel view synthesis。

Key Point

在NeRF的基础上加上了2D view image features. 加上这个可以学习scene的先验信息。这样做的好处:一,泛化性好;二,收敛快。

除了加上image feature 外,nerf部分也做了改变,在网络一开始的位置就将direction一并输入,因为作者认为multi-view case,view directions could serve as a signal for the relevance and positioning of different views.

坐标系说明:Viewer-centric 3D reconstruction

本文使用的是viewer-centered coordinate system。对于object-centered坐标系,作者说虽然这使得学习
空间规律更加容易,但使用规范空间抑制了对不可见的对象类别和有多个对象的场景的预测性能,在这些场景中没有预定义或定义良好的规范姿态。

2. Single-Image pixelNeRF



γ(·) is a positional encoding

3. Incorporating Multiple Views

We de- note the i-th input image as I(i) and its associated camera transform from the world space to its view space as

For a new target camera ray, we transform a query point x, with view direction d, into the coordinate system of each input view i with the world to camera transform as

we denote the initial layers of the NeRF network as f1, which process inputs in each input view space separately, and the final layers as f2, which process the aggregated views.
中间向量:

W ( i ) W^{(i)} W(i) is each input image feature volume.


ψ \psi ψ is average pooling.
single-view special cases, 简化为 f = f 1 ∘ f 2 f =f_1 \circ f_2 f=f1​∘f2​

4. Experiments


Table 3: 和SOTA的对比。
Table 3: local features and view directions 的增益。