NeRF with VTK

Hello guys.

I just want to reconstruct 3d volume from screenshot of view of vtk/paraview renderer.
To do it, I need to make my own transform.json file which has camera pose(camera to world) matrix and some other parameters.

this is example of json file

{
  "camera_angle_x": 0.7481849417937728,
  "camera_angle_y": 1.2193576119562444,
  "fl_x": 1375.52,
  "fl_y": 1374.49,
  "k1": 0.0578421,
  "k2": -0.0805099,
  "p1": -0.000980296,
  "p2": 0.00015575,
  "cx": 554.558,
  "cy": 965.268,
  "w": 1080.0,
  "h": 1920.0,
  "aabb_scale": 4,
  "frames": [
    {
      "file_path": "images/0001.jpg",
      "sharpness": 31.752987436300323,
      "transform_matrix": [
        [
          0.8926439112348871,
          0.08799600283226543,
          0.4420900262071262,
          3.168359405609479
        ],
        [
          0.4464189982715247,
          -0.03675452191179031,
          -0.8940689141475064,
          -5.4794898611466945
        ],
        [
          -0.062425682580756266,
          0.995442519072023,
          -0.07209178487538156,
          -0.9791660699008925
        ],
        [
          0.0,
          0.0,
          0.0,
          1.0
        ]
      ]
    },

with json file,
I need to generate my own json file like above.

they siad transform_matrix is cam2world matrix.
So, I think I need to use GetCompositeProjectionTransformMatrix() method.
and then inverse it to world2cam → cam2world.

But, result of NeRF is weird.
Is there someone who did same thing what I do?

Or, please give some hint … experts…

actually, I delete belows… and use only parameter “camera_angle_y” with math.radians(camera.GetViewAngle())

“fl_x”: 1375.52,
“fl_y”: 1374.49,
“k1”: 0.0578421,
“k2”: -0.0805099,
“p1”: -0.000980296,
“p2”: 0.00015575,
“cx”: 554.558,
“cy”: 965.268,
“w”: 1080.0,
“h”: 1920.0,

Thank you.

for someone who are struggling with like me.
I got the solution


def generate_transform_matrix(pos, rot):

    def Rx(theta):
      return np.matrix([[ 1, 0            , 0            ],
                        [ 0, np.cos(theta),-np.sin(theta)],
                        [ 0, np.sin(theta), np.cos(theta)]])
    def Ry(theta):
      return np.matrix([[ np.cos(theta), 0, np.sin(theta)],
                        [ 0            , 1, 0            ],
                        [-np.sin(theta), 0, np.cos(theta)]])
    def Rz(theta):
      return np.matrix([[ np.cos(theta), -np.sin(theta), 0 ],
                        [ np.sin(theta), np.cos(theta) , 0 ],
                        [ 0            , 0             , 1 ]])

    R = Rz(-math.radians(rot[2])) * Ry(-math.radians(rot[1])) * Rx(-math.radians(rot[0]))
    xf_rot = np.eye(4)
    xf_rot[:3,:3] = R

    xf_pos = np.eye(4)
    xf_pos[:3,3] = np.array(pos) - np.array([x_mid, y_mid, z_mid])

    extra_xf = np.matrix([
        [-1, 0, 0, 0],
        [ 0, 0, 1, 0],
        [ 0, 1, 0, 0],
        [ 0, 0, 0, 1]])
    
    shift_coords = np.matrix([
        [0, 0, 1, 0],
        [1, 0, 0, 0],
        [0, 1, 0, 0],
        [0, 0, 0, 1]])
    xf = shift_coords @ extra_xf @ xf_pos
    assert np.abs(np.linalg.det(xf) - 1.0) < 1e-4
    xf = xf @ xf_rot
    return np.float32(xf).tolist()

generate_transform_matrix(camera.GetPosition(), camera.GetOrientation())