Skip to content

Qualitative result on scene description task #23

@jkstyle2

Description

@jkstyle2

I've been conducting an evaluation test on unified_3dllm_scene_description dataset with the pretrained generalist ckpt ll3da-opt-1.3b.pth.

An example result for scene0612_00 is as below
image

"scene0612_00": {

        "pred": [
            "this room is surrounded by walls, each with a different design. there are two beds in the room, 
			one located near a door and another towards the center. there are also two nightstands in the room. 
			on one of the walls, there is a picture hanging on a hook. the room also includes a lamp and a chair. 
			additionally, there are two curtains on one of those walls. additionally, on the floor, there is another picture hanging. 
			on the opposite side, there is food standing beside a table. additionally, a wardrobe dresser is present in the room as well. 
			there is also a door and a window in the room.. the room features a nightlight on another wall."
        ],

        "gt": [
            "this room has a vent located near the center. there is a shower curtain hanging from a rod, 
			along with shower walls surrounding the shower area. a towel is hanging on a towel rack near the shower. 
			the room also contains a wall, a floor, a toilet, and a soap dish. three more towels are scattered throughout the room. 
			there is a toilet paper holder mounted on a wall. a picture is hanging on another wall. the room has a ceiling, 
			and there are two more walls. a light switch is located near the entrance, and there is a doorframe leading to the room. 
			finally, there is a bar attached to the wall and a bathtub next to it.",

            "this room is a bathroom with a vent situated at one corner. in the center of the room, 
			there is a bathtub with a shower curtain and a shower curtain rod attached to the walls. 
			the walls are positioned around the room, enclosing the space. 
			there is a door with a doorframe located at one side of the room, opposite the bathtub. 
			adjacent to the door, there is a light switch on one of the walls. 
			the ceiling is above, covering the entire room. on the opposite side of the bathtub, 
			there is a towel hanging from a bar mounted on the wall. in the corner next to the towel, there is a soap dish. 
			near the soap dish, there is a towel. 
			in another corner of the room, there is a towel and a towel holder. 
			additionally, there is a toilet in the room with a toilet paper holder attached to one side. 
			finally, there is a picture hanging on the wall, completing the decor of the room."
        ],

        "score": {
            "bleu-1": 0.6134234738542225,
            "bleu-2": 0.44446629419655276,
            "bleu-3": 0.2870047490209411,
            "bleu-4": 0.16898075047368497,
            "CiDEr": 0.1474318082229148,
            "rouge": 0.3330959164292497,
            "meteor": 0.18474151255536048
        }

    },

As it shown, the pred result is quite different from gt annotations.
Is this an expected result for the generalist model or am I missing something to do?
I wonder if it could be much better with fine-tuning on unified_3dllm_scene_description dataset.

I also noticed the annoations for scene description dataset are quite different from the original 3D-LLM annotations. For instance, the annotation for the above scene0612_00 is :

["The room is a bathroom with a shower curtain, sink, paper towel dispenser, crate, toilet, soap dispenser, 
bathroom counter, toilet paper, and trash bins. The walls are present, as well as a mirror and a door. 
There is also a light switch and a bar. The room has a window."]},

Could you tell me why are they different and how did you process each annoation for the scene description task?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions