-
Notifications
You must be signed in to change notification settings - Fork 14
Open
Description
I've been conducting an evaluation test on unified_3dllm_scene_description dataset with the pretrained generalist ckpt ll3da-opt-1.3b.pth.
An example result for scene0612_00 is as below

"scene0612_00": {
"pred": [
"this room is surrounded by walls, each with a different design. there are two beds in the room,
one located near a door and another towards the center. there are also two nightstands in the room.
on one of the walls, there is a picture hanging on a hook. the room also includes a lamp and a chair.
additionally, there are two curtains on one of those walls. additionally, on the floor, there is another picture hanging.
on the opposite side, there is food standing beside a table. additionally, a wardrobe dresser is present in the room as well.
there is also a door and a window in the room.. the room features a nightlight on another wall."
],
"gt": [
"this room has a vent located near the center. there is a shower curtain hanging from a rod,
along with shower walls surrounding the shower area. a towel is hanging on a towel rack near the shower.
the room also contains a wall, a floor, a toilet, and a soap dish. three more towels are scattered throughout the room.
there is a toilet paper holder mounted on a wall. a picture is hanging on another wall. the room has a ceiling,
and there are two more walls. a light switch is located near the entrance, and there is a doorframe leading to the room.
finally, there is a bar attached to the wall and a bathtub next to it.",
"this room is a bathroom with a vent situated at one corner. in the center of the room,
there is a bathtub with a shower curtain and a shower curtain rod attached to the walls.
the walls are positioned around the room, enclosing the space.
there is a door with a doorframe located at one side of the room, opposite the bathtub.
adjacent to the door, there is a light switch on one of the walls.
the ceiling is above, covering the entire room. on the opposite side of the bathtub,
there is a towel hanging from a bar mounted on the wall. in the corner next to the towel, there is a soap dish.
near the soap dish, there is a towel.
in another corner of the room, there is a towel and a towel holder.
additionally, there is a toilet in the room with a toilet paper holder attached to one side.
finally, there is a picture hanging on the wall, completing the decor of the room."
],
"score": {
"bleu-1": 0.6134234738542225,
"bleu-2": 0.44446629419655276,
"bleu-3": 0.2870047490209411,
"bleu-4": 0.16898075047368497,
"CiDEr": 0.1474318082229148,
"rouge": 0.3330959164292497,
"meteor": 0.18474151255536048
}
},
As it shown, the pred result is quite different from gt annotations.
Is this an expected result for the generalist model or am I missing something to do?
I wonder if it could be much better with fine-tuning on unified_3dllm_scene_description dataset.
I also noticed the annoations for scene description dataset are quite different from the original 3D-LLM annotations. For instance, the annotation for the above scene0612_00 is :
["The room is a bathroom with a shower curtain, sink, paper towel dispenser, crate, toilet, soap dispenser,
bathroom counter, toilet paper, and trash bins. The walls are present, as well as a mirror and a door.
There is also a light switch and a bar. The room has a window."]},
Could you tell me why are they different and how did you process each annoation for the scene description task?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels