-
Notifications
You must be signed in to change notification settings - Fork 12
Description
Assessment name
Find the bounding box for bedrooms and toilets
Image to use
Prompt
Floor-Plan Room Detection Prompt (Text-Driven with Centroid Validation)
Image Geometry (Variables)
image_width = 660image_height = 1224
All coordinates must be integer pixels within [0,image_width) × [0,image_height). Do not resize, crop, pad, or letterbox the image.
Input Images — Filename Mapping Rules
You will receive multiple aligned images. Use their filenames to understand their purpose:
"binary_text": Primary source for text. Use this for OCR to find room labels."clahe_gray": Source for text & geometry."binary_walls": Primary source for geometry. Use this to find precise room boundaries."sharpened_gray": Secondary source for geometry."clahe_lab": (Optional) Tertiary source for geometry.
Task (Concise)
Detect only the classes: ["bedroom","toilet"]. The detection process must be driven entirely by text labels. Detections for bedrooms and toilets should be performed independently, allowing their final polygons to overlap.
Core Mandate: Text is a Prerequisite
This is a text-first detection task. The process is governed by one rule:
- First, find the text labels (
BEDROOM,TOILET, etc.). - Second, count them. This count becomes your target.
- Third, find the geometry for only those specific labels.
Do not detect any room or object based on its shape alone.
Iterative Detection Algorithm (Text-First → Propose → Validate → Refine)
High-level idea:
Your goal is to produce one high-quality polygon for each text label. You will propose boundaries from multiple images, validate them using a centroid distance check, reconcile the valid ones, and then iterate up to 3 times to ensure every label is processed.
Detailed Steps
Step 1 — Identify Targets by Finding Text
- Perform OCR on
binary_text(priority) andclahe_grayto find all text strings forBED,BEDROOM,TOILET,WC,BATH. - Count the number of
bedroomandtoiletlabels. This is your required number of detections for each class. - Create a list of text-candidates, one for each label found, with its
label_text, expanded searchbbox,class, andconfidence_text = 0.60.
Step 2 — Proposing Geometries from Multiple Image Sources
- For each
text-candidate, conduct an independent search on each key geometry image to generate multiple polygon "proposals." - Proposal A (Primary): In
binary_walls, find the best closed contour that encloses the text label's search area. Assignc_geom = 0.70. - Proposal B (Secondary): In
sharpened_gray, find the corresponding contour. Assignc_geom = 0.70. - Proposal C (Tertiary): In
clahe_gray, find the corresponding contour. Assignc_geom = 0.60. - Proposal D (Optional): In
clahe_lab, find the corresponding contour. Assignc_geom = 0.60.
Step 3 — Filtering Proposals via Centroid Validation
- This step is a crucial quality check. You will discard any proposed geometry that is too far from its corresponding text label.
- 1. Establish Anchor: For each
text-candidate, determine the coordinate of its textbboxcentroid. This is the "anchor point". - 2. Define Validation Radius: Calculate the maximum allowed distance:
radius = 0.10 * min(image_width, image_height). - 3. Filter Proposals: For each geometry proposal (A, B, C, D) from Step 2, calculate its polygon's centroid.
- Measure the distance between the anchor point and the proposal's centroid.
- If this distance is greater than the validation radius, that proposal is considered invalid and must be discarded.
- At the end of this step, you will have a list of validated proposals for each text label.
Step 4 — Geometry Reconciliation of Validated Proposals
- For each text label, reconcile the validated geometry proposals from Step 3 into a single, high-confidence polygon.
- If multiple proposals are valid: Compare them for consistency (conceptual IoU). Use the
binary_wallsproposal (A) as the base and refine it with details from the others (B, C, D) if they are consistent. - If only one proposal is valid: Use that single proposal as the final polygon.
- If no proposals are valid: The text label has no valid geometry for this iteration.
- Calculate a
merged_confidencescore from all validated and consistent evidence (1 - ∏(1 - c_i)).
Step 5 — Acceptance Rules
- A merged candidate is ACCEPTED if it has a valid, synthesized polygon from Step 4 and its
merged_confidence >= 0.65. - If a
text-candidatehas no validated geometry after this step, it is temporarily marked as text-only.
Step 6 — Iterative Refinement to Meet Target Counts
- Compare your count of ACCEPTED objects against the target counts from Step 1.
- If any labels are still text-only, begin the next iteration. Focus only on these remaining labels and retry Steps 2-5, perhaps with a slightly larger search
bbox. - After a maximum of 3 iterations, the process stops.
Step 7 — Final Outline Construction
- For each accepted object:
- Generate a tight
polygonfrom the result of Step 4. Must benulliftext-only. - Calculate the
centroid. Must benulliftext-only. notes: Document the validation process. Example:"Final polygon from binary_walls. 3 of 4 proposals passed centroid validation."For text-only, note:"text-only; no proposals passed centroid validation after 3 iterations."
- Generate a tight
- Ensure all coordinates are valid integers.
Step 8 — Return
- Return one single JSON using the schema below.
Output Schema (Exact)
{
"status": "ok",
"image_width": 660,
"image_height": 1224,
"objects": [
{
"id": "obj_0001",
"class": "bedroom",
"label_text": "MASTER BEDROOM",
"polygon": [[x1,y1],[x2,y2],...],
"centroid": [xc,yc],
"confidence": 0.92,
"candidates": [{"class":"bedroom","confidence":0.92},{"class":"toilet","confidence":0.03}],
"mask": {
"format": "rle",
"mask_rle": null,
"bbox": [x_min,y_min,x_max,y_max]
},
"notes": "Final polygon from binary_walls. 3 of 4 proposals passed centroid validation."
}
],
"version": "rooms-v1"
}Correct answer
Don't know Visually inspect.
How would you like to be cited? (optional)
No response