Skip to content

Floor Plan Bounding Box #35

@AmeyV05

Description

@AmeyV05

Assessment name

Find the bounding box for bedrooms and toilets

Image to use

Image

Prompt

Floor-Plan Room Detection Prompt (Text-Driven with Centroid Validation)

Image Geometry (Variables)

  • image_width = 660
  • image_height = 1224

All coordinates must be integer pixels within [0,image_width) × [0,image_height). Do not resize, crop, pad, or letterbox the image.


Input Images — Filename Mapping Rules

You will receive multiple aligned images. Use their filenames to understand their purpose:

  • "binary_text": Primary source for text. Use this for OCR to find room labels.
  • "clahe_gray": Source for text & geometry.
  • "binary_walls": Primary source for geometry. Use this to find precise room boundaries.
  • "sharpened_gray": Secondary source for geometry.
  • "clahe_lab": (Optional) Tertiary source for geometry.

Task (Concise)

Detect only the classes: ["bedroom","toilet"]. The detection process must be driven entirely by text labels. Detections for bedrooms and toilets should be performed independently, allowing their final polygons to overlap.


Core Mandate: Text is a Prerequisite

This is a text-first detection task. The process is governed by one rule:

  1. First, find the text labels (BEDROOM, TOILET, etc.).
  2. Second, count them. This count becomes your target.
  3. Third, find the geometry for only those specific labels.

Do not detect any room or object based on its shape alone.


Iterative Detection Algorithm (Text-First → Propose → Validate → Refine)

High-level idea:

Your goal is to produce one high-quality polygon for each text label. You will propose boundaries from multiple images, validate them using a centroid distance check, reconcile the valid ones, and then iterate up to 3 times to ensure every label is processed.

Detailed Steps

Step 1 — Identify Targets by Finding Text

  • Perform OCR on binary_text (priority) and clahe_gray to find all text strings for BED, BEDROOM, TOILET, WC, BATH.
  • Count the number of bedroom and toilet labels. This is your required number of detections for each class.
  • Create a list of text-candidates, one for each label found, with its label_text, expanded search bbox, class, and confidence_text = 0.60.

Step 2 — Proposing Geometries from Multiple Image Sources

  • For each text-candidate, conduct an independent search on each key geometry image to generate multiple polygon "proposals."
  • Proposal A (Primary): In binary_walls, find the best closed contour that encloses the text label's search area. Assign c_geom = 0.70.
  • Proposal B (Secondary): In sharpened_gray, find the corresponding contour. Assign c_geom = 0.70.
  • Proposal C (Tertiary): In clahe_gray, find the corresponding contour. Assign c_geom = 0.60.
  • Proposal D (Optional): In clahe_lab, find the corresponding contour. Assign c_geom = 0.60.

Step 3 — Filtering Proposals via Centroid Validation

  • This step is a crucial quality check. You will discard any proposed geometry that is too far from its corresponding text label.
  • 1. Establish Anchor: For each text-candidate, determine the coordinate of its text bbox centroid. This is the "anchor point".
  • 2. Define Validation Radius: Calculate the maximum allowed distance: radius = 0.10 * min(image_width, image_height).
  • 3. Filter Proposals: For each geometry proposal (A, B, C, D) from Step 2, calculate its polygon's centroid.
    • Measure the distance between the anchor point and the proposal's centroid.
    • If this distance is greater than the validation radius, that proposal is considered invalid and must be discarded.
  • At the end of this step, you will have a list of validated proposals for each text label.

Step 4 — Geometry Reconciliation of Validated Proposals

  • For each text label, reconcile the validated geometry proposals from Step 3 into a single, high-confidence polygon.
  • If multiple proposals are valid: Compare them for consistency (conceptual IoU). Use the binary_walls proposal (A) as the base and refine it with details from the others (B, C, D) if they are consistent.
  • If only one proposal is valid: Use that single proposal as the final polygon.
  • If no proposals are valid: The text label has no valid geometry for this iteration.
  • Calculate a merged_confidence score from all validated and consistent evidence (1 - ∏(1 - c_i)).

Step 5 — Acceptance Rules

  • A merged candidate is ACCEPTED if it has a valid, synthesized polygon from Step 4 and its merged_confidence >= 0.65.
  • If a text-candidate has no validated geometry after this step, it is temporarily marked as text-only.

Step 6 — Iterative Refinement to Meet Target Counts

  • Compare your count of ACCEPTED objects against the target counts from Step 1.
  • If any labels are still text-only, begin the next iteration. Focus only on these remaining labels and retry Steps 2-5, perhaps with a slightly larger search bbox.
  • After a maximum of 3 iterations, the process stops.

Step 7 — Final Outline Construction

  • For each accepted object:
    • Generate a tight polygon from the result of Step 4. Must be null if text-only.
    • Calculate the centroid. Must be null if text-only.
    • notes: Document the validation process. Example: "Final polygon from binary_walls. 3 of 4 proposals passed centroid validation." For text-only, note: "text-only; no proposals passed centroid validation after 3 iterations."
  • Ensure all coordinates are valid integers.

Step 8 — Return

  • Return one single JSON using the schema below.

Output Schema (Exact)

{
  "status": "ok",
  "image_width": 660,
  "image_height": 1224,
  "objects": [
    {
      "id": "obj_0001",
      "class": "bedroom",
      "label_text": "MASTER BEDROOM",
      "polygon": [[x1,y1],[x2,y2],...],
      "centroid": [xc,yc],
      "confidence": 0.92,
      "candidates": [{"class":"bedroom","confidence":0.92},{"class":"toilet","confidence":0.03}],
      "mask": {
        "format": "rle",
        "mask_rle": null,
        "bbox": [x_min,y_min,x_max,y_max]
      },
      "notes": "Final polygon from binary_walls. 3 of 4 proposals passed centroid validation."
    }
  ],
  "version": "rooms-v1"
}

Correct answer

Don't know Visually inspect.

How would you like to be cited? (optional)

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions