The text label is not correct

I gave this image a unrelated word (apple) but it still get detected. 

<img width="1024" height="1024" alt="Image" src="https://github.com/user-attachments/assets/248d0e0d-a85a-4f37-affe-6e36e38fd58a" />

```python
import torch
from transformers import AutoModelForZeroShotObjectDetection, AutoProcessor
from transformers.image_utils import load_image


# Prepare processor and model
model_id = "iSEE-Laboratory/llmdet_base"
device = "cuda" if torch.cuda.is_available() else "cpu"
processor = AutoProcessor.from_pretrained(model_id)
model = AutoModelForZeroShotObjectDetection.from_pretrained(model_id).to(device)

# Prepare inputs
image = load_image("missing_object_results/edited/3.jpg") 
text_labels = [["front wheels", "apple"]]
inputs = processor(images=image, text=text_labels, return_tensors="pt").to(device)

# Run inference
with torch.no_grad():
    outputs = model(**inputs)

# Postprocess outputs
results = processor.post_process_grounded_object_detection(
    outputs,
    threshold=0.3,
    target_sizes=[(image.height, image.width)]
)

# Retrieve the first image result
result = results[0]
for box, score, labels in zip(result["boxes"], result["scores"], result["labels"]):
    box = [round(x, 2) for x in box.tolist()]
    print(f"Detected {labels} with confidence {round(score.item(), 3)} at location {box}")

# Visualize results
from PIL import Image, ImageDraw

draw = ImageDraw.Draw(image)
for box, labels in zip(result["boxes"], result["labels"]):
    box = [round(x, 2) for x in box.tolist()]
    draw.rectangle(box, outline="red", width=2)
    draw.text((box[0], box[1]), labels, fill="red")

image
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The text label is not correct #72

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The text label is not correct #72

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions