Skip to content

[Question]: It takes a long time to execute obdiag check run #1068

@suyestyle

Description

@suyestyle

Description

ob version 4.2.5.3

obdiag version 3.6.0

I installed obdiag on four machines and conducted tests on the same oceanbase cluster, and discovered a phenomenon:

1、There are two machines running obdiag. Each collection item is basically very slow and will eventually get stuck. Even if the execution time exceeds 24 hours, no result will be returned.

2、There is a machine running obdiag. At first, it is relatively fast, but eventually it gets stuck and will never return the result.

3、There is a machine running obdiag. It is relatively fast overall and can eventually return the result

The above-mentioned phenomenon can be stably reproduced in my environment.

Regarding the first point, with the assistance of the oceanbase team in analysis and investigation, it was determined that it was caused by an excessive number of records in this file [~ /.ssh/known_hosts](This file contains over 7,700 records). When I cleared this file and re-executed it, I got the result very quickly.

Regarding the second point, with the assistance of the oceanbase team in analyzing the logs, it was found that the [cluster.observer_port] collection item was very slow。We can delete file to skip this collection. [~/.obdiag/check/tasks/observer/cluster/observer_port.py]

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions