HDDS-12356. granular locking framework #8176

sumitagrawl · 2025-03-27T11:20:12Z

What changes were proposed in this pull request?

Granular locking framework for OBS for existing flow. Its just framework binding code flow but pending to call the lock in respective flow.

locking is added for external request at entry point. This provides execution of request at leader and existing flow simultaneouly without impacting for cache.

refer obs-locking.md for locking added for obs request (HDDS-11898. design doc leader side execution)
refer leader-execution.md for Step-by-step integration of existing request (interoperability)

Next PR will include:

integration of locking framework to flow
locking for obs key operation, bucket operation, volume operation and MPU cases
https://issues.apache.org/jira/browse/HDDS-12386 configuration for lock bucket length and timeout

Parent Jira:
https://issues.apache.org/jira/browse/HDDS-11900

Its Parent for Epic;
https://issues.apache.org/jira/browse/HDDS-11897

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-12356

How was this patch tested?

UT cases added for lock

szetszwo

@sumitagrawl , thanks for the update! The code looks better. Please see the comments/questions inlined.

szetszwo · 2025-03-28T01:08:16Z

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/lock/OmLockOperation.java

+ * Manage locking of volume, bucket, keys and others.
+ */
+public class OmLockOperation {
+  private static final long LOCK_TIMEOUT_DEFAULT = 10 * 60 * 1000;


Why it needs a 10min timeout?

Idea here is,

Hadoop RPC thread pool ozone.om.grpc.read.thread.num is configured default 10, so it may be blocked if do not timeout.

number of request waiting keeps increasing and hence memory

So IMO, a server should timeout request at appropriate time. May be lesser value as configurable or context of usages.

OM currently does not have lock timeout. We do not see any problems. Let's don't fix this non-occurring problem here for simplicity reason. If we see the problem, we can add lock timeout. @kerneltime , what do you think?

Initially, I thought lock time would complicate the code a lot. Now, I find out that It is actually not very complicated. Thanks @errose28 for pointing it out.

szetszwo · 2025-03-28T01:11:04Z

...op-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/lock/WrappedStripedLock.java

+      Lock lockObj = fileStripedLock.get(key).writeLock();
+      boolean b = lockObj.tryLock(lockTimeout, TimeUnit.MILLISECONDS);


Do not mixing getLock and acquireLock together. It makes the code complicated and error prone.

it first get() lock object based on name belonging to the bucket, which internally it generate hash and return.
Next is, tryLock() to lock
I am not getting the problem.

Then, how would you release the lock? get it again and then release? Why getting the same lock twice?

szetszwo · 2025-03-28T01:16:07Z

hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/lock/OmLockOperation.java

+    long startTime = Time.monotonicNowNanos();
+    try {
+      OmLockInfo.LockLevel level = lockInfo.getLevel();
+      switch (level) {


When using inheritance, don't use switch-case; see

https://stackoverflow.com/questions/31518236/object-oriented-programming-avoid-switch-case-and-if-else-java

https://softwareengineering.stackexchange.com/questions/347223/switch-vs-polymorphism

Yes, above are better to avoid switch. will take up with finalized solution.

sumitagrawl · 2025-04-04T05:23:05Z

Code is restructured and new PR is raised, closing this PR.
#8217

HDDS-12356. granular locking framework

302e9a6

sumitagrawl requested review from errose28, kerneltime, swamirishi and szetszwo March 27, 2025 11:22

sumitagrawl added the om-pre-ratis-execution PRs related to https://issues.apache.org/jira/browse/HDDS-11897 label Mar 27, 2025

sumitagrawl marked this pull request as ready for review March 27, 2025 13:25

szetszwo reviewed Mar 28, 2025

View reviewed changes

adoroszlai mentioned this pull request Apr 2, 2025

HDDS-12356. granular locking framework for obs #8217

Closed

sumitagrawl closed this Apr 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HDDS-12356. granular locking framework #8176

HDDS-12356. granular locking framework #8176

Uh oh!

sumitagrawl commented Mar 27, 2025 •

edited

Loading

Uh oh!

szetszwo left a comment

Uh oh!

szetszwo Mar 28, 2025

Uh oh!

sumitagrawl Mar 28, 2025

Uh oh!

szetszwo Mar 28, 2025

Uh oh!

szetszwo Mar 28, 2025

Uh oh!

sumitagrawl Mar 28, 2025

Uh oh!

szetszwo Mar 28, 2025

Uh oh!

szetszwo Mar 28, 2025

Uh oh!

sumitagrawl Mar 28, 2025

Uh oh!

sumitagrawl commented Apr 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		Lock lockObj = fileStripedLock.get(key).writeLock();
		boolean b = lockObj.tryLock(lockTimeout, TimeUnit.MILLISECONDS);

HDDS-12356. granular locking framework #8176

HDDS-12356. granular locking framework #8176

Uh oh!

Conversation

sumitagrawl commented Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

Uh oh!

szetszwo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sumitagrawl commented Apr 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sumitagrawl commented Mar 27, 2025 •

edited

Loading