-
Notifications
You must be signed in to change notification settings - Fork 590
HDDS-12356. granular locking framework #8176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
szetszwo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sumitagrawl , thanks for the update! The code looks better. Please see the comments/questions inlined.
| * Manage locking of volume, bucket, keys and others. | ||
| */ | ||
| public class OmLockOperation { | ||
| private static final long LOCK_TIMEOUT_DEFAULT = 10 * 60 * 1000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why it needs a 10min timeout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idea here is,
- Hadoop RPC thread pool
ozone.om.grpc.read.thread.numis configured default 10, so it may be blocked if do not timeout. - number of request waiting keeps increasing and hence memory
So IMO, a server should timeout request at appropriate time. May be lesser value as configurable or context of usages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OM currently does not have lock timeout. We do not see any problems. Let's don't fix this non-occurring problem here for simplicity reason. If we see the problem, we can add lock timeout. @kerneltime , what do you think?
Initially, I thought lock time would complicate the code a lot. Now, I find out that It is actually not very complicated. Thanks @errose28 for pointing it out.
| Lock lockObj = fileStripedLock.get(key).writeLock(); | ||
| boolean b = lockObj.tryLock(lockTimeout, TimeUnit.MILLISECONDS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not mixing getLock and acquireLock together. It makes the code complicated and error prone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it first get() lock object based on name belonging to the bucket, which internally it generate hash and return.
Next is, tryLock() to lock
I am not getting the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then, how would you release the lock? get it again and then release? Why getting the same lock twice?
| long startTime = Time.monotonicNowNanos(); | ||
| try { | ||
| OmLockInfo.LockLevel level = lockInfo.getLevel(); | ||
| switch (level) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When using inheritance, don't use switch-case; see
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, above are better to avoid switch. will take up with finalized solution.
|
Code is restructured and new PR is raised, closing this PR. |
What changes were proposed in this pull request?
Granular locking framework for OBS for existing flow. Its just framework binding code flow but pending to call the lock in respective flow.
locking is added for external request at entry point. This provides execution of request at leader and existing flow simultaneouly without impacting for cache.
locking added for obs request(HDDS-11898. design doc leader side execution)Step-by-step integrationof existing request (interoperability)Next PR will include:
Parent Jira:
https://issues.apache.org/jira/browse/HDDS-11900
Its Parent for Epic;
https://issues.apache.org/jira/browse/HDDS-11897
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-12356
How was this patch tested?