From GPFS paper [2002], there's no open source version.
GPFS uses a centralized global lock manager, in conjunction with local lock managers in each file system node. Lock manager handing out lock tokens to requested node.
GPFS guarantees single-node equivalent POSIX semantics for file system operations across the cluster, meaning a read on node A will see either all or none of concurrent write on node B (read/write atomicity). But with one exception, the access time (atime in metadata) updates only periodically, due to concurrent read is very common, synchronizing atime would be very expensive.
The paper claims there are two approaches to achieving the necessary synchronization:
1. Distributed Locking : every FS operation acquires read/write lock to synchronize with conflicting operations.
2. Centralized Management : all conflicting operations are forwarded to a designated node, which performs requests.
GPFS uses byte-range locking for updates to file data, and dynamically elected "metanodes" for centralized management of file metadata. The argument of using different approach for data and metadata is this:
(1) when lock conflicts are frequent, (e.g. many nodes may access different parts of a file, but all need to access the same metadata), the overhead for distributed locking may exceed the cost of forwarding requests to a central node.
(2) if different nodes operates on different pieces of file data, distributed locking allows greater parallelism.
Also, a smaller lock granularity means more overhead due to frequent lock requests. Whereas larger granularity may cause more frequent lock conflicts. Thus, byte-rang lock for file data, lock-per-file used for metadata.
However, there could be third approach, a middle solution: Panopticon
No comments:
Post a Comment