{"id":2120,"date":"2024-11-12T11:29:43","date_gmt":"2024-11-12T11:29:43","guid":{"rendered":"https:\/\/www.nicktailor.com\/?p=2120"},"modified":"2025-12-01T04:48:47","modified_gmt":"2025-12-01T04:48:47","slug":"deploying-lustre-file-system-with-rdma-node-maps-and-acls","status":"publish","type":"post","link":"https:\/\/nicktailor.com\/tech-blog\/deploying-lustre-file-system-with-rdma-node-maps-and-acls\/","title":{"rendered":"Deploying Lustre File System with RDMA, Node Maps, and ACLs"},"content":{"rendered":"\n<p>Lustre is the de facto parallel file system for high-performance computing (HPC) clusters, providing extreme scalability, high throughput, and low-latency access across thousands of nodes. This guide walks through a complete deployment of Lustre using <strong>RDMA over InfiniBand<\/strong> for performance, along with <strong>Node Maps<\/strong> for client access control and <strong>ACLs<\/strong> for fine-grained permissions.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">1. Understanding the Lustre Architecture<\/h2>\n\n\n\n<p>Lustre separates metadata and data services into distinct roles:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>MGS (Management Server)<\/strong> \u2013 Manages Lustre configuration and coordinates cluster services.<\/li>\n\n\n\n<li><strong>MDT (Metadata Target)<\/strong> \u2013 Stores file system metadata (names, permissions, directories).<\/li>\n\n\n\n<li><strong>OST (Object Storage Target)<\/strong> \u2013 Stores file data blocks.<\/li>\n\n\n\n<li><strong>Clients<\/strong> \u2013 Mount and access the Lustre file system for I\/O.<\/li>\n<\/ul>\n\n\n\n<p>The typical architecture looks like this:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>+-------------+        +-------------+\n|   Client 1  |        |   Client 2  |\n| \/mnt\/lustre |        | \/mnt\/lustre |\n+------+------+        +------+------+\n       |                        |\n       +--------o2ib RDMA-------+\n                |\n        +-------+-------+\n        |     OSS\/OST    |\n        |   (Data I\/O)   |\n        +-------+-------+\n                |\n        +-------+-------+\n        |     MGS\/MDT    |\n        |  (Metadata)    |\n        +---------------+\n<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">2. Prerequisites and Environment<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Component<\/th><th>Requirements<\/th><\/tr><\/thead><tbody><tr><td><strong>OS<\/strong><\/td><td>RHEL \/ Rocky \/ AlmaLinux 8.x or higher<\/td><\/tr><tr><td><strong>Kernel<\/strong><\/td><td>Built with Lustre and OFED RDMA modules<\/td><\/tr><tr><td><strong>Network<\/strong><\/td><td>InfiniBand fabric (Mellanox or compatible)<\/td><\/tr><tr><td><strong>Lustre Version<\/strong><\/td><td>2.14 or later<\/td><\/tr><tr><td><strong>Devices<\/strong><\/td><td>Separate block devices for MDT, OST(s), and client mount<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">3. Install Lustre Packages<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">On MGS, MDT, and OSS Nodes:<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>dnf install -y lustre kmod-lustre lustre-osd-ldiskfs\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">On Client Nodes:<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>dnf install -y lustre-client kmod-lustre-client\n<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">4. Configure InfiniBand and RDMA (o2ib)<\/h2>\n\n\n\n<p>InfiniBand provides the lowest latency for Lustre communication via RDMA. Configure the <code>o2ib<\/code> network type for Lustre.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">1. Install and verify InfiniBand stack<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>dnf install -y rdma-core infiniband-diags perftest libibverbs-utils\nsystemctl enable --now rdma\nibstat\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">2. Configure IB network<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>nmcli con add type infiniband ifname ib0 con-name ib0 ip4 10.0.0.1\/24\nnmcli con up ib0\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">3. Verify RDMA link<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>ibv_devinfo\nibv_rc_pingpong -d mlx5_0\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">4. Configure LNET for o2ib<\/h3>\n\n\n\n<p>Create <code>\/etc\/modprobe.d\/lustre.conf<\/code> with:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>options lnet networks=\"o2ib(ib0)\"\n<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>modprobe lnet\nlnetctl lnet configure\nlnetctl net add --net o2ib --if ib0\nlnetctl net show\n<\/code><\/pre>\n\n\n\n<p>Expected output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>net:\n  - net type: o2ib\n    interfaces:\n      0: ib0\n<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">5. Format and Mount Lustre Targets<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Metadata Server (MGS + MDT)<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>mkfs.lustre --fsname=lustrefs --mgs --mdt --index=0 \/dev\/sdb\nmount -t lustre \/dev\/sdb \/mnt\/mdt\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Object Storage Server (OSS)<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>mkfs.lustre --fsname=lustrefs --ost --index=0 --mgsnode=&lt;MGS&gt;@o2ib \/dev\/sdc\nmount -t lustre \/dev\/sdc \/mnt\/ost\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Client Node<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>mount -t lustre &lt;MGS>@o2ib:\/lustrefs \/mnt\/lustre\nsudo mkdir -p \/mnt\/lustre\n\nsudo mount -t lustre \\\n  172.16.0.10@o2ib:\/lustrefs \\\n  \/mnt\/lustre\n\nexample without ibnetwork\n&#91;root@vbox ~]# mount -t lustre 172.16.0.10@tcp:\/lustre \/mnt\/lustre-client\n&#91;root@vbox ~]# \n&#91;root@vbox ~]# # Verify the mount worked\n&#91;root@vbox ~]# df -h \/mnt\/lustre-client\nFilesystem                Size  Used Avail Use% Mounted on\n172.16.0.10@tcp:\/lustre   12G  2.5M   11G   1% \/mnt\/lustre-client\n&#91;root@vbox ~]# lfs df -h\nUUID                       bytes        Used   Available Use% Mounted on\nlustre-MDT0000_UUID         4.5G        1.9M        4.1G   1% \/mnt\/lustre-client&#91;MDT:0]\nlustre-OST0000_UUID         7.5G        1.2M        7.0G   1% \/mnt\/lustre-client&#91;OST:0]\nlustre-OST0001_UUID         3.9G        1.2M        3.7G   1% \/mnt\/lustre-client&#91;OST:1]\nfilesystem_summary:        11.4G        2.4M       10.7G   1% \/mnt\/lustre-client\n<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">6. Configuring Node Maps (Access Control)<\/h2>\n\n\n\n<p>Node maps allow administrators to restrict Lustre client access based on network or host identity.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">1. View current node maps<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>lctl nodemap_list\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">2. Create a new node map for trusted clients<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>lctl nodemap_add trusted_clients\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">3. Add allowed network range or host<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>lctl nodemap_add_range trusted_clients 10.0.0.0\/24\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">4. Enable enforcement<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>lctl set_param nodemap.trusted_clients.admin=1\nlctl set_param nodemap.trusted_clients.trust_client_ids=1\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">5. Restrict default map<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>lctl set_param nodemap.default.reject_unauthenticated=1\n<\/code><\/pre>\n\n\n\n<p>This ensures only IPs in <code>10.0.0.0\/24<\/code> can mount and access the Lustre filesystem.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">7. Configuring Access Control Lists (ACLs)<\/h2>\n\n\n\n<p>Lustre supports standard POSIX ACLs for fine-grained directory and file permissions.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">1. Enable ACL support on mount<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>mount -t lustre -o acl &lt;MGS&gt;@o2ib:\/lustrefs \/mnt\/lustre\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">2. Verify ACL support<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>mount | grep lustre\n<\/code><\/pre>\n\n\n\n<p>Should show:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\/dev\/sda on \/mnt\/lustre type lustre (rw,acl)\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">3. Set ACLs on directories<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>setfacl -m u:researcher:rwx \/mnt\/lustre\/projects\nsetfacl -m g:analysts:rx \/mnt\/lustre\/reports\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">4. View ACLs<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>getfacl \/mnt\/lustre\/projects\n<\/code><\/pre>\n\n\n\n<p>Sample output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># file: projects\n# owner: root\n# group: root\nuser::rwx\nuser:researcher:rwx\ngroup::r-x\ngroup:analysts:r-x\nmask::rwx\nother::---\n<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">8. Verifying Cluster Health<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">On all nodes:<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>lctl ping &lt;MGS&gt;@o2ib\nlctl dl\nlctl get_param -n net.*.state\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Check RDMA performance:<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>lctl get_param -n o2iblnd.*.stats\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Check file system mount from client:<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>df -h \/mnt\/lustre\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Optional: Check node map enforcement<\/h3>\n\n\n\n<p>Try mounting from an unauthorized IP \u2014 it should fail:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>mount -t lustre &lt;MGS&gt;@o2ib:\/lustrefs \/mnt\/test\nmount.lustre: mount &lt;MGS&gt;@o2ib:\/lustrefs at \/mnt\/test failed: Permission denied\n<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">9. Common Issues and Troubleshooting<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Issue<\/th><th>Possible Cause<\/th><th>Resolution<\/th><\/tr><\/thead><tbody><tr><td><code>Mount failed: no route to host<\/code><\/td><td>IB subnet mismatch or LNET not configured<\/td><td>Verify <code>lnetctl net show<\/code> and <code>ping -I ib0<\/code> between nodes.<\/td><\/tr><tr><td><code>Permission denied<\/code><\/td><td>Node map restriction active<\/td><td>Check <code>lctl nodemap_list<\/code> and ensure client IP range is allowed.<\/td><\/tr><tr><td><code>Slow performance<\/code><\/td><td>RDMA disabled or fallback to TCP<\/td><td>Verify <code>lctl list_nids<\/code> shows <code>@o2ib<\/code> transport.<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">10. Final Validation Checklist<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>InfiniBand RDMA verified with <code>ibv_rc_pingpong<\/code><\/li>\n\n\n\n<li>LNET configured for <code>o2ib(ib0)<\/code><\/li>\n\n\n\n<li>MGS, MDT, and OST mounted successfully<\/li>\n\n\n\n<li>Clients connected via <code>@o2ib<\/code><\/li>\n\n\n\n<li>Node maps restricting unauthorized hosts<\/li>\n\n\n\n<li>ACLs correctly enforcing directory-level access<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\">Summary<\/h3>\n\n\n\n<p>With RDMA transport, Lustre achieves near line-rate performance while node maps and ACLs enforce robust security and access control. This combination provides a scalable, high-performance, and policy-driven storage environment ideal for AI, HPC, and research workloads.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Lustre is the de facto parallel file system for high-performance computing (HPC) clusters, providing extreme scalability, high throughput, and low-latency access across thousands of nodes. This guide walks through a complete deployment of Lustre using RDMA over InfiniBand for performance, along with Node Maps for client access control and ACLs for fine-grained permissions. 1. Understanding the Lustre Architecture Lustre separates<a href=\"https:\/\/nicktailor.com\/tech-blog\/deploying-lustre-file-system-with-rdma-node-maps-and-acls\/\" class=\"read-more\">Read More &#8230;<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[143],"tags":[],"class_list":["post-2120","post","type-post","status-publish","format-standard","hentry","category-hpc"],"_links":{"self":[{"href":"https:\/\/nicktailor.com\/tech-blog\/wp-json\/wp\/v2\/posts\/2120","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nicktailor.com\/tech-blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nicktailor.com\/tech-blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nicktailor.com\/tech-blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/nicktailor.com\/tech-blog\/wp-json\/wp\/v2\/comments?post=2120"}],"version-history":[{"count":3,"href":"https:\/\/nicktailor.com\/tech-blog\/wp-json\/wp\/v2\/posts\/2120\/revisions"}],"predecessor-version":[{"id":2184,"href":"https:\/\/nicktailor.com\/tech-blog\/wp-json\/wp\/v2\/posts\/2120\/revisions\/2184"}],"wp:attachment":[{"href":"https:\/\/nicktailor.com\/tech-blog\/wp-json\/wp\/v2\/media?parent=2120"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nicktailor.com\/tech-blog\/wp-json\/wp\/v2\/categories?post=2120"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nicktailor.com\/tech-blog\/wp-json\/wp\/v2\/tags?post=2120"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}