{"id":2124,"date":"2024-12-13T04:44:43","date_gmt":"2024-12-13T04:44:43","guid":{"rendered":"https:\/\/www.nicktailor.com\/?p=2124"},"modified":"2025-12-10T13:13:01","modified_gmt":"2025-12-10T13:13:01","slug":"how-to-deploy-lustre-with-zfs-backend-rdma-acls-nodemaps-clients","status":"publish","type":"post","link":"https:\/\/nicktailor.com\/tech-blog\/how-to-deploy-lustre-with-zfs-backend-rdma-acls-nodemaps-clients\/","title":{"rendered":"How to Deploy Lustre with ZFS Backend (RDMA, ACLs, Nodemaps, Clients"},"content":{"rendered":"<h1>\u00a0<\/h1>\n<p>This step-by-step guide walks you through deploying a production-ready Lustre filesystem backed by ZFS, including RDMA networking, MDT\/OST setup, nodemaps, ACL configuration, and client mounting. This guide assumes:<\/p>\n<ul>\n<li>MGS + MDS on one node<\/li>\n<li>One or more OSS nodes<\/li>\n<li>Clients mounting over RDMA (o2ib)<\/li>\n<li>ZFS as the backend filesystem<\/li>\n<\/ul>\n<hr \/>\n<h2>0. Architecture &amp; Assumptions<\/h2>\n<ul>\n<li>Filesystem name: <strong>lustrefs<\/strong><\/li>\n<li>MGS\/MDS RDMA IP: <strong>172.16.0.10<\/strong><\/li>\n<li>OSS RDMA IP: <strong>172.16.0.20<\/strong><\/li>\n<li>Client RDMA IP: <strong>172.16.0.30<\/strong><\/li>\n<li>RDMA interface: <strong>ib0<\/strong><\/li>\n<li>Network type: <strong>o2ib<\/strong><\/li>\n<\/ul>\n<hr \/>\n<h2>1. Manager Server Setup (MGS + MDS with ZFS)<\/h2>\n<h3>1.1 Install ZFS and Lustre MDS packages<\/h3>\n<pre><code>sudo apt update\nsudo apt install -y zfsutils-linux\nsudo apt install -y lustre-osd-zfs-mount lustre-utils\n<\/code><\/pre>\n<h3>1.2 Create a ZFS pool for MDT<\/h3>\n<pre><code>sudo zpool create mdtpool mirror \/dev\/nvme0n1 \/dev\/nvme1n1 ashift=12\nsudo zfs create -o recordsize=4K -o primarycache=metadata mdtpool\/mdt0\n<\/code><\/pre>\n<h3>1.3 Format MDT &amp; enable MGS<\/h3>\n<pre><code>sudo mkfs.lustre \\\n  --fsname=lustrefs \\\n  --mgs \\\n  --mdt \\\n  --index=0 \\\n  --backfstype=zfs mdtpool\/mdt0\n<\/code><\/pre>\n<h3>1.4 Mount MDT<\/h3>\n<pre><code>sudo mkdir -p \/mnt\/mdt0\nsudo mount -t lustre mdtpool\/mdt0 \/mnt\/mdt0\n<\/code><\/pre>\n<hr \/>\n<h2>2. RDMA + LNET Configuration (All Nodes)<\/h2>\n<h3>2.1 Install RDMA core utilities<\/h3>\n<pre><code>sudo apt install -y rdma-core\n<\/code><\/pre>\n<h3>2.2 Bring up the RDMA interface<\/h3>\n<pre><code>sudo ip addr add 172.16.0.10\/24 dev ib0\nsudo ip link set ib0 up\n<\/code><\/pre>\n<h3>2.3 Configure LNET to use o2ib<\/h3>\n<p>Create <code>\/etc\/modprobe.d\/lustre.conf<\/code>:<\/p>\n<pre><code>options lnet networks=\"o2ib(ib0)\"\n<\/code><\/pre>\n<h3>Load and enable LNET<\/h3>\n<pre><code>sudo modprobe lnet\nsudo systemctl enable lnet\nsudo systemctl start lnet\nsudo lctl list_nids\n<\/code><\/pre>\n<hr \/>\n<h2>3. OFED \/ Mellanox Optional Performance Tuning<\/h2>\n<p>These settings are optional but recommended for high-performance Lustre deployments using Mellanox or OFED-based InfiniBand hardware.<\/p>\n<h3>3.1 Relevant config locations<\/h3>\n<ul>\n<li><code>\/etc\/infiniband\/*<\/code><\/li>\n<li><code>\/etc\/modprobe.d\/mlx5.conf<\/code><\/li>\n<li><code>\/etc\/security\/limits.d\/rdma.conf<\/code><\/li>\n<li><code>\/etc\/sysctl.conf<\/code> (MTU, hugepages, buffers)<\/li>\n<li><code>\/etc\/rdma\/modules\/<\/code><\/li>\n<\/ul>\n<h3>3.2 Increase RDMA MTU (InfiniBand)<\/h3>\n<pre><code>sudo ip link set ib0 mtu 65520\n<\/code><\/pre>\n<h3>3.3 Increase RDMA network buffers<\/h3>\n<pre><code>echo 262144 | sudo tee \/proc\/sys\/net\/core\/rmem_max\necho 262144 | sudo tee \/proc\/sys\/net\/core\/wmem_max\n<\/code><\/pre>\n<p>These settings improve performance when using high-speed links (56Gb, 100Gb, HDR100, etc.).<\/p>\n<hr \/>\n<h2>4. OSS Node Setup (ZFS + OSTs)<\/h2>\n<h3>4.1 Install ZFS + Lustre OSS components<\/h3>\n<pre><code>sudo apt update\nsudo apt install -y zfsutils-linux lustre-osd-zfs-mount lustre-utils\n<\/code><\/pre>\n<h3>4.2 Create an OST ZFS pool<\/h3>\n<pre><code>sudo zpool create ostpool raidz2 \\\n    \/dev\/sdc \/dev\/sdd \/dev\/sde \/dev\/sdf ashift=12\n\nsudo zfs create -o recordsize=1M ostpool\/ost0\n<\/code><\/pre>\n<h3>4.3 Format OST using RDMA to MGS<\/h3>\n<pre><code>sudo mkfs.lustre \\\n  --fsname=lustrefs \\\n  --ost \\\n  --index=0 \\\n  --mgsnode=172.16.0.10@o2ib \\\n  --backfstype=zfs ostpool\/ost0\n<\/code><\/pre>\n<h3>4.4 Mount OST<\/h3>\n<pre><code>sudo mkdir -p \/mnt\/ost0\nsudo mount -t lustre ostpool\/ost0 \/mnt\/ost0\n<\/code><\/pre>\n<hr \/>\n<h2>5. Client Node Setup<\/h2>\n<h3>5.1 Install Lustre client packages<\/h3>\n<pre><code>sudo apt update\nsudo apt install -y lustre-client-modules-$(uname -r) lustre-utils\n<\/code><\/pre>\n<p>If you setup MGS\/OSS and client correctly when you mount from the client\u00a0<br \/><br \/><br \/><br \/><br \/><\/p>\n<h3>5.2 Configure RDMA + LNET (same as above)<\/h3>\n<pre><code>sudo ip addr add 172.16.0.30\/24 dev ib0\nsudo ip link set ib0 up\n\necho 'options lnet networks=\"o2ib(ib0)\"' | sudo tee \/etc\/modprobe.d\/lustre.conf\n\nsudo modprobe lnet\nsudo systemctl start lnet\nsudo lctl list_nids\n<\/code><\/pre>\n<hr \/>\n<h2>6. How to Get Lustre Target Names<\/h2>\n<h3>List OSTs<\/h3>\n<pre><code>lfs osts\n<\/code><\/pre>\n<h3>List MDTs<\/h3>\n<pre><code>lfs mdts\n<\/code><\/pre>\n<h3>List all targets and connections<\/h3>\n<pre><code>lctl dl\n<\/code><\/pre>\n<h3>Check space and OST availability<\/h3>\n<pre><code>lfs df -h\n<\/code><\/pre>\n<hr \/>\n<h2>7. Nodemap Configuration (Access Control)<\/h2>\n<h3>7.1 Create and enable default nodemap<\/h3>\n<pre><code>sudo lctl nodemap_add default\nsudo lctl nodemap_modify default --property enable=1\nsudo lctl nodemap_modify default --property map_mode=identity\n<\/code><\/pre>\n<h3>7.2 Restrict access to an RDMA subnet<\/h3>\n<pre><code>sudo lctl nodemap_modify default --add ranges=172.16.0.0@o2ib\/24\n<\/code><\/pre>\n<h3>7.3 Make a subnet read-only (optional)<\/h3>\n<pre><code>sudo lctl nodemap_modify default --property readonly=true\n<\/code><\/pre>\n<hr \/>\n<h2>8. ACL Configuration (ZFS + Lustre)<\/h2>\n<h3>8.1 Enable ACL support in ZFS (MDT)<\/h3>\n<pre><code>sudo zfs set acltype=posixacl mdtpool\/mdt0\nsudo zfs set xattr=sa mdtpool\/mdt0\nsudo zfs set compression=off mdtpool\/mdt0\n<\/code><\/pre>\n<h3>8.2 Enable ACLs in Lustre<\/h3>\n<pre><code>sudo lctl set_param mdt.*.enable_acls=1\n<\/code><\/pre>\n<h3>8.3 Use ACLs from clients<\/h3>\n<pre><code>sudo setfacl -m u:alice:rwx \/mnt\/lustre\/data\ngetfacl \/mnt\/lustre\/data\n<\/code><\/pre>\n<hr \/>\n<h2>9. Mounting Lustre on Clients (Over RDMA)<\/h2>\n<h3>9.1 Mount command<\/h3>\n<pre><code>sudo mkdir -p \/mnt\/lustre\n\nsudo mount -t lustre \\\n  172.16.0.10@o2ib:\/lustrefs \\\n  \/mnt\/lustre\n\nexample without ibnetwork\n[root@vbox ~]# mount -t lustre 192.168.50.5@tcp:\/lustre \/mnt\/lustre-client\n[root@vbox ~]# \n[root@vbox ~]# # Verify the mount worked\n[root@vbox ~]# df -h \/mnt\/lustre-client\nFilesystem                Size  Used Avail Use% Mounted on\n192.168.50.5@tcp:\/lustre   12G  2.5M   11G   1% \/mnt\/lustre-client\n[root@vbox ~]# lfs df -h\nUUID                       bytes        Used   Available Use% Mounted on\nlustre-MDT0000_UUID         4.5G        1.9M        4.1G   1% \/mnt\/lustre-client[MDT:0]\nlustre-OST0000_UUID         7.5G        1.2M        7.0G   1% \/mnt\/lustre-client[OST:0]\nlustre-OST0001_UUID         3.9G        1.2M        3.7G   1% \/mnt\/lustre-client[OST:1]\nfilesystem_summary:        11.4G        2.4M       10.7G   1% \/mnt\/lustre-client\n\n<\/code><\/pre>\n<h3>9.2 Verify the mount<\/h3>\n<pre><code>df -h \/mnt\/lustre\nlfs df -h\n<\/code><\/pre>\n<h3>9.3 Persistent fstab entry<\/h3>\n<pre><code>172.16.0.10@o2ib:\/lustrefs  \/mnt\/lustre  lustre  _netdev,defaults  0 0\n<\/code><\/pre>\n<hr \/>\n<h2>10. Summary of the Correct Order<\/h2>\n<ol>\n<li>Install ZFS + Lustre on MGS\/MDS<\/li>\n<li>Create MDT ZFS dataset &amp; format MDT+MGS<\/li>\n<li>Configure RDMA + LNET<\/li>\n<li>Apply optional OFED\/Mellanox tuning<\/li>\n<li>Install ZFS + Lustre on OSS, create OSTs<\/li>\n<li>Format and mount OSTs<\/li>\n<li>Install Lustre client packages<\/li>\n<li>Mount client via RDMA<\/li>\n<li>Retrieve target names (OST\/MDT)<\/li>\n<li>Configure nodemaps<\/li>\n<li>Configure ACLs<\/li>\n<\/ol>\n<hr \/>\n<h2>Final Notes<\/h2>\n<p>You now have a complete ZFS-backed Lustre filesystem with RDMA transport, OFED\/Mellanox tunings, ACLs, and nodemaps. This layout provides parallel filesystem HIGH grade performance and clean scalability.<\/p>\n\n\n<p>Note: I have also created a ansible-playbook that can deploy this across clients and test everything; its currently not a public repo; email me at support@nicktailor.com. If you like to hire me to set it up.<br \/><br \/>\u251c\u2500\u2500 inventory\/<\/p>\n\n\n\n<p>\u2502 &nbsp; \u2514\u2500\u2500 hosts.yml &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;# Inventory file with host definitions<\/p>\n\n\n\n<p>\u251c\u2500\u2500 group_vars\/<\/p>\n\n\n\n<p>\u2502 &nbsp; \u2514\u2500\u2500 all.yml &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;# Global variables<\/p>\n\n\n\n<p>\u251c\u2500\u2500 roles\/<\/p>\n\n\n\n<p>\u2502 &nbsp; \u251c\u2500\u2500 infiniband\/ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;# InfiniBand\/RDMA setup<\/p>\n\n\n\n<p>\u2502 &nbsp; \u251c\u2500\u2500 zfs\/ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; # ZFS installation and configuration<\/p>\n\n\n\n<p>\u2502 &nbsp; \u251c\u2500\u2500 lustre_mgs_mds\/ &nbsp; &nbsp; &nbsp; &nbsp;# MGS\/MDS server setup<\/p>\n\n\n\n<p>\u2502 &nbsp; \u251c\u2500\u2500 lustre_oss\/ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;# OSS server setup<\/p>\n\n\n\n<p>\u2502 &nbsp; \u251c\u2500\u2500 lustre_client\/ &nbsp; &nbsp; &nbsp; &nbsp; # Client setup<\/p>\n\n\n\n<p>\u2502 &nbsp; \u251c\u2500\u2500 lustre_nodemaps\/ &nbsp; &nbsp; &nbsp; # Nodemap configuration<\/p>\n\n\n\n<p>\u2502 &nbsp; \u2514\u2500\u2500 lustre_acls\/ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; # ACL configuration<\/p>\n\n\n\n<p>\u251c\u2500\u2500 site.yml &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; # Main deployment playbook<\/p>\n\n\n\n<p>\u251c\u2500\u2500 test_connectivity.yml &nbsp; &nbsp; &nbsp;# Connectivity testing playbook<\/p>\n\n\n\n<p>\u2514\u2500\u2500 README.md &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;# This file<br \/><br \/><\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u00a0 This step-by-step guide walks you through deploying a production-ready Lustre filesystem backed by ZFS, including RDMA networking, MDT\/OST setup, nodemaps, ACL configuration, and client mounting. This guide assumes: MGS + MDS on one node One or more OSS nodes Clients mounting over RDMA (o2ib) ZFS as the backend filesystem 0. Architecture &amp; Assumptions Filesystem name: lustrefs MGS\/MDS RDMA IP:<a href=\"https:\/\/nicktailor.com\/tech-blog\/how-to-deploy-lustre-with-zfs-backend-rdma-acls-nodemaps-clients\/\" class=\"read-more\">Read More &#8230;<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[143],"tags":[],"class_list":["post-2124","post","type-post","status-publish","format-standard","hentry","category-hpc"],"_links":{"self":[{"href":"https:\/\/nicktailor.com\/tech-blog\/wp-json\/wp\/v2\/posts\/2124","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nicktailor.com\/tech-blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nicktailor.com\/tech-blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nicktailor.com\/tech-blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/nicktailor.com\/tech-blog\/wp-json\/wp\/v2\/comments?post=2124"}],"version-history":[{"count":8,"href":"https:\/\/nicktailor.com\/tech-blog\/wp-json\/wp\/v2\/posts\/2124\/revisions"}],"predecessor-version":[{"id":2137,"href":"https:\/\/nicktailor.com\/tech-blog\/wp-json\/wp\/v2\/posts\/2124\/revisions\/2137"}],"wp:attachment":[{"href":"https:\/\/nicktailor.com\/tech-blog\/wp-json\/wp\/v2\/media?parent=2124"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nicktailor.com\/tech-blog\/wp-json\/wp\/v2\/categories?post=2124"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nicktailor.com\/tech-blog\/wp-json\/wp\/v2\/tags?post=2124"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}