MAN page from OpenSuSE gfs2-utils-3.1.9-lp152.3.5.x86_64.rpm


Section: File Formats (5)



gfs2 - GFS2 reference guide



Overview of the GFS2 filesystem



GFS2 is a clustered filesystem, designed for sharing data between multiple nodesconnected to a common shared storage device. It can also be used as alocal filesystem on a single node, however since the design is aimedat clusters, that will usually result in lower performance than usinga filesystem designed specifically for single node use.

GFS2 is a journaling filesystem and one journal is required for each nodethat will mount the filesystem. The one exception to that is spectatormounts which are equivalent to mounting a read-only block device and assuch can neither recover a journal or write to the filesystem, so do notrequire a journal assigned to them.



This specifies which inter-node lock protocol is used by the GFS2 filesystemfor this mount, overriding the default lock protocol name stored in thefilesystem's on-disk superblock.

The LockProtoName must be one of the supported locking protocols,currently these are lock_nolock and lock_dlm.

The default lock protocol name is written to disk initially when creating thefilesystem with mkfs.gfs2(8), -p option. It can be changed on-disk byusing the gfs2_tool(8) utility's sb proto command.

The lockproto mount option should be used only under specialcircumstances in which you want to temporarily use a different lock protocolwithout changing the on-disk default. Using the incorrect lock protocolon a cluster filesystem mounted from more than one node will almostcertainly result in filesystem corruption.

This specifies the identity of the cluster and of the filesystem for thismount, overriding the default cluster/filesystem identify stored in thefilesystem's on-disk superblock. The cluster/filesystem name is recognizedglobally throughout the cluster, and establishes a unique namespace forthe inter-node locking system, enabling the mounting of multiple GFS2filesystems.

The format of LockTableName is lock-module-specific. Forlock_dlm, the format is clustername:fsname. Forlock_nolock, the field is ignored.

The default cluster/filesystem name is written to disk initially when creatingthe filesystem with mkfs.gfs2(8), -t option. It can be changed on-diskby using the gfs2_tool(8) utility's sb table command.

The locktable mount option should be used only under specialcircumstances in which you want to mount the filesystem in a different cluster,or mount it as a different filesystem name, without changing the on-diskdefault.

This flag tells GFS2 that it is running as a local (not clustered) filesystem,so it can allow the kernel VFS layer to do all flock and fcntl file locking.When running in cluster mode, these file locks require inter-node locks,and require the support of GFS2. When running locally, better performanceis achieved by letting VFS handle the whole job.

This is turned on automatically by the lock_nolock module.

Setting errors=panic causes GFS2 to oops when encountering an error thatwould otherwise cause themount to withdraw or print an assertion warning. The default settingis errors=withdraw. This option should not be used in a production system.It replaces the earlier debug option on kernel versions 2.6.31 andabove.
Enables POSIX Access Control List acl(5) support within GFS2.
Mount this filesystem using a special form of read-only mount. The mountdoes not use one of the filesystem's journals. The node is unable torecover journals for other nodes.
A synonym for spectator
Sets owner of any newly created file or directory to be that of parentdirectory, if parent directory has S_ISUID permission attribute bit set.Sets S_ISUID in any new directory, if its parent directory's S_ISUID is set.Strips all execution bits on a new file, if parent directory owner is differentfrom owner of process creating the file. Set this option only if you knowwhy you are setting it.
Turns quotas on or off for a filesystem. Setting the quotas to be inthe "account" state causes the per UID/GID usage statistics to becorrectly maintained by the filesystem, limit and warn values areignored. The default value is "off".
Causes GFS2 to generate "discard" I/O requests for blocks which havebeen freed. These can be used by suitable hardware to implementthin-provisioning and similar schemes. This feature is supportedin kernel version 2.6.30 and above.
This option, which defaults to on, causes GFS2 to send I/O barrierswhen flushing the journal. The option is automatically turned offif the underlying device does not support I/O barriers. We highlyrecommend the use of I/O barriers with GFS2 at all times unlessthe block device is designed so that it cannot lose its write cachecontent (e.g. its on a UPS, or it doesn't have a write cache)
This is similar to the ext3 commit= option in that it setsthe maximum number of seconds between journal commits if there isdirty data in the journal. The default is 60 seconds. This optionis only provided in kernel versions 2.6.31 and above.
When data=ordered is set, the user data modified by a transaction isflushed to the disk before the transaction is committed to disk. Thisshould prevent the user from seeing uninitialized blocks in a fileafter a crash. Data=writeback mode writes the user data to the diskat any time after it's dirtied. This doesn't provide the sameconsistency guarantee as ordered mode, but it should be slightlyfaster for some workloads. The default is ordered mode.
This option results in selecting the meta filesystem root rather thanthe normal filesystem root. This option is normally only used bythe GFS2 utility functions. Altering any file on the GFS2 meta filesystemmay render the filesystem unusable, so only experts in the GFS2on-disk layout should use this option.
This sets the number of seconds for which a change in the quotainformation may sit on one node before being written to the quotafile. This is the preferred way to set this parameter. The valueis an integer number of seconds greater than zero. The default is60 seconds. Shorter settings result in faster updates of the lazyquota information and less likelihood of someone exceeding theirquota. Longer settings make filesystem operations involving quotasfaster and more efficient.
Setting statfs_quantum to 0 is the preferred way to set the slow versionof statfs. The default value is 30 secs which sets the maximum timeperiod before statfs changes will be syned to the master statfs file.This can be adjusted to allow for faster, less accurate statfs valuesor slower more accurate values. When set to 0, statfs will alwaysreport the true values.
This setting provides a bound on the maximum percentage change inthe statfs information on a local basis before it is synced backto the master statfs file, even if the time period has notexpired. If the setting of statfs_quantum is 0, then this settingis ignored.
This flag tells gfs2 to look for information about a resource group's freespace and unlinked inodes in its glock lock value block. This keeps gfs2 fromhaving to read in the resource group data from disk, speeding up allocations insome cases. This option was added in the 3.6 Linux kernel. Prior to thiskernel, no information was saved to the resource group lvb. Note: Tosafely turn on this option, all nodes mounting the filesystem must be runningat least a 3.6 Linux kernel. If any nodes had previously mounted the filesystemusing older kernels, the filesystem must be unmounted on all nodes before itcan be mounted with this option enabled. This option does not need to beenabled on all nodes using a filesystem.
This flag tells gfs2 to use location based readdir cookies, instead of itsusual filename hash readdir cookies. The filename hash cookies are notguaranteed to be unique, and as the number of files in a directory increases,so does the likelihood of a collision. NFS requires readdir cookies to beunique, which can cause problems with very large directories (over 100,000files). With this flag set, gfs2 will try to give out location based cookies.Since the cookie is 31 bits, gfs2 will eventually run out of unique cookies,and will fail back to using hash cookies. The maximum number of files thatcould have unique location cookies assuming perfectly even hashing and names of8 or fewer characters is 1,073,741,824. An average directory should be able togive out well over half a billion location based cookies. This option was addedin the 4.5 Linux kernel. Prior to this kernel, gfs2 did not add directoryentries in a way that allowed it to use location based readdir cookies.Note: To safely turn on this option, all nodes mounting the filesystemmust be running at least a 4.5 Linux kernel. If this option is only enabled onsome of the nodes mounting a filesystem, the cookies returned by nodes usingthis option will not be valid on nodes that are not using this option, and viceversa. Finally, when first enabling this option on a filesystem that had beenpreviously mounted without it, you must make sure that there are no outstandingcookies being cached by other software, such as NFS.



GFS2 doesn't support errors=remount-ro or data=journal.It is not possible to switch support for user and group quotas on andoff independently of each other. Some of the error messages are rathercryptic, if you encounter one of these messages check firstly that gfs_controldis running and secondly that you have enough journals on the filesystemfor the number of nodes in use.



mount(8) for general mount options,chmod(1) and chmod(2) for access permission flags,acl(5) for access control lists,lvm(8) for volume management,ccs(7) for cluster management,umount(8),initrd(4).

The GFS2 documentation has been split into a number of sections:

gfs2_edit(8) A GFS2 debug tool (use with caution)fsck.gfs2(8) The GFS2 file system checkergfs2_grow(8) Growing a GFS2 file systemgfs2_jadd(8) Adding a journal to a GFS2 file systemmkfs.gfs2(8) Make a GFS2 file systemgfs2_quota(8) Manipulate GFS2 disk quotas gfs2_tool(8) Tool to manipulate a GFS2 file system (obsolete)tunegfs2(8) Tool to manipulate GFS2 superblocks



GFS2 clustering is driven by the dlm, which depends on dlm_controld toprovide clustering from userspace. dlm_controld clustering is built oncorosync cluster/group membership and messaging.

Follow these steps to manually configure and run gfs2/dlm/corosync.

1. create /etc/corosync/corosync.conf and copy to all nodes

In this sample, replace cluster_name and IP addresses, and add nodes asneeded. If using only two nodes, uncomment the two_node line.See corosync.conf(5) for more information.

totem {        version: 2        secauth: off        cluster_name: abc}nodelist {        node {                ring0_addr:                nodeid: 1        }        node {                ring0_addr:                nodeid: 2        }        node {                ring0_addr:                nodeid: 3        }}quorum {        provider: corosync_votequorum#       two_node: 1}logging {        to_syslog: yes}

2. start corosync on all nodes

systemctl start corosync

Run corosync-quorumtool to verify that all nodes are listed.

3. create /etc/dlm/dlm.conf and copy to all nodes

*To use no fencing, use this line:


*To use no fencing, but exercise fencing functions, use this line:

fence_all /bin/true

The "true" binary will be executed for all nodes and will succeed (exit 0)immediately.

*To use manual fencing, use this line:

fence_all /bin/false

The "false" binary will be executed for all nodes and will fail (exit 1)immediately.

When a node fails, manually run: dlm_tool fence_ack <nodeid>

*To use stonith/pacemaker for fencing, use this line:

fence_all /usr/sbin/dlm_stonith

The "dlm_stonith" binary will be executed for all nodes. Ifstonith/pacemaker systems are not available, dlm_stonith will fail andthis config becomes the equivalent of the previous /bin/false config.

*To use an APC power switch, use these lines:

device  apc /usr/sbin/fence_apc ipaddr= login=admin password=pwconnect apc node=1 port=1connect apc node=2 port=2connect apc node=3 port=3

Other network switch based agents are configured similarly.

*To use sanlock/watchdog fencing, use these lines:

device wd /usr/sbin/fence_sanlock path=/dev/fence/leasesconnect wd node=1 host_id=1connect wd node=2 host_id=2unfence wd

See fence_sanlock(8) for more information.

*For other fencing configurations see dlm.conf(5) man page.

4. start dlm_controld on all nodes

systemctl start dlm

Run "dlm_tool status" to verify that all nodes are listed.

5. if using clvm, start clvmd on all nodes

systemctl clvmd start

6. make new gfs2 file systems

mkfs.gfs2 -p lock_dlm -t cluster_name:fs_name -j num /path/to/storage

The cluster_name must match the name used in step 1 above.The fs_name must be a unique name in the cluster.The -j option is the number of journals to create, there mustbe one for each node that will mount the fs.

7. mount gfs2 file systems

mount /path/to/storage /mountpoint

Run "dlm_tool ls" to verify the nodes that have each fs mounted.

8. shut down

umount -a -t gfs2systemctl clvmd stopsystemctl dlm stopsystemctl corosync stop

More setup information:




This document was created byman2html,using the manual pages.