test veritas logo


vxtunefs(1M)

NAME

vxtunefs - tune a Veritas File System

SYNOPSIS

vxtunefs [-b value] [-ps] [-D suboption] [-f tunefstab] [-o parameter=value] [{mount_point | special}]...

AVAILABILITY

VRTSvxfs

DESCRIPTION

The vxtunefs command sets or prints tunable I/O parameters of mounted file systems. The vxtunefs command can set parameters describing the I/O properties of the underlying device, parameters to indicate when to treat an I/O as direct I/O, or parameters to control the extent allocation policy for the specified file system.

With no options specified, vxtunefs prints the existing VxFS parameters for the specified file systems.

The vxtunefs command works on a list of mount points specified on the command line, or all the mounted file systems listed in the tunefstab file. The default tunefstab file is /etc/vx/tunefstab. You can change the default by setting the VXTUNEFSTAB environment variable. The /etc/vx/tunefstab file is not present by default. If user wants to overwrite the default parameters, this file needs to be created with new values.

The vxtunefs command can be run at any time on a mounted file system, and all parameter changes take immediate effect. Parameters specified on the command line override parameters listed in the tunefstab file.

If /etc/vx/tunefstab exists, the VxFS-specific mount command invokes vxtunefs to set device parameters from /etc/vx/tunefstab. The VxFS-specific mount command interacts with VxVM to obtain default values for the tunables, so you need to specify tunables for VxVM devices only to change the defaults.

Only a privileged user can run vxtunefs.

NOTES

The vxtunefs command works with Storage Checkpoints; however, VxFS tunables apply to an entire file system. Therefore, tunables affect not only the primary fileset, but also any Storage Checkpoint filesets within that file system. Veritas does not recommend that you retune a parameter when the system is under heavy load. Any change in a tunable parameter requires freezing the file system and it can slow down or stall some file system activities. As a result, retuning a parameter can take a long time to complete.

Cluster File System Issues

Whether specified by the command line or the tunefstab file, tunable parameters are propagated to all nodes in the cluster.

SFCFS requires more memory to manage distributed operations, so you may want to adjust related tunables for a cluster mounted file system.

Multiple Volume Set Considerations

When using file systems in multiple volume sets, VxFS sets the VxFS tunables based on the geometry of the first component volume (volume 0) in the volume set.

OPTIONS

-D suboption You can specify one of the following suboptions with the -D option:
ckpt_removable=value
  Specifying a value of 1 causes the fsckptadm create and fsckptadm createall commands to create removable Storage Checkpoint by default. Specifying a value of 0 causes the fsckptadm command to create non-removable Storage Checkpoints by default.
print Prints the current values of the parameters.
-f filename Use filename instead of /etc/vx/tunefstab as the file containing tuning parameters. The -f option and the -o option are mutually exclusive.
-o parameter=value
  Specifies parameters for the file systems listed on the command line. The -o option and the -f option are mutually exclusive.
-p Prints the tuning parameters for all the file systems specified on the command line. The -p option and the -s option are mutually exclusive.
-s Sets the new tuning parameters for the Veritas File Systems specified on the command line or in the tunefstab file. The -s option and the -p option are mutually exclusive.

VxFS Tuning Parameters and Guidelines

The values for all the following parameters except fcl_keeptime, fcl_winterval, read_nstream and write_nstream can be specified in bytes, kilobytes, megabytes, gigabytes, terabytes, or sectors (512 bytes) by appending k, K, m, M, g, G, t, T, s, or S. There is no need for a suffix for the value in bytes.

If the file system is being used with VxVM, it is advisable to let the parameters use the default values based on the volume geometry.

If the file system is being used with a hardware disk array, align the parameters to match the geometry of the logical disk. For disk striping and RAID-5 configurations, set read_pref_io to the stripe unit size or interleave factor and set read_nstream to be the number of columns. For disk striping configurations, set write_pref_io and write_nstream to the same values as read_pref_io and read_nstream, but for RAID-5 configurations, set write_pref_io, to the full stripe size and set write_nstream to 1.

For an application to do efficient direct I/O or discovered direct I/O, it should issue read requests that are equal to the product of read_nstream and read_pref_io. In general, any multiple or factor of read_nstream multiplied by read_pref_io is a good size for performance. For writing, the same general rule applies to the write_pref_io and write_nstream parameters. When tuning a file system, the best thing to do is use the tuning parameters under a real workload.

If an application is doing sequential I/O to large files, the application should issue requests larger than the discovered_direct_iosz. This performs the I/O requests as discovered direct I/O requests which are unbuffered like direct I/O, but which do not require synchronous inode updates when extending the file. If the file is too large to fit in the cache, using unbuffered I/O avoids losing useful data out of the cache and lowers CPU overhead.

aligned_write
  Specifies whether write to the file system be treated as aligned fixed width write of given size. Specified value is used as aligned fixed width size. Specifying a value of 0, disables aligned fixed width write functionality.
The default value of aligned_write is 0.
aligned_write is measured in file system blocks.
dalloc_enable
  Enables or disables delayed allocations. You can specify the following values for dalloc_enable:
 
0 Disables delayed allocations
1 Enables delayed allocations
delicache_enable
  Specifies whether performance optimization of inode allocation and reuse during a new file creation is turned on or not. You can specify the following values for delicache_enable:
 
0 Disables delicache optimization
1 Enables delicache optimization
The default value of delicache_enable is 1. However, the performance benefits in case of cluster file system are limited as compared to local mount.
discovered_direct_iosz
  Any file I/O requests larger than the discovered_direct_iosz are handled as discovered direct I/O. A discovered direct I/O is unbuffered like direct I/O, but it does not require a synchronous commit of the inode when the file is extended or blocks are allocated. For larger I/O requests, the CPU time for copying the data into the page cache and the cost of using memory to buffer the I/O becomes more expensive than the cost of doing the disk I/O. For these I/O requests, using discovered direct I/O is more efficient than regular I/O. The default value of this parameter is 256K.
fcl_keeptime
  Specifies the minimum amount of time, in seconds, that the VxFS File Change Log (FCL) keeps records in the log. When the oldest 8K block of FCL records have been kept longer than the value of fcl_keeptime, they are purged from the FCL file and the extents nearest to the beginning of the FCL file are freed. This process is referred to as "punching a hole." Holes are punched in the FCL file in 8K chunks.
If the fcl_maxalloc parameter is set, records are purged from the FCL file when the amount of space allocated to the FCL file exceeds fcl_maxalloc. This purge occurs even if the elapsed time that the records have been in the log is less than the value of fcl_keeptime. If the file system runs out of space before fcl_keeptime is reached, the FCL file is deactivated.
Either or both of the fcl_keeptime or fcl_maxalloc parameters must be set before the File Change Log can be activated. fcl_keeptime operates only on Version 6 or higher disk layout file systems.
fcl_maxalloc
  Specifies the maximum amount of space that can be allocated to the VxFS File Change Log. The FCL file is a sparse file that grows as changes occur in the file system. When the space allocated to the FCL file reaches the fcl_maxalloc value, the oldest FCL records are purged from the FCL file and the extents nearest to the beginning of the FCL file are freed. This process is referred to as "punching a hole." Holes are punched in the FCL file in 8K chunks. If the file system runs out of space before fcl_maxalloc is reached, the FCL file is deactivated.
Either or both of the fcl_maxalloc or fcl_keeptime parameters must be set before the File Change Log can be activated. fcl_maxalloc operates only on Version 6 or higher disk layout file systems.
fcl_ointerval
  Specifies the time interval in seconds within which subsequent opens of a file do not produce an additional FCL record. This helps to reduce number of repetitive file-open records logged in the FCL file, especially in the case of frequent accesses through NFS. If the tracking of access information is also enabled, a subsequent file open event within fcl_ointerval might produce a record, if the latter open is by a different user. Similarly, if an inode goes out of cache and returns, or if there is an FCL sync, there might be more than one file open record within the same open interval. The default value is 600 seconds.
fcl_winterval
  Specifies the time, in seconds, that must elapse before the VxFS File Change Log records a data overwrite, data extending write, or data truncate for a file. The ability to limit the number of repetitive FCL records for continuous writes to the same file is important for file system performance and for applications processing the FCL file. fcl_winterval is best set to an interval less than the shortest interval between reads of the FCL file by any application. This way all applications using the FCL file can be assured of finding at least one FCL record for any file experiencing continuous data changes.
fcl_winterval is enforced for all files in the file system. Each file maintains its own time stamps, and the elapsed time between FCL records is per file. This elapsed time can be overridden using the VxFS FCL sync public API (see the vxfs_fcl_sync(3) manual page).
fcl_winterval operates only on Version 6 or higher disk layout file systems. The default value of fcl_winterval is 3600 seconds.
fiostats_enable
  Specifies whether the file I/O statistics collection is turned on or not. You can specify the following values for fiostats_enable :
 
0 Disables file I/O statistics collection
1 Enables file I/O statistics collection
The default value of fiostats_enable is 0.
lazy_isize_enable
  Enables or disables a performance optimization in Cluster File system. When one node in a cluster is extending the file, optimization is to not reflect the updated file size immediately on other nodes. Note that if this tunable is enabled, file size reported by stat might be stale, but it will not have any impact on other file operations. You can specify the following values for lazy_isize_enable:
 
0 Disables the performance optimization
1 Enables the performance optimization
The default value of lazy_isize_enable is 0.
initial_extent_size
  Changes the default size of the initial extent.
VxFS determines, based on the first write to a new file, the size of the first extent to allocate to the file. Typically the first extent is the smallest power of 2 that is larger than the size of the first write. If that power of 2 is less than 8K, the first extent allocated is 8K. After the initial extent, the file system increases the size of subsequent extents with each allocation. See max_seqio_extent_size.
Because most applications write to files using a buffer size of 8K or less, the increasing extents start doubling from a small initial extent. initial_extent_size changes the default initial extent size to a larger value, so the doubling policy starts from a much larger initial size, and the file system will not allocate a set of small extents at the start of a file.
Use this parameter only on file systems that have a very large average file size. On such file systems, there are fewer extents per file and less fragmentation.
initial_extent_size is measured in file system blocks.
inode_aging_count
  Specifies the maximum number of inodes to place on an inode aging list. Inode aging is used in conjunction with file system Storage Checkpoints to allow quick restoration of large, recently deleted files. The aging list is maintained in first-in-first-out (fifo) order up to maximum number of inodes specified by inode_aging_count. As newer inodes are placed on the list, older inodes are removed to complete their aging process. For best performance, it is advisable to age only a limited number of larger files before completion of the removal process. The default maximum number of inodes to age is 2048.
inode_aging_size
  Specifies the minimum size to qualify a deleted inode for inode aging. Inode aging is used in conjunction with file system Storage Checkpoints to allow quick restoration of large, recently deleted files. For best performance, age only a limited number of larger files before completion of the removal process. Setting the size too low can push larger file inodes out of the aging queue to make room for newly removed smaller file inodes.
lazy_copyonwrite
  Shared extents must be replaced with newly allocated extents before modification. Under normal circumstances, the data from the shared extent is copied to the new extent before it is inserted into the file. This causes some performance impact due to the extra operations on the disk. Postponing the update of the new extent until normal write processing would flush the new data allows a performance benefit. However, if the system performing the write fails before completing the operation, the data that was previously on the disk at the new location may appear inside the file after file system recovery. You can specify the following values for lazy_copyonwrite:
 
0 Disables lazy_copyonwrite optimization
1 Enables lazy_copyonwrite optimization
The default value of lazy_copyonwrite is 0.
This tunable is not supported on disk layouts prior to Version 8.
max_buf_data_size
  Not available.
max_direct_iosz
  Maximum size of a direct I/O request issued by the file system. If there is a larger I/O request, it is broken up into max_direct_iosz chunks. This parameter defines how much memory an I/O request can lock at once; do not set it to more than 20% of the system’s memory.
max_diskq
  Specifies the maximum disk queue generated by a single file. If the number of dirty pages in the disk queue exceeds this limit, the file system prevents writing more data to disk until the amount of data decreases. The default value is 1 megabyte.
Although it does not limit the actual disk queue, max_diskq prevents processes that flush data to disk, such as fsync, from making the system unresponsive.
See the write_throttle description for more information on pages and system memory.
max_seqio_extent_size
  Increases or decreases the maximum size of an extent. When the file system is following its default allocation policy for sequential writes to a file, it allocates an initial extent that is large enough for the first write to the file. When additional extents are allocated, the extents are progressively larger since the algorithm tries to double the size of the file with each new extent. Thus, each extent can hold several writes worth of data. This reduces the total number of extents in anticipation of continued sequential writes. When there are no more writes to the file, unused space is freed for other files to use.
In general, this allocation stops increasing the size of extents at max_seqio_extent_size blocks, which prevents one file from holding too much unused space.
max_seqio_extent_size is measured in file system blocks. The default value for this tunable is 32768 blocks. Setting max_seqio_extent_size to a value less than 2048 blocks automatically resets this tunable to 2048 blocks (minimum value).
odm_cache_enable
  Enables or disables caching for database files accessed via ODM. To enable caching, set odm_cache_enable to 1. This enables caching for the file system, but you must also configure which files or I/O types are to be cached by using the odmadm command.
See the odmadm(1) manual page.
odm_cache_enable is synonymous with qio_cache_enable; activating one activates the other.
pdir_enable
  Enables or disables partitioned directories. Specifying a value of 1 enables partitioned directories, while specifying a value of 0 disables partitioned directories. The default value is 0.
The pdir_enable tunable operates only on Version 8 or later disk layout file systems.
pdir_threshold
  Sets the threshold value in terms of directory size in bytes, beyond which the directory will be partitioned if the pdir_enable tunable is set to 1. The default value is 32768. The pdir_threshold tunable operates only on Version 8 or later disk layout file systems.
read_ahead
  In the absence of a specific caching advisory, the default for all VxFS read operations is to perform sequential read ahead. The enhanced read ahead functionality implements an algorithm that allows read aheads to detect more elaborate patterns (such as increasing or decreasing read offsets, or multithreaded file accesses) in addition to simple sequential reads. You can specify the following values for read_ahead:
 
0 Disables read ahead functionality
1 Retains traditional sequential read ahead behavior
2 Enables enhanced read ahead for all reads
By default, read_ahead is set to 1, that is, VxFS detects only sequential patterns.
read_ahead detects patterns on a per-thread basis, up to a maximum of vx_era_nthreads. The default number of threads is 5.
read_nstream
  The number of parallel read requests of size read_pref_io that can be outstanding at one time. The file system uses the product of read_nstream and read_pref_io to determine its read ahead size. The default value for read_nstream is 1.
read_pref_io
  The preferred read request size. The file system uses this in conjunction with the read_nstream value to determine how much data to read ahead. The default value is 64K.
thin_friendly_alloc
  Enables or disables thin friendly allocations. Specifying a value of 1 enables thin friendly allocations, while specifying a value of 0 disables thin friendly allocations. The default value is 1 for thinrclm volumes, and 0 for all other volume types. You must turn on delicache_enable before you can activate this feature. This tunable is not supported for cluster file systems.
write_nstream
  The number of parallel write requests of size write_pref_io that can be outstanding at one time. The file system uses the product of write_nstream and write_pref_io to determine when to do flush behind on writes. The default value for write_nstream is 1.
write_pref_io
  The preferred write request size. The file system uses this in conjunction with the write_nstream value to determine how to do flush behind on writes. The default value is 64K.
write_throttle
  When data is written to a file through buffered writes, the file system updates only the in-memory image of the file, creating what are referred to as dirty pages. Dirty pages are cleaned when the file system later writes the data in these pages to disk. Note that data can be lost if the system crashes before dirty pages are written to disk.
Newer model computer systems typically have more memory. The more physical memory a system has, the more dirty pages the file system can generate before having to write the pages to disk to free up memory. So more dirty pages can potentially lead to longer return times for operations that write dirty pages to disk such as sync and fsync. If your system has a combination of a slow storage device and a large amount of memory, the sync operations may take long enough to complete that it gives the appearance of a hung system.
If your system is exhibiting this behavior, you can change the value of write_throttle. write_throttle lets you lower the number of dirty pages per file that the file system will generate before writing them to disk. After the number of dirty pages for a file reaches the write_throttle threshold, the file system starts flushing pages to disk even if free memory is still available. Depending on the speed of the storage device, user write performance may suffer, but the number of dirty pages is limited, so sync operations will complete much faster.
The default value of write_throttle is zero. The default value places no limit on the number of dirty pages per file. This typically generates a large number of dirty pages, but maintains fast writes. If write_throttle is non-zero, VxFS limits the number of dirty pages per file to write_throttle pages In some cases, write_throttle may delay write requests. For example, lowering the value of write_throttle may increase the file disk queue to the max_diskq value, delaying user writes until the disk queue decreases. So unless the system has a combination of large physical memory and slow storage devices, it is advisable not to change the value of write_throttle.

FILES

/etc/vx/tunefstab VxFS tuning parameters table.

SEE ALSO

mkfs(1M), vxfsstat(1M), sync(2), vxfs_fcl_sync(3), fsync(3C), tunefstab(4), vxfsio(7), mount(8)

Storage Foundation Administrator’s Guide,


VxFS 7.4 vxtunefs(1M)