vxtunefs


		« SORT Documents < Previous \| TOC \| Index \| Next >

vxtunefs

NAME

vxtunefs - tune a Veritas File System

SYNOPSIS

vxtunefs {-p | -s [-f tunefstab | -o parameter=value]} {mount_point | special} ...

AVAILABILITY

VRTSvxfs

DESCRIPTION

The vxtunefs command sets or prints tunable I/O parameters of mounted file systems. The vxtunefs command can set parameters describing the I/O properties of the underlying device, parameters to indicate when to treat an I/O as direct I/O, or parameters to control the extent allocation policy for the specified file system.

With no options specified, vxtunefs prints the existing VxFS parameters for the specified file systems.

The vxtunefs command works on a list of mount points specified on the command line, or all the mounted file systems listed in the tunefstab file. The default tunefstab file is /etc/vx/tunefstab . You can change the default using the -f option.

The vxtunefs command can be run at any time on a mounted file system, and all parameter changes take immediate effect. Parameters specified on the command line override parameters listed in the tunefstab file.

If /etc/vx/tunefstab exists, the VxFS-specific mount command invokes vxtunefs to set device parameters from /etc/vx/tunefstab . The VxFS-specific mount command interacts with VxVM to obtain default values for the tunables, so you need to specify tunables for VxVM devices only to change the defaults.

Only a privileged user can run vxtunefs.

NOTES

The vxtunefs command works with Storage Checkpoints; however, VxFS tunables apply to an entire file system. Therefore tunables affect not only the primary fileset, but also any Storage Checkpoint filesets within that file system.

The tunable qio_cache_enable is not supported on HP-UX.

Cluster File System Issues

Whether specified by the command line or the tunefstab file, tunable parameters are propagated to all nodes in the cluster.

CFS requires more memory to manage distributed operations, so you may want to adjust related tunables for a cluster mounted file system.

The max_seqio_extent_size and initial_extent_size parameters are in-memory values that take effect only when the invoking node is the primary, or later becomes the primary, node in the cluster.

Multiple Volume Set Considerations

When using file systems in multiple volume sets, VxFS sets the VxFS tunables based on the geometry of the first component volume (volume 0) in the volume set.

OPTIONS

-f filename

Use filename instead of /etc/vx/tunefstab as the file containing tuning parameters.

-o parameter=value

Specifies parameters for the file systems listed on the command line.

-p

Prints the tuning parameters for all the file systems specified on the command line.

-s

Sets the new tuning parameters for the Veritas File Systems specified on the command line or in the tunefstab file.

VxFS Tuning Parameters and Guidelines

The values for all the following parameters except fcl_keeptime, fcl_winterval, read_nstream, write_nstream, and qio_cache_enable can be specified in bytes, kilobytes, megabytes, gigabytes, terabytes, or sectors (1024 bytes) by appending k, K, m, M, g, G, t, T, s, or S. There is no need for a suffix for the value in bytes.

If the file system is being used with VxVM, it is advisable to let the parameters use the default values based on the volume geometry.

If the file system is being used with a hardware disk array, align the parameters to match the geometry of the logical disk. For disk striping and RAID-5 configurations, set read_pref_io to the stripe unit size or interleave factor and set read_nstream to be the number of columns. For disk striping configurations, set write_pref_io and write_nstream to the same values as read_pref_io and read_nstream, but for RAID-5 configurations, set write_pref_io, to the full stripe size and set write_nstream to 1.

For an application to do efficient direct I/O or discovered direct I/O, it should issue read requests that are equal to the product of read_nstream and read_pref_io. In general, any multiple or factor of read_nstream multiplied by read_pref_io is a good size for performance. For writing, the same general rule applies to the write_pref_io and write_nstream parameters. When tuning a file system, the best thing to do is use the tuning parameters under a real workload.

If an application is doing sequential I/O to large files, the application should issue requests larger than the discovered_direct_iosz. This performs the I/O requests as discovered direct I/O requests which are unbuffered like direct I/O, but which do not require synchronous inode updates when extending the file. If the file is too large to fit in the cache, using unbuffered I/O avoids losing useful data out of the cache and lowers CPU overhead.

discovered_direct_iosz

Any file I/O requests larger than the discovered_direct_iosz are handled as discovered direct I/O. A discovered direct I/O is unbuffered like direct I/O, but it does not require a synchronous commit of the inode when the file is extended or blocks are allocated. For larger I/O requests, the CPU time for copying the data into the buffer cache and the cost of using memory to buffer the I/O becomes more expensive than the cost of doing the disk I/O. For these I/O requests, using discovered direct I/O is more efficient than regular I/O. The default value of this parameter is 256K.

fcl_holesize

Specifies the hole size (in 8K units) that is punched (or freed) in the FCL file. No hole is punched if the FCL file exceeds fcl_maxalloc bytes and the life of the oldest record has not reached fcl_keeptime seconds.

fcl_keeptime

Specifies the minimum amount of time, in seconds, that the VxFS File Change Log (FCL) keeps records in the log. When the oldest 8K block of FCL records have been kept longer than the value of fcl_keeptime, they are purged from the FCL and the extents nearest to the beginning of the FCL file are freed. This process is referred to as "punching a hole." Holes are punched in the FCL file in 8K chunks.

If the fcl_maxalloc parameter is set, records are purged from the FCL when the amount of space allocated to the FCL exceeds fcl_maxalloc. This purge occurs even if the elapsed time that the records have been in the log is less than the value of fcl_keeptime. If the file system runs out of space before fcl_keeptime is reached, the FCL is deactivated.

Either or both of the fcl_keeptime or fcl_maxalloc parameters must be set before the File Change Log can be activated. fcl_keeptime operates only on Version 6 or higher disk layout file systems.

fcl_maxalloc

Specifies the maximum amount of space that can be allocated to the VxFS File Change Log. The FCL file is a sparse file that grows as changes occur in the file system. When the space allocated to the FCL file reaches the fcl_maxalloc value, the oldest FCL records are purged from the FCL and the extents nearest to the beginning of the FCL file are freed. This process is referred to as "punching a hole." Holes are punched in the FCL file in 8K chunks. If the file system runs out of space before fcl_maxalloc is reached, the FCL is deactivated.

Either or both of the fcl_maxalloc or fcl_keeptime parameters must be set before the File Change Log can be activated. fcl_maxalloc operates only on Version 6 or higher disk layout file systems.

fcl_ointerval

Specifies the time interval in seconds within which subsequent opens of a file do not produce an additional FCL record. This helps to reduce number of repetitive file-open records logged in the FCL, especially in the case of frequent accesses through NFS. If the tracking of access information is also enabled, a subsequent file open event within fcl_ointerval might produce a record, if the latter open is by a different user. Similarly, if an inode goes out of cache and returns, or if there is an FCL sync, there might be more than one file open record within the same open interval. The default value is 600 seconds.

fcl_winterval

Specifies the time, in seconds, that must elapse before the VxFS File Change Log records a data overwrite, data extending write, or data truncate for a file. The ability to limit the number of repetitive FCL records for continuous writes to the same file is important for file system performance and for applications processing the FCL. fcl_winterval is best set to an interval less than the shortest interval between reads of the FCL by any application. This way all applications using the FCL can be assured of finding at least one FCL record for any file experiencing continuous data changes.

fcl_winterval is enforced for all files in the file system. Each file maintains its own time stamps, and the elapsed time between FCL records is per file. This elapsed time can be overridden using the VxFS FCL sync public API (see the vxfs_fcl_sync(3) manual page).

fcl_winterval operates only on Version 6 or higher disk layout file systems.

hsm_write_prealloc

For a file managed by a hierarchical storage management (HSM) application, hsm_write_prealloc preallocates disk blocks before data is migrated back into the file system. An HSM application usually migrates the data back through a series of writes to the file, each of which allocates a few blocks. By setting hsm_write_prealloc (hsm_write_prealloc=1), a sufficient number of disk blocks will be allocated on the first write to the empty file so that no disk block allocation is required for subsequent writes, which improves the write performance during migration.

The hsm_write_prealloc parameter is implemented outside of the DMAPI specification, and its usage has limitations depending on how the space within an HSM controlled file is managed. It is advisable to use hsm_write_prealloc only when recommended by the HSM application controlling the file system.

initial_extent_size

Changes the default size of the initial extent.

VxFS determines, based on the first write to a new file, the size of the first extent to allocate to the file. Typically the first extent is the smallest power of 2 that is larger than the size of the first write. If that power of 2 is less than 8K, the first extent allocated is 8K. After the initial extent, the file system increases the size of subsequent extents (see max_seqio_extent_size) with each allocation.

Because most applications write to files using a buffer size of 8K or less, the increasing extents start doubling from a small initial extent. initial_extent_size changes the default initial extent size to a larger value, so the doubling policy starts from a much larger initial size, and the file system won't allocate a set of small extents at the start of file.

Use this parameter only on file systems that have a very large average file size. On such file systems, there are fewer extents per file and less fragmentation.

initial_extent_size is measured in file system blocks.

inode_aging_count

Specifies the maximum number of inodes to place on an inode aging list. Inode aging is used in conjunction with file system Storage Checkpoints to allow quick restoration of large, recently deleted files. The aging list is maintained in first-in-first-out (fifo) order up to maximum number of inodes specified by inode_aging_count. As newer inodes are placed on the list, older inodes are removed to complete their aging process. For best performance, it is advisable to age only a limited number of larger files before completion of the removal process. The default maximum number of inodes to age is 2048.

inode_aging_size

Specifies the minimum size to qualify a deleted inode for inode aging. Inode aging is used in conjunction with file system Storage Checkpoints to allow quick restoration of large, recently deleted files. For best performance, it is advisable to age only a limited number of larger files before completion of the removal process. Setting the size too low can push larger file inodes out of the aging queue to make room for newly removed smaller file inodes.

max_buf_data_size

Determines the maximum buffer size allocated for file data. The two accepted values are 8K bytes and 64K bytes. The larger value can be beneficial for workloads where large reads/writes are performed sequentially. The smaller value is preferable on workloads where the I/O is random or is done in small chunks. The default value is 8K bytes.

max_direct_iosz

Maximum size of a direct I/O request issued by the file system. If there is a larger I/O request, it is broken up into max_direct_iosz chunks. This parameter defines how much memory an I/O request can lock at once; do not set it to more than 20% of memory.

max_diskq

Specifies the maximum disk queue generated by a single file. If the number of dirty buffers in the disk queue exceeds this limit, the file system prevents writing more data to disk until the amount of data decreases. The default value is 1 megabyte.

Although it does not limit the actual disk queue, max_diskq prevents processes that flush data to disk, such as fsync, from making the system unresponsive.

See the write_throttle description for more information on pages and system memory.

max_seqio_extent_size

Increases or decreases the maximum size of an extent. When the file system is following its default allocation policy for sequential writes to a file, it allocates an initial extent that is large enough for the first write to the file. When additional extents are allocated, they are progressively larger (the algorithm tries to double the size of the file with each new extent), so each extent can hold several writes worth of data. This reduces the total number of extents in anticipation of continued sequential writes. When there are no more writes to the file, unused space is freed for other files to use.

In general, this allocation stops increasing the size of extents at 2048 blocks, which prevents one file from holding too much unused space.

max_seqio_extent_size is measured in file system blocks.

qio_cache_enable

Enables or disables caching on Quick I/O for Database files. The default behavior is to disable caching. To enable caching, set qio_cache_enable to 1.

On systems with large amounts of memory, the database cannot always use all of the memory as a cache. By enabling file system caching as a second level cache, performance can improve.

If the database is performing sequential scans of tables, the scans can run faster by enabling file system caching so the file system performs aggressive read aheads on the files.

read_ahead

In the absence of a specific caching advisory, the default for all VxFS read operations is to perform sequential read ahead. The enhanced read ahead functionality implements an algorithm that allows read aheads to detect more elaborate patterns (such as increasing or decreasing read offsets, or multithreaded file accesses) in addition to simple sequential reads. You can specify the following values for read_ahead:

Disables read ahead functionality

Retains traditional sequential read ahead behavior

Enables enhanced read ahead for all reads

By default, read_ahead is set to 1, that is, VxFS detects only sequential patterns.

read_ahead detects patterns on a per-thread basis, up to a maximum of vx_era_nthreads. The default number of threads is 5, however, you can change the default value by setting the vx_era_nthreads parameter in the system configuration file, /etc/system.

read_nstream

The number of parallel read requests of size read_pref_io that can be outstanding at one time. The file system uses the product of read_nstream and read_pref_io to determine its read ahead size. The default value for read_nstream is 1.

read_pref_io

The preferred read request size. The file system uses this in conjunction with the read_nstream value to determine how much data to read ahead. The default value is 64K.

write_nstream

The number of parallel write requests of size write_pref_io that can be outstanding at one time. The file system uses the product of write_nstream and write_pref_io to determine when to do flush behind on writes. The default value for write_nstream is 1.

write_pref_io

The preferred write request size. The file system uses this in conjunction with the write_nstream value to determine how to do flush behind on writes. The default value is 64K.

write_throttle

When data is written to a file through buffered writes, the file system updates only the in-memory image of the file, creating what are referred to as dirty buffers. Dirty buffers are cleaned when the the file system later writes the data in these buffers to disk. (Note that data can be lost if the system crashes before dirty buffers are written to disk.)

Newer model computer systems typically have more memory. The more physical memory a system has, the more dirty buffers the file system can generate before having to write the buffers to disk to free up memory. So more dirty buffers can potentially lead to longer return times for operations that write dirty buffers to disk such as sync and fsync. If your system has a combination of a slow storage device and a large amount of memory, the sync operations may take long enough to complete that it gives the appearance of a hung system.

If your system is exhibiting this behavior, you can change the value of write_throttle. write_throttle lets you lower the number of dirty buffers per file that the file system will generate before writing them to disk. After the number of dirty buffers for a file reaches the write_throttle threshold, the file system starts flushing buffers to disk even if free memory is still available. Depending on the speed of the storage device, user write performance may suffer, but the number of dirty buffers is limited, so sync operations will complete much faster.

The default value of write_throttle is zero. The default value places no limit on the number of dirty buffers per file. This typically generates a large number of dirty buffers, but maintains fast writes. If write_throttle is non-zero, VxFS limits the number of dirty buffers per file to write_throttle buffers In some cases, write_throttle may delay write requests. For example, lowering the value of write_throttle may increase the file disk queue to the max_diskq value, delaying user writes until the disk queue decreases. So unless the system has a combination of large physical memory and slow storage devices, it is advisable not to change the value of write_throttle.

FILES

/etc/vx/tunefstab

VxFS tuning parameters table.


^ Return to Top	« SORT Documents < Previous \| TOC \| Index \| Next >

vxtunefs

NAME

SYNOPSIS

AVAILABILITY

DESCRIPTION

NOTES

Cluster File System Issues

Multiple Volume Set Considerations

OPTIONS

VxFS Tuning Parameters and Guidelines

FILES

SEE ALSO