Date: 2012-03-31 OS: SLES OS Version: 10 SP3 Symantec FileStore 5.6 RP1 P4 Patch Upgrade README CONTENTS I. Overview II. Upgrade procedure III. Fixes in the new patch IV. Known issues V. New Features PATCH ID : N/A PATCH NAME : FileStore-sles10_x86_64-patch-5.6RP1P4.tar.gz BASE PACKAGE NAME : Symantec FileStoreSymantec FileStore BASE PACKAGE VERSION : 5.6 OBSELETE PATCHES : N/A SUPERCEDED PATCHES : N/A INCOMPATIBLE PATCHES : N/A SUPPORTED OS : SLES SUPPORTED OS VERSION : SLES 10 SP3 CREATION DATE : 2012-03-31 CATEGORY : enhancement, performance issue REBOOT REQUIRED : Yes SUPPORTS ROLLBACK : NO I. OVERVIEW: ------------ Symantec FileStore provides a scalable clustered storage solution. This document provides release information for the patch. II. UPGRADE PROCEDURE: ---------------------- After you have installed or synchronized a new Symantec FileStore patch into your cluster, the list of available commands may change. Please login again to the CLI to access the updated features. IMPORTANT: There is a downtime for the services during an upgrade. The actual downtime will be a little longer than it takes to reboot the system. To avoid data loss, Symantec recommends that customers stop I/O processing completely during a patch upgrade. After you apply this patch, you cannot uninstall it. The 5.6 RP1P4 patch can only be installed on 5.6 or 5.6 P1 or 5.6 P2 or 5.6 P3 or 5.6 RP1 or 5.6 RP1 P1 or 5.6 RP1 P2 or 5.6 RP1 P3. To install the patch: 1. Login as master: su - master 2. Start the patch install: upgrade patch install IMPORTANT: Full upgrade instructions are included in the Symantec FileStore 5.6 Release Notes. Please note the following revisions When you upgrade: * Symantec recommends that you remove I/O fencing before upgrading any cluster node or exporting your current configuration. Use the Storage> fencing off command first followed by the Storage> fencing destroy command to remove I/O fencing. This step is not required, but it is suggested for a clean upgrade. III. FIXES IN THE NEW PATCH: ---------------------------- Etrack incidents: 2412665, 2632286, 2645201, 2635192, 2609154, 2603230, 2700305, 2697888, 2696509, 2703954, 2363148, 2619197, 2414725, 2704806, 2623088, 2695286, 2607847, 2644523, 2692509, 2697455, 2585532, 2700298, 2701830, 2645581, 2691337, 2431161, 2688830, 2667552, 2615036, 2626621, 2650787, 2565270, 2691918, 2663456, 2661303, 2568999, 2636522, 2706432, 2639323, 2661144, 2654444, 2634158, 2405104, 2626241, 2646819, 2422001, 2615104, 2420331, 2481742, 2611496, 2607364, 2044315, 2609091, 2693895, 2707725, 2100531, 2687622, 2717361, 2700301, 2659917, 2613661, 2703160, 2629548, 2712744, 2711186, 2688851, 2644277 Errors/Problems Fixed: 2414709, 2412665 Enable changing the cluster name without re-installation. 2632284, 2632286 Add vote_sys_files as the default CIFS share option 2634543, 2645201 CIFS needs support for hidden share (suffix with $) 2639049, 2635192 Existing segregated share error message seen when attempting to reuse a share name 2518550, 2609154 FILESTORE[GUI]: SFS_DB log fill up NLM filesystem 2603348, 2603230 FILESTORE[REPLICATION]: Jobs failed: Failed to mount checkpoint at destination 2407637, 2700305 FileStore: NFS: offline of NFS group takes a long time when there are a lot of shares. 2 seconds per share 2697881, 2697888 FileStore: NFS: Make NLM one-way udp nat for mac as default 2697087, 2696509 Filestore::Antivirus:: Node on which patch is getting updated hangs. 2703954 Filestore::DST:: Even if enough space was not available in pool "tier add" command, adding tier with improper size and also giving the message for insufficient space. 2353265, 2363148 File system is not allowed to grow if usage is above 95% 2620455, 2619197 [CIFS] usability improvement on setting data migration option 2411464, 2414725 SFS5.6P1: Replication issue is related with target cluster. Target replication log is showing that message apply failed to write on disk sync record (the marker, up to which point we have applied) 2706610, 2704806 "Internal configuration database is 90% full" messages display again and again after `CLI> support gui db rescan` completed. 2623548, 2623088 /var/log/sfsfs_event.log does not have antivirus scan job detecting virus info on node_01, but it is detecting virus info on node_02. 2695286 GUI>AntiVirus>LiveUpdate and Quarantine cannot be updated to current by refresh operation. They should be updated in a timely manner. 2608134, 2607847 GUI> File Systems> File Systems, click-FSname shows FileSystem Details involving Tier Summary, Secondary Tier info is not updated by refresh button or `CLI> support gui db rescan` command 2642218, 2644523 `cli> replication job pause` does not work during `replication job resync` running. `replication job status` shows "paused" but `rsync` is still running by `ps` command. 2692509 After PXEboot/install, system does not make a copy of crontab settings for replication scheduled jobs. These settings should be copied. 2706324, 2697455 antivirus scan/autoprotect does not support *.bz2 files copressed by /usr/bin/bzip2. This should be supported. 2584716, 2585532 cifs service got "Faulted Shares" during antivirus scan. This issue did not occur on the same system running 5.6P1. 2696937, 2700298 Default number of nfsd is too big, primary node (ConIP node) becomes 100% busy very quickly. Default should be safe/small value like 8. 2651509, 2701830 In cli> cifs share add, "ip=virtual_ip" option cannot be used on 5.6RP1P2, user can use it on 5.6P1. This should be announced to customers before actual drop. 2593659, 2645581 non_scan.tar.gz built from certain simple/small text file causes "extraction error by Decomposer", other files seem to get scanned. 2641022, 2691337 under antivirus autoprotect, rtvscand detects virus 60 seconds later after nfs write completed. It cannot be called "realtime work". 2491205, 2431161 Support FTP across file systems 2618306, 2688830 To exclude Samba from supportconifg 2427859, 2667552 If SNAPSHOT is executed by the schedule simultaneously, any file system does not execute snapshot. 2525944, 2615036 Add new command to mount tmpfs to speed up antivirus scan 2625895, 2626621 CVM ServiceGroup timeout with 4-node cluster 2638369, 2650787 /var file system usage increase by the Samba logs. 2572650, 2565270 Combine information collection at installation 2690606, 2691918 File system utilization is not shown correctly in GUI (does not refresh) 2662857, 2663456 Slow throughput and large LLT packets when accessing file through second node 2662848, 2661303 [Replication] Make replication rsync time unlimited 2568995, 2568999 [SAV] Provide option to toggle READ and WRITE operation scanning for performance improvement. 2634587, 2636522 Cannot access homedir share if the slave node is powered off 2217904, 2706432 Scan virus for a file system with a large number of files; the process rtvscand occupies a lot of CPU. 2634576, 2639323 The function of homedirfs default Quota does not take effect. 2590238, 2661144 The user master can login to system dir and create a new pool in VxFS. 2654444 When the cluster reboots all, one node was hung. 2634561, 2634158 Destroying the file system leads to the primary node reboot. 2414804, 2405104 Displaying by 'fs list' is different from displaying by 'fs list fsname' 2626602, 2626241 Import the disk group after installing the new version. The CVM of one node cannot be onlined. 2647154, 2646819 The data is not the same as the orginal data after transferring the file from the breakpoint of ftp function 2664231, 2422001 Map a LUN with the new id. The LUN status information is different between support and master mode. 2377234, 2615104 When running homedir show, the AD user aa16 shows multiple times 2425134, 2420331 When the master node faulted, the other nodes cannot send syslog log messages 2491197, 2481742 Assign multiple VIPs on one NIC 2612696, 2611496 antivirus section not available in clish after 5.6RP1P2 and n8k patch upgrade. 2598782, 2607364 ctdb monitor is broken in a single-node cluster 2044143, 2044315 dump: Storage: clish crash when trying to create a single pool with more than 500 disks. 2609142, 2609091 execute command 'storage quota fs setdefault userquota softlimit numspace 100m' fails with error info:SFS quota ERROR V-288-2330. The softlimit numspace must be less than the default hardlimit numspace. 2685348, 2693895 upgrade show takes >1 minute to complete 2707725 "logrotate: error: samba-winbind dumplicate log" comes out periodically. setting logrorate for samba should be fixed. 2099996, 2100531 Antivirus "Auto protect" does not detect virus when renaming virus-file to "11", "Auto protect" should detect virus in any filename. 2703051, 2687622 GUI does not support new cifs share option, sharename@VIP, and other three issue related to GUI>share>cifs. 2717355, 2717361 GUI, "FileSystem" does not show "Protocol(CIFS, NFS)" columun properly in case of share-add for sub-directory like /vx/NS1/dir1. 2700301 CLI>storage quota fs setall groupquota` can't handle group name involving "-"(hyphen) like "gr-hyphen". 2651874, 2659917 `antivisurs set tmpfssize` needs improvements, 1) cli does not show value, 2) PEXboot node does not have tmpfs mount, 3) can change value during realtime scan running. 2614308, 2613661 `cli>cluster reboot all` makes antivirus offline to online automatically. status should be kept as offline. 2703160 `fsppadm enforce` running on multile nodes is needed for tier jobs to complete during night safely. 2630156, 2629548 cus wants to supprss TCPConnTrack connection new/close event logs because cus has tons of messages like 200, 000 events every day 2712744 driver modules igb, ixgbe can't work well after upgrading to 5.6RP1P2. 2711186 old Antivirus messages are logged into /var/log/messages repeatedly. 2694244, 2688851 repoting email can't send antivirus alert msg with japanease file name, system reports "Invalid or incomplete multibyte or wide character". 2644755, 2644277 under antivirus autoprotect and run multiple cifs big-zip writes, smbd seemed to be frozen till rtvscand completes its scan, if it's design doc should mention this. IV. KNOWN ISSUES: ----------------- Etrack Incident: 2737179 Symptom: Manual scan cannot detect the virus file without file extension (e.g. 11) when the ¡°File excluded extension list¡± setting is empty. Description: This issue only happens on manual scan with the ¡°File excluded extension list¡± setting is empty, this issue has been fixed in Auto-Protect scan, then we suggest you enable Auto-Protect to avoid this risk. If you do not want enable Auto-Protect since Auto-Protect scan brings some CPU consumption, you can use the workaround as below. Resolution: The workaround is to keep at least one file extension in the ¡°File excluded extension list¡± setting. you can set one unusual file extension to avoid this issue, for example ¡°dummyextension¡±, in fact, there should not be any file with such file extension Etrack Incident: 2724702 SYMPTOM: For mapped users with the same user name for both CIFS/NFS shares using "full_acl," you may encounter permissions issues. DESCRIPTION: If you map users with the same user name using both CIFS and NFS, users using "full_acl" may encounter permissions issues when trying to access directories or files created by NFS users. RESOLUTION: The work-around is to set "no_full_acl." Etrack Incident: 2700195 SYMPTOM: Missing information about support of Active Directory Japanese user names/groups from the Symantec FileStore documentation. DESCRIPTION: Symantec FileStore documentation should include information on support of Active Directory Japanese user names/groups. RESOLUTION: Symantec FileStore supports Japanese, Korean, and Chinese local user/group names in areas supported by SAMBA. Etrack Incident: 2722856 SYMPTOM: While shutting down a node, unmounting a file system or destroying a file system, it takes a long time to complete the operation. DESCRIPTION: While shutting down a node, unmounting a file system or destroying a file system, it takes a long time to complete the operation. This happens if there are any pending snapshot operations on the file system. The unmount operation hangs until the snapshot operation is completed. RESOLUTION: Introduced a way to postpone the snapshot operations. This helps to unmount the file system, destroy the file system, or shutdown quickly. Applying this patch helps to get rid of the issue. Etrack Incident: 2632963 SYMPTOM: Storage> tier relocate command will not relocate all files, and the Storage> tier remove command requires all its policy files to be removed. DESCRIPTION: The Storage> tier relocate command skips NDS files. NDS includes named data streams and extended attributes. The Storage> tier remove command is successful only if all the policies related to the tier are removed. The Symantec FileStore man pages have been updated to reflect the same. RESOLUTION: There is no resolution for the NDS relocation issue. Run the Storage> tier policy remove command before running the Storage> tier remove command. Etrack Incident: 2738026 SYMPTOM: Antivirus Liveupdate jar file symbol link on master node and newly PXE booted nodes points to different versions of jar file. DESCRIPTION: After upgrade to 5.6RP1P4, rpm of savjlu upgrades to version savjlu-1.0.12-8 on all nodes. But due to difference in savjlu package upgrade on master and newly PXE booted nodes, it could result in the symbol link of /opt/Symantec/LiveUpdate/jlu.jar points to different jar version (jlu-3.5.1.34.jar or jlu-3.9.1.14.jar). RESOLUTION: No need any workaround as antivirus live update works fine with either jlu-3.5.1.34.jar or jlu-3.9.1.14.jar. Etrack Incident: 2719915 SYMPTOM: numspace and numinodes parameters should be less than 2TB. Symantec FileStore will not allow setting these variables to 2TB or more. DESCRIPTION: numspace and numinodes parameters should be less than 2TB. Symantec FileStore will not allow setting these variables to 2TB or more. This limitation is because the quota component supports only 32-bits. RESOLUTION: In Symantec FileStore 5.7 and onwards, the 2TB limitation is removed. If you need quota support equal to or more than 2TB, please upgrade to Symantec FileStore 5.7 and subsequent 5.7 patch releases. Etrack incident: 2645902 SYMPTOM: Replication jobs are failing with error "[SOURCE ERROR] Failed to get response from destination cluster." There are some core dumps generated because of a replication process segmentation fault. DESCRIPTION: The maximum number of replication jobs is 64, but there are more stricter limits on the number of replication jobs that can be running in parallel at the same time. Replication uses a RAM-based file system for storing the transit messages. Each GB of this RAM-based file system can accommodate up to 8 parallel running jobs. The default size of this file system depends upon the amount of physical memory of the node on which replication is running. If the physical memory is less than 5 GB, replication limits its maximum usage for storing messages to 1 GB of memory, which means the user can run up to 8 replication jobs in parallel at the same time. If the physical memory is between 5 GB to 10 GB, replication limits its maximum usage for storing messages to 2 GB of memory, which means the user can run up to 16 replication jobs in parallel. If the physical memory is greater than 10 GB, replication limits its maximum usage for storing messages to 4 GB of memory, which means the user can run up to 32 jobs in parallel at the same time. These default values can be changed using the following steps: 1. Stop the replication service. 2. Edit line "Net FS Size: 1024 MB" in file "/opt/VRTSnasgw/conf/sfsnet.conf" with the proper tpmfs size. This step is needed on all nodes of the cluster. For example: To increase the size of tmpfs to 2GB, edit the line as "Net FS Size: 2048 MB". 3. Above changes will be effective after starting the replication service. To check the tmpfs size, you can use the following command: #cat /opt/VRTSnasgw/conf/sfsnet.conf | grep "^Net FS Size:" | awk -F":" '{print $2}' Note that the above values are based on a worst-case analysis. The typical memory consumption by these messages is very low because as soon as a message is sent, the corresponding memory is immediately freed. Thus, specifying a value for tmpfs size does not mean that so much memory is actually consumed. It is the worst case when replication is not able to send and apply messages quickly on the network. It is advisable that if you want to run more jobs than the above specifications, you should schedule the jobs at different times. RESOLUTION: Increase size of replication tmpfs depending upon the number of jobs you want to run in parallel. V. NEW FEATURES: ----------------- Etrack Incident: 2722268 A new tunable "fullspace" was added to the Storage> fs alert set and unset commands for 5.6 RP1 P4: Storage> fs alert set numinodes|numspace|fullspace value [fs_name,...] Storage> fs alert unset numinodes|numspace|fullspace [fs_name,...] When a file system is 100% full, and the user continues to run write I/0s on the file system, performance might be slowed down dramatically. The NFS clients might get stuck for a long time while waiting for the I/Os, which can cause an I/O hang. When a file system is 100% full, and some of the files are being overwritten, there can be some small spaces available for write I/Os. But in this scenario, the file system might have problems managing the free spaces, hence the performance downgrade. For a file system to run efficiently, users should always reserve some space for the file system, instead of using the space 100%. FileStore provides the function of file system full protection. When file system usage is over the space limit, alert value (80% by default), a warning message will be sent to the users. If the user continues to write I/Os to the file system, and file system usage reaches almost 100% (that is, 98%, tunable), all the NFS shares on the file system will be automatically changed to READ-only to prevent potential issues like a performance downgrade when the file system is full. By default, file system full protection is turned off. To activate file system full protection, a user can run the command "Storage> fs alert set fullspace 98" (set full limit to 98%). Etrack incident: 2644277 Two new Antivirus> set commands were added for 5.6 RP1 P4: o Antivirus> set tmpfssize size o Antivirus> set autoprotect holdonclose [yes|no] The Antivirus> set tmpfssize size command enables and mounts a tempfs (temporary file storage facility) for accelerating Symantec AntiVirus for FileStore scans. o To enable and mount a tempfs that can take up to size (MB) memory for accelerating Symantec AntiVirus for FileStore scans, enter the following: Antivirus> set tmpfssize size where size is the memory used by the tempfs. A minimum value of 2048 for size is required. Set size to 0 to unmount and disable the tempfs. For example, to enable a tempfs for Symantec AntiVirus for FileStore scans, enter the following: Antivirus> set tmpfssize 204 The Antivirus> set autoprotect holdonclose command determines if the file close system call is held by the Auto-Protect kernel module until the realtime scan finishes. To determine if the file close system call is held by the Auto-Protect kernel module until the realtime Symantec AntiVirus for FileStore scan finishes, enter the following: Antivirus> set autoprotect holdonclose yes | no The default value of the holdonclose parameter is yes. For example, to disable the realtime Symantec AntiVirus for FileStore scan holdonclose parameter, enter the following: Antivirus> set autoprotect holdonclose no