Date: 2012-03-31
OS: SLES
OS Version: 10 SP3
Symantec FileStore 
5.6 RP1 P4 Patch Upgrade README

CONTENTS
I.   Overview
II.  Upgrade procedure
III. Fixes in the new patch
IV.  Known issues
V.   New Features

PATCH ID                        : N/A
PATCH NAME                      : FileStore-sles10_x86_64-patch-5.6RP1P4.tar.gz
BASE PACKAGE NAME               : Symantec FileStoreSymantec FileStore
BASE PACKAGE VERSION            : 5.6
OBSELETE PATCHES                : N/A
SUPERCEDED PATCHES              : N/A
INCOMPATIBLE PATCHES            : N/A
SUPPORTED OS                    : SLES
SUPPORTED OS VERSION            : SLES 10 SP3
CREATION DATE                   : 2012-03-31
CATEGORY                        : enhancement, performance issue
REBOOT REQUIRED                 : Yes
SUPPORTS ROLLBACK               : NO


I. OVERVIEW:
------------
Symantec FileStore provides a scalable clustered storage solution. This document provides release information for the patch.


II. UPGRADE PROCEDURE:
----------------------
After you have installed or synchronized a new Symantec FileStore patch into your cluster, the list of available commands may change. Please login again to the CLI to access the updated features. 

IMPORTANT: There is a downtime for the services during an upgrade. The actual downtime will be a little longer than it takes to reboot the system. To avoid data loss, Symantec recommends that customers stop I/O processing completely during a patch upgrade.  

After you apply this patch, you cannot uninstall it. The 5.6 RP1P4 patch can only be installed on 5.6 or 5.6 P1 or 5.6 P2 or 5.6 P3 or 5.6 RP1 or 5.6 RP1 P1 or 5.6 RP1 P2 or 5.6 RP1 P3.


To install the patch:
1. Login as master:
   su - master
2. Start the patch install:
   upgrade patch install

IMPORTANT: Full upgrade instructions are included in the Symantec FileStore 5.6 Release Notes. Please note the following revisions
When you upgrade:

* Symantec recommends that you remove I/O fencing before upgrading any cluster node or exporting your current configuration. Use the Storage> fencing off command first followed by the Storage> fencing destroy command to remove I/O fencing. This step is not required, but it is suggested for a clean upgrade. 

III. FIXES IN THE NEW PATCH:
----------------------------

Etrack incidents:
2412665, 2632286, 2645201, 2635192, 2609154, 2603230, 2700305, 2697888, 2696509, 2703954, 2363148, 2619197, 2414725, 2704806, 2623088, 2695286, 2607847, 2644523, 2692509, 2697455, 2585532, 2700298, 2701830, 2645581, 2691337, 2431161, 2688830, 2667552, 2615036, 2626621, 2650787, 2565270, 2691918, 2663456, 2661303, 2568999, 2636522, 2706432, 2639323, 2661144, 2654444, 2634158, 2405104, 2626241, 2646819, 2422001, 2615104, 2420331, 2481742, 2611496, 2607364, 2044315, 2609091, 2693895, 2707725, 2100531, 2687622, 2717361, 2700301, 2659917, 2613661, 2703160, 2629548, 2712744, 2711186, 2688851, 2644277

Errors/Problems Fixed:

2414709, 2412665		Enable changing the cluster name without re-installation.
2632284, 2632286        	Add vote_sys_files as the default CIFS share option
2634543, 2645201        	CIFS needs support for hidden share (suffix with $)
2639049, 2635192        	Existing segregated share error message seen when attempting to reuse a share name
2518550, 2609154        	FILESTORE[GUI]: SFS_DB log fill up NLM filesystem
2603348, 2603230        	FILESTORE[REPLICATION]: Jobs failed: Failed to mount checkpoint at destination
2407637, 2700305        	FileStore: NFS: offline of NFS group takes a long time when there are a lot of shares. 2 seconds per share
2697881, 2697888        	FileStore: NFS: Make NLM one-way udp nat for mac as default
2697087, 2696509        	Filestore::Antivirus:: Node on which patch is getting updated hangs.
2703954           		Filestore::DST:: Even if enough space was not available in pool "tier add" command, adding tier with improper size and also giving the message for insufficient space.
2353265, 2363148        	File system is not allowed to grow if usage is above 95%
2620455, 2619197        	[CIFS] usability improvement on setting data migration option
2411464, 2414725        	SFS5.6P1: Replication issue is related with target cluster. Target replication log is showing that message apply failed to write on disk sync record (the marker, up to which point we have applied)
2706610, 2704806        	"Internal configuration database is 90% full" messages display again and again after `CLI> support gui db rescan` completed.
2623548, 2623088        	/var/log/sfsfs_event.log does not have antivirus scan job detecting virus info on node_01, but it is detecting virus info on node_02.
2695286           		GUI>AntiVirus>LiveUpdate and Quarantine cannot be updated to current by refresh operation. They should be updated in a timely manner.
2608134, 2607847        	GUI> File Systems> File Systems, click-FSname shows FileSystem Details involving Tier Summary, Secondary Tier info is not updated by refresh button or `CLI> support gui db rescan` command
2642218, 2644523        	`cli> replication job pause` does not work during `replication job resync` running. `replication job status` shows "paused" but `rsync` is still running by `ps` command.
2692509         	  	After PXEboot/install, system does not make a copy of crontab settings for replication scheduled jobs. These settings should be copied.
2706324, 2697455        	antivirus scan/autoprotect does not support *.bz2 files copressed by /usr/bin/bzip2. This should be supported.
2584716, 2585532        	cifs service got "Faulted Shares" during antivirus scan. This issue did not occur on the same system running 5.6P1.
2696937, 2700298        	Default number of nfsd is too big, primary node (ConIP node) becomes 100% busy very quickly. Default should be safe/small value like 8.
2651509, 2701830        	In cli> cifs share add, "ip=virtual_ip" option cannot be used on 5.6RP1P2, user can use it on 5.6P1. This should be announced to customers before actual drop.
2593659, 2645581        	non_scan.tar.gz built from certain simple/small text file causes "extraction error by Decomposer", other files seem to get scanned.
2641022, 2691337        	under antivirus autoprotect, rtvscand detects virus 60 seconds later after nfs write completed. It cannot be called "realtime work".
2491205, 2431161        	Support FTP across file systems
2618306, 2688830        	To exclude Samba from supportconifg
2427859, 2667552        	If SNAPSHOT is executed by the schedule simultaneously, any file system does not execute snapshot.
2525944, 2615036        	Add new command to mount tmpfs to speed up antivirus scan
2625895, 2626621        	CVM ServiceGroup timeout with 4-node cluster
2638369, 2650787		/var file system usage increase by the Samba logs.
2572650, 2565270        	Combine information collection at installation
2690606, 2691918        	File system utilization is not shown correctly in GUI (does not refresh)
2662857, 2663456        	Slow throughput and large LLT packets when accessing file through second node
2662848, 2661303        	[Replication] Make replication rsync time unlimited
2568995, 2568999        	[SAV] Provide option to toggle READ and WRITE operation scanning for performance improvement.
2634587, 2636522        	Cannot access homedir share if the slave node is powered off
2217904, 2706432        	Scan virus for a file system with a large number of files; the process rtvscand occupies a lot of CPU.
2634576, 2639323        	The function of homedirfs default Quota does not take effect.
2590238, 2661144        	The user master can login to system dir and create a new pool in VxFS.
2654444           		When the cluster reboots all, one node was hung.
2634561, 2634158        	Destroying the file system leads to the primary node reboot.
2414804, 2405104        	Displaying by 'fs list' is different from displaying by 'fs list fsname'
2626602, 2626241        	Import the disk group after installing the new version. The CVM of one node cannot be onlined.
2647154, 2646819        	The data is not the same as the orginal data after transferring the file from the breakpoint of ftp function
2664231, 2422001        	Map a LUN with the new id. The LUN status information is different between support and master mode.
2377234, 2615104        	When running homedir show, the AD user aa16 shows multiple times
2425134, 2420331        	When the master node faulted, the other nodes cannot send syslog log messages
2491197, 2481742        	Assign multiple VIPs on one NIC
2612696, 2611496        	antivirus section not available in clish after 5.6RP1P2 and n8k patch upgrade.
2598782, 2607364        	ctdb monitor is broken in a single-node cluster
2044143, 2044315        	dump: Storage: clish crash when trying to create a single pool with more than 500 disks.
2609142, 2609091        	execute command 'storage quota fs setdefault userquota softlimit numspace 100m' fails with error info:SFS quota ERROR V-288-2330. The softlimit numspace must be less than the default hardlimit numspace.
2685348, 2693895        	upgrade show takes >1 minute to complete
2707725         		"logrotate: error: samba-winbind dumplicate log" comes out periodically. setting logrorate for samba should be fixed.
2099996, 2100531		Antivirus "Auto protect" does not detect virus when renaming virus-file to "11", "Auto protect" should detect virus in any filename.
2703051, 2687622		GUI does not support new cifs share option, sharename@VIP, and other three issue related to GUI>share>cifs.
2717355, 2717361		GUI, "FileSystem" does not show "Protocol(CIFS, NFS)" columun properly in case of share-add for sub-directory like /vx/NS1/dir1.
2700301         		CLI>storage quota fs setall groupquota` can't handle group name involving "-"(hyphen) like "gr-hyphen".
2651874, 2659917		`antivisurs set tmpfssize` needs improvements, 1) cli does not show value, 2) PEXboot node does not have tmpfs mount, 3) can change value during realtime scan running.
2614308, 2613661		`cli>cluster reboot all` makes antivirus offline to online automatically. status should be kept as offline.
2703160         		`fsppadm enforce` running on multile nodes is needed for tier jobs to complete during night safely.
2630156, 2629548		cus wants to supprss TCPConnTrack connection new/close event logs because cus has tons of messages like 200, 000 events every day
2712744         		driver modules igb, ixgbe can't work well after upgrading to 5.6RP1P2.
2711186         		old Antivirus messages are logged into /var/log/messages repeatedly.
2694244, 2688851		repoting email can't send antivirus alert msg with japanease file name, system reports "Invalid or incomplete multibyte or wide character".
2644755, 2644277		under antivirus autoprotect and run multiple cifs big-zip writes, smbd seemed to be frozen till rtvscand completes its scan, if it's design doc should mention this.




IV. KNOWN ISSUES:
-----------------
Etrack Incident: 2737179

Symptom: 
Manual scan cannot detect the virus file without file extension (e.g. 11) when the ¡°File excluded extension list¡± setting is empty.  

Description:
This issue only happens on manual scan with the ¡°File excluded extension list¡± setting is empty, this issue has been fixed in Auto-Protect scan, then we suggest you enable Auto-Protect to avoid this risk. If you do not want enable Auto-Protect since Auto-Protect scan brings some CPU consumption, you can use the workaround as below.  

Resolution:
The workaround is to keep at least one file extension in the ¡°File excluded extension list¡± setting. you can set one unusual file extension to avoid this issue, for example ¡°dummyextension¡±, in fact, there should not be any file with such file extension

Etrack Incident: 2724702

SYMPTOM: For mapped users with the same user name for both CIFS/NFS shares using "full_acl," you may encounter permissions issues.

DESCRIPTION: If you map users with the same user name using both CIFS and NFS, users using "full_acl" may encounter permissions issues when trying to access directories or files created by NFS users.

RESOLUTION:

The work-around is to set "no_full_acl."

Etrack Incident: 2700195

SYMPTOM:
Missing information about support of Active Directory Japanese user names/groups from the Symantec FileStore documentation.

DESCRIPTION:
Symantec FileStore documentation should include information on support of Active Directory Japanese user names/groups.

RESOLUTION:
Symantec FileStore supports Japanese, Korean, and Chinese local user/group names in areas supported by SAMBA.

Etrack Incident: 2722856

SYMPTOM:
While shutting down a node, unmounting a file system or destroying a file system, it takes a long time to complete the operation.

DESCRIPTION:
While shutting down a node, unmounting a file system or destroying a file system, it takes a long time to complete the operation. This happens if there are any pending snapshot operations on the file system. The unmount operation hangs until the snapshot operation is completed.

RESOLUTION:
Introduced a way to postpone the snapshot operations. This helps to unmount the file system, destroy the file system, or shutdown quickly. Applying this patch helps to get rid of the issue.
 
Etrack Incident: 2632963

SYMPTOM:
Storage> tier relocate command will not relocate all files, and the Storage> tier remove command requires all its policy files to be removed.

DESCRIPTION:
The Storage> tier relocate command skips NDS files. NDS includes named data streams and extended attributes. The Storage> tier remove command is successful only if all the policies related to the tier are removed. The Symantec FileStore man pages have been updated to reflect the same.

RESOLUTION:
There is no resolution for the NDS relocation issue. 
Run the Storage> tier policy remove <fs_name> command before running the Storage> tier remove command.

Etrack Incident: 2738026

SYMPTOM: 
Antivirus Liveupdate jar file symbol link on master node and newly PXE booted nodes points to different versions of jar file.
  
DESCRIPTION:
After upgrade to 5.6RP1P4, rpm of savjlu upgrades to version savjlu-1.0.12-8 on all nodes. But due to difference in savjlu package upgrade on master and  newly PXE booted nodes, it could result in the symbol link of /opt/Symantec/LiveUpdate/jlu.jar points to different jar version (jlu-3.5.1.34.jar or jlu-3.9.1.14.jar).

RESOLUTION:
No need any workaround as antivirus live update works fine with either jlu-3.5.1.34.jar or jlu-3.9.1.14.jar.


Etrack Incident: 2719915

SYMPTOM:
numspace and numinodes parameters should be less than 2TB. Symantec FileStore will not allow setting these variables to 2TB or more. 

DESCRIPTION:
numspace and numinodes parameters should be less than 2TB. Symantec FileStore will not allow setting these variables to 2TB or more. This limitation is because the quota component supports only 32-bits.

RESOLUTION:
In Symantec FileStore 5.7 and onwards, the 2TB limitation is removed. If you need quota support equal to or more than 2TB, please upgrade to Symantec FileStore 5.7 and subsequent 5.7 patch releases.

Etrack incident: 2645902

SYMPTOM: 
Replication jobs are failing with error "[SOURCE ERROR] Failed to get response
from destination cluster." There are some core dumps generated because of
a replication process segmentation fault.

DESCRIPTION: 
The maximum number of replication jobs is 64, but there are more stricter limits on the number of replication 
jobs that can be running in parallel at the same time. Replication uses a RAM-based file system for storing the 
transit messages. Each GB of this RAM-based file system can accommodate up to 8 parallel running jobs. The 
default size of this file system depends upon the amount of physical memory of the node on which replication is 
running. If the physical memory is less than 5 GB, replication limits its maximum usage for storing messages to 
1 GB of memory, which means the user can run up to 8 replication jobs in parallel at the same time. If the 
physical memory is between 5 GB to 10 GB, replication limits its maximum usage for storing messages to 2 GB of 
memory, which means the user can run up to 16 replication jobs in parallel. If the physical memory is greater 
than 10 GB, replication limits its maximum usage for storing messages to 4 GB of memory, which means the user 
can run up to 32 jobs in parallel at the same time. 

These default values can be changed using the following steps:

1. Stop the replication service.
2. Edit line "Net FS Size: 1024 MB" in file "/opt/VRTSnasgw/conf/sfsnet.conf" with the proper tpmfs size. 
This step is needed on all nodes of the cluster. 
For example: To increase the size of tmpfs to 2GB, edit the line as "Net FS Size: 2048 MB".
3. Above changes will be effective after starting the replication service.

To check the tmpfs size, you can use the following command:

#cat /opt/VRTSnasgw/conf/sfsnet.conf | grep "^Net FS Size:" | awk -F":" '{print $2}'

Note that the above values are based on a worst-case analysis. The typical memory consumption by these messages 
is very low because as soon as a message is sent, the corresponding memory is immediately freed. Thus, 
specifying a value for tmpfs size does not mean that so much memory is actually consumed. It is the worst case 
when replication is not able to send and apply messages quickly on the network. It is advisable that if you 
want to run more jobs than the above specifications, you should schedule the jobs at different times.

RESOLUTION: 
Increase size of replication tmpfs depending upon the number of jobs you want to run
in parallel.

V. NEW FEATURES:
-----------------

Etrack Incident: 2722268

A new tunable "fullspace" was added to the Storage> fs alert set and unset commands for 5.6 RP1 P4:

Storage> fs alert set numinodes|numspace|fullspace value [fs_name,...]
Storage> fs alert unset numinodes|numspace|fullspace [fs_name,...]

When a file system is 100% full, and the user continues to run write I/0s on the file system, performance might be slowed down dramatically. The NFS clients might get stuck for a long time while waiting for the I/Os, which can cause an I/O hang.

When a file system is 100% full, and some of the files are being overwritten, there can be some small spaces available for write I/Os.  But in this scenario, the file system might have problems managing the free spaces, hence the performance downgrade.

For a file system to run efficiently, users should always reserve some space for the file system, instead of using the space 100%. FileStore provides the function of file system full protection. When file system usage is over the space limit, alert value (80% by default), a warning message will be sent to the users. If the user continues to write I/Os to the file system, and file system usage reaches almost 100% (that is, 98%, tunable), all the NFS shares on the file system will be automatically changed to READ-only to prevent potential issues like a performance downgrade when the file system is full. By default, file system full protection is turned off. To activate file system full protection, a user can run the command "Storage> fs alert set fullspace 98" (set full limit to 98%).

Etrack incident: 2644277

Two new Antivirus> set commands were added for 5.6 RP1 P4:

o Antivirus> set tmpfssize size 
o Antivirus> set autoprotect holdonclose [yes|no]

The Antivirus> set tmpfssize size command enables and mounts a tempfs (temporary file storage facility) for accelerating Symantec AntiVirus for FileStore scans.

o To enable and mount a tempfs that can take up to size (MB) memory for
accelerating Symantec AntiVirus for FileStore scans, enter the following:

Antivirus> set tmpfssize size

where size is the memory used by the tempfs.

A minimum value of 2048 for size is required. Set size to 0 to unmount and
disable the tempfs.

For example, to enable a tempfs for Symantec AntiVirus for FileStore scans,
enter the following:

Antivirus> set tmpfssize 204

The Antivirus> set autoprotect holdonclose command determines if the file close system call is held by 
the Auto-Protect kernel module until the realtime scan finishes.

To determine if the file close system call is held by the Auto-Protect kernel
module until the realtime Symantec AntiVirus for FileStore scan finishes,
enter the following:

Antivirus> set autoprotect holdonclose yes | no

The default value of the holdonclose parameter is yes.

For example, to disable the realtime Symantec AntiVirus for FileStore scan
holdonclose parameter, enter the following:

Antivirus> set autoprotect holdonclose no