Monday, September 1, 2014

Create, expand, destroy aggregates on NetApp DataOnTap

Aggregates: OnTap combines one or more RaidGroups (any size) into a pool of disk space for creating multiple volumes.
Configuring an optimum RAID group size for an aggregate made up of disks requires a trade-off of factors. You must decide which factor—speed of recovery, assurance against data loss, or maximizing data storage space—is most important for the aggregate that you are configuring.

Note:
1. You cannot reduce the number of disks in an aggregate by removing data disks. The only way to reduce the number of data disks in an aggregate is to copy the data and transfer it to a new aggregate that has fewer data disks.
2. You are advised to keep your RAID groups homogeneous when possible. If needed, you can replace a mismatched disk with a more suitable disk later.
3. At a minimum, you should have at least one matching or appropriate hot spare available for each kind of disk installed in your storage system. However, having two available hot spares for all disks provides the best protection against disk failure.
Aggregate Creation:
===================
Syntax:
--------
aggr create aggr_name [-f] [-m] [-n] [-t {raid0 | raid4 | raid_dp}] [-r raidsize] [-T disk-type] -R rpm] [-L] [-B {32 | 64}] disk-list
Scenario:1
----------
Create an aggregate with all the avaialable disks on controller with described opions below. Also disable automatic snapshot and set snap reserve to '0' on the aggregate.
Aggregate Name: aggr0, Raid Type: raid_dp, Raid Size: 20, Disk Type: SAS, 64-bit
Step:1 Create aggregate
aggr create aggr0 -t raid_dp -r 20 -T SAS -B 64
or
If you want to specify disk list:
aggr create aggr0 -t raid_dp -r 20 -d 7a.1 7a.2 7a.3 7a.4.......7a.19
Step:2 Disable automatic aggregate Snapshot copy creation

aggr options aggr0 nosnap on
Step:3 Set the aggregate Snapshot reserve to 0 percent
snap reserve -A aggr0 0
Step:4 Verify
aggr status -r
Aggregate Expansion:
=====================
Syntax:
--------
aggr add aggr_name [-f] [-n] [-g {raid_group_name | new | all}] disk_list
Scenario:1
----------
Add four 500-GB disks to the aggr0
Step:1 Add disks to aggregate
aggr add aggr0 4@500
or
aggr add aggr0 -g rg1 -d 7a.22 7a.23 7a.24 7a.25
Step:2 Verify
aggr status -r
After you add storage to an aggregate, run a full reallocation job on each FlexVol volume contained in that aggregate.

Destroy Aggregate:
================

I really really hope you are doing this action with full consciousness!

Step:1 Offline aggregate
Note: You use the aggr offline command to take an aggregate offline to perform maintenance on the aggregate, move it, or destroy it.

aggr offline aggr0

Step:2 Destroy aggregate
Note: Before you can destroy an aggregate, you must destroy all of the FlexVol volumes contained by that aggregate.

aggr destroy aggr0

Backout: Undestroy the aggregate

aggr undestroy aggr0

Create, expand, destroy NetApp volumes

Volume:  A volume is a logical unit of storage, containing a file system image associated with administrative options. One can create up to 500 volumes on a single filer depending upon the model and ontap versions. I recommend to keep that number low.

Flexible volumes are independent entities carved on aggregates, which can be increased and decreased.

Size: Minimum of 20 MB to Maximum of 16 TB

'vol' suite allows us perform several operations mainly creating, expanding, offline, online, destroy, copy and many more. I am restricting this post only to explore few and also to set 'popular' options that I use when I create a volume.



Volume Creation:
=============

Create a thin volume with name vol1 of size 500 GB on aggr1, 0% fractional reserve, with no snaps.

Step:1 Check for available space on aggregate

aggr -Ahx

step:2 Create volume with specified options

vol create vol1 -s none aggr1 500g

#set fractional reserve to 0%

vol options fractional_reserve 0

#There are several default options that will be applied when a volume is created depending on the ontap version, you can verify then as follows:

vol options vol1

Step:3 No snaps

snap sched vol1 000
snap reserve vol1 0
snap autodelete vol1 on

Step:4 Verify

vol status vol1

Expand Volume:
==============

Expand the volume created above by another 600 GB

Step:1 Check for available space

aggr -Ahx

Step:2 Expand the volume by 600 GB

vol size vol1 +600g

After completion it show the new expanded size.

Offline and destroy Volume:
=====================

Take the above created volume offline and destroy it.

This operations needs to be done with extra consciousness, if you screw up-you are fired!

Step:1 Confirm there is no activity on the target volume from CLI or OCUM

stats show -e volume:vol1

#make sure you see all of the counters '0'

Step:2 Offline the volume

vol offline vol1

#confirm with your server admins that users are not complaining

Step:3 Destroy the volume

vol destroy vol1


Notes:

Fractional Reserve: If you are wondering what fractional reserve is, lets take a scenario where you have a volume of size 600 GB with a lun of 500 GB and also say you have the snaps turned on.

If you are not monitoring the snaps and allowing them to grow beyond the limits, you will obviously be out of space on the volume which leads to bad things! So if you set the fractional reserve to 100% it will reserve another 600 GB space on the aggregate for eliminating the volume being offline when the snaps conquer the volume.

Enhancements to NFS and SMB - NetApp Cluster-mode support


NFSv4:
=====
1. Data ONTAP 8.1 cluster-mode introduces support for NFSv4 protocol specification as well as elements of NFSv 4.1
2. cluster mode continues to fully support NFSv2 and NFSv3 although you should not use NFSv2 with cluster mode
3. NFSv4 support brings the Data ONTAP 8.1 cluster mode operating system in parity with the Data ONTAP 7.3 operating system
4. The key feature of NFSv4 is referrals, NFSv4.1 is a minor revision of version 4.0 and is an extension of version 4 not a modification, so it's fully compliant with the NFSv4 specification it extends delegations beyond files to directories and send links introduces NFS sessions for enhanced efficiency and reliability provides parallel NFS, pNFS



Remote file access:
==============
-It is defined as the file access in which a client connected to a logical interface LIF going to a physical port on one controller accesses a file that is hosted on a different controller in the same cluster
-Remote file access has traditionally been a performance concern for clients which has been fixed with Data ONTAP Cluster mode operating system

Scenario: 
=======
A client is mounted to a Data LIF that is hosted on node1 and has a file operation with the destination in a volume on node4 the request is serviced by the node1 protocol stack, that protocol stack looks for the location of the volume and directs the operation to node4; which hosts the volume. The request traverses the cluster network and the result is returned to the client along the same path

With pNFS when a file is open by an NFS client that mounted data LIF on node1 serves as the metadata path, because this is the path that will be used to carry out discovery of the target volumes location. If the data is hosted by node1 the operation will be managed locally in this case the local node discovers that the data is on node4, based on the pNFS protocol the client will be redirected to a LIF hosted on node4 the request as well as subsequent requests to the volume are serviced locally bypassing the cluster network

when a volume is moved to an aggregate on a different node the pNFS client data path is redirected to a Data LIF hosted on the destination node

To enable pNFS:

cluster1::> vserver nfs modify -v 4.1 -pnfs enabled

Note: Clients should be able to support pNFS, it is supported with RHEL 6.2 and Fedora 14.

NFSv4 referrals and pNFS do not work together. By keeping network accessing data local pNFS reduces the amount of traffic that traverses is the cluster network. Unlike with NFS referrals, pNFS works seamlessly to the client it does not require a file system remount to ensure an optimized path with pNFS because the network redirect does not happen at Mount time. The final handle will not be left stale when a volume is moved to an aggregate on a different node

SMB 2.0 and SMB 2.1
=================
1. In addition to the SMB 1.0 protocol, the Data ONTAP cluster-mode operating system now supports SMB 2.0 and SMB 2.1.
2. SMB 2.0 was a major revision of the SMB 1.0 protocol including a complete reworking of the packet format SMB 2.0 also introduces several performance improvements relative to previous versions.
3. Efficient network utilization request compounding which stacks, multiple SMB into a single network packet larger read and write sizes to exploit faster networks File and directory.
4. Property caching durable file handles to allow SMB connection to transparently reconnect to the server if a temporary disconnection occurs such as over a wireless connection.
5. Improve message signing with improved configuration and interoperability with HMAC SHA-256 replacing MD5 at a hashing algorithm SMB 2.1.
6. Provides important performance enhancements, these enhancements include the following -client opportunistic lock (Opplock leasing model)
-large maximum transmission unit MTU support
-improved energy efficiency for client computers support for previous versions of SMB
7. The SMB 2.1 protocol provides several minor enhancements to the SMB 2.0 specification 8. Data ONTAP 8.1 cluster mode supports most but not all of the SMB 2.1 features.
9. The following SMB 2.1 features are not supported
-large MTU resilient handles
-branch cache support for SMB 2.1 is automatically enabled when you enable the SMB 2.0 protocol on a virtual server (Vserver)

Use the following command to enable SMB to point out for the server:

cluster1::> vserver cifs options modify -vserver vs1 -smb2 -enabled true

Features of SMB 2.1 Leases:
=====================
-File and metadata caching
-Reduces bandwidth consumption
-Retention of cached data after a file is closed
-Full caching with multiple handles as long as those handles are opened on same client

NetApp SnapVault - "The Data archiving solution"



SV can be done from multiple primary storage systems to one secondary storage system, it also reduces storage requirements by using thin-replication technology and by inter-operating with deduplication. SV can be scheduled at multiple intervals to improve RPO.

One can provide read-only access that is stored on SV secondary storage by exporting the SV secondary volume to or sharing with UNIX or windows clients. Users can be allowed to copy-and-paste procedures to restore data.



License:
1.Install license for each primary (sv_ontap_pri) and secondary (sv_ontap_sec) storage system.
2. SnapMirror license required for failover to a SV secondary qtree or volume.

Qtrees are the basic unit for SV. Primary qtrees, non-qtree data and even volumes are backed up to qtrees on the SV secondary system.

Initial Backup (baseline): The first time SV backs up a qtree or volume, it backs up all of the data blocks on primary storage, writes the data to the secondary volume, and then creates a Snapshot on the secondary volume.

Scheduled Updates: After the initial backup, SV performs updates, transferring and storing only the data blocks that have changed since the last backup. For every update, SV creates a snapshot copy of the relevant volume.

Archive: One uses SV mainly for archival purposes, as we take multiple snapshot copies, you can archive all those multiple backups that were performed multiple times.

Ports:
1. For SV backup and restore operations, port 10566 must be open in both directions.
2. For NDMP management, port 10000 must be open on both primary and secondary systems.

Log file: One can locate the SV logs in the /etc/log/snapmirror

Throttling: One can enable throttling by setting "options replication.throttle.enable on|off"

Deduplication:
1. By enabling dedup on SV secondary storage system and secondary volume, it automatically starts the process automatically after the completion of a SV transfer.
2. The deduplication of blocks is initiated when the number of changed blocks represents at least 20% of the number of blocks in volume.
3. As dedup synchronizes with SV schedule, you cannot schedule the dedup of a SV secondary volume. One can start this process manually from GUI or CLI with a maximum of eight concurrent dedup operations.

Scenario:1

To setup SV by adding license, enabling access, creating volume, schedule, start and update.

Step:1 Add license
filerA> license add <sv_ontap_pri>
filerB> license add <sv_ontap_sec>

Step:2 Enable SV access
filerA> options sanpvault.enable on
filerA> options snapvault.access all

filerB> options sanpvault.enable on
filerB> options snapvault.access all

Step:3 Create thin primary and secondary volume by setting several options which I generally use

filerA> vol create vol_primary -s none aggr1 500g
filerA> vol options vol_primary fractional_reserve 0
filerAsnap reserve vol_primary 0
filerAsnap sched vol_primary 000
filerAsnap autodelete vol_primary on
filerAsnap autodelete vol_primary target_free_space 5
filerA> sis on /vol/vol_primary

filerBvol create vol_secondary -s none aggr1 500g
filerB> vol options vol_secondary fractional_reserve 0
filerB> vol options vol_secondary nosnap on
filerBsnap reserve vol_secondary 0
filerBsnap sched vol_secondary 000
filerB> sis on /vol/vol_secondary

Step:4 Create Qtree on primary side (Qtree will be automatically created on secondary)
filerA> qtree create /vol/vol_primary/qt

Step:5 Schedule Snapvault
filerA> snapvault snap sched vol_primary sv_hourly 6@0-23

filerB> snapvault snap sched -x vol_secondary sv_hourly 24@0-23

Step:6 Initialize baseline transfer
filerB> snapvault start -S filerA:/vol/vol_primary/qt /vol/vol_secondary/qt

When you run the above command, qtree "qt" will be created on the secondary volume automatically.

Step:7 Check the status of transfer on primary or secondary storage system

snapvault status -l filerA:/vol/vol_primary/qt

or

snapvault status

Step:8 Update SV secondary
filerB> snapvault update /vol/vol_secondary/qt

Scenario:2

Restore data to original qtree on the primary storage system

filerA> snapvault restore -S filerB:/vol/vol_secondary/qt /vol/vol_primary/qt

Scenario:3

To clean up the obsolete SV relationships

Step:1 Identify the relationships that needs to be cleaned up
filerA> snapvault destinations

Step:2 Release secondary destinations.
filerA> snapvault release /vol/vol_primary/qt filerB:/vol/vol_secondary/qt

Step:3 Stop the snapvault services
filerB> snapvault stop -f filerB:/vol/vol_secondary/qt

Step:4 Unschedule updates
filerA> snapvault snap unsched -f vol_primary sv_hourly
filerB> snapvault snap unsched -f vol_secondary sv_hourly

Step:5 Delete SV snapshot copies
filerA> snap list vo_primary
filerA> snap delete vol_primary <snapshot name>
filerB> snap list vol_secondary
filerB> snap delete vol_secondary <snapshot name>

Scenario:4

To restart SV by resynchronizing relationships between primary and secondary.

filerB> snapvault start -r -S filerA:/vol/vol_primary/qt filerB:/vol/vol_secondary/qt