Dell Storage Solution Resources Owner's manual

  • Hello! I am an AI chatbot trained to assist you with the Dell Storage Solution Resources Owner's manual. I’ve already reviewed the document and can help you find the information you need or explain it in simple terms. Just ask your questions, and providing more details will help me assist you more effectively!
Dell EqualLogic Best Practices Series
Sizing Microsoft Exchange
2010 on Dell EqualLogic PS6100
and PS4100 Series Arrays on
VMware vSphere 5
A Dell Technical Whitepaper
Storage Infrastructure and Solutions Engineering
Dell Product Group
May 2012
This document has been archived and will no longer be maintained or updated. For more
information go to the Storage Solutions Technical Documents page on Dell TechCenter
or contact support.
THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL
ERRORS AND TECHNICAL INACCURACIES. THE CONTENT IS PROVIDED AS IS, WITHOUT EXPRESS
OR IMPLIED WARRANTIES OF ANY KIND.
© 2012 Dell Inc. All rights reserved. Reproduction of this material in any manner whatsoever without
the express written permission of Dell Inc. is strictly forbidden. For more information, contact Dell.
Dell, the DELL logo, and the DELL badge, PowerConnect, EqualLogic™, PowerEdge™ and
PowerVault™ are trademarks of Dell Inc. Broadcom® is a registered trademark of Broadcom
Corporation. Intel® is a registered trademark of Intel Corporation in the U.S. and other countries.
Microsoft®, Windows®, Windows Server®, and Active Directory® are either trademarks or registered
trademarks of Microsoft Corporation in the United States and/or other countries.
BP1026 Sizing MS Exchange with EqualLogicPS6100 & PS4100 on VMware vSphere 5 i
Table of Contents
1
Introduction ....................................................................................................................................................... 2
1.1 Purpose and Scope ................................................................................................................................... 2
1.2 Target audience ......................................................................................................................................... 2
1.3 Terminology ............................................................................................................................................... 2
2 EqualLogic 6100/4100 product overview .................................................................................................... 3
3 Microsoft Exchange and storage subsystem ............................................................................................... 4
3.1 Configuration factors to consider .......................................................................................................... 5
3.2 Considerations around Exchange DAG differentials ........................................................................... 6
4 Test topology and architecture overview ...................................................................................................... 7
4.1 Functional system design ......................................................................................................................... 7
4.2 Physical system configuration ................................................................................................................ 8
4.3 Storage layout ............................................................................................................................................ 9
5 Validate the trend of each factor .................................................................................................................. 10
5.1 Characterize the impact of user profile workload ............................................................................. 11
5.2 Characterize the mailbox size ................................................................................................................ 15
5.3 Study the database volumes layout ...................................................................................................... 17
5.4 Study the DAG number of database copies differential ................................................................... 20
5.5 Characterize the RAID policy ................................................................................................................ 22
5.6 Scaling up the number of users ............................................................................................................ 25
5.7 Scaling out the SAN ................................................................................................................................ 30
5.8 Assess the 6100 and 4100 family models ........................................................................................... 34
6 Capacity planning and sizing ........................................................................................................................ 38
6.1 Capacity considerations ......................................................................................................................... 38
6.2 Performance considerations ................................................................................................................. 40
6.3 Sizing example ......................................................................................................................................... 41
7 Best practices recommendations ................................................................................................................ 44
Appendix A Configuration details .................................................................................................................. 47
A.1 Hardware components .......................................................................................................................... 47
A.2 Software components ............................................................................................................................ 48
A.3 Network configuration details .............................................................................................................. 49
A.4 Host hypervisor and virtual machines configuration ......................................................................... 51
A.4.1 Virtual network configuration ........................................................................................................ 51
A.4.2 Virtual Machines network configuration ..................................................................................... 54
Appendix B Microsoft Jetstress considerations........................................................................................... 55
BP1026 Sizing MS Exchange with EqualLogicPS6100 & PS4100 on VMware vSphere 5 ii
Acknowledgements
This whitepaper was produced by the PG Storage Infrastructure and Solutions of Dell Inc.
The team that created this whitepaper:
Danilo Feroce, Puneet Dhawan, Suresh Jasrasaria, and Camille Daily
We would like to thank the following Dell team members for providing significant support during
development and review:
Mark Welker
Feedback
We encourage readers of this publication to provide feedback on the quality and usefulness of this
information by sending an email to [email protected]
.
SISfeedback@Dell.com
BP1026 Sizing MS Exchange with EqualLogicPS6100 & PS4100 on VMware vSphere 5 1
Executive Summary
The endless growth of storage capacity demand is showing no signs of waning. Disk drives with
increasingly larger capacity are becoming available every year and the cost per gigabyte is routinely
declining, and has progressed to such proportions that the terabyte is the new unit of storage capacity.
A solution that provisions such large storage capacity, however, is not as simple as it appears.
Applications and storage infrastructures must support such growth with the same or even improved
performance to user data access. End users would not accept any degradation in the usability of their
business tools solely because hundreds to thousands times more data is being accessed than in the
past.
Microsoft
®
Exchange Server 2010 is able to address this storage growth challenge through its ability to
implement and manage very large user mailboxes while reducing the storage access footprint due to
significant redesigns in its database storage architecture. The recent introduction of the updated Dell™
EqualLogic™ PS Series family arrays (6100 and 4100) substantially raises the bar of both performance
and capacity kindling many organizations to investigate the benefits of using them to power their
corporate applications and storage-based services such as Microsoft Exchange Server.
Combining the messaging services and SAN platforms is a winning scenario for the end user
expectations, but it necessitates the careful examination of IT professionals when it comes to planning
and deploying a production environment. Understanding the current access footprint of the end users
and services is a key element to properly begin the journey of sizing the right solution. Many additional
variables, originating from either software or hardware, might influence the outcome in a powerful
and surprising way if not weighted in advance.
We evaluated most of the variables surrounding the Exchange 2010 deployment decisions against the
storage and identified some key findings summarized below.
Exchange database maintenance and service tasks could have a considerable impact on the
overall footprint of storage IOPS. The absolute value of their load is approximately steady,
therefore when examined as a percentage, the value can profoundly influence environments
with light load while it appears mitigated when compared instead with heavy workloads.
Distributing a given set of user mailboxes with a known workload in fewer databases would
greatly benefit the overall performance of the storage subsystem. Reproducible tests showed
an application latency improvement between 55% and 70% when shifting from 20 to 2
deployed databases to support the same number of users. Nevertheless the flexibility of
administration and protection of a large database must be weighed with caution.
Database availability group (DAG) technology manifests a light, but consistent increment in
generated IOPS due to the supplementary read activity against database log files. The number
of replica copies is a factor of the increment moving from 2% overhead of the first copy, to 1%
for each subsequent duplication of the number of copies.
Traditional RAID policies comparison demonstrates how RAID 10 still performs ahead of other
implementations. Around 5% from the closest competitor RAID 5, followed by RAID 50, and
then RAID 6. Due to the EqualLogic design, RAID 5 and RAID 6 benefit from a greater number
of active spindles. Performance is nevertheless close enough to persuade a decision maker to
base his disk protection choice on capacity or on higher tolerance to failure characteristics,
instead of pure performance.
BP1026 Sizing MS Exchange with EqualLogicPS6100 & PS4100 on VMware vSphere 5 2
1 Introduction
Microsoft Exchange Server is a versatile messaging solution heavily relying on the storage subsystem.
The 2010 version provides access to more data per mailbox with same or improved efficiency.
The EqualLogic updated PS Series families (6100 and 4100), while maintaining the same building block
approach and easy management, accommodate wider capacity needs and dispense increased
performance.
Virtualization technologies and private clouds allow a more beneficial exploitation of hardware
investments, new or preferable ways to provide high availability and reliability to IT solutions, and
basically a better control of the expenditures.
Virtualization is also often associated with an enhanced chance of curtailment of the datacenter
physical footprint. When combined with server blades architecture, it exhibits an unparalleled
processing density to the corporations, eager to gauge and control their energy consumption and
ultimately their total cost of ownership (TCO).
A proper amalgamation of these four main pillars (storage, blade servers, virtualization, and Microsoft
Exchange application) requires an expanded canvass to address the additional variables that might
influence the outcome of a solution.
1.1 Purpose and Scope
This white paper presents the results of a study centered on the characterization of I/O patterns of
Microsoft Exchange Server when deployed on a storage subsystem based on EqualLogic SAN and in a
dense server environment built on DellPowerEdgeblade servers. It will guide Exchange and SAN
administrators to understand their messaging workload, and predict their SAN size requirements. The
scope of this paper is a virtual infrastructure built on VMware® vSphere® connecting to the SAN
directly from the virtual machines and storage sizing, leaving aside server sizing activity.
1.2 Target audience
This paper is primarily intended for IT professionals (IT managers, Solution Architects, Exchange and
Storage Administrators, and System and VMware Engineers) who are involved in defining, planning
and/or implementing Microsoft Exchange Server 2010 infrastructures and would like to investigate the
benefits of using EqualLogic storage. This document assumes the reader is familiar with Microsoft
Exchange functionalities, EqualLogic SAN operation, and VMware system administration.
1.3 Terminology
The following terms will be used throughout this document.
Group: Consists of one or more EqualLogic PS Series arrays connected to an IP network that work
together to provide SAN resources to host servers.
Member: Identifies a single physical EqualLogic array.
Pool: A logical collection that each member (array) is assigned to after being added to a group and
contributes its storage space to the entire pool.
Hypervisor: Denotes the software layer in charge of managing the access to the hardware resources,
sitting above the hardware, and in between the operating systems running as ‘Guests’.
BP1026 Sizing MS Exchange with EqualLogicPS6100 & PS4100 on VMware vSphere 5 3
Virtual Machine: Denotes an operating system implemented on a software representation of hardware
resources (processor, memory, storage, network, etc.). Virtual machines are usually identified as
‘guests’ in relation with the ‘host’ operating system that execute the processes to allow them to run
directly on the hardware.
Exchange DAG: Database Availability Group is a pool of networked Exchange mailbox servers that
hosts multiple copies of the same Exchange databases.
B-Tree (balanced tree): a tree data structure where a node can have a variable number of child nodes,
commonly used in databases to maintain data sorted in a hierarchical arrangement. It allows efficient
data access to the pages for insert, delete, and searches.
2 EqualLogic 6100/4100 product overview
Dell™ EqualLogic™ PS6100 Series arrays are designed to meet the performance and availability needs
of application and virtualization environments in medium to large enterprises. These virtualized iSCSI
storage area networks (SANs) combine intelligence and automation with fault tolerance to provide
simplified administration, rapid deployment, enterprise performance and reliability, and seamless
scalability using innovative Fluid Data™ technology.
The Dell EqualLogic PS4100 series is designed to address the needs of remote office/branch office
(RO/BO) and small-to-medium business storage deployments. Remote offices receive the benefits of data
center class IT solutions, such as storage consolidation and virtualization, at an affordable entry point.
PS6100 arrays fit your needs with 2.5” or 3.5” drives in 2U or 4U form factorsall while delivering an
increase in drives per array of up to 50% over the previous generation. With an option of all SSD drives, a
mix of SSD and 10K SAS drives in a single array, all 10K SAS or all 15K SAS drives in a single array,
EqualLogic PS6100/4100 series array models offer the flexibility in capacity and performance to best
match application needs.
BP1026 Sizing MS Exchange with EqualLogicPS6100 & PS4100 on VMware vSphere 5 4
3 Microsoft Exchange and storage subsystem
Microsoft Exchange Server product has a diversified set of components and services working together
to accomplish the goal of supporting the most dissimilar requirements to design and deploy a
messaging infrastructure within an organization. From a storage perspective, the most relevant role
performed in Exchange infrastructure is the mailbox server role, since it ultimately governs the
retrieval, storage, and availability of users data for the rest of the infrastructure in order to have it
routed, presented, etc.
Appropriate sizing of the mailbox role servers in an organization is the primary best practice to avoid
poor performance issues, or administrative overhead to redesign the deployment layout to adapt to
new or changed users requirements.
We should follow the hierarchy of Microsoft Exchange Server 2010 underlying logical components to
understand the interaction between the Exchange mailbox server role and the storage subsystem. The
access to mailbox databases (or public folder database, when implemented) is the key element that
primarily generates I/O to the storage subsystem. However, while a database is a logical
representation of a collection of user mailboxes, it is at the same time an aggregation of files on the
disk which are accessed and manipulated by a set of Exchange services (i.e. Exchange Information
Store, Exchange Search Indexer, Exchange Replication Service, Microsoft Exchange server) following a
different set of rules.
Database file (*.edb) is the container of users mailbox data. Its content, broken into database pages
(32KB in size), is primarily read and written in a random fashion as required by the Exchange services
running on the mailbox server role. A database has a 1:1 ratio with its own .edb database file. The
maximum supported database size in Exchange Server 2010 is 16 TB, where the Microsoft guidelines
recommend a maximum 200 GB database file in a standalone configuration and 2 TB if the database
participates in a replicated DAG environment.
Transaction Logs (*.log) are the container where all the transactions that occur on the database
(create, modify, delete messages, etc.) are recorded. Each database owns a set of logs, and keeps a 1
to many ratio with them. The logs are written to the disk sequentially, appending the content to the
file. The logs are read only when in a replicated database configuration (DAG) or in the event of a
recovery.
Checkpoint file (*.chk) is a container for metadata indicating when the last flush of data from the
memory cache to the database occurred. It has a very limited size (8KB), and although repeatedly
accessed its overall amount of I/O is limited. The database keeps a 1:1 ratio with its own checkpoint
file and positions it in the same folder location as the log files.
Content Index files (catalog folder, multiple file extensions) are flat files, representing the Search
Catalog, built by the Microsoft Search Service. The client applications connected to Exchange Server
benefit from this catalog by performing faster searches based on indexes instead of full scans.
Microsoft Exchange Server uses a proprietary format named Extensible Storage Engine (ESE) to access,
manipulate, and save data to its own mailbox databases. The same format is now employed on the
Exchange HUB server role for queue databases. The ESE technology, previously known as Jet
Database Engine, has evolved through several versions of Exchange Server releases and created
different ramifications within the Microsoft products since its inception (Microsoft Access, Active
Directory, File Replication Service, WINS server, Certificate Services). The ESE is an Indexed Sequential
BP1026 Sizing MS Exchange with EqualLogicPS6100 & PS4100 on VMware vSphere 5 5
Access Method (ISAM) technology, which organizes the database data in B-Tree structures, and as
such the databases are populated with the effort of keeping the data together or adjacent. Considering
this event does not always occur. Such structured databases benefit from external tasks directed
towards the reorganization or defragmentation of the database data itself to restore the optimal data
contiguity.
Note: For additional information about the Exchange 2010 Store refer to Microsoft documentation
Understanding the Exchange 2010 Store
, available at: http://technet.microsoft.com/en-
us/library/bb331958.aspx
To summarize, an Exchange 2010 database itself is subject to a subset of tasks producing storage
access:
the regular read and write access required to retrieve user mailboxes data and store it
(according to the Exchange cache policy);
the online defragmentation and compacting activities due to the B-Tree optimization
the database maintenance, which includes dumpster cleanup, deleted mailboxes purge, and
other activities addressing logical objects support; and
the database scanning checksum to verify data blocks integrity (sequential read activity), which
can be set as a background 24x7 activity or in a scheduled time window.
Additionally, Exchange Server offers a specialized offline defragmentation task that can only be
performed manually, taking advantage of the ESEUTIL.EXE command line tool, while the database is
dismounted. The principal goal of this task is to reclaim the empty space left in a database by the
online defragmentation, shrinking the size of the .edb file itself, and thus returning the free space back
to the operating system volume.
Any further discussion around offline defragmentation has been purposely omitted from this paper
as this task does not contribute toward the scope of our study. Furthermore, it is not
recommended to include offline defragmentation in a regular maintenance plan due to the
disruption in the availability of the database, the rupture of the logs chain, and the obligation of
database re-seeding in case of Database Availability Group configuration.
3.1 Configuration factors to consider
After determining the different I/O activities generated against the storage subsystem for a single
database, the factors that influence the overall I/O footprint of an entire Exchange mailbox server role
and the several variables of a deployment were taken into consideration.
Mailbox usage profile denotes the usage characteristics of a mailbox (i.e. send, receive, open or delete
items). It is commonly defined by the amount of messages sent and received per day and the average
message size. It is also translated into transactional IOPS per mailbox through considerations made
around the size of database cache allocated per each mailbox.
Mailbox size quantifies the maximum space a mailbox will be allowed to grow, enforced by a quota
policy, or more generally the average mailbox size in a corporate messaging environment. It primarily
affects the capacity requirement of a mailbox role server and the planned size of the databases hosted
by the server. Moreover it influences the IOPS response due to the wider physical disk surface that
must be accessed to retrieve or store the data.
BP1026 Sizing MS Exchange with EqualLogicPS6100 & PS4100 on VMware vSphere 5 6
Number of Databases designates the database layout and mailbox distribution across the databases.
Each Exchange database is managed as single administrative unit and is served by a set of services with
a 1:1 ratio (defragmentation, maintenance, logs generation).
Database and Log placement indicates whether the .edb and log files reside on the same volume or
are deployed on isolated volumes. The historical requirement to split database and logs into different
volumes, or rather physical drives, arose from two different reasons:
Performance: the different I/O pattern of these two streams of data (random reads/writes
versus sequential writes) and the aim to associate them with the most fitting storage device
(i.e. rotational speed, RAID level).
Reliability: the contemporaneous loss of both the database and logs could jeopardize the
recoverability of users data depending on the backup methodology used.
The Exchange Server 2010 I/O footprint reduction, when compared with former versions, nullifies the
perceived performance issue reported above. In addition, the use of a SAN as a storage subsystem
provides different approaches to the data protection of Exchange mailbox server role that does not
require the volume split mentioned above (i.e. EqualLogic Smart Copies). For additional references
refer to the white paper mentioned in the notes box at the end of this section.
Mailbox count represents the number of user mailboxes hosted by the mailbox role server. With the
increment of this value the amount of IOPS, capacity allocated and logs generation increase.
Furthermore an elevated number of mailboxes hosted by a single server intensify the demand for a
highly available solution
High availability footprint refers to the overhead of having a highly available solution in place with
data replication involved. See the section titled, “Considerations around Exchange DAG differentials”.
Data protection footprint identifies the planned amount of IOPS spent by the solution established to
protect the mailbox database data. Since a customized solution is usually tailored to each distinct
production environment (for example 24x7, or 9 to 5) the impact is greatly variable and depends upon
the degree of parallelism between the regular data access and the data protection access. Where
examining the aspects of protecting Exchange mailbox server role data requires a more
comprehensive explanation, refer to the white paper mentioned in the notes box below.
For additional information about Exchange data protection options with Dell EqualLogic SAN refer
to
Best Practices for Enhancing Microsoft Exchange Server 2010 Data Protection and Availability
using Dell EqualLogic Snapshots
, available at:
http://www.delltechcenter.com/page/Enhancing+Microsoft+Exchange+Server+2010+Data+Prote
ction+and+Availability+with+EqualLogic+Snapshots
3.2 Considerations around Exchange DAG differentials
A Database Availability Group (DAG) is a pool of up to 16 networked servers that hosts multiple copies
of the same Exchange database or databases where only one of the copies is active at a specific point
in time within the group; the other copies are passive and contain data sourced from replicated and
replayed transaction logs. While implemented, it directs some deviations in the storage access patterns
and in the Exchange memory cache behavior.
Transaction Logs access is now differentiated between the conventional sequential writes pattern,
and by an additional sequential read access, required to perform the replication activities.
BP1026 Sizing MS Exchange with EqualLogicPS6100 & PS4100 on VMware vSphere 5 7
Log Checkpoint depth refers to the amount of logs written to the disk and containing transactions not
yet flushed to the database file. It is usually set at 20 in a standalone server install, and increased to 100
in a DAG configuration. The outcome of this change is a contraction in the write I/O for the given
database, since the opportunity to combine user activity changes in memory (coalescing), and thus to
reduce I/O increases.
4 Test topology and architecture overview
The findings presented in this paper are a result of testing conducted on a Microsoft Windows
infrastructure, built on VMware vSphere 5.0 hypervisor, and accessing storage on EqualLogic SAN. We
took advantage of the Microsoft Exchange Jetstress simulation tool in order to simulate the different
workloads specified in section 5.
Our test architecture consisted of a core building block of one Exchange mailbox role server paired
with one storage array. This block was then scaled horizontally to achieve the targeted workload level
by multiplying the number of servers and arrays while firmly retaining the 1:1 ratio between them.
4.1 Functional system design
The functional elements of the test infrastructure are shown in Figure 1. Some key elements of the
design were:
Single Active Directory forest, single domain, single site (not strictly required for the tests),
Centralized management and monitoring with dedicated resources, and
Building block design approach for mailbox role servers.
Figure 1 Functional system design diagram
BP1026 Sizing MS Exchange with EqualLogicPS6100 & PS4100 on VMware vSphere 5 8
4.2 Physical system configuration
The physical components of the test infrastructure were laid out as show in Figure 2.
We deployed this solution on Dell Blade servers planning for a greater datacenter density and flexibility
of the solution. Some key aspects of the physical deployment were:
Single M100E Blade enclosure with redundant management controllers (CMC) and fully
populated power supplies configured for redundancy,
Single EqualLogic iSCSI SAN provisioned alternatively by a single unit of one of the following
array models: PS6100 3.5”, PS6100X, PS6100E, PS4100XV 3.5”, or PS4100E, and
Single EqualLogic iSCSI SAN provided by up to 3 PS6100XV 3.5” arrays for the exercise to scale
the SAN to multiple arrays.
Figure 2 Physical system design diagram
Dual PowerConnect M6220 Ethernet switches (stacked) to support LAN IP traffic (Fabric A)
Dual PowerConnect M6348 Ethernet switches (stacked) to support the iSCSI data storage
traffic on the server side (Fabric B),
BP1026 Sizing MS Exchange with EqualLogicPS6100 & PS4100 on VMware vSphere 5 9
Dual PowerConnect 7048R Ethernet switches (stacked) to support the iSCSI data storage
traffic on the SAN side, and
Link Aggregation Group (LAG) consisting of 4 fiber connections (2 from each switch) between
the Fabric B M6348s and the top of the rack 7048s.
More details of the test configuration setup, including a hardware and software list, SAN array
characteristics, hypervisor and virtual machines relationship, network connections, and blade switch
fabric paths are provided in Appendix A.
4.3 Storage layout
The EqualLogic SAN arrays and the volume underlying the Exchange databases were setup as:
One EqualLogic group configured alternatively with one unit of each array model previously
listed in the section titled,Physical system configuration”.
One EqualLogic group configured with one to three array members for the simulation to scale
the SAN to multiple arrays.
One storage pool defined within the group and including the single member or all the
members of the group for the test case of multiple arrays.
RAID policy dependent and specified as part of each test case.
Five volumes created in the pool, unless specified in the test case. One volume dedicated to
each Exchange mailbox database and a set of file logs with a 1:1 ratio.
Figure 3 reflects the common volume layout implemented on the EqualLogic SAN.
Figure 3 Volume and database/logs layout
BP1026 Sizing MS Exchange with EqualLogicPS6100 & PS4100 on VMware vSphere 5 10
5 Validate the trend of each factor
This set of tests was run to evaluate the storage performance trends of an EqualLogic SAN under the
load of a reference Exchange mailbox role server configuration while the deployment factors were
altered.
In order to predict storage needs for current or future implementations, an Exchange administrator
and storage counterpart should be able to address the following quandaries:
What if the usage profile of an average mailbox user in our organization changes?
How much should an average mailbox be allowed to grow?
Across how many mailbox databases should user mailboxes be distributed?
What storage impact should be expected when DAG technology is adopted to increase
mailbox availability?
Does the RAID policy selected become a heavy constraint against Exchange performance?
Which EqualLogic array model will be a satisfactory fit for the overall footprint of our
mailboxes?
How many users should be provisioned per each building block unit of deployment?
Does the SAN scale horizontally following the same pace of growth of the workload?
The reference Exchange mailbox server role configuration used is reported in Table 1. Each test
described in the following paragraphs explicitly reports variations made to this reference baseline.
For details about the simulation tool, Microsoft Jetstress 2010, refer to Appendix B.
Table 1 Reference configuration
Reference configuration: factors under study
Number of simulated mailboxes/users 5,000 concurrent users
Mailbox size 512MB
Number of databases 5 databases (active)
Databases size 500GB each
Mailbox allocation 1,000 mailboxes per each mailbox database
Number of database replica copies 1 (standalone)
IOPS per mailbox / messages per day per mailbox 0.18 IOPS / 150 messages
RAID policy RAID 50
Array model, amount of units, SAN configuration 1x PS6100XV 3.5”, one single pool (default)
Reference configuration: unaltered factors across the test sets
Volumes
One volume for each DB and associated Logs
Volume size equal 120% database size
Windows Disk/Partition, File System
Basic disk, GPT partition, default alignment
NTFS, 64KB allocation unit size
Background database maintenance Enabled
Test duration 2 hours + time required to end DB checksum
BP1026 Sizing MS Exchange with EqualLogicPS6100 & PS4100 on VMware vSphere 5 11
Below is a list of metrics and pass/fail criteria recorded during testing. Most of this information is
outlined by the Jetstress tool and the remainder is verified through Dell EqualLogic SAN Headquarters.
Microsoft indications around thresholds for storage validation are reported as well.
Database Reads Average Latency (msec) is the average length in time to wait for a database read
operation. It should be less than 20milliseconds (random reads according to Microsoft threshold
criteria).
Database Writes Average Latency (msec) is the average length in time to wait for a database write
operation. It should be less than 20milliseconds (random writes according to Microsoft threshold
criteria).
Logs Writes Average Latency (msec) is the average length in time to wait for a log file write operation.
It should be less than 10milliseconds (sequential writes according to Microsoft threshold criteria).
Planned Transactional IOPS are the target amount of IOPS for the test (calculated by multiplying the
number of users by the IOPS per mailbox).
Achieved Transactional IOPS are the amount of IOPS really performed by the storage subsystem to
address the transactional requests. The result should be not less than 5% of the planned IOPS to be
considered a successful test iteration according to Microsoft Jetstress.
LOGs IOPS are the IOPS performed against the log files during the transactional test. They are not
directly taken into account as part of the transactional IOPS, but tracked separately instead.
Differential IOPS are the IOPS generated for the DB maintenance and all the remaining activities on
the storage subsystem, calculated as difference between the IOPS provisioned by the EqualLogic SAN
and the previously reported transactional and logs IOPS.
5.1 Characterize the impact of user profile workload
The goal of the baseline workload analysis was to establish the storage trend, IOPS ratios, and
relationship when applying the reference user workload and then varying it, while maintaining the
remaining factors. The configuration parameters for the test are shown in Table 2.
Table 2 Test parameters: variable workload
Reference configuration: variable factor
IOPS per mailbox / messages per day per mailbox
0.06 IOPS / 50 messages
0.18 IOPS / 150 messages
0.36 IOPS / 300 messages
Reference configuration: unchanged factors
Number of simulated mailboxes/users 5,000 concurrent users
Mailbox size 512MB
Number of databases 5 databases (active)
Databases size 500GB each
Mailbox allocation 1,000 mailboxes per each mailbox database
Number of database replica copies 1 (standalone)
RAID policy RAID 50
Array model, amount of units, SAN configuration 1x PS6100XV 3.5”, one single pool (default)
BP1026 Sizing MS Exchange with EqualLogicPS6100 & PS4100 on VMware vSphere 5 12
Exchange messaging service end users retrieve and store their data on the server databases traversing
the Exchange Server services. In turn, the Exchange Information Store service running on the mailbox
server benefits from the database cache to perform its own storage access instructions via the ESE
interface. The database cache is retained in memory, thus the access to and from it is considerably
faster than if performed with the storage subsystem directly. Furthermore, Exchange utilizes internal
algorithms to combine changes to the same blocks of data before flushing them to the disks,
achieving an even lower amount of IO access.
Formerly, the metric adopted for Exchange user activities was founded on light to heavy classification
with intermediate degrees of workload. Currently, the user profiles are identified by the association of
the number of messages sent and received per mailbox per day, where the average message size is
75KB, and the corresponding amount of memory allocated in the database cache per each mailbox.
The three profiles selected for the test, reported in the Table 2, specify an average mailbox workload
progressing from 50 messages to 150 and then to 300 messages per day. The corresponding average
cache allocations for the same mailbox are estimated to be 3 MB, 6 MB, and 18 MB. The final
estimated IOPS per mailbox, according to these profiles is evaluated in 0.06, 0.18, and 0.36 IOPS.
When applying this count to our simulated scenario of 5,000 mailboxes, the industry accepted average
user count for a medium enterprise, hosted by a single server, we account a total estimated IOPS
progression (transactional only) of 300, 900, and 1,800 IOPS. These estimated IOPS values do not rate
all the remaining Exchange Server activities generating IO access. The database cache evaluated for
this pool of users would be 15 GB, 30 GB, and 90 GB; again without considering the additional
memory requirements due to other factors.
The results collected from the Exchange Jetstress simulation of these three workload profiles are
shown in Figure 4.
BP1026 Sizing MS Exchange with EqualLogicPS6100 & PS4100 on VMware vSphere 5 13
Figure 4 Storage trend under increasing workload per mailbox with 5,000 users distributed across 5 DBs
Note: When simulating a different workload with Exchange Jetstress, the tool increases and
decreases the generated IOPS via a couple of tuning parameters (threads and sluggish sessions). It
is not always viable to exactly match the planned with the achieved IOPS because sometimes the
number of threads used distributes more requests. This happens regardless of the effort to
configure Exchange Jetstress to perform the exact IOPS planned. The resultant set of latency
values must be regarded as the performance obtained by achieved and not planned IOPS.
If we evaluate the latency values as an absolute result, they should be equated from achieved
versus planned IOPS through a percentage based equation.
During the workload analysis we observed some behavioral changes worth mentioning:
The reads/writes ratio changes with the workload. The heavier the load, the more writes per
seconds were recorded; with an even average of 50% reads /50% writes while running the
reference workload profile of 150 messages per day, a 58%/42% ratio for the 50 messages
profile and 48%/52% ratio for the 300 messages profile.
BP1026 Sizing MS Exchange with EqualLogicPS6100 & PS4100 on VMware vSphere 5 14
The IOPS differential represented in Figure 4 is almost entirely due to the database
maintenance activities. The amount of IOPS generated to perform the maintenance for our
database was constant. When evaluated as a percentage, the maintenance load had a heavy
29% impact in the lower workload scenario (0.06 IOPS per mailbox), but decreased to 11% and
then 6% in the remaining simulations demanding more transactional IOPS to complete.
The latency values recorded were well below the Microsoft advised threshold of 20 msec for
the DB reads and writes and 10 msec for the log writes even under the heaviest load of 300
messages per day profile.
Table 3 shows an example of converting the resultant set of latency values to match the discrepancy
between planned versus achieved IOPS.
Table 3 Test results: example of latency value conversions to match planned versus achieved IOPS
50 msg / 0.06 IOPS 150 msg / 0.18 IOPS 300 msg /0.36 IOPS
IOPS achieved <> planned 299 IOPS [100%] 1018 IOPS [113%] 2194 IOPS [122%]
Read Latency DBs [avg] 6.4 msec 7.7 msec 12.9 msec
Write Latency DBs [avg] 2.4 msec 5.2 msec 8.1 msec
Write latency LOGs [avg] 0.8 msec 1.0 msec 2.5 msec
IOPS equated = planned 300 IOPS [100%] 900 IOPS [100%] 1800 IOPS [100%]
Read Latency DBs equated unchanged 6.8 msec 10.6 msec
Write Latency DBs equated unchanged 4.6 msec 6.7 msec
Write latency LOGs equated unchanged 0.9 msec 2.1 msec
Table 4 lists the relative improvement or decline of performance recorded during the increase of
workload. The percentages are calculated against the total IOPS shown in Figure 4, not just the
transactional IOPS. The relevant metrics to evaluate Exchange responsiveness (database or log
latency) must be derived from the entire amount of access performed against the storage subsystem,
as the transactional load is merely a subset of the total.
Table 4 Test results: improvement or decline relationship of performance under workload increase
50 msg / 0.06 IOPS 150 msg / 0.18 IOPS 300 msg /0.36 IOPS
IOPS planned 100% [300 IOPS] 300% [900 IOPS] 600% [1800 IOPS]
Total IOPS performed 100% [575 IOPS] 266% [1530 IOPS] 480% [2762 IOPS]
Read Latency DBs 100% [6.4 msec] 121% [7.7 msec] 203% [12.9 msec]
Write Latency DBs 100% [2.4 msec] 223% [5.2 msec] 345% [8.1 msec]
Write latency LOGs 100% [0.8 msec] 125% [1.0 msec] 323% [2.5 msec]
The outcomes exhibit a powerful trend where the Exchange storage performance indicators
(latencies) report a performance penalty consistently lower than the increase of workload. Multiplying
by three or six times the workload did not equate in a similar increment in latency. Furthermore, the
write latencies, both database and logs, grew more than the read latencies, which confirms the trend
of the read/write ratio previously underlined (while write percentage increased with the overall
BP1026 Sizing MS Exchange with EqualLogicPS6100 & PS4100 on VMware vSphere 5 15
workload). The dispensed IOPS, and resultant latencies, for the second and third tests would show
more inferior values if they were equated following the argument exposed and represented in Table 3.
5.2 Characterize the mailbox size
The goal of the mailbox size variance analysis was to establish the storage trend, IOPS ratios, and
relationship while varying the size of the user mailboxes and maintaining the remaining factors. The
configuration parameters for the test are shown in Table 5.
Table 5 Test parameters: varying mailbox size
Reference configuration: variable factor
Mailbox size 500 MB / 1 GB / 1.5 GB per mailbox
Databases size 500 GB / 1 TB / 1.5 TB each
Reference configuration: unchanged factors
Number of simulated mailboxes/users 5,000 concurrent user
Number of databases 5 databases (active)
Mailbox allocation 1,000 mailboxes per each mailbox database
Number of database replica copies 1 (standalone)
IOPS per mailbox / messages per day per mailbox 0.18 IOPS / 150 messages
RAID policy RAID 50
Array model, amount of units, SAN configuration 1x PS6100XV 3.5”, one single pool (default)
Exchange Server databases size is determined by multiple factors. The main component affecting the
capacity requirements is the average mailbox size per user, known as mailbox storage quota, where
the term
quota
is usually correlated with the proposition of enforcing a maximum space or capacity
limit. The estimate to plan the database file size is easily completed by multiplying the mailbox storage
quota by the number of mailboxes per database (e.g. 700 mailboxes with a mailbox storage quota of
800 MB equate to a minimum database capacity requirement of 5.6 TB).
The supplementary factors concurring in the database capacity requirements are the dumpster size,
directly related to each mailbox, and the database empty space (an offshoot of mailbox usage and
database maintenance activity). The mailbox dumpster contains the items deleted from the end users,
but not yet purged from the Exchange system according to a set of retention rules (retention window
per mailbox items, per calendar items, single item recovery). As the maximum size of the database is
not set, the continuous user activity allocates database pages where free or eventually expands the
database file if new ones are needed. While the database maintenance frees tombstoned objects and
the online defragmentation consolidates the user data optimizing the B-tree, the empty space in the
database is assembled.
For additional information about Exchange mailbox capacity factors refer to Microsoft
documentation:
Understanding Mailbox Database and Log Capacity Factors
, available at:
http://technet.microsoft.com/en-us/library/ee832796.aspx
BP1026 Sizing MS Exchange with EqualLogicPS6100 & PS4100 on VMware vSphere 5 16
The three mailbox storage quotas selected for the test (and listed in the Table 5) specify an average
mailbox size expanded from 512 MB to 1 GB, and then 1.5 GB. The resultant database file size for our
scenario of 1,000 users per database becomes respectively 512 GB, 1 TB, and 1.5 TB.
The results collected from the Exchange Jetstress simulation of these three simulations are reported in
Figure 5.
Figure 5 Storage trend under different mailbox sizes with 5,000 users distributed across 5 DBs
Table 6 lists the relative improvement or decline of performance recorded during the growth of
mailbox size. The percentages are calculated against the total IOPS (not just transactional IOPS) shown
in Figure 5.
/