DataMover For Windows Print E-mail

Help Topics

Getting Started

DataMover consists of two screens - The Basic Dialog (BD) and Advanced Dialog (AD), both the basic and advanced dialogs require a license.  Only one license is needed.  The BD will operate for aprroximately 30 seconds before informing the user that a valid license is required.  The AD does not have a time trial period, it always requires a valid license.  Why would anyone want to use the BD screen?  Well, it provides a quick and dirty way to maximize the bandwidth in half duplex or full duplex operation.  It can be used in conjunction with a bus analyzer to monitor for framing and protocol errors.  What are the downsides of just using the BD screen?  It performs only Synchronous I/O using a fixed transfer length and does not perform data integrity checks.  Here's a screenshot of the BD screen:

Image

And here's a picture of the AD screen:

Image

How To Register DataMover:

Open the AD Dialog by selecting a target.

Image

Image

Before DataMover is registered, the About Dialog will display 'Not Registered.'

Image

Select the Advanced -> Features -> Logical menu item to open the AD dialog.

Image

Obtain a valid license from the Moojit and select License -> Import to open a dialog that allows you to find the license file provided by the Moojit.

Image

If the Import was successful, this dialog will be displayed indicating that the license has been registered successfully.

Image

To verify your copy of DataMover has been successfully registered, close the AD dialog and open the About dialog.  Notice how the third entry window has changed.

Image

Basic Dialog Operation

  1. READ, WRITE or BOTH Operation - Selects the direction of the I/O.  READ is half duplex from the target to the Host.  WRITE is half duplex from the host to the target.  BOTH is full duplex moving data in both to and from the host.
  2. Threads - Number of simultaneous threads.  If BOTH is selected, equivalent number of threads are used in both directions.  For example, if threads = 10, ten READ threads will be spawned and ten WRITE threads will be spawned.
  3. Pattern - BYTE pattern selectable by the user.  Only alphanumeric values are accepted, 0 - 9 and A - F.  Default value is 0xAAh.
  4. START/STOP Buttons - START begins I/O, STOP ends I/O.
  5. Menu Selections
  6. Select Logical Drives - Selects the logical targets to perform I/O to.
  7. Select Physical Drives - Selects the physical targets to perform I/O to.
  8. Advanced Patterns (CRPAT, CSPAT, CJTPAT, RESET) - CSPAT, CRPAT and CJTPAT are modified versions of the test patterns defined by FC-MJSQ Specification on www.t11.org.  They differ in two key areas:  1.)  The pattern is repeated until it fills the buffer size selected and 2.)  Valid fibre channel headers are used vs. using these 24 bytes for data pattern.  As a third party vendor application, it's impossible for DataMover to modify fibre channel frame formatting and if it were done, the frame would not be recognizable to another fibre channel device or fabric.  RESET sets the pattern back to 0xAAh.
  9. Advanced Features Logic - Opens the Advanced DataMover Dialog in Logical Mode.
  10. Advanced Features Physical - Opens the Advanced DataMover Dialog in Physical Mode.
  11. Status - RUN, STOP or IDLE will be displayed.
  12. Location - Number of targets selected and their string identifiers.

Advanced Dialog Operation

  1. Threads - This value is selected in the basic dialog.
  2. Transfer Size - Range 512B to 4MB.  Transfer size must be less than or equal to buffer size.
  3. Buffer Size - Range 512B to 4MB.  Buffer size must be greater than or equal to transfer size.
  4. Operation - This parameter is selected in the basic dialog.
  5. Pattern - This can be the pattern selected in the basic dialog or the pattern imported by the advanced dialog.
  6. Status - RUN, STOP or FAIL will be displayed.  If error checking is enabled, and one of the threads experiences a failure, FAIL will be displayed.
  7. I/O Depth - This corresponds to the number of I/O's that will be queued, maximum value is 63.  This control will be disabled when the asynchronous check box is not enabled.
  8. Extended - If error checking is enabled and a thread experiences a failure, a more detailed description of the failure will be displayed here along with an Application event being logged providing the most detail on the error condition.
  9. Location - Number of targets selected and their string identifiers.
  10. Elapsed Time - DAYS:HOURS:MINUTES:SECONDS
  11. Performance - Displayed in Megabytes per second.
  12. IOPS - Number of successful Input/Output Operations per second.
  13. Target Window
    1. Target - String identifier of target.
    2. Status - INIT, RUN, STOP or FAIL.  INIT will be displayed when the thread initializing and has not started I/O to the target.  If error checking is enabled and a thread experiences a failure, FAIL will be displayed.
    3. MB/s - Megabytes per second.
    4. IOPS - Number of successful input/output operations per second.
    5. MB/s (READ) - Megabytes per second, read only direction.
    6. IOPS (READ) - Number of successful input/output operations per second, read only direction.
    7. MB/s (WRITE) - Megabytes per second, write only direction.
    8. IOPS (WRITE) - Number of successful input/output operations per second, write only direction.
    9. Error - If error checking is enabled and a thread experiences a failure, a more detailed description of the failure will be displayed here along with an Application event being logged providing the most detail on the error condition.
    10. Operation -This parameter is selected in the basic dialog.
    11. Chunk -Transfer size selected for target.
    12. Buffer - Buffer size selected for target.
    13. Performance - Will display ENABLED or DISABLED.
    14. Error Checking - Will display ENABLED or DISABLED.
    15. I/O Depth - Queue depth per thread associated with asynchronous I/O.
    16. Threads - The number of threads per target.
    17. Direction - I/O direction selected for target (READ, WRITE, or READ/WRITE).
    18. Latency - Average latency value for target, READ and WRITE direction (in milliseconds).
    19. Latency (READ) - Average Latency value for target, READ only direction (in milliseconds).
    20. Latency (WRITE) - Average latency value for target, WRITE only direction (in milliseconds).
  14. Error Checking Check Box - If selected, error checking will be enabled.
  15. Performance Check Box - If selected, performance monitoring will be enabled.
  16. Asynchronous Check Box - If selected, asynchronous I/O will be enabled.
  17. Random LBA - If selected, random LBA selection will be enabled.  Data pattern maintained for all I/O combinations except Performance.
  18. START/STOP/MODIFY Buttons - START begins I/O.  STOP stops I/O.  MODIFY can be used to vary transfer size, buffer size, asynchronous or synchronous, error checking and performance monitoring on a per target basis.  For example, the user would make appropriate selections in the advanced dialog, highlight a target in the target list window and select MODIFY.  For targets not selected for MODIFY, when the user selects START, all other targets will get the currently selected advanced dialog settings.
  19. Menu Selections
    1. License Import - Imports the license.  This procedure must be performed prior to registering the product.
    2. License Register - Registers the application with the licensing server.
    3. License UnRegister - Removes the registration from the licensing server.
    4. Options Import Data File - Imports a custom data pattern.
    5. Options Performance Test - Runs automated performance test suite over a range of transfer sizes producing a results report at the end along with the option to save the results to a comma delimited file.  This test takes approximately 1.5 hours to complete.  Please refer to the Performance test section for more details.
    6. Options Quick Performance Test - Runs a quick automated performance test suite for small (512B) and large (1024KB) block only and displays results at the end.  Results can be saved to a text file.  The quick test time may vary between 5 - 7 minutes.
    7. More Performance Tests Quick Walking Test - Performance test available for physical targets only.  Purpose of test is to minimize target cache hits on READ.  This test runs the Earth I/O profile.
    8. More Performance Tests Long Walking Test - Performance test available for physical targets only.  Purpose of test is to minimize target cache hits on READ.  This test runs the Earth I/O profile.
    9. Options Clear Modifications - Clears per target modifications made in physical or logical mode.
    10. Options Target Test Standalone - Selects Standalone Target test mode.
    11. Options Target Test Cluster - Selects Cluster Target test mode.
    12. Options Target Test Random  Standalone Test - Selects random Standalone Target test mode.
    13. Options Target Test Random Cluster Test - Selects random Cluster Target test mode.
    14. Options Target Test Clear Target - Disables target mode testing and restores original settings.
    15. Options Preferences - This will open a dialog that allows the user to adjust specific settings related to internet access for registrations purposes and application behavior.
    16. Options Custom Tests Random I/O Test - Selects random I/O test mode.
    17. Options Custom Tests Clear Tests - Disables custom test and restores Advanced Dialog settings made prior to custom test selection.
    18. Options Custom Tests Mercury - Available for physical targets only.
    19. Options Custom Tests Venus - Available for physical targets only.
    20. Options Custom Tests Earth - Available for physical targets only.
    21. Options Recall Last Performance Data - Displays results of last Quick or Long performance test that was run.
    22. View Performance History Bytes Write (Red) - Enable or disable Write Performance and IOPS graph data.
    23. View Performance History Bytes Read (Yellow) - Enable or disable Read Performance and IOPS graph data.
    24. View Performance History Bytes Total (Green) - Enable or disable Total Performance and IOPS graph data.
  20. Timeout - I/O timeout is adjustable if performing I/O to a physical target in synchronous mode.  The default setting is 16 seconds.  This is the timeout setting used by the SCSIPort or STorport driver.
  21. Error Injection - Error Injection is available when performing I/O on physical targets in synchronous or asynchronous modes.  The default setting is zero seconds.  By selecting any value other than zero, bus resets will be performed on the selected targets periodically.  For example, if the user selects 20 in this dropdown window, a bus reset will be performed once every 20 seconds.  Bus resets test the driver and firmware's ability to perform error recovery and I/O retry's which are transparent to upper level applications.  I/O should continue after a bus reset without data corruption or errors.
  22. Latency - The average latency value that combines both READ and WRITE I/O latency values for all targets. 
  23. Read Latency - The average value of READ I/O latency values for all targets.
  24. Write Latency - The average value of WRITE I/O latency values for all targets.

Memory Considerations

DataMover's memory requirements are dependent on the settings chosen by the user.  Asynchronous I/O is the worst case configuration for maximum memory usage.  Let's look at an example.  The buffer size, I/O Depth and Thread count dictate the amount of memory that will be required.  If the user selects READ and WRITE operation (BOTH), 25 threads, buffer size equal to 4MB and I/O Depth equal to 63, the memory requirements per target will be as follows:  50 * 4MB * 63 = 12.3GB.  Based on this calculation, it's easy to see that the user must have a good understanding of the memory configuration in the system under test.  Avoid using virtual memory that's in the system's page file, this will have a negative impact on performance.  Please refer to the following Microsoft Knowledge Base Article that describes large memory support in Windows Q283037 and how to enable it.

System Performance Considerations

The following components all have impact on I/O performance.

  • Application performing the I/O
  • Host system design and configuration
  • HBA hardware design
  • HBA maximum queue depth
  • HBA driver
  • HBA firmware
  • Storage Interconnects between Initiator and Target
  • Target cache configuration
  • Target firmware
  • Target hardware design
  • Target maximum queue depth

DataMover has control over only one of these components - the application.  It's been optimized to perform sequential I/O in asynchronous or synchronous mode to fully exercise the HBA hardware, HBA firmware, HBA driver, storage interconnects, and target controller.  DataMover's purpose is not to test every LBA that exists on a logical or physical object residing on the target.  It does not perform I/O on random LBA's because this primarily tests the target controller's ability to handle rapid discontinuity's in LBA location.  From the HBA's perspective, the location of the LBA is irrelevant and has no impact on the HBA's scatter/gather capabilities.  Random LBA I/O stresses the I/O manager on the operating system and has a negative impact on performance.

Importing Custom Data Patterns

Custom data pattern files can be created using any hex-binary editor.  The Moojit prefers HHD's Free Hex Editor.  Each file must begin with a 16 byte header that consists of a NULL terminated variable length ASCII string up to 16 bytes (this includes the NULL value).  Data files created by The Moojit can be accessed here, please use these as examples when creating your own.

Performance Testing

DataMover offers a variety of performance testing options that cater to the type of device being evaluated and include an Extended and Quick version of the three basic I/O profiles offered - CACHE, WALK, and RANDOM.  For example, if you are interested in the performance of your HBA, the CACHE performance profile would be best suited, and if your interested in characterizing the performance of your external RAID storage device, all three profiles would be necessary.  The basic profiles are discussed below ...

CACHE - This profile is available on both logical and physical targets.  The purpose of this test is to maximize storage cache hits to produce the highest bandwidth and IOPS values possible.

WALK - This profile is available on physical targets only.  The purpose of this test is to maximize pre-fetch cache hits by utilizing a non-repeating sequential disk I/O pattern.

RANDOM - This profile is available on physical targets only.  The purpose of this test is to minimize cache hits by randomly selecting LBA ranges across the entire disk geometry.

... and the following table summarizes behavior and feature selection.

 TypeLogical  Physical Quick ExtendedMenu Item 
 CACHE YES YES NO YES Performance Test
 CACHE YES YES YES NO Quick Performance Test
 RANDOM NO YES NO YES Performance Test (with Random check box selected)
 RANDOM NO YES YES NO Quick Peformance Test (with Random check box selected)
 WALK NO YES NO YES Walking LBA Test
 WALK NO YES YES NO Quick Walking LBA Test

The objective of the performance test suite is to record 15 measurements for all transfer sizes ranging from 512B to 4MB for half duplex READ, half duplex WRITE and full duplex READ/WRITE operations.  DataMover uses a 30 second ramp up time before beginning performance data captures.  DataMover applies a dynamic I/O profile changing I/O type, thread count and queue depth to achieve the best results, each of the standard long play performance tests take greater than 15 hours to complete.  The combination that produces the best performance number will be recorded.  After all samples have been taken, the mean and standard deviation from the mean will be calculated for throughput (MB/s), IOPS (IO/s), CPU utilization, and Latency (ms) for all transfer sizes.  This information will be displayed in the Performance Results dialog that is displayed when the test completes.  The user is given the option to save these results to a comma delimited file which can be later imported into a spreadsheet program such as Excel to generate graphs and data tables.  When the Performance Dialog closes, the original settings selected prior to the Performance test will be restored.  The following pictures demonstrate how to enable the Performance test, what the Performance Results dialog looks like, how to save to a comma delimited file, what the raw imported data looks like in Excel, and what the file report can look like in less than 10 minutes.  Data patterns when random and walking LBA test selected will not be maintained for READ I/O direction.

How To Enable Performance Test ...

Image 

PERFTEST will be displayed in Operation window.

Image

Select START to begin testing ...

Image

When testing is complete, the Performance Dialog will be displayed showing the results ...

Image

Select SAVE to save the results to a comma delimited file that can be imported into a spreadsheet program such as Excel ...

Image

View of raw data after importing into Excel ...

Image

View of data after cleaning it up a bit ...

Image

Example of IOPS chart made from this data ...

Image

Physical vs. Logical I/O Comparison

This chart shows the difference between physical and logical IOPS values.  The system under test and storage used for both tests were the same.  Logical I/O was performed on a 5 disk striped LUN, physical I/O was performed on the same 5 disks, no other data parameters were changed.  In general, physical targets will have higher performance values than logical targets.

Image

Target Testing

Available on physical targets only.  Two modes of target testing are implemented by DM - Standalone and Cluster.  In Standalone mode, a single client performs I/O on a target with maximum LBA count of N.  Odd numbered threads start at LBA 0 and increment to LBA N, then repeat.  Even numbered threads start at LBA N and decrement to LBA 0, then repeat.  This process is repeated until the test is stopped.  The Standalone test is a one-to-one relationship, only a single DM client may perform I/O to a given target.  In Cluster mode, multiple DM clients perform I/O on the same target, a many-to-one relationship.   The same test paradigm used in Standalone mode is also used in Cluster mode.  In Cluster mode, if DM detects a Reservation Conflict, it will fail the test; data integrity checking is mandatory for both Standalone and Cluster mode of operation.  The random standalone and cluster target tests differ from the their standard counterparts in one important aspect - they randomly  change thread count, I/O type, and queue depth periodically.  The figures below illustrate the basic difference between Standalone and Cluster mode.

Image

Image

Quick Performance Test

The Quick Performance Test provides a condensed version of the longer Performance Test by collecting metrics on just two transfers sizes - 512B and 1024KB (small block and large block respectively).  When the test finishes, a window will be displayed showing the calculated mean and standard deviation of the mean for throughput, IOPS, CPU utilization, and latency values.  The contents of the display can be saved to a text file.  Test times will vary between 5 - 7 minutes.  The figure below shows the results dialog that will be displayed at the end of the test.  Data patterns when random and walking LBA test selected will not be maintained for READ I/O direction.

Image

For the BOTH data direction, two latency values will be displayed; the leftmost value is for the READ I/O direction, and the rightmost value is for the WRITE I/O direction.  The format is as follows:  Latency in READ / Latency in WRITE.

Asynchronous vs. Synchronous I/O

DataMover has the capability of performing synchronous and asynchronous I/O.  Synchronous I/O allows only one pending I/O exchange per thread while asynchronous I/O allows multiple pending I/O exchanges per thread.  The default behavior in DataMover is that I/O is synchronous:  an I/O operation is called and does not return until the I/O operation returns or times out.  Asynchronous I/O on the otherhand, allows more than one outstanding I/O request to be issued at the same time, saturating the storage device with multiple I/O requests simultaneously.  Asynchonous I/O typically achieves higher throughput and IOPS values for a given transfer size.

Bus Analyzer Signatures

Bus signatures are automatically enabled if Error Checking is turned on.  A specific pattern will be written to the target allowing a bus analyzer such as a SCSI or Fibre Channel analyzer to trigger at the point of failure.  The signature is not written for all errors reported by DataMover - only Data Integrity, READ API and WRITE API errors will trigger the signature pattern.  The patterns for API and Data Integrity failures are as follows:

ASCII

 HEX
 "MOOJIT GOT CRC ERROR ON LAST I/O         "4D4F4F4A 49542047 4F542043 52432045 52524F52 204F4E20 4C415354 20492F4F 20202020 20202020
 "MOOJIT GOT API ERROR ON LAST I/O         "4D4F4F4A 49542047 4F542041 50492045 52524F52 204F4E20 4C415354 20492F4F 20202020 20202020

The bus analyzer trigger should be set to look for this pattern at offset 0x00h in an FCP_DATA frame, payload offset 0x00h in an iSCSI WRITE command, or offset 0x00h in an TCP PAYLOAD.  There is a special signature for SCSI Status equal to Reservation Conflict, available only when performing synchronous I/O to physical targets.  This signature was created for the Target Mode Cluster test specifically.  The signature should be set to look for a vendor specific CDB equal to 0xEFh (CDB[0] = EFh) in FCP_CMD frames or iSCSI WRITE commands.

In most cases, the first DWORD (4D4F4F4A) should be sufficient to trigger the analyzer.  

Custom Tests

Random I/O Test - This test will randomly change thread count, I/O type, I/O direction, and queue depth every five minutes.  Error checking and performance metrics are always enabled.

The walking LBA tests are meant to provide an improvement over the common IoMeter profile by providing non-overlapping I/O on both READ and WRITE while allowing multiple threads per target.  Depending on the target's design and caching algorithm, profiles "Mercury, Venus and Earth" may provide different results.  The Moojit suggests that all profiles be tried to determine which provides the best results.

Walking LBA Test Mercury (w/ Init) - This test has been designed to produce zero cache hits on READ.  Please note that with initialization enabled, the length of time required to initialize the target with the data pattern selected by the user is dependent on the size of the target and the buffer size selected.

Walking LBA Test Mercury (w/o Init) - This test has been designed to produce zero cache hits on READ.  Please note that with initialization disabled, data patterns selected by the user will not be maintained in the READ I/O direction.

Walking LBA Test Venus (w/ Init) - This test is meant to minimize cache hits on READ.  Please note that with initialization enabled, the length of time required to initialize the target with the data pattern selected by the user is dependent on the size of the target and the buffer size selected.

Walking LBA Test Venus (w/o Init) - This test has been designed to minimize cache hits on READ.  Please note that with initialization disabled, data patterns selected by the user will not be maintained in the READ I/O direction.

Walking LBA Test Earth (w/ Init) - This test has been designed to produce zero cache hits on READ but using a different I/O profile characteristic compared to Mercury.  Please note that with initialization enabled, the length of time required to initialize the target with the data pattern selected by the user is dependent on the size of the target and the buffer size selected.

Walking LBA Test Earth (w/o Init) - This test has been designed to produce zero cache hits on READ but using a different I/O profile characteristic compared to Mercury.  Please note that with initialization disabled, data patterns selected by the user will not be maintained in the READ I/O direction.

IoMeter Profile - This test has been designed to emulate IoMeter's overlapping I/O profile.  The data patterns selected by the user will not be maintained with this profile for the READ I/O direction.

Real Time Plots

DataMover displays Performance and IOPS real time plots.  Each plot displays three sets of data - one for READ (yellow), one for WRITE (red), and TOTAL (green).  By default, only TOTAL is displayed but the user may select the other two via the menu.  The number displayed in the upper left hand corner of each display is the maximum value for the vertical axis (units MiB/s for Performance plot and IO/s for IOPS), and the horizontal axis is time.

Image