# What is the Feature Extraction tool?

The Feature Extraction tool in ActiLife allows users to analyze specific time and frequency domain features of the high-resolution raw data produced by ActiGraph's "w" and "BT" series of devices (GT3X+, ActiSleep+, wGT3X+, wActiSleep+ and wGT3X-BT).

### Overview of High Resolution Raw Data

ActiGraph's "w" series devices are capable of measuring up to +/-6 *g*s in three dimensions; "BT" series devices are capable of measuring up to +/-8 *g*s in three dimensions. The orientation of the device and the corresponding axes are shown below.

When oriented perfectly in each vertical, horizontal or perpendicular direction, the accelerometer will produce a negative *g* value for that axes output as show below (the other axes readings will be zero).

The ActiGraph device will sample and store data for each axis, reading at a rate between 30-100Hz as selected by the user at the time of initialization. The data stored during this time is what we refer to as "raw data." Use ActiLife to download the data from the device. Doing so will produce a *.gt3x file. This *.gt3x file is an archive file (zip) that contains several files, including a *.bin file which contains the raw sample data (**NOTE: This sample data is bit-packed using a special algorithm and is subject to change at anytime. Use ActiLife to process *.gt3x files to guarantee the data remains consistent from device-to-device**)

**Extracting Features**

Once data has been collected, the Feature Extraction tool can be used to isolate certain time and/or frequency domain characteristics from the raw data. Use the "Add Dataset(s)" option to select up to 100 *.gt3x files for analyses in the Feature Extraction tool.

**Terms Defined**

**Array:**Is an array of raw data samples with width equal to*window size***sample rate*. E.g., For a Window Size = 10s and a Sample Rate = 30Hz, the array will contain 300 samples for each 10s period within the file. The array is used to break up the file for feature analysis.**Feature:**Is the desired feature to extract from the selected array.**Window Size:**Defines the time interval of the selected array.**Sample Rate:**The data sample rate set at time of initialization (30-100Hz in 10Hz increments)- : These are the arrays, referenced below, that relate to data from the X, Y and Z axes.
- represents any array (X, Y or Z)
**Array Length:**The length of the array depends on the Window Size and the Sample Rate. E.g., 30Hz Sample Rate and 1s Window Size = array length of 30

To extract features:

## Step 1: Select arrays from the Time or Frequency domain section

**Available Arrays**

- X = accelerometer X-axis data
- Y = accelerometer Y-axis data
- Z = accelerometer Z-axis data
- Vector Magnitude = square root of the sum of the squares of each axis
- First Order Differential of X = the first order differential of the accelerometer X-axis data
- First Order Differential of Y = the first order differential of the accelerometer Y-axis data
- First Order Differential of Z = the first order differential of the accelerometer Z-axis data

## Step 2: Select the features to be extracted

**Time Domain Features:**

- Min = determines the minimum value of the selected array
*min(x)**min(y)**min(z)*

- Max = determines the maximum value of the selected array
*max(x)**max(y)**max(z)*

- Mean = determines the mean value of the selected array
- Standard Deviation = determines the standard deviation of the selected array
- Correlation XY = determines the correlation between the accelerometer X & Y axes
- Correlation XZ = determines the correlation between the accelerometer X & Z axes
- Correlation YZ = determines the correlation between the accelerometer Y & Z axes
- Median Crossing = determines the number of zero crossings of the median of the selected array
- Entropy = determines the measure of uncertainty of the selected array. (
**coming in ActiLife 6.9**) - 10
^{th}Percentile = determines the 10^{th}percentile rank of the selected array. - 25
^{th}Percentile = determines the 25^{th}percentile rank of the selected array. - 50
^{th}Percentile = determines the 50^{th}percentile rank of the selected array. - 75
^{th}Percentile = determines the 75^{th}percentile rank of the selected array. - 90
^{th}Percentile = determines the 90^{th}percentile rank of the selected array.- The percentile rank formula is discussed at http://en.wikipedia.org/wiki/Percentile_rank

**Frequency Domain Features:**

- Dominant Frequency = calculates the frequency content of the selected array which is above the selected cutoff frequency, then determines which frequency has the highest magnitude.
- Dominant Frequency Magnitude = calculates the frequency content of the selected array which is above the selected cutoff frequency, then determines the highest magnitude of all frequencies.
- Entropy = determines the measure of uncertainty of the frequency content of the selected array which is above the selected cutoff frequency. (
**coming in ActiLife 6.9**)

### Special Note about Entropy Calculations

ActiLife will support a normalized and an un-normalized entropy calculation for both frequency and time-domain calculations. For an overview of entropy calculation methods for these types of data (and justification for non-normalized outputs), see Nonparametric Entropy Estimation, an Overview, J. Beirlant, E. J. Dudewicz, L. Gyorfi, and E. C. van der Meulen (1997), Volume 6, pp. 17– 39 at http://jimbeck.caltech.edu/summerlectures/references/Entropy%20estimation.pdf

## Step 3: Select the Window Size

The "Window Size" option selects the time interval that the array should span and can be different for Time and Frequency outputs.

## Step 4: Select the Frequency Cutoff (for Frequency Domain Analytics Only)

The "Frequency Cutoff" option allows the user to select the lowest allowable frequency that can be included in the analysis. Anything frequency components below the selected frequency will be ignored during the analysis.

# Viewing Results

After clicking "Calculate" at the bottom, the results can be viewed in ActiLife by clicking the "Results" button to the right of the file(s) as shown below

In the results example shown below, the Time Domain output is set to a Window Length of 1s while the Frequency Domain output is set to a Window Length of 10s. Note that the Frequency Domain output latches for 10s while the Time Domain output updates every 1s.

The same output for all files can be exported in batch *csv using the "Export" button at the bottom of the main screen. A single *.csv file will be created for each file. The timestamp given in the *.csv export can be formatted to one-second resolution as desired using Excel.