CUAS: Data Normalization

In the simplest terms we need to compare apples to apples. To accurately compare data from different sources that all relate to a common event – a UAS flight in this case – we must use common frames of reference. Before we dig into examples, let’s cover what is involved with “normalizing” data.

A definition to kick us off:

Normalizing – Simple and sophisticated adjustments intended to bring a variety of data into a common frame of reference.

At a bare minimum, all of your data should use the same timezone and reference model for the physical location of all participating systems.

Example: All times are converted to GMT and LLA (latitude, longitude, altitude) using WGS 84 and altitude in meters above sea level.

To normalize the source data, you should apply well understood transformations to convert the source format into the desired format. For example:

Convert a local time to UTC in Python. (Similar functions exist for all major programming languages and for applications such as Excel and MATLAB)
```
utc_dt = local_dt.astimezone(pytz.utc)
```
Geodetic coordinates latitude $\ \phi$ , longitude $\ \lambda$ , height $h$ can be converted into ECEF coordinates using the following equation: (There are solid implementations of this online and in most major programming languages, you do not need to do it by hand.)

${\begin{aligned}X&=\left(N(\phi )+h\right)\cos {\phi }\cos {\lambda }\\Y&=\left(N(\phi )+h\right)\cos {\phi }\sin {\lambda }\\Z&=\left({\frac {b^{2}}{a^{2}}}N(\phi )+h\right)\sin {\phi }\end{aligned}}$

Let’s walk through a chart showing some example source data, the forms it might take, how URSA normalizes it and the results.

Field name	Examples	Processes Applied	URSA Field Name	URSA Significance	URSA Units	Interpolated?

timestamp	Date time, time since system start, epoch + different data formats	Taxonomy, Interpolation	timestamp	Time since epoch (Unix)	nanoseconds	Yes
timezone	Sensor’s, cloud instance’s, targets, GMT	Taxonomy, Interpolation	timezone	Timezone	GMT	Yes
target location	GPS (unknown accuracy), bearing/range (unknown zero)	Taxonomy, Interpolation	vehicle_position_latitude	GPS latitude	WGS 84	Maybe
target location	GPS longitude		vehicle_position_longitude	GPS longitude	WGS 84	Maybe
altitude	AGL (m or ft), MSL (m or ft), Barometric, Above Launch	Taxonomy, Interpolation	vehicle_position_altitude	Above Ground Level	Meters	Yes
Velocity	Velocity (North, East), Velocity (x,y,z), airspeed (ground, true)	Taxonomy, Interpolation	vehicle_velocity_(x\|y\|z), vehicle_velocity_(ground\|air), vehicle_velocity_(down\|east\|north)	Velocity	Meters / Second	Maybe

Each row shows:

A representative source field
The information that field contains and some example units and reference models
The process we apply to normalize the field
The resulting field name for internal or API use
The significance or reference model for the resulting field
The units of measure
If the field is interpolated or not. If we applied a conversion function, the data was interpolated. If we simply changed the units of measure – kilometers to meters for example – it was not

It is worth going through the exercise of creating this chart. It will help ensure that you understand your source data and it will help those that follow you understand what you did to the data to get it into the final form.

You can do all of your normalization using a combination of Excel and simple scripts. However, as for any complex rote tasks, encoding it in some formal manner will greatly reduce the level of effort required and reduce the possible ways for errors to creep into the process.

By the way, understanding and converting timestamps and timezones is so challenging and important that there are multiple XKCD cartoons about it. Here are two examples: