Coordinate Transformation

In almost all fields of science and engineering, it is essential to identify and manipulate mathematical representations of physical, real-world quantities. Robotics is no exception. Intelligent robots build a "mental model" of themselves and the world as they perceive their environments, and they modify those models when interpreting the past and predicting the future. In an abstract sense, these models are simply a collection of numbers and labels, with no explicit meaning to the robot. It is the job of a robot engineer to correctly associate numbers with meaning, and correspondingly, meanings with numbers. (If this sounds like mumbo jumbo at the moment, some concrete examples will be offered soon enough!) Two of the most important mathematical representations are vectors and matrices from linear algebra. Vectors are often representations of positions or directions in two or three dimensions of space, but can also represent other quantities like sensor measurements. Matrices are representations of how representations change, either through an action, or even through a change in how those numbers are interpreted. We will be using them liberally throughout the book, and they appear in almost every subject of robotics. Hence, they must be mastered to get anywhere beyond a superficial understanding of the material.

Coordinate Frames

Vectors extend concepts that are familiar to us from working with real numbers R to other spaces of interest. They also succinctly represent collections of real numbers that have a common meaning like position or direction, or readings from a signal taken at a given time. They make mathematical expressions more compact, which helps us wrap our heads around more difficult concepts.

Most often, n -dimensional Euclidean spaces $R^n$ is used, in which a vector is simply a tuple of n real numbers. The "list of numbers" interpretation is the most common way that vectors are conceived of by engineers and computer scientists, and that is certainly how they are stored and operated upon. Let us call this the "layman's definition" of vectors. However, it is often important to realize that these numbers are just an interpretation of a more abstract essential concept -- the underlying physical meaning -- and the numbers will change depending on their manner of interpretation, such as a chosen frame of reference. This section will present common operations in 2D and 3D, and follow it with a discussion about the importance of separating meaning from representation.

1. 2D coordinate frames

In the "layman's definition", an n -dimensional vector x is a tuple of real numbers $x={x_1,...,x_n}\in R^n$ . For now, we will work in $R^{2}$ . We will use boldface notation only temporarily to help distinguish between vectors and real numbers. In the future, the boldface will typically be dropped.
A 2D position P is represented by a 2-element vector $p=(p_x,p_y)$ that gives its coordinates relative to axis directions X and Y , offset from a position O where the axes cross, called the origin (Fig. 1). We will also represent vectors in column vector form:
$p=\begin{bmatrix} p_x\\ p_y \end{bmatrix}$
for use in matrix-vector products. Both parenthetical and column vector notations are equivalent and interchangeable.

A point P in the plane (a) has no numerical representation until we define a reference coordinate frame (b), which has origin point O and orthogonal coordinate axes X and Y . Its coordinates p=( $p_x,p_y$ ) are respectively the extents of P along X and Y from the origin (c).

The items O , X , and Y define the coordinate frame in which the coordinates are interpreted. Here O is an arbitrary position in space, and X and Y are orthogonal directions with Y rotated $90^{0}$ counter-clockwise from X . Note that in isolation, a vector of coordinates does not define a position. A physical position is only defined by coordinates in reference to a certain coordinate frame. The frame will often be left implicit, or spoken of as the reference frame of the coordinates.

2. 3D coordinate frames

The situation in 3D space is similar, except that we represent a 3D position P with a 3-element vector $p=(p_x,p_y,p_z)$ that gives its coordinates relative to axes X , Y , and Z and offset from an origin O in 3D space where the axes cross. The parenthetical notation is equivalent to the column vector form:
$p=\begin{bmatrix} p_x\\ p_y\\ p_z \end{bmatrix}$
In 3D the coordinate frame consists of the origin O and the mutually orthogonal axes X , Y , and Z . In this book we will use right-handed coordinate convention in which the axes can be envisioned in the layout of the first three fingers of the right hand, suitably arranged at $90^{0}$ right angles. X axis corresponds to the thumb, Y axis corresponds to the index finger, and Z axis corresponds to the middle finger.

Next