Please right-click on the white paper that you would like to download as a PDF file.
| Basic Photogrammetry in RemoteView : Download PDF |
| The Ortho Mosaic Process in RemoteView : Download PDF |
| Pan Sharpening : Download PDF |
| Multi-Image Geo Location and 3D Feature Extraction from Stereo : Download PDF |
| Multi-Sensor Triangulation : Download PDF |
Introduction
This paper covers some of the basic Photogrammetric operations available in RemoteView including geolocation, Orthocalibration, Orthorectification, image registration, and mensuration. Another paper “Advanced Photogrammetry in RemoteView” covers the operations of intersection, resection and block adjustment which are also provided in RemoteView.
What is Photogrammetry?
Photogrammetry, simply put, is the act of making measurements from photographs. When you look at an image and make an estimate about the size of an object in the image, you are performing Photogrammetry. If you were to place a ruler on the image and measure a distance, then estimate the actual distance by scaling the measurement, or if you place a protractor on the image and measure an angle, you are performing Photogrammetry.
The primary function of imagery in geospatial applications is the making of measurements of geographic location, ground distance, heights of objects, areas, perimeters, and volumes. These images are acquired using precisely calibrated airborne and satellite borne sensors. The term “sensor” is used to describe both passive sensors which collect energy radiated by the sun and reflected back to the sensor or absorbed by the earth and re-emitted as thermal energy back to the sensor, and active sensors such as radar and lidar which transmit their own energy and act as both the source and the receptor of the energy. Figure 1 shows an illustration of the imaging system known as “QuickBird”, a commercial satellite with multispectral (visible and near-infrared) capability and sub-meter ground resolution.

QuickBird satellite (Illustration)
Regardless of the sensor type, in order to visualize the data, the sensor must convert the raw radiance data that it receives into an image in which the pixels represent quantized levels of this radiance. The main task of Photogrammetry is to determine the precise ground coordinates to which these pixels correspond.
Sensors which are used in modern geospatial imaging systems contain precise instruments for determining the location and attitude (roll, pitch and yaw) of the platform at the instant, or over the time span, at which an image is collected. These include Global Positioning System (GPS) receivers, star tracking devices which can determine a precise location based on the relative position of a star field observed by the sensor, and Inertial Navigation Systems (INS) which determine the precise orientation angles of the imaging system about all 3 axes using gyroscopes and accelerometers.
In addition, the sensors are precisely calibrated in laboratories to determine their internal characteristics such as camera focal length, lens distortion, offset between the GPS/INS and the sensor, and many other factors which determine the precise relationship between the sensor and the ground at the time each image is obtained.
All of this information can then be encoded into the image metadata or distributed as extra text files with the image so that exploitation tools such as Overwatch’s RemoteView which have the capability to process this information can construct a mathematical “sensor model” for the image.
Using this sensor model, any pixel on the image can be related to a ground position and conversely, any point on the ground can be related to a pixel in the image. The first of these processes, the computation of the ground point from the image point, also requires a source of height information in the single-image case, for reasons explained later. In RemoteView, this source can be from standard DTED files including Shuttle Radar Topography Mission (SRTM) data, or from Lidar files, USGS DEM files, or any other elevation source which can be captured in an image format which RemoteView supports. Having an available source of elevation information of the highest possible accuracy is critical for accurate geolocation. RemoteView supports this requirement by automatically loading elevation data from directories which can be cataloged in the Image Catalog. Each time a new image is opened, the corresponding elevation data of the highest available accuracy is also loaded as a virtual “underlay” for the image.
It is from this mathematical sensor model that all of the measurements taken from an image are performed, either directly or indirectly. The computation of a distance, for example, depends on first computing the ground coordinates of two image points. These ground points are then input to a standard distance equation. Sensor modeling is at the heart of all Photogrammetry and much emphasis in the Photogrammetric literature is devoted to this field. Since so much of Photogrammetry relies on accurate and consistent sensor models, the U.S. Government has invested heavily in developing a set of common sensor models which can be employed by multiple exploitation systems such as RemoteView. The next section will delve more into the details of the art and science of sensor modeling.
Fundamental Applications, Image to Ground and Ground to Image
As mentioned above, one of the main tasks of Photogrammetry is to be able to calculate precise geographic coordinates for each pixel of an image, a process known as “Geo Location”. An image which has the proper metadata to allow this process to be done is said to be “Geo Referenced”. Sometimes the two terms are used interchangeably, but here we will stick to this usage.
In RemoteView, the process of Geo Referencing an image is fully automated. Users are never required to enter sensor model information such as the type of sensor or the parameters of the sensor. This is handled automatically as shown in Figure 2.

Flow of Geo Referencing and Geo Location on a single image in RemoteView
When a user opens an image in RemoteView, the software automatically determines the best method of georeferencing the image based on its metadata, which is data in the header of the file which determines how it is laid out and its other characteristics. This sensor model is “attached” to the image. Now, when the user moves the mouse, RemoteView “reads” the cursor location and, using the sensor model, converts the pixel coordinates to ground coordinates (Latitude, Longitude and Height or some other ground coordinate system such as MGRS, depending on what the user has selected in the viewer’s coordinate readout control). RemoteView also calculates statistical estimates of the accuracy of the ground coordinate when it is able to from the metadata. These values are called the “Circular Error 90%” and the “Linear Error 90%”, abbreviated CE90 and LE90. These two numbers define the radius and the height of an imaginary cylinder located with it’s base centered on the ground point. In a statistical sense, 90% of the true values of the ground point will fall within this cylinder. Of course, the other 10% of the time, the true values lie somewhere outside of this cylinder. The basis of this calculation depends on the imagery, but for images with Rational Polynomial Coefficients, the errors are based mostly on the ERR_RAND an ERR_BIAS tags in the metadata. These are interpreted as 1-sigma random and bias errors respectively, in the plane normal to the look vector, and not including terrain errors. Terrain errors are accounted for separately in the LE90 value on single image error propagation.
The results of geolocation, including accuracy are presented in the geo reporting control in the viewer as shown below.

RemoteView georeporting – ground point location, pixel location, RGB values and accuracy
This process of determining a ground point from pixel coordinates is one of the fundamental applications of Photogrammetry. The reverse process is also critical, that is, the computation of image coordinates from a given ground point. In fact, the mathematics of the sensor model make this reverse process more straightforward, since no external data (elevation) is required. A simple example of where this process is used is in the RemoteView Geo-Marker function shown below.

RemoteView Geomarker Application
In this function, the user enters ground point coordinates and RemoteView calculates the correct image pixel location which corresponds to that ground point and moves the user’s view to center on that point. It will also optionally place a marker with a symbol and text at that point. To do this computation, RemoteView uses the sensor model in the reverse sense to that explained above for Geo Location. It converts from ground space to image space (pixel location) and moves the viewer to center on that point.
The Criticality of Elevation Data
On a single image, the process of determining a ground point from image coordinates cannot be done accurately without elevation data as an additional source of information. This is because the image ray connecting the sensor location and the image point has to be intersected with the earth to obtain the location, as shown in the figure below.

The ambiguous single image geo location problem
The direction of the image ray connecting the sensor (roughly the location of the lens in a simple camera), and the image point (pixel) can be constructed, but the length of the vector cannot be determined just from the sensor model information. There needs to be another source of independent terrain information in order to determine the correct intersection point. This is a fundamental and important fact for single-image geo positioning in any system such as RemoteView that needs to calculate accurate ground points. It is for this reason that RemoteView makes elevation data loading and processing extremely easy to use and always attempts to make elevation data available to the geo referencing function.
The interface for setting up automatic loading of terrain data in RemoteView is shown below.

Setting up RemoteView for automatic elevation data loading and usage
The elevation data can consist of any combination of standard DTED, USGS DEM, Lidar data in GeoTiff format and any other data source in any file format that RemoteView can read. Terrain data locations can also be ingested automatically into the RemoteView Image Catalog, which is a database of image locations and other metadata is provided with RemoteView.
Upon loading any image into RemoteView, the software will also identify any overlapping elevation data, based on the Elevation setup parameters shown above silently load that elevation data as an “underlay” for the image. This data can then be used to determine an accurate ground coordinate for each image pixel. When elevation data has been loaded for an image and is being used for this purpose, this is known as “Ortho Calibration”, or as “Terrain Correction”. In this case, a message as shown below will be shown in the RemoteView viewer.
![]()
Notifying the User that terrain data is being used to obtain accurate ground points
Orthorectification and Orthocalibration
The process of using terrain elevation data to perform a geo positioning on a single image as explained above depends on this elevation data being added to the sensor model for that image. That process is called “Orthocalibration”. In Orthocalibration, the image pixels themselves are not warped or resampled in any way. Instead, the sensor model for the image is made to use elevation data internally and to compute ground coordinates.
In contrast to Orthocalibration, there is a similar process known as “Orthorectification”. This process takes a single image or mosaic of images and uses terrain data to remove the relief displacement errors and other sensor errors that occur in images. Terrain displacement is explained in the figure below.

Illustration of the phenomenon of Terrain Relief.
In this figure, two points on the ground, A and B are imaged by the sensor. Point A is located on a hill, with elevation above that of point B. The corresponding image points are ‘a’ and ‘b’. The height difference along with the perspective of the image gives a distorted image of the relative distance between the two points on the image. If the goal of our image is to portray objects at their correct relative position to one another on the ground, we must remove the effect of the terrain and perspective. If we do this, the image points ‘a’ and ‘b’ will then appear as shown below.

Orthorectification
In this figure, the effects of the terrain have been removed by projecting the two ground points straight up on a line between the point and the base of the figure, called the “Vertical Datum”. This type of perpendicular projection is also called an “Orthographic” projection, which is where the term Orthorectification comes from. In the process of Orthorectification, each pixel in the image is projected straight up from the vertical datum, which is a user defined ellipsoid model of the earth to a new location in the orthorectified image. In the final image, each pixel appears as it would if we could somehow view every pixel directly from above. In fact, this perspective is exactly what traditional paper maps give us. That is why we can view a map and envision the “real” relationships between the objects on the map, their position, size and orientation, and the objects on the earth.
Orthorectification in RemoteView is performed from the calibration dialog on the Rectify tab. The user specifies a map projection. The system automatically loads elevation data as explained above or the user can explicitly load it through this dialog. The user interface for performing Orthorectification in RemoteView is shown below.

Orthorectification User Interface in RemoteView
Notice that there is also a button on this dialog labeled “Adjust Georeferencing Model” which allows a user to further refine the output orthorectified image (also called an “orthophoto”, “image map”, or just “an ortho”). This part of the process is optional. It allows the user to enter known ground points that they may have obtained from a ground survey, a map, another orthophoto or a control image, and to “tie” the orthophoto to those ground points, thus generating a very accurate product.
Another way of looking at Orthorectification is shown below.
The actual appearance of an orthophoto depends on the map projection which is used to generate it. No map projection can preserve all of the round-earth geometry, different ones attempt to preserve area (Equal Area such as Albers Equal Area), some seek to preserve relative directions (Conformal such as Lambert Conformal and Mercator) and others attempt to preserve distances (Equi-Distance such as Azimuthal Equidistant, while others try to preserve some of each of these). For a typical high-resolution satellite image or aerial image of a small area, such as an Ikonos or QuickBird image, the differences obtained from these different map projections will usually be very small and hard to see. For images of larger areas such as Landsat, Spot or Radarsat, the differences will be very noticeable and the projection must be chosen with care. Accurate orthophoto generation requires reasonably accurate elevation data, but it does not require elevation data at the same resolution as the image itself. For a typical 1-meter Ikonos image, DTED level 1 terrain data with 30 meter spacing will produce an acceptable result in most cases, for example.

OrthoPhoto generation
In the figure above, a grid of lines, each separated by the same distance on the earth, and perpendicular to the lines running in the other direction, is shown on the ground and in the orthophoto. On an orthophoto created with nearly an appropriate map projection, the grid lines on the image will appear perpendicular, equally spaced. The rectangles will each be the same area on the image and represent the same area on the ground. Again, we are assuming that the image is a high resolution one of a small area and the choice of map projection is not so important.
In areas such as mountainous areas where there is a lot of variation in the terrain height, the effects of terrain displacement can be seen. The most obvious example is a straight road which goes up and down over hills. In the image, it will appear as a curvy line. When the image is orthorectified, the road appears as a straight line.
We can see an example of this in the figure below.

Illustration of the effects of Orthorectification
Here an orthorectified image is displayed with two vector layers overlaid on it in RemoteView. The red vector layer was digitized from the orthophoto. The green vector layer was digitized from the original, non-orthorectified image. As can be seen in the image, the effect of Orthorectification, in terms of shifting of the image pixels from the original image, can be fairly extreme. The road layers shown on the image are displaced by about 10 meters on average. It is very important to note that it is NOT necessary to Orthorectify your images in order to obtain accurate geo reporting and accurate vectors. It is only necessary that your image be orthocalibrated. That means, it is important to have elevation data loaded with your image so that the correct ground point locations of image points can be computed. In fact, most users do not want even the slight warping of the image pixels caused by Orthorectification. Orthorectification should only be used if you need to display your image in a map projection. Orthorectification might also be necessary if you are provided with vector data from another system such as Erdas Imagine or ArcGIS that was digitized on an orthorectified image or a map. In that case, to get the vector layers to line up exactly with the features on the image, it might be necessary to Orthorectify your image. In that case, you will need to ask the data provider what map projection was used to digitize the data and to create your orthophoto in the same map projection.
A Note on Vector Data
The typical and most often used means of transferring vector information to and from RemoteView is by using ESRI shape files. The coordinates stored in a shape file can be 2-dimensional, such as latitude and longitude or map easting and northing, or they can contain a “z value” to represent the height of each point in the shape file. When this z value is properly populated in the shape file, RemoteView correctly uses this value to compute the corresponding image point for each pixel in the file and displays the vectors in the correct location. In the case where this value is not populated, RemoteView must make an educated guess as to what height information to use for each vector point. There are three basic possibilities. In the best case, assuming all of the vector features are on the ground, RemoteView will use height information from terrain data which has been loaded with the image. So here is another reason why, whenever possible, RemoteView users should try to have elevation data available for their imagery and to load this elevation data automatically when the images are loaded as explained above. The other alternatives for height information are to use a default height from the sensor model, such as the RPC “Height Offset” value, or to use 0 as the height when that is not possible, when the sensor model doe not provide a default height.
Typically, vectors digitized from an orthophoto or a map do NOT contain height information. Especially with orthophotos, this may seem non-intuitive since an orthophoto is created from height data, why don’t the vectors digitized from an orthophoto have height data in them? The reason is that an orthophoto is essentially georeferenced with a “2-dimensional” georeferencing method such as a map projection. This can cause confusion to users when they digitize vectors from an orthophoto in another product and then overlay the vectors on an orthocalibrated image in RemoteView. Because the orthophoto was more than likely created using different elevation data than the data used to orthocalibrate the image in RemoteView, there will be a difference in the image locations of the vector points computed from the ground points stored in the file. In this case, as explained above, the user should first Orthorectify their image to the exact map projection, using the same elevation data if possible, as the image from which the vectors were digitized, in order to get the best alignment of the vector features on the image.
Sensor Models in RemoteView
RemoteView supports a wide variety of georeferencing models, some of which utilize rigorous sensor models with error propagation and advanced Photogrammetric operations such as block adjustment and precision positioning, others which support some of these operations and others which only perform basic image to ground and ground to image calculations. Another way of partitioning these models is by 3-dimensional models and 2-dimensional models. Three-dimensional models are able to use height data to compute a ground point from a single image point. They are also able to compute an image point directly from a given 3-dimensional ground point. If given two different ground points with the same latitude and longitude, but different elevations, a three-dimensional georeferencing model will produce two different image points.
A summary of the 3-dimensional georeferencing models available in RemoteView is shown in the table below. The advanced Photogrammetric operations – Rigorous registration to control images, precision positioning and block bundle adjustment are explained in Advanced Photogrammetry (another Overwatch white paper).
Model |
Types of Imagery |
Operations Supported |
Comments |
|||||
|
Basic |
Ortho |
REG |
PP |
BBA |
ER |
|
|
RPC |
NITF, TFRD, DPPDB, Commercial |
X |
X |
X |
X |
|
X |
Rational Polynomial Coefficients, approximately 90% of all imagery. |
Pushbroom |
QuickBird, |
X |
X |
X |
X |
X |
X |
Rigorous pushbroom models. |
CSM |
Tactical and national |
X |
X |
X |
X |
X |
X |
USAF Program, “black box” sensor models with public API. |
RSM |
NITF “Smart Images” |
X |
X |
X |
X |
X |
X |
Replacement Sensor Model, calculated by MSP |
Image America |
IA TIFF images |
X |
X |
|
|
|
|
A proprietary “black box” sensor model. |
Adjustable RPC |
Ikonos |
X |
X |
X |
X |
X |
X |
A special case of simple RPC images such as Ikonos. |
Three dimensional georeferencing methods supported in RemoteView
NOTES:
Basic = Image to Ground and Ground to Image
Ortho = Orthocalibration and Orthorectification
PP = Precision Positioning
The 2-dimensional models supported in RemoteView are shown below.
Model |
Types of Imagery |
Comments |
Affine |
DEM, DTED, GeoTiff, Lidar, ICHIPB |
A simple 6-parameter coordinate transformation |
Polynomial |
NITF (I2MAPD, GeoTiff, Radarsat, SPOT |
These can be 0th, 1st or 2nd order polynomials. |
Perspective |
GeoTiff, Predator |
A 16-parameter perspective transform. |
EOSAT |
Landsat |
Map Projections used for Landsat images |
Map Projection |
GeoTiff, RemoteView export |
RemoteView supports all of the projections in the USGS projection library. |
Two dimensional georeferencing models supported in RemoteView
The 2-dimensional models support only the basic image to ground and ground to image calculations. They cannot be used to create orthophotos nor do they support Orthocalibration or the other advanced Photogrammetric operations. However, any imagery supported by either the 2 or 3 dimensional models can be used in RemoteView’s “non-rigorous” Photogrammetric operations described below, including image to reference, image to image and image to vector calibration.
Improving the accuracy of images with RemoteView- Registration, Calibration, etc.
Any image with a georeferencing method can be made “more accurate”, in terms of giving ground points which are closer to the true positions of features. There are several ways to do this in RemoteView. This process is known as “Registration” or by “Calibration”. For this paper, we can use the two terms synonymously. In addition, images that have no georeferencing model at all can be given one by RemoteView. These operations work with any of the georeferencing methods described in the tables above, both the 3-dimensional and 2-dimensional models.
If there is a source of ground control points (GCPs) available, the user can use the Image to Reference Tool from RemoteView shown below.

Registering an image to Ground Control Points
In one scenario, a user might have access to a ground survey consisting of GPS measured ground points along with image chips which show the exact pixel on the image which corresponds to the ground point. In that case, the user simply digitizes those points at the known pixel location and enters in the known ground coordinates. The tool will then “tie” the image to those ground control points.
In another scenario, the user might have highly accurate “control” images, such as DPPDB, NTM, Commercial or other sources of imagery which has a supported 3-D sensor model from the table above. In that case, the user would select the “Multi-Image Intersection” option on the tool and the software will then guide them through the process of identifying appropriate control imagery and extracting ground points using a multi-image intersection process (described in the white paper “Advanced Photogrammetry”). In this case, the user obtains the benefit of rigorous sensor modeling and Photogrammetric processing and does not need a ground control survey. Using this process in conjunction with an image which also has a supported 3-D sensor model is the best way to perform image registration, such as tactical image registration. In this case, the control points are obtained from the intersection process and those control points are used to actually adjust the sensor model of the tactical or other image being registered, a Photogrammetric process known as “Resection”, which is also described in the advanced Photogrammetry white paper. The accuracy information of the image is also updated in this process, providing new, and often much better, estimates of the CE90 and LE90 of the image being registered.
But, as mentioned, this process works with any image, including those that do not have a rigorous sensor model. In cases where it does not, the ground control points, either entered from survey data or computed from control images, are used to compute an adjustment to the images georeferencing information.
Once an image has been improved by these processes, it can be saved in a format that will capture all of this improved geo referencing information, that is by saving it to a NITF image and computing RPCs for the image. RemoteView has the capability to compute RPCs from any image that has a 3-dimensional georeferencing model, or from an image that has a 2-dimensional georeferencing model together with elevation data. The RemoteView Image Save tool is used for this purpose. The user should select NITF format and chose “Compute New RPC” on the Format Options.
In this way, the saved NITF image with RPCs can be used in the future and by other users without going through the process of registration again. In essence, the ground control point information is now embedded in the new saved image with RPCs.
In the absence of ground control points or control images with 3-dimensional sensor model support, the user can use other images such as scanned maps, orthophotos including CIB or any imagery which the user is confident has a more accurate georeferencing method than the image they are trying to improve.
One way to do this would be to look at the other image and extract “ground control points” from it and use the tool shown above. But there is a much simpler way, which is by using the image to image calibration tool shown below.

Registering an image to a more accurate image
The image to image registration tool allows the user to place “tie points” on the less accurate image and on the more accurate image. Tie points are simply pixels on both images that represent the same feature on the ground. Examples include any feature that can be identified on both images, such as the intersection of a crosswalk and a street, the corner of a sidewalk, or simply an unidentified “bright pixel” that can be seen in both images. Note that this tool works with a single control image only. This tool provides a quick and robust way to calibrate imagery, including images that have no georeferencing support at all such as JPEGs or Tiffs downloaded from the internet. It features the ability to “auto find” tie points once 2 or more tiepoints have been manually identified. It supports first and second order polynomial adjustments, requiring a minimum of 4 and 10 points for these respectively . It also gives the capability to actually “warp” the image to the control image, but this is not necessary in order to obtain more accurate georeferencing. Warping should be used only when it is necessary that features in the image match the size and shape and relationship of features in the control image. Warping can cause extreme pixel modifications that typical users do not want.
The image to image registration tool is also useful when precise image co-registration is required, for example for performing a multi-spectral pan-sharpening of a multispectral image using a panchromatic image. In this process, better results are sometimes obtained if the 2 images are first co-registered using the image to image calibration tool and a fairly large number of tiepoints. If a small area of the images need to be co-registered, then the tiepoints should be located around that area of the images. Also, a 2nd order polynomial may be required in some cases to achieve sufficient co-registration.
The user of “Check points” is also provided with this tool, which allows the user to get an independent “sanity check” on the calibration process. Error values are computed at these check points which represent how well the polynomial adjustment fits at those points. Generally, a good goal is to try to get all of the errors under 1 pixel, but this is not always possible.
Mensuration
In the domain of image analysis, making basic ground measurements such as distance, area, volume and perimeter is called “Mensuration”.
For national imagery, RemoteView supports an interface to the U.S. government standard mensuration service called “Ruler”. Ruler provides hundreds of mensuration tools or “Output Functions” for use with this imagery. The tools are tailored by image and sensor type. In the future, the U.S. government will transition Ruler and other software to the Mensuration Support Program (MSP) and RemoteView will support an interface to this as well.
RemoteView also has native mensuration tools for computing: Geodetic Distance, Geodetic Azimuth, Perimeter and Area. All of these computations are based on the best algorithms available in the unclassified domain. Some examples of mensuration are shown in the image below.

Some RemoteView Mensuration examples – Azimuth, Distance and Area
At their heart, each of these operations depends on first obtaining ground points for image points identified by the user. So, their accuracy depends first on the accuracy of their georeferencing methods. Geodetic computations are then performed using these ground points and standard geodetic formulas. Distance and Azimuth are computed with an algorithm known as the “Vincenty” formula, named for the inventor. Distance is computed on the great circle path between any two given points. For perimeters, the sum of these distances are simply summed together. Azimuth is defined as the angle from geographic north, measured clockwise from 0 to 360 degrees. Area is computed on the ellipsoidal model of the earth using a concept from mathematics known as “spherical excess”. Polygonal features are divided into a set of triangles and this formulas is applied to each triangle and the result is summed to give the area.
In addition, RemoteView supports a limited selection of mensuration tools through a new tool (to be available in RemoteView version 2.8) with the internal name “Baby Ruler”. A preliminary concept for a user interface for this new tool is shown below.
RemoteView Mensuration tool
The operations supported by “Baby Ruler” are shown below.
As shown above, the RemoteView mensuration tool provides a running log of information returned by the various functions as well as instructions on the use of each function. The results of a mensuration session can also be saved out to a word processing file if needed.
The Ortho-Mosaic Process in RemoteView
Overwatch has developed and deployed geospatial exploitation software tools in RemoteView capable of very high sustained throughput rates. Originally designed to enable smooth, jitter-free roaming in high-resolution displays, this same RemoteView architecture can sustain very fast creation of ortho-rectified mosaics. File size does not adversely affect the throughput rate. Even terabyte-class image files do not diminish the throughput rates. This capability enables distributed, desktop-based workflow scenarios that can compete with more expensive, centralized processing schemes.
Ortho-Mosaicking is the combination of two processes: Ortho-rectification and Mosaicking.
Ortho-rectification is the process of correcting imagery for distortion using elevation data and camera model information so that the scale variation corresponds to a map projection throughout the image. Image distortion can arise from a number of sources including terrain or feature elevation, collection geometry, and from the sensor itself.
Figure 1 shows a very simplified illustration of the distortion that can arise from elevation. The vertical arrow represents a vertical elevation feature. The sensor takes an image of the arrow and ground plane, and projects the image onto the image plane, which is perpendicular to the sensor field of view center ray. The tail of the arrow is projected to point B on the image plane, while the arrow head is projected to point A on the image plane.

Because the sensor captures the image at an angle to the ground plane, the image will appear distorted. Figure 2 below illustrates this distortion. The arrow appears to be “layed over” on its side. Points A and B illustrate how these points appear on the distorted image. The arrow also appears to be “shortened” from its true length.

Figure 3 illustrates how the image should look if the sensor captured every point in the image directly from overhead every point. If the sensor truly collected every point from a perpendicular collection angle, points A and B would overlap. The arrow would appear as a point. This kind of transformation is known as “true” ortho-rectification.

Simply put, the process of “true” ortho-rectification is the transformation to get from Figure 2 to Figure 3.
Mosaicking is the process of taking two or more separate images and "stitching" them together into a single image. Figure 4 illustrates an image generated by a mosaic process over four images.

The ortho-mosaic process, logically, combines these two processes. It creates a single image from many images, and corrects the resulting mosaic for distortions.
Users today often need to monitor broad areas and bring other georeferenced data, such as GIS layers, into their analytical environment. Most users are not experts in photogrammetry or pixel manipulation. Their primary skills are in analyzing features and their timelines, and in quickly drawing accurate conclusions. Since areas of interest often span multiple images, analysts need tools that will quickly and effortlessly stitch together multiple images to create the broad area coverage. They also need tools that will take the work out of placing GIS layers in their proper locations on the broad area images. Proper layer alignment is crucial for drawing accurate conclusions.
As illustrated above, when a sensor collects imagery, especially in a non-vertical (oblique) collection geometry, there is some inherent distortion present in the collected imagery. This is especially true when the imaged area is hilly or mountainous. The collected imagery will have a number of distortions. One type of distortion is called layover, and is due to the higher elevation being closer to the sensor. Figure 1 illustrates the layover phenomena.
Layover can also produce pronounced shifts in a feature’s location. Without ortho-rectification, layover will be present, and GIS layers will not line up. Figure 5 below illustrates the shift in the position of a road vector that layover can produce. In this figure, the red line shows the true road position. The yellow line shows what can happen when a user extracts a road vector from a map or previously ortho-rectified image, and overlays that vector on a non-ortho-rectified image.

Without correcting for terrain effects, the vector will not line up with its true position. In this case, the offset is over 29 meters.
Areas of interest often span significant geographic regions. This is especially true in the analysis of infrastructure such as communications, power and transportation. These infrastructure networks can cover significant portions of entire countries. Creating mosaics of these networks from multiple images can involve hundreds of gigabytes of image data. Analysts can not afford to wait hours or even minutes for a tool to create a mosaic before the analysis can begin. Even modest areas of interest, such as a city, can require many gigabytes. For example, Figure 6 below represents almost 6 GB of imagery from several commercial imaging satellites.

In RemoteView, the process of removing sensor and terrain-induced distortions in an image is fully automated. Users are never required to enter sensor model information such as the type of sensor or the parameters of the sensor. This is handled automatically as shown in Figure 7.

Figure 7: Flow of Geo Referencing and Geo Location on a single image in RemoteView
When a user opens an image in RemoteView, the software automatically determines the best method of georeferencing the image based on its metadata, which is data in the header of the file that determines how it is laid out, as well as its other characteristics. This sensor model is “attached” to the image. Now, when the user moves the mouse, RemoteView “reads” the cursor location and, using the sensor model, converts the pixel coordinates to ground coordinates (Latitude, Longitude and Height or some other ground coordinate system such as MGRS, depending on what the user has selected in the viewer’s coordinate readout control). RemoteView also calculates statistical estimates of the accuracy of the ground coordinate when it is able to from the metadata. These values are called the “Circular Error 90%” and the “Linear Error 90%”, abbreviated CE90 and LE90. These two numbers define the radius and the height of an imaginary cylinder located with it’s base centered on the ground point. In a statistical sense, 90% of the true values of the ground point will fall within this cylinder. Of course, the other 10% of the time, the true values lie somewhere outside of this cylinder. The basis of this calculation depends on the imagery, but for images with Rational Polynomial Coefficients, the errors are based mostly on the ERR_RAND an ERR_BIAS tags in the metadata. These are interpreted as 1-sigma random and bias errors respectively, in the plane normal to the look vector, and not including terrain errors. Terrain errors are accounted for separately in the LE90 value on single image error propagation.
The results of geolocation, including accuracy are presented in the geo reporting control in the viewer as shown below.
RemoteView georeporting – ground point location, pixel location, RGB values and accuracy
This process of determining a ground point from pixel coordinates is one of the fundamental applications of Photogrammetry. The reverse process is also critical, that is, the computation of image coordinates from a given ground point. In fact, the mathematics of the sensor model make this reverse process more straightforward, since no external data (elevation) is required. A simple example of where this process is used is in the RemoteView Geo-Marker function shown below.

RemoteView Geomarker Application
In this function, the user enters ground point coordinates and RemoteView calculates the correct image pixel location which corresponds to that ground point and moves the user’s view to center on that point. It will also optionally place a marker with a symbol and text at that point. To do this computation, RemoteView uses the sensor model in the reverse sense to that explained above for Geo Location. It converts from ground space to image space (pixel location) and moves the viewer to center on that point.
As early as the mid 90’s RemoteView engineers have architected solutions that exploit converging COTS hardware technology and quality image metadata. This approach enables even novice users to benefit from performance increases offered by the most recent hardware advances that soon enter the commercial mainstream. It is a common sense approach to achieving increased performance at reasonable (COTS) costs. That approach continues today, and gives RemoteView an ortho-mosaic capability unique among geospatial exploitation solutions.
As already mentioned, RemoteView removes the burden of removing sensor and terrain-induced distortions from the user. Users are never required to enter sensor model information such as the type of sensor or the parameters of the sensor. This eliminates the need to understand the details of photogrammetry. Users with average computer skills can correct imagery for distortion so that the scale variation is the selected map projection throughout the image.
The RemoteView architecture in conjunction with 64-bit, quad core, dual processing hardware offers new capabilities that enable very fast creation of ortho-mosaics of entire countries at high resolution.
A 64-bit OS running in a multi-core, multi-CPU environment has the necessary processing power and memory address space to address the processing demands that a mosaic of this scale would place on the processing system. This is simply not possible to do in a 32-bit environment.
The RemoteView architecture is able to take full advantage of a 64-bit, multi-CPU environment because of the focus the RemoteView engineers have placed historically on smooth, jitter-free display in dynamic imagery exploitation. In an interactive exploitation environment, data displayed dynamically on a screen must keep pace with a reasonably high refresh rate in order to guarantee fast response times and minimize user fatigue. For some applications, this refresh rate is as high as eighty-five new display frames every second, or about 12 msec per frame. Any pixel processing required to render pixels on the screen must finish within that time to avoid rendering “stalls” (also known as jitter). Jitter occurs when the processing required to render a single frame exceeds the interframe time interval. To avoid stalling the rendering pipeline, the pixel data must be queued or pre-staged into memory in a manner that accounts for I./O and processing latency. This queuing of pixel data requires significant amounts of memory, and large address spaces to support today’s higher-resolution monitors and more complex data sets. By applying the power of a 64-bit OS running in a multi-core, multi-CPU environment to these demands, new capabilities become possible. Some examples of these new capabilities are listed below:
Soft Copy Search capabilities allow users to easily and quickly build large area mosaics from high resolution imagery and interactively search (roam, pan, zoom) and manipulate (contrast, brightness, DRA, etc.) the mosaic image. In order for the roam, pan and zoom to be truly interactive and responsive, the system must “queue up” all the necessary display tiles “next in line” to be pushed to the display. A 32-bit OS soon runs out of memory address space needed to keep track of all the display tiles ready to be pushed to the display in this interactive mode. Users in a 32-bit OS therefore experience a soft limit of about 500 GB on the size of an image mosaic. The soft limit in a 64-bit OS may be on the order of 100 TB. Thus, a 64-bit OS running in a multi-core, multi-CPU environment can create high-resolution mosaics of entire countries.
Besides the need to keep track of all display tiles ready to be pushed to the display, a truly interactive system also has the need to process the display pixels fast enough in order to keep up with the desired refresh rate. An additional advantage of “on-the-fly” processing is that it eliminates the need to store multiple processed copies of essentially the same data set. Only one copy of the native data needs storage space. “On-the-fly” processing creates all other requested processed products on demand.
Single CPU systems have offered limited ability to handle display pixel processing in an interactive manner. For example, RemoteView has taken advantage of this ability for more than ten years and offered users the ability to perform processing stages like pan-sharpening or ortho-rectification “on-the-fly”. A single core, single CPU system can achieve some performance gain by overlapping I/O and processing, but such a system must limit the number of processing stages they can support interactively before the CPU became the performance bottleneck.
Multiple CPUs and multiple cores can significantly reduce this bottleneck and improve performance. Multiple processing cores enable many of these stages to take place at the same time. This in turn means more processing stages can be ready for display in the same time in which previously only one stage could be ready. Some examples of concurrent processing stages that multiple CPU/multiple core systems may enable in an interactive (i.e., “on-the-fly”) manner are:
A 64-bit OS running in a multi-core, multi-CPU environment will support interactive mosaics of multi-stage products.
Overwatch has taken the RemoteView pixel processing architecture originally designed to enable smooth, jitter-free roaming in high-resolution displays and applied it to “on-the-fly” exploitation processes, including the Photogrammetric process of generating ortho-mosaicked images.
When installed on a 64-bit, multiple-CPU PC, this architecture can easily scale to terabyte-sized mosaics. With such a configuration it is now feasible to consider creating daily refreshed mosaics of entire countries in a desktop environment, realizing a considerable savings over more expensive, centralized processing schemes.
“Pan Sharpening” is shorthand for “Panchromatic sharpening”. It means using a panchromatic (single band) image to “sharpen” a multispectral image. In this sense, to “sharpen” means to increase the spatial resolution of a multispectral image.
A multispectral image contains a higher degree of spectral resolution than a panchromatic image, while often a panchromatic image will have a higher spatial resolution than a multispectral image. A pan sharpened image represents a sensor fusion between the multispectral and panchromatic images which gives the best of both image types, high spectral resolution AND high spatial resolution. This is the simple why of pan sharpening. Most of this paper is concerned with the how of pan sharpening. First, a review of some fundamental concepts is in order.
A multispectral image is an image that contains more than one spectral band. It is formed by a sensor which is capable of separating light reflected from the earth into discrete spectral bands. A color image is a very simple example of a multispectral image that contains three bands. In this case, the bands correspond to the blue, green and red wavelength bands of the electromagnetic spectrum. The electromagnetic spectrum is the wavelength (or frequency) mapping of electromagnetic energy, as shown below.

The electromagnetic spectrum
The full electromagnetic spectrum covers all forms of radiation, from extremely short-wavelength gamma rays through long wavelength radio wave. In Remote Sensing imagery, we are limited to radiation that is either reflected or emitted from the earth, that can also pass through the atmosphere to the sensor. Electro-optical sensors sense solar radiation that originates at the sun and is reflected from the earth in the visible to near-infrared (just to the right of red in the figure above) region. Thermal sensors sense solar radiation that is absorbed by the earth and emitted as longer wavelength thermal radiation in the mid to far infrared regions. Radar sensors provide their own source of energy in the form of microwaves that are bounced off of the earth back to the sensor. A conceptual diagram of a multispectral sensor is shown below.

A Simplified diagram of a multispectral scanner
In this diagram, the incoming radiation is separated into spectral bands using a prism. We have all seen how a prism is able to do this and we have seen the earth’s atmosphere act like a prism when we see rainbows.
In practice, prisms are rarely used in modern sensors. Instead, a diffraction grating which is a piece of material with many thin grooves carved into it is used. The grooves cause the light to be reflected and transmitted in different directions depending on wavelength. You can see a rough example of a diffraction grating when you look at a CD and notice the multi-color effect of light reflecting off of it as you tilt it at different angles.
After separating the light into different “bins” based on wavelength ranges, the multispectral sensor forms an image from each of the bins and then combines them into a single image for exploitation.
Multispectral images are designed to take advantage of the different spectral properties of materials on the earths surface. The most common example is for detection of healthy vegetation. Since healthy vegetation reflects much more near-infrared light than visible light, a sensor which combines visible and near-infrared bands can be used to detect health and less healthy vegetation. Typically this is done with one or more vegetation indices such as the Normalized Difference Vegetation Index (NDVI) defined as the ratio of the difference of the red and near-infrared reflectance divided by the sum of these two values. Some typical spectral signatures of vegetation, soil and water are shown below.

These are only representative spectra. Each type of vegetation, water, soil and other surface type has a different reflectance spectra, and outside of a laboratory, these also depend on the sun’s position in the sky and the satellite’s position as well.
When there are more bands covering more parts of the electromagnetic spectrum, more materials can be identified using more advanced algorithms such as supervised and unsupervised classification, in addition to the simple but effective band ratio and normalization methods such as the NDVI.
RemoteView has several tools which take advantage of multispectral data including the Image Calculator for performing NDVI and other indices and a robust Multispectral Classification capability which includes both supervised (using training sets) and unsupervised classification. This paper however is focused on the Pan Sharpening tools within RemoteView.
In contrast to the multispectral image, a panchromatic image contains only one wide band of reflectance data. The data is usually representative of a range of bands and wavelengths, such as visible or thermal infrared, that is, it combines many colors so it is “pan” chromatic. A panchromatic image of the visible bands is more or less a combination of red, green and blue data into a single measure of reflectance. Modern multispectral scanners also generally include some radiation at slightly longer wavelengths than red light, called “near infrared” radiation.
Panchromatic images can generally be collected with higher spatial resolution than a multispectral image because the broad spectral range allows smaller detectors to be used while maintaining a high signal to noise ratio.
For example, 4-band multispectral data is available from QuickBird and GeoEye. For each of these, the panchromatic spatial resolution is about four times better than the multispectral data. Panchromatic imagery from QuickBird-3 has a spatial resolution of about 0.6 meters at nadir. The same sensor collects the nearly the multispectral data at about 2.4 meters resolution. For GeoEye’s Ikonos, the panchromatic and multispectral spatial resolutions are about 1.0 meters and 4.0 meters respectively. Both sensors can collect co registered (explained below) panchromatic and four-band (red, green, blue and near-infrared) multispectral images.
The image below is a QuickBird panchromatic image of Taipei, Taiwan with 0.6 meter ground resolution.

QuickBird panchromatic image, 0.6 meter ground resolution.
A QuickBird multispectral image of the same area at about 2.4 meter resolution is shown below.

QuickBird multispectral image, 2.4 meter ground resolution showing the same areas shown in the panchromatic image above.
The impact of both the multiple of four decrease in spatial resolution and the enhanced color information available in the multispectral image are readily apparent in these two images. Using RemoteView’s projective pan sharpening algorithm to sharpen the multispectral image produces the result shown below.

Pan sharpening example based on the two images shown above. Result has 0.6 meter spatial resolution and 4 multispectral bands.
As you can see from the pan-sharpened image can substantially improve the amount of spectral information in a panchromatic image, conversely it can substantially increase the spatial resolution of multispectral images.
There have been many studies on pan sharpening and many, many algorithms have been developed. Some are slightly better than others at preserving either spatial or spectral information, but there is generally always a loss of one or the other or both. All of the methods depend on the panchromatic and the multispectral image being very closely co registered. When images are co registered, you can think of overlaying one on top of the other and examining any pixel in the top image. The pixel in the image below that should be the exact same feature on the ground.
Pan sharpening algorithms depend on the input images being co registered because they all perform operations on corresponding pixels in both images. They all do something with the multispectral pixel and the panchromatic pixels to create new pixels. If the images are not co registered, the processing will use the wrong pixels, not the corresponding ones and the result will not look natural.
In practice RemoteView uses the georeferencing information in the images to perform co registration “on the fly”. That is why images used for pan sharpening must be georeferenced, that is they must have metadata which supports one of the image/ground transformation methods supported in RemoteView. For information on georeferencing see, the Basic Photogrammetry white paper available at this web site. Unlike advanced photogrammetric operations, the images do not need to have rigorous sensor model support, or RPCs support to perform pan sharpening. Simpler methods such as the ICHIPB or other “4-corner” or map projection methods are also supported.
The software uses the georeferencing information in one image to identify the ground point associated with each pixel, then uses the georeferencing information in the other image in the reverse direction, to convert the ground location to image space in order to locate the corresponding image pixel in the other image.
To a large degree, the results that you obtain from pan sharpening will depend on how well co registered the two images are and how closely their georeferencing methods agree, that is how good their relative georeferencing is.
If the images are not well co registered, you can fix this situation using the RemoteView image to image registration tool. This tool is explained in the white paper Basic Photogrammetry. By manually identifying a small number of “tie points” on the two images, you can achieve much better co registration and pan sharpening results in some cases.
Using this tool, designate the panchromatic image as the reference image and the multispectral image as the image to Calibrate as shown below.

The RemoteView Image To Image Calibration tool. Here 5 “tie points” have been identified on the two images. Push the Calibrate button to perform the co registration of the two images.
There are also special cases where the panchromatic and multispectral images are already exactly co registered. Such is the case for both the DigitalGlobe level 2A product and standard Ikonos panchromatic and multispectral bundled products from GeoEye.
The DigitalGlobe 2A product can be identified from the file name. In the illustration above which shows the image to image calibration tool, the multispectral image file name contains the string “M2A” and the panchromatic image file name contains the string “P2A”. Other image types, including the QuickBird level 1B product, might actually be registered closely enough for successful pan sharpening, and in other cases you might need to perform the image to image calibration steps outlined above.
Note that there is also another registration issue with the multispectral data alone. That is, the individual bands of the image must also be co registered. For nearly all modern sensors, this is not an issue, the bands are precisely co registered because they are collected with the same sensor at the same instant in time. For some older remote sensing data such as SPOT or Landsat, it might also be necessary to perform image to image registration on the individual bands using the technique described above.
To access the RemoteView pan sharpening algorithms, push the “PS” button on the multispectral toolbar, shown below.
The Multispectral Toolbar and the Pan Sharpening Tool button (red circle with arrow)
The dialog shown below will be started.

The Pan Sharpening Dialog. Choose one multispectral and one panchromatic image from the dropdown lists. The images can be chosen in either order. Select a method.
The default method (projective) is suitable when you have well co registered data and a 4-band multispectral image.
This simple dialog begins the process of pan sharpening. In RemoteView, pan sharpening, like most processing algorithms such as Orthorectification or multispectral classification happen “on the fly”. That is, the sharpening occurs only on the pixels which are actually viewed by the user. As the user roams or zooms to other parts of the image, those pixels are processed at that time.
By not forcing the user to wait until the entire image is pan sharpened, RemoteView allows real-time pan sharpening and other processing, removing the need for the “hour glass” user interface of other software packages. Immediately upon selecting the OK button, the pan sharpened image is produced and is ready for exploitation. The sharpened image can also be saved to a new file or saved as part of a RemoteView folder so that the steps used in creating the sharpening do not have to be repeated.
RemoteView provides five algorithms for pan sharpening, as shown below. By far the best method, in terms of the appearance of the output, is the Projective method used in the example of the Taiwan images above. The other algorithms are summarized below.
Algorithm Name |
Features |
Projective |
|
Hue, Saturation, Intensity |
|
Hue, Saturation, Intensity |
|
High Pass Filter |
|
Multiplicative |
|
Pan Sharpening algorithms available in RemoteView.
The results of performing pan sharpening using the High Pass Filter and Hue, Saturation Intensity methods are shown in the images below. As you can see, the projective method shown above produces much better results in terms of preserving the color of the multispectral image. The other algorithms all cause a color shift of one type or another from the original multispectral image. The main advantage of the High Pass and Multiplicative methods are that they produce the same number of bands in the output as the original multispectral image contained.

High Pass Filter, sometimes produces slightly better spatial resolution than the other methods. Creates the same number of bands as the original multispectral image.

Hue Saturation Intensity, color shift due to inconsistent spectral ranges of the panchromatic and spectral bands in spectral space. The effect can be somewhat improved using the NIR Adjustment method available in RemoteView. Produces a three-band output (R, G and B).
The projective pan sharpening algorithm in RemoteView takes advantage of the fact that the panchromatic image used covers the exact same spectral range as the multispectral data. The four spectral bands cover the electromagnetic spectrum from the blue through the near-infrared regions. The panchromatic image covers the exact same spectral range. A mathematical model which relates the panchromatic and multispectral images is formed and the parameters of this model are determined by least squares.
The other methods might be necessary when the input multispectral image does not represent full spectral overlap with the panchromatic image, or when it is not possible to accurately co register the two images. The High Pass Filter and Multiplicative methods also have the advantage of producing images with the same number of bands as the original multispectral image.
By increasing the spatial resolution of the high spectral resolution multispectral image, many other image processing tasks which are performed on the multispectral image are enhanced. This includes simple visual image interpretation and visual exploitation, as well as product generation and advanced methods such as Orthorectification and ortho mosaic.
This paper has presented the reasons why pan sharpening is a popular feature in RemoteView and image processing tools and some of the theory of this technology. There are many additional sources of reference on Pan Sharpening available from Photogrammetric and remote sensing journals. The most widely respected of these include, Photogrammetric Engineering and Remote Sensing, IEEE Transactions on Geoscience and Remote Sensing, Journal of Geophysical Research, and International Journal of Remote Sensing. For additional information, please contact Gene Rose at gene.rose@overwatch.com..
Introduction
In RemoteView, both basic and advanced Photogrammetric processes are available. The white paper titled “Basic Photogrammetry in RemoteView” describes the basic processes. This paper discusses one of the more advanced operations; multi-image geo location, a process used for extracting precise coordinates for targeting and other applications. Also, a method for extracting heights of objects from a single image, using tools called Ruler and RemoteView’s Mensuration Toolkit are described in that paper. This paper considers the case of multiple imagery. In general, using multiple images to obtain the heights of objects is more accurate than using one image, but these generalizations do not always hold true.
Space Intersection
Multi-Image Geo Location (MIG), as the name implies is the process of extracting ground point coordinates from measurements take on two or more images. In the paper Basic Photogrammetry in RemoteView, it was explained that three-dimensional ground coordinates, that is coordinates with horizontal as well as height information, cannot be obtained from a single image without resorting to an independent source of information in the form of terrain data. The problem is shown below.

The ambiguous single image geo location problem.
The problem is that only the direction of the ray connecting the image pixel and the unknown ground point can be determined from the sensor model information, not its length. The ray has an infinite number of possible intersections with the earth. To find the correct one, we estimate the ground surface using an elevation model which can be comprised of standard DTED data or other elevation sources.
On the other hand, if we have two images, the situation is as shown below.

Ground position from two image locations
Here, a feature, which may be on the ground or not, is viewed in both images. The image point locations of the feature in the two images are used to construct two rays. One end point of the rays is known from the sensor location. The direction of both rays is known from the sensor orientation data.
Now, we are able to mathematically find the intersection of the two rays in space, which is the feature represented by the two image points. The ground coordinates of this intersection point represent a point in space from which we can get the horizontal AND height information.
This process is known as Space Intersection and is the same process used in targeting systems such as iGeoPos, RainDrop and the new Common Geopositioning Services (CGS) intersection service. It is also the process by which most digital terrain data which exists today was created, including most DTED and USGS DEM data.
The fundamental physical principle which forms the basis for many of the advance algorithms in Photogrammetry is called the collinearity equation. This equation simply states that the ground point, the image point, and the point where the light is focused by the sensor (called the principal point for optical sensors) all lie on the same straight line. That is, they are all collinear. Mathematically, this condition of collinearity can be written simply as:
G = X*p (1)
That’s it, the collinearity equation. It says that the ground point (G), is something (X) multiplied times the image point (p). The “something” is, of course, where all of the detail comes in. “Something” is something different for each sensor type and for each image of a sensor type. But we don’t have to know what is in “X” right now. We just have to know that we can solve this equation. But there is a problem. The image point (p) has two parts – x and y (or row and column), and the Ground Point G has three parts – either latitude, longitude and height, or X, Y and Z in some other space. We cannot solve this problem with one image point. That is why we cannot determine three-dimensional ground coordinates from a single image, as explained in Basic Photogrammetry. To do that, we needed another source of information, which was terrain data.
Now, however, we have two image points, which means we have four things in “p”, the x and y of both points. Therefore, we can now solve the equation (the collinearity equation) using the two image points without any other source of information. If you are mathematically inclined, this equation is solved by the method of least squares and the book by Ed Mikhail referenced at the end of this paper is an excellent reference, as are the other Photogrammetry texts referenced there. More detail on the mathematics of this process are given below.
The collinearity equation is extremely useful in Photogrammetry. Note however that it really only applies to passive optical sensors. There is, in fact, a similar equation for radar sensors. Instead of a ray of light, the radar equation gives the intersection of a cone with a sphere. We are not going to delve into radargrammetry in this paper. This topic will be covered in a future paper.
Here is another way of showing the collinearity equation. This is a picture of the equation above. The geometric interpretation of the equation is that the small vector “p” needs to be multiplied by something to make it longer so that it will equal the big vector “G”.

The basic physics of optical Photogrammetry. The Perspective Center, the Image Point and the Ground Point all lie on the same line. They are “collinear”.
Some Details of the Collinearity Equation
Taking the case of a simple frame camera, the small vector “p” above will be known from the image pixel location in the focal plane of the sensor and the focal length of the sensor. So the three coordinates of this three-dimensional vector are [x, y, -f], where x and y are the focal plane coordinates and f is the focal length (it is –f because the image is formed behind the perspective center, not in front of it). We can get the x and y part from the image coordinates. The sensor model documentation will tell us how to convert that to the focal plane. It will typically be a fairly simple equation. The ‘f’ part, the focal length, we could get from the documents that specify the sensor design. Focal lengths are well known and calibrated to within thousandths of microns before the camera or sensor is installed in a spacecraft or aircraft. After launch, it might have to be recalibrated because of things getting shaken up during the launch.
This small vector “p” is now defined, but it’s in the wrong coordinate system. To get it to be in the same coordinate system as G, we need to know a few things about the sensor position and orientation (how much it is rotated around all three of its axes, commonly called “tilt/tip/swing” or “roll/pitch/yaw”). If we know these things, we can transform the vector p into the coordinate space of G (i.e. the Ground Space). In RemoteView and in most systems, we use an Earth Centered Fixed (ECF) ground space.
These values of sensor location and orientation must be derived from the image metadata. For a simple frame camera, there will be one value of the sensor position in space and one set of rotation angles for the image. For a pushbroom camera, there will actually be one set of these values for each line of the image. Actually, there will usually be many more than that and we can chose the closest set. The location of the sensor in space is called it’s Ephemeris. The orientation data is called its Attitude. Sometimes the single word Ephemeris is used to refer to both things.
In modern remote sensing systems, the ephemeris of the aircraft or satellite is precisely determined using onboard GPS, Inertial systems, and other methods such as star field observations (the sensor is turned away from the earth and pointed into space and the locations of known stars are used to locate the satellite using astronomy). Ground based tracking systems add to the location information and make it more precise.
Attitude of remote sensing platforms is usually determined using precisely calibrated and highly accurate gyroscopes, high tech versions of the toy gyroscopes we played with as children (well some of us did anyway). The very latest gyroscopes are extremely small and contain no moving parts, but rather sense rotations of the platform by measuring the phase shift of lasers moving through a ring (“ring laser gyroscopes”). An example of a gyroscope used on satellite imaging systems is shown below.

The Mathematics of Rigorous Photogrammetric stereo feature extraction
The mathematical equations which are used in RemoteView and other stereo feature extraction and targeting systems is shown in the diagram below. If you are not familiar with matrix notation or are not interested in this, you can safely skip this section.

Outline of the least squares solution of space intersection. In this example, there is one target point and 2 images. The process can handle any number of targets and images. (but there must be at least 2 images).
This figure above shows the “rigorous” Photogrammetric solution with error propagation used by RemoteView and other stereo extraction and targeting systems. Actually, since the system of equations is not linear, the solution provides corrections to the ground point and the computations are repeated in an iterative fashion until the changes are considered insignificant.
The error propagation shown above is critical to many application such as generating target points for stand-off weapons which are capable of guiding themselves to target locations. The error propagation shown above includes both measurement errors, that is the errors that the user might have made in selecting the image points, as well as the sensor model parameters errors which are used to create the collinearity equations which are what is actually being solved here. All sensor model parameters have some amount of uncertainty in them. The ephemeris and attitude discussed above are measured with instruments that have errors in them. These instruments include GPS receivers and gyroscopes for measuring attitude. These errors are available with the sensor model data.
The intersection process takes these measurement errors and propagates them into the error estimate for the ground point. These are then converted into a Circular Error 90% (CE90) and a Linear Error 90% (LE90). These are statistical measures which define a hypothetical cylinder located with its center at the computed ground point. From statistical inference, we can say that about 90% of the correct answers fall within that cylinder. You may ask, what about the other 10%. Statistically speaking, we don’t know anything about the other 10%. It is important to both have the CE90 and LE90 of your data, and to use good judgment when you use it. Also, these errors represent the absolute ground point accuracy. If your work requires only relative accuracy, then having poor absolute accuracy might not matter. For example, if you are interested only in the absolute height of an object, from base to top, the absolute error of both points might be 20 meters, for example. But, especially with images taken close together in time and space, the relative accuracy is usually much better than that.
Sensor Models which provide stereo and multi-image geo location support
In RemoteView, a wide variety of sensor models are used, including (starting at Version 2.8) the Community Sensor Models (CSM) which provide the best available source of information for the parameters required for performing stereo intersection. CSM models include about 15 different tactical and national models which are available to our U.S. Government customers only.
For commercial imagery, rigorous sensor models for QuickBird, Ikonos and Orbview have been developed from documentation made available from the various commercial software companies. These too can be used for stereo feature extraction with error propagation.
Imagery with no physical sensor model but containing Rational Polynomial Coefficients (RPCs) can also be used for stereo feature extraction with error propagation. In this case, the actual sensor model parameters are not known so one is forced to use the given parameters from the RPCs. These parameters define some overall ground accuracies which must then be converted to image space using the other information available in the RPCs.
The full list of sensor models which are supported for this operation and other Photogrammetric operations are shown below.
|
Model |
Types of Imagery |
Operations Supported |
Comments |
|||||
Basic |
Ortho |
REG |
PP |
BBA |
ER |
|||
RPC |
NITF, TFRD, DPPDB, Commercial |
X |
X |
X |
X |
X |
Rational Polynomial Coefficients, approximately 90% of all imagery. |
|
Pushbroom |
QuickBird, |
X |
X |
X |
X |
X |
X |
Rigorous pushbroom models. |
CSM |
Tactical and national |
X |
X |
X |
X |
X |
X |
USAF Program, “black box” sensor models with public API. |
RSM |
NITF “Smart Images” |
X |
X |
X |
X |
X |
X |
Replacement Sensor Model, calculated by MSP |
Image America |
IA TIFF images |
X |
X |
|||||