Benchmarking Queries

In addition to the data and benchmarking tool, the queries play a particularly important role in SKYSHARK. We have closely aligned the design of the queries with real-world aviation and airspace surveillance challenges. For the SKYSHARK benchmark, we have formulated a total of 14 queries. These queries progressively increase in complexity and demand more from the system being measured. We have chosen to formulate the queries in SQL or an extension of SQL developed by Apache Calcite. This extension allows for the formulation of stream queries. The goal was to provide a precise and formal description of the data processing to eliminate misunderstandings.

Query Name Focus Latency
1 Identity Test query for implementations Yes
2 JSON Parsing JSON decoding/encoding performance Yes
3 Speed Conversion Arithmetic expressions
Attribute renaming
Attribute rearranging
4 Airframe Tracking Filter with string equals Yes
5 Emergency Squawk Filter wtih boolean expression (integers) Yes
6 Emergency Descent Filter with multiple boolean expressions containing arithmetic expressions Yes
7 Airline Filter Projection of boolean expressions
Floating point comparison
8 Airspace Check Projection of boolean expressions
Floating point comparison
9 Airport Proximity Multiple, complex calculations
Staying within RPU feature limits
10 Airframe Information Join No
11 Airborne Aircraft State Count Aggregation
Time windows
12 OpenSky Latency Aggregation
Tuple count windows with different sized
13 Diversion Airports Join with complex calculations No
14 Flight Schedule Updating Join with computation
Updating RDBMS table

Click here to download query set

Query 1 - Identity

The state JSON string is not parsed and deserialized, only the raw throughput performance of the stream processing framework is tested.

Centered Image
FROM states

Query 2 - JSON Parsing

Every incoming state tuple is decoded, and if output is produced, it is encoded as JSON. Subsequent dataflow graphs will omit the JSON decode and JSON encode steps.

Centered Image
SELECT STREAM id,icao24, time_position, callsign
FROM states

Query 3 - Speed Conversion

Although the metric system is the international standard for most industries, this is not true for aviation, where many imperial units are still in use. For example, an aircraft’s speed is usually measured in knots (kt), the unit of nautical miles per hour. The query convert the aircraft's velocity from meters per second (m/s) to knots. For the conversion, the velocity value is first converted to km/s by multiplication with 3.6. Then, the kilometers are converted to nautical miles. In the past, nautical miles were defined as one minute on the meridian arc, but today the conversion is fixed to 1,852 meters.

Centered Image
SELECT STREAM id,icao24,time_position, last_contact, longitude, latitude, (velocity * 3.6 / 1.852) AS velocity, true_track, vertical_rate, callsign, on_ground,  spi, squawk, baro_altitude, geo_altitude, last_contact
FROM states

Query 4 - Airframe Tracking

ADS-B aircraft tracking information is often used to track the movement of a specific airframe, often private jets with famous owners. This query filters all tuples for a specific icao24 address, only forwarding states for one specific airframe to the data sink.

Centered Image
SELECT STREAM id,icao24, callsign, time_position, longitude, latitude, velocity, true_track, baro_altitude
FROM states
WHERE icao24 like '3cc492'

Query 5 - Emergency Squawk

In an emergency situation, planes can alert air traffic control of an emergency situation by squawking designated emergency codes via their transponder. The code 7500 indicates a hijacking situation, 7600 problems with radio communication, and 7700 a general emergency.

Centered Image
SELECT STREAM id,icao24, callsign, time_position longitude, latitude, baro_altitude
FROM states
WHERE squawk = 7500 OR squawk = 7900 OR squawk = 7700

Query 6 - Emergency Descent

An emergency descent is a rapid, controlled descent that an aircraft may perform in response to certain emergency situations. A prime example is rapid decompression of an aircraft’s cabin, which can be caused by a structural failure, such as a window or door seal breach or damage to the fuselage. In this case, an emergency descent is necessary to bring the aircraft to a lower altitude where passengers and crew can breathe without supplemental oxygen. It usually ends at an altitude of 10,000 ft (3048m), where enough oxygen is available to breathe. Other situations include smoke, fire, or a medical emergency onboard. An emergency descent can be identified by a very high negative vertical speed, sometimes exceeding -7,000 feet per minute (fpm) or 35.56 m/s. As a first condition, the aircraft’s altitude should be higher than 8200ft (2500m). The aircraft must also descent very fast, with a vertical rate exceeding -5,000 fpm (-25.4 m/s). Alternatively, the descent rate must be higher than -2000 fpm (-10.15 m/s) while the transponder squawk code is set to 7600 or 7700.

Centered Image
SELECT STREAM id,icao24, callsign, time_position, longitude, latitude, baro_altitude
FROM states
WHERE baro_altitude > 2500 AND (vertical_rate < -25.4 OR ((squawk = 7600 OR squawk = 7700) AND vertical_rate < -10.15))

Query 7 - Airline Filter

Many airlines have dedicated Flight Operations Centers where they monitor their entire fleet’s status, including aircraft position, weather conditions, maintenance needs, and other relevant data. This query is filtering out all state tuples that do not belong to a particular airline. The ICAO assigns three-letter codes to each airline, which are used in the transponder callsign together with the flight’s flight number. Some example assignments are DLH for Lufthansa, RYR for Ryanair, or QFA for Qantas. The best method of filtering is most likely the icao24 address, as airlines know this information on all their planes. However, we want to focus this query on the RPU’s capability to use string wildcard operations in a boolean expression. Therefore, we check if the callsign attribute string starts with the airline’s ICAO code.

Centered Image
SELECT STREAM id,icao24, callsign, time_position, longitude, latitude, velocity, true_track, baro_altitude
FROM states
WHERE callsign LIKE 'DLH%'

Query 8 - Airspace Check

Airspace is an invisible, three-dimensional area above the surface of the Earth in which aircraft operate, and it is divided into various types. The controlled Delta airspace close to an airport is usually managed by the airport tower, while planes flying at higher altitudes in controlled Charlie or Delta airspace are managed by air traffic controllers in a radar room. Depending on the type of airspace, its lower and upper boundaries may be defined by an altitude above ground level (AGL) or above mean sea level (AMSL). To determine the latitude and longitude, Google Maps can be used.

Centered Image
--upper_lat = 50.7
--lower_lat = 49.2
--left_long = 6.4
--right_long = 10.8
--lower_boundary = 9124.4m (3,000ft AMSL)
--upper_boundary = 3048m (10,000ft AMSL)
SELECT STREAM,states.icao24, states.callsign, states.time_position, (latitude <= upper_lat AND latitude >= lower_lat AND longitude >= left_long AND longitude <= right_long
AND baro_altitude > lower_boundary AND baro_altitude < upper_boundary) AS inside_airspace
FROM states

Query 9 - Airport Proximity

Having knowledge of the planes in the proximity of an airport is critical for the air traffic controller to ensure the safe and efficient operation of the airspace around the airport. Usually, a radar screen would display a map of all known aircraft near the airport. This query calculates the distance from each aircraft’s position to a fixed set of coordinates, representing the airport. Only states representing aircraft that are within a radius of approximately 25 kilometers are emitted to the output. The output state contains all standard variables with the addition of the calculated X and Y coordinates, as well as the calculated distance to the airport.

Centered Image
--airport_latitude = 50.036521
--airport_longitude = 8.561268
SELECT STREAM,position.icao24, position.callsign, position.time_position, position.latitude, position.longitude, position.baro_altitude,
        ((position.airport_latitude_km - (position.state_latitude_km)) * (position.airport_latitude_km - (position.state_latitude_km)))
            + ((airport_longitude_km - (state_longitude_km)) * (airport_longitude_km - (state_longitude_km))) as distance_squared
    SELECT  icao24, callsign, time_position, latitude, longitude, baro_altitude,
        (latitude * 111.139) AS state_latitude_km,
        ((1001.879 / 9) * (0.0129281 * longitude + 1.2) * longitude) AS state_longitude_km,
        (airport_latitude * 111.139) AS airport_latitude_km,
        ((1001.879 / 9) * (0.0129281 * airport_longitude + 1.2) * airport_longitude) AS airport_longitude_km
FROM states
     ) as position
WHERE ((position.airport_latitude_km - (position.state_latitude_km)) * (position.airport_latitude_km - (position.state_latitude_km)))
          + ((airport_longitude_km - (state_longitude_km)) * (airport_longitude_km - (state_longitude_km))) < 625

Query 10 - Airframe Information

Knowing an aircraft’s tail number and type in addition to the icao24 address contained in an ADS-B state can add important context for an air traffic controller or other users. In Data we presented a relational database of airframes collected and published by the OpenSky Network. The query joins every state tuple with its corresponding airframe information, identified by the icao24 address. The focus here is on joining every tuple in a stream table to a relational table.

Centered Image
SELECT STREAM states.*, airframes.registration, airframes.type, airframe.typedesc, airframe.owner
FROM states
JOIN airframes ON states.icao24 = airframes.icao24

Query 11 - Airborn Aircraft State Count

Several groups of people and organizations are interested in the number of airplanes currently in the air, including air traffic control, airlines, researchers, and aviation enthusiasts. The query creates a window for every hour and counts the number of states received, grouped by the icao24 airframe address. The number of result tuples equals the number of airplanes that have communicated with the ground. The focus of this query lies on the ability to create windows in the tuple stream and use it to calculate aggregated counts. The windows are time-based, meaning the number of tuples for each window is not known before it is completed.

Centered Image
SELECT STREAM CEIL(time TO HOURs) as Time, icao24, COUNT(*) as count
FROM states
GROUP BY CEIL(time to HOUR), icao24

Query 12 - OpenSky Latency

The focus of the query lies on the use of windows with different properties and sizes, testing different configurations with each sub-query. We calculate the latency between the state time and the lastcontact timestamps.

Centered Image
SELECT STREAM AVG(states.time_position - states.last_contact) as lc_latency_avg
OVER 1000
FROM states

Query 13 - Diversion Airports

Unexpected situations like severe weather, mechanical issues, or medical emergencies can happen at any point during a flight, which is why pilots need to be aware of airports in their vicinity all the time. This query (Figure 4.15) joins the state tuple stream with airports that are closer than 250km. In addition, the airplane must be airborne (not onground) and the airport type has to be large_airport. Smaller airfields and heliports are not suitable for airliners, making them irrelevant as diversion destinations. The example SQL in Listing 27 is not a complete implementation, but rather just a guideline. The distance calculations are more complex, but identical to the method discussed in Section 4.1.8. This query focuses on complex calculations as join conditions.

Centered Image
SELECT STREAM icao24, callsign, time_position, ident as airport, ((airports_latitude_km - (state_latitude_km)) * (airports_latitude_km - (state_latitude_km)) )
       + ((airports_longitude_km - (states_longitude_km)) * (airports_longitude_km - (states_longitude_km))) as distance_to_airport_squared
    SELECT  states.icao24, states.callsign, states.time_position,
            (states.latitude * 111.139) AS state_latitude_km,
            ((1001.879 / 9)*(0.0129281 * states.longitude + 1.2) * states.longitude) AS states_longitude_km
    FROM states
    WHERE NOT states.on_ground
     ) as state_position
    SELECT  airports.ident,airports.type,
            (airports.latitude * 111.139) as airports_latitude_km,
            ((1001.879 / 9)*(0.0129281 * airports.longitude + 1.2) * airports.longitude) AS airports_longitude_km
    FROM airports
    WHERE airports.type like 'large_airport'
) as airport_position
ON ((airports_latitude_km - (state_latitude_km)) * (airports_latitude_km - (state_latitude_km)) )
       + ((airports_longitude_km - (states_longitude_km)) * (airports_longitude_km - (states_longitude_km))) < 62500

Query 14 - Flight Schedule Updating

In this query, we join the state tuple stream with two relational tables to calculate the current status for each flight, and use this information to update the dynamic fields in the flight schedule table. We calculate the time in seconds the flight will approximately need to its destination and select the current flight phase.

Centered Image

First we find the corresponding schedule for the state's callsign by joining with the flightschedules table. Next, we use the arr_airport attribute to hoin with the airports table, to gather the airports position information. After this information is loaded, the latitude and longitude coordinates for both the airplane and arrical airport are converted from degrees to meters according to the formulas presented in previous queries. The next step is to calculate the distance. Using the distance we now determine the time to destination. Also contained in the script is the logic to determine the current flight phase.

Centered Image
SELECT STREAM icao24, latitude, longitude, velocity, vertical_rate, callsign, on_ground, baro_altitude, airports.latitude, airports.longitude
FROM states
JOIN schedules ON states.callsign = schedules.callsign
JOIN airports ON schedules.destination = airports.ident
distance = abs(state_lat_m - destination_airport_m) + abs(state_long_m - destination_airports_long_m)
time_to_destination = distance / states.velocity
flight_phase = ON_GROUND
    flight_phase = CRUISE
    if (states.vertical_rate > 2) {
        flight_phase = DEPARTURE
    } else {
        flight_phase = CLIMB
    if (states.vertical_rate < 2){
        if (states.baro_altitude < 2000){
            flight_phase = APPROACH
        } else {
            flight_phase = DESCENT
UPDATE schedules
    current_position_lat = states.latitude,
    current_position_long = states.longitude,
    time_to_destination = time_to_destination,
    flight_phase = flight_phase
WHERE schedules.callsign = states.callsign