What’s up? A lot of airplanes!

I lived for several years in the glide path to LaGuardia Airport watching aircraft descend onto Runway 04L. I’ve moved out to Long Island and can still see lots of air traffic overhead, sometimes aligned coming from or going to KJFK or KLGA, and sometimes way overhead making the journey to or from Europe.

All of these aircraft broadcast flight data that we can listen in on and examine to learn the call sign and current track or heading, airspeed, and position. It’s called Automatic Dependent Surveillance – Broadcast (ADS-B) and it all adds up to a lot of Big Data! This demo collects over-the-air digital data from aircraft into a Vertica database, where it’s collated and rendered into maps and charts.

View the map and statistics – 25+ million messages daily from 2500+ aircraft! – at https://www.woodsidelabs.com/whatsup/

That’s interesting, but how does it apply to me?

Are you working with Big Data from sources other than airplanes? This demo shows how Vertica can ingest and analyze large amounts of data from any source. Think how each component can be generalized: the “aircraft” could be an IOT sensor or a financial transaction. The radio feed could be sensor data or FIX messages. The analysis could be patterns in the sensor readings across a geographic region or trends across many quotes and transactions in market data. Vertica can handle many use cases from many different data sources and support decision making ranging from maps and charts of past to present data to machine learning and prediction of future movement. That’s the real power and value of Big Data beyond a clever air traffic demo!

The hardware setup

Signals are received by a FlightAware UHF antenna atop a homemade 15-foot PVC mast. The UHF antenna feeds into a RTL-SDR v3 USB radio that captures the digital radio signal and sends it down a long USB repeater cable to a Raspberry Pi 3B. The Raspberry Pi runs dump1090-fa to convert the radio packets into CSV rows that are streamed into a Vertica database running on a Linux desktop computer.

The software setup

ADS-B messages contain bits of telemetry data. dump1090-fa outputs ADS-B messages as CSV rows of a standard field count, which is straightforward to pull into a database. You could feed these messages to a message queue or broker such as ActiveMQ or Kafka for more fault tolerance and flexibility, but I wrote a one-line streamer loader to push batches of rows to Vertica using NetCat to read the socket, split to chunk the feed into 10K line batches, and vsql to copy from stdin:

nc myraspi 30003 | split -u --filter="/opt/vertica/bin/vsql -U dbadmin -w mypasswd -i -c \"COPY dump1090csv FROM LOCAL STDIN DELIMITER ','\"" --lines=10000 - 1090

For a mapping demo we need MSG types 1,3,4, which are Identification and Category, Airborne Position Message, Airborne Velocity Message. Alas, we need to combine all three messages from a given aircraft to determine where to plot it, what the label should be (if any, as callsign can be empty!), and what direction it should be facing. Fortunately Vertica offers analytic window functions, so we can group by transponder hardware ID (hex_ident) field and pick the most recent callsign, postion, heading values and plot those! (If callsign is empty, I show the transponder hex ID)

The map is an OpenLayer JS with OpenStreetMap data that plots the flight data for aircraft observed in the last 3 minutes with three major airports (KEWR, KJFK, KLGA) marked by red dots and my approximate antenna location. If aircraft appear to be bunched together, it’s because the map updates every 2-3 minutes on a cron job, so aircraft seemingly occupying the same space probably passed that point several minutes apart – this is particularly obvious where aircraft line up to land, since 3-4 aircraft may have landed in the 3-minute analytic window.