Getting StartedTutorialPart 2: Landing Zones

Part 2: Landing Zones — Real Data Ingestion

In Part 1, you used a VALUES clause to generate sample data. That works for learning, but real platforms ingest data from files. In this part, you will upload real CSV files through landing zones and build pipelines that read from them.

Time required: ~15 minutes

Prerequisites: Completed Part 1


What You Will Build

Two Bronze pipelines that ingest real space launch data:

  • mission_log — reads 25 launches from space_launches.csv
  • rocket_registry — reads 11 launch vehicles from launch_vehicles.csv

Both pipelines will read from landing zones — RAT’s mechanism for bringing external files into the platform.


What Are Landing Zones?

A landing zone is a named drop area where you upload files (CSV, Parquet, JSON). Pipelines reference landing zones using the landing_zone() Jinja function, and RAT resolves the path to the actual files in S3.

Landing zones are useful because:

  • They decouple file upload from pipeline execution
  • Files can be uploaded via the portal UI, curl, or S3 API
  • You can upload sample files for preview (so you can test before running)
  • Landing zone uploads can trigger pipelines automatically (Part 7)

Create the Launch Data Landing Zone

Click Landing in the left sidebar. This page lists all landing zones. It is empty since you have not created any yet.

Landing Zones Page

Create a new landing zone

Click the + New Landing Zone button. Fill in:

FieldValue
Namelaunch-data

Click Create.

Landing zone names are kebab-case identifiers. They are referenced in pipeline SQL as {{ landing_zone("launch-data") }}, which resolves to the S3 path where the files are stored.

Upload the CSV file

You need the space_launches.csv file from the repository. It is located at docs/data/space_launches.csv (25 rows of real space launch data).

Option A: Upload via the portal

Open the landing zone detail page and use the drag-and-drop upload area. Drag space_launches.csv into the upload zone.

Upload via drag-and-drop

Option B: Upload via curl

Terminal
curl -X POST http://localhost:8080/api/v1/landing-zones/launch-data/files \
  -F "file=@docs/data/space_launches.csv"

Upload a sample file

For preview to work, RAT needs a sample file in the _samples/ directory of the landing zone. Upload the same CSV as a sample:

In the landing zone detail page, scroll to the Samples section and upload space_launches.csv there.

Sample files are used by the preview feature. When you press Ctrl+Shift+Enter in the editor, RAT uses the sample files (not the real data) so preview stays fast and does not require a full pipeline run.

Verify the upload

The landing zone detail page should now show your uploaded file with its size and timestamp.

Landing Zone Detail


Create the Mission Log Pipeline

Create the pipeline

Go to Pipelines+ New Pipeline:

FieldValue
Namespacedefault
Layerbronze
Namemission_log
Typesql

Write the SQL

pipeline.sql
-- @description: Space launch mission data from landing zone
 
SELECT
    launch_id,
    mission_name,
    launch_date,
    vehicle,
    launch_site,
    country,
    orbit,
    outcome,
    payload_mass_kg,
    mission_type
FROM read_csv_auto('{{ landing_zone("launch-data") }}')

The key here is {{ landing_zone("launch-data") }}. This is a Jinja template function that resolves to the S3 path of your landing zone at execution time. DuckDB’s read_csv_auto function reads the CSV and auto-detects the schema.

The landing_zone() function is one of RAT’s Jinja template helpers. Others include ref() (reference another table), this (current table name), is_incremental(), and watermark_value. You will use these throughout the tutorial.

Preview

Press Ctrl+Shift+Enter to preview. You should see all 25 rows from the CSV:

launch_idmission_namelaunch_datevehiclecountryorbitoutcomepayload_mass_kg
L001Starlink Group 6-12023-02-17Falcon 9USALEOsuccess17400
L002Crew-62023-03-02Falcon 9USALEOsuccess12055
L025Crew-92024-09-28Falcon 9USALEOsuccess12055

Save, Publish, and Run

  1. SaveCtrl+S
  2. Publish — click the Publish button
  3. Run — click the Run button (play icon) and confirm

Watch the run logs. You should see:

[INFO]  Starting pipeline: default.bronze.mission_log
[INFO]  Creating branch: run-xyz789
[INFO]  Executing SQL pipeline...
[INFO]  Query produced 25 rows
[INFO]  Writing to Iceberg table: default.bronze.mission_log
[INFO]  Merging branch: run-xyz789 → main
[INFO]  Pipeline completed successfully in 1.8s

Verify in Query Console

Navigate to Query and run:

Query Console
SELECT COUNT(*) AS total, COUNT(DISTINCT country) AS countries
FROM "default"."bronze"."mission_log"

Result: 25 total launches from 5 countries (USA, France, Russia, India, Japan).


Create the Vehicle Catalog Landing Zone

Now repeat the process for the launch vehicles data.

Create the landing zone

Go to Landing+ New Landing Zone:

FieldValue
Namevehicle-catalog

Upload the CSV

Upload docs/data/launch_vehicles.csv (11 rows) — both as a data file and as a sample.

Drag launch_vehicles.csv into the upload area, then upload it again in the Samples section.


Create the Rocket Registry Pipeline

Create the pipeline

Go to Pipelines+ New Pipeline:

FieldValue
Namespacedefault
Layerbronze
Namerocket_registry
Typesql

Write the SQL

pipeline.sql
-- @description: Launch vehicle specifications from vendor catalog
 
SELECT
    vehicle_name,
    manufacturer,
    country,
    height_m,
    diameter_m,
    mass_kg,
    thrust_kn,
    stages,
    first_flight,
    active
FROM read_csv_auto('{{ landing_zone("vehicle-catalog") }}')

Preview, Save, Publish, Run

  1. Preview — verify you see 11 rows of rocket data
  2. Save (Ctrl+S) → Publish → Run

Verify

Query Console
SELECT vehicle_name, manufacturer, country, active
FROM "default"."bronze"."rocket_registry"
ORDER BY vehicle_name
vehicle_namemanufacturercountryactive
Ariane 5ArianeGroupFrancefalse
Ariane 6ArianeGroupFrancetrue
Falcon 9SpaceXUSAtrue
Falcon HeavySpaceXUSAtrue

Check Your Progress

At this point, you should have:

PipelineLayerSourceRows
raw_launchesbronzeVALUES clause (Part 1)5
mission_logbronzeLanding zone launch-data25
rocket_registrybronzeLanding zone vehicle-catalog11

Navigate to Explorer to see all three tables in the tree:

default/
  bronze/
    mission_log (25 rows)
    raw_launches (5 rows)
    rocket_registry (11 rows)

You can also check Runs to see the history of all 3 pipeline executions.


What You Learned

  • Landing zones are named file drop areas for ingesting external data
  • {{ landing_zone("name") }} resolves to the S3 path at execution time
  • DuckDB’s read_csv_auto() handles CSV parsing and schema detection
  • Sample files enable preview without a full pipeline run
  • Files can be uploaded via the portal UI or the REST API (curl)

Next Steps

You now have two Bronze tables with real data — launches and vehicles. In Part 3, you will create a Silver pipeline that joins them together using ref(), and you will see the lineage DAG that visualizes how your pipelines connect.