ReferenceAPI ReferenceLineage

Lineage

The lineage endpoint returns the full data lineage graph (DAG) for your platform, showing how pipelines, tables, and landing zones are connected. The portal uses this data to render the interactive DAG visualization.


Endpoints

MethodEndpointDescription
GET/api/v1/lineageGet the lineage graph

Get Lineage Graph

GET /api/v1/lineage

Builds and returns the full lineage graph by:

  1. Listing all pipelines
  2. Reading pipeline SQL/Python files in parallel (bounded to 20 concurrent reads) to extract ref() and landing_zone() dependencies
  3. Batch-fetching latest runs, quality test counts, tables, and landing zones in parallel
  4. Constructing the DAG with nodes and edges

Query Parameters

ParameterTypeDefaultDescription
namespacestringFilter to a single namespace (optional)

Request

curl "http://localhost:8080/api/v1/lineage?namespace=default"

Response — 200 OK

{
  "nodes": [
    {
      "id": "pipeline:default.silver.orders",
      "type": "pipeline",
      "namespace": "default",
      "layer": "silver",
      "name": "orders",
      "latest_run": {
        "id": "run-uuid",
        "status": "success",
        "started_at": "2026-02-12T14:00:00Z",
        "duration_ms": 4500
      },
      "quality": {
        "total": 3,
        "passed": 3,
        "failed": 0,
        "warned": 0
      }
    },
    {
      "id": "table:default.silver.orders",
      "type": "table",
      "namespace": "default",
      "layer": "silver",
      "name": "orders",
      "table_stats": {
        "row_count": 12340,
        "size_bytes": 524288
      }
    },
    {
      "id": "table:default.bronze.raw_orders",
      "type": "table",
      "namespace": "default",
      "layer": "bronze",
      "name": "raw_orders",
      "table_stats": {
        "row_count": 15000,
        "size_bytes": 800000
      }
    },
    {
      "id": "landing:default.raw-uploads",
      "type": "landing_zone",
      "namespace": "default",
      "name": "raw-uploads",
      "landing_info": {
        "file_count": 5
      }
    }
  ],
  "edges": [
    {
      "source": "table:default.bronze.raw_orders",
      "target": "pipeline:default.silver.orders",
      "type": "ref"
    },
    {
      "source": "pipeline:default.silver.orders",
      "target": "table:default.silver.orders",
      "type": "produces"
    },
    {
      "source": "landing:default.raw-uploads",
      "target": "pipeline:default.bronze.ingest",
      "type": "landing_input"
    }
  ]
}

Node Types

TypeID FormatDescription
pipelinepipeline:{namespace}.{layer}.{name}A data pipeline
tabletable:{namespace}.{layer}.{name}An Iceberg table in the data lake
landing_zonelanding:{namespace}.{name}A landing zone for file uploads

Pipeline Node Fields

FieldTypeDescription
idstringNode identifier
typestringAlways pipeline
namespacestringNamespace
layerstringData layer
namestringPipeline name
latest_runobject|nullMost recent run summary
latest_run.idstringRun UUID
latest_run.statusstringRun status
latest_run.started_atstringISO 8601 start timestamp
latest_run.duration_msintegerExecution duration in milliseconds
qualityobject|nullQuality test summary
quality.totalintegerTotal number of quality tests
quality.passedintegerNumber of passing tests
quality.failedintegerNumber of failing tests
quality.warnedintegerNumber of tests with warnings

Table Node Fields

FieldTypeDescription
idstringNode identifier
typestringAlways table
namespacestringNamespace
layerstringData layer
namestringTable name
table_statsobject|nullTable statistics
table_stats.row_countintegerNumber of rows
table_stats.size_bytesintegerTable size in bytes

Landing Zone Node Fields

FieldTypeDescription
idstringNode identifier
typestringAlways landing_zone
namespacestringNamespace
namestringZone name
landing_infoobjectLanding zone stats
landing_info.file_countintegerNumber of files in the zone

Edge Types

TypeSourceTargetDescription
reftablepipelineTable is read by a pipeline (via ref() in SQL/Python)
producespipelinetablePipeline writes to a table (convention: same ns.layer.name)
landing_inputlanding_zonepipelineLanding zone feeds a pipeline (via landing_zone() in SQL/Python)

Edge Fields

FieldTypeDescription
sourcestringSource node ID
targetstringTarget node ID
typestringEdge type: ref, produces, or landing_input

Orphan Nodes

Orphan tables (not produced by any pipeline) and orphan landing zones (not referenced by any pipeline) are included as disconnected nodes in the graph. These appear as isolated nodes in the portal’s DAG visualization.

The lineage graph is computed on-the-fly from the current state of pipelines and their source files. It is not cached — each request reads all pipeline files and constructs the graph from scratch. For large deployments, the namespace filter can significantly reduce response time by limiting the scope of the graph.


How Dependencies Are Detected

Dependencies are extracted by parsing pipeline source files:

FunctionCreates EdgeExample
ref('layer.name')ref edge from table:{ns}.{layer}.{name} to the pipeline{{ ref('bronze.raw_orders') }}
ref('ns.layer.name')ref edge from table:{ns}.{layer}.{name} to the pipeline{{ ref('shared.silver.customers') }}
landing_zone('name')landing_input edge from landing:{ns}.{name} to the pipeline{{ landing_zone('raw-uploads') }}
(implicit)produces edge from the pipeline to table:{ns}.{layer}.{pipeline_name}Every pipeline produces a table with the same name

Cross-namespace references (3-part ref()) create edges that span namespaces, even when the lineage is filtered to a single namespace. The portal renders these as cross-boundary connections.