AWS Data Pipeline

Name: AWS Data Pipeline API
Brand: AWS Data Pipeline
Availability: InStock

★ Only Publicly Available OpenAPI DocumentAnalyticsData PipelinesAWS Signature v4 (HMAC)19 EndpointsREST

For Agents

Manage AWS Data Pipeline definitions, activate and deactivate pipelines, and coordinate task runners.

Quickstart

Get started with AWS Data Pipeline in minutes using your preferred integration method.

# Add to your MCP client config (Claude Desktop, Cursor, Windsurf)
{
  "jentic": {
    "url": "https://api.jentic.com/mcp",
    "auth": "oauth"
  }
}

# Then ask your agent:
"validate and activate a Data Pipeline"

# → Jentic returns the GET /events tool with parameter schema, agent executes.

Capabilities

What an agent can do with AWS Data Pipeline API.

Create pipelines and put their definitions with PutPipelineDefinition

Activate, deactivate, and delete pipelines to control execution

Validate a pipeline definition before activation

Poll for and report on tasks executed by task runners

List pipelines and describe the status of objects within them

GET STARTED

Start building with AWS Data Pipeline API

Explore with Jentic

View OpenAPI Document

Use for: I need to activate an existing data pipeline, I want to validate a new pipeline definition before activation, List all pipelines in this account, Describe the status of the objects in a running pipeline

Not supported: Does not handle modern serverless orchestration or Spark-based ETL - use AWS Data Pipeline only for managing existing legacy pipelines; use Step Functions or Glue for new work.

Jentic publishes the only available OpenAPI document for AWS Data Pipeline, keeping it validated and agent-ready.

Jentic publishes the only available OpenAPI specification for AWS Data Pipeline, keeping it validated and agent-ready. AWS Data Pipeline configures and runs data-driven workflows that move and transform data between AWS services and on-premises sources. The API manages pipeline definitions, activations, task runners, and validation. AWS announced Data Pipeline as legacy in 2023 and recommends Step Functions, Glue Workflows, or MWAA for new work, but the API is still available for existing pipelines.

Use Cases

Patterns agents use AWS Data Pipeline API for, with concrete tasks.

★ Maintaining a Legacy ETL Pipeline

Many organizations still operate AWS Data Pipeline workflows that move data between S3, RDS, Redshift, and on-premises sources on a schedule. The API allows operators to put updated definitions, activate or deactivate runs, and inspect status without using the deprecated console.

Call PutPipelineDefinition with the updated PipelineObjects, then ActivatePipeline to start the next run.

Pipeline Validation Before Promotion

Before promoting a Data Pipeline definition from staging to production, teams validate that all referenced fields, schedules, and resources are well-formed. ValidatePipelineDefinition returns errors and warnings without activating the pipeline, which prevents broken changes from running.

Call ValidatePipelineDefinition with the candidate PipelineObjects and return errors[] and warnings[] for review.

Custom Task Runner Coordination

Some teams run Data Pipeline activities on their own EC2 instances or on-premises hosts via the task runner protocol. PollForTask, ReportTaskProgress, and SetTaskStatus let custom task runners pull work, report heartbeat progress, and report success or failure.

Call PollForTask with a Worker Group, execute the returned activity, then call SetTaskStatus with FINISHED.

Agent-Assisted Migration Off Data Pipeline

An AI agent helping a team migrate off Data Pipeline can list every pipeline, fetch each definition for review, and deactivate ones that have been replaced by Step Functions or Glue Workflows. Through Jentic, the agent issues each operation as one structured call.

Call ListPipelines, then for each pipelineId call GetPipelineDefinition and DescribePipelines, and deactivate any pipeline tagged 'replaced=true'.

Key Endpoints

19 endpoints — jentic publishes the only available openapi specification for aws data pipeline, keeping it validated and agent-ready.

METHOD

PATH

DESCRIPTION

POST

/#X-Amz-Target=DataPipeline.CreatePipeline

Create a new pipeline

POST

/#X-Amz-Target=DataPipeline.PutPipelineDefinition

Set the pipeline definition

POST

/#X-Amz-Target=DataPipeline.ValidatePipelineDefinition

Validate a pipeline definition

POST

/#X-Amz-Target=DataPipeline.ActivatePipeline

Activate a pipeline

POST

/#X-Amz-Target=DataPipeline.DeactivatePipeline

Deactivate a pipeline

POST

/#X-Amz-Target=DataPipeline.DescribePipelines

Describe pipeline state

POST

/#X-Amz-Target=DataPipeline.PollForTask

Poll for the next task as a task runner

POST

/#X-Amz-Target=DataPipeline.CreatePipeline

Create a new pipeline

POST

/#X-Amz-Target=DataPipeline.PutPipelineDefinition

Set the pipeline definition

POST

/#X-Amz-Target=DataPipeline.ValidatePipelineDefinition

Validate a pipeline definition

POST

/#X-Amz-Target=DataPipeline.ActivatePipeline

Activate a pipeline

POST

/#X-Amz-Target=DataPipeline.DeactivatePipeline

Deactivate a pipeline

Why though Jentic?

Three things that make agents converge on Jentic-routed access.

Credential isolation

AWS SigV4 (HMAC) credentials for the AWS Data Pipeline are stored encrypted in the Jentic vault. Agents receive scoped, short-lived access via Jentic's MAXsystem rather than holding the raw AWS access key ID and secret access key in their context.

Intent-based discovery

Agents search Jentic with intents like 'validate and activate a Data Pipeline' and Jentic returns the matching AWS Data Pipeline operation with its input schema, so the agent can call the correct endpoint without browsing the AWS service reference.

Time to first call

Direct integration: 2-4 days for SigV4 request signing, IAM policy setup, and error handling across AWS Data Pipeline operations. Through Jentic: under 1 hour - search by intent, load the schema, execute.

Related APIs

Alternatives and complements available in the Jentic catalogue.

Alternative

AWS Step Functions

Modern serverless workflow orchestration that AWS recommends over Data Pipeline.

Choose Step Functions for new orchestration workloads; keep Data Pipeline only for existing definitions.

Alternative

AWS Step Functions

Modern serverless workflow orchestration that AWS recommends over Data Pipeline.

Choose Step Functions for new orchestration workloads; keep Data Pipeline only for existing definitions.

Alternative

Amazon Kinesis Analytics

SQL and Apache Flink analytics on streaming AWS data.

Choose Kinesis Analytics for streaming analytics workloads alongside text or pipeline workflows.

FAQs

Specific to using AWS Data Pipeline API through Jentic.

Why is there no official OpenAPI spec for AWS Data Pipeline?

AWS does not publish an OpenAPI specification. Jentic generates and maintains this spec so that AI agents and developers can call AWS Data Pipeline via structured tooling. It is validated against the live API and kept up to date. Get started at https://app.jentic.com/sign-up.

What authentication does the AWS Data Pipeline API use?

AWS Signature v4 (HMAC) with IAM permissions on datapipeline:* actions plus pass-through permissions for the AWS resources the pipeline uses (S3, EC2, EMR, RDS, Redshift). Jentic stores the AWS credentials in its vault and signs each request.

Can I activate and deactivate pipelines with the API?

Yes. ActivatePipeline starts the next scheduled execution and DeactivatePipeline halts further runs without deleting the pipeline. DescribePipelines reports current state.

What are the rate limits for the AWS Data Pipeline API?

Per-account, per-region TPS limits apply, and PollForTask is the rate-limiting operation for custom task runner fleets - back off when no task is returned. AWS treats Data Pipeline as legacy, so capacity may be lower than newer services.

How do I validate a pipeline definition through Jentic?

Search Jentic for 'validate a Data Pipeline definition before activating it', load the ValidatePipelineDefinition schema, and execute it with the pipelineId and PipelineObjects. Errors and warnings come back in structured arrays so you can fail closed before activation.

Should I use AWS Data Pipeline for new workloads?

AWS announced Data Pipeline as legacy in 2023 and recommends AWS Step Functions, AWS Glue Workflows, or Amazon MWAA (managed Airflow) for new work. The API remains supported for existing pipelines.