Text-to-Speech Serverless App — Phase 2: Serverless Backend & Data Storage

November 8, 2023·1 min read

by Frank Doka

Article

Text-to-Speech Serverless App — Phase 2: Serverless Backend & Data Storage

Phase 2 builds the submission pipeline: user text enters through the API, gets persisted, and triggers the processing Lambda asynchronously.

What I Built

API Gateway — REST endpoint integrated with the Cognito User Pool authorizer from Phase 1. Only requests with a valid JWT token pass through.
Submission Lambda — Receives the user's text, language selection, user ID, and timestamp. Writes a new record to DynamoDB with a "processing" status.
DynamoDB table — Stores each submission: user ID, original text, target language, processing status, and (later) the S3 URL for the generated audio.
SNS topic — After writing to DynamoDB, the Lambda publishes a message to an SNS topic. A separate processing Lambda subscribes to this topic, decoupling submission from synthesis.

The submission Lambda returns immediately after writing to DynamoDB and publishing to SNS. The user gets a fast response. The actual translation and speech synthesis happen asynchronously in a separate Lambda triggered by the SNS message — so a slow Polly call never blocks the API response.

What's Next

Phase 3 wires in Amazon Translate and Polly to convert submissions into spoken audio, stored in S3.

Follow me

Text-to-Speech Serverless App — Phase 2: Serverless Backend & Data Storage

Text-to-Speech Serverless App — Phase 2: Serverless Backend & Data Storage

What I Built

What's Next

More articles

Text-to-Speech Serverless App — Phase 4: Frontend Integration & Retrieval

Text-to-Speech Serverless App — Phase 3: AWS Polly, Translate & S3 Storage

Text-to-Speech Serverless App — Phase 2: Serverless Backend & Data Storage

What I Built

Why SNS for Decoupling

What's Next

More articles

Text-to-Speech Serverless App — Phase 4: Frontend Integration & Retrieval

Text-to-Speech Serverless App — Phase 3: AWS Polly, Translate & S3 Storage