#2- Complete End To End Generative AI Project On AWS Using AWS Bedrock And AWS Lambda
Based on Krish Naik's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
API Gateway (HTTP API) can expose a POST endpoint that triggers Lambda for generative AI tasks.
Briefing
The core build is an end-to-end “blog generator” API on AWS: Postman sends a blog topic to an API Gateway endpoint, an AWS Lambda function uses Amazon Bedrock to generate a 200-word blog with a foundation model (Llama 2 chat), and the resulting text is saved to an S3 bucket as a timestamped .txt file. The workflow ties together API Gateway → Lambda → Bedrock → S3, turning a single HTTP request into a stored generative output.
Implementation starts with Amazon Bedrock model access. In the Bedrock console, the process requires granting model access per region (the walkthrough uses us-east-1). Some models require additional use-case details before access is granted, so the setup includes selecting which foundation models are allowed and saving the changes. The foundation model catalog includes options such as Llama 2, Llama 3, Titan (Amazon), Claude (Anthropic), Command (Cohere), Jurassic (AI21 Labs), Mistral, and Stable Diffusion for image generation—though the blog generation flow later focuses on a Llama 2 chat model.
Next comes AWS Lambda, set up as the compute layer that reacts to API calls. The function is created with Python 3.12 and a timeout configured to 3 minutes. Because Bedrock invocation requires a newer boto3 than the default Lambda runtime provides, the walkthrough adds a custom Lambda Layer containing an updated boto3 package compatible with Python 3.12 (and also mentions Python 3.10/3.11 compatibility during layer creation). The Lambda code uses boto3’s Bedrock Runtime client to call Bedrock’s invoke_model with a model-specific request body.
The prompt format follows Llama 2 chat conventions, using “INST” with “human” and “assistant” sections. The request body includes generation controls such as max token length, temperature, and top_p. In the Lambda handler, the incoming API Gateway event is parsed to extract blog_topic from the request body, then generate_blog_using_bedrock is called to produce the blog text. The code reads the Bedrock response, pulls the generated content from the response JSON (the walkthrough references the generation field), and returns a success message.
After generation, the Lambda function writes the blog to S3. It creates a timestamped S3 object key (e.g., blog_output/<time>.txt) using the current time formatted as hours, minutes, and seconds. A dedicated helper function performs the S3 put_object call, storing the generated blog text in the configured bucket.
The integration layer is API Gateway (HTTP API). A POST route (blog generation) is created and integrated with the Lambda function, then deployed to a Dev stage. Testing begins with Postman; initial failures include a 404 “message not found” caused by a request path mismatch (including an extra space), and a Bedrock authorization error caused by the Lambda execution role lacking permission for bedrock:InvokeModel. Fixing the IAM role by attaching broader permissions resolves the invoke_model failure. After successful invocation, CloudWatch logs show the generated output, and the S3 bucket contains the timestamped .txt file with the blog content.
Cornell Notes
The project builds a complete AWS generative AI pipeline for blog creation. Postman calls an API Gateway POST endpoint with a JSON body containing a blog topic. API Gateway triggers an AWS Lambda function, which invokes an Amazon Bedrock foundation model (Llama 2 chat) using boto3 Bedrock Runtime and a Llama-style prompt. The Lambda handler extracts the generated text from the Bedrock response and saves it to an S3 bucket as a timestamped .txt file. The setup matters because it demonstrates the practical glue—model access, Lambda runtime dependencies (updated boto3 via a Layer), IAM permissions for bedrock:InvokeModel, and API Gateway routing—needed to make generative output reliably accessible through an HTTP API.
How does a single HTTP request turn into a generated blog stored in S3?
Why is a custom Lambda Layer needed for boto3 in this setup?
What must be configured in Bedrock before invoking a foundation model?
What IAM permission caused the initial Bedrock invocation failure, and how was it resolved?
How does the prompt structure work for Llama 2 chat in the Lambda code?
What common testing issues appeared during Postman calls, and what were the fixes?
Review Questions
- Describe the exact request/response flow from Postman to API Gateway to Lambda to Bedrock to S3.
- What changes are required when the default Lambda boto3 version is insufficient for Bedrock invocation?
- Which IAM permission is necessary for Lambda to call Bedrock invoke_model, and how would you verify it using CloudWatch logs?
Key Points
- 1
API Gateway (HTTP API) can expose a POST endpoint that triggers Lambda for generative AI tasks.
- 2
Bedrock model access must be granted per region before foundation models can be invoked.
- 3
Lambda should use boto3 Bedrock Runtime invoke_model with a model-specific request body and prompt format (Llama 2 chat uses INST with human/assistant markers).
- 4
A custom Lambda Layer is often required to update boto3 so Bedrock invocation works reliably.
- 5
Lambda’s execution role must include permission for bedrock:InvokeModel; missing IAM policies cause invocation to fail.
- 6
Generated text can be persisted by writing to S3 with a timestamped object key using put_object.
- 7
CloudWatch logs are essential for debugging API routing errors (404) and Bedrock authorization failures.