Get AI summaries of any video or article — Sign up free
#2- Complete End To End Generative AI Project On AWS Using AWS Bedrock And AWS Lambda thumbnail

#2- Complete End To End Generative AI Project On AWS Using AWS Bedrock And AWS Lambda

Krish Naik·
5 min read

Based on Krish Naik's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.

TL;DR

API Gateway (HTTP API) can expose a POST endpoint that triggers Lambda for generative AI tasks.

Briefing

The core build is an end-to-end “blog generator” API on AWS: Postman sends a blog topic to an API Gateway endpoint, an AWS Lambda function uses Amazon Bedrock to generate a 200-word blog with a foundation model (Llama 2 chat), and the resulting text is saved to an S3 bucket as a timestamped .txt file. The workflow ties together API Gateway → Lambda → Bedrock → S3, turning a single HTTP request into a stored generative output.

Implementation starts with Amazon Bedrock model access. In the Bedrock console, the process requires granting model access per region (the walkthrough uses us-east-1). Some models require additional use-case details before access is granted, so the setup includes selecting which foundation models are allowed and saving the changes. The foundation model catalog includes options such as Llama 2, Llama 3, Titan (Amazon), Claude (Anthropic), Command (Cohere), Jurassic (AI21 Labs), Mistral, and Stable Diffusion for image generation—though the blog generation flow later focuses on a Llama 2 chat model.

Next comes AWS Lambda, set up as the compute layer that reacts to API calls. The function is created with Python 3.12 and a timeout configured to 3 minutes. Because Bedrock invocation requires a newer boto3 than the default Lambda runtime provides, the walkthrough adds a custom Lambda Layer containing an updated boto3 package compatible with Python 3.12 (and also mentions Python 3.10/3.11 compatibility during layer creation). The Lambda code uses boto3’s Bedrock Runtime client to call Bedrock’s invoke_model with a model-specific request body.

The prompt format follows Llama 2 chat conventions, using “INST” with “human” and “assistant” sections. The request body includes generation controls such as max token length, temperature, and top_p. In the Lambda handler, the incoming API Gateway event is parsed to extract blog_topic from the request body, then generate_blog_using_bedrock is called to produce the blog text. The code reads the Bedrock response, pulls the generated content from the response JSON (the walkthrough references the generation field), and returns a success message.

After generation, the Lambda function writes the blog to S3. It creates a timestamped S3 object key (e.g., blog_output/<time>.txt) using the current time formatted as hours, minutes, and seconds. A dedicated helper function performs the S3 put_object call, storing the generated blog text in the configured bucket.

The integration layer is API Gateway (HTTP API). A POST route (blog generation) is created and integrated with the Lambda function, then deployed to a Dev stage. Testing begins with Postman; initial failures include a 404 “message not found” caused by a request path mismatch (including an extra space), and a Bedrock authorization error caused by the Lambda execution role lacking permission for bedrock:InvokeModel. Fixing the IAM role by attaching broader permissions resolves the invoke_model failure. After successful invocation, CloudWatch logs show the generated output, and the S3 bucket contains the timestamped .txt file with the blog content.

Cornell Notes

The project builds a complete AWS generative AI pipeline for blog creation. Postman calls an API Gateway POST endpoint with a JSON body containing a blog topic. API Gateway triggers an AWS Lambda function, which invokes an Amazon Bedrock foundation model (Llama 2 chat) using boto3 Bedrock Runtime and a Llama-style prompt. The Lambda handler extracts the generated text from the Bedrock response and saves it to an S3 bucket as a timestamped .txt file. The setup matters because it demonstrates the practical glue—model access, Lambda runtime dependencies (updated boto3 via a Layer), IAM permissions for bedrock:InvokeModel, and API Gateway routing—needed to make generative output reliably accessible through an HTTP API.

How does a single HTTP request turn into a generated blog stored in S3?

API Gateway (HTTP API) receives a POST request on the /blog generation route. That request triggers the Lambda function. Lambda parses the event body to read blog_topic, calls Bedrock Runtime invoke_model with a Llama 2 chat prompt and generation parameters, then reads the response JSON to extract the generated text. Finally, Lambda writes the blog to S3 using put_object with a timestamped key under a blog_output prefix.

Why is a custom Lambda Layer needed for boto3 in this setup?

The walkthrough notes that Lambda’s default boto3 version can be older and may not support the Bedrock foundation model invocation features needed for this project. To fix that, it creates a Layer by installing boto3 into a local python/ folder, zipping it so the zip contains a python directory, and attaching that Layer to the Lambda function. The layer is configured for Python 3.12 compatibility (and also lists Python 3.10/3.11 during creation).

What must be configured in Bedrock before invoking a foundation model?

Model access must be granted in the Bedrock console for the selected region (us-east-1 in the walkthrough). The process includes selecting the foundation models to allow and saving model access. Some models (e.g., certain Anthropic models) may require submitting use-case details before access is granted.

What IAM permission caused the initial Bedrock invocation failure, and how was it resolved?

CloudWatch logs show an authorization error indicating the Lambda role lacked permission for bedrock:InvokeModel (no identity-based policy allowed the invoke model action). The fix was to update the Lambda execution role by attaching a policy that grants broader access (the walkthrough uses administrator access as a demonstration), then retest the same Postman request.

How does the prompt structure work for Llama 2 chat in the Lambda code?

The request body uses Llama 2 chat formatting with INST and the human/assistant markers. The human section includes instructions like “write a 200 words blog on the topic” followed by the user-provided blog topic. The assistant section closes the prompt and signals where the model should generate the blog. Generation settings like max token length, temperature, and top_p are included alongside the prompt.

What common testing issues appeared during Postman calls, and what were the fixes?

First, a 404 “message not found” occurred due to an incorrect route path caused by an extra space in the request URL. After correcting the path, the next failure was an authorization error for bedrock:InvokeModel, resolved by updating the Lambda role permissions. After both fixes, the blog generation succeeded and the S3 object appeared.

Review Questions

  1. Describe the exact request/response flow from Postman to API Gateway to Lambda to Bedrock to S3.
  2. What changes are required when the default Lambda boto3 version is insufficient for Bedrock invocation?
  3. Which IAM permission is necessary for Lambda to call Bedrock invoke_model, and how would you verify it using CloudWatch logs?

Key Points

  1. 1

    API Gateway (HTTP API) can expose a POST endpoint that triggers Lambda for generative AI tasks.

  2. 2

    Bedrock model access must be granted per region before foundation models can be invoked.

  3. 3

    Lambda should use boto3 Bedrock Runtime invoke_model with a model-specific request body and prompt format (Llama 2 chat uses INST with human/assistant markers).

  4. 4

    A custom Lambda Layer is often required to update boto3 so Bedrock invocation works reliably.

  5. 5

    Lambda’s execution role must include permission for bedrock:InvokeModel; missing IAM policies cause invocation to fail.

  6. 6

    Generated text can be persisted by writing to S3 with a timestamped object key using put_object.

  7. 7

    CloudWatch logs are essential for debugging API routing errors (404) and Bedrock authorization failures.

Highlights

The pipeline turns a Postman request into a stored artifact: API Gateway → Lambda → Bedrock → S3 timestamped .txt file.
A practical dependency issue is handled by adding a Lambda Layer with an updated boto3 package for Python 3.12.
Two distinct failure modes are resolved: a 404 route mismatch from an extra space, and a bedrock:InvokeModel IAM permission error.

Topics

  • Bedrock
  • Lambda
  • API Gateway
  • S3 Storage
  • Boto3 Layer

Mentioned

  • API
  • IAM
  • S3
  • HTTP API
  • SDK
  • JSON