Star 历史趋势
数据来源: GitHub API · 生成自 Stargazers.cn
README.md

Generate video summary report using generative AI and serverless on AWS

This repository showcases an automated method for creating comprehensive video summary reports utilizing Amazon Bedrock with the AI21 Labs Jurassic-2 Ultra model. The process involves automating the extraction of images from each frame of video presentations and generating corresponding text summaries. Additionally, it generates a consolidated PDF report that merges each frame's image with its respective text summary.

The resultant PDF report functions as a structured, visual, and textual reference for the video content. By combining images with text summaries, it ensures the meticulous preservation of crucial visual elements such as slides, charts, and diagrams. Moreover, it plays a pivotal role in extracting key points, explanations, and vital information from the video content. This enables users to swiftly review and comprehend the essential aspects of educational presentations without the necessity of watching the entire video, significantly boosting overall efficiency.

Note: For more hands-on labs on generative AI on AWS, please refer to this workshop: Using generative AI on AWS for diverse content types. In this workshop, you will use generative AI on AWS to work with various types of contents, including documents, PDFs, videos files, audios files, images, CSVs, SQL database, graph database, and application logs.

Architecture

The solutions comprises of the below steps:

  1. The user initiates the process by uploading a training/webinar/presentation video file to an Amazon S3 bucket.
  2. Once the video file is successfully uploaded, it triggers the Lambda function to detect segments within the video using Amazon Rekognition Segment API
  3. After receiving successfull job completion event from Amazon Rekognition, it triggers the AWS Step Functions workflow.
  4. AWS Step Functions workflow retrieves segment details and leverages a Ditributed Map for parallel processing. This executes child workflows in parallel, and performs Steps 5 to 7 for batches of segments.
  5. Within child workflow, the Lambda function generates video clips and images for each segment using FFMpeg and stores them in an S3 bucket.
  6. Then Lambda function generates transcripts for the video clips using Amazon Transcribe and place them in the corresponding S3 bucket.
  7. Another Lambda function creates a summary of the transcripts using Amazon Bedrock with the AI21 Labs Jurassic-2 Ultra model and place the summary text file to S3 bucket.
  8. In the main workflow, a Lambda function combines the generated summaries and images for each segment into a PDF document, and creates pre-signed S3 URL. Additionally, it updates an Amazon DynamoDB table with the pre-signed S3 URL.
  9. Finally, a Streamlit app displays the pre-signed URL for users to download the compiled PDF document.

Prerequisites

We recommend using AWS Cloud9 to create an environment to get access to the AWS CLI and SAM CLI from a bash terminal. AWS Cloud9 is a browser-based IDE that provides a development environment in the cloud. While creating the new environment, ensure you choose Linux2 as the operating system. Alternatively, you can use your bash terminal in your favorite IDE and configure your AWS credentials in your terminal.

Deployment

  1. Clone the repo.
git clone https://github.com/aws-samples/video-summarization-serverless.git
  1. Run the following command to prepare our serverlress application for deploying to the AWS Cloud. This command creates a .aws-sam directory that structures your application in a format and location that next step requries.
sam build
  1. Now, package and deploy the SAM application. This deployment will be an interactive menu, the information to give the menu is below. Run the following command.
sam deploy --guided
  • Stack Name: : video-summarization
  • AWS Region: your current region (i.e. us-west-2, us-east-1)
  • Parameter S3BucketName: leave as default
  • Parameter VideoPrefix: leave as default
  • Parameter VideoProcessingStagingPrefix: leave as default
  • Parameter VideoSummaryFilesPrefix: leave as default
  • Parameter VideoPDFReportFilesPrefix: leave as default
  • Parameter VideoProcessingStagingPrefix: leave as default
  • Parameter VideoSummaryFilesPrefix: leave as default
  • Parameter BedrockModelId: leave as default
  • Parameter SummparyReportURLExpiration: leave as default
  • Parameter RekognitionSNSTopicName: leave as default
  • Parameter VideoProcessingWorkflowName: leave as default
  • Parameter PDFFileURLExpiration: leave as default
  • Parameter MaxConcurrency: leave as default
  • Parameter MaxItemsPerBatch: leave as default
  • Parameter WaitTimeForJob: leave as default
  • Confirm changes before deploy: N
  • Allow SAM CLI IAM role creation: leave as default
  • Disable rollback: leave as default
  • Save arguments to configuration file: leave as default
  • SAM configuration file: leave as default
  • SAM configuration environment: leave as default
  1. Verify the SAM template deployed successfully. Also, copy the output value for the key VideoBucket. You will use this same S3 bucket for testing.

Note: For detailed steps, please refer the workshop here:

Test

  1. Go to the video-summarization-serverless/test/ directory.

  2. Run the following command to upload video file to the S3 bucket. Make sure to replace the <video-bucket-name> with the bucket name you copied earlier.

aws s3 cp AWS-TechTalk-S3-Lifecycle.mp4 s3://<video-bucket-name>/video-files/AWS-TechTalk-S3-Lifecycle.mp4
  1. After a few minutes, this will trigger the video-processing-workflow workflow.

  2. Once the workflow is completed, run the Streamlit app to view the summary.

  3. Enter the following command to install all of the Python modules and packages listed in the requirements.txt from within video-summarization/ui/ directory

pip install -r requirements.txt
  1. Launch the Streamlit app with following command.
streamlit run app.py

Note: For detailed steps, please refer the workshop here:

Clean up

  1. Go to thevideo-summarization/ directory.

  2. Run the following command to empty the video bucket. Make sure to replace the <vide-bucket-name> with the bucket name you copied earlier.

aws s3 rm s3://<video-bucket-name> --recursive
  1. Run the following command to delete the SAM template.
sam delete --stack-name video-summarization --no-prompts

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

DISCLAIMER

The solution architecture sample code is provided without any guarantees, and you're not recommended to use it for production-grade workloads. The intention is to provide content to build and learn. Be sure of reading the licensing terms.

关于 About

Generate video summary report at scale using generative AI and serverless on AWS

语言 Languages

Python97.7%
Makefile2.3%

提交活跃度 Commit Activity

代码提交热力图
过去 52 周的开发活跃度
0
Total Commits
峰值: 1次/周
Less
More

核心贡献者 Contributors