GPT Subtitler

Description:

GPT Subtitler is a web application designed to translate subtitles using LLM APIs like OpenAI, Claude, or Gemini. Built with the AWS Serverless Stack (SST) framework, the frontend uses React.js while the backend relies on AWS Lambda functions, DynamoDB, Cognito, and S3. It integrates Stripe for subscription payments and supports token-based pricing, allowing users to access translation services without needing an API key.

Users can upload subtitle files, customize translation settings, and download translated subtitles. Key features include customizable translation settings (support for various languages, model selection, temperature adjustment, prompt customization, and few-shot examples), batch processing, and real-time progress tracking.

GPT Subtitler is designed to make subtitle translation faster and more cost-effective, reducing the typical workflow from hours to minutes. The website offers different pricing based on the model used; Claude-Haiku is currently the most cost-efficient, while free models like Gemini-1.5-flash and Gemini-1.5-pro are also available.

Overall, GPT Subtitler offers a scalable, efficient solution for subtitle translation, leveraging state-of-the-art language models and serverless architecture. Future plans include continuous improvements and additional features to enhance user experience and translation quality.

The user interface of the translation settings page in GPT Subtitler,

The translation settings page in the GPT Subtitler web application provides a user-friendly interface for configuring subtitle translation. The left side displays the original and translated subtitle content, while the right side offers various settings. Users can choose to use their account token or enter an API key, select source and target languages, specify the starting index, and access advanced settings like prompt, additional context, model, batch size, and temperature etc. The page also includes options to upload a subtitle file, save, delete, translate, stop translation, and download subtitles in different formats. This intuitive interface allows users to customize the translation process based on their preferences and requirements.

💡 The Inspiration

The idea for GPT Subtitler was born out of my previous project, "Dual Subtitles for Video with GPT-3.5". This Python tool allowed users to transcribe and translate videos, generating dual subtitles using the Whisper model and GPT-3.5. The positive response and the insights gained from this project inspired me to take things to the next level.

I wanted to create a more user-friendly, scalable, and feature-rich platform that could cater to the needs of content creators and language learners. And I also wanted to learn more about AWS and serverless framework. That's when I discovered this guide (https://sst.dev/guide.html). I followed this guide to create the foundation of the project, a note-taking website, which taught me to use AWS and how to properly build the backend of a serverless application.

🛠️ Under the Hood

To bring GPT Subtitler to life, I use the power of serverless architecture and the GPT models. The application is built using the AWS Serverless Stack (SST) framework, with React.js for the frontend and AWS Lambda, DynamoDB, Cognito, and S3 for the backend. This serverless approach allows for seamless scalability and cost-efficiency.

One of the key highlights of GPT Subtitler is its integration with OpenAI and Anthropic's GPT models. These state-of-the-art LLMs enable the application to generate highly accurate and contextually relevant translations. Users can customize the translation settings, such as the target language, model type, and temperature, to fine-tune the output to their specific needs, then sit back as the application works its magic. The real-time progress tracking feature allows users to monitor the translation process, while the option to download the translated subtitles in their desired format adds a touch of convenience.

Since not everyone has an OpenAI or Claude API key, to ensure that GPT Subtitler is accessible to everyone, I implemented a token-based pricing model and subscription service and integrated Stripe for secure and seamless subscription payments and token purchases. Users can choose a plan that suits their needs and budget, making subtitle translation more affordable than ever.

🌟 Features

Translate subtitles from one language to another using OpenAI's GPT models

Customizable translation settings (source language, target language, model, temperature, etc.)

Design a comprehensive prompt that translates batches of subtitles into the desired target language, also allows the users to modify it

The prompt adheres to guidelines from the deep learning Prompt Engineering course, employing techniques such as few-shot examples, structured JSON output and reflection.

The prompt has been tested extensively and refined iteratively for optimal performance

Daily login rewards and recurring token additions based on subscription tier

API key support for free users with daily usage limits

Detailed token usage logs and transaction history

🖥️ Tech stacks

React.js for the frontend

AWS Serverless Stack (SST) for the backend infrastructure

AWS Lambda for serverless functions

AWS DynamoDB for data storage

AWS Cognito for user authentication

AWS S3 for file storage

WebSocket API for real-time communication

Stripe for payment processing

OpenAI API for subtitle translation

👥User case:

GPT Subtitler can significantly speed up the subtitling process. Typically, an experienced subtitler would need 1 hour to transcribe a 5-minute video and possibly the same amount of time to translate the subtitles (source). For my workflow, as someone with zero subtitling experience, it would take me 3 minutes to transcribe a 40-minute video using Whisper, generating a subtitle file with ~700 lines. Then, I would upload it to GPT Subtitler, and it would take about ~5 minutes to translate them. Using Claude-Haiku, I only need to spend ~0.1 dollars to do the translation, making it a highly cost-effective solution. That's a total of about 10 minutes of unsupervised work, allowing me to do other tasks while running the processes in the background. After it's finished, I quickly skim through the subtitle file containing both the original and translated text to identify any mistranslations and replace them with the correct ones using VSCode as my editor, which typically takes no more than 5 minutes. Finally, I watch the video with those subtitles to ensure everything looks smooth. The entire workflow requires only 10 minutes of background run time, 5 minutes of human effort, and the rest is just watching the video. In comparison, it would take an experienced subtitler 8 hours to transcribe and another 8 hours to translate, resulting in much better subtitle quality, but the time difference is substantial. The quality of the GPT-translated subtitles is adequate; while it may not surpass an experienced subtitler's work due to its inability to translate underlying context and cultural references, it sometimes outperforms inexperienced subtitlers like myself. Although I can occasionally correct the translation with my insight, most of the time, I would not have been able to come up with a better translation in a short time. Therefore, with this project, anyone can translate content from any language to their native language or translate their content into a foreign language. Additionally, for subtitlers, this project can definitely speed up their work.

📊 Impact and Recognition:

Efficiency Highlight: GPT Subtitler dramatically reduces the time and effort required for subtitling, making the process more accessible and efficient for both casual users and professionals. By automating a significant portion of the work, it enables faster content localization and opens up new opportunities for language learning.

Cost-Effective: By significantly reducing the time and effort required for subtitling, GPT Subtitler can help users and businesses save on subtitling costs. As demonstrated in the user case, translating a 40-minute video using Claude-Haiku costs only ~0.1 dollars, making it an affordable solution for those with limited budgets or those who need to subtitle large volumes of content.

Flexibility: GPT Subtitler offers a flexible workflow that allows users to customize the translation process according to their needs. Users can adjust the prompt and few-shot example, add additional context for the subtitle, refine translations, and integrate the tool into their existing subtitling pipelines.

Educational Value: For language learners, GPT Subtitler provides an opportunity to engage with foreign language content more easily. By generating subtitles in their native language, the tool can help learners improve their comprehension and vocabulary acquisition..

Continuous Improvement: As the underlying language models and technologies advance, GPT Subtitler has the potential to continuously improve its translation quality and efficiency. This ensures that users can benefit from the latest advancements in natural language processing and machine translation.

🔗 Relevant Links:

GPT Subtitler