Building HackerNews Podcast Generator with Gemini 3, Elevenlabs

TL;DR

The article demonstrates building a podcast generator that converts Hacker News posts into audio summaries using Tensorlake as the execution runtime. The workflow includes web scraping, text cleaning, summarization via Gemini, and audio generation via ElevenLabs—all orchestrated as a single managed application rather than loosely connected scripts.

Overview

I regularly visit Hacker News but lack time to read every article. This project addresses that by collecting content, creating summaries, and converting them to audio for consumption while multitasking. The focus extends beyond individual tools to exploring how to run agent workflows cleanly and reliably as a unified execution model.

Architecture Overview

The system transforms trending links into podcast-style audio summaries through four coordinated stages:

Content retrieval and preparation: Links from Hacker News are fetched and converted to clean, readable text
Text summarization: Prepared content is sent to a language model for concise summaries suitable for audio
Audio generation: Summaries are converted to spoken audio
End-to-end execution: All steps run as one coordinated workflow with intermediate results preserved

Why Tensorlake Runs the Agent

Tensorlake provides execution-focused capabilities suited to this multi-step workflow:

Durable function execution: Each step runs as a durable function; if summarization fails due to a transient API error, only that function retries without re-executing prior steps like web scraping
Serverless orchestration with dynamic fan out: The scraping stage handles variable numbers of articles discovered at runtime
Built-in orchestration and state handling: Tensorlake manages ordering and data flow between steps without custom glue code
Native support for large inputs and outputs: The workflow handles outputs of varying sizes—full article text, summary scripts, and MP3 files—without special handling

Building the Podcast Generator

Prerequisites

Before starting, ensure you have:

Python 3.11 or later
Gemini API Keys
ElevenLabs API Keys

Generate a Gemini API Key

Go to https://aistudio.google.com/
Create an API key
Keep it available for the next step
Replace the key in your .env file

Generate an ElevenLabs API Key

Go to https://elevenlabs.io/app/developers/api-keys
Create an account
Generate an API key from your profile
Replace the key in your .env file

Step 1: Set Up a Virtual Environment

Create a new project folder and open it in your editor. Then create and activate a virtual environment.

python -m venv venv

Activate the environment:

On Windows:

venv\Scripts\activate

On macOS or Linux:

source venv/bin/activate

Once activated, your terminal should show that the virtual environment is in use.

Step 2: Install Dependencies

Create a requirements.txt file with the following content:

tensorlake
pydoll-python
streamlit
google-genai
requests
python-dotenv
beautifulsoup4

Install the required dependencies:

pip install -r requirements.txt

This installs Tensorlake, the headless browser dependency, and libraries for Gemini and ElevenLabs integration.

Create the .env file

Create a file named .env in the same directory with the following variable names:

GEMINI_API_KEY=PASTE_YOUR_GEMINI_API_KEY_HERE
ELEVENLABS_API_KEY=PASTE_YOUR_ELEVENLABS_API_KEY_HERE

Now, create a Python file named podcast_agent.py.

Step 3: Web Scraping with Tensorlake

Web scraping is implemented as a Tensorlake Function within a Tensorlake Application, making it a composable stage in the podcast generation workflow. The pipeline begins with fetch_hackernews_top_articles, which automatically retrieves top Hacker News articles and serves as the entry point.

For each selected article, the crawl function performs a controlled depth-first traversal from the article URL, bounded by configurable max_depth and max_links parameters. The dedicated fetch_content function executes inside a purpose-built scraper image containing Chromium and PyDoll, ensuring reliable rendering of JavaScript-heavy pages.

HTML pages are normalized into clean, readable text, while binary assets like images or PDFs are detected separately with appropriate metadata. Domain boundaries are enforced, and visited URLs are tracked to prevent redundant processing.

Step 4: Summarization with Gemini

Once the crawl completes, a Tensorlake function extracts clean, readable text from scraped results. This step normalizes data and prepares it for language model input by removing empty content and consolidating text across pages.

The summarization step uses Gemini through the google-genai client. Cleaned article text is passed to the gemini-2.5-flash model with a prompt designed to generate a concise, podcast-style script. Input size is intentionally limited for compatibility with free or low-tier usage.

Gemini serves strictly as an external inference service, while Tensorlake manages execution, orchestration, and data flow between functions.

@function(secrets=["GEMINI_API_KEY"])
def summarize_with_gemini(clean_text: str) -> str:
    """
    Generate a podcast-style summary.
    """
    from google import genai
    import os
 
    client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])
 
    prompt = f"""
    Create a short podcast-style summary of the following article.
    Keep the tone clear, neutral, and easy to listen to.
 
    Article:
    {clean_text[:6000]}
    """
 
    response = client.models.generate_content(
        model="gemini-2.5-flash",
        contents=prompt
    )
 
    return response.text

Step 5: Audio Generation with ElevenLabs

The final step converts the generated podcast script into audio using ElevenLabs. A dedicated Tensorlake function reads the text content and sends it to the ElevenLabs Text-to-Speech API using a fixed voice ID.

The model used is eleven_v3, configured with stability and similarity settings to produce clear and natural narration. The resulting MP3 audio is returned as a file object, completing the podcast generation pipeline.

@function(secrets=["ELEVENLABS_API_KEY"])
def generate_audio(script_text: str) -> File:
    """
    Convert podcast script text into audio using ElevenLabs TTS.
    """
    import os
    import requests
 
    VOICE_ID = "21m00Tcm4TlvDq8ikWAM"
 
    url = f"https://api.elevenlabs.io/v1/text-to-speech/{VOICE_ID}"
 
    headers = {
        "xi-api-key": os.environ["ELEVENLABS_API_KEY"],
        "Content-Type": "application/json",
        "Accept": "audio/mpeg",
    }
 
    payload = {
        "text": script_text,
        "model_id": "eleven_v3",
        "voice_settings": {
            "stability": 0.5,
            "similarity_boost": 0.5,
        },
    }
 
    response = requests.post(url, json=payload, headers=headers)
 
    if response.status_code != 200:
        raise RuntimeError(
            f"ElevenLabs TTS failed: {response.status_code} {response.text}"
        )
 
    return File(
        content=response.content,
        content_type="audio/mpeg",
    )

The complete implementation is available in the project repository: https://github.com/tensorlakeai/examples/tree/main/podcast-agent

Run the script:

python podcast_agent.py

Creating the UI

With the backend workflow in place, the next step is building the user interface using Streamlit. The UI serves as a lightweight interaction layer for the Tensorlake-powered podcast generation pipeline, automatically selecting the top article from Hacker News and exposing basic configuration options.

The interface focuses on clarity and ease of use. All core processing—crawling, summarization, and audio generation—runs inside the Tensorlake application, while the UI only triggers the workflow and displays final podcast audio with playback and download options.

Installing Streamlit

Install Streamlit in the same virtual environment:

pip install streamlit

Run the file app.py using the command below:

streamlit run app.py

Deploying to the Tensorlake Cloud

Authenticate with Tensorlake from your terminal:

tensorlake login

Set Secrets

The podcast agent uses external services (Gemini and ElevenLabs), so secrets must be configured.

Option A: Using Tensorlake UI

Go to Agentic Apps → Secrets
Add the following secrets:
- GEMINI_API_KEY
- ELEVENLABS_API_KEY
Save the changes

Option B: Using CLI

tensorlake secretsset GEMINI_API_KEY=your_gemini_key
tensorlake secretsset ELEVENLABS_API_KEY=your_elevenlabs_key

Secrets are securely injected into the functions at runtime.

Verify that the secrets are set correctly:

tensorlake secrets list

Export Tensorlake API Key

If required for local testing or automation, export your Tensorlake API key:

export TENSORLAKE_API_KEY=tl_apiKey_xxxxxxxxx

This allows the CLI and local execution helpers to communicate with Tensorlake.

Deploy the Agent

Deploy the application using the same podcast_agent.py file:

tensorlake deploy podcast_agent.py

During deployment:

Tensorlake validates the application and functions
Container images are built
The agent is registered under Agentic Apps

On successful deployment, Tensorlake returns a permanent endpoint for the agent.

Invoke the Agent

Option A: Using Tensorlake UI

Open Agentic Apps → Your App
Click Invoke
Provide input parameters

Option B: Using CLI

curl https://api.tensorlake.ai/applications/podcast_agent \
-H "Authorization: Bearer $TENSORLAKE_API_KEY" \
--json '{ "url": "example_string", "max_depth": 3, "max_links": 5}'

After invocation, a Request ID will be generated.

Observe Execution (Graph & Timing)

After invocation, Tensorlake provides a full execution overview, including:

Observable function graph
Execution timing per function
Clear parent → child function relationships

At the end of the workflow, you have:

Top Hacker News articles automatically selected and processed
Clean, normalized text extracted from each article using the Tensorlake crawler
Concise podcast-style summaries generated by Gemini for each article
High-quality MP3 audio outputs generated using ElevenLabs
Clear visibility into each pipeline stage through Tensorlake's function-level execution model

Key Takeaways

This project demonstrates how a structured, end-to-end workflow can transform live web content into podcast-ready audio with minimal manual input
Tensorlake acts as the execution backbone, orchestrating article selection, crawling, summarization, and audio generation as discrete, composable functions
By separating source selection, scraping, summarization, and voice synthesis into individual Tensorlake functions, the pipeline remains easy to understand, debug, and extend
External models such as Gemini and ElevenLabs are used purely for inference, while Tensorlake manages execution boundaries, retries, and function-level isolation
The same architectural pattern can be extended to other automated content workflows, such as daily news podcasts, research summaries, or technical briefings

If you want to build similar execution-heavy AI workflows without managing infrastructure, explore what Tensorlake offers for running tools, preparing data, and scaling agent-style applications.

Start with the Tensorlake Applications Quickstart, experiment with the cookbooks, and see how far you can take this pattern with your own use cases.

WRITTEN BYTensorlake TeamEngineering