Blog Post

5 min read Published: 2025-06-23

In a world where email overload is a daily battle, having an automated assistant to triage your inbox can save hours of unnecessary reading. The Smart Email Summarizer is a Python script designed precisely for this purpose. It connects to your email account, fetches your latest messages, and distills them into clear, concise summaries using AI. This helps you focus on what matters without wading through walls of text.

IMG-1-smart-email-summary-flow

This article breaks down the script’s architecture, shows how each part works, and explores how you can adapt it to your own workflows. Whether you're a Python developer looking to build smarter automations or a productivity-minded professional curious about integrating AI into your day-to-day tasks, this walkthrough aims to answer your key questions and spark practical inspiration.

What the Script Does (and Why It’s Useful)

At its core, the script connects to any IMAP-compatible email server (like Gmail), retrieves a defined number of emails, extracts useful metadata (sender, subject, content), and summarizes the body text using either:

  • A transformer model from Hugging Face (facebook/bart-large-cnn)

  • OpenAI's GPT-3.5-turbo API

You can configure it to process only unread emails, mark them as read afterward, and optionally save summaries to a file. This allows for:

  • Quick triage of inboxes without opening every message

  • Executive or team assistants to get the gist of emails before forwarding

  • Developers to embed this into broader workflows

IMG-2-script-architecture

Breakdown of Core Components

1. IMAP Authentication

mail = imaplib.IMAP4_SSL(server)
mail.login(email_user, password)

The script uses Python’s imaplib to connect securely to the mail server. It prompts the user for credentials interactively and retries if authentication fails. App-specific passwords (e.g., from Gmail) are supported.

2. Fetching Emails

status, messages = mail.search(None, "UNSEEN")

Once authenticated, it selects the inbox and searches for unread (or all) emails. It limits results to the N most recent, based on user input. The mail.fetch command then retrieves the full content of each email.

3. Parsing Email Content

msg = email.message_from_bytes(raw_email)

The script handles both plain-text and HTML emails, as well as multipart messages. It uses Python’s built-in email library to decode headers and extract content. This ensures reliable parsing across a wide range of email formats.

4. Summarization Engines

IMG-3-summarization-methods-compare

You can choose between two summarization methods:

  • Transformer-based: Uses Hugging Face’s pipeline with facebook/bart-large-cnn. This is local, free, and useful for moderate-sized text.
summary = summarizer(text, max_length=100, min_length=20, do_sample=False)
  • OpenAI GPT-3.5: Requires an API key. Offers more nuanced summaries and handles longer or more complex text better.
response = openai.ChatCompletion.create(...)

The script includes fallback logic. If a summarization attempt fails, it retries once.

5. Output and Logging

Summaries are printed and optionally written to a user-specified file. Logging statements give visibility into each step: authentication, fetching, parsing, summarization, and error handling.

6. Email Management

Optionally, the script can mark emails as read after summarizing them, using IMAP flags.

Answers to Common Questions

How secure is this? The script runs locally and uses SSL for connecting to the mail server. No data is sent externally unless using OpenAI, in which case only the email body is shared with the API.

Will it alter my inbox? Only if you explicitly choose to mark emails as read. Otherwise, messages remain untouched.

Can I customize it? Yes. You can:

  • Switch models

  • Add keyword filters

  • Extend it to post summaries to Slack or a dashboard

What are the limitations?

  • Transformer models have a context limit

  • OpenAI usage depends on internet connectivity and API limits

  • Complex HTML or encoding issues may occasionally cause missing content

IMG-4-inbox-to-clarity

Why This Matters

This tool reflects a powerful, growing trend: blending traditional scripting with modern AI to solve daily annoyances. Reading and prioritizing emails shouldn’t take hours, and it no longer has to.

For developers, this script is a template for smart automation not just in email but in any workflow involving unstructured text. For productivity-minded professionals, it's a glimpse into how accessible and actionable AI is becoming.

The Smart Email Summarizer isn't just a tool. It is an example of how to think, build, and work smarter with code.

Want to build on this? Consider scheduling it to run daily, turning it into a web app with Flask, or integrating with your company’s communication tools.

Get the Code

You can find the full source code, setup instructions, and contribution guidelines on GitHub: Smart Email Summarizer GitHub Repository

Python-Code

import imaplib
import email
import os
import logging
from email.header import decode_header
from typing import List, Tuple, Optional
from transformers import pipeline
import openai
import getpass

# Configure logging
logging.basicConfig(
    level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s"
)

def authenticate_imap(server: str, email_user: str, password: str) -> imaplib.IMAP4_SSL:

    """Authenticate to IMAP server."""
    try:
        mail = imaplib.IMAP4_SSL(server)
        mail.login(email_user, password)
        logging.info("Authenticated successfully.")
        return mail
    except imaplib.IMAP4.error as e:
        logging.error(f"IMAP authentication failed: {e}")
        raise

def fetch_emails(

    mail: imaplib.IMAP4_SSL, n: int = 5, unread_only: bool = True
) -> List[Tuple[str, bytes]]:
    """Fetch unread or latest N emails."""
    mail.select("inbox")
    if unread_only:
        status, messages = mail.search(None, "UNSEEN")
    else:
        status, messages = mail.search(None, "ALL")
    if status != "OK":
        logging.error("Failed to fetch emails.")
        return []
    email_ids = messages[0].split()
    email_ids = email_ids[-n:]
    fetched = []
    for eid in email_ids:
        status, msg_data = mail.fetch(eid, "(RFC822)")
        if status == "OK":
            fetched.append((eid, msg_data[0][1]))
    return fetched

def extract_email_content(raw_email: bytes) -> Tuple[str, str, str]:

    """Extract sender, subject, and body from raw email."""
    msg = email.message_from_bytes(raw_email)
    sender = msg.get("From", "")
    subject, encoding = decode_header(msg.get("Subject", ""))[0]
    if isinstance(subject, bytes):
        subject = subject.decode(encoding or "utf-8", errors="ignore")
    body = ""
    if msg.is_multipart():
        for part in msg.walk():
            content_type = part.get_content_type()
            content_disposition = str(part.get("Content-Disposition"))
            if content_type == "text/plain" and "attachment" not in content_disposition:
                charset = part.get_content_charset() or "utf-8"
                body = part.get_payload(decode=True).decode(charset, errors="ignore")
                break
        else:
            # Fallback to HTML
            for part in msg.walk():
                if part.get_content_type() == "text/html":
                    charset = part.get_content_charset() or "utf-8"
                    body = part.get_payload(decode=True).decode(
                        charset, errors="ignore"
                    )
                    break
    else:
        content_type = msg.get_content_type()
        if content_type == "text/plain":
            charset = msg.get_content_charset() or "utf-8"
            body = msg.get_payload(decode=True).decode(charset, errors="ignore")
        elif content_type == "text/html":
            charset = msg.get_content_charset() or "utf-8"
            body = msg.get_payload(decode=True).decode(charset, errors="ignore")
    return sender, subject, body

def summarize_text(

    text: str, method: str = "transformer", openai_api_key: Optional[str] = None
) -> str:
    """Summarize text using transformer or OpenAI API."""
    if method == "openai" and openai_api_key:
        openai.api_key = openai_api_key
        try:
            response = openai.ChatCompletion.create(
                model="gpt-3.5-turbo",
                messages=[
                    {"role": "system", "content": "Summarize the following email:"},
                    {"role": "user", "content": text},
                ],
                max_tokens=100,
            )
            return response.choices[0].message["content"].strip()
        except Exception as e:
            logging.error(f"OpenAI API error: {e}")
            return "[Summary unavailable due to API error]"
    elif method == "transformer":
        try:
            summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
            summary = summarizer(text, max_length=100, min_length=20, do_sample=False)
            return summary[0]["summary_text"]
        except Exception as e:
            logging.error(f"Transformer summarization error: {e}")
            return "[Summary unavailable due to summarization error]"
    else:
        return "[No summarization method available]"




def try_summarize_text(
    text: str, method: str = "transformer", openai_api_key: Optional[str] = None
) -> str:

    """Try to summarize text, retrying once if the first attempt fails."""
    summary = summarize_text(text, method=method, openai_api_key=openai_api_key)
    if summary.startswith("[Summary unavailable"):
        logging.info("First summarization attempt failed, retrying...")
        summary = summarize_text(text, method=method, openai_api_key=openai_api_key)
    return summary




def output_summary(
    sender: str, subject: str, summary: str, save_file: Optional[str] = None
):

    output = f"From: {sender}\nSubject: {subject}\nSummary: {summary}\n{'-'*40}"
    print(output)
    if save_file:
        with open(save_file, "a", encoding="utf-8") as f:
            f.write(output + "\n")

def mark_as_read(mail: imaplib.IMAP4_SSL, email_id: str):
    mail.store(email_id, "+FLAGS", "\\Seen")

def main():

    print("Smart Email Summarizer (interactive mode)")
    server = input("IMAP server (e.g., imap.gmail.com): ").strip()
    user = input("Email address: ").strip()
    # Loop for password until authentication succeeds
    while True:
        password = getpass.getpass("App password or IMAP password: ").strip()
        try:
            mail = authenticate_imap(server, user, password)
            break
        except imaplib.IMAP4.error:
            print("Authentication failed. Please try again.")
    method = (
        input("Summarization method (transformer/openai) [transformer]: ").strip()
        or "transformer"
    )
    n = input("Number of emails to summarize [5]: ").strip()
    n = int(n) if n.isdigit() else 5
    unread_only = input("Only fetch unread emails? (y/n) [y]: ").strip().lower() != "n"
    mark_read = (
        input("Mark emails as read after summarizing? (y/n) [n]: ").strip().lower()
        == "y"
    )
    save_file = input("File to save summaries (leave blank for none): ").strip() or None
    openai_api_key = None
    if method == "openai":
        openai_api_key = input("OpenAI API key: ").strip()
    emails = fetch_emails(mail, n=n, unread_only=unread_only)
    if not emails:
        logging.info("No emails found.")
        return
    for eid, raw_email in emails:
        sender, subject, body = extract_email_content(raw_email)
        if not body.strip():
            logging.warning(
                f"Email from {sender} with subject '{subject}' has no body."
            )
            continue
        summary = try_summarize_text(body, method=method, openai_api_key=openai_api_key)
        output_summary(sender, subject, summary, save_file=save_file)
        if mark_read:
            mark_as_read(mail, eid)
    mail.logout()

if __name__ == "__main__":
    main()

#Python #HowTo #AI