The AI agent needs to send emails. Not forward them, not summarize them — actually compose and send them based on context. A customer emails about a delayed order? The agent should check the order status, draft an appropriate response, and send it (or queue it for my approval).
Building this took me three iterations. The first was embarrassingly bad. The second was functional but creepy. The third is what I actually use.
Iteration 1: The Overconfident Email Bot
My first attempt gave the agent full email access with simple instructions: “Monitor the inbox, respond to customer emails based on context.” It worked technically — it read emails, generated responses, and sent them.
The problem: it sent a response to an angry customer complaint that started with “I understand your frustration.” The customer was not frustrated — she was mildly annoyed about a small billing discrepancy. The agent escalated a minor issue into an emotional response that made the customer actually frustrated.
Lesson: AI agents are bad at reading emotional tone in emails. They default to the most dramatic interpretation and respond accordingly.
Iteration 2: The Over-Cautious Email Bot
After the first disaster, I swung too far the other direction. The agent drafted emails but never sent them — everything went to an approval queue. I had to review and approve every single response.
This created more work than doing it manually. Instead of reading the email and writing a response, I was reading the email, reading the agent’s draft, deciding if the draft was appropriate, editing it 60% of the time, and then approving it. What should have been a time-saver became an extra step.
Lesson: An approval queue for every email defeats the purpose. You need selective automation.
Iteration 3: What Actually Works
The current system categorizes incoming emails and handles each category differently:
Category A: Routine and safe (auto-respond). Meeting confirmations, receipt acknowledgments, simple information requests with clear answers. The agent responds automatically. These make up about 40% of incoming emails and are almost impossible to mess up.
Category B: Standard but nuanced (draft + approve). Customer questions that require checking data, follow-up requests, moderately complex inquiries. The agent drafts a response, attaches relevant context (order status, account details), and puts it in my approval queue. I review and send with one click. Usually no edits needed. About 45% of emails.
Category C: Sensitive (flag only). Complaints, legal mentions, financial disputes, anything from an important contact. The agent flags these for my personal attention and doesn’t draft a response. About 15% of emails.
The Classification System
The categorization is based on a set of rules in the agent’s instructions:
– Contains words like “cancel,” “refund,” “lawyer,” “disappointed” → Category C
– From a VIP contact list → Category C
– Simple question with a factual answer → Category A
– Everything else → Category B
This rules-based approach is more reliable than asking the AI to “decide how important this email is.” AI judgment on email importance is inconsistent. Simple rules are predictable and debuggable.
The Email Templates Approach
For Category A auto-responses, I don’t let the agent write freeform. Instead, I provide response templates with variables:
“Meeting confirmation: Hi [name], confirmed for [date] at [time]. See you then.”
The agent fills in the variables from the email context. This eliminates the risk of the agent saying something unexpected in automatic responses. Boring? Yes. Reliable? Completely.
For Category B drafts, the agent has more freedom but follows structural guidelines: acknowledge the question, provide the relevant information, offer next steps, close professionally. The drafts are consistently good because the structure is constrained even when the content varies.
What I Learned About AI and Email
AI is good at: Extracting information from emails (dates, names, requests), looking up relevant data (order status, account history), and generating structurally sound responses.
AI is bad at: Reading emotional subtlety, understanding relationships and politics, knowing when to CC someone, and deciding the right level of formality for a given recipient.
The ideal split: Let the AI handle the information processing (what is this email about? what data is relevant?) and the drafting (write a response with these facts). Keep the human in the loop for tone, judgment, and send/no-send decisions on anything beyond routine correspondence.
The Numbers
Before the email agent: I spent about 90 minutes per day on email.
After the email agent: about 35 minutes per day.
The 55-minute savings comes from: auto-responses handling routine emails (20 minutes saved), faster processing of draft reviews compared to writing from scratch (25 minutes saved), and context pre-loading so I don’t have to look things up manually (10 minutes saved).
That’s 4.5 hours per week recovered. For $0.20/day in API costs, it’s one of the highest-ROI automations I’ve built.
🕒 Last updated: · Originally published: December 31, 2025