A RAG system that auto-collects KakaoTalk consultations at tax & accounting firms and lets AI search and summarize them. We built the full pipeline so the consultations piling up each day stop disappearing and start grounding the next one.
A tax firm's consultation data accumulates daily — and disappears just as quickly.
Threads completed inside the KakaoTalk Business Center were effectively unsearchable. A similar question arriving next month forced the staff to start from scratch every time, with no shared record of who answered what.
General-purpose tools like ChatGPT couldn't hold tax-specific context. The real need was to bring back 'the answer this firm already gave' — fast.
We pulled consultations into the server in one click, ran them through a live embedding pipeline, and layered a response engine that combines multiple search strategies.
A Chrome extension collects threads from the Business Center screen with one click and ships them to our server. Each record flows through SQS → Lambda → Cohere embeddings → pgvector and becomes searchable the moment it lands.
Retrieval doesn't lean on a single query. Multi-query rewrites the question, MMR diversifies results, and hybrid keyword-plus-vector search surfaces the answers. So 'similar cases' arrives as the firm's cumulative knowledge, not a single hit.
Pulls Business Center consultations into the server with one click — no manual filing.
Ingestion, embedding and retrieval all serverless. Cost tracks traffic.
Vector search on top of Postgres. The team can use familiar SQL alongside it.
Cohere for embeddings, GPT for response generation. Multi-query + MMR + hybrid search combined.