
Overview
We designed and launched a chat based editor for the Captions app that allows anyone to make edits to their videos with natural language.
With simple commands like "Add transitions", "Remove pauses" and "Add B-Roll", our video agent is able to create post-worthy edits in just a few chats.
As a part of this product, we had to create a new design system with a variety of message states like chats, actions, thinking states, inline media rendering, error states and more.
We leaned on existing patterns for agentic chat interfaces like Cursor or ChatGPT, but leaned into video editing specific iconography and actions. As we had a mix of synchronous and asynchronous actions, the chat can be dismissed and the user can edit the video manually while the agent is working in the background.
We quickly learned that the agent was great for repeated tasks, like fixing typos, adding sound effects every N seconds, and re-arranging footage.