For Video objects, the current AI does a great job capturing the correct words spoken and converting them into transcription, however it really, REALLLLLLY struggles with punctuation and capitalization. A typical 3-4 minute video yields a transcription riddled with excess commas, random capitalized words like "button," and a near-complete lack of periods.
This requires a tedious read-through and correction, amplified greatly by longer videos. The issue is further exacerbated by being forced to edit the transcription inside tiny line-by-line fields, rather than being able to just upload, say, a TXT file which I could easily edit within an outside editor. (Yes, I know I can upload a VTT subtitle file, but apparently that can only be used for subtitle/captions, not transcription...)
Can you improve the AI so it better understands natural speaking and punctuation placement?
Hello!
I am happy to report that we have improved Pinpoint by implementing the most modern API available from our transcription service provider and it is available to all clients.