What I really want to get from the new era of machine learning we supposedly are coming through is human-quality self-hosted text-to-speech and speech-to-text so I would be able to listen to text ebooks and convert big podcasts and video/audio lecture courses to text making it easy to search through them and quote phrases from them. Is this it? Whatever I could find so far were either significantly worse than a human could do or expensive online services.