You are tuned into TCS podcasts. Hello and welcome everybody. Today we're diving into something that I in particular I'm very, very excited about hearing is Believing a podcast on podcast. How fantastic is that? Featuring TCS research. I'm thrilled to be celebrating podcasts on a podcast and I am not alone in this celebration because I have with me Niranjan Pedanekar, Principal Scientist and Doctor Sunil Kumar Kopparapu. Principal Scientist TCS Research. Now this podcast is going to be all about how podcasting is an emerging field for research in the audio space and all about how audio analytics can be applied to it. Both are guests today are perfect people to talk to because their experience with AI and data analytics, especially in the field of audio and media, is sure to lend some interesting perspectives. Welcome to the show Niranjan and welcome to the show Sunil. So I'm going to dive right in because like I said, you both are the perfect people to be talking to today. And I have a lot of questions. So podcasting is the radio's millennial avatar and an incredibly savvy medium. I know this for a fact. Being on the other end of things. It's easy to produce, broadcast, and consume. But I'm intrigued about the other side of this entire situation. What are some recent scientific advances that have actually led to it becoming such a viable medium of communication? If you look at the technology, it's recording technology, audio production software. Internet blogging of the different kinds because that led to podcasts, whatever you write in blogs, it essentially comes to comes in the form of audio sometimes and there are conversations as well. So if you look at it, the technology has been there for a long time. It's been put together in an interesting manner. And I think Adam Curry, it was in 2004, he created something called ipodder. And that allow sort of radio broadcasts to be sent into iPods And then iTunes took over. They did their own stuff and they allowed native support in the Mac OS for podcasts. Then Steve Jobs showed in one of the addresses, the famous addresses in 2006 how to make a podcast using something called GarageBand which is provided with the Mac OS. And then people just I'm, I mean they just took off. And by 2013, I think there were about a billion subscribers to Apple Podcasts. And then by then you also had Spotify, this, that, and so on. So a lot of has been built, a lot has been built on top of what already existed. But it's just an incredibly easy way to do this. What has been added technologically is the scale, the scale at which at which podcasts happen these days and the scale at which. Things can be related to people and the number of people listening to podcasts, the number of podcasts which are there, the content, where it is stored, how it is stored, the various ways of processing that content and making it available to people. The you must know. I mean you must be using that speed button many times that instead of 1X just put it on one point, 5X put it on 2X3. Let's just test it. So you might have seen all that and to make it. Sort of flawless. That's where most of the technology has been very interesting point you just brought out there, which was also the journey of the fact that there's always been a foundation which in on my end as a creative person I've realized that it is kind of just radio has been the building blocks for it. I think to add to what Niranjan said, I think while there is no specific advance in technology as such, I think if I had to choose one or two it would be to me it would be audio codecs, these are. What have enabled more data compression so that you can pack lot of data into small amount of time without sacrificing the audio quality. I think has done one good thing that even geographies which have let's say poor data transmission possibilities because of cost or because of bandwidth, there also you can actually send these kind of data without any sacrifice in the audio quality. However, some of the state of the work, state-of-the-art work that we do in the field of audio analytics, actually useful for podcasts. How do we take statistics and analytics and apply them to the medium itself? Forecast is essentially what their audio signals and their signals, which can be analyzed in various ways. You can find a variety of things, such as whether this is music, It's a mixture of music and speech, it's speech, it's two people, four people, five people, who is talking, when, and so on and so forth. What recent advances have also allowed is to take. This audio and then create transcripts automatically. So you essentially take audio and generate text out of it. What is being said? Who is saying it? And that kind of stuff is being used a lot. How does it matter if you see? So there are a variety of things. So these days on the one hand you have really long podcast, but at the same time there's an opposing force in the market as generations change. This is the generation of TLDR. Too long didn't read, so you have. Small summary of whatever has happened. So summarization of podcast is also an important thing. You cannot say that everyone would like podcasts to be 3 hours long. If there is a three hour long podcast, can I compress it in half an hour? Can I compress it in 10 minutes? Can I still get the right kind of things in those 10 minutes so that I at least know what's going on in that podcast? So those are the kind of things that you do when you find out what is going on in the podcast. So that that's one problem which I think has been enabled. The other problems which have been enabled that as you are able to process speech, can you take that speech and convert it into someone elses voice. Imagine a podcast with voice like this and you can't really go on for a really long time. Can I change it to Amitabh Bachchan's voice for example and can I just make it into something which is personalized. So personalization is another factor which is enabled by some of the stuff. that is done in that audio analysis. You can even place ads at point certain points if you want to. You can place ads around it. A variety of things are possible because of the kind of analysis that is possible at this very moment. Yeah. And I have to say thank you for that lovely. In addition to the information and all of the knowledge that we just dived dove into. Thank you so much for the sneak peek at your theatre theatre background. It really you know jumped out. That's wonderful, and both of you touched upon this idea of summarization. And it's true, because with these changing attention spans, it is important that the message of the content still stays intact and reaches its audience as intended, just taking off on the aspect of time and it being somewhere in the foreseeable future. This is definitely a newer medium as compared to others, at least in the way that it exists today. So new things always come with their own challenges. Do either of you have any perspectives on some interesting? Problems that were actually exploring in this field, the one thing that comes to my mind is having what one would say is the emotion voice, empathetic voice. For example, today you have a consumer who is in a certain state of emotion. Now you cannot have a podcast being very chirpy and against his emotion. So probably there are ways in which you can detect the current state of the emotion of a person and based on that probably. Taking on what Niranjan said, you could change the emotion of the voice of the podcast. I do not know whether this is bizarre, but maybe I would like to listen to the podcast in my own voice. OK, so each of us could be listening in our own voice. And the other thing is, today the language of the podcast is sort of fixed, but the kind of work that we are doing today, speech, translation and things like that. Maybe podcast is produced in one language, say English. For whatever reason. And then maybe I would like to listen to it in Marathi or maybe Tamil. And all this can be done technologically very fast without much thinking. So today technology is available that or at least it can be managed in such a way that these things can be made possible. There are few other problems. Say one thing that we are really interested in is the grammar of media. For example, movie has a three act structure which is given by. Someone Aristotle. Years ago in his work poetics that there are three acts and then you use that to analyze movies. So even if you're using an AI program to do it, you also consider this kind of the grammar insight that comes from the actual media, It comes from people who make movies. Similarly, what is the grammar of podcast? Can we think of why people get engaged into a podcast? Can we think of why people get engaged into certain voices? What will make? People resonate more with things because whatever we said about generating podcast artificially, you can't consume our podcast artificially. Machines can't can't sit and listen to podcast and be happy about them. It will be us who would be consuming however they are produced. I mean, in whichever way they are produced, we would be consuming. And therefore knowing the human part of it, I think is the most important thing, and it often comes at the very end of any technological advance. You work on the technology, you work on making it scalable, you work on compression algorithms, etc. But the human in the loop, the human in the center of all this has to be acknowledged at some point in time, and that's where you find grammar. Let me just take that a step further because it's interesting you talked about the grammar of certain formats and actually understanding them. Given our experience working on other media segments and formats like video, for example, can we actually, you know, borrow from our knowledge of these segments and, you know, learn from our understanding of these segments and use those to address the problems in the podcast space? Yes, absolutely you can do that. And I mean a few pointers to that are that humans like attention to be created and released, attention to be created and released, that's there in music, that's there in drama, that's there in the theater, movies, etc. Everywhere. What is it in podcast? Can you artificially raise the tension and release? If someone just talking like this and just goes on and on talking like this and there's nothing that is happening and so on, can you artificially fix it? So that there's a pause created in between I'm creating right now. Can I use this kind of a pattern of tension and release in podcast to make them more engaging? So these are the kind of things that we can certainly borrow from other media types. Now it's interesting because no medium is really anything without its audience, but So what are some of the areas that we need to improve to enhance user experience? Is my next question. I think user experience is something that definitely can be enhanced or there is no limit to what you can do today. You could be actually wanting to listen to a podcast as if you're sitting in a park. What is the park means? Probably a breeze or the birds chipping and occasional footstep for joggers in the park and things like that could bring a freshness to the podcast. You know, podcast would be technically it could be whatever it is, but you may want to have. The sense of experience in that form. Or maybe you're sitting by the river and then trying to actually listen to the podcast. Maybe the breeze from the river or the fish which are jumping in the river. You may want to get that thing and all this happening while you're sitting at your home. OK. The experience is as if you are in the park or by the river, but actually you're sitting at your home. And this side, in my opinion, can be very, very wonderful. Yeah, powerful to allow immersive experience, right? Yeah. And also these things are not futuristic or something. There is lot of current technology that allows us to mix these kind of audio events realistically into podcasts. So things are there. Now the only thing is whether these are something that people would want is you need to test it out. You just need to have immersive podcast versus non immersive podcast and see how people respond to that. Finally, as Niranjan said, it is the consumer who. Decides what he wants to listen, how he wants to listen. There is one more point that I would like to touch upon. Can podcast give this? I mean similar expressive power? To a wide variety of people. Such platform is not yet there and it probably might come in the future. And just like Tiktok has been around, just maybe Instagram has been around. Here is where you consume things in a very short span of time. Clubhouse is one one thing that I have seen which is slightly close to that people just get together and chat. These chats could be relayed in the local community, they could be available for. Future consumption. It's not that they just get together and it just fizzles out the after the meeting is over. So those are the kind of things where certain advances could be observed in future. Interesting. That's a lot of food for thought actually in Naranjan to give power both to the producer as well as the consumer. Yes. And again, we very conveniently fallen into my next question, which is that podcast audiences are actually growing all the time. Especially in my personal experience during the past two years of the global pandemic, the podcast market really saw a boom, right. What does this increase in size, considering both of you touched upon the idea of scalability, What does the increase in the size of the listening audience mean for both the research and scientific community? Yeah, I think scale of course, definitely is something that goes without saying, which means automatic creation of rich content becomes very important. So how can you create podcast of interest automatically without anybody actually trying to? Create the content and personalization of the content is of course very, very important. I think the material could be the same thing but put in a different form, which is possible through the technology that happens to exist today. And if you look at, if you try and make a podcast, everybody starts podcasting, then I think search, search becomes a very, very important aspect. How do you search for information that you're looking for? Yeah, which most of the. Social media sites make it very easy for you because of the way you like other data. Maybe they make use of that and try to project something. I think these are some things that are open problems that researchers and scientific community will sort of try and address over a period of time. So I think as the extent grows, how do you control what messages are being relayed? I'm not saying you have to control them. What you see on Facebook, you see on WhatsApp, there's plenty of fake news. There is plenty of information misinformation which is around. How do you really tackle that problem when it comes to podcasts? Because I believe that podcast is a great tool for education and the more reach it has for education, you can do wonders. There is something called cognitive load, which is less when your video is giving you something, visual feed is giving you something different and your audio feed is giving you something. That's where podcasts excel. They decrease your cognitive load. You could be doing something else and you could still be listening to. An entirely different thing and 100% that is where that education aspect of it comes. But it also is fraught with stuff such as the messages that we get on WhatsApp, which are entirely. They could be fake, they could be propaganda, they could be unverified and so on and so forth. All those things could probably be taken care of at an early stage if we go or wish to go in that. The education, the direction of education, mass education, so to say. Wow, I mean both of you have brought so much to the table today. I am genuinely feeling. I'm really excited to just think about everything that you both have spoken about today because it is such a different perspective. Especially for someone like me, I am on the creative side. I do the hosting. But to actually hear all about the thought that can go into this medium, to make it as big as it can be and as useful as it can be, it was really lovely. A big thank you to both of you for taking the time and actually sharing so many of your insights and your thoughts about podcasting, the future of podcasting and what better way for us to celebrate podcast than this. Thank you so much.