With the advent of internet TV and OTT (over-the-top) media delivery, personalizing user entertainment experiences and monetizing entertainment is becoming a top priority in the entertainment industry. In this chapter, we outline future scenarios in the entertainment industry, and discuss how we are attempting to take on some of these challenges in the Area 66 Entertainment Group at TCS Research. We present glimpses of our research work related to cognitive and affective annotation of entertainment media, and its various applications to personalization of entertainment and monetization through advertising. We also outline the challenges of future when artists, machines, and consumers cocreate art.
Media, media everywhere
Humans have loved entertainment, since prehistory. We told stories, drew pictures, sang songs, wrote books, and played games in order to keep ourselves stimulated. Today, we live in a world surrounded by entertainment media. Apart from movies, music, TV, and news, we, as a society, seem to immerse ourselves in a deluge of social media. As we spend more time online, we will need more online entertainment, and someone has to provide it! Traditional media businesses would like to do that, but are in competition with usergenerated content found on new-age media channel, such as YouTube. The industry needs to find a way to provide personalized, ubiquitous, and monetized entertainment.
Personalized, ubiquitous (and monetized) entertainment
Imagine the following scenarios:
- You are watching the iconic action movie The Matrix for the third time. Since you love the action scenes, on your demand, you are shown the action scenes and nothing else. You pay only for those scenes and not the entire movie.
- You are watching your favorite travel and cooking show featuring the late Anthony Bourdain. You vaguely remember that scene on the beach where he cooked something using coconuts as an ingredient. All you need to do is make a voice query to Alexa—beach and coconuts—and you get the specific scene from that specific episode of Parts Unknown with Anthony Bourdain.
- You are in the middle of a flight, intently watching The Martian, and a rather dull flight announcement chimes, “Twenty minutes to landing.” You would like to watch the remaining hour of movie, but you only have twenty minutes. So you get to watch a twenty-minute summary of the movie and do not miss out.
- You are a soccer fan—in particular, FC Barcelona and Lionel Messi—and are watching the exciting el clasico football match against Real Madrid on TV. When Barcelona scores their first goal, you get an offer from the local pub to watch the next game with them at 50% off on all orders. During halftime, you get the highlights of all the sublime touches by Lionel Messi in the first half.
- It is the end of a hectic day, and you are tired. Your wearable notices your condition, and news which you might see on CNN.com and may cause you stress is automatically filtered out. In addition, Spotify plays a light classical melody in the background. Suddenly, the world does not seem as stressful.
All this becomes a possibility using newer business models, such as pay-as-you-go, pay-for-what-youwatch, and contribute-not-pay.
Sounds exciting, does it not?
Annotation is the key!
The world seems to be looking forward to such levels of personalization and ubiquitous presence of entertainment. One way to solve this problem is to annotate the content so that you know what the content is about (metadata annotation), what is going on at a given time (cognitive annotation), and how might it make the viewer feel (affective annotation). When you annotate the content with a number of such annotations, they can be used for various personalization and monetization applications, such as recommending content, allowing rich content selection, or placing appropriate advertisements.
Area 66 is a group in TCS Research, which looks at solving problems at the intersection of entertainment, data science, and behavioral science. At Area 66, we aim at building a platform for automated annotation and profiling of various types of entertainment media, such as movies, TV shows, news, music, sports broadcasts, and advertisements. As seen in Figure 1, we aim at providing a number of metadata, cognitive and affective annotations such as emotion intensity, emotion polarity, character emotion, person, location, activity, memorability, aesthetics, cinematic, scene genre, and textual scene description. We use multimodal data such as audio, video, and text for this purpose. We are using a bunch of image and video processing, audio processing, deep and machine learning, and natural language processing (NLP) techniques to make these annotations.
Applications of research
What-When-How ad placement
Currently, we encounter advertisements across entertainment media—during TV telecasts, within YouTube videos, on news sites, or even in Instagram feeds. But do they hamper our entertainment experience? There is a need to provide an ad experience that is in tune with the entertainment experience. For this, we aim at using various annotations on entertainment media to find the right locations for the right ads. For example, if we profile the emotional intensity of a movie, we may be able to locate places where ads can be placed. If we know the emotional polarity, we can control the way ads appear. Ads can either appear in the content similar in mood, or satisfy a latent need. For instance, happy ads can appear with happy content. If we can locate a desert scene or a meal scene within an episode or a movie, we can place an ad for a soft drink right after. If we can find a scene with a car chase, we can place ads for car insurance deals.
We also look at how ads can be rendered as a part of the aesthetic experience of the entertainment media. For this, we use deep learning and image-processing based techniques so that the ads appear as a part of the program being played.
A slightly different take on this is being able to come up with the ads and branding content automatically. Can we create or synthesize content (ads, promos, or even memes) that promotes a particular product? We aim to find an answer using a combination of natural language branding briefings, automated crosspollination, style transfer, and generative neural networks.
Repeat or alternate viewing of entertainment media
Most entertainment content can become stale very soon. The media industry is always interested in re-monetizing such stale content. Why would you watch an average or old movie again? Why would you consume old news now? Why would you re-listen to music that has dropped off the charts? We aim at making re-consumption of entertainment content an interesting experience through a repeat content viewing platform. We utilize the aforementioned annotations to augment content with various external resources, such as reviews, tweets, quizzes, cinematic, and articles. We also aim at manipulating content or consuming it in alternative ways. For example, what if Brad Pitt’s face was replaced with yours in Fight Club? What if Jennifer Anniston sounded like you in Friends? What if the story of X-Men was presented in a nonlinear fashion? While people are rewatching content, it would also serve as a platform for placing new and personalized advertisements. For example, all billboards in a repeat sports telecast could be auctioned for new advertisements, the price depending on when and where they occur in the sports telecast.
Automated or crowdsourced advertising
Advertising is a business that is closely tied to entertainment. A huge amount of money is invested in creating the right advertisements and placing them in the right places for the right people. But as the opportunities for ad placements and available eyeballs on platforms such as social media increase, there is an increasing need for creating new and widespread advertising material and automated ways of placing them in these varied platforms.
We intend to provide artificial intelligence (AI) support to create advertisement and marketing content, not only to creative teams but also to crowds. The rise of technologies, such as generative adversarial networks (GANs) and neural style transfer, along with paradigms such as conversational advertising, is bound to bring in change in traditional advertising techniques. Furthermore, data privacy regulations such as the General Data Protection Regulation (GDPR) are likely to make personal-informationbased advertising more elusive. In such times, advertising might need to rely on the analysis and annotation of entertainment content being consumed by consumers.
Automated music assessment
We also aim at assisting people in learning art. For example, using existing data on music assessments (and a tie-up with a world leader in music assessments), we aim at enabling machine-assisted assessment of learners’ music performances. Instead of employing an expensive (and sometimes idiosyncratic) human expert to assess one’s music while practicing, we hope to bring in a machine assessor to grade and point out flaws in the playing.
No one knows what the future might bring. That is why it is imperative that we take a shot at predicting it. Here are some possibilities:
- In the era of increased privacy, person-based personalization may change to content-based personalization. You know more about the content that a person consumes to profile her rather than knowing about her though personally identifiable information.
- Machines might generate entertainment content in the future. But this might replace the lower-end content such as music for promos or editing for soap operas, or short content pieces such as branding ads and memes.
- There is a chance that technologies (such as the scary Deepfake) might be able to produce libraries of actors and scenarios. That along with ‘cinematic’ style transfer techniques might allow artists to compose entertainment content than to create it from scratch.
- There is a need for assisting artists to create art efficiently, while turning consumers into learners of art. This will allow artists to produce higher quality art and entertainment and consumers to produce a long tail of entertainment content. Machines will have to enable this ecosystem.
- The advertising industry has largely been a specialized industry consuming huge marketing and advertising budgets of big enterprises. It is already changing with the advent of the internet, search engines, and the social media. In future, it might get further decentralized in such a way that creativity is also democratized to an extent. In the future, a piece of the marketing pie will be available for all those who are assisted by machines in creativity: artists, media companies, and even consumers themselves!
The possibility of machines, artists, and consumers cocreating entertainment media is exciting. But there are some obvious questions that need to be answered. Would artists have the sense of artistic freedom and uniqueness while being compared to machines that are learning to be creative? For example, would a graphic designer be comfortable with a machine-assisted ad produced by consumers? What would she do to express uniqueness in a world where machines try to catch up?
The intervention of machines in the creative world is hardly going to be a smooth ride. For example, would Martin Scorsese accept that a machine compressed 20-minute capsule of his painstakingly made films like Taxi Driver? Who would own the copyright to machine produced art and entertainment, which was based on learnings from human artistic capabilities? Would copyright laws need to be changed in the future?
Perhaps, artists of the future might take into account the changing world. They might produce content in such a way that it is modular, consumption ready, monetizable, and amenable to be supported by AI enhancements.