Predicting Success of Bollywood Movies - Lal Bahadur Shastri Institute of Management, Delhi - Lal Bahadur Shastri ...

 
CONTINUE READING
Predicting Success of Bollywood Movies - Lal Bahadur Shastri Institute of Management, Delhi - Lal Bahadur Shastri ...
Lal Bahadur Shastri Institute of Management, Delhi

            LBSIM Working Paper Series
                 LBSIM/WP/2020/06

       Predicting Success of
         Bollywood Movies

                    Shivani Bali
                    August,2020
LBSIM Working Papers indicate research in progress by the author(s) and are brought out to elicit
ideas, comments, insights and to encourage debate. The views expressed in LBSIM Working Papers
are those of the author(s) and do not necessarily represent the views of the LBSIM nor its Board of
Governors.
WP/August2020/06

                                           LBSIM Working Paper
                                                Research Cell

                  Predicting Success of Bollywood Movies
                                                 Shivani Bali

Abstract

India’s entertainment economy is growing rapidly and with half a billion people under the age of twenty-five,
the Media and Entertainment (M&E) companies of the world are taking note of it. With favourable
demographics and rise in disposable incomes the propensity to spend on leisure and entertainment is growing
faster than the economy itself. Indian Hindi Movie industry popularly known as Bollywood has reached
staggering proportions in terms of volume of business (220 billion), employment, movies produced (more than
100 in a year) and its reach (more than 100 countries worldwide). The Indian film industry is the most prolific in
the world, with more than 1,000 films produced every year in more than 20 languages. Almost 90 percent of
revenues are derived from local films leaving English and foreign films with a minority 10 percent market share.
With 3.3 billion tickets sold annually, India also has the highest number of theater admissions. With so much at
stake and highly uncertain nature of returns, it is of commercial interest to develop a model which can predict
the success of a movie. Movies have been described as experience goods with very less shelf life therefore it is
difficult to forecast the demand for a movie. There are number of parameters that may influence success of a
movie like – time of its release, marketing, lead actors, director, producer, writer, music director – being some
of the factors. The present study aims to develop a model based upon Multiple Linear Regression that may help
in predicting the success of a movie in advance thereby reducing certain level of uncertainty.

Keywords: Predictive modeling, regression, success
Predicting Success of Bollywood Movies

Introduction
The Indian movie industry produces the maximum number of movies per year, higher than
any other country’s movie industry. However, very few movies taste commercial success.
Given the low success rate, models and mechanisms to predict reliably the ranking and / or
box office collections of a movie can help de-risk the business significantly and increase
average returns. Various stakeholders such as actors, financiers, directors etc. can use these
predictors to make more informed decisions. The study is restricted to Bollywood movies as
time and financial constraints were involved. Some of the questions that can be answered
using prediction models are given below.
1. What are the major factors that determine the success or ranking of a Bollywood movie?
2. Is the Budget of the Indian movie a key determinant of rank or success?
3. Does social media ratings drive the success of movie?

Further, a DVD rental agency or a distribution house could use these predictions to determine
which titles to stock or promote respectively.

Data from reliable sources from the Internet - Movie Database (IMDB),
Bollywoodhumgama.com ratings, critic’s ratings, social media (twitter, face-book etc.) were
taken and various data mining and prediction techniques like multi-linear regression, were
used to devise a model that can predict an Bollywood movie’s key success factors.
   1. Literature review
In Hollywood researchers have conducted several studies on the box office or financial
performance of movies by primarily adopting psychological or economic approaches (Litman
& Ahn, 1998). The psychological approach focused on the individual moviegoer’s decision to
choose a movie over other entertainment options and to select particular movies, whereas the
economic approach was based on mainly the economic and industrial factors that are
determined by the supply side (Litman & Ahn, 1998). Methodologically, psychological
studies primarily used survey methods, whereas the economic approach used secondary data.
In another study conducted by Chang and Ki (2005) considered the fact that little research
had been done on inherent experience good property of movies as there is advantage in using
experience good property in that it is closely related to a movie audience’s decision making
process.

Jack Valenti, president and CEO of the Motion Picture Association of America, once
mentioned that “…No one can tell you how a movie is going to do in the marketplace …not
until the film opens in darkened theatre and sparks fly up between the screen and the
audience”(Valenti,1978). Movies then are something to be experienced and most trade
journals and magazines of motion picture Industry concur with the experience that supports
the claim of Valenti. The experience good property has two aspects. First, movies are an
experience good because individuals choose and use movies solely for the experience and
enjoyment (Hirschman & Holbrook,1982; Holbrook & Hirshman,1982). This means that in
case of movie goers, consumption experience is an end in itself (Reddy et al, 1998). Also
movies are an experience good because individuals do not know what the value of the movie
will be to them until they experience it (Shapiro & Varian)
Based on work by Litman (1983) and Sochay (1994), creative sphere, scheduling, release
pattern and marketing effort were used as categories of independent variables. However an
explicit criterion was not suggested for this categorization. Historically neither the creators
nor distributors of “cultural products” have used analytics – data ,statistics, predictive
modeling – to determine the likely success of their offerings. It’s easy to see why most people
see the prediction of taste as an art (Thomas et al 2009) But the balance between art and
science is shifting. Today companies have unprecedented access to data and sophisticated
technology that allows even the best known experts to weigh factors and consider evidence
that was unobtainable just a few years ago. The entertainment landscape as a result is looking
for prominent features for the prediction of consumer taste.
Gaikar et al. (2015) used twitter data to predict success of a movie. Bhave et al (2015)
discussed various factors that can impact the success of a movie. We are living in the era of
internet where we trust more on the sentiments of people given on social media. Mudra et al
(2019) dis sentiment analysis on the tweets to predict the success of a movie.
Since, not much work has been carried out in India in this area an attempt has been made to
develop a model which can predict the success factors of a movie. We differentiate our study
as no reported study to predict Bollywood movie success on the variables we have taken has
been done, moreover the study is a longitudinal study based on two years namely 2014 and
2016. The research attempts to review and empirically test predictors that have been adopted
by field practitioners for the purpose of predicting movie success in Bollywood

    2. Research Methodology

   The objectives of the study were to answer the following research questions.
1. Determine the key variables that results in predicting the success of a Bollywood movie
2. Determining whether budget is an important factor for the success of a Bollywood movie.
   Secondary data for two years (2014 and 2016) were collected from reliable sources- IMDB
   rating, Bollywood Hungama ratings, Social media-(Twitter ratings) and Critics ratings.
   Major parameters that were taken in the study were:
        Dependent Variables
        1. Total Revenue crore- This was the most frequently used variable in literature (eg.,
           Litman & Ahn, 1998). It is the collection over the entire running of movie in
           theatres.
        2. First week box office collection - While the first-week box office is considered to
           be highly correlated with the total domestic box office collection, In this study this
           variable is adopted to check whether some independent variables affect the two
           dependent variables to different degrees.(Thomas et al 2005)
        Independent Variables
        1. Release date – Release date is considered important in India as lot of box office
           hits have come to movies which have released on holidays/occasions. In fact
           major production houses form a pact to release movies on the occasions of
           Diwali/Id/Christmas so that the release dates do not clash. So to see how this
           variable affects the total success of a movie it is taken in the study. The rationale
is that a high-attendance-period release (e.g., Christmas/Eid/Diwali) attracts a
           bigger audience, which leads to higher box office performance. In reviewing the
           high-attendance periods, only a few periods, such as Summer, were empirically
           supported by previous literature (Litman, 1998; Sochay, 1991; Wyatt, 1991).

       2. Budget of Movie in Crores- Production costs have been considered an important
          predictor because big budgets manifest as lavish sets, locales, costumes, special
          effects etc. Most previous studies (Basuroy et al,2003; Litman,1982; Litman &
          Ahn,1988; Wyatt,1991, Lee 2009) have also supported the importance of budget.
          The production budget data were brought in from the IMDb. Accurate information
          on production budgets is highly difficult to obtain because it is considered
          confidential. Therefore, some caution is required in using the production budget
          data because they are usually based on press releases by the studios or estimates
          by insiders in the industry.

       3. Genre -       Based on previous research, the movies are categorized into seven
          genres:      action/adventure,   children/family, comedy,     drama,     horror,
          mystery/suspense, and sci-fi/fantasy. This listing is complete and mutually
          exclusive. To code genre, this study consulted IMDb and Bollywood Hungama
          which coded genre for all the sample movies. The comedy genre was significant
          in several studies and sci-fi fantasy and horror genre were supported in other
          literature, however no studies were unanimous.

       4. Lead Actor / Actress- the effects of actors or star power have received
          considerable attention in literature (Basuroy, Chaterjee & Ravid,2003: De Silva
          1998, Holbrook 1999, Prag and Casavant,1994, Wyatt, 1991) So here we have
          categorized actors in 4 categories depending on No. of movies the Actor has done
          -whether he is a
              a. Debut(
6. Sequel–a sequel uses established brand name to introduce a new product and is
         conceptualized like a brand extension (Keller 1998).The use of established brand
         parent allows brand extensions to get customer attention and thus reduces
         marketing costs. In this study we have determined whether a movie is a sequel or
         not.

      7. Production House – The production house is an important variable in Indian Film
         Industry and has been taken as one of the variable. Production houses like Yash-
         Raj films, Dharma productions, UTV motion pictures and Balaji films are
         considered some of the most successful production houses. These successful
         houses can also bear the cost of lavish sets and costumes, expensive digital
         manipulations as they have budgets for all activities pre-production and post
         production.

      8. Critic’s ratings – Like the case of superstars and production houses, the effect of
          critics ratings has been widely tested by previous research (De Silva ,1998;
          Litman & Ahn, 1998, Sochay 1994, Prag & Casavant 1994). Ratings given by
          famous movie critics. critics assist individuals in making a movie choice,
          understanding the content of the movie, developing an initial opinion of the film,
          and communicating movie information to others(Austin 1993)
      9. IMDB ratings – IMDB is world’s largest and famous movie database website with
          rating sclaes. (1 – 10 scale).
      10. Bollywood Hungama Ratings – Ratings given on the Bollywood Hungama
          websites on the movies (1 – 10 scale).
      11. Social Media - Average of Tweets for the movie on a scale of 1 - 10.
      The data for 140 movies on all the above mentioned factors were collected through
      the secondary reliable sources.
  3. Result and Analysis
      The data collected was raw and a few values were missing which resulted in an
      incomplete dataset. It also had certain outliers as well. So to make data ready for
      analysis data cleaning and data transformation was performed. Dummy Variables
      were created for all the categorical variables present in the data set and are mentioned
      below:
      A. Release Date – Dummy Release date was formed where 1 meant movie was
         released on festival or holiday and 0 meant no occasion.
      B. No. of movies the Actor has done- So here 4 categories were made- Debut, Actor,
         Superstar and Female Actor. So since there were 4 categories – 3 dummy
         variables were made namely – Debut, Actor and Superstar.
      C. Genre – In Genre there were 7 categories, so 6 Dummy variables were made
         namely – DummyAction, DummyFamily, DummyComdey, DummyDrama,
         DummyHorror, and DummyMystery.
      D. Production House – Here this variable was divided into 5 categories where 4 were
         the famed production houses and the last category as others. So 4 Dummy
         variables were DummyYashraj, DummyDharma, DummyUtv and DummyBalaji.

The sampled movies earned an average revenue of ₹ 63.53 crores. The mean of the first
week box office collection was ₹ 28.92 crores, showing that the performance of the first
week accounted for approximately 46% of the total box office revenue. The average
 budget of the sampled movies was ₹ 33.60 crores which accounted for approximately
 53% of the average box office revenue. Action (n=36, 26%), family (n = 12, 9%),
 comedy (n = 30, 21%), drama (n = 45, 32%), horror (n = 8, 6%), mystery (n = 8, 6%),
 sci/fic (n = 1, 1%).
Regarding the parameters of information source, the average critics’ rating was 4.70 (1
– 10 scale), the average IMDb rating was 5.76 (1 – 10 scale) and average Bollywood
hungama rating was 7.02 (1 – 10 scale) quite higher than the IMDb rating. Since we
are in the era of social media, therefore, tweets played a very critical role in predicting
the success of the movie. The average tweet rating was 6.18 (1 – 10 scale). Regarding
release time (n = 12, 9%) of the sampled movies were released during festival season or
during holidays. Out of these 12 movies, 8 movies have earned the total revenue of
more than 200 crores. The average values of all quantitative data are presented in
Table 1.

Table 1. Descriptive Analysis

 Variables                              Mean

 First Week Box-Office Collection       28.9179
 Total Revenue (in Crores)              63.53
 Budget (in Crores)                     33.6
 Actor No of Movies                     35.97
 Tweet ratings                          6.0921
 Director No. of Movies                 6.18
 Critics Rating                         4.6939
 IMDb Rating                            5.7598
 Bollywood Hungama Rating               7.02

Regression Analysis: The result of regression analysis with total box office collection and
first week box office collection as dependent variable is shown in the Table 2. Both the
models explained significantly high amount of proportions of variances with R 2 values of
0.703 and 0.673 respectively.
Table 2 Summary of the outcome of the regression analysis

 Independent Variable                            First week   Total
 Dummy Release date                              30.027**     78.955**
 Drama                                                        13.698*
 Dharma                                          13.481*
 Budget                                          0.843**      1.638**
 Bollywood Hungama                                            4.258*
 Tweets                                          2.446**      3.499**

 N                                               140          140
 R                                               0.838        0.821
 R2                                              0.703        0.673
 Adjusted R2                                     0.69         0.661
F                                           78.689           53.593
Probability > F                             0.000            0.000
*p
The decision to go for watching a movie is directed by the twitter ratings. People have a
belief that twitter ratings are more reliable source for knowing the performance of a movie
than any other ratings’ source. The first week box office success is explained by the twitter
ratings whereas the total box office success is explained by twitter as well as the evaluations
done by Bollywood hungama ratings as well. As always budget has the positive impact on the
success of a movie. For this study, budget includes both movie making expenses as well as
promotion expenses. The study also revealed that amongst the various production houses,
Dharma production has a positive impact over the audience. This is also one of the significant
predictors of the first week box-office success; during the first week people are willing to
watch a movie under the banner of Dharma production irrespective of the star cast or genre.
Last but not the least amongst all the genres ‘Drama’ is the choice of the audience. It has
positive impact on the total box office collection.
   6. Conclusion & Future Scope
This research study attempts to propose models for measuring the success of the Bollywood
movies. Two models were proposed to predict the box office collection of first week and total
box office collection. Regression analysis was performed on the data set. The study revealed
that Release date, Dharma Productions, Budget, Tweets significantly impacted the first week
box-office success. For total box office collection, the significant variables were Release
date, Drama, Budget, Bollywood Hungama ratings and Tweet ratings. The key contribution
of this research is to give direction to the film maker regarding the significant drivers of the
success of the movie. Further in this study data for latest years can be added. Also, advance
machine learning techniques can be applied to the data set for better predictability of the
results.

References
   1.    Austin, B. A. (1983). A longitudinal test of the taste culture and elitist hypotheses. Journal of Popular
         Film and Television, 11, 157–167.
   2.    Basuroy, S., Chatterjee, S., & Ravid, S. A. (2003). How critical are critical reviews? The box office
         effects of film critics, star power, and budget. Journal of Marketing, 67(4), 103–117.
   3.    Bhave Anand, Kulkarni Himanshu ; Biramane Vinay ; Kosamkar Pranali (2015). Role of different
         factors in predicting movie success Pervasive Computing (ICPC), 2015 International Conference, DOI:
         10.1109/PERVASIVE.2015.7087152, IEEE
   4.    Chang.B.H and Ki E.J,(2005) Devising a Practical Model for predicting theatrical movie success
         focusing on the Experience Good Property, Journal of Media Economics, 18(4),247-269
   5.    De Silva, I. (1998). Consumer selection of motion pictures. In B. R. Litman (Ed.), The motion picture
         mega-industry (pp. 144–171). Needham Heights, MA: Allyn & Bacon.
   6.    Gaikar D. Damodar and Marakarkandy Bijith, Dasgupta Chandan (2015), Using Twitter data to predict
         the performance of Bollywood movies, Industrial Management & Data Systems, Vol. 115 No. 9, 2015,
         pp. 1604-1621
   7.    Hirschman, E. C., & Holbrook,M. B. (1982). Hedonic consumption: Emerging concepts, methods and
         propositions. Journal of Marketing, 46, 92–101.
   8.    Holbrook,M. B., & Hirschman, E. C. (1982). The experiential aspects of consumption: Consumer
         fantasies, feelings and fun. Journal of Consumer Research, 9, 132–140.
   9.    http://www.hollywoodreporter.com/news/indian-entertainment-biz-revenues-reach-270337 accessed on
         8th July 2016 at 4.53 p.m
   10.   http://www.pwc.in/en_IN/in/assets/pdfs/PwC- Indian-Entertainment-and-Media-Outlook- 2009.pdf.
   11.   Keller, K. L. (1998). Strategic brand management. Upper Saddle River, NJ: Prentice Hall.
   12.   Lee K.J., W. Chang (2009). “Bayesian Belief Network for Box Office Performance: A Case Study of
         Korean Movies”. Expert Systems with Applications, 2009, vol. 36 (1), page 280-291
   13.   Levin, A. M., Levin, I. P., & Health, C. E. (1997). Movie stars and authors as brand names: Measuring
         brand equity in experiential products. Advances in Consumer Research, 24, 175–181.
14. Litman, B. R. (1982). Decision-making in the film industry: The industry of the TV market. Journal of
    Communication, 32, 33–52.
15. Litman, B. R. (1983). Predicting success of theatrical movies: An empirical study. Journal of Popular
    Culture, 16, 159–175.
16. Litman, B. R. (Ed.). (1998). Motion picture mega-industry. Needham Heights, MA: Allyn & Bacon.
17. Litman, B. R., & Kohl, L. S. (1989). Predicting financial success of motion pictures: The ’80s
    experience.Journal of Media Economics, 2, 35–50.
18. Litman, B. R.,& Ahn, H. (1998). Predicting financial success ofmotion pictures: The early ’90s
    experience.In B. R. Litman (Ed.), Motion picture mega-industry (pp. 172–197). Needham Heights,
    MA:Allyn & Bacon publishing incorporated
19. Mundra S., Dhingra A., Kapur A., Joshi D. (2019) Prediction of a Movie’s Success Using Data
    Mining Techniques. In: Satapathy S., Joshi A. (eds) Information and Communication Technology
    for Intelligent Systems. Smart Innovation, Systems and Technologies, vol 106. Springer,
    Singapore
20. Prag, J., & Casavant, J. (1994). An empirical study of the determinants of revenues and marketing
    expenditures in the motion picture industry. Journal of Cultural Economics, 18, 217–235.
21. Shapiro, C., & Varian, H. R. (1999). Information rules. Boston: Harvard Business School Press.
22. Sochay, S. (1994). Predicting performance of motion pictures. Journal of Media Economics, 7, 1–20.
23. Vany A. De. Walls, D. (1999). Uncertainty in the movies: Can star power reduce the terror of the box
    office? Journal of Cultural Economics, 23, 285–318.
24. Wyatt, J. (1991). High concept, product differentiation, and the contemporary U.S. film industry. Texas
    Film and Media Study.
LAL BAHADUR SHASTRI INSTITUTE OF MANAGEMENT, DELHI
    PLOT NO. 11/7, SECTOR-11, DWARKA, NEW DELHI-110075
             Ph.: 011-25307700, www.lbsim.ac.in
You can also read