Introduction
Several decades ago the amount of digital information has begun to increase through the development of computer technologies. Whereas previously, in the late 1980s only 1% of all the world’s information was in digital format, nowadays more than 99% of the information is stored in this format (Hilbert, 2015). As a consequence, new computational approach and analytical tools are required. Until very recently researchers have troubles with analyzing large amounts of data as they used traditional methods of analyzing and statistical tools. It is obvious that this approach was not enough for analyzing the growing amount of information. Thus, computational social sciences (CSS) are being developed.
CSS is an emerging field of study which has now become an intersection of various different fields of study like social science, computer science, environment and engineering (Cioffi-Revilla, 2014). CSS is intended to process data and run simulations at planetary scale, where up to the whole world population is considered, in order to get a better understanding of global social dynamics. This makes sense in a more and more interconnected world, where the events occurring in one place can have tremendous consequences on the other side of the globe (Helbing et al, 2012). The new ICT-enabled study of society has been named CSS. This is a truly interdisciplinary approach, where social and behavioral scientists, cognitive scientists, agent theorists, computer scientists, mathematicians and physicists cooperate side-by-side to come up with innovative and theory-grounded models of the target phenomena. Computational social scientists strongly believe that a new era has started in the understanding of the structure and function of our society at different levels.
CSS is a powerful tool for fostering our understanding of the complexities of real socio-economic systems, by building “virtual computational social worlds” that we can analyze, experiment with, feed with and test against empirical data on a hitherto unprecedented scale (Lazer et al, 2009).. In the last couple of years, social scientists have started to organize and classify the number, variety, and severity of criticalities, if not pathologies and failures, recurring in complex social systems (Helbing et al, 2012). The analysis of huge data sets as obtained, say, from mobile phone calls, social networks, or commercial activities provides insight into phenomena and processes at the societal level. Investigating peoples’ electronic footprints did already contribute to understand the relationship between the structure of the society and the intensity of relationships (Onnela, 2007) and the way pandemic diseases spread (Balcan, 2009) as well as to identify the main laws of human communication behaviour (Karsai, 2011).
There is an increasing realization of the enormous potential of data-driven computational social science. In short, a computational social science is emerging that leverages the capacity to collect and analyze data with an unprecedented breadth and depth and scale (Lazer et al, 2009).
Aim & objectives of the study
The aim of the study is to identify the areas and scope of CSS. The study tries to understand the emerging role and contribution of CSS among research community as well as the tourism industry practitioners. This study collects and analyzes the literatures around CSS and CSS implementation in the Tourism industry. Our objective of the study is have broader view of this new field of study and know where the idea will lead to. We hear and see that the internet and big data presenting new opportunities and challenges to the researchers and the industry equally. The traditional ways of collecting and analyzing data is slowly being replaced by advanced ICT tools. Machine learnings, Big data, data visualization is changing the way of research methodologies. The study focuses to know which new problems are being solved with the help of computational science tools & technologies. We want to know why will the research community and tourism industry will integrate it in their study and work.
Meanwhile, there seems the demand of new sets of knowledge and skills to realize the new possibilities offered by the computational science. The study tries to find out if researchers are facing the knowledge & skill gap in the usage of advanced computational tools. The deeper study of the CSS tools and practices will be taken consideration into the study. Therefore, the goal of the study is to portray the emerging opportunities and challenges in the field of Computational science.
By the study we want to answer the following questions:
- What are the opportunities and challenges for the academicians and tourism industry brought by the Computational social science?
- What are the problems being solved by the computational social science?
Theoretical background
The world has changed for empirical sociologists. In this world dominated by computer scientists who created new ways of creating and collecting data, developed new analytical and statistical techniques, and provided new ways of visualizing and presenting information. These new data sources and techniques have the potential to transform the way in which social sciences are applied (Brynjolfsson et al, 2011). Despite the fact that computational social science is a relatively young science, it has already changed the processes of sociological analysis and has had an impact on other areas of science. As a result, computational social science is a focus of particular attention of researchers. Computational social science is a deeply multidisciplinary field, which includes experts with backgrounds in the social, natural, biological and applied sciences. There are certain differences in modelling approaches, depending on whether the origin of model is within industry, academia or the public sector (David et al, 2004; Heath et al, 2009).
According to Watts (2013), Computational social science is located at “the intersection of the social and computational sciences, an intersection that includes analysis of web-scale observational data, virtual lab-style experiments, and computational modeling”. As for social sciences, they comprise five traditional disciplines which investigate human and social dynamics: social psychology, anthropology, economics, political science, and sociology (Cioffi-Revilla, 2010).
As mentioned above, Computational social science is a more recent development than social sciences. Cioffi-Revilla (2010) claims that Computational social science is “an instrument-enabled scientific discipline”. This fact makes Computational social science similar to such scientific disciplines like microbiology, radio astronomy, or nanoscience. In these disciplines “it is the instrument of investigation that drives the development of theory and understanding” (Cioffi-Revilla, 2010).
The main methods which are widely used nowadays can be classified in five areas (Cioffi-Revilla, 2010):
– Automated information extraction (AIE)
Traditional method of analyzing and coding texts to extract information – content analysis has transformed into the computational analysis of all sorts of media (e.g. images, video, audio) in many fields. AIE is used mainly for the production of events data which then are analysed through various methodologies.
– Social network analysis (SNA)
SNA investigates social structures through the use of networks and graph theory. It has many computational applications for providing provide insightful information and knowledge not available through plain observation or through more traditional methods. For example, in the private, sector SNA is used for investigating customer interaction and analysis, information system development analysis, marketing, and business intelligence needs.
– Geospatial analysis [socio-GIS (geographic information systems) or social GIS]
Initially, GIS were used for getting spatially referenced data about the social world. Nowadays, GIS are combined with other quantitative techniques to produce unique new insights about spatial patterns that are otherwise unavailable through other statistical or mathematical models.
– Complexity modeling
Complexity modeling classifies computational problems in accordance with their difficulty. All of these problems are solved by a computer using an algorithm.
– Social simulations models
As a rule, simulation models meet the standards of “internal validity (the causal relationship) and external validity (whether it can be generalized to a wider popularity)” (Bryman, 2004).
A particularly valuable feature of computational simulation models is “their ability to run current and alternative policies to observe their effects (alternative scenarios), assuming a sufficiently well-developed base model of a given ‘target system.’” (Cioffi-Revilla, 2010).
Computational social science and tourism
Tourism has been ranked as the foremost industry in terms of volume of online transactions (Werthner and Ricci, 2004). For tourism organisations, both private and public, the internet has become one of the most important marketing communication channels (Wang and Fesenmaier, 2006).
Carson (2005) provides a summary of internet applications for tourism organisations and enterprises within an “online architecture” and proposes five important functions of the internet: communication, promotion, product distribution, management and research. This pre-supposes that enterprises would endeavour to learn and use these applications, enter partnerships and make effective use of the internet. Albert and Sanders (2003) talk about the four Ps marketing mix (of product, place, price and promotion) being enhanced by the four Cs of customer solution, cost, convenience and communication, while Newhagen and Rafaeli (1996) show that compared with other distribution and transaction channels the internet contains a truly huge amount of information which can be customised and personalized.
A recent UK survey found consumers trusted more sites with reviews than professional guides and travel agencies (eMarketer, 2007). Similar research in Germany and Austria showed online customer ratings have high credibility with consumers (Österreich Werbung, 2007) and a recent study by Gretzel et al. (2007) undertaken with tripadvisor.com users found that looking at other tourists’ By April 2007 there were apparently over 70 million blogs with around 120,000 new blogs created each day (Sifry, 2007) and currently there are around 102 million blogs, with 175,000 new blogs added each day (http://technorati.com). A study undertaken by Compete Inc. has found UGC has an influence on around 54 G. Akehurst 123 US$10 billion p.a. in online travel bookings and over 20% of consumers rely on UGC when trip planning (Sarks, 2007).
This information is clearly valuable to the tourism sciences industry. The advantage of having access to information and feedback, make users prefer online booking. Studies about consumer’s online behavior revealed that the decision of acquiring a product is very much influenced by other buyer’s opinions (Bucur, 2014). In the past one had trouble deciding to make a booking to a hotel not found in a guide or recommended by an agency, due to the lack of information. Now the problem is the excess of information. With so much sites providing rating and feedback, is impossible to read it all and become extremely difficult to find the relevant information for one to get an overall image. Some sites only provide a rating system (by stars or numbers) or text reviews, others also provide a text review and a rating (Kasper & Vela, 2013).
Big Data & Data Mining
Big data which is regarded as a prominent area of future technology has already started gaining attention in the tourism industry. Banjelloun et al confirms that big data’ is enabling new opportunities for research and analysis in a myriad of domains, including tourism. Big Data in the tourism sector, allows to extract valuable insight, such as better understanding tourists’ behaviours, detection of evolving preferences and needs, forecasting tourism demand for a destination, recommendation in real-time hotels, restaurants and activities to tourists according to their preference. Data mining is the extraction of interesting, previously unknown knowledge from potentially large and noisy datasets. Data mining could be especially valuable to the field of tourism science where abundant amounts of information regarding people’s movements and activities is available, yet untapped. Luke Bermingham and Ickjai Lee (2014 )
Database contains the important hidden information used for decision making. Different databases like relational, object oriented, transactional and spatial databases consist on the complex dataset B. Seerat & F. Azam (2012). The rapid growth in databases has created the need to develop such technologies to extract the nuggets of knowledge and information intelligently. Major data mining techniques used to extract the knowledge and information are: generalization, classification, clustering, association rule mining, data visualization, neural networks, fuzzy logic, Bayesian networks, genetic algorithms, decision tree, multi agent systems, CRISP-DM model, churn prediction, Case Based Reasoning and many more Zhai et al (2009)
Detection and extraction of opinions from online reviews is part of a new area of research developed in the last decade. Opinion mining, also called in scientific literature as sentiment analysis, studies the determination and classification of opinions or feelings expressed in text, through the use of computing machines. The challenge of the research area is to extract knowledge from unstructured data. The reviews contains opinions expressed in natural language, common to people but uninterpretable by computers (Bucur, 2014).
Khan and et al. presented literature survey of opinion mining. Their study focused various ma-chine learning algorithms for sentiment classification from unstructured reviews. They have discussed various applications of opinion mining such as search engines, recommendation systems, email filtering, Web ad filter-ing, questioning/answering systems.
With social media data, Miah et al have developed a method composed of four techniques to identify and predict tourist behavior and to forecast future and seasonal tourism demands for the purposes of tourism development, management and planning; the four techniques are as follows: 1) Textual metadata processing to identify keywords, which reflect tourists’ interests when taking photos, 2) geographical data clustering to identify popular location(s) for each of the identified tourist interests, 3) representative photo identification to identify the photo subjects that most frequently appear for each tourist interest, which provide insights into tourist’s own experience and interests, and 4) time series modelling to predict future tourism demand and reveal seasonal travel patterns for future planning and decision-making.
There are several Big Data projects launched and have demonstrated the potential of big data in the tourism domain.
Flux vision is an innovate solution initiated by Orange Labs, analyzes population flows in real-time using data from Orange’s mobile network. It converts millions of items of technical information from the mobile network.
Tourinflux project funded by public investment program for the future (PIA), The project aims at providing actors of the tourism industry with a set of tools allowing them to handle tourism data and provide an extensive dashboard to visualize and interpret the information available about the territory. With our big data system, we exploit mass data obtained from different websites, mobile applications, geo-localization, social media and connected objects to target and recommend the most appropriate tourism offer according to user profile, to monitor and analyze tourists ‘opinions in order to improve the customer experience and to provide dashboards, useful for decision-making and thus promote intelligent tourism in Morocco and contribute to its economic development.
Methodology
In order to answer the research question of this paper, we will take the study into two parts. First (1) direct review of literature within the last decade in tourism domain with computational social science approaches. (2) Survey among the non-probability sample of tourism scholars and practitioners in the industry. The aim of reviewing literature is to provide a concise overview of general key methodologies that is used for tourism studies and reveal whether computational social science is integrated in tourism. Secondly, the survey will try to understand the frequency of usage of computational social science in tourism research, it’s utility, reliability and effectiveness in the tourism research and developing businesses. The sample will encompass two sub-groups. The first sub-group will consist of academics with different degree (e.g. master studies, Ph.D., associate professors and full professors). The second subgroup will encompass representatives from the tourism and hospitality industry. The questionnaire will include questions regarding current computational social science that they have been using in their studies and it’s challenges and barriers in CSS application process. Which kind of data are they usually using when they say computational social sciences like (administrative, commercial, photographs, videos and audios, social media data). Which specific type of problems are they looking to solve or which information are they trying to retrieve through CSS will be our interest and what skills and collaborations tourism scholars and practitioners would need in order to apply CSS.
Anticipated Findings
From our study we expect to clearly outline the opportunities created by CSS in the field of tourism. The contribution made by the use of CSS in acquiring the new knowledge and solving various social issues will be made evident. We anticipate to find out the problems faced by the academicians and research community regarding the use of CSS in the process of quantitative studies. We expect that our findings from our study and research questions will allow both academicians and industry practitioners to assure the true potential of CSS and encourage to make better use of it in their study and work.
References
- Albert TC, Sanders WB (2003) e-Business marketing. Pearson Education, Upper Saddle River New Jersey
- Backstrom L, Dwork C, Kleinberg J. Anonymized Social Networks, Hidden Patterns, and Structural Steganography. Proc. 16th Intl. World Wide Web Conference.2007. )
- Benjelloun, F.-Z., Lahcen, A. A. & Belfkih, S. An overview of big data opportunities, applications and tools. in Intelligent Systems and Computer Vision (ISCV), 2015 1–6 (IEEE, 2015). 7. Miah, S. J., Vu, H. Q.,
- Brynjolfsson, Erik, Lorin M. Hitt, and Heekyung Hellen Kim. 2011. “Strength in Numbers: How Does Data-Driven Decisionmaking Affect Firm Performance?
- B. Seerat and F.Azam “Opinion Mining: Issues and Challenges (A survey)” International Journal of Computer Applications (0975 – 8887) Volume 49– No.9, July 2012.
- D. Lazer, A. Pentland, L. Adamic, S. Aral, A. Barab´asi, D. Brewer, N. Christakis, N. Contractor, J. Fowler, M. Gutmann, T. Jebara, G. King, D. Roy, M.W. Macy, M. Van Alystyn, Science 323, 721 (2009)
- [D. Helbing, S. Bishop, R. Conte, P. Lukowicz, J.B. McCarthy, Eur. Phys. J. Special Topics 214, 11 (2012)
- J.P. Onnela, J. Saram¨aki, J. Hyvonen, G. Szab´o, D. Lazer, K. Kaski, J. Kert´esz, A.L. Barab´asi, PNAS 104, 7332 (2007)
- D.V. Balcan, B. Colizza, H. Goncalves, J.J. Ramasco, A. Vespignani, PNAS USA 106, 21484 (2009)
- M. Karsai, M. Kivel¨a, R.K. Pan, K. Kaski, J. Kert´esz, A.L. Barab´asi, J. Saram¨aki, Phys. Rev. E 83, 1 (2011)
- eMarketer (2007) UK Online travel: travellers are sharing experiences online’, 25 June. http:// www.emarketer.com/Article.aspx?id=1005067andsrc=article2_newsltr. Retrieved 17 April 2008
- ‘Flux Vision’ https://www.orangebusiness. com/en/press/flux-vision-the-leading-big-data-solution-among-tourism-professionals. 11. Soualah-Alila, F., Coustaty, M., Faucher, C. & Wannous, R. Projet TourinFlux : Apport des Technologies du Web Sémantique pour la Gestion des Données du Tourisme. In 6ème édition du colloque pluridisciplinaire AsTRES : Association Tourisme Recherche et Enseignement Supérieur 12 (2016).
- Gammack, J. & McGrath, M. A big data analytics method for tourist behaviour analysis. Inf. Manage. 54, 771–785 (2017).
- Gandomi, A. & Haider, M. Beyond the hype: Big data concepts, methods, and analytics. Int. J. Inf. Manag. 35, 137–144 (2015).
- Gretzel U, Yoo KH, Purifoy M (2007) Online travel review study: role and impact of online travel reviews. Laboratory for Intelligent Systems in Tourism, Texas A & M University. www. tripadvisor.com/pdfs/OnlineTravelReviewReport.pdf. Retrieved 17 April 2008
- Newhagen J, Rafaeli S (1996) Why communication researchers should study the internet: a dialogue. J Comput Mediat Commun 46(1):4–1 O¨sterreich Werbung (2007) Web 2.0 im internet: Onlinebefrafung unter deutschen O¨sterreich-Urlaubern. http://www.austriatourism.com/scms/media.php/8998/2007E_Web20_summary.pdf. Retrieved 14 April 2008
- N. M. Shelke, S. Deshpande and V. Thakre “Survey of Techniques for Opinion Mining” International Journal of Computer Applications (0975 – 8887) Volume 57– No.13, November 2012.
- Werthner H, Ricci F (2004) E-commerce and tourism. Commun ACM 17(12):101–105
- Wang Y, Fesenmaier DR (2006) Identifying the success factors of web-based marketing strategy: an investigation of convention and visitors bureaus in the United States. J Travel Res 44(3):239–249
- Z. ZHAI, H. XU, J. LI and P. JIA “Sentiment Classification for Chinese Reviews Based on Key Substring Features” Natural Language Processing and Knowledge Engineering, 2009. NLPKE 2009. International Conference onDate of Conference: 24-27 Sept. 2009