A Study on Modernizing Marketing and Sales Potential: A Literature Review

Posted: 19 Feb 2016

Shivika Anand

Educesta Pvt ltd

Date Written: February 19, 2016

A review of literature on marketing and sales potential was made to determine the organisation capabilities in the market. Organisations always try to develop a strong brand which sustained growth in the market. This requires a transformation in current strategies and policies. Leaders play a crucial role in plan strategies according to uncertain environment and market. The literature review demonstrate the technique of modernizing the sales and marketing potential by plan and control, build trust in team, fast result, stimulate the organisation, training and deep rooted organisation culture. The discussion of issue of transformation and challenges related to sales and marketing is also identified in the literature to present the gap in marketing that needs to address with effective sales practices. Finally, the review provides directions for the further research in the field.

Keywords: Marketing, Sales, Modernisation, transformation, Training, Planning

Suggested Citation: Suggested Citation

Shivika Anand (Contact Author)

Educesta pvt ltd ( email ).

722, 10 main Jayanagar BANGALORE, IN 560011 India 8861244409 (Phone) 08861244409 (Fax)

Do you have a job opening that you would like to promote on SSRN?

Paper statistics, related ejournals, cognitive social science ejournal.

Subscribe to this fee journal for more curated articles on this topic

National Academies Press: OpenBook

Use of Market Research Panels in Transit (2013)

Chapter: chapter two - literature review.

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

8 accommodate new technologies, such as telephone sampling and computerization. Since Frankel and Frankel wrote their article in 1987, the Internet has become another technologi- cal advance, requiring innovations in sampling to accommo- date the technology. The Brick article notes that 10 years ago, no generally accepted method of sampling from the Internet had been established, and that as of the writing of the article in 2011, it was still the case. As developed by Neyman in 1934, probability survey sam- pling became the basis for virtually all sampling theory, with a very specific framework for making inferences. The frame- work assumes that all units in the population can be identified, that every unit has a probability of being selected, and that the probability can be computed for each unit. Once the population is selected, the framework assumes that all char- acteristics can be accurately measured. Non-response fac- tors and the inability to include portions of the population (coverage error) violate the pure assumptions of probability sampling. A variety of techniques has been developed to adjust for non-response and coverage error, such as model- based and design-based sampling methods. Technological advances that have changed sampling meth- ods include: the introduction of telephone surveying, which eventually replaced face-to-face interviews as the primary mode of household surveying; the shift from landline tele- phones to cell phones; and the advent of the Internet. Each of these not only affected methods of sampling, it is intertwined with the others, with changes in one leading to new develop- ments in the others. New Concerns with Traditional Quantitative Research The science of traditional quantitative market research rests on two fundamental assumptions: (1) the people who are surveyed approximate a random probability sample; and (2) the people who are surveyed are able and willing to answer the questions they are asked (Poynter 2010). In addition to these concerns with the theoretical underpinnings of market research, operational concerns are also putting pressure on traditional methodologies. Random Probability Sample The random probability sample is an essential ingredient of traditional market research. Without it, quantitative market This literature review consists of four sections: • The first section provides a brief history of survey sam- pling and the theoretical basis for market research anal- ysis, providing context for what became the standard procedures and expectations of market research. This is followed by an overview of some of the issues facing market research today, and how they are impacting the statistical underpinning of market analysis. • The second section introduces traditional panel surveys and panel survey techniques. A summary of a typical tra- ditional case example, the Puget Sound Transportation Panel Survey, is provided in Appendix A. • The third section introduces relatively newer concepts of online panel research, with definitions particular to online panel surveys and techniques, issues with online panel research, and the special concerns of market research in the public sector. • The fourth section circles back to the concerns raised in the first section, and looks at what lies ahead for the market research industry on these issues. MARKET RESEARCH CONTEXT A Brief History of Survey Sampling In a 2011 special edition of the Public Opinion Quarterly, Brick provides an article on the future of survey sampling (Brick 2011). For the purposes of the article, survey sampling is defined as the methodology for identifying a set of obser- vations from a population and making inferences about the population from those observations. Prior to 1934, full enu- meration was considered necessary to understanding char- acteristics of a population; everyone needed to be contacted. Neyman, in his 1934 article “On the Two Different Aspects of the Representative Method: The Method of Stratified Sample and the Method of Purposive Selection,” planted the seeds that resulted in the overthrow of full enumeration and established the paradigm of survey sampling. The move to telephone surveying in the mid-20th century was another significant change in survey sampling methods. The pressures for timely and cost-efficient estimates were stimulants for change then, and are even more relevant today. The article by Brick draws from a 1987 article by Frankel and Frankel, “Fifty Years of Survey Sampling in the United States.” In the 1987 article, sampling is described as having two phases: (1) the establishment of the basic methods of prob- ability sampling; and (2) innovations in the basic methods to chapter two LITERATURE REVIEW

9 research studies and analysis cannot be conducted. It is this underpinning of sampling theory that allows the calculation of sampling and expressing confidence in the results, such as results being ±3% with a confidence interval of 95%. As tele- phones became standard in every household, random-digit- dial techniques for landline telephones became the foundation of probabilistic sampling, with a solid theoretical basis. Results for a survey conducted between January and June 2011 by the National Center for Health Statistics found that 31.6% of American homes had only cell phones, and that an additional 16.4% of homes received all or almost all calls on wireless telephones despite also having a landline telephone (Blumbery 2011). Expert disagree, but it has been suggested that if more than 30% of the population has a 0% chance of being selected, then a random probabilistic sample cannot be selected (Brick 2011). The implication is profound—that with the incidence of landline phones declining, random-digit-dial telephone survey- ing, the mainstay of traditional quantitative market research, no longer provides a probabilistic sample. Cell phones are considered unsuitable for random-digit-dialing, for a variety of reasons, including the possibility of respondents having more than one cell phone resulting in duplication within the sample; respondent resistance; and legislation that prohibits the practice. Online recruitment is fast and economical, but does not provide a probabilistic sample, as is discussed later in this chapter. Willingness and Ability to Answer Research Questions Market researchers started out assuming that people could answer direct questions about their attitudes and behavior. Early on, it became clear that these questions were difficult to answer, so psychometrics and marketing science method- ologies were developed to facilitate responses and analysis of results. More recently even these techniques have been chal- lenged, as the industry realizes that respondents are unreliable witnesses about themselves. Operational Issues Other problems with the traditional market research process are operational. It is perceived as slow and costly, and increas- ingly, organizations are relying on techniques that may be providing quick results at the expense of quality. The cost of traditional survey research often is driven up by the decline in use of landline telephones, making it more difficult to obtain a traditional random probabilistic sample. In addition, more people have answering machines to screen calls or otherwise refuse to participate, again making it difficult to achieve a suf- ficient sample without additional time and expense. If cost is arguably the single most important factor in the search for new survey techniques, the Internet offers a potential solution, even though it doesn’t provide a probabilistic sample. This shift from probability sampling to non-probability sam- pling is a paradigm change of the magnitude of the shift from enumeration to probability sampling theory in 1934—which, Brick notes in his article, was spurred by the cost of enumera- tion. Today researchers find themselves in a similar situation, driven by rising costs away from probabilistic sampling toward non-probabilistic sampling. The issues raised here may fundamentally change the way all market research is conducted. How the market research indus- try is responding and what may lie ahead is discussed in the last section of this chapter, “The Future of Market Research.” TRADITIONAL MARKET RESEARCH PANELS Panel surveys have been conducted for many years, and have been used in the transportation industry for topics such as travel behavior changes and tracking customer satisfaction. The concepts discussed in this section are applicable to both traditional and online panel research. Definition of a Traditional Panel The meaning of a market research panel depends on the con- text, industry, and time period in which the term is being used. The AMA acknowledges this with the following distinctions: • True panel: A sample of respondents who are measured repeatedly over time with respect to the same variables. • Omnibus panel: A sample of respondents who are mea sured repeatedly over time but on variables that change from measurement to measurement [http:// www.marketing power.com/_layouts/Dictionary. aspx?dLetter=P (accessed Mar. 11, 2012)]. Traditional Panel Survey Techniques Traditional panel members were recruited through probabilis- tic sampling techniques so that survey results could be extra p- olated to the general population. Developing and maintaining a panel was an expensive proposition, made more difficult by the challenge of keeping track of people and households as they moved and changed phone numbers. Panel survey research was typically used to determine individual travel behavior changes over time, such as to understand the relation- ship between changes in household characteristics and choice of travel mode. Another use was for “before and after” studies to measure impacts of a change in policy or service; for exam- ple, adding a new light rail line or carpool lanes. These studies were often conducted by a MPO for the purpose of developing regional travel demand and forecasting models. Panels were rarely set up and maintained for the purpose of ad hoc, on-call market research (omnibus panels). Panel data collection is described as “a survey of a group of preselected respondents who agree to be panel members on a continuous basis for a given period of time and provide demographic data, allowing selection of special groups and

10 cally, each wave consists of the same core questions along with some new questions. In a travel behavior survey, the panel provides information on how the travel behavior of each participant evolves in response to changes in the travel environment, household background, or other factors. Rotating or revolving panel surveys are a combination of repeated and cross-sectional designs, in that they collect panel data on the same sample for a specified number of waves, after which portions of the panel are dropped and replaced with comparable members. The strength of this design is its abil- ity to allow for both short-term panel member analysis and long-term analysis of population and subgroup change. Like repeated cross-sectional designs, rotating panels periodically draw new members from the current population, obtaining similar measurements on them. Benefits of Panel Designs The most important benefit of a panel survey is that it directly measures changes at the individual level and can provide repeated measurements over time. This rich source of infor- mation on personal and household behavior is essential for determining causal relationships between travel behavior and the factors that influence personal travel decisions and developing predictive models for personal travel behavior. This same benefit applies to the ability to measure and under- stand trends in population behavior. Panel studies can be especially useful for before-and-after surveys that measure the impacts of transportation policy and service changes on travel behavior, rider attitudes, and safety. For example, a before-and-after study of the implementation of a new rail line (replacing existing bus service) shows that a shift in mode split occurred after the implementation of the new line. Results using a cross-sectional survey showed a shift from auto to train after opening of the rail line suggesting overall growth in transit use shifting car drivers to rail riders. A panel study measuring individual specific changes captured a shift from bus to car in addition to the shift from car to rail. This finding fundamentally changed the implications of the cross- sectional study: the new service attracted former car drivers, but also shifted former bus riders into cars. Additional benefits of the panel approach include statisti- cal efficiency (it requires a smaller sample size); lower cost (it requires fewer surveys); and speed (easy access to the panel allows faster survey implementation than when a fresh sample must be obtained). Limitations with Panel Designs Three primary limitations of panel surveys are identified: panel attrition, time-in-sample effects, and seam effects. 1. Panel attrition refers to panel member non-response in later waves of data collection. The Puget Sound Trans- portation Panel conducted its first wave of surveys in permitting the use of surveys to monitor responses over time” (Elmore-Yalch 1998). This maximizes the use of a sample in that the sampling need be done only once, after which the panel is accessible for future research efforts. Panel member attrition and replacement is an element of maintaining the panel, and is discussed elsewhere in this chapter. The remainder of this section is a summary of An Introduc- tion to Panel Surveys in Transportation Studies (Tourangeau et al. 1997), which provides a solid overview of the basics of traditional panel survey research, especially as applied to travel behavior studies. The report has a four-fold purpose for the development and implementation of travel behavior studies: (1) to highlight the differences between cross-sectional and panel surveys; (2) to discuss the limitations of both cross-sectional and panel sur- veys; (3) to identify situations where panel surveys are the preferred method; and (4) to provide guidelines for design- ing a panel survey and maintaining the panel. A panel survey approach is recommended when the purpose of the survey is to develop travel demand models and forecast future demand; to measure and understand trends in behavior; to assess the impact of a change in transportation policies or services; or to collect timely information on emerging travel issues. Definition of Cross-Sectional and Panel Designs There are two broad types of surveys, cross-sectional and panel surveys. A cross-sectional survey uses a fresh sample each time, whereas a panel survey samples the same persons (or households) over time. In addition, the questions may be the same or change with each survey. This creates four basic approaches to travel behavior surveys. One-time cross-sectional surveys provide a “snapshot” of travel behavior at a particular point in time, and show how behavior differs among members of the population, but pro- vide no direct information on how it changes over time. This type of survey makes no attempt to replicate conditions or questions from previous studies, and as a result is not well suited for assessing trends in population behavior. Repeated cross-sectional surveys measure travel behav- ior by repeating the same survey on two or more occasions. In addition to repeating the questions, the sampling is conducted in a similar manner to allow comparisons between or among separate survey efforts. Repeated cross-sectional surveys are sometimes referred to as a “longitudinal survey design” because they measure variations in the population over time. A more restrictive definition of a longitudinal survey design is where survey questions are repeated with the same sample over time. Longitudinal panel designs collect information on the same set of variables from the same sample members at two or more points in time. Each time the panel is surveyed, it provides what is called a “wave” of data collection. Typi-

11 1989. The fourth round of surveying in 1993 had a par- ticipation rate from the original panel member of about 55%, meaning 45% of the panel had left and needed to be replaced. 2. The time-in-sample effect refers to reporting errors or bias as a result of participants remaining in the panel over time. This is also called condition, rotation bias, or panel fatigue; and generally refers to respondents reporting fewer trips or fewer purchases in later rounds of a panel survey than in earlier ones. 3. Seam effects are another type of reporting error and refer to reporting changes at the beginning or ending of the interval between rounds rather than in other times covered by the interview. Design Issues in Conducting a Panel Survey There are four design issues that need to be considered in conducting a panel survey: definition of the sample unit; the number and spacing of rounds; method of data collection; and sample size. 1. Most traditional travel surveys conducted by MPOs use households as the sampling unit; however, sampling individuals is another option. When a household is the sampling unit, the panel survey sample can become com- plicated as household members are born, die, divorce, or mature and move out. For travel surveys, the report sug- gests using the household as the sampling unit, follow- ing initial respondents to new households, and adding any additional household members to the panel. 2. The number and spacing of survey rounds depends on factors such as the rate of changes in travel behavior and the need for up-to-date information. If changes in travel behavior are the result of external factors, such as rap- idly increasing gas prices, or if administrative reporting requires monthly or quarterly updates, this may shorten the intervals between survey waves. Panel travel surveys are collected at six-month or annual intervals, balancing the potential for respondent burden with the desire for regular data collection. The report recommends annual data collection for travel behavior studies. 3. Data collection methods differ in terms of cost, cover- age of the population, response rates, and data quality (inconsistent or missing data). In-person data collec- tion is typically the most expensive, but produces the highest percentage of coverage, highest response rates, and potentially most accurate data, as the interviewer can assist the respondent. Telephone data collection tends to be the next most expensive methodology, and eliminates the population without a telephone. This used to be limited since almost all households had a landline phone, but since the report was written the per- centage of mobile phone-only households has grown significantly. Data collection by mail is the cheapest of the three traditional modes, but has the lowest response rates and poorest data quality. [Since the report was written, Internet surveying has become another inex- pensive alternative method of data collection. Online surveying is covered in other portions of the literature review.] The report recommends using the telephone for data collection in the first wave of a travel behavior panel study and considering less expensive methods for successive waves, if necessary. 4. Selecting the sample size requires specifying the desired level of precision for the survey estimates. The preci- sion level is determined by the requirements for ana- lyzing the goals and objectives of the survey, typically rates of change in travel behavior at the household or sub-regional level. After the level of precision is deter- mined, traditional statistical formulas can be applied to determine the sample size, which is then adjusted for anticipated non-response, attrition, and eligibility rates. Issues with Maintaining the Panel The report points up three issues that need to be considered in terms of maintaining a panel: freshening the sample, main- taining high response rates across waves, and modifying the questionnaires across rounds. 1. “Freshening the sample” is the process of adding new panel members over time to ensure that the sample accurately reflects changes in the population from newly formed households or those who have recently moved to the study area. The longer the panel is con- tinued, the less likely it is to represent the study area. The report suggests that, if a panel continues for more than five years or there is significant in-migration to the study area, a supplemental sample be implemented. Another reason for freshening the sample is to off- set attrition, recruiting new panel members compa- rable to those who drop out and thereby maintaining the panel make-up and sample size for the duration of the panel effort. The report suggests that the initial sample size be large enough to accommodate antici- pated attrition in later waves, and that steps are taken to minimize attrition. Replacement of panel members should only be done as a last resort. 2. There are three techniques for maintaining high response rates: tracing people or households who move; main- taining contact with panel members between rounds; and providing incentives for participation. Methods of tracing panel members who move include mailing a let- ter several months in advance of the next wave request- ing updated contact information; and asking the post office to provide new addresses rather than forwarding the mail, to ensure that the contact files get updated. If new contact information is not provided, researchers may attempt a manual search through existing databases. The report suggests that a protocol be developed at the outset of the survey effort to track respondents between waves and reduce attrition. Another way of reducing attrition is to maintain respondent interest and contact information between

12 waves by sending postcards, holiday greetings, and sur- vey results. Incentives such as small amounts of cash can also be helpful. Cross-sectional surveys have shown that a small prepaid incentive (for example, a $2 bill) is effective in increasing participation rates and reducing attrition. Unfortunately, there was limited research at the time as to the effect of incentives on panel surveys over time. It is noted that non-respondents in one wave may still participate in the next, so that only those who refuse to respond to more than one round of the study would be dropped from the panel. 3. A defining element of a traditional panel survey is the ability to administer the same questions to panel mem- bers over time, which is what provides the direct mea- surement of change that is so valuable to travel behavior studies. Two situations arise that may make it neces- sary to modify the questionnaire across waves. First, a new issue may arise that can be advantageously posed to the panel. This then becomes a cross-sectional sur- vey, where the data are collected once. If the question is repeated in later waves, it becomes part of the panel effort. Although this is easy, fast, and less expensive than conducting a separate study, it can add to respon- dent fatigue by making the questionnaire longer. For this reason, it is suggested that new questions be kept to a minimum. The second reason for changing a question that there is a problem with the question itself (e.g., it is poorly worded, yields unreliable results, or becomes irrelevant). In this instance, it is important to revise the question as soon as possible. The report recommends that a calibration study be done to determine the effect of any core changes. Weighting the Panel Data The final section of the report deals with how to weight panel survey data. Weighting is done to account for differences in the probability of being selected, to compensate for differences in response rates across subgroups, and to adjust for random or systematic departures from the composition of the popula- tion. Weighting is done at two points: after the initial wave, following the procedures for standard cross-sectional surveys; and then after each wave to account for changes in the panel membership. Although weighting is fairly straightforward for the first wave, subsequent waves can be complicated if the sampling unit is a household, as is typical of travel behav- ior panel studies. Elements that must be taken into account are how to treat households who add or lose members over the course of the panel; and how to define a “responding” or “non-responding” household, for example, whether all survey waves are completed by all household members or only cer- tain household members. It is sometimes necessary to gener- ate different weights for different survey analyses. Detailed guidelines for developing panel survey weights are provided in the report appendices. ONLINE MARKET RESEARCH PANELS This section will discuss the types of online panels, sampling strategies, and issues and concerns with using the Internet for market research purposes. The current literature reviewed in this synthesis discusses sampling and recruitment for online panels using the Internet, e-mail, or other new technologies, such as quick response (QR) codes scanned by a smart phone. Multi-frame sampling, where a mix of sampling techniques is used for developing the panel, poses additional issues which are only now being explored and disseminated within the market research industry. Because this is an emerging area of research, this literature review does not include multi-frame sampling. Types of Online Panels Three types of panels are discussed by Poynter in his 2010 book, The Handbook of Online and Social Media Research: Tools and Techniques for Market Researchers. The first is a traditional panel, typically called a client panel or in-house panel, developed to meet specific criteria and recruited either in-house by the agency or through the assistance of a vendor. The panel can be recruited through a variety of techniques, including telephone; in-person intercepts (on a vehicle, or on the street); through existing agency customer databases; or online, through the agency website or pop-up invitations to join the panel. The critical elements of this type of panel are the definition and control that is exercised by the agency, and the intention for the agency to maintain the panel over time. An online access panel, also referred to as an access panel or online panel, is developed by independent market research firms and can provide samples for most markets that have a significant volume of research activity. The researcher provides the panel company with the desired sample speci- fication, and then either the researcher provides a link to the online survey, or the panel company scripts and hosts the online survey. The third type of panel survey is an online research com- munity, also known as a market research online community or MROC, which combines attributes of panel research with elements of a social media community. Although it is sometimes grouped with social media techniques, the online research community has been included here because it meets the definition of “a group of persons selected for the purpose of collecting data for the analysis of some aspect of the group or area.” In-House Panels As the name implies, in-house panels are owned by the research department of the agency, and are not purchased from a vendor’s existing panel. The in-house panel is used

13 for market research, not public relations, marketing or sales; and panel members are aware that they will be contacted for research, insight, and advice. The primary advantages of in-house panels are cost savings, speed of feedback, and control over the panel. Disadvantages include the workload required to manage a panel and that the possibility that panel members may become sensitized to the research approaches. In-house panels can be conducted simply from a list of people and an off-the-shelf survey program using e-mail and a way to unsubscribe from the panel. For small-budget projects or a low-key exploratory concept, a simple approach may be the most appropriate. More sophisticated panel management may require methods to prevent people from signing up mul- tiple times, the ability to draw sub-samples, protocols for han- dling and managing incentives, panel member replacement strategies, quotas on survey responses, online focus groups or bulletin board groups, and rules for creating an online panel community. The more sophisticated the approach, the more advanta- geous it is to contract with a vendor to run the panel. Using internal staff may make the research more responsive to man- agement needs while saving in consultant fees. A vendor, how- ever, can handle more work without overburdening agency staff, using employees familiar with the latest thinking and best practices. These different strengths often lead to a strong partnership between the vendor and staff. Traditionally, panel research was done with standard ques- tionnaires, implemented by means of mail or telephone. New developments in technology and the Internet have made it easy to expand the activities of a panel even further, creating online focus groups, photo projects where panel members take pictures with their cell phones and upload them to an agency website, brainstorming through collaborative systems such as “wiki” sites, and quick “fun polls” that encourage participa- tion, generate panel engagement, and provide almost instant answers to questions of the moment. Tips for using an in-house panel include: 1. Manage the expectations of panel members by letting them know at the outset how many surveys/activities they should expect. 2. Let panel members know you value their participation and that they are making a difference. 3. Recognize that panels will usually be skewed toward members who are knowledgeable about the product or service, and that they may not represent the opinion of the general public. 4. Complement conventional incentives (such as cash) with intrinsic rewards, such as information about upcoming events or new products before it hits the general market. Online Access Panels Online access panels have fundamentally changed how mar- ket research is conducted. An online access panel “is a col- lection of potential respondents who have signed up with a vendor which provides people for market research surveys.” These respondents are aware that they are joining a market research panel, and that they will be receiving invitations to online surveys. The vendor keeps some information on the panel members so that it can draw samples, if requested, but does not share this information with the client. Panel mainte- nance, including the provision of incentives, is the vendor’s responsibility. In selecting a panel vendor, six factors need to be considered: 1. Does the vendor provide only the sample, or will it also host surveys? If so, can the brand image on the survey maintain the agency’s brand, or does it become folded into the vendor’s survey branding? 2. What is the quality of the panel? Not all panels are created equal, and the results can vary based on the panel used. ESOMAR formulated “26 Questions” (later, “28 Questions”) for agencies to ask vendors in order to understand their procedures and the potential quality of the survey results. The questions can be found at: http://www.esomar.org/index.php/ 26-questions.html. 3. In looking at vendor costs, caution must be exercised to ensure that price quotes are on similar services so they can be correctly compared. 4. Make sure that the vendor has the capacity to complete the study, including any potential future waves of the study. It is common practice for panel survey vendors to outsource a portion of or even the entire project to another firm if they do not have the resources to com- plete it as scheduled. Outsourcing to another panel sur- vey firm can result in double-sampling people who are members of both panels. More importantly, because dif- ferent panels often have varying results, this can lead to confusion as to whether an apparent change is real or a reflection of the panel used. 5. The more data a vendor has on its panel members, the more closely a survey can be targeted to the appropriate respondents. This results in fewer respondents being screened out and a shorter survey with fewer necessary questions. 6. As with any service, it is helpful to have a supportive vendor who is willing to stay late if needed, help clean up errors, and respond quickly to issues and concerns. After selecting a vendor, it is essential to ensure a good working relationship. This can be facilitated by: • Clarifying the quote for the project to make sure it includes all work needed;

14 • Booking the fielding time for the job as soon as the ven- dor is selected so there is flexibility if dates need to be changed for holidays, computer maintenance, etc.; and • Developing and agreeing on the timeline, including final- izing the sample specification, scripting the survey or sending the link to the survey, having a soft launch to test the survey, agreeing on the full implementation and end date, and specifying the frequency of communica- tion with the panel company, especially regarding prob- lems that may occur. Once the survey is in the field, it is important to monitor progress and report any issues immediately to the panel ven- dor, including problems reaching the target quotas for com- pleted surveys. The sooner action is taken, the easier it will be to rectify the issue. It is advisable to work closely with the vendor supplying the panel to take advantage of its experi- ence with data issues with long surveys and improving the survey experience. Online Research Communities Using social media to create online research communities or MROCs for research purposes is a relatively new field. Research communities have been offered by third-party ven- dors since about 2000, but did not become widely used until about 2006. Online research communities typically have a few hundred members, and straddle the divide between quantita- tive and qualitative research. The communities can be short- term, developed for one research question and then dissolved; or can be a long-term resource, allowing research on a wide variety of topics over a period of six months or more. The benefits of online research communities are that they provide access to the authentic voice of the customer; go beyond the numbers to provide qualitative discussion; provide quick turnaround at a low marginal cost because the sampling and recruitment is complete; and create an active dialogue with the customers, letting them feel they “make a difference.” These communities can either be open to anyone who wishes to join (within the requirements of screening criteria, such as age or geographic location), or closed, in which case panel members are invited to participate. It is important to note that open communities tend to be more about involvement, advo- cacy, and transparency rather than insight and research. Incentives are important to maintaining a high level of par- ticipation for all types of research panels; however, several issues are to be considered when structuring an incentive pro- gram. (It should be noted that it is illegal for some public agen- cies to use incentives.) The argument for using incentives is that they represent a small payment for the time and contributions of the panel members, and may be necessary to obtain the level of engage- ment needed to make the community succeed. The type of incentive (cash versus intrinsic rewards) must also be con- sidered. A chance to win a transit pass or seeing the results immediately upon completing an instant poll are examples of incentives. Finally, the agency must decide how to allocate the incentives. Options include giving all members an incen- tive regardless of participation levels; giving members who participate in a specified time frame the incentive; offering a chance to win a prize; and awarding a prize to the “best” contribution in a specified time frame. Agencies should avoid starting with a high-value incentive, because lowering the incentive later will seem to panel members that the agency is taking away a benefit, resulting in a loss of participation. As with all research techniques, the online community can be developed and maintained either in-house or through a vendor. Online research communities require significant and continuous management. Even if the community is maintained by a vendor, significant input by staff is needed to ensure that the community is addressing issues of concern to the agency. The advantages of a having a research-only community are that it can be much smaller than broader-topic communities, and members may be more open if they know they will not be “sold to” another interest. Opening the community up to other department managers may result in too many surveys and e-mails being sent to members, with research being pushed aside in favor of other topics. Likewise, it is important not to allow community members to usurp the purpose of the research community for their own agendas. Part of managing the community is monitoring and ending any member activ- ity that begins to create an agenda separate from that of the agency, even removing a panel member if necessary. The steps to and guidelines for setting up an online com- munity include determining: • What type of community is best (short versus long term, open versus closed, and the number of members); • The “look and feel” (i.e., makeup) of the community; • Community tools; • Methods of recruiting members; • Terms and conditions (including intellectual prop- erty, member expectations, restricted activities, anti- community behavior, privacy and safety, incentive rules, eligibility, data protection and privacy), and the ability to change terms and conditions; • Methods of moderating and managing communities (moderator function, community plan, dealing with neg- ativity, creating member engagement); and • Requirements for finding and delivering insights. The rapid pace of change among social media makes it difficult to project how this type of research activity will be conducted in the future. Four considerations are identified in Poynton’s book: 1. Market research organizations typically do not allow activities that would influence the outcome of the

15 research. Because interaction and relationships built between community members and the sponsoring com- munity agency may sensitize panel members to organi- zational issues, MROCs may be declared “not research.” 2. Currently, online research communities are used for more qualitative work rather than large-scale quantita- tive work. The ability to expand online research to larger projects (e.g., international research) will increase this as a mainstream research tool. 3. Respondent fatigue may set in, resulting in a less engaged community. This may be especially true if panel mem- bers belong to more than one community. 4. Alternative (not research-based) methods may be more successful, such as having a very large community that can serve both marketing and research functions, or tap- ping into other existing communities to conduct research rather than establishing one specific to the organization. One of the primary concerns with online research commu- nities has been that the relationship with the organization may cause heightened brand awareness and affinity, and that this will lead to a positive bias in research results. However, Aus- tin notes in an article in Quirk’s Marketing Research Media (Austin 2012) that while engagement builds a relationship with the company, community members remain candid and critical despite their relationship with the brand. If anything, members became slightly more critical as their tenure length- ened, not less. The article recommends that in moving to a new research paradigm, organizations make two changes from the traditional research approach to take advantage of this finding: trade anonymity for transparency because transparency builds engagement; and trade distance for relationship because rela- tionship creates candor. Together, the community members “work harder, they share more and they stay engaged in the research longer.” Online Panel Sampling Techniques A few online panels employ traditional random sampling tech- niques, such as random-digit-dialing, and then conduct the research online; but the majority of panels are recruited using a non-probability approach online, such as pop-up or web ban- ner ads. The AAPOR Report on Online Panels (Baker 2010) covers both types of panels. This review will cover probability and non-probability sampling techniques as they relate to panels; it also discusses “river sampling,” although it is not a panel sampling technique, per se. Lastly, it provides an over- view of strategies for adjusting non-probability samples to represent a population. Probability sampling techniques for online survey research have been slow to be adopted, despite being around for more than 20 years. The recruitment is similar to voluntary, non- probabilistic samples, except that the initial contact is based on probabilistic sampling techniques such as random-digit- dialing, or other techniques for which the population is known. Computers may sometimes be provided to persons with no online access to remove bias that might exist from only includ- ing persons or households with Internet access. Once the sample is determined, panels are built and maintained in the same way, regardless of whether they are probability- or non- probability-based. A probability-based sample is more expen- sive to develop than a non-probabilistic sample. Consequently, systematic replacement or the replacement of panel members lost through attrition is also more costly. The benefit is that a panel can be built that represents the general population and allows analysis of results based on probability theory. Non-probability and volunteer online panel members are recruited through a variety of techniques, all of which involve self-selection. The invitations to join a panel can be delivered online (through pop-up or banner advertisements), in maga- zines, on television, or through any other medium where the target population is likely to see the advertisement. The recruit- ment entices respondents by offering an incentive, talking about the fun of taking surveys, or other proven techniques. A common practice in the industry for developing online panels is through co-registration agreements. An organization will compile e-mail lists of its website visitors and ask if they would like to receive offers from partner agencies. The e-mail list is then sold to a research panel company. Off-line recruit- ment strategies include purchasing an organization’s cus- tomer contact database and asking participants in a telephone survey if they would like to become part of an online panel for future surveys. A technique used for both online and off-line recruitment is to ask existing panel members to refer their friends and relatives, sometimes offering a reward for each new panel member recruited. No two panels are recruited the same way, and the panel research companies carefully guard their methodologies for recruiting panel members. River sampling is an online technique that uses pop-up sur- veys, banner ads, or other methods to attract survey respon- dents when they are needed. In river sampling, the ad presents a survey invitation to site visitors and then directs or “down- streams” them to another, unrelated website to complete the survey. (Using this analogy, a panel would be a pond or reser- voir sample.) Knowing on which websites to place the ads is critical to the success of river sampling. This technique is not related to developing a panel, although sometimes the respon- dent is invited to join a panel at the completion of the sur- vey. There is generally a reward of some kind for completing the survey, such as cash, online merchant gift cards, frequent flyer miles, etc. This type of sampling may be on the rise as researchers seek larger and more diverse sample pools, and to get respondents who are less frequently surveyed than those provided through online access panels. The AAPOR report provides an overview of strategies for adjusting self-selected (non-probability-based) online panels, and reviews complex weighting, quotas, benchmarking, and modeling methodologies for creating a more representative

16 sample. Complex weighting uses detailed information about the population to balance respondents so that they mirror the population. Quotas, which match key demographics of the respondents with the demographics of the target population, are the most common technique. Benchmarking keeps the sample specifications the same over multiple waves, under the assumption that any changes are the result of changes in the element being measured, regardless of whether the sample is representative of the population. Modeling refers to linking the benchmark results to the real world to model what a sur- vey score of X means in terms of actual outcomes. When applying statistical significance testing to the panel sample, it is important to recognize that the significance is not how representative it is of the population, but of the panel. “The error statistics indicate how likely it is that another sample from the same panel will be different, which is a valid and rele- vant measure of reliability” (Poynter, p. 74). It is not, however, an estimate of the population sampling error, as is commonly understood with traditional random (probabilistic) sampling. Response rates for online access panels have little impact on how representative the research is, but do provide a measure of the quality of the panel. Issues and Concerns with Online Panel Surveys: AAPOR Report on Online Panels Online surveys have grown rapidly because of the lower cost, faster turnaround time, and greater reliability in building tar- geted samples, at the same time that traditional survey research methods are plagued by increasing costs, higher non-response rates, and coverage concerns. The quality of online access panel survey data came into focus in 2006 when the VP of Proctor and Gamble’s Global Consumer Market Knowledge gave a presentation on the range of problems P&G had faced with online access panel reliability. It fielded a survey twice with the same panel, two weeks apart, with results that pointed to two different business conclusions. This focused the market research industry’s attention on the need to provide understand- ing, guidance, and research on the topic of online research. The traditional probabilistic sample, such as random-digit- dialing, is the underpinning of market research. Probabilistic samples are based on the probability of being selected out of a specified population (such as households within the city lim- its). Based on probability theory, the results can be projected to the population with a statistical level of certainty. Online panel surveys typically use non-probability samples, which are a significant departure from traditional methods. The AAPOR Report on Online Panels, produced by the AAPOR task force on opt-in online panels, is a seminal work on concerns and issues with online panel (i.e., non-probability sample) survey research. The scope was to “provide key infor- mation and recommendations about whether and when opt-in panels might be best utilized and how best to judge their quality” (Baker 2010). Sampling Error, Coverage Error, and Non-Response Bias A sample is, by definition, a subset of a population. All sur- veys, regardless of sampling method, have some level of imprecision owing to variation in the sample. This is known as sampling error. A probabilistic sample is one where sam- pling theory provides the probability by which the member of the sample is selected from the total population. In traditional sampling methods, such as random-digit-dialing of house- holds within a geographic area, the total population of home telephone numbers is known. With address-based sampling, the total number of addresses in a specific area is known. Thus the total population is known and the probability of selecting any one phone number (or address) is known. This allows the data to be projected to the population as a whole. The difficulty with online sampling is that the population is unknown. Typically an e-mail address is used as the sampling unit (rather than a home telephone, as in the earlier example). The issues with e-mail addresses include duplication problems, in that one person may have more than one e-mail address; and clustering problems, where an e-mail address represents more than one person. As a result, online sampling differs from tra- ditional sampling in three significant ways: (1) the concept of a sampling frame is discarded and the focus is shifted to recruit- ing as large and diverse a group as possible; (2) instead of a representative sample of all households, a diverse group of persons with the attributes of interest for the panel is recruited; (3) the panel membership is rarely rotated, with panel mem- bers being retained as long as they keep completing surveys. Over time, this can lead to a very different panel membership than the initial profile of the panel. Coverage error occurs when persons, or groups of persons, have zero chance of being selected to participate in the sur- vey. Lack of access to the Internet creates significant cover- age bias. The AAPOR report includes data from 2008 stating that although 85% of the households in the continental United States have some level of Internet service, those without Inter- net access differ significantly from those who do. Those with- out access are more than twice as likely to be over the age of 65 as the general population. They are also more likely to be members of a minority group, to have incomes less than $25,000, to have a high school education (or less), to be unem- ployed, not to own a home, and to live in rural counties or the South Census Region. It can also be noted that having access to the Internet does not necessarily make for active users of the Internet. In 1970, household telephone coverage estimates of 88% led to the acceptability of using telephone surveys in place of in person interviewing. Coverage estimates of Inter- net usage are currently lower than 88%, indicating that it has not yet reached a level where it can be used to represent the general population. Commercial online access panels are even more problem- atic, in that a person has to have Internet access, receive an invitation to become a panel member, sign up for the panel,

17 and then participate in the surveys. Current estimates are that less than 5% of the population has signed up for an online panel, meaning that more than 95% of the population has a 0% chance of being selected. Non-response bias is when some of the persons in the sam- ple choose not to respond to the survey, or some of the ques- tions within the survey. Four stages of panel development are discussed, and how online panel survey development is affected by non-response bias: Stage 1: Recruitment of panel members. The previous discussion on coverage error points out issues with Internet access. In addition, there is bias regarding which Internet users are likely to join a panel. The report cites several studies that found online panels are more likely to be comprised of white, active Internet users with high education levels who are considerably more involved in civic and political activities; and who place less importance on religion and traditional gender roles and more importance on environmental issues. Stage 2: Joining and profiling the respondents. Most panels require a respondent to click through from the online ad to the company’s website to register for the panel and complete some profile information, including an e-mail address. An e-mail is sent to the prospective panel mem- ber, who must respond in order to join the panel. A study by Alvarez et al. (2003) reported that just over 6% of those who clicked on the banner ad completed all of the steps to become a panel member. Stage 3: Completing the questionnaire. This is similar to random-digit dialing when a person refuses to par- ticipate in the survey or does not meet the eligibility requirements. Online surveys have an additional non- response bias from technical problems that can prevent delivery of the e-mail invitation or completion of the survey itself. Some panels will oversample groups that are known to have low response rates in order to have a representative sample after data collection is complete. Although this may result in a balanced sample on that particular dimension, it does not ensure that the sample is representative on other points. Stage 4: Panel maintenance. Attrition can be “normal,” when people opt out for whatever reasons; or can be forced, when panel members are automatically dropped from the panel after a set period of time to keep the panel fresh. Many strategies are used to reduce panel attrition, but little research exists on reducing or determining the most “desirable” attrition rate to balance the costs of adding panel members with the potential concerns of long-term membership, such as panel conditioning. Measurement Error Measurement error is defined as the difference between an observed response and the underlying true response. This can be random error, as when a respondent picks an answer other than the true response, without any systematic direc- tion in the choice made. Systematic measurement error, or bias, occurs when the responses are more often skewed in one direction. Much of the literature regarding measurement error is related to the benefits and potential biases of personal interviewers and self-administered surveys, including paper and online surveys. Because this is an issue that is related to data collection methodology for any survey, not specific to panel surveys, it is beyond the scope of this project and will not be covered in this literature review. However, this is an important issue for all survey efforts, and researchers are encouraged to look at the issues related to both interviewers and self-administered surveys. One measurement issue directly related to panel surveys is that of panel conditioning. Repeatedly taking surveys on a par- ticular topic is known to make respondents more aware of that topic, pay more attention to it in their daily lives, and therefore have different responses on future surveys than if they had not been on the panel. The research on panel conditioning with online panels has mixed findings. Some studies have shown a marked bias towards an increased likelihood to purchase; other studies show that this effect can be mitigated by vary- ing topics from survey to survey. Other research studies have shown no difference in attitudinal responses between infre- quent and very experienced panel survey members. There are two theories on the effects of taking large numbers of surveys: Experienced survey-takers may be more likely to answer in a way that they believe is best for themselves (e.g., it will earn them more incentives, or get more surveys to complete; alter- natively, experienced survey takers will understand the pro- cess better, resulting in more accurate and complete responses. So far, there is no definitive research on the effects of panel members completing large numbers of surveys regarding the accuracy of the survey results. Sample Adjustments to Reduce Error and Bias It is agreed by most researchers that online panels are not representative of the general population, and that techniques are needed to correct for this if the results are used. Four tech- niques have been used to attempt to correct for the known biases with a goal of making the sample representative of the population: sampling to represent a population; modeling; post-survey adjustment; and propensity weighting. 1. The most common form of sampling to represent a certain population is quota sampling, with the quotas often being demographics to match the census. Other elements can be factored in by, for example, balancing members by political affiliation. There does not appear to be any research on the reliability or validity of this type of sampling applied to panel surveys. 2. Models are frequently used in the physical sciences and in epidemiological studies to reduce error and bias. Online panels are much more complex than epidemio- logical studies, however, making it more difficult to apply model-based techniques.

18 3. The most common post-survey adjustment is the weight- ing of survey data. The difference between the sample and sampling frame with probability samples is handled through probability theory principles. Because there is rarely a sampling frame in an online sample, the census and other sources are typically used to adjust the results for under-representation of certain groups of respon- dents. Work conducted by Dever et al. (2008) found that inclusion of enough variables could eliminate coverage bias, but did not address problems associated with being a non-probability sample. 4. To apply propensity weighting, a second “reference” survey with a probability-based sample is conducted at the same time as the online panel survey, using the same questions. A model is built that can be used to weight future online surveys to better represent the target pop- ulation. Although this technique can be used success- fully, it can also increase other types of error, leading to erroneous conclusions from the resulting data. The AAPOR report (Baker 2010) provides an extensive discussion of and guidance on applying these techniques. The reader is encouraged to review the report before applying a sampling adjustment technique. Panel Data Quality Panel data cleaning is an important step in delivering results from respondents who are real, unique, and engaged in the sur- vey. Three areas of cleaning panel data are discussed: eliminat- ing fraudulent respondents, identifying duplicate respondents, and measuring engagement. Fraudulent respondents are those who sign up for a panel multiple times under false names and lie on the qualifying questionnaire to maximize their chances of participation. Duplicate responses occur when respondents answer the questionnaire more than once from the same invita- tion or when they are invited to complete the survey more than once because they belong to more than one panel. Measuring engagement is the most controversial technique. Four basic cleaning strategies are used to weed out respon- dents who may not be engaged with completing the survey, but are simply answering to earn the incentives: recognizing respondents with very short survey times (compared with all surveys); identifying respondents who answer all questions in a matrix format (usually scaled questions) the same way; recording an excessive selection of non-substantive answers, such as “don’t know”; and noting nonsense answers or identi- cal answers provided for all open-ended questions. Although there was no research at the time that demon- strated the effects of using cleaned data on the sample or final results, it is generally accepted that negative respondent behavior is detrimental to data quality. Industry Focus on Quality The market research industry has been focused on panel data quality, with virtually every national and international asso- ciation incorporating principles and guidelines for conducting online and panel research. Four key efforts are highlighted in the report: 1. The Council of American Survey Research Organiza- tion (CASRO) revised its Code of Standards and Eth- ics for Survey Research in 2007 to include specific clauses related to online panels. 2. ESOMAR developed comprehensive guidelines titled “Conducting Market and Opinion Research Using the Internet.” This was supplemented by its “26 Questions to Help Research Buyers of Online Samples.” 3. The International Organization for Standardization technical committee that developed ISO 20252— Market, Opinion and Social Research also developed ISO 26362—Access Panels in Market, Opinion and Social Research. The standard defines key terms and concepts in an attempt to create a common vocabu- lary for online panels, and details the specific kinds of information that a research panel is expected to make available to a client at the conclusion of every project. 4. The Advertising Research Foundation established the Online Research Quality Council, which in turn designed and executed the Foundations of Quality project. Work was in progress as of the writing of the AAPOR report, and as of the writing of this synthesis, results of the effort were just being made public. Recommendations The AAPOR Report on Online Panels makes the following recommendations to market researchers who are considering using online access panels: • A non-probability online panel is appropriate when precise estimates of population values are not required, such as when testing the receptivity to product concepts and features. • Avoid using non-probability online access panels when the research is to be used to estimate population values. There is no theoretical basis for making projections or estimates from this type of sample. • The accuracy of self-administered computer surveys is undermined because it is a non-probability sample. A random-digit-dial telephone survey is more accurate than an online survey because it is a probability sam- ple, despite the coverage error arising from households without a landline phone. • It has not yet been demonstrated that weighting the results from online access panel surveys is consistently effective and can be used to adjust for panel bias.

19 • There are significant differences in the composition and practices of various online access panels, which can affect survey results. Different panels may yield significantly different results on the same questionnaire. Market Research by the Public Sector Poynter’s book devotes a section to issues specific to pub- lic sector research. Although most of the marketing research principles apply equally to the private and public sectors, there are a few areas where the public sector researcher needs to be particularly attentive, because public funds are being used to conduct the research and the results may determine how public funds are expended. Areas for particular attention are identified as: operating in the public eye, “representativ- ity,” geographical limitations, social media and the public sector, and ethics. In the Public Eye Public sector research is subject to audit, inspection, and reporting in the media. Freedom of information laws ensure that the public has a right to see how public funds are being spent. Poorly conducted research could be brought to light in a public forum, creating public relations problems for a per- ceived waste of taxpayer money and jeopardizing the ability to conduct future research. As a result, care must be taken to ensure public sector research is conducted to the highest qual- ity and ethical standards. Representativity Having a representative sample is always important, but is of special concern for public agencies. Many public services, such as public transportation, target specific groups which may have multiple challenges. Much of the target population may not have Internet access, and those that do may not be typical of the market segment they are expected to represent. For each study, the researcher must carefully assess whether an online survey is appropriate for that market and research purpose, and whether the sampling and recruitment strategies provide survey results that can be defended in public. Geographical Limitations Public agencies have strict geographic boundaries from which a sample population can be drawn. Face-to-face or telephone surveys are often simplified by these restrictions. Surveys using an online access panel, however, can be problematic, as there may not be an adequate sample of persons from the target area. This is further exacerbated when the sample is also required to be representative of the population within a speci- fied geographical area. Social Media and the Public Sector There are several ways in which social media are being used for research in the public sector. Online communities engage in a range of activities, including information-sharing, research, community-building, and engagement. Online research com- munities are typically closed communities, operated by a ven- dor, with membership by invitation only as part of an overall sampling plan (see the MnDOT case example of an online research community). Twitter, blogs, and public discussions are resources for passive research, using data mining tools to monitor trends in what people are saying about the agency. Although useful information can be elicited from these sources, it should be noted that they do not provide a representative sample, and should be considered public comment rather than research. Social media is often used to reach out to groups that are otherwise hard to reach, such as young adults. It should be noted that using a variety of social media techniques, such as Facebook, YouTube, and Twitter, is likely to reach the same people multiple times. If multiple social media channels are used to recruit online survey participants, for example, the researcher must be prepared for the potential duplication of survey responses. Ethics There is an expectation that research will be reliable, and can be used by a decision-making body in a public forum. First and foremost, the researcher must provide unbiased market research. Often a vendor is used to conduct the research so as to provide a wall between the agency and the research and avoid the appearance of leading the respondents, or “spinning” the results. The second concern is that quantitative research based on random probability sampling has been the standard method for achieving that level of reliability expected of a pub- lic agency. Since online research is typically not from a proba- bilistic sample, the researcher should recognize the potential lack of statistical reliability inherent in the research design and ensure that decision makers understand the limitations of the data. FUTURE OF MARKET RESEARCH Technology has fundamentally changed how society com- municates and how it does business. Whereas people used to communicate by means of the telephone at home, cell phones make communication possible virtually anywhere. Cell phone numbers do not represent a physical address; they have become a moving, real-time “personal” address. The Internet provides instant access to information and communication through e-mail, websites and social media. The smart phone combines mobile communication with the Internet, creating

20 a completely new, technology-based world. Panels can now be developed online, quickly and easily. Household and per- sonal contact information is no longer tied to a home address, but exists outside of the person’s geographic location. This technology has led to a revolution in market research. Recruiting survey respondents is easier; developing a panel is faster; and surveys are online, resulting in automation of survey tabulation and reporting. As a result, recruiting and maintaining research panels is simpler, less expensive, and very attractive to decision makers who want results “now.” But these changes have created a myriad of concerns, pri- marily related to using non-probabilistic sampling practices. The history of sampling theory provides some insights into what may occur in the future. In 1934, although sam- pling theory had not yet been developed, Anders Kaier con- vinced an international audience that representative samples could be used to represent a population. Morris Hansen of the U.S. Census Bureau greatly expanded the theory and prac- tice of sampling and helped convince the bureau to accept sampling and quality control methods in the 1940 Census. Through the leadership of these two important individuals, the practice was adopted. From there, sampling theory was developed—it followed the practice, rather than the theory creating the practice. Brick states that data collection costs will continue to put pressure on agencies to use non-probability samples from online recruitment. If this cannot be done within design-based probability sampling theory, he suggests two potential out- comes: A new paradigm that accommodates online surveys is introduced, which replaces or supplements traditional prob- ability sampling; or online surveys using non-probabilistic sampling are restricted to specific applications as a result of the weak theoretical basis. One potential solution is the use of multiple-frame sam- pling to reduce coverage error, a fundamental concern with online panel research. For example, to reach transit riders, an online survey could be placed on the agency website and be supplemented with paper surveys on board vehicles. Stat- isticians are working on establishing a theoretical basis for conducting sampling using the multiple-frame technique (Brick 2011). In addition to the changes in survey practice that led to the historical development of sampling theory, two additional factors are cited as creating the paradigm shift from popula- tion surveying to representative sampling in 1934. The first was the wealth of scientific development and statistical ideas, not necessarily related to survey sampling, which neverthe- less supported the growth and change in methods. The sec- ond factor was society’s demand for information on a wide range of topics that made population sampling cumbersome and expensive. This desire for faster, cheaper research drove the development of probability sampling and our current market research paradigm. These characteristics are in place today, almost 80 years after probabilistic sampling made its debut. With the rapid changes in technology and society’s insatiable thirst for more information, more quickly, and for less cost, a new research paradigm with a theoretical founda- tion to support non-probabilistic online surveying may be on the horizon.

TRB’s Transit Cooperative Research Program (TCRP) Synthesis 105: Use of Market Research Panels in Transit describes the various types of market research panels, identifies issues that researchers should be aware of when engaging in market research and panel surveys, and provides examples of successful market research panel programs.

The report also provides information about common pitfalls to be avoided and successful techniques that may help maximize research dollars without jeopardizing the quality of the data or validity of the results.

READ FREE ONLINE

Welcome to OpenBook!

You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

Do you want to take a quick tour of the OpenBook's features?

Show this book's table of contents , where you can jump to any chapter by name.

...or use these buttons to go back to the previous chapter or skip to the next one.

Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

To search the entire text of this book, type in your search term here and press Enter .

Share a link to this book page on your preferred social network or via email.

View our suggested citation for this chapter.

Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

Get Email Updates

Do you enjoy reading reports from the Academies online for free ? Sign up for email notifications and we'll let you know about new publications in your areas of interest when they're released.

Review of Marketing Research

  • Recent Chapters

All books in this series (21 titles)

Cover of Shopper Marketing and the Role of In-Store Marketing

Recent chapters in this series (17 titles)

  • “I Did Not Think of Myself as a ‘Customer’”: The Confluence of Intertwined Vulnerabilities Among Subsistence Consumers Through Marketplace Literacy
  • A Consumer Vulnerability Perspective on Eviction
  • Aging and Vulnerabilities in Consumer Information Processing
  • From Stigma to Scarcity: On Interpersonal and Cognitive Sources of Vulnerability for Consumers in Poverty
  • Leaves in the Wind: Underdeveloped Thinking Systems Increase Vulnerability to Judgments Driven by Salient Stimuli
  • Marketplace Solutions to Motivational Threats: Helping Consumers With Four Distinct Types of Vulnerability *
  • Not Knowing Who I Am: Implications for Materialism and Consumption Behaviors
  • The Vulnerable Consumer: Beyond the Poor and the Elderly
  • The Yin and Yang of Hard Times: When Can States of Vulnerability Motivate Self-Improvement?
  • Vulnerability and Consumer Poverty: An Explication of Consumption Adequacy
  • AI and Personalization
  • Anthropomorphism in Artificial Intelligence: A Review of Empirical Work Across Domains and Insights for Future Research
  • Artificial Intelligence and Pricing
  • Artificial Intelligence and User-Generated Data Are Transforming How Firms Come to Understand Customer Needs
  • Artificial Intelligence Applications to Customer Feedback Research: A Review
  • Deep Learning in Marketing: A Review and Research Agenda
  • Leveraging AI for Content Generation: A Customer Equity Perspective
  • Naresh K. Malhotra

All feedback is valuable

Please share your general feedback

Report an issue or find answers to frequently asked questions

Contact Customer Support

Assessing and enhancing the impact potential of marketing articles

  • Theory/Conceptual
  • Open access
  • Published: 02 December 2021
  • Volume 11 , pages 407–415, ( 2021 )

Cite this article

You have full access to this open access article

literature review of market potential

  • Elina Jaakkola   ORCID: orcid.org/0000-0003-4654-7573 1 &
  • Stephen L. Vargo 2  

4730 Accesses

4 Citations

6 Altmetric

Explore all metrics

Although the impact of marketing is a recognized priority, current academic practices do not fully support this goal. A research manuscript’s likely influence is difficult to evaluate prior to publication, and audiences differ in their understandings of what “impact” means. This article develops a set of criteria for assessing and enhancing a publication’s impact potential. An article is argued to have greater influence if it changes many stakeholders’ understandings or behaviors on a relevant matter; and makes its message accessible by offering simple and clear findings and translating them into actionable implications. These drivers are operationalized as a checklist of criteria for authors, reviewers, and research supervisors who wish to evaluate and enhance a manuscript’s potential impact. This article invites scholars to further develop and promote these criteria and to participate in establishing impact evaluation as an institutionalized practice within marketing academia.

Similar content being viewed by others

literature review of market potential

Marketing’s theoretical and conceptual value proposition: opportunities to address marketing’s influence

Contours of the marketing literature: text, context, point-of-view, research horizons, interpretation, and influence in marketing.

literature review of market potential

Special Session: Issues and Answers: Panel Discussion on Data Quality in Present-Day Marketing Research: An Abstract

Avoid common mistakes on your manuscript.

Introduction

For decades, there has been a lingering concern that the impact of marketing is declining, both as a discipline and in the board room (e.g., Clark et al., 2014 ; Lehmann et al., 2011 ; Reibstein et al., 2009 ). In academia, one might argue that “being impactful” has become a mantra: citations are counted for promotion applications; scholars’ reputation is increasingly affected by their h-index; external funding bodies tend to make decisions based on a research project’s expected business or societal impact; and business school accreditation boards and university ranking systems treat impact as a key standard (e.g., Birkinshaw et al., 2016 ). As a result, scholars engage in ongoing discussion of how marketing research might be made more relevant, important, and useful (e.g., Bolton, 2020 ; Kohli & Haenlein, 2021 ; MacInnis et al., 2020 ; Stremersch, 2021 ), so acknowledging that the future of marketing as a science will ultimately be defined by its impact.

At the same time, many scholars have noted that current academic practices fail to optimally support impactful research (e.g., Clark et al., 2014 ; Key et al., 2020 ; Reibstein et al., 2009 ). As the discipline of marketing grows and matures, standards continue to rise for publishing and tenure, prompting a methods focus as opposed to an impact focus (Houston, 2019 ). Researchers tend to remain stuck in disciplinary or philosophical silos which means that they continue to speak to narrow, predefined audiences (MacInnis et al., 2020 ). Many journal editors have called for research that breaks with existing institutions—for example, more conceptual work (e.g., Clark et al., 2014 ; MacInnis, 2011 ; Yadav, 2010 ); more relevant and important research topics (Kohli & Haenlein, 2021 ; Reibstein et al., 2009 ); and more boundary-challenging or interdisciplinary research (Moorman et al., 2018 , 2019 ; Yadav, 2018 ).

However, in a persisting disconnect between the ideal of impact and grassroots practice, both authors and reviewers continue to prioritize methodological rigor over impact potential. While authors often go to great lengths to explicate how their work adds to existing knowledge, the likely influence of their findings on various stakeholders is seldom comprehensively discussed. One possible reason for this is that the future impact of a given article is difficult to evaluate prior to publication. Meanwhile, there are well established quality guidelines for rigor, including validity, reliability and objectivity; and failure to articulate measures to ensure research trustworthiness often results in manuscript rejection. In the absence of any equivalent criteria, evaluating impact remains largely intuitive. This lack of concrete actionable tools for evaluating and articulating the potential impact of a research manuscript hinders its adoption as a guiding norm for academic marketing research.

In this article, we contend that the promotion of more impactful marketing research will depend on developing and institutionalizing a set of criteria for assessing and enhancing a study’s potential impact. Our aim here is to take a first step toward that goal. At present, authors can access advice on how to enhance particular impact-related aspects of a manuscript—for example, by crafting interesting and relevant research questions (e.g., Kohli & Haenlein, 2021 ; Lange & Pfarrer, 2017 ; Shugan, 2003 ) or developing theoretical contributions (e.g., MacInnis,  2011 ; Makadok et al., 2018 )—but there is as yet no comprehensive set of criteria for evaluating the diverse drivers of impact. While one can argue that prestigious journals will not accept an article that makes no significant contribution to the literature, many published articles have little impact, at least in terms of their citation count. In short, while an article’s contribution is undoubtedly one element of impact, the two are not identical.

To be useful, a set of impact criteria should operationalize and integrate the key factors that can be said to drive the influence of a study on various stakeholders, and be observable in a research publication such as a journal manuscript of an academic thesis. As well as highlighting the importance of impact, the articulation of such criteria would support the assessment and enhancement of a research manuscript’s potential impact and would ultimately facilitate the establishment of such measures as an institutionalized practice in marketing academia.

We start by reviewing existing viewpoints on impact in marketing research to define what this means as a goal for academic publication. Next, we identify drivers of impact potential that should inform the development of explicit criteria. We identified those drivers by analyzing research articles and editorials that focus on impact and relevance, as well as by reflecting on our own experiences as editors, reviewers, and authors for various journals. As the article’s main outcome, we specify an integrative set of criteria that can be used to evaluate and enhance the likely impact of articles submitted for publication.

Defining impact as a goal for academic publication

In operationalizing impact, one underlying challenge is the multifaceted nature of that concept and how interpretations of the term differ across users and audiences (Penfield et al., 2014 ). For that reason, scholarly impact must be more clearly defined in order to develop effective ways of inspiring and achieving it (Aguinis et al., 2014 ). Current scholarly and policy research reveal diverse perspectives on the nature of impact, ranging from a publication’s citation count to a research program’s societal benefits (see Aguinis et al., 2014 ; see also Table 1 ). Following Reale et al. ( 2018 , p. 298), we define research impact here as “ a change that research outcomes produce upon academic activities, the economy, and society at large.”

As outlined in Table 1 , marketing scholars typically address impact from three complementary perspectives. The first of these focuses on scientific impact in terms of the influence that particular publications, authors, journals, or streams have within marketing, or the influence of marketing research on other business sciences, typically measured by citations (Aguinis et al., 2014 ; Baumgartner & Pieters, 2003 ). The second approach focuses on business impact : how research informs marketing practice (e.g., Roberts et al., 2014 ) or how marketing contributes to firm performance and decision making (e.g., Krasnikov & Jayachandran, 2008 ). Finally, the third perspective focuses on the societal impact of marketing research: the effect of scholarly output on advancing wellbeing (e.g., Blocker et al., 2013 ) or the use of that knowledge for public or societal decision making (e.g., Davis & Ozanne, 2019 ). In that context, possible audiences for marketing research include educators, funding agencies, media, public policymakers, and regulators (Shugan,  2003 ).

As Table 1 illustrates, current perspectives on impact within the discipline of marketing address different domains of change, operationalizing impact as the diffusion and/or influence of academic output among stakeholders within these domains. Accordingly, an academic study’s impact is arguably greater when it engages a broader range of stakeholders and triggers more extensive change in their thinking or actions.

Drivers of impact

Scholars have discussed myriad factors that may affect research impact, not all of which can be assessed at the level of academic publication (Hauser, 2017 ; Sternberg & Gordeeva, 1996 ). These factors range from the promotion of published articles to academic practices pertaining to broader science policy, such as industry-academia collaboration (e.g., Hauser, 2017 ; Kohli & Haenlein, 2021 ; Lindgreen et al., 2020 ). As the goal of this article is to develop guidelines for assessing a paper’s impact potential, we confine our attention here to drivers that are observable within a manuscript. For that reason, some measures such as the dissemination of research findings through seminar presentations or social media (e.g., Lindgreen et al., 2020 ) or efforts to influence managerial and public policy decision making through marketing institutions and consultancy (Bolton, 2020 ) are beyond our scope here.

Based on our reading of the literature, our reflections on articles that have come to influence marketing thought over time, and our own experiences as editors, reviewers, and authors, we propose that there are two key drivers of impact (Fig.  1 ). The first of these is change potential , reflecting the common view of impact as the change that research induces in a range of stakeholders (Morton, 2015 ; Penfield et al., 2014 ; Reale et al., 2018 ; Stremersch, 2021 ). The second driver is accessibility : how effectively a publication communicates its message to the intended audience. Many scholars have noted that insightful ideas may go unnoticed if obscured by complicated communication (e.g., Stremersch et al., 2007 ; Houston, 2019 ; Warren et al.,  2021 ), and our own experiences of the review processes of various journals confirm this view.

figure 1

Drivers of publication impact potential

According to the proposed framework, a publication’s impact potential will be high if it is likely to promote significant change in important stakeholders’ understanding or behavior and if a simple explanation is clearly translated into actionable implications to make the message accessible. These elements are discussed in more detail below.

Change potential

Many authors consider the research question to be the primary driver of impact (Houston, 2019 ; Stremersch, 2021 ), emphasizing that impactful research promotes change in matters that are relevant (e.g. Bolton, 2020 ; Kohli & Haenlein,  2021 ). Scholars have suggested alternative pathways for identifying relevant and important research topics. One approach is to address a topic that is likely to advance theory within a given research domain by solving a puzzle, paradox, or tension that hinders its development or by studying emerging phenomena that require substantial theoretical advances to understand them (Houston, 2019 ; Li et al., 2021 ; Smith, 2003 ; Yadav, 2018 ). Smith ( 2003 ) suggested that impactful research topics might for example initiate new domains of inquiry by introducing novel concepts, resolving inconsistent findings, or intervening in accepted causal models to posit new explanations. Others have argued that relevant research problems often relate to “big questions” such as climate change, which pose challenges for researchers because they tend to be broad and ill-structured, and lack clear criteria or algorithms for solving them (Key et al., 2020 ). Another frequently suggested approach is to identify significant problems faced by marketing stakeholders (Kohli & Haenlein, 2021 ; MacInnis et al., 2020 ; Moorman et al., 2019 ). For example, Zeithaml et al. ( 2020 ) suggested exploring marketing phenomena from the perspective of those most closely involved—firms, consumers, and managers—in order to focus on key issues for marketing practice by capturing relevant issues in real-world settings. Collaboration with practitioners and immersion in industry or related contexts can help researchers to pinpoint important problems that need to be solved (Bolton, 2020 ; Stremersch, 2021 ).

Another dimension of research impact is magnitude of change —that is, how radically the research might change current understanding or behavior (Kohli & Haenlein,  2021 ). According to Stremersch ( 2021 ), impactful research potentially causes someone to act or think differently—for example, by influencing stakeholder decision making, prompting the adoption of different methods, changing opinions or objectives, or promoting a new approach to problem solving (Morton, 2015 ; Shugan, 2003 ). Analyses indicate that impactful research articles often explain existing phenomena better than previous work (Sternberg & Gordeeva, 1996 ) or challenge the disciplinary status quo (Li et al., 2021 ). Tellis ( 2017 ) noted that arguments refuting current knowledge are especially impactful. In a similar vein, Smith ( 2003 ) contended that challenging taken-for-granted practices and assumptions makes a study interesting. More generally, challenging the premises on which a theory, domain, concept, or method relies can induce significant change—for example, by testing a research stream’s key assumptions or probing the external validity of what is considered true (e.g., Hauser, 2017 ; Makadok et al., 2018 ; Smith, 2003 ).

Problematization beyond incremental gap-spotting is another widely recommended method of identifying and formulating research questions that can potentially induce major change within a given field (Alvesson & Sandberg, 2011 ). In the present context, problematization means questioning the assumptions underpinning existing theory in some significant way to identify new and inspiring points of departure for theory development or paradigm shift (Sandberg & Alvesson, 2011 ). In practical terms, this means articulating a compelling complication in something assumed to be true in the field and arguing for the significance of that complication (Lange & Pfarrer, 2017 ). According to Sandberg and Alvesson ( 2011 ), research that merely identifies gaps in the existing literature relies on and even strengthens prevailing assumptions because that gap falls within the confines of existing theory. Research inspired by the existing academic literature in one’s silo is likely to perpetuate ignorance of emergent phenomena pertinent to marketplace stakeholders and academics in other disciplines (MacInnis et al., 2020 ; Yadav, 2018 ). Sandberg and Alvesson ( 2011 ) argued that gap-spotting studies are inherently limited by their aim of adding to the existing literature rather than challenging it; the smaller the addition, the smaller the change induced.

A third issue is breadth of change : the extent of the audience whose understanding or behavior might be affected by the research findings. According to Kohli and Haenlein ( 2021 ), an issue that affects a large number of stakeholders is arguably more important than one that affects only a few; in the present context, potential audiences include managers, public policy makers, consumers, academics, consultants, and societal groups. Smith ( 2003 , p. 319) refers to Zaltman’s suggestion that an interesting idea is one that, if true, would require a large number of people to substantially change their beliefs or behaviors. Bolton ( 2020 ) highlights that consideration of domain(s) and stakeholders that may participate in and benefit from a study is an important part of responsible and impactful research, and provides an elaborate checklist for identifying relevant stakeholders for marketing research.

Accessibility

As the second key driver of impact, accessibility refers to how a publication’s message is communicated. We focus here on three elements of accessibility: simplicity, clarity, and actionability (see Fig.  1 ). Many scholars have emphasized that simplicity is central to the impact of theoretical explanation. According to Tellis ( 2017 , p. 4), “A good theory is a simple explanation of a phenomenon. The best theory is the simplest explanation for a wide set of phenomena”; or as Einstein expressed it, “The grand aim of all science is to cover the greatest number of empirical facts by logical deduction from the smallest number of hypotheses or axioms” (cited in Barnett, 2005 ). The same basic idea is also captured by Occam’s Razor: that the best explanation for a given purpose is the simplest one—or more precisely, “entities [assumptions, foundations] should not be multiplied beyond necessity.”

Contrary to these well established principles, marketing models and frameworks and associated manuscripts tend toward increasing complexity, arguably for two reasons. First, there is a tendency to draw on multiple theoretical frameworks or models to address the research objective but to inadequately reconcile them—i.e., reduce them to a set of common concepts—resulting in unnecessarily complex “Frankenstein models” (Vargo & Koskela-Huotari, 2020 ). The second problem is perhaps an unintended consequence of increasingly sophisticated methodological approaches such as structural equation modeling, which allow, if not encourage modelers to incorporate an ever-increasing number of variables. In many cases, these are likely to be related issues.

Clarity is another important element of accessibility and impact. Several authors have noted that, over time, marketing articles have become increasingly complex and difficult to read because of how they use language (e.g., Brown et al., 2005 ; Key et al., 2020 ). In this regard, Warren et al. ( 2021 ) referred to the “curse of knowledge,” as researchers’ familiarity with their own field prompts them to adopt a more abstract, technical, and passive writing style that makes the message more difficult to understand. This in turn hampers impact because a text that obscures insights rather than illuminating them is likely to be ignored (Houston, 2019 ). In short, when the intended message is not understood, change will not occur.

One method for assessing clarity is quantitative assessment of relative sentence and word length, as in the Gunning ( 1952 ) Fog Index and the Flesch ( 1948 ) index of Reading Ease. Noting that the quality of writing improves with brevity, Tellis ( 2017 ) recommends shorter sentences, trimming of redundant phrases, and streamlining of arguments to produce articles that are short, forceful, and idea-packed. However, studies of the correlation between these variables and impact (measured as citation count) have produced mixed results. Warren et al. ( 2021 ) argued that this is because readability is not the same as clarity, which is more adequately captured as use of (1) concrete language, (2) concrete examples, (3) common words, and (4) active voice. They found these qualities to be associated with higher degrees of understanding and impact as measured by citation counts. In short, other things being equal, simpler is better.

Previous research indicates that an accessible writing style is especially important for engaging nonacademic audiences (Stremersch et al., 2007 ). Simple and powerful ideas, straightforward methods, and clear writing can heighten the subject matter’s appeal for relevant stakeholders such as academics in other fields, business practitioners, the media, policy makers, and the general public (MacInnis et al., 2020 ). The language used is also likely to affect uptake of the topic among popular writers (Gonsalves et al., 2021 ). Journalists, consultants, and other professional service providers typically play a brokering role between academia and practice, offering a valuable conduit for disseminating research findings (Roberts et al., 2014 ). However, unclear writing may discourage these important middlemen or may promote a “telephone game” effect, where journalists and consultants misinterpret the research and transmit misleading messages to a wider audience.

Another accessibility-enhancing element is the extent to which research findings are actionable . Researchers can optimize their study’s change potential by translating their findings into concrete guidance, action points, or tools for practitioners or other researchers. Identifying useful research implications to guide future studies can also enhance an article’s impact (Sternberg & Gordeeva, 1996 )—for example, by highlighting novel research questions implicit in the findings (MacInnis, 2011 ) and explaining how the findings could or should be used, and by whom.

Outlining managerial and societal implications is obviously another important means of increasing a study’s practical impact. In a journal article, a practical implications section offers space for translating conceptual findings into a practical format. In their analysis of most-cited business-to-business marketing articles, Baraldi et al. ( 2014 ) found that articles with a dedicated practical implications section offered more actionable implications than those distributing them throughout the article. In the latter case, implications were often abstract, non-normative, and too complicated or trivial, using language that was excessively scientific (Baraldi et al., 2014 ), again confirming the importance of switching language and presentation when addressing managers.

Key et al. ( 2020 , p. 164) argued that rather than creating “parallel journal universes (jargon-academic and translated-practitioner),” rigorous article writing should altogether avoid using impenetrable language that may alienate practitioners. However, scientific concepts are the means through which researchers arrive at findings that should in turn render implications for a wider audience beyond academia. Translation thus refers to more than language, requiring the researcher to explain why and how their study is relevant to a particular phenomenon or stakeholder group, what problems the study findings can solve, and how they might change how people think and behave (cf. Shugan,  2003 ).

Assessing and enhancing impact potential: Tentative criteria

Table 2 condenses the above drivers into a tentative set of criteria for assessing and enhancing a publication’s impact potential. While this list is not exhaustive, it serves as a point of departure for a more robust evaluation scheme similar to those used to evaluate the trustworthiness of empirical research.

These criteria relate to the identified drivers of enhanced impact for an academic publication. The key driver is change potential , referring to the relevance, magnitude, and breadth of change that the research is likely to trigger. Authors can use the checklist in Table 2 to guide their choice of research topic and to argue for the impact potential of their research. Similarly, reviewers can look for these indicators when evaluating a manuscript. As the interest value of any research is audience-relative (Shugan, 2003 ), an article should clearly identify its intended audience and the scholarly, business, or societal discussion to which it contributes. The article’s arguments and claims can then be evaluated in relation to existing knowledge within that discussion or domain. To signal high impact potential, a publication should establish convincingly the need for significant changes of practice for a wide audience. The evaluator has some discretion in assessing how these criteria are to be applied; in some cases, the research may be considered impactful when the topic is highly relevant for a small number of key stakeholders. On the other hand, influencing a higher number of stakeholders may not be considered impactful if the change potential relates to a matter of little relevance (cf. Kohli & Haenlein, 2021 ).

The proposed set of criteria also addresses a publication’s accessibility . We contend that even relevant research may fail to achieve sufficient impact if it does not communicate its message in a simple, clear, and actionable manner. Spelling out the study’s key findings and implications as simply as possible makes the takeaways more accessible for the reader. Authors can convey their key message using hip-pocket takeaways or power expressions. The former is a metaphor for ultimate parsimony; by capturing and condensing the essence of a theory or framework in a conceptual or graphical space that is sufficiently compact to be transported in a metaphorical “hip pocket,” authors make it easier for the reader to understand and adopt. Perhaps the ultimate hip pocket takeaway is “e = mc 2 ”; examples from marketing include models that capture the theoretical narrative of service-dominant logic in one simple figure (see Fig.  1 in Vargo & Lusch, 2016 ) or conceptualize the nomological network of market orientation (see Fig.  1 in Kohli & Jaworski, 1990 ).

“Power expressions” are sentences that crystallize the nature, impact, or relation between constructs, types, categories, or processes, as well as any key finding or argument, by catching the reader’s eye and making it easy to grasp the article’s main points. Many highly cited articles have achieved that status by virtue of one or two especially powerful, usable, and understandable expressions. For example, Brodie et al. ( 2011 ) presented five propositions to characterize customer engagement, one of which so powerfully captured the concept’s emergence and nature that it subsequently became the most frequently cited definition of engagement. However, power expressions should not be regarded as a cheap gimmick to attract citations; rather, they curate and capture a study’s main points in a condensed and simple form. Although authors are best placed to crystallize the essence of their research, this work is too often left to the reader.

Authors and reviewers should also pay attention to clarity, as incoherent storytelling, complex language, and academic jargon create noise that can drown out an article’s message, causing the reader to disengage. In addition to the criteria in Table 2 , authors can employ more detailed methods to test the clarity of their writing, such as the Writing Clarity Calculator developed by Warren et al. ( 2021 ) based on their study published in the Journal of Marketing (see also http://writingclaritycalculator.com/ ).

The set of criteria proposed here also highlights the importance of specifying the implications of a piece of research in an actionable manner. In many cases, the results of the study mark the highpoint of the article, while the implications are addressed in a perfunctory way, sometimes noting only that firms should pay attention to the studied issue. Similarly, research implications are often formulated only as a list of potential research topics, or confined to issues arising from the study’s limitations. This is a missed opportunity to achieve higher impact. The guidelines in Table 2 invite authors and reviewers to ask what new research the findings might inspire and how they might guide the use or development of methods, measures, literature, nomological networks, and research frameworks. As a further step, authors might try to envisage how their findings could be used by researchers beyond their disciplinary silo. Sufficiently abstracted and parsimonious findings could be accessible also for researchers who are unfamiliar with the domain’s concepts and jargon.

This article represents a first step toward developing a set of criteria for evaluating and enhancing a research publication’s impact potential. This endeavor is especially important for social science disciplines like marketing, where research impact is often indirect and difficult to prove (cf. Muhonen & Tellmann, 2021 ). Our central argument is that a publication will have high impact potential if it is likely to promote significant change in how important stakeholders understand or behave in relation to a relevant matter and if it offers simple and clear findings that can be translated into actionable implications, making its message accessible. By operationalizing these drivers as a set of criteria, we provide a concrete toolbox for assessing and enhancing a manuscript’s impact potential. At present, measures to enhance an article’s impact potential, or to evaluate it during the review process, tend to exist only as tacit knowledge. During the review process, the authors bear the burden of proof and must present compelling arguments in support of their manuscript’s potential to impact external audiences (Shugan,  2003 ). Equally, while journal reviewers may be able to judge a manuscript’s methodological or conceptual robustness, its future impact is often difficult to evaluate prior to publication. To that extent, the proposed criteria can be of value to authors, reviewers, and research supervisors.

Importantly, the present article does not downplay the importance of rigor but suggests the need for a balancing impact perspective. Sufficient rigor should be considered a hygiene factor for delivering trustworthy results, but not the main selling point of the article. As Yadav ( 2018 ) noted, knowledge development approaches in the marketing discipline have become increasingly scripted, constituting a straitjacket that may hamper impactful research. Authors should therefore explain why their chosen methodology is adequate for addressing the target problem and how impact would be undermined by a more rigid approach. Authors should also argue against the narrow view that rigor relates only to the application of sophisticated and complicated quantitative methodologies (cf. Houston, 2019 ).

In terms of future research effort on the topic of impact, we invite scholars to further develop the proposed criteria. The goal is that in the future, authors will more fully assess and explicate their efforts to improve the impact potential of their work. If marketing academics genuinely believe that impact matters, it should be afforded the same status as methodological rigor in academic publications. We hope that scholars will use, advance, and promote these criteria in helping to establish impact evaluation as an institutionalized practice.

Aguinis, H., Shapiro, D. L., Antonacopoulou, E. P., & Cummings, T. G. (2014). Scholarly impact: A pluralist conceptualization. Academy of Management Learning & Education, 13 (4), 623–639.

Article   Google Scholar  

Alvesson, M., & Sandberg, J. (2011). Generating research questions through problematization. Academy of Management Review, 36 (2), 247–271.

Google Scholar  

Antonetti, P., & Maklan, S. (2014). Feelings that make a difference: How guilt and pride convince consumers of the effectiveness of sustainable consumption choices. Journal of Business Ethics, 124 (1), 117–134.

Aspara, J., & Tikkanen, H. (2017). Why do public policy-makers ignore marketing and consumer research? A case study of policy-making for alcohol advertising. Consumption Markets & Culture, 20 (1), 12–34.

Backhaus, K., Lügger, K., & Koch, M. (2011). The structure and evolution of business-to-business marketing: A citation and co-citation analysis. Industrial Marketing Management, 40 (6), 940–951.

Baraldi, E., La Rocca, A., & Perna, A. (2014). Good for science, but which implications for business? An analysis of the managerial implications in high-impact B2B marketing articles published between 2003 and 2012. Journal of Business & Industrial Marketing., 29 (7/8), 574–592.

Barnett, L. (2005). The universe and Dr. Einstein . Dover.

Baumgartner, H., & Pieters, R. (2003). The structural influence of marketing journals: A citation analysis of the discipline and its subareas over time. Journal of Marketing, 67 , 123–139.

Birkinshaw, J., Lecuona, R., & Barwise, P. (2016). The relevance gap in business school research: Which academic papers are cited in managerial bridge journals? Academy of Management Learning & Education, 15 (4), 686–702.

Blocker, C. P., Ruth, J. A., Sridharan, S., Beckwith, C., Ekici, A., Goudie-Hutton, M., & Varman, R. (2013). Understanding poverty and promoting poverty alleviation through transformative consumer research. Journal of Business Research, 66 (8), 1195–1202.

Bolton, R. N. (2020). First steps to creating high impact theory in marketing. AMS Review, 10 (3), 172–178.

Brodie, R. J., Hollebeek, L. D., Jurić, B., & Ilić, A. (2011). Customer engagement: Conceptual domain, fundamental propositions, and implications for research. Journal of Service Research, 14 (3), 252–271.

Brown, S. W., Webster, F. E., Steenkamp, J. B. E., Wilkie, W. L., Sheth, J. N., Sisodia, R. S., & Bauerly, R. J. (2005). Marketing renaissance: Opportunities and imperatives for improving marketing thought, practice, and infrastructure. Journal of Marketing, 69 (4), 1–25.

Clark, T., Key, T. M., Hodis, M., & Rajaratnam, D. (2014). The intellectual ecology of mainstream marketing research: An inquiry into the place of marketing in the family of business disciplines. Journal of the Academy of Marketing Science, 42 (3), 223–241.

Davis, B., & Ozanne, J. L. (2019). Measuring the impact of transformative consumer research: The relational engagement approach as a promising avenue. Journal of Business Research, 100 , 311–318.

Day, G. S. (1992). Marketing’s contribution to the strategy dialogue. Journal of the Academy of Marketing Science, 20 (4), 323–329.

Flesch, R. (1948). A new readability yardstick. Journal of Applied Psychology, 32 (3), 221–233.

Gonsalves, C., Ludwig, S., de Ruyter, K., & Humphreys, A. (2021). Writing for impact in service research. Journal of Service Research, 24 (4), 480–499.

Gunning, R. (1952). The technique of clear writing . Macmillan.

Hauser, J. R. (2017). Phenomena, theory, application, data, and methods all have impact. Journal of the Academy of Marketing Science, 45 (1), 7–9.

Hoffman, D. L., & Holbrook, M. B. (1993). The intellectual structure of consumer research: A bibliometric study of author cocitations in the first 15 years of the Journal of Consumer Research. Journal of Consumer Research, 19 (4), 505–517.

Houston, M. (2019). Four facets of rigor. Journal of the Academy of Marketing Science, 47 , 570–573.

Key, T. M., Clark, T., Ferrell, O. C., Stewart, D. W., & Pitt, L. (2020). Marketing’s theoretical and conceptual value proposition: Opportunities to address marketing’s influence. AMS Review, 10 (3), 151–167.

Kohli, A. K., & Haenlein, M. (2021). Factors affecting the study of important marketing issues: Implications and recommendations. International Journal of Research in Marketing, 38 (1), 1–11.

Kohli, A. K., & Jaworski, B. J. (1990). Market orientation: The construct, research propositions, and managerial implications. Journal of Marketing, 54 (2), 1–18.

Krasnikov, A., & Jayachandran, S. (2008). The relative impact of marketing, research-and-development, and operations capabilities on firm performance. Journal of Marketing, 72 (4), 1–11.

Lange, D., & Pfarrer, M. D. (2017). Editors’ comments: Sense and structure—The core building blocks of an AMR article. Academy of Management Review, 42 (3), 407–416.

Leeflang, P. S., & Wittink, D. R. (2000). Building models for marketing decisions: Past, present and future. International Journal of Research in Marketing, 17 (2–3), 105–126.

Lehmann, D. R., McAlister, L., & Staelin, R. (2011). Sophistication in research in marketing. Journal of Marketing, 75 (4), 155–165.

Li, L. P., Fehrer, J. A., Brodie, R. J., & Juric, B. (2021). Trajectories of influential conceptual articles in service research. Journal of Service Management, 32 (5), 645–672.

Lilien, G. L., Roberts, J. H., & Shankar, V. (2013). Effective marketing science applications: Insights from the ISMS-MSI practice prize finalist papers and projects. Marketing Science, 32 (2), 229–245.

Lindgreen, A., Di Benedetto, C. A., Brodie, R. J., Fehrer, J., & Van der Borgh, M. (2020). How to get great research cited. Industrial Marketing Management, 89 , A1–A7.

MacInnis, D. J. (2011). A framework for conceptual contributions in marketing. Journal of Marketing, 75 (4), 136–154.

MacInnis, D. J., Morwitz, V. G., Botti, S., Hoffman, D. L., Kozinets, R. V., Lehmann, D., & Pechmann, C. (2020). Creating boundary-breaking, marketing-relevant consumer research. Journal of Marketing, 84 (2), 1–23.

Makadok, R., Burton, R., & Barney, J. (2018). A practical guide for making theory contributions in strategic management. Strategic Management Journal, 39 , 1530–1545.

Moorman, C., van Heerde, H. J., Moreau, C. P., & Palmatier, R. W. (2018). JM as a marketplace of ideas. Journal of Marketing, 83 (1), 1–7.

Moorman, C., van Heerde, H. J., Moreau, C. P., & Palmatier, R. W. (2019). Challenging the boundaries of marketing. Journal of Marketing, 83 (5), 1–4.

Morton, S. (2015). Progressing research impact assessment: A ‘contributions’ approach. Research Evaluation, 24 (4), 405–419.

Muhonen, R., & Tellmann, S. (2021). Challenges of reporting societal impacts for research evaluation purposes—case of sociology. In T.C.E. Engels & E. Kulczycki (Eds.), Handbook on research assessment in the social sciences. Edward Elgar Publishing.

Nath, P., & Mahajan, V. (2011). Marketing in the C-suite: A study of chief marketing officer power in firms’ top management teams. Journal of Marketing, 75 (1), 60–77.

Nelson, M. R., Ham, C. D., & Ahn, R. (2017). Knowledge flows between advertising and other disciplines: A social exchange perspective. Journal of Advertising, 46 (2), 309–332.

Penfield, T., Baker, M. J., Scoble, R., & Wykes, M. C. (2014). Assessment, evaluations, and definitions of research impact: A review. Research Evaluation, 23 (1), 21–32.

Reale, E., Avramov, D., Canhial, K., Donovan, C., Flecha, R., Holm, P., & Van Horik, R. (2018). A review of literature on evaluating the scientific, social and political impact of social sciences and humanities research. Research Evaluation, 27 (4), 298–308.

Reibstein, D. J., Day, G., & Wind, J. (2009). Guest editorial: Is marketing academia losing its way? Journal of Marketing, 73 (4), 1–3.

Roberts, J. H., Kayande, U., & Stremersch, S. (2014). From academic research to marketing practice: Exploring the marketing science value chain. International Journal of Research in Marketing, 31 (2), 127–140.

Sandberg, J., & Alvesson, M. (2011). Ways of constructing research questions: Gap-spotting or problematization? Organization, 18 (1), 23–44.

Shugan, S. M. (2003). Defining interesting research problems. Marketing Science, 22 (1), 1–15.

Smith, D. C. (2003). The importance and challenge of being interesting. Journal of the Academy of Marketing Science, 31 (3), 319–322.

Srinivasan, S., & Hanssens, D. M. (2009). Marketing and firm value: Metrics, methods, findings, and future directions. Journal of Marketing Research, 46 (3), 293–312.

Sternberg, R. J., & Gordeeva, T. (1996). The anatomy of impact: What makes an article influential? Psychological Science, 7 (2), 69–75.

Stremersch, S. (2021). Commentary on Kohli & Haenlein: The study of important marketing issues: Reflections. International Journal of Research in Marketing, 38 (1), 12–17.

Stremersch, S., Verniers, I., & Verhoef, P. C. (2007). The quest for citations: Drivers of article impact. Journal of Marketing, 71 (3), 171–193.

Tellis, G. J. (2017). Interesting and impactful research: On phenomena, theory, and writing. Journal of the Academy of Marketing Science, 45 (1), 1–6.

Vargo, S. L., & Koskela-Huotari, K. (2020). Advancing conceptual-only articles in marketing. AMS Review, 10 (1–2), 1–5.

Vargo, S. L., & Lusch, R. F. (2016). Institutions and axioms: An extension and update of service-dominant logic. Journal of the Academy of Marketing Science, 44 (1), 5–23.

Warren, N. L., Farmer, M., Gu, T., & Warren, C. (2021). Marketing ideas: How to write research articles that readers understand and cite. Journal of Marketing, 85 (5), 42–57.

Yadav, M. S. (2018). Making emerging phenomena a research priority. Journal of the Academy of Marketing Science, 46 (3), 361–365.

Yadav, M. S. (2010). The decline of conceptual articles and implications for knowledge development. Journal of Marketing, 74 (1), 1–19.

Zeithaml, V. A., Jaworski, B. J., Kohli, A. K., Tuli, K. R., Ulaga, W., & Zaltman, G. (2020). A theories-in-use approach to building marketing theory. Journal of Marketing, 84 (1), 32–51.

Download references

Open Access funding provided by University of Turku (UTU) including Turku University Central Hospital.

Author information

Authors and affiliations.

Turku School of Economics, University of Turku, Turku, Finland

Elina Jaakkola

University of Hawai’i, Honolulu, USA

Stephen L. Vargo

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Elina Jaakkola .

Ethics declarations

Conflicts of interests, additional information, publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Jaakkola, E., Vargo, S.L. Assessing and enhancing the impact potential of marketing articles. AMS Rev 11 , 407–415 (2021). https://doi.org/10.1007/s13162-021-00219-7

Download citation

Received : 04 October 2021

Accepted : 14 November 2021

Published : 02 December 2021

Issue Date : December 2021

DOI : https://doi.org/10.1007/s13162-021-00219-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Find a journal
  • Publish with us
  • Track your research
  • Open access
  • Published: 19 September 2024

Machine learning in business and finance: a literature review and research opportunities

  • Hanyao Gao 1 ,
  • Gang Kou 2 ,
  • Haiming Liang 1 ,
  • Hengjie Zhang 3 ,
  • Xiangrui Chao 1 ,
  • Cong-Cong Li 5 &
  • Yucheng Dong 1 , 4  

Financial Innovation volume  10 , Article number:  86 ( 2024 ) Cite this article

Metrics details

This study provides a comprehensive review of machine learning (ML) applications in the fields of business and finance. First, it introduces the most commonly used ML techniques and explores their diverse applications in marketing, stock analysis, demand forecasting, and energy marketing. In particular, this review critically analyzes over 100 articles and reveals a strong inclination toward deep learning techniques, such as deep neural, convolutional neural, and recurrent neural networks, which have garnered immense popularity in financial contexts owing to their remarkable performance. This review shows that ML techniques, particularly deep learning, demonstrate substantial potential for enhancing business decision-making processes and achieving more accurate and efficient predictions of financial outcomes. In particular, ML techniques exhibit promising research prospects in cryptocurrencies, financial crime detection, and marketing, underscoring the extensive opportunities in these areas. However, some limitations regarding ML applications in the business and finance domains remain, including issues related to linguistic information processes, interpretability, data quality, generalization, and the oversights related to social networks and causal relationships. Thus, addressing these challenges is a promising avenue for future research.

Introduction

The rapid development of information and database technologies, coupled with notable progress in data analysis methods and computer hardware, has led to an exponential increase in the application of ML techniques in various areas, including business and finance (Ghoddusi et al. 2019 ; Gogas and Papadimitriou 2021 ; Chen et al. 2022 ; Hoang and Wiegratz 2022 ; Nazareth and Ramana 2023 ; Ozbayoglu et al. 2020 ; Xiao and Ke 2021 ). The progress in ML techniques in business and finance applications, such as marketing, e-commerce, and energy, has been highly successful, yielding promising results (Athey and Imbens 2019 ). Compared to traditional econometric models, ML techniques can more effectively handle large amounts of structured and unstructured data, enabling rapid decision-making and forecasting. These benefits stem from ML techniques’ ability to avoid making specific assumptions about the functional form, parameter distribution, or variable interactions and instead focus on making accurate predictions about the dependent variables based on other variables.

Exploring scientific databases, such as the Thomson Reuters Web of Science, reveals a significant exponential increase in the utilization of ML in business and finance. Figure  1 illustrates the outcomes of an inquiry into fundamental ML applications in emerging business and financial domains over the past few decades. Numerous studies in this field have applied ML techniques to resolve business and financial problems. Table 1 lists some of their applications. Boughanmi and Ansari ( 2021 ) developed a multimodal ML framework that integrates different types of non-parametric data to accommodate diverse effects. Additionally, they combined multimedia data in creative product settings and applied their model to predict the success of musical albums and playlists. Zhu et al. ( 2021 ) asserted that accurate demand forecasting is critical for supply chain efficiency, especially for the pharmaceutical supply chain, owing to its unique characteristics. However, a lack of sufficient data has prevented forecasters from pursuing advanced models. Accordingly, they proposed a demand forecasting framework that “borrows” time-series data from many other products and trains the data with advanced ML models. Yan and Ouyang ( 2018 ) proposed a time-series prediction model that combines wavelet analysis with a long short-term memory neural network to capture the complex features of financial time series and showed that this neural network had a better prediction effect. Zhang et al. ( 2020a , b ) employed a Bayesian learning model with a rich dataset to analyze the decision-making behavior of taxi drivers in a large Asian city to understand the key factors that drive the supply side of urban mobility markets.

figure 1

Trend of articles on applied ML techniques in business and finance (2007–2021)

Several review papers have explored the potential of ML to enhance various domains, including agriculture (Raj et al. 2015 ; Coble et al. 2018 ; Kamilaris and Prenafeta-Boldu 2018 ; Storm et al. 2020 ), economic analysis (Einav and Levin 2014 ; Bajari et al. 2015 ; Grimmer 2015 ; Nguyen et al. 2020 ; Nosratabadi et al. 2020 ), and financial crisis prediction (Lin et al. 2012 ; Canhoto 2021 ; Dastile et al. 2020 ; Nanduri et al. 2020 ). Kou et al. ( 2019 ) conducted a survey encompassing research and methodologies related to the assessment and measurement of financial systemic risk that incorporated various ML techniques, including big data analysis, network analysis, and sentiment analysis. Meng and Khushi ( 2019 ) reviewed articles that focused on stock/forex prediction or trading, where reinforcement learning served as the primary ML method. Similarly, Nti et al. ( 2020 ) reviewed approximately 122 pertinent studies published in academic journals over an 11-year span, concentrating on the application of ML to stock market prediction.

Despite these valuable contributions, it is worth noting that the existing review papers primarily concentrate on specific issues within the realm of business and finance, such as the financial system and stock market. Consequently, although a substantial body of research exists in this area, a comprehensive and systematic review of the extensive applications of ML in various aspects of business and finance is lacking. In addition, existing review articles do not provide a comprehensive review of common ML techniques utilized in business and finance. To bridge the aforementioned gaps in the literature, we aim to provide an all-encompassing and methodological review of the extensive spectrum of ML applications in the business and finance domains. To begin with, we identify the most commonly utilized ML techniques in the business and finance domains. Then we introduce the fundamental ML concepts and frequently employed techniques and algorithms. Next, we systematically examine the extensive applications of ML in various sub-domains within business and finance, including marketing, stock markets, e-commerce, cryptocurrency, finance, accounting, credit risk management, and energy. We critically analyze the existing research that explores the implementation of ML techniques in business and finance to offer valuable insights to researchers, practitioners, and decision-makers, thereby facilitating better-informed decision-making and driving future research directions in this field.

The remainder of this paper is organized as follows. Section “ Keywords, distribution of articles, and common technologies in the application of ML techniques in business and finance ” outlines the literature retrieval process and presents the statistical findings from the literature analysis, including an analysis of common application trends and ML techniques. Section “ Machine learning: a brief introduction ” introduces fundamental concepts and terminology related to ML. Sections “ Supervised learning ” and “ Unsupervised learning ” explore in-depth common supervised and unsupervised learning techniques, respectively. Section “ Applications of machine learning techniques in business and finance ” discusses the most recent applications of ML in business and finance. Section “ Critical discussions and future research directions ” discusses some limitations of ML in this domain and analyzes future research opportunities. Finally, “ Conclusions ” section concludes.

Keywords, distribution of articles, and common technologies in the application of ML techniques in business and finance

The primary focus of this review is to explore the advancements in ML in business- and finance-related fields involving ML applications in various market-related issues, including prices, investments, and customer behaviors. This review employs the following strategies to identify existing literature. Initially, we identify relevant journals known for publishing papers that utilize ML techniques to address business and finance problems, such as the UTD-24. Table 2 lists the keywords used in the literature search. During the search process, we input various combinations of ML keywords and business/finance keywords, such as “support vector machine” and “marketing.” By cross-referencing the selected journals and keywords and thoroughly examining the citations of highly cited papers, we aimed to achieve a comprehensive and unbiased representation of the current literature.

After identifying journals and keywords, we searched for articles in the Thomson Reuters Web of Science and Elsevier Scopus databases using the same set of keywords. Once the collection phase was complete, the filtering process was initiated. Initially, duplicate articles were excluded to ensure that only unique articles remained for further analysis. Subsequently, we carefully reviewed the full text of each article to eliminate irrelevant or inappropriate items and thus ensure that the final selection comprised relevant and meaningful literature.

Figure  2 illustrates the process of article selection for the review. In the identification phase, we retrieved 154 articles from the search and identified an additional 37 articles through reference checking. During the second phase, duplicates and inappropriate articles were filtered out, resulting in a total of 68 articles eligible for inclusion in this study. Based on the review of these articles, we categorized them into seven different applications: stock market, marketing, e-commerce, energy marketing, cryptocurrency, accounting, and credit risk management, as depicted in Fig.  3 and Tables 3 , 4 , 5 , 6 , 7 , 8 and 9 . Statistical analyses have revealed that ML research in the business and finance domain is predominantly concentrated in the areas of stock market and marketing. The research on e-commerce, cryptocurrency, and energy market applications is nearly equivalent in quantity. Conversely, articles focusing on accounting and credit risk management applications are relatively limited. Figure  4 provides a summary of the ML techniques employed in the reviewed articles. Deep learning, support vector machine, and decision tree methods emerged as the most prominent research technologies. In contrast, the application of unsupervised learning techniques, such as k-means and reinforcement learning, were less common.

figure 2

Flow diagram for article identification and filtering

figure 3

Number of papers employing ML techniques

figure 4

Prominent methods applied in the business and finance domains

Machine learning: a brief introduction

This section introduces the basic concepts of ML, including its goals and terminology. Thereafter, we present the model selection method and how to improve the performance.

Goals and terminology

The key objective in various scientific disciplines is to model the relationships between multiple explanatory variables and a set of dependent variables. When a theoretical mathematical model is established, researchers can use it to predict or control desired variables. However, in real-world scenarios, the underlying model is often too complex to be formulated as a closed-form input–output relationship. This complexity has led researchers in the field of ML to focus on developing algorithms (Wu et al. 2008 ; Chao et al. 2018 ). The primary goal of these algorithms is to predict certain variables based on other variables or to classify units using limited information; for example, they can be used to classify handwritten digits based on pixel values. ML techniques can automatically construct computational models that capture the intricate relationships present in available data by maximizing the problem-dependent performance criterion or minimizing the error term, which allows them to establish a robust representation of the underlying relationships.

In the context of ML, the sample used to estimate the parameters is usually referred to as a “training sample,” and the procedure for estimating the parameters is known as “training.” Let N be the sample size, k be the number of features, and q be the number of all possible outcomes. ML can be classified into two main types: supervised and unsupervised. In supervised learning problems, we know both the feature \({\mathbf{X}}_{i} = (x_{i1} ,...,x_{ik} ),\; \, i = 1,2,...,N\) and the outcome \(Y_{i} = (y_{i1} ,y_{i2} ,...,y_{iq} )\) , where \(y_{ij}\) represents the outcome of \(y_{i}\) in the dimension \(j\) . For example, in a recommendation system, the quality of product can be scored from 1 to 5, indicating that “q” equals 5. In unsupervised learning problems, we only observe the features \({\mathbf{X}}_{i}\) (input data) and aim to group them into clusters based on their similarities or patterns.

Cross-validation, overfitting, and regularization

Cross-validation is frequently used for model selection in ML that is applied to each model; the technique is applied to each model and the one with the lowest expected out-of-sample prediction error is selected.

The ML literature shows significantly higher concern about overfitting than the standard statistics or econometrics literature. In the ML community, the degrees of freedom are not explicitly considered, and many ML methods involve a large number of parameters, which can potentially lead to negative degrees of freedom.

Limiting overfitting is commonly achieved through regularization in ML, which controls the complexity of a model. As stated by Vapnik ( 2013 ), the regularization theory was one of the first signs of intelligent inference. The complexity of the model describes its ability to approximate various functions. As the complexity increases, the risk of overfitting also increases, whereas less complex and more regularized models may lead to underfitting. Regularization is often implemented by selecting a parsimonious number of variables and using specific functional forms without explicitly controlling for overfitting. Instead of directly optimizing an objective function, a regularization term is added to the objective function, which penalizes the complexity of the model. This approach encourages the model to generalize better and avoids overfitting by promoting simpler and more interpretable solutions.

Here, we provide an example to illustrate how regularization works. The following linear regression model was used:

where N is the sample size, k is the numbers of features, and q is the number of all possible outcomes. The variable \(y_{{ij}} (i = 1,2,...,N,\quad j = 1,2,...,q)\) represents the outcome of \(y_{i}\) in the j th dimension. Additionally, \(b_{pj} (p = 1,2,...,k,j = 1,2,...,q)\) represents the coefficient of feature p in the j th dimension. By using vector notations, \({{\varvec{\upsigma}}} = (\sigma_{1} ,...,\sigma_{q} )^{{ \top }}\) , \({\mathbf{b}} = (b_{{11}} ,b_{{21}} ,...,b_{{k1}} ,b_{{12}} ,b_{{22}} ,...,b_{{k2}} ,...,b_{{1q}} ,b_{{2q}} ,...,b_{{kq}} )^{{ \top }}\) and \(Y_{i} = (y_{i1} ,y_{i2} ,...,y_{iq} )\) , we can rewrite Eq. ( 1 ) as follows:

where \({\mathbf{b}}\) is the solution of

\(\lambda\) is a penalty parameter that can be selected through out-of-sample cross-validation to optimize the model’s out-of-sample predictive performance.

Supervised learning

This section introduces common supervised learning technologies. Compared to traditional statistics, supervised learning methods exhibit certain desired properties when optimizing predictions in large datasets, such as transaction and financial time series data. In business and finance, supervised learning models have proven to be among the most effective tools for detecting credit card fraud (Lebichot et al. 2021 ). In the following subsections, we briefly describe the commonly used supervised ML methods for business and finance.

Shrinkage methods

The traditional least-squares method often yields complex models with an excessive number of explanatory variables. In particular, when the number of features, k , is large compared to the sample size N , the least-squares estimator, \({\hat{\mathbf{b}}}\) , does not have good predictive properties, even if the conditional mean of the outcome is linear. To address this problem, regularization is typically used to adjust the estimation parameters dynamically and reduce the complexity of the model. The shrinkage method is the most common regularization method and can reduce the values of the parameters to be estimated. Shrinkage methods, such as ridge regression (Hoerl and Kennard 1970 ) and least absolute shrinkage and selection operator (LASSO) (Tibshirani 1996 ), are linear regression models that add a penalty term to the size of the coefficients. This penalty term pushes the coefficients towards zero, effectively shrinking their values. Shrinkage methods can be effectively used to predict continuous outcomes or classification tasks, particularly when dealing with datasets containing numerous explanatory variables.

Compared to the traditional approach that estimates the regression function using least squares,

shrinkage methods add a penalty term that shrinks \({\mathbf{b}}\) toward zero, aiming to minimize the following objective function:

where \(\left\| {\mathbf{b}} \right\|_{q} = \sum\nolimits_{i = 1}^{N} {\left| {b_{i} } \right|^{q} }\) . In \(q = 1\) , this formulation leads to a LASSO. However, when \(q = 2\) is used, this formulation degenerates ridge regression.

Tree-based method

Regression trees (Breiman et al. 1984 ) and random forests (Breiman 2001 ) are effective methods for estimating regression functions with minimal tuning, especially when out-of-sample predictive abilities are required. Considering a sample \((x_{i1} ,...,x_{ik} ,Y_{i} )\) for \(i = 1,2,...,N\) , the idea of a regression tree is to split the sample into subsamples where the regression functions are being estimated. The splits process is sequential and based on feature value \(x_{ij}\) exceeding threshold \(c\) . Let \(R_{1} (j,c)\) and \(R_{2} (j,c)\) be two sets based on the feature \(j\) and threshold \(c\) , where \(R_{1} (j,c) = \left\{ {{\mathbf{X}}_{i} |x_{ij} \le c} \right\}\) and \(R_{2} (j,c) = \left\{ {{\mathbf{X}}_{i} |x_{ij} > c} \right\}\) . Naturally, the dataset \(R\) is divided into two parts, \(R_{1}\) and \(R_{2}\) , based on the chosen feature and threshold.

Let \(c_{1} = \frac{1}{{|R_{1} |}}\sum\nolimits_{{{\mathbf{X}}_{i} \in R_{1} }} {x_{ij} }\) and \(c_{2} = \frac{1}{{|R_{2} |}}\sum\nolimits_{{{\mathbf{X}}_{i} \in R_{2} }} {x_{ij} }\) , where \(| \bullet |\) refer to the cardinality of the set. Then we can construct the following optimization model to calculate the errors of the \(R_{1}\) and \(R_{2}\) datasets:

For all \(x_{ij}\) and threshold \(c \in ( - \infty , + \infty )\) , the method finds the optimal feature \(j^{*}\) and threshold \(c^{*}\) that minimizes errors and splits the sample into subsets based on these criteria. By selecting the best feature and threshold, the method obtains the optimal classification of \(R_{1}^{*}\) and \(R_{2}^{*}\) . This process is repeated recursively, leading to further splits that minimize the squared error and improve the overall model performance. However, researchers should be cautious about overfitting, wherein the model fits the training data too closely and fails to generalize well to new data. To address this issue, a penalty term can be added to the objective function to encourage simpler and more regularized models. The coefficients of the model are then selected through cross-validation, optimizing the penalty parameter to achieve the best trade-off between model complexity and predictive performance on new, unseen data. This helps prevent overfitting and ensures that the model's performance is robust and reliable.

Random forest builds on the tree algorithm to better estimate the regression function. This approach smooths the regression function by averaging across multiple trees, thus exhibiting two distinct differences. First, instead of using the original sample, each tree is constructed based on a bootstrap sample or a subsample of the data, a technique known as “bagging.” Second, at each stage of building a tree, the splits are not optimized over all possible features (covariates) but rather over a random subset of the features. Consequently, feature selection varies in each split, which enhances the diversity of the individual trees.

Deep learning and neural networks

Deep learning and neural networks have been proven to be highly effective in complex settings. However, it is worth noting that the practical implementation of deep learning often demands a considerable amount of tuning compared to other methods, such as decision trees or random forests.

Deep neural networks

As with any other supervised learning methods, deep neural networks (DNNs) can be viewed as a straightforward mapping \(y=f(x;\theta )\) from the input feature vector \(x\) to the output vector or scalar \(y\) , which is governed by the unknown parameters \(\theta\) . This mapping typically consists of layers that form chain-like structures. Figure  5 illustrates the structure of the DNN. For a DNN with multiple layers, the structure can be represented as

figure 5

Structure of DNN

In a fully connected DNN, the \(i\) th layer has a structure given by \(h^{(i)} = f^{(i)} (x) = g^{(i)} ({\mathbf{W}}^{(i)} h^{(i - 1)} + {\mathbf{b}}^{(i)} )\) , where \({\mathbf{W}}\) is the matrix of unknown parameters and \({\mathbf{b}}^{\left( i \right)}\) is the vector of basis factors. A typical choice for \(g^{\left( i \right)}\) , called the “activation function,” can be a rectified linear unit, tanh transformation function, or sigmoid function. The 0th layer \(h^{(0)} = x\) , which represents the input vector. The row dimension of \(b\) or the column dimension of the \({\mathbf{W}}\) species is the number of neurons in each layer. The weight matrix \({\mathbf{W}}\) is learned by minimizing a loss function, which can be the mean squared error for regression tasks or the cross-entropy for classification tasks. In particular, when the DNN has one layer, \(y\) is scalar. The activation function is set to linear or logistic, and we obtain a linear or logistic regression.

Convolutional neural networks

Although neural networks have many different architectures, the two most classical and relevant are convolutional neural networks (CNNs) and recurrent neural networks (RNNs). A classical CNN structure, which contains three main components—convolutional, pooling, and fully connected layers—is shown in Fig.  6 . In contrast to the previously mentioned fully connected structure, in the convolutional layer, each neuron connects with only a small fraction of the neurons from the former layer; however, they share the same parameters. Therefore, sparse connections and parameter sharing significantly reduces the number of estimated parameters.

figure 6

Structure of CNN

Different layers play different roles in the training process and are introduced in more detail as follows:

Convolutional layer : This layer comprises a collection of trained filters that are used to extract features from the input data. Assuming that \(X\) is the input and there are \(k\) filters, the output of the convolutional layer can be formulated as follows:

where \(\omega_{j}\) and \(b_{j}\) denote the weights and bias, respectively; \(f\) represents the activation function; and \(*\) denotes the convolutional operator.

Pooling layer : This layer reduces the features and parameters of the network. The most popular pooling methods are the maximum and average pooling.

CNN are designed to handle one-dimensional time-series data or images. Intuitively, each convolutional layer can be considered a set of filters that move across images or shift along time sequences. For example, some filters may learn to detect textures, whereas others may identify specific shapes. Each filter generates a feature map and the subsequent convolutional layer integrates these features to create a more complex structure, resulting in a map of learned features. Suppose that \(S\) is an \(p \times p\) window size. Then the average pooling process can be formulated as

where \(x_{ij}\) is the activation value at location \((i,j)\) , and N is the total number of \(S\) .

Recurrent neural networks

Recurrent neural networks (RNNs) are well suited for processing sequential data, dynamic relations, and long-term dependencies. RNNs, particularly those employing long short-term memory (LSTM) cells, have become popular and have shown significant potential in natural language processing (Schmidhuber 2015 ). A key feature of this architecture is its ability to maintain past information over time using a cell-state vector. In each time step, new variables are combined with past information in the cell vector, enabling the RNN to learn how to encode information and determine which encoded information should be retained or forgotten. Similar to CNNs, RNN benefit from parameter sharing, which allows them to detect specific patterns in sequential data.

Figure  7 illustrates the structure of the LSTM network, which contains a memory unit \({C}_{t}\) , a hidden state \({h}_{t}\) , and three types of gates. Index \(t\) refers to the time step. At each step \(t\) , the LTSM combines input \({x}_{t}\) with the previous hidden state \({h}_{t-1}\) , calculates the activations of all gates, and updates the memory units and hidden states accordingly.

figure 7

Structure of LSTM

The computations of LSTM networks are described as follows:

where \(W\) denotes the weight of the inputs, and \(\omega_{f}\) and \(\omega_{i}\) represent the weights of the outputs and biases, respectively. The subscript \(f,i,{\text{ and }}O\) refer to the forget, input, and output gate vectors, respectively. \(b\) indicates biases and \(\circ\) is an element-wise multiplication.

Wavelet neural networks

Wavelet neural networks (Zhang and Benveniste  1992 ) use the wavelet function as the activation function, thus combining the advantages of both the wavelet transform and neural networks. The structure of wavelet neural networks is based on backpropagation neural networks, and the transfer function of the hidden layer neuron is the mother wavelet function. For input features \({\mathbf{x}} = (x_{1} ,...,x_{n} )\) , the output of the hidden layer can be expressed as follows:

where \(h(j)\) is the output value for neuron \(j\) , \(h_{j}\) is the mother wavelet function, \(\omega_{ij}\) is the weight between the input and hidden layers, \(b_{j}\) is the shift factor, and \(a_{j}\) is the stretch factor for \(h_{j}\) .

Support vector machine and kernels

Support vector machines (SVM) are flexible classification methods (Cortes and Vapnik 1995 ). Let us consider a binary classification problem, where we have an \(N\) observation \({\mathbf{X}}_{i}\) , each with \(k\) features, and a binary label \(y_{i} \in \{ - 1,1\}\) . Subsequently, a hyperplane \(x \in {\mathbf{\mathbb{R}}}\) s. t. \(w^{{ \top }} {\mathbf{X}}_{i} + b = 0\) is defined, which can be considered a binary classifier \({\text{sgn}} (w^{{ \top }} {\mathbf{X}}_{i} + b)\) . The goal of SVM is to find a hyperplane such that the observations can be separated into two classes: + 1 and − 1. From the hyperplane space, SVM selects the option that maximizes the distance from the closest sample. In an SVM, there is typically a small set of samples with the same maximal distance, which are referred to as “support vectors.”

The above-mentioned process can be written as the following optimization model:

To solve the above optimization model, we rewrite it in terms of Lagrangian multipliers as follows:

where \(\alpha_{i}\) is the Lagrangian multiplier of the original restriction and \(Y_{i} (\omega^{{ \top }} {\mathbf{X}}_{i} + b) \ge 1\) . The model above is equivalent to

We can obtain the Lagrangian multiplier \({{\varvec{\upalpha}}} = (\alpha_{1} ,...,\alpha_{N} )\) from Model ( 15 ), and then \(\widehat{b}\) can be solved from \(\sum\nolimits_{i = 1}^{N} {\hat{\alpha }_{i} (Y_{i} (\omega^{{ \top }} {\mathbf{X}}_{i} + b) - 1)} = 0\) . Furthermore, we can obtain the classifier:

Traditional SVM assumes linearly separable training samples. However, SVM can also deal with non-linear cases by mapping the original covariates to a new feature space using the function \(\phi ({\mathbf{X}}_{i} )\) and then finding the optimal hyperplane in this transformed feature space; that is, \(f(x_{i} ) = \omega^{{ \top }} \phi (x_{i} ) + b\) . Thus, the optimization problem in the transformed feature space can be formulated as

where \(K({\mathbf{X}}_{i} ,{\mathbf{X}}_{j} ) = \phi ({\mathbf{X}}_{i} )^{{ \top }} \phi ({\mathbf{X}}_{j} )\) . The kernel function \(K( \bullet )\) can be linear, polynomial, or sigmoid. Once the kernel function is determined, we can solve for the value of the Lagrangian multiplier \(\alpha\) . Then \(\widehat{b}\) can be solved from \(\sum\nolimits_{i = 1}^{N} {\hat{\alpha }_{i} (Y_{i} (\omega^{{ \top }} {\mathbf{X}}_{i} + b) - 1)} = 0\) , which allows us to derive the classifier:

Bayesian classifier

A Bayesian network is a graphical model that represents the probabilistic relationships among a set of features (Friedman et al. 1997 ). The Bayesian network structure \(S\) is a directed acyclic graph. Formally, a Bayesian network is a pair \(B = \left\langle {G,\Theta } \right\rangle\) , where \(G\) is a directed acyclic graph whose nodes represent the random variable \(\left( {X_{1} ,...,X_{n} } \right)\) , whose edges represent the dependencies between variables, and \(\Theta\) is the set of parameters that quantify the graph.

Assuming that there are \(q\) labels; that is, \({\mathbf{Y}} = \{ c_{1} ,...,c_{q} \}\) , \(\lambda_{ij}\) is the loss caused by misclassifying the sample with the true label \(c_{j}\) as \(c_{i}\) , and \({\mathbb{X}}\) represents the sample space. Then, based on the posterior probability \(P(c_{i} |{\mathbf{x}})\) , we can calculate the expected loss of classifying sample \({\mathbf{x}}\) into the label \(c_{i}\) as follows:

Therefore, the aim of the Bayesian classifier is to find a criterion \(h:{\mathbb{X}} \to {\mathbf{Y}}\) that minimizes the total risk

Obviously, for each sample \({\mathbf{x}}\) , when \(h\) can minimize the conditional risk \(R(h({\mathbf{x}})|{\mathbf{x}})\) , the total risk \(R(h)\) will also be minimized. This leads to the concept of Bayes decision rules: to minimize the total risk, we need to classify each sample into the label that minimizes the conditional risk \(R(h({\mathbf{x}})|{\mathbf{x}})\) , namely

We then used \(h^{*}\) as the Bayes-optimal classifier and \(R(h^{*} )\) as the Bayes risk.

K-nearest neighbor

The K-nearest neighbor (KNN) algorithm is a lazy-learning algorithm because it defers to the induction process until classification is required (Wettschereck et al. 1997 ). The lazy-learning algorithm requires less computation time during the training process compared to eager-learning algorithms such as decision trees, neural networks, and Bayes networks. However, it may require additional time during the classification phase.

The kNN algorithm is based on the assumption that instances close to each other in a feature space are likely to have similar properties. If instances with the same classification label are found nearby, an unlabeled instance can be assigned the same class label as its nearest neighbors. kNN locates the k-nearest instances to the unlabeled instance and determines its label by observing the most frequent class label among these neighbors.

The choice of k significantly affects the performance of the kNN algorithm. Let us discuss the performance of kNN during \(k = 1\) . Given sample \({\mathbf{x}}\) and its nearest sample \({\mathbf{z}}\) , the probability of error can be expressed as follows:

Suppose the samples are independent and identically distributed. For any \({\mathbf{x}}\) and any positive number \(\delta\) , there always exists at least one sample \({\mathbf{z}}\) within a distance of \(\delta\) from \({\mathbf{x}}\) . Let \(c^{*} ({\mathbf{x}})\mathop {\arg \min }\limits_{{c \in {\mathbf{Y}}}} P(c|{\mathbf{x}})\) be the outcome the Bayes optimal classifier. Then we have:

According to (23), despite the simplicity of kNN, the generalization error is no more than twice that of the Bayes-optimal classifier.

Unsupervised learning

In unsupervised learning, researchers can only access observations without any labeled information, and their primary interest lies in partitioning a sample into subsamples or clusters. Unsupervised learning methods are particularly useful in descriptive tasks because they aim to find relationships in a data structure without measuring the outcomes. Several approaches commonly used in business and finance research fall under the umbrella of unsupervised learning, including k-means clustering and reinforcement learning. Accordingly, unsupervised learning can be used in qualitative business and finance. For example, it can be particularly beneficial during stakeholder analysis, when stakeholders must be mapped and classified by considering certain predefined attributes. It can also be useful for customer management. A company can employ an unsupervised ML method to cluster guests, which influences its marketing strategy for specific groups and leads to a competitive advantage. This section introduces unsupervised learning technologies that are widely used in business and finance.

K-means clustering

The K-means algorithm aims to find K points in the sample space and classify the samples that are closest to these points. Using an iterative method, the values of each cluster center are updated step-by-step to achieve the best clustering results. When partitioning the feature space into K clusters, the k-means algorithm selects centroids and assigns observations to clusters based on their proximity to them. \(b_{1} ,...,b_{k}\) . The algorithm proceeds as follows. First, we begin with the K centroids \(b_{1} ,...,b_{k}\) , which are initially scattered throughout the feature space. Next, in accordance with the chosen centroids, each observation is assigned to clusters that minimize the distance between the observation and the centroid of the cluster:

Next, we update the centroid by computing the average of \(X_{i}\) across each cluster:

where \(I( \bullet )\) is the indicative function. When choosing the number of clusters, K, we must exercise caution because no cross-validation method is available to compare the values.

Reinforcement learning

Reinforcement learning (RL) draws inspiration from the trial-and-error procedure conducted by Thorndike in his 1898 study of cat behavior. Originating from animal learning, RL aims to mimic human behavior by making decisions that maximize profits through interactions with the environment. Mnih et al. ( 2015 ) proposed deep RL by employing a deep Q-network to create an agent that outperformed a professional player in a game and further advanced the field of RL.

In deep RL, the learning algorithm plays an essential role in improving efficiency. These algorithms can be categorized into three types: value-based, policy-based, and model-based RL, as illustrated in Fig.  8 .

figure 8

Learning algorithm-based reinforcement learning

RL consists of four components—agent, state, action and reward—with the agent as its core. When an action leads to a profitable state, it receives a reward, otherwise, it is discouraged. In RL, an agent is defined as any decision-maker, while everything else is considered the environment. The interactions between the environments and the agents are described by state \(s\) , action \(a\) , and reward \(r\) . At time step \(t\) , the environment is in state \(s_{t}\) , and the agent takes action \(a_{t}\) . Consequently, the environment transitions to state \(s_{t + 1}\) and rewards agent \(r_{t + 1}\) .

The agent’s decision is formalized by a policy \(\pi\) , which maps state \(s\) to action \(a\) . This is deterministic when the probability of choosing action \(a\) in state \(s\) equals one (i.e., \(\pi (a|s) = p(a|s) = 1\) ). In contrast, it is stochastic when \(p(a|s) < 1\) is used. Policy \(\pi\) can be defined as the probability distribution of all actions selected from a certain \(s\) , as follows:

where \(\Delta_{\pi }\) represents all possible actions of \(\pi\) .

In each step, the agent receives an immediate reward \(r_{t + 1}\) until it reaches the final state \(s_{T}\) . However, the immediate reward does not ensure a long-term profit. To address this, a generalized return value is used at time step \(t\) , defined as \(R_{t}\) :

where \(0 \le \gamma \le 1\) . The agents become more farsighted when \(\gamma\) approaches 1, and more shortsighted when it approaches 0.

The next step is to define a score function \(V\) to estimate the goodness of the state:

Then, we determine the goodness of a state-action pair \((s,a)\) :

Finally, we access the goodness between two policies:

Finally, we can expand \(V_{\pi } (s)\) and \(Q_{\pi } (s,a)\) through \(R_{t}\) to represent the relationship between \(s\) and \(s_{t + 1}\) as

where \(W_{{s \to s^{\prime}|a}} = E[r_{t + 1} |s_{t} = s,a_{t} = a,s_{t + 1} = s^{\prime}]\) . By solving ( 31 ) and ( 32 ), we obtain \(V\) and \(S\) , respectively.

Restricted Boltzmann machines

As Fig.  9 shows, a restricted Boltzmann machine (RBM) can be considered an undirected neural network with two layers, called the “hidden” and “visible” layers. Hidden layers are used to detect the features, whereas visible layers are used to train the input data. Given the \(n\) visible layers \(v\) and \(m\) hidden layers \(h\) , the energy function is given by

where \(\alpha_{ij}\) is the weight between the unit \(i\) \(j\) , and \(a_{i}\) and \(b_{j}\) are the biases for \(v\) and \(h\) , respectively.

figure 9

Structure of RBM

Applications of machine learning techniques in business and finance

This section considers the application fields in the following categories: marketing, stock market, e-commerce, cryptocurrency, finance, accounting, credit risk management, and energy economy. This study reviews the application status of ML in these fields.

ML is an innovative technology that can potentially improve forecasting models and assist in management decision-making. ML applications can be highly beneficial in the marketing domain because they rely heavily on building accurate predictive models from databases. Compared to the traditional statistical approach for forecasting consumer behavior, researchers have recently applied ML technology, which offers several distinctive advantages for data mining with large, noisy databases (Sirignano and Cont 2019 ). An early example of ML in marketing can be found in the work of Zahavi and Levin ( 1997 ), who used neural networks (NNs) to model consumer responses to direct marketing. Compared with the statistical approach, simple forms of NNs are free from the assumptions of normality or complete data, making them particularly robust in handling noisy data. Recently, as shown in Table  3 , ML techniques have been predominantly used to study customer behaviors and demands. These applications enable marketers to gain valuable insights and make data-driven decisions to optimize marketing strategies.

Consumer behavior refers to the actions taken by consumers to request, use, and dispose of consumer goods, as well as the decision-making process that precedes and determines these actions. In the context of direct marketing, Cui et al. ( 2006 ) proposed Bayesian networks that learn by evolutionary programming to model consumer responses to direct marketing using a large direct marketing dataset. In the supply chain domain, Melancon et al. ( 2021 ) used gradient-boosted decision trees to predict service-level failures in advance and provide timely alerts to planners for proactive actions. Regarding unsupervised learning in consumer behavior analysis, Dingli et al. ( 2017 ) implemented a CNN and an RBM to predict customer churn. However, they found that their performance was comparable to that of supervised learning when introducing added complexity in specific operations and settings. Overall, ML techniques have demonstrated their potential for understanding and predicting consumer behavior, thereby enabling businesses to make informed decisions and optimize their marketing strategies (Machado and Karray 2022 ; Mao and Chao 2021 ).

Predicting consumer demand plays a critical role in helping enterprises efficiently arrange production and generate profits. Timoshenko and Hauser ( 2019 ) used a CNN to facilitate qualitative analysis by selecting the content for an efficient review. Zhang et al. ( 2020a , b ) used a Bayesian learning model with a rich dataset to analyze the decision-making behavior of taxi drivers in a large Asian city to understand the key factors that drive the supply side of urban mobility markets. Ferreira et al. ( 2016 ) employed ML techniques to estimate historical lost sales and predict future demand for new products. For the application of consumer demand-level prediction, most of the research we reviewed used supervised learning technologies because learning consumer consumption preferences requires historical data of consumers, and only clustering consumers is insufficient to predict their consumption levels.

Stock market

ML applications in the stock market have gained immense popularity, with the majority focusing on financial time series for stock price predictions. Table 4 summarizes the reviewed articles that employed ML methods in stock market studies, including references, research objectives, data sources, applied techniques, and journals. Investing in the stock market can be highly profitable but also entails risk. Therefore, investors always try to determine and estimate stock values before taking any action. Researchers have mostly used ML techniques to predict stock prices (Bennett et al. 2022 ; Moon and Kim 2019 ). However, predicting stock values can be challenging due to the influence of uncontrollable economic and political factors that make it difficult to identify future market trends. Additionally, financial time-series data are often noisy and non-stationary, rendering traditional forecasting methods less reliable for stock value predictions. Researchers have explored ML in sentiment analysis to identify future trends in the stock market (Baba and Sevil 2021 ). Furthermore, other studies have focused on objectives such as algorithmic trading, portfolio management, and S&P 500 index trend prediction using ML techniques (Cuomo et al. 2022 ; Go and Hong 2019 ).

Various ML techniques have been successfully applied for stock price predictions. Fischer and Krauss ( 2018 ) applied LSTM networks to predict the out-of-sample directional movements of the constituent stocks of the S&P 500 from 1992 to 2015, demonstrating that LSTM networks outperform memory-free classification methods. Wu et al. ( 2021 ) applied LASSO, random forest, gradient boosting, and a DNN to cross-sectional return predictions in hedge fund selection and found that ML techniques significantly outperformed four styles of hedge fund research indices in almost all situations. Bao et al. ( 2017 ) fed high-level denoising features into the LSTM to forecast the next day’s closing price. Sabeena and Venkata ( 2019 ) proposed a modified adversarial-network-based framework that integrated a gated recurrent unit and a CNN to acquire data from online financial sites and processed the obtained information using an adversarial network to generate predictions. Song et al. ( 2019 ) used deep learning methods to predict future stock prices. Sohangir et al. ( 2018 ) applied several NN models to stock market opinions posted on StockTwits to determine whether deep learning models could be adapted to improve the performance of sentiment analysis on StockTwits. Bianchi et al. ( 2021 ) showed that extreme trees and NNs provide strong statistical evidence in favor of bond return predictability. Vo et al. ( 2019 ) proposed a deep responsible investment portfolio model containing an LSTM network to predict stock returns. All of these stock price applications use supervised learning techniques and financial time-series data to supervise learning. In contrast, it is challenging to apply unsupervised learning methods, particularly clustering, in this domain (Chullamonthon and Tangamchit 2023 ). However, RL still has certain applications in the stock markets. Lei ( 2020 ) combined deep learning and RL models to develop a time-driven, feature-aware joint deep RL model for financial time-series forecasting in algorithmic trading, thus demonstrating the potential of RL in this domain.

Additionally, the evidence suggests that hybrid LSTM methods can outperform other single-supervised ML methods in certain scenarios. Thus, in applying ML to the stock market, researchers have explored the combination of LSTM with different methods to develop hybrid models for improved performance. For instance, Tamura et al. ( 2018 ) used LSTM to predict stock prices and reported that the accuracy test results outperformed those of other models, indicating the effectiveness of the hybrid LSTM approach in stock price prediction.

Researchers have explored various hybrid approaches that combine wavelet transforms and LSTM with other techniques to predict stock prices and financial time series. Bao et al. ( 2017 ) established a new method for predicting stock prices that integrated wavelet transforms, stacked autoencoders, and LSTM. In the first stage, they eliminate noise to decompose the stock price time series. In the next stage, predictive features for the stock price are created. Finally, LSTM is applied to predict the next day’s closing price based on the features of the previous stage. The authors claimed that their model outperformed state-of-the-art models in terms of predictive accuracy and profitability. To address the non-linearity and non-stationary characteristics of financial time series, Yan and Ouyang ( 2018 ) integrated wavelet analysis with LSTM to forecast the daily closing price of the Shanghai Composite Index. Their proposed model outperformed multiple layer perceptron, SVM, and KNN with respect to finding patterns in financial time-series data. Fang et al. ( 2019 ) developed a methodology to predict exchange trade–fund option prices by integrating LSTM with support vector regression (SVR). They used two LSTM-SVR models to model the final transaction price. In the second generation of LSTM-SVR, the hidden state vectors of the LSTM and the seven factors affecting the option price were considered as SVR inputs. Their proposed model outperformed other methods, including LSTM and RF, in predicting option prices.

Online shopping, which allows users to purchase products from companies via the Internet, falls under the umbrella of e-commerce. In today’s rapidly evolving online shopping landscape, companies employ effective methods to recognize their buyers’ purchasing patterns, thereby enhancing their overall client experience. Customer reviews play a crucial role in this process as they are not only utilized by companies to improve their products and services but also by customers to assess the quality of a product and make informed purchase decisions (Da et al. 2022 ). Consequently, the decision-making process is significantly improved through analysis of reviews that provide valuable insights to customers.

Traditionally, enterprises’ e-commerce strategic planning involves assessing the performance of organizational e-commerce adoption behavior at the strategic level. In this context, the decision-making process exhibits typical behavioral characteristics. With regard to organizations’ adoption of technology, it is important to note that the entity adopting the technology is no longer an individual but the organization as a whole. However, technology adoption decisions are still made by people within an organization, and these decisions are influenced by individual cognitive factors (Zha et al. 2021 ). Individuals involved in the decision-making process have their own perspectives, beliefs, and cognitive biases, which can significantly impact an organization’s technology adoption choices and strategies (Li et al. 2019 ; Xu et al. 2021 ). Therefore, the behavioral perspective of technology acceptance provides a new perspective for e-commerce strategic planning research. With the development of ML, research on technology acceptance has been hindered by the limitations of traditional strategic e-commerce planning. Different general models of information technology acceptance behaviors are commonly explored.

Table 5 provides a summary of the aforementioned studies. Cui et al. ( 2021 ) constructed an e-commerce product marketing model based on an SVM to improve the marketing effects of e-commerce products. Pang and Zhang ( 2021 ) built an SVM model to more effectively solve the decision support problem of e-commerce strategic planning. To increase buyers’ trust in the quality of the products and encourage online purchases, Saravanan and Charanya ( 2018 ) designed an algorithm that categorizes products based on several criteria, including reviews and ratings from other users. They proposed a hybrid feature-extraction method using an SVM to classify and separate products based on their features, best product ratings, and positive reviews. Wang et al. ( 2018a , b , c ) employed LSTM to improve the effectiveness and efficiency of mapping customer requirements to design parameters. The results of their model revealed the superior performance of the RNN over the KNN. Xu et al. ( 2019 ) designed an advanced credit risk evaluation system for e-commerce platforms to minimize the transaction risks associated with buyers and sellers. To this end, they employed a hybrid ML model combined with a decision tree ANN (DT-ANN) and found that it had high accuracy and outperformed other hybrid ML models, such as logistic regression and dynamic Bayesian network. Cai et al. ( 2018 ) used deep RL to develop an algorithm to address the allocation of impression problems on e-commerce websites such as www.taobao.com , www.ebay.com , and www.amazon.com . In this algorithm, buyers are allocated to sellers based on their impressions and strategies to maximize the income of the platform. To do so, they applied a gated recurrent unit, and their findings demonstrated that it outperformed a deep deterministic policy gradient. Wu and Yan ( 2018 ) claimed that the main assumption of current production recommender models for e-commerce websites is that all historical user data are recorded. In practice, however, many platforms fail to capture such data. Consequently, they devised a list-wise DNN to model the temporal online behavior of users and offered recommendations for anonymous users.

In the accounting field, ML techniques are employed to detect fraud and estimate accounting indicators. Most companies’ financial statements reflect accounts or disclosure amounts that require estimations. Accounting estimates are pervasive in financial statements and often significantly impact a company’s financial position and operational results. The evolution of financial reporting frameworks has led to the increased use of fair value measurements, which necessitates estimation. Most financial statement items are based on subjective managerial estimates and ML has the potential to provide an independent estimate generator (Kou et al. 2021 ).

Chen and Shi ( 2020 ) utilized bagging and boosting ensemble strategies to develop two models: bagged-proportion support vector machines (pSVM) and boosted-pSVMs. Using datasets from LibSVM, they tested their models and demonstrated that ensemble learning strategies significantly enhanced model performance in bankruptcy prediction. Lin et al. ( 2019 ) emphasized the importance of finding the best match between feature selection and classification techniques to improve the prediction performance of bankruptcy prediction models. Their results revealed that using a genetic algorithm as the wrapper-based feature selection method, combined with naïve Bayes and support vector machine classifiers, resulted in remarkable predictive performance. Faris et al. ( 2019 ) investigated a combination of resampling (oversampling) techniques and multiple election method features to improve the accuracy of bankruptcy prediction methods. According to their findings, employing the oversampling technique and the AdaBoost ensemble method using a reduced error pruning (REP) tree provided reliable and promising results for bankruptcy prediction.

The earlier studies by Perols ( 2011 ) and Perols et al. ( 2017 ) were among the first to predict accounting fraud. Two recent studies by Bao et al. ( 2020 ) and Bertomeu et al. (2020) used various accounting variables to improve the detection of ongoing irregularities. Bao et al. ( 2020 ) employed ensemble learning to develop a fraud-prediction model that demonstrated superior performance compared to the logistic regression and support vector machine models with a financial kernel. Huang et al. ( 2014 ) used Bayesian networks to extract textual opinions, and their findings showed that they outperformed dictionary-based approaches, both general and financial. Ding et al. ( 2020 ) used insurance companies’ data on loss reserve estimates and realizations and documented that the loss estimates generated by ML were superior to the actual managerial estimates reported in financial statements in four out of the five insurance lines examined.

Many companies commission accounting firms to handle accounting and bookkeeping and provide them access to transaction data, documentation, and other relevant information. Mapping daily financial transactions into accounts is one of the most common accounting tasks. Therefore, Jorgensen and Igel ( 2021 ) devised ML systems based on random forest to automate the mapping process of financial transfers to the appropriate accounts. Their approach achieved an impressive accuracy of 80.50%, outperforming baseline methods that either excluded transaction text or relied on lexical bag-of-words text representations. The success of ML systems indicates the potential of ML to streamline accounting processes and increase the efficiency of financial transaction’ mapping. Table 6 summarizes the ML techniques described in “ Accounting ” section.

Credit risk management

The scoring process is an essential part of the credit risk management system used in financial institutions to predict the risk of loan applications because credit scores imply a certain probability of default. Hence, credit scoring modes have been widely developed and investigated for credit approval assessment of new applicants. This process uses a statistical model that considers both the application and performance data of a credit or loan applicant to estimate the likelihood of default, which is the most significant factor used by lenders to prioritize applicants in decision-making. Given the substantial volume of decisions involved in the consumer lending business, it is necessary to rely on models and algorithms rather than on human discretion (Bao et al. 2019 ; Husmann et al. 2022 ; Liu et al. 2019 ). Furthermore, such algorithmic decisions are based on “hard” information, such as consumer credit file characteristics collected by credit bureau agencies.

Supervised and unsupervised ML methods are widely used for credit risk management. Supervised ML techniques are used in credit scoring models to determine the relationships between customer features and credit default risk and subsequently predict classifications. Unsupervised techniques, mainly clustering algorithms, are used as data mining techniques to group samples into clusters (Wang et al. 2019 ). Hence, unsupervised learning techniques often complement supervised techniques in credit risk management.

Despite the high accuracy of ML, it is not possible to explain its predictions. However, financial institutions must maintain transparency in their decision-making processes. Fortunately, researchers have shown that ML can deduce rules to mitigate a lack of transparency without compromising accuracy (Baesens et al. 2003 ). Table 7 summarizes the recent applications of ML methods in credit risk management. Liu et al. ( 2022 ) use KNN, SVM, and random forest to predict the default probability of online loan borrowers and compare their prediction performance with that of a logistic model. Khandani et al. ( 2010 ) applied regression trees to construct non-linear, non-parametric forecasting models for consumer credit risk.

Cryptocurrency

A cryptocurrency is a digital or virtual currency used to securely exchange and transfer assets. Cryptography is used to securely transfer assets, control and regulate the addition of cryptocurrencies, and secure their transactions (Garcia et al. 2014 ); hence, the term “cryptocurrency.” In contrast to standard currencies, which depend on the central banking system, cryptocurrencies are founded on the principle of decentralized control (Zhao 2021 ). Owing to its uncontrolled and untraceable nature, the cryptocurrency market has evolved exponentially over a short period. The growing interest in cryptocurrencies in the fields of economics and finance has drawn the attention of researchers in this domain. However, the applications of cryptocurrencies and associated technologies are not limited to financing. There is a significant body of computer science literature that focuses on the supporting technologies of cryptocurrencies, which can lead to innovative and efficient approaches for handling Bitcoin and other cryptocurrencies, as well as addressing their price volatility and other related technologies (Khedr et al. 2021 ).

Generating an accurate prediction model for such complex problems is challenging. As a result, cryptocurrency price prediction is still in its nascent stages and further research efforts are required to explore this area. In recent years, ML has become one of the most popular approaches for cryptocurrency price prediction owing to its ability to identify general trends and fluctuations. Table 8 presents a survey of cryptocurrency price prediction research using ML methods. Derbentsev et al. ( 2019 ) presented a short-term forecasting model to predict the cryptocurrency prices of Ripples, Bitcoin, and Ethereum using an ML approach. Greaves and Au ( 2015 ) applied blockchain data to Bitcoin price predictions and employed various ML techniques, including SVM, ANN, and linear and logistic regression. Among the ML classifiers used, the NN classifier with two hidden layers achieved the highest price accuracy of 55%, followed by logistic regression and SVM. Additionally, the research mentioned an analysis using several tree-based models and KNN.

The most recent LSTM networks appear to be more suitable and convenient for handling sequential data, such as time series. Lahmiri and Bekiros ( 2019 ) were the first to use LSTM to predict the digital currency prices of the three currencies that were used the most at the time they conducted their study: Bitcoin, Ripple, and digital cash. In their study, long memory was used to assess the market efficiency of cryptocurrencies, and the inherent non-linear dynamics encompassing chaoticity and fractality were examined to gauge the predictability of digital currencies. Chowdhury et al. ( 2020 ) applied LSTM to the indices and constituents of cryptocurrencies to predict prices. Lahmiri and Bekiros ( 2019 ) implemented LSTM to forecast the prices of the three most widely traded cryptocurrencies. Furthermore, Altan et al. ( 2019 ) built a novel hybrid forecasting model based on LSTM to predict digital currency time series.

The existing applications of ML techniques in energy economics can be classified into two major categories: energy price and energy demand prediction. Energy prices typically demonstrate complex features, such as non-linearity, lag dependence, and non-stationarity, which present challenges for the application of simple traditional models (Chen et al. 2018 ). Owing to their high flexibility, ML techniques can provide superior prediction performance. In energy demand predictions, lagged values of consumption and socioeconomic and technological variables, such as GDP per capita, population, and technology trends, are typically utilized. Table 9 presents a summary of these studies. A critical distinction between “price” and “consumption” prediction is that the latter is not subject to market efficiency dynamics. The prediction of consumption has little effect on the actual consumption of the agents. However, price prediction tends to offset itself by creating opportunities for traders to use this information.

Predicting prices in energy markets is a complicated process because prices are subject to physical constraints on electricity generation and transmission and market power potential (Young et al. 2014 ). Predicting prices using ML techniques is one of the oldest applications in energy economics. In the early 2000s, a wave of studies attempted to forecast electricity prices using conventional ANN techniques. Ding ( 2018 ) combined ensemble empirical mode decomposition and an artificial NN to forecast international crude oil prices. Zhang et al. ( 2020a , b ) employed the LSTM method to forecast day-ahead electricity prices in a deregulated electricity market. They also investigated the intricate dependence structure within the price-forecasting model. Peng et al. ( 2018 ) applied LSTM with a differential evolution algorithm to predict electricity prices. Lago et al. ( 2018 ) first proposed a DNN to improve the predictive accuracy in a local market and then proposed a second model that simultaneously predicts prices from two markets to further improve the forecasting accuracy. Huang and Wang ( 2018 ) proposed a model that combines wavelet NNs with random time-effective functions to improve the prediction accuracy of crude oil price fluctuations.

Understanding the future energy demand and consumption is essential for short- and long-term planning. A wide range of users, including government agencies, local development authorities, financial institutions, and trading institutions, are interested in obtaining realistic forecasts of future consumption portfolios (Lei et al. 2020 ). For demand prediction, Chen et al. ( 2018 ) used ridge regression to combine extreme gradient boosting forest and feedforward deep networks to predict the annual household electricity consumption. Wang et al. ( 2018a , b , c ) first built a model using a self-adaptive multi-verse optimizer to optimize the SVM and then employed it to predict China’s primary energy consumption.

Critical discussions and future research directions

ML techniques have proven valuable in establishing computational models that capture complex relationships with the available data. Consequently, ML has become a useful tool in business and finance. This section critically discusses the existing research and outlines future directions.

Critical discussions

Although ML techniques are widely employed in business and finance, several issues need to be addressed.

Linguistic information is abundant in business and finance, encompassing online commodity comments and investors’ emotional responses in the stock market. Nonetheless, the existing research has predominantly concentrated on processing numerical data. When juxtaposed with numerical information, linguistic data harbor intricate characteristics, notably personalized individual semantics (Li et al. 2022a , b ; Zhang et al. 2021a , b ; Hoang and Wiegratz 2022 ).

The integration of ML into business and finance can lead to interpretability issues. In ML, an interpretable model refers to one in which a human observer can readily comprehend how the model transforms an observation into a prediction (Freitas 2014 ). Typically, decision-makers are hesitant to accept recommendations generated by ML techniques unless they can grasp the reasoning behind them. Unfortunately, the existing research in business and finance, particularly those employing DNNs, has seldom emphasized the interpretability of their models.

Social networks are prevalent in the marketing domain within businesses (Zha et al. 2020 ). For instance, social networks exist among consumers, whose purchasing behavior is influenced by the opinions of trusted peers or friends. However, the existing research that applies ML to marketing has predominantly concentrated on personal customer attributes, such as personality, purchasing power, and preferences (Dong et al. 2021 ). Regrettably, the potential impact of social networks and their influence on customer behavior have been largely overlooked in these studies.

ML techniques typically focus on exploring the statistical relationships between dependent and independent variables and emphasize feature correlations. However, in the context of business and finance applications, causal relationships exist between variables. For instance, consider a study suggesting that girls who have breakfast tend to have lower weights than those who do not’, based on which one might conclude that having breakfast aids in weight loss. However, in reality, these two events may only exhibit a correlation rather than causation (Yao et al. 2021 ). Causality plays a significant role in ML techniques’ performance. However, many current business and finance applications have failed to account for this crucial factor. Ignoring causality may lead to misleading conclusions and hinder accurate modeling of real-world scenarios. Therefore, incorporating causality into ML methodologies within the business and finance domains is essential for enhancing the reliability and validity of predictive models and decision-making processes.

In the emerging cryptocurrency field, although traditional statistical methods are simple to implement and interpret, they require many unrealistic statistical assumptions, making ML the best technology in this field. Although many ML techniques exist, challenges remain in accurately predicting cryptocurrency prices. However, most ML techniques require further investigation.

In recent years, rapid growth in digital payments has led to significant shifts in fraud and financial crimes (Canhoto 2021 ; Prusti et al. 2022 ; Wang et al. 2023 ). While some studies have shown the effective use of ML in detecting financial crimes, there remains a limitation in the research dedicated to this area. As highlighted by Pourhabibi et al. ( 2020 ), the complex nature of financial crime detection applications poses challenges in terms of deploying and achieving the desired detection performance levels. These challenges are manifested in two primary aspects. First, ML solutions encounter substantial pressure to deliver real-time responses owing to the constraints of processing data in real time. Second, in addition to inherent data noise, criminals often attempt to introduce deceptive data to obfuscate illicit activities (Pitropakis et al. 2019 ). Regrettably, few studies have investigated the robustness and performance of the underlying algorithmic solutions when confronted with data quality issues.

In the finance domain, an important limitation of the current literature on energy and ML is that most works highlight the computer science perspective to optimize computational parameters (e.g., the accuracy rate), while finance intuition may be ignored.

Future research directions

Thus, we propose that future research on this topic follow the directions below:

As analyzed above, there is abundant linguistic information exists in business and finance. Consequently, leveraging natural language processing technology to handle and analyze linguistic data in these domains represents a highly promising research direction.

The amalgamation of theoretical models using ML techniques is an important research topic. The incorporation of interpretable models can effectively reveal the black-box nature of ML-driven analyses, thereby elucidating the underlying reasoning behind the results. Consequently, the introduction of interpretable models into business and finance while applying ML can yield substantial benefits.

The interactions and behaviors are often intertwined within social networks, making it crucial to incorporate social network dynamics when modeling their influence on consumer behavior. Introducing the social network aspect into ML models has tremendous potential for enhancing marketing strategies and outcomes  (Trandafili and Biba 2013 ).

Causality has garnered increasing attention in the field of ML in recent years. Accordingly, we believe it is an intriguing avenue to explore when applying ML to address problems in business and finance.

Further studies need to include all relevant factors affecting market mood and track them over a longer period to understand the anomalous behavior of cryptocurrencies and their prices. We recommend that researchers analyze the use of LSTM models in future research, such as CNN LSTM and encoder–decoder LSTM, and compare the results to obtain future insights and improve price prediction results. In addition, researchers can apply sentiment analysis to collect social signals, which can be further enhanced by improving the quality of content and using more content sources. Another area of opportunity is the use of more specialized models with different types of approaches, such as LSTM networks.

Graph NNs and emerging adaptive solutions provide important opportunities for shaping the future of fraud and financial crime detection owing to their parallel structures. Because of the complexity of digital transaction processing and the ever-changing nature of fraud, robustness should be treated as the primary design goal when applying ML to detect financial crimes. Finally, focusing on real-time responses and data noise issues is necessary to improve the performance of current ML solutions for financial crime detection.

Currently, the application of unsupervised learning methods in different areas, such as marketing and risk management, is limited. Some problems related to marketing and customer management could be analyzed using clustering techniques, such as K-means, to segment clients by different demographic or behavioral characteristics and by their likelihood of default or switching companies. In energy risk management, extreme events can be identified as outliers using principal component analysis or ranking algorithms.

Conclusions

Having already made notable contributions to business and finance, ML techniques for addressing issues in these domains are significantly increasing. This review discusses advancements in ML in business and finance by examining seven research directions of ML techniques: cryptocurrency, marketing, e-commerce, energy marketing, stock market, accounting, and credit risk management. Deep learning models, such as DNN, CNN, RNN, random forests, and SVM are highlighted in almost every domain of business and finance. Finally, we analyze some limitations of existing studies and suggest several avenues for future research. This review is helpful for researchers in understanding the progress of ML applications in business and finance, thereby promoting further developments in these fields.

Availability of data and materials

Not applicable.

Abbreviations

  • Machine learning

Long short-term memory

Support vector machine

Restricted Boltzmann machine

Least absolute shrinkage and selection operator

Agarwal S (2022) Deep learning-based sentiment analysis: establishing customer dimension as the lifeblood of business management. Glob Bus Rev 23(1):119–136

Article   Google Scholar  

Ahmadi E, Jasemi M, Monplaisir L, Nabavi MA, Mahmoodi A, Jam PA (2018) New efficient hybrid candlestick technical analysis model for stock market timing on the basis of the support vector machine and heuristic algorithms of imperialist competition and genetic. Expert Syst Appl 94:21–31

Akyildirim E, Goncu A, Sensoy A (2021) Prediction of cryptocurrency returns using machine learning. Ann Oper Res 297(1–2):34

MathSciNet   Google Scholar  

Alobaidi MH, Chebana F, Meguid MA (2018) Robust ensemble learning framework for day-ahead forecasting of household-based energy consumption. Appl Energy 212:997–1012

Article   ADS   Google Scholar  

Altan A, Karasu S, Bekiros S (2019) Digital currency forecasting with chaotic meta-heuristic bio-inspired signal processing techniques. Chaos Solitons Fractals 126:325–336

Article   ADS   MathSciNet   Google Scholar  

Athey S, Imbens GW (2019) Machine learning methods that economists should know about. Annu Rev Econ 11:685–725

Baba B, Sevil G (2021) Bayesian analysis of time-varying interactions between stock returns and foreign equity flows. Financ Innov 7(1):51

Baesens B, Setiono R, Mues C, Vanthienen J (2003) Using Neural Network Rule Extraction and Decision Tables for Credit-Risk Evaluation. Manage Sci 49(3):312–329

Bajari P, Nekipelov D, Ryan SP, Yang MY (2015) Machine learning methods for demand estimation. Am Econ Rev 105(5):481–485

Bao W, Yue J, Rao YL (2017) A deep learning framework for financial time series using stacked autoencoders and long-short term memory. PLoS ONE 12(7):24

Bao W, Lianju N, Yue K (2019) Integration of unsupervised and supervised machine learning algorithms for credit risk assessment. Expert Syst Appl 128:301–315

Bao Y, Ke BIN, Li BIN, Yu YJ, Zhang JIE (2020) Detecting accounting fraud in publicly traded U.S. firms using a machine learning approach. J Acc Res 58(1):199–235

Bennett S, Cucuringu M, Reinert G (2022) Lead–lag detection and network clustering for multivariate time series with an application to the US equity market. Mach Learn 111(12):4497–4538

Article   MathSciNet   Google Scholar  

Bianchi D, Buchner M, Tamoni A (2021) Bond risk premiums with machine learning. Rev Financ Stud 34(2):1046–1089

Boughanmi K, Ansari A (2021) Dynamics of musical success: a machine learning approach for multimedia data fusion. J Mark Res 58(6):1034–1057

Breiman L (2001) Random forests. Mach Learn 45(1):5–32

Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and regression trees. Chapman and Hall, Wadsworth

Google Scholar  

Cai Q, Filos-Ratsikas A, Tang P, Zhang Y (2018) Reinforcement mechanism design for e-commerce. In: Proceedings of the 2018 world wide web conference, pp 1339–1348

Canhoto AI (2021) Leveraging machine learning in the global fight against money laundering and terrorism financing: an affordances perspective. J Bus Res 131:441–452

Article   PubMed   Google Scholar  

Chen KL, Jiang JC, Zheng FD, Chen KJ (2018) A novel data-driven approach for residential electricity consumption prediction based on ensemble learning. Energy 150:49–60

Chao X, Kou G, Li T, Peng Y (2018) Jie Ke versus AlphaGo: a ranking approach using decision making method for large-scale data with incomplete information. Eur J Oper Res 265(1):239–247

Chen Z, Chen W, Shi Y (2020) Ensemble learning with label proportions for bankruptcy prediction. Expert Syst Appl 146:113155

Chen H, Fang X, Fang H (2022) Multi-task prediction method of business process based on BERT and transfer learning. Knowl Based Syst 254:109603

Chen MR, Dautais Y, Huang LG, Ge JD (2017) Data driven credit risk management process: a machine learning approach. Paper presented at the international conference on software and system process Paris, France

Chong E, Han C, Park FC (2017) Deep learning networks for stock market analysis and prediction: methodology, data representations, and case studies. Expert Syst Appl 83:187–205

Chowdhury R, Rahman MA, Rahman MS, Mahdy MRC (2020) An approach to predict and forecast the price of constituents and index of cryptocurrency using machine learning. Physica A 551:17

Chullamonthon P, Tangamchit P (2023) Ensemble of supervised and unsupervised deep neural networks for stock price manipulation detection. Expert Syst Appl 220:119698

Coble KH, Mishra AK, Ferrell S, Griffin T (2018) Big data in agriculture: a challenge for the future. Appl Econ Perspect Policy 40(1):79–96

Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

Cui G, Wong ML, Lui HK (2006) Machine learning for direct marketing response models: Bayesian networks with evolutionary programming. Manag Sci 52(4):597–612

Cui F, Hu HH, Xie Y (2021) An intelligent optimization method of e-commerce product marketing. Neural Comput Appl 33(9):4097–4110

Cuomo S, Gatta F, Giampaolo F, Iorio C, Piccialli F (2022) An unsupervised learning framework for marketneutral portfolio. Expert Syst Appl 192:116308

Da F, Kou G, Peng Y (2022) Deep learning based dual encoder retrieval model for citation recommendation. Technol Forecast Soc 177:121545

Dastile X, Celik T, Potsane M (2020) Statistical and machine learning models in credit scoring: A systematic literature survey. Appl Soft Comput 91:21

Derbentsev V, Datsenko N, Stepanenko O, Bezkorovainyi V (2019) Forecasting cryptocurrency prices time series using machine learning approach. In: SHS web of conferences, vol 65, p 02001

Ding YS (2018) A novel decompose-ensemble methodology with AIC-ANN approach for crude oil forecasting. Energy 154:328–336

Ding KX, Lev B, Peng X, Sun T, Vasarhelyi MA (2020) Machine learning improves accounting estimates: evidence from insurance payments. Rev Acc Stud 25(3):1098–1134

Dingli A, Fournier KS (2017) Financial time series forecasting - a deep learning approach. Int J Mach Learn Comput 7(5):118–122

Dingli A, Marmara V, Fournier NS (2017) Comparison of deep learning algorithms to predict customer churn within a local retail industry. Int J Mach Learn Comput 7(5):128–132

Dong YC, Li Y, He Y, Chen X (2021) Preference-approval structures in group decision making: axiomatic distance and aggregation. Decis Anal 18(4):273–295

Einav L, Levin J (2014) Economics in the age of big data. Science 346(6210):715-+

Fang Y, Chen J, Xue Z (2019) Research on quantitative investment strategies based on deep learning. Algorithms 12(2):35

Faris H, Abukhurma R, Almanaseer W, Saadeh M, Mora AM, Castillo PA, Aljarah I (2019) Improving financial bankruptcy prediction in a highly imbalanced class distribution using oversampling and ensemble learning: A case from the Spanish market. Prog Artif Intell 9:1–23

Ferreira KJ, Lee BHA, Simchi-Levi D (2016) Analytics for an online retailer: demand forecasting and price optimization. Manuf Serv Oper Manag 18(1):69–88

Fischer T, Krauss C (2018) Deep learning with long short-term memory networks for financial market predictions. Eur J Oper Res 270(2):654–669

Freitas AA (2014) Comprehensible classification models: a position paper. SIGKDD Explor Newsl 15(1):1–10

Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29(2–3):131–163

Garcia D, Tessone CJ, Mavrodiev P, Perony N (2014) The digital traces of bubbles: feedback cycles between socio-economic signals in the bitcoin economy. J R Soc Interface 11(99):20140623

Article   PubMed   PubMed Central   Google Scholar  

Ghoddusi H, Creamer GG, Rafizadeh N (2019) Machine learning in energy economics and finance: a review. Energy Econ 81:709–727

Go YH, Hong JK (2019) Prediction of stock value using pattern matching algorithm based on deep learning. Int J Recent Technol Eng 8:31–35

Gogas P, Papadimitriou T (2021) Machine learning in economics and finance. Comput Econ 57(1):1–4

Goncalves R, Ribeiro VM, Pereira FL, Rocha AP (2019) Deep learning in exchange markets. Inf Econ Policy 47:38–51

Greaves A, Au B (2015) Using the bitcoin transaction graph to predict the price of bitcoin. No Data

Grimmer J (2015) We are all social scientists now: how big data, machine learning, and causal inference work together. PS Polit Sci Polit 48(1):80–83

Gu SH, Kelly B, Xiu DC (2020) Empirical Asset Pricing via Machine Learning. Rev Financ Stud 33(5):2223–2273

Hoang D, Wiegratz K (2022) Machine learning methods in finance: Recent applications and prospects. Eur Financ Manag 29(5):1657–1701

Hoerl AE, Kennard RW (1970) Ridge regression—biased estimation for nonorthogonal problems. Technometrics 12(1):55–000

Huang LL, Wang J (2018) Global crude oil price prediction and synchronization-based accuracy evaluation using random wavelet neural network. Energy 151:875–888

Huang AH, Zang AY, Zheng R (2014) Evidence on the information content of text in analyst reports. Account Rev 89(6):2151–2180

Husmann S, Shivarova A, Steinert R (2022) Company classification using machine learning. Expert Syst Appl 195:116598

Jiang ZY, Liang JJ (2017) Cryptocurrency portfolio management with deep reinforcement learning. In: Paper presented at the intelligent systems conference, London, England

Johari SN, Farid FH, Nasrudin N, Bistamam NL, Shuhaili NS (2018) Predicting Stock Market Index Using Hybrid Intelligence Model. Int J Eng Technol 7:36

Jorgensen RK, Igel C (2021) Machine learning for financial transaction classification across companies using character-level word embeddings of text fields. Intell Syst Account Financ Manag 28(3):159–172

Kamilaris A, Prenafeta-Boldu FX (2018) Deep learning in agriculture: a survey. Comput Electron Agric 147:70–90

Khandani AE, Kim AJ, Lo AW (2010) Consumer credit-risk models via machine-learning algorithms. J Bank Financ 34(11):2767–2787

Khedr AM, Arif I, Raj PVP, El-Bannany M, Alhashmi SM, Sreedharan M (2021) Cryptocurrency price prediction using traditional statistical and machine-learning techniques: a survey. Intell Syst Account Financ Manag 28(1):3–34

Kim JJ, Cha SH, Cho KH, Ryu M (2018) Deep reinforcement learning based multi-agent collaborated network for distributed stock trading. Int J Grid Distrib Comput 11(2):11–20

Kou G, Chao XR, Peng Y, Alsaadi FE, Herrera-Viedma E (2019) Machine learning methods for systemic risk analysis in financial sectors. Technol Econ Dev Eco 25(5):716–742

Kou G, Xu Y, Peng Y, Shen F, Chen Y, Chang K, Kou S (2021) Bankruptcy prediction for SMEs using transactional data and two-stage multiobjective feature selection. Decis Support Syst 140:113429

Ladyzynski P, Zbikowski K, Gawrysiak P (2019) Direct marketing campaigns in retail banking with the use of deep learning and random forests. Expert Syst Appl 134:28–35

Lago J, De Ridder F, Vrancx P, De Schutter B (2018) Forecasting day-ahead electricity prices in Europe: the importance of considering market integration. Appl Energy 211:890–903

Lahmiri S, Bekiros S (2019) Cryptocurrency forecasting with deep learning chaotic neural networks. Chaos Solitons Fractals 118:35–40

Lebichot B, Paldino GM, Siblini W, Guelton LH, Oblé F, Bontempi G (2021) Incremental learning strategies for credit cards fraud detection. Int J Data Sci Anal 12:165–174

Lei ZZ (2020) Research and analysis of deep learning algorithms for investment decision support model in electronic commerce. Electron Commer Res 20(2):275–295

Lei K, Zhang B, Li Y, Yang M, Shen Y (2020) Time-driven feature-aware jointly deep reinforcement learning for financial signal representation and algorithmic trading. Expert Syst Appl 140:14

Li CC, Dong YC, Xu YJ, Chiclana F, Herrera-Viedma E, Herrera F (2019) An overview on managing additive consistency of reciprocal preference relations for consistency-driven decision making and Fusion: Taxonomy and future directions. Inf Fusion 52:143–156

Li CC, Dong YC, Liang H, Pedrycz W, Herrera F (2022a) Data-driven method to learning personalized individual semantics to support linguistic multi-attribute decision making. Omega 111:102642

Li CC, Dong YC, Pedrycz W, Herrera F (2022b) Integrating continual personalized individual semantics learning in consensus reaching in linguistic group decision making. IEEE Trans Syst Man Cybern Syst 52(3):1525–1536

Lima MSM, Eryarsoy E, Delen D (2021) Predicting and explaining pig iron production on charcoal blast furnaces: a machine learning approach. INFORMS J Appl Anal 51(3):213–235

Lin WY, Hu YH, Tsai CF (2012) Machine learning in financial crisis prediction: a survey. IEEE Trans Syst Man Cybern Syst C 42(4):421–436

Lin WC, Lu YH, Tsai CF (2019) Feature selection in single and ensemble learning-based bankruptcy prediction models. Expert Syst 36:e12335

Liu YT, Zhang HJ, Wu YZ, Dong YC (2019) Ranking range based approach to MADM under incomplete context and its application in venture investment evaluation. Technol Econ Dev Eco 25(5):877–899

Liu Y, Yang ML, Wang YD, Li YS, Xiong TC, Li AZ (2022) Applying machine learning algorithms to predict default probability in the online credit market: evidence from China. Int Rev Financ Anal 79:14

Long W, Lu ZC, Cui LX (2019) Deep learning-based feature engineering for stock price movement prediction. Knowl Based Syst 164:163–173

Ma XM, Lv SL (2019) Financial credit risk prediction in internet finance driven by machine learning. Neural Comput Appl 31(12):8359–8367

Machado MR, Karray S (2022) Applying hybrid machine learning algorithms to assess customer risk-adjusted revenue in the financial industry. Electron Commer Res Appl 56:101202

Mao ST, Chao XL (2021) Dynamic joint assortment and pricing optimization with demand learning. Manuf Serv Oper Manag 23(2):525–545

Melancon GG, Grangier P, Prescott-Gagnon E, Sabourin E, Rousseau LM (2021) A machine learning-based system for predicting service-level failures in supply chains. INFORMS J Appl Anal 51(3):200–212

Meng TL, Khushi M (2019) Reinforcement learning in financial markets. Data 4(3):110

Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G, Petersen S, Hassabis D (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533

Article   ADS   CAS   PubMed   Google Scholar  

Moews B, Herrmann JM, Ibikunle G (2019) Lagged correlation-based deep learning for directional trend change prediction in financial time series. Expert Syst Appl 120:197–206

Moon KS, Kim H (2019) Performance of deep learning in prediction of stock market volatility. Econ Comput Econ Cybern Stud 53(2):77–92

ADS   Google Scholar  

Nanduri J, Jia YT, Oka A, Beaver J, Liu YW (2020) Microsoft uses machine learning and optimization to reduce e-commerce fraud. Informs J Appl Anal 50(1):64–79

Nazareth N, Ramana RYV (2023) Financial applications of machine learning: a literature review. Expert Syst Appl 219:119640

Nguyen TT, Nguyen ND, Nahavandi S (2020) Deep reinforcement learning for multiagent systems: a review of challenges, solutions, and applications. IEEE Trans Cybern 50(9):3826–3839

Nosratabadi S, Mosavi A, Duan P, Ghamisi P, Filip F, Band SS, Reuter U, Gama J, Gandomi AH (2020) Data science in economics: comprehensive review of advanced machine learning and deep learning methods. Mathematics 8(10):1799

Nti IK, Adekoya AF, Weyori BA (2020) A systematic review of fundamental and technical analysis of stock market predictions. Artif Intell Rev 53(4):3007–3057

Ozbayoglu AM, Gudelek MU, Sezer OB (2020) Deep learning for financial applications: a survey. Appl Soft Comput 93:106384

Padilla N, Ascarza E (2021) Overcoming the cold start problem of customer relationship management using a probabilistic machine learning approach. J Mark Res 58(5):981–1006

Pang H, Zhang WK (2021) Decision support model of e-commerce strategic planning enhanced by machine learning. Inf Syst E-Bus Manag 21(1):11

Paolanti M, Romeo L, Martini M, Mancini A, Frontoni E, Zingaretti P (2019) Robotic retail surveying by deep learning visual and textual data. Robot Auton Syst 118:179–188

Peng L, Liu S, Liu R, Wang L (2018) Effective long short-term memory with differential evolution algorithm for electricity price prediction. Energy 162:1301–1314

Perols J (2011) Financial statement fraud detection: An analysis of statistical and machine learning algorithms. Auditing J Pract Th 30:19–50

Perols JL, Bowen RM, Zimmermann C, Samba B (2017) Finding needles in a haystack: using data analytics to improve fraud prediction. Acc Rev 92(2):221–245

Pfeiffer J, Pfeiffer T, Meissner M, Weiss E (2020) Eye-tracking-based classification of information search behavior using machine learning: evidence from experiments in physical shops and virtual reality shopping environments. Inf Syst Res 31(3):675–691

Pitropakis N, Panaousis E, Giannetsos T, Anastasiadis E, Loukas G (2019) A taxonomy and survey of attacks against machine learning. Comput Sci Rev 34:100199

Pourhabibi T, Ong KL, Kam BH, Boo YL (2020) Fraud detection: a systematic literature review of graph-based anomaly detection approaches. Decis Support Syst 133:113303

Prusti D, Behera RK, Rath SK (2022) Hybridizing graph-based Gaussian mixture model with machine learning for classification of fraudulent transactions. Comput Intell 38(6):2134–2160

Rafieian O, Yoganarasimhan H (2021) Targeting and privacy in mobile advertising. Mark Sci 40(2):193–218

Raj MP, Swaminarayan PR, Saini JR, Parmar DK (2015) Applications of pattern recognition algorithms in agriculture: a review. Int J Adv Netw Appl 6(5):2495–2502

Sabeena J, Venkata SRP (2019) A modified deep learning enthused adversarial network model to predict financial fluctuations in stock market. Int J Eng Adv Technol 8:2996–3000

Saravanan V, Charanya SK (2018) E-Commerce Product Classification using Lexical Based Hybrid Feature Extraction and SVM. Int J Innov Technol Explor Eng 9(1):1885–1891

Schmidhuber J (2015) Deep learning in neural networks: An overview. Neural Networks 61:85–117

Simester D, Timoshenko A, Zoumpoulis SI (2020) Targeting prospective customers: robustness of machine-learning methods to typical data challenges. Manag Sci 66(6):2495–2522

Singh R, Srivastava S (2017) Stock prediction using deep learning. Multimed Tools Appl 76(18):18569–18584

Sirignano J, Cont R (2019) Universal features of price formation in financial markets: perspectives from deep learning. Quant Financ 19(9):1449–1459

Sohangir S, Wang DD, Pomeranets A, Khoshgoftaar TM (2018) Big data: deep learning for financial sentiment analysis. J Big Data 5(1):25

Song Y, Lee JW, Lee J (2019) A study on novel filtering and relationship between input-features and target-vectors in a deep learning model for stock price prediction. Appl Intell 49(3):897–911

Storm H, Baylis K, Heckelei T (2020) Machine learning in agricultural and applied economics. Eur Rev Agric Econ 47(3):849–892

Tamura K, Uenoyama K, Iitsuka S, Matsuo Y (2018) Model for evaluation of stock values by ensemble model using deep learning. Trans Jpn Soc Artif Intell 2018:33

Tashiro D, Matsushima H, Izumi K, Sakaji H (2019) Encoding of high-frequency order information and prediction of short-term stock price by deep learning. Quant Financ 19(9):1499–1506

Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J R Stat Soc Ser B Stat Methodol 58(1):267–288

Timoshenko A, Hauser JR (2019) Identifying customer needs from user-generated content. Mark Sci 38(1):1–20

Trandafili E, Biba M (2013) A review of machine learning and data mining approaches for business applications in social networks. Int J E Bus Res (IJEBR) 9(1):36–53

Valencia F, Gomez-Espinosa A, Valdes-Aguirre B (2019) Price movement prediction of cryptocurrencies using sentiment analysis and machine learning. Entropy 21(6):12

Vapnik V (2013) The nature of statistical learning theory. Springer, Berlin

Vo NNY, He X, Liu S, Xu, G (2019) Deep learning for decision making and the optimization of socially responsible investments and portfolio. Decis Support Syst 124:113097. https://doi.org/10.1016/j.dss.2019.113097

Wang XY, Luo DK, Zhao X, Sun Z (2018b) Estimates of energy consumption in China using a self-adaptive multi-verse optimizer-based support vector machine with rolling cross-validation. Energy 152:539–548

Wang Y, Mo DY, Tseng MM (2018c) Mapping customer needs to design parameters in the front end of product design by applying deep learning. CIRP Ann 67(1):145–148

Wang B, Ning LJ, Kong Y (2019) Integration of unsupervised and supervised machine learning algorithms for credit risk assessment. Expert Syst Appl 128:301–315

Wang WY, Li WZ, Zhang N, Liu KC (2020) Portfolio formation with preselection using deep learning from long-term financial data. Expert Syst Appl 143:17

Wang C, Zhu H, Hu R, Li R, Jiang C (2023) LongArms: fraud prediction in online lending services using sparse knowledge graph. IEEE Trans Big Data 9(2):758–772

Wang Q, Li BB, Singh PV (2018) Copycats vs. original mobile apps: a machine learning copycat-detection method and empirical analysis. Inf Syst Res 29(2):273–291

Weng B, Lu L, Wang X, Megahed FM, Martinez W (2018) Predicting short-term stock prices using ensemble methods and online data sources. Expert Syst Appl 112:258–273

Wettschereck D, Aha DW, Mohri T (1997) A review and empirical evaluation of feature weighting methods for a class of lazy learning algorithms. Artif Intell Rev 11(1–5):273–314

Wu WB, Chen JQ, Yang ZB, Tindall ML (2021) A cross-sectional machine learning approach for hedge fund return prediction and selection. Manage Sci 67(7):4577–4601

Wu X, Kumar V, Ross Quinlan J, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Yu PS, Zhou ZH, Steinbach M, Hand DJ, Steinberg D (2008) Top 10 algorithms in data mining. Knowl Inform Syst 14:1–37

Wu C, Yan M (2018) Session-aware Information Embedding for E-commerce Product Recommendation. In: Paper presented at the Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, Singapore, Singapore

Xiao F, Ke J (2021) Pricing, management and decision-making of financial markets with artificial intelligence: introduction to the issue. Financ Innov 7(1):85

Article   MathSciNet   PubMed   PubMed Central   Google Scholar  

Xu YZ, Zhang JL, Hua Y, Wang LY (2019) Dynamic credit risk evaluation method for e-commerce sellers based on a hybrid artificial intelligence model. Sustainability 11:5521

Xu WJ, Chen X, Dong YC, Chiclana F (2021) Impact of decision rules and non-cooperative behaviors on minimum consensus cost in group decision making. Group Decis Negot 30(6):1239–1260

Yan HJ, Ouyang HB (2018) Financial time series prediction based on deep learning. Wirel Pers Commun 102(2):683–700

Yao LY, Chu ZX, Li S, Li YL, Gao J, Zhang AD (2021) A survey on causal inference. ACM Trans Knowl Discov Data 15(5):1–46

Yoganarasimhan H (2020) Search personalization using machine learning. Manag Sci 66(3):1045–1070

Young D, Poletti S, Browne O (2014) Can agent-based models forecast spot prices in electricity markets? Evidence from the New Zealand electricity market. Energy Econ 45:419–434

Zahavi JN, Levin I (1997) Applying neural computing to target marketing. J Direct Mark 11(4):76–93

Zha QB, Kou G, Zhang HJ, Liang HM, Chen X, Li CC, Dong YC (2020) Opinion dynamics in finance and business: a literature review and research opportunities. Financ Innov 6(1):44

Zha QB, Dong YC, Zhang HJ, Chiclana F, Herrera-Viedma E (2021) A personalized feedback mechanism based on bounded confidence learning to support consensus reaching in group decision making. IEEE Trans Syst Man Cybern Syst 51(6):3900–3910

Zhang QG, Benveniste A (1992) Wavelet networks. IEEE Trans Neural Netw 3(6):889–898

Article   CAS   PubMed   Google Scholar  

Zhang C, Li R, Shi H, Li FR (2020a) Deep learning for day-ahead electricity price forecasting. IET Smart Grid 3(4):462–469

Zhang YJ, Li BB, Krishnan R (2020b) Learning Individual behavior using sensor data: the case of global positioning system traces and taxi drivers. Inf Syst Res 31(4):1301–1321

Zhang B, Tan RH, Lin CJ (2021a) Forecasting of e-commerce transaction volume using a hybrid of extreme learning machine and improved moth-flame optimization algorithm. Appl Intell 51(2):952–965

Zhang HJ, Li CC, Liu YT, Dong YC (2021b) Modelling personalized individual semantics and consensus in comparative linguistic expression preference relations with self-confidence: An optimization-based approach. IEEE Trans Fuzzy Syst 29:627–640

Zhao L (2021) The function and impact of cryptocurrency and data technology in the context of financial technology: introduction to the issue. Financ Innov 7(1):84

Zhu XD, Ninh A, Zhao H, Liu ZM (2021) Demand forecasting with supply-chain information and machine learning: evidence in the pharmaceutical industry. Prod Oper Manag 30(9):3231–3252

Download references

Acknowledgements

We would like to acknowledge financial support from the grant (No. 72271171) from the National Natural Science Foundation of China, the grant (No. sksy12021-02) from Sichuan University, and National Outstanding Youth Science Fund Project of National Natural Science Foundation of China (71725001).

This work was supported by the grant (No. 72271171) from the National Natural Science Foundation of China, the grant (No. sksy12021-02) from Sichuan University, National Outstanding Youth Science Fund Project of National Natural Science Foundation of China (71725001), and the Open Project of Xiangjiang Laboratory (No. 22XJ03028).

Author information

Authors and affiliations.

Business School, Sichuan University, Chengdu, 610065, China

Hanyao Gao, Haiming Liang, Xiangrui Chao & Yucheng Dong

School of Business Administration, Faculty of Business Administration, Southwestern University of Finance and Economics, Chengdu, 611130, China

Business School, Hohai University, Nanjing, 211100, China

Hengjie Zhang

Xiangjiang Laboratory, Changsha, 410205, China

Yucheng Dong

School of Economics and Management, Southwest Jiaotong University, Chengdu, 610031, China

Cong-Cong Li

You can also search for this author in PubMed   Google Scholar

Contributions

HG, GK and YD contributed to the completion of the idea and writing of this paper. HG, GK and YD contributed to the discussion of the content of the organization and HL and HZ contributed to the improvement of the text of the manuscript. HG and HL contributed to Methodology. XC, and CL contributed to the literature collection of this paper. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Gang Kou or Yucheng Dong .

Ethics declarations

Competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Gao, H., Kou, G., Liang, H. et al. Machine learning in business and finance: a literature review and research opportunities. Financ Innov 10 , 86 (2024). https://doi.org/10.1186/s40854-024-00629-z

Download citation

Received : 10 June 2022

Accepted : 07 February 2024

Published : 19 September 2024

DOI : https://doi.org/10.1186/s40854-024-00629-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

literature review of market potential

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Journal Proposal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

forests-logo

Article Menu

  • Subscribe SciFeed
  • Recommended Articles
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Dynamic impact of digital inclusive finance and financial market development on forests and timber in china: economic and social perspective.

literature review of market potential

1. Introduction

2. literature review, 2.1. digitalization of finance and forestry, 2.2. financial development and forestry, 3. evolution of china’s forestry reforms on china’s forest sector, 4. materials and methods, 4.1. empirical modeling, 4.2. empirical methods, 4.3. cross-dependence testing, 4.4. panel unit root, 4.5. method of moments quantile regression model, 5. results and discussion, 5.1. forest, timber, and digital inclusive finance, 5.2. forest, timber, and financial development, 6. conclusions, author contributions, data availability statement, conflicts of interest.

ShanghaiHenanXinjiangInner Mongolia
TianjinQinghaiShaanxi
BeijingHebeiGuangdong
NingxiaLiaoningJiangxi
JiangsuZhejiangHunan
HainanGansuGuangxi
ShandongHubeiTibet
ChongqingGuizhouSichuan
ShanxiJilinHeilongjiang
AnhuiFujianYunnan
  • Liu, J.; Coomes, D.A.; Gibson, L.; Hu, G.; Liu, J.; Luo, Y.; Yu, M. Forest fragmentation in China and its effect on biodiversity. Biol. Rev. 2019 , 94 , 1636–1657. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Li, W.; Jiang, Y.; Dong, M.; Du, E.; Zhou, Z.; Zhao, S.; Xu, H. Diverse responses of radial growth to climate across the southern part of the Asian boreal forests in northeast China. For. Ecol. Manag. 2020 , 458 , 117759. [ Google Scholar ] [ CrossRef ]
  • Barbu, M.C.; Tudor, E.M. State of the art of the Chinese forestry, wood industry and its markets. Wood Mater. Sci. Eng. 2022 , 17 , 1030–1039. [ Google Scholar ] [ CrossRef ]
  • Richards, M.; Treanor, N.B.; Sun, X.; Fenton, S.T. China’s International Wood Trade: A Review, 2011–2020 ; Forest Trends: Washington, DC, USA, 2022. [ Google Scholar ]
  • Zhang, Y.; Chen, S. Wood trade responses to ecological rehabilitation program: Evidence from China’s new logging ban in natural forests. For. Policy Econ. 2021 , 122 , 102339. [ Google Scholar ] [ CrossRef ]
  • Hongqiang, Y.; Ji, C.; Nie, Y.; Yinxing, H. China’s wood furniture manufacturing industry: Industrial cluster and export competitiveness. For. Prod. J. 2012 , 62 , 214–221. [ Google Scholar ] [ CrossRef ]
  • Wu, H.; Chen, J.; Bai, W.; Fang, Y. The impact of financial support on forestry green total factor productivity from the perspective of environmental regulation. For. Econ. Rev. 2023 , 5 , 106–124. [ Google Scholar ] [ CrossRef ]
  • Liu, J.; Ren, Y. Can digital inclusive finance ensure food security while achieving low-carbon transformation in agricultural development? Evidence from China. J. Clean. Prod. 2023 , 418 , 138016. [ Google Scholar ] [ CrossRef ]
  • Xu, L.; Tan, J. Financial development, industrial structure and natural resource utilization efficiency in China. Resour. Policy 2020 , 66 , 101642. [ Google Scholar ] [ CrossRef ]
  • Koondhar, M.A.; Shahbaz, M.; Ozturk, I.; Randhawa, A.A.; Kong, R. Revisiting the relationship between carbon emission, renewable energy consumption, forestry, and agricultural financial development for China. Environ. Sci. Pollut. Res. 2021 , 28 , 45459–45473. [ Google Scholar ] [ CrossRef ]
  • Löfqvist, S.; Garrett, R.D.; Ghazoul, J. Incentives and barriers to private finance for forest and landscape restoration. Nat. Ecol. Evol. 2023 , 7 , 707–715. [ Google Scholar ] [ CrossRef ]
  • Yu, L.; Zhao, D.; Xue, Z.; Gao, Y. Research on the use of digital finance and the adoption of green control techniques by family farms in China. Technol. Soc. 2020 , 62 , 101323. [ Google Scholar ] [ CrossRef ]
  • Mainelli, M.; Mill, S. Financial Innovation and Sustainable Development ; Business & Sustainable Development Commission: London, UK, 2016. [ Google Scholar ]
  • Ros-Tonen, M.A.; Bitzer, V.; Laven, A.; de Leth, D.O.; Van Leynseele, Y.; Vos, A. Conceptualizing inclusiveness of smallholder value chain integration. Curr. Opin. Environ. Sustain. 2019 , 41 , 10–17. [ Google Scholar ] [ CrossRef ]
  • Gabrys, J. Smart forests and data practices: From the Internet of Trees to planetary governance. Big Data Soc. 2020 , 7 , 2053951720904871. [ Google Scholar ] [ CrossRef ]
  • Goldstein, J.E.; Faxon, H.O. New data infrastructures for environmental monitoring in Myanmar: Is digital transparency good for governance? Environ. Plan. E Nat. Space 2022 , 5 , 39–59. [ Google Scholar ] [ CrossRef ]
  • Zhang, J. FinTech and green finance: The case of ant forest in China. In 2021 International Conference on Public Art and Human Development (ICPAHD 2021) ; Atlantis Press: Amsterdam, The Netherlands, 2022; pp. 657–661. [ Google Scholar ]
  • Zhang, Y.; Chen, J.; Han, Y.; Qian, M.; Guo, X.; Chen, R.; Chen, Y. The contribution of Fintech to sustainable development in the digital age: Ant forest and land restoration in China. Land Use Policy 2021 , 103 , 105306. [ Google Scholar ] [ CrossRef ]
  • Ge, H.; Li, B.; Tang, D.; Xu, H.; Boamah, V. Research on digital inclusive finance promoting the integration of rural three-industry. Int. J. Environ. Res. Public Health 2022 , 19 , 3363. [ Google Scholar ] [ CrossRef ]
  • Shanin, I.I. Main Ways of Financing Innovative Activities of Forest Industry. In Challenges and Solutions in the Digital Economy and Finance ; Springer Proceedings in Business and Economics; Springer: Cham, Switzerland, 2022. [ Google Scholar ] [ CrossRef ]
  • Bareyko, S.N.; Bazhaeva, T.S.; Kozhukhina, K.A.; Ivanova, E.V. Transformation of financial relations under the digital economy in the forest industry. In Challenges and Solutions in the Digital Economy and Finance: Proceedings of the 5th International Scientific Conference on Digital Economy and Finances (DEFIN 2022), St. Petersburg 2022 ; Springer International Publishing: Cham, Switzerland, 2022; pp. 339–349. [ Google Scholar ]
  • Shabaeva, S.V.; Shabaev, A.I. Strategic opportunities for digitalization of Russian timber industry enterprises. Russ. J. Ind. Econ. 2023 , 16 , 1159. [ Google Scholar ] [ CrossRef ]
  • He, J.; Deng, Z. Revisiting natural resources rents and sustainable financial development: Evaluating the role of mineral and forest for global data. Resour. Policy 2023 , 80 , 103166. [ Google Scholar ] [ CrossRef ]
  • Rode, J.; Pinzon, A.; Stabile, M.C.; Pirker, J.; Bauch, S.; Iribarrem, A.; Wittmer, H. Why ‘blended finance’could help transitions to sustainable landscapes: Lessons from the Unlocking Forest Finance project. Ecosyst. Serv. 2019 , 37 , 100917. [ Google Scholar ] [ CrossRef ]
  • Zada, M.; Yukun, C.; Zada, S. Effect of financial management practices on the development of small-to-medium size forest enterprises: Insight from Pakistan. GeoJournal 2021 , 86 , 1073–1088. [ Google Scholar ] [ CrossRef ]
  • Asif, M.; Khan, K.B.; Anser, M.K.; Nassani, A.A.; Abro, M.M.Q.; Zaman, K. Dynamic interaction between financial development and natural resources: Evaluating the ‘Resource curse’ hypothesis. Resour. Policy 2020 , 65 , 101566. [ Google Scholar ] [ CrossRef ]
  • Ke, S.; Qiao, D.; Zhang, X.; Feng, Q. Changes of China’s forestry and forest products industry over the past 40 years and challenges lying ahead. For. Policy Econ. 2021 , 123 , 102352. [ Google Scholar ] [ CrossRef ]
  • Kan, S.; Chen, B.; Han, M.; Hayat, T.; Alsulami, H.; Chen, G. China’s forest land use change in the globalized world economy: Foreign trade and unequal household consumption. Land Use Policy 2021 , 103 , 105324. [ Google Scholar ] [ CrossRef ]
  • Jiang, Y.; Su, H. The Status, Trend, and Global Position of China’s Forestry Industry: An Anatomy Based on the Global Value Chain Paradigm. Forests 2023 , 14 , 2040. [ Google Scholar ] [ CrossRef ]
  • Zhu, Z.; Shen, Y.; Ning, K.; Zhu, Z.; Li, B.; Yaoqi, Z. Non-timber Forest Products for Poverty Allevation and Environment: The Development and Drive Forces of Hickory Production in China. In Non-Wood Forest Products of Asia: Knowledge, Conservation and Livelihood ; Springer International Publishing: Cham, Switzerland, 2022; pp. 253–266. [ Google Scholar ]
  • Nambiar, E.S. Tamm Review: Re-imagining forestry and wood business: Pathways to rural development, poverty alleviation and climate change. For. Ecol. Manag. 2019 , 448 , 160–173. [ Google Scholar ] [ CrossRef ]
  • Qiao, D.; Yuan, W.T.; Ke, S.F. China’s Natural Forest Protection Program: Evolution, impact and challenges. Int. For. Rev. 2021 , 23 , 338–350. [ Google Scholar ] [ CrossRef ]
  • Tang, Y.; Shao, Q.; Liu, J.; Zhang, H.; Yang, F.; Cao, W.; Gong, G. Did ecological restoration hit its mark? Monitoring and assessing ecological changes in the Grain for Green Program region using multi-source satellite images. Remote Sens. 2019 , 11 , 358. [ Google Scholar ] [ CrossRef ]
  • Zhang, K.; Artati, Y.; Putzel, L.; Xie, C.; Hogarth, N.J.; Wang, J.N.; Wang, J. China’s Conversion of Cropland to Forest Program as a national PES scheme: Institutional structure, voluntarism and conditionality of PES. Int. For. Rev. 2017 , 19 , 24–36. [ Google Scholar ] [ CrossRef ]
  • Xiao, Q.; Wang, Y.; Liao, H.; Han, G.; Liu, Y. The impact of digital inclusive finance on agricultural green total factor productivity: A study based on China’s provinces. Sustainability 2023 , 15 , 1192. [ Google Scholar ] [ CrossRef ]
  • Liu, D.; Li, Y.; You, J.; Balezentis, T.; Shen, Z. Digital inclusive finance and green total factor productivity growth in rural areas. J. Clean. Prod. 2023 , 418 , 138159. [ Google Scholar ] [ CrossRef ]
  • Beck, T.; Levine, R.; Loayza, N. Finance and the Sources of Growth. J. Financ. Econ. 2000 , 58 , 261–300. [ Google Scholar ] [ CrossRef ]
  • Herdianti, A.R. Business Ecosystems to Provide Incentives and Opportunities for Sustainable and Resilient Livelihoods in Forest Landscapes. Doctoral Dissertation, University of British Columbia, Kelowna, Canada, 2022. Available online: https://open.library.ubc.ca/media/stream/pdf/24/1.0412916/4 (accessed on 1 June 2024).
  • Yasmeen, R.; Yao, X.; Padda, I.U.H.; Shah, W.U.H.; Jie, W. Exploring the role of solar energy and foreign direct investment for clean environment: Evidence from top 10 solar energy consuming countries. Renew. Energy 2022 , 185 , 147–158. [ Google Scholar ] [ CrossRef ]
  • Pesaran, M.H. General diagnostic tests for cross section dependence in panels. Empir. Econ. 2021 , 60 , 13–50. [ Google Scholar ] [ CrossRef ]
  • Pesaran, M.H. Testing weak cross-sectional dependence in large panels. Econom. Rev. 2015 , 34 , 1089–1117. [ Google Scholar ] [ CrossRef ]
  • Hasanov, F.J.; Khan, Z.; Hussain, M.; Tufail, M. Theoretical framework for the carbon emissions effects of technological progress and renewable energy consumption. Sustain. Dev. 2021 , 29 , 810–822. [ Google Scholar ] [ CrossRef ]
  • Pesaran, M.H. A simple panel unit root test in the presence of cross-section dependence. J. Appl. Econom. 2007 , 22 , 265–312. [ Google Scholar ] [ CrossRef ]
  • Machado, J.A.F.; Santos Silva, J.M.C. Quantiles via moments. J. Econom. 2019 , 213 , 145–173. [ Google Scholar ] [ CrossRef ]
  • Binder, M.; Coad, A. From Average Joe’s happiness to Miserable Jane and Cheerful John: Using quantile regressions to analyze the full subjective well-being distribution. J. Econ. Behav. Organ. 2011 , 79 , 275–290. [ Google Scholar ] [ CrossRef ]
  • Koenker, R.; Bassett, G. Regression Quantiles. Econometrica 1978 , 46 , 33. [ Google Scholar ] [ CrossRef ]
  • Koenker, R. Quantile regression for longitudinal data. J. Multivar. Anal. 2004 , 91 , 74–89. [ Google Scholar ] [ CrossRef ]
  • Firpo, S.; Galvao, A.F.; Pinto, C.; Poirier, A.; Sanroman, G. GMM quantile regression. J. Econom. 2022 , 230 , 432–452. [ Google Scholar ] [ CrossRef ]
  • Pazarbasioglu, C.; Mora, A.G.; Uttamchandani, M.; Natarajan, H.; Feyen, E.; Saal, M. Digital financial services. World Bank 2020 , 54 , 1–54. [ Google Scholar ]
  • Wanyonyi, K.; Ngaba, D. Digital financial services and financial performance of savings and credit cooperative societies in Kakamega County, Kenya. Int. J. Curr. Asp. Financ. Bank. Account. 2021 , 3 , 9–20. [ Google Scholar ] [ CrossRef ]
  • Cadman, T.; Sarker, T.; Muttaqin, Z.; Nurfatriani, F.; Salminah, M.; Maraseni, T. The role of fiscal instruments in encouraging the private sector and smallholders to reduce emissions from deforestation and forest degradation: Evidence from Indonesia. For. Policy Econ. 2019 , 108 , 101913. [ Google Scholar ] [ CrossRef ]
  • Wang, G.; Innes, J.L.; Lei, J.; Dai, S.; Wu, S.W. China’s forestry reforms. Science 2007 , 318 , 1556–1557. [ Google Scholar ] [ CrossRef ]
  • Guariguata, M.R.; García-Fernández, C.; Nasi, R.; Sheil, D.; Herrero-Jáuregui, C.; Cronkleton, P.; Ingram, V. Timber and non-timber forest product extraction and management in the tropics: Towards compatibility? Non-Timber For. Prod. Glob. Context 2011 , 7 , 171–188. [ Google Scholar ]
  • Hou, J.; Yin, R.; Wu, W. Intensifying forest management in China: What does it mean, why, and how? For. Policy Econ. 2019 , 98 , 82–89. [ Google Scholar ] [ CrossRef ]
  • Ke, S.; Zhang, Z.; Wang, Y. China’s forest carbon sinks and mitigation potential from carbon sequestration trading perspective. Ecol. Indic. 2023 , 148 , 110054. [ Google Scholar ] [ CrossRef ]
  • Teketay, D.; Lemenih, M.; Bekele, T.; Yemshaw, Y.; Feleke, S.; Tadesse, W.; Moges, Y.; Hunde, T.; Nigussie, D. Forest resources and challenges of sustainable forest management and conservation in Ethiopia. In Degraded Forests in Eastern Africa ; Routledge: Oxfordshire, UK, 2010; pp. 19–63. [ Google Scholar ]
  • Marchi, E.; Chung, W.; Visser, R.; Abbas, D.; Nordfjell, T.; Mederski, P.S.; Brink, M.; Laschi, A. Sustainable Forest Operations (SFO): A new paradigm in a changing world and climate. Sci. Total Environ. 2018 , 634 , 1385–1397. [ Google Scholar ] [ CrossRef ]
  • Kocak, E.; Cavusoglu, M. Is there a conservation relationship between tourism, economic output, and forest areas? Conserv. Sci. Pract. 2024 , 6 , e13171. [ Google Scholar ] [ CrossRef ]
  • Chen, W.; Xu, D.; Liu, J. The forest resources input–output model: An application in China. Ecol. Indic. 2015 , 51 , 87–97. [ Google Scholar ] [ CrossRef ]
  • Zhang, D. China’s forest expansion in the last three plus decades: Why and how? For. Policy Econ. 2019 , 98 , 75–81. [ Google Scholar ] [ CrossRef ]
  • Zhu, A.L.; Weins, N.; Lu, J.; Harlan, T.; Qian, J.; Seleguim, F.B. China’s nature-based solutions in the Global South: Evidence from Asia, Africa, and Latin America. Glob. Environ. Chang. 2024 , 86 , 102842. [ Google Scholar ] [ CrossRef ]
  • Gbadebo, O.V. A Review of the Impact of Afforestation on the Socio-Economic Development of Nigeria. FUTY J. Environ. 2022 , 16 , 34–43. [ Google Scholar ]
  • Akomaning, Y.O.; Darkwah, S.A.; Živělová, I.; Hlaváčková, P. Achieving sustainable development goals in Ghana: The contribution of non-timber forest products towards economic development in the Eastern region. Land 2023 , 12 , 635. [ Google Scholar ] [ CrossRef ]
  • Häggström, C.; Lindroos, O. Human, technology, organization and environment–a human factors perspective on performance in forest harvesting. Int. J. For. Eng. 2016 , 27 , 67–78. [ Google Scholar ] [ CrossRef ]

Click here to enlarge figure

Variables Acronyms UnitsObsMeanStd DevMinMax
Forest Output Forestry output value (100 million)341159.8716168.83762.12289
Timber Output Timber output (10,000 cubic meters)341279.2186499.06030.043502.47
Digital Inclusive Finance Index Peking University Digital Financial Inclusion Index of China (PKU-DFIIC)341230.4609103.362916.22458.97
Digital Inclusive Finance—breadth 341211.6459103.92331.96433.42
Digital Inclusive Finance—depth 341225.9221105.87616.76510.69
1: Financial Market Development Factor Market Development Score: Financial Marketization Degree: Credit Fund Distribution Market3415.5255572.654370.0611.755
2: Financial Market Development Factor Market Development Score: Financial Marketization Degree3416.4133141.909470.1211.185
Forest Area Forest area (10,000 hectares)34125,245134,620.95.97852,720.9
Afforestation Afforestation area (hectare)341199,493.3160,866.2710805,156
Investment in Forest Completed Investment in Forestry (10 thousand)3411,322,3111,532,60953,3551.10 × 10
Technology High-tech expenditures on scientific research activities (Thousand)3415.03 × 10 8.06 × 10 138,650.85.80 × 10
VariablesCD-Statp-ValueAverage Joint TMean ρMean abs ρ
33.3210.00011.000.470.63
14.9010.00011.000.210.49
71.2690.00011.001.001.00
70.6760.00011.000.990.99
70.9350.00011.000.990.99
19.5970.00011.000.270.45
35.6210.00011.000.500.63
62.7520.00011.000.880.88
13.2980.00011.000.190.44
20.4350.00011.000.290.49
64.1610.00011.000.900.90
VariablesCIPS—LevelCIPS—First Difference
Trend—ExclusiveTrend—InclusiveTrend—Exclusive Trend—Inclusive
−1.630−1.693−3.239 ***−3.360 ***
−1.481−2.695−3.193 **−3.121 **
−3.930 ***−2.799 **−3.174 ***−2.797 **
−5.385 ***−5.381 ***−5.252 ***−4.901 ***
−3.553 ***−2.520−2.920 ***−2.883 *
−0.613−0.934−3.670 ***−4.109 ***
−0.919−2.010−4.473 ***−4.022 ***
−1.402−1.674−4.832 ***−6.053 ***
−1.328−1.749−4.209 ***−3.276 ***
−2.332−2.889−3.832 ***−3.756 ***
−2.317−2.392 −3.197 ***−3.565 ***
Variance Ratio
Forest_O6.2070.0000
Timber_O6.80840.0000
VariablesLocationScaleqtile_10qtile_50qtile_95
0.0757 *−0.105 ***0.233 ***0.0748 *−0.140 *
(0.0409)(0.0228)(0.0501)(0.0418)(0.0839)
1.241 ***0.05331.161 ***1.242 ***1.351 ***
(0.306)(0.158)(0.399)(0.305)(0.436)
0.0567 **0.01760.03040.0569 **0.0928 *
(0.0287)(0.0172)(0.0321)(0.0288)(0.0510)
0.105 **0.01100.08860.105 **0.128 ***
(0.0493)(0.0281)(0.0820)(0.0491)(0.0468)
0.141 ***0.0702 ***0.03590.142 ***0.285 ***
(0.0348)(0.0158)(0.0423)(0.0359)(0.0557)
−8.266 ***−1.160−6.526 ***−8.276 ***−10.65 ***
(1.838)(0.958)(2.428)(1.833)(2.628)
341341341341341
VariablesLocationScaleqtile_10qtile_50qtile_95
0.0537 *−0.0709 ***0.162 ***0.0521 *−0.0986
(0.0305)(0.0189)(0.0399)(0.0309)(0.0635)
1.288 ***−0.01381.309 ***1.288 ***1.258 ***
(0.306)(0.156)(0.414)(0.306)(0.420)
0.0563 *0.02030.02520.0567 *0.0999 *
(0.0286)(0.0172)(0.0324)(0.0289)(0.0523)
0.105 **0.008200.09200.105 **0.122 **
(0.0496)(0.0292)(0.0850)(0.0492)(0.0488)
0.148 ***0.0561 ***0.06230.149 ***0.269 ***
(0.0330)(0.0163)(0.0421)(0.0337)(0.0529)
−8.545 ***−0.681−7.502 ***−8.559 ***−10.01 ***
(1.845)(0.967)(2.544)(1.839)(2.571)
341341341341341
VariablesLocationScaleqtile_10qtile_50qtile_95
0.0878 **−0.0789 ***0.213 ***0.0870 **−0.0737
(0.0404)(0.0200)(0.0549)(0.0406)(0.0608)
1.220 ***−0.01111.238 ***1.220 ***1.197 ***
(0.298)(0.151)(0.404)(0.298)(0.404)
0.0572 **0.01890.02720.0574 **0.0960 *
(0.0284)(0.0172)(0.0331)(0.0286)(0.0509)
0.107 **0.001170.1050.107 **0.109 **
(0.0493)(0.0298)(0.0866)(0.0491)(0.0500)
0.136 ***0.0501 ***0.05630.136 ***0.238 ***
(0.0356)(0.0155)(0.0444)(0.0361)(0.0483)
−8.133 ***−0.435−7.443 ***−8.138 ***−9.023 ***
(1.795)(0.914)(2.441)(1.792)(2.432)
341341341341341
VariablesLocationScaleqtile_10qtile_50qtile_95
0.0234 *−0.009460.0389 *0.0235 *0.00355
(0.0134)(0.00764)(0.0214)(0.0134)(0.0178)
−3.954 ***−0.539−3.075 ***−3.950 ***−5.087 ***
(0.596)(0.337)(0.733)(0.599)(1.052)
0.03460.0421−0.03390.03430.123 *
(0.0443)(0.0279)(0.0682)(0.0450)(0.0742)
−0.0538−0.0149−0.0295−0.0537−0.0852
(0.0399)(0.0273)(0.0590)(0.0399)(0.0715)
0.1100.113 *−0.07450.1090.348 *
(0.111)(0.0664)(0.146)(0.110)(0.205)
26.63 ***2.10423.20 ***26.61 ***31.05 ***
(2.964)(1.562)(3.779)(2.984)(4.606)
341341341341341
VariablesLocationScaleqtile_10qtile_50qtile_95
0.000294−0.0005450.001170.000280−0.000881
(0.000768)(0.000471)(0.00129)(0.000757)(0.00117)
−3.915 ***−0.404−3.269 ***−3.926 ***−4.786 ***
(0.645)(0.366)(0.946)(0.646)(1.115)
0.02640.0389−0.03570.02740.110
(0.0444)(0.0278)(0.0670)(0.0448)(0.0886)
−0.0330−0.0115−0.0147−0.0333−0.0578
(0.0413)(0.0272)(0.0642)(0.0412)(0.0675)
0.2030.130−0.004440.2060.482
(0.140)(0.0865)(0.204)(0.138)(0.301)
25.96 ***0.51825.13 ***25.97 ***27.07 ***
(4.133)(2.418)(6.880)(4.111)(4.888)
341341341341341
VariablesLocationScaleqtile_10qtile_50qtile_95
0.000533−0.0003380.001080.000521−0.000201
(0.000608)(0.000398)(0.00108)(0.000598)(0.000842)
−3.978 ***−0.476−3.208 ***−3.995 ***−5.012 ***
(0.629)(0.355)(0.903)(0.630)(1.117)
0.02490.0369−0.03480.02620.105
(0.0445)(0.0280)(0.0678)(0.0448)(0.0835)
−0.0328−0.0149−0.00869−0.0333−0.0652
(0.0424)(0.0280)(0.0686)(0.0422)(0.0683)
0.1750.1060.004170.1790.405 *
(0.121)(0.0785)(0.187)(0.119)(0.244)
26.78 ***1.40924.50 ***26.82 ***29.83 ***
(3.579)(2.129)(6.056)(3.561)(4.490)
341341341341341
VariablesLocationScaleqtile_10qtile_50qtile_95
0.196 ***−0.0707 **0.311 ***0.197 ***0.0686
(0.0577)(0.0300)(0.0819)(0.0581)(0.0714)
1.200 ***−0.01521.225 ***1.200 ***1.172 ***
(0.260)(0.142)(0.385)(0.260)(0.319)
0.0548 **0.02230.01870.0546 *0.0950 *
(0.0277)(0.0175)(0.0324)(0.0278)(0.0492)
0.105 **−0.006800.1160.105 **0.0923 **
(0.0482)(0.0319)(0.0907)(0.0483)(0.0454)
0.181 ***−0.009690.196 ***0.181 ***0.163 ***
(0.0262)(0.0146)(0.0347)(0.0261)(0.0383)
−8.551 ***0.359−9.133 ***−8.554 ***−7.903 ***
(1.459)(0.810)(2.161)(1.462)(1.811)
341341341341341
VariablesLocationScaleqtile_10qtile_50qtile_95
0.0278 *−0.01180.0462 **0.0270 *0.00415
(0.0146)(0.00883)(0.0209)(0.0146)(0.0221)
1.396 ***−0.1891.691 ***1.384 ***1.019 ***
(0.291)(0.153)(0.430)(0.290)(0.332)
0.0502 *0.0322 *5.72 × 10 0.0522 *0.114 **
(0.0282)(0.0178)(0.0352)(0.0288)(0.0514)
0.119 **−0.01200.1380.118 **0.0951 *
(0.0510)(0.0359)(0.0982)(0.0495)(0.0523)
0.200 ***−0.004540.207 ***0.199 ***0.191 ***
(0.0263)(0.0139)(0.0342)(0.0262)(0.0385)
−10.12 ***1.299−12.15 ***−10.04 ***−7.530 ***
(1.676)(0.862)(2.356)(1.673)(2.033)
341341341341341
VariablesLocationScaleqtile_10qtile_50qtile_95
0.121 ***0.02220.08750.123 ***0.168 **
(0.0459)(0.0251)(0.0531)(0.0465)(0.0752)
−4.136 ***−0.623 *−3.190 ***−4.192 ***−5.449 ***
(0.654)(0.347)(0.892)(0.654)(0.872)
0.02720.01250.008300.02840.0535
(0.0478)(0.0283)(0.0724)(0.0472)(0.0658)
−0.0508−0.0231−0.0157−0.0529−0.0996
(0.0457)(0.0297)(0.0750)(0.0448)(0.0639)
0.164 **0.05860.07540.170 **0.288 ***
(0.0677)(0.0481)(0.115)(0.0657)(0.0974)
27.63 ***3.334 *22.57 ***27.94 ***34.67 ***
(3.328)(1.703)(4.497)(3.335)(4.374)
341341341341341
VariablesLocationScaleqtile_10qtile_50qtile_95
0.150 **0.0581 *0.06540.160 **0.267 ***
(0.0598)(0.0309)(0.0594)(0.0621)(0.0965)
−3.904 ***−0.476−3.209 ***−3.984 ***−4.863 ***
(0.591)(0.312)(0.812)(0.585)(0.739)
0.01360.0161−0.009880.01630.0460
(0.0451)(0.0270)(0.0683)(0.0442)(0.0598)
−0.0320−0.03250.0155−0.0374−0.0974
(0.0438)(0.0282)(0.0701)(0.0425)(0.0609)
0.289 ***0.07830.1740.302 ***0.446 ***
(0.0897)(0.0524)(0.108)(0.0910)(0.142)
23.69 ***1.90520.91 ***24.01 ***27.53 ***
(2.632)(1.313)(3.742)(2.583)(3.024)
341341341341341
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Yasmeen, R.; Fu, G.H. Dynamic Impact of Digital Inclusive Finance and Financial Market Development on Forests and Timber in China: Economic and Social Perspective. Forests 2024 , 15 , 1655. https://doi.org/10.3390/f15091655

Yasmeen R, Fu GH. Dynamic Impact of Digital Inclusive Finance and Financial Market Development on Forests and Timber in China: Economic and Social Perspective. Forests . 2024; 15(9):1655. https://doi.org/10.3390/f15091655

Yasmeen, Rizwana, and Guo Hong Fu. 2024. "Dynamic Impact of Digital Inclusive Finance and Financial Market Development on Forests and Timber in China: Economic and Social Perspective" Forests 15, no. 9: 1655. https://doi.org/10.3390/f15091655

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

IMAGES

  1. (PDF) Marketing Literature Review

    literature review of market potential

  2. Chapter Two

    literature review of market potential

  3. (PDF) Nexus Between Online Marketing Strategies and Market Performance

    literature review of market potential

  4. ⇉Literature Review on Marketing Strategies Essay Example

    literature review of market potential

  5. (PDF) The Efficient Market Hypothesis: A Critical Review of the Literature

    literature review of market potential

  6. Exercise 3 about 'Market Potential'.docx

    literature review of market potential

VIDEO

  1. Literature Review Process (With Example)

  2. Review of literature|| Review of literature

  3. Writing the Literature Review (recorded lecture during pandemic)

  4. BECE Literature questions and answers (Year 2024)

  5. How to find Literature Review for Research

  6. Literature Review Critical Questions

COMMENTS

  1. Market innovation: A literature review and new research directions

    A more careful analysis of the conceptualizations in Table 1 helps us discern several recurring themes that reflect the three central elements of market innovation. First, most conceptualizations employ a structural notion of market. For example, they refer to product-market structures (Darroch & Miles, 2011), exchange structures (Giesler, 2012), market norms and market representations ...

  2. Market innovation: A literature review and new research directions

    In this article, we reviewed the market innovation literature, iden-. ti ed six research clusters, discussed their interrelations, identi ed. major shifts in the literature, and proposed new ...

  3. PDF Nber Working Paper Series Market Potential and Global Growth Over the

    the long run by: (1) developing a theoretically-derived measure of market potential appropriate. for historical use rather than relying on ―data-as-given‖ narratives as in Figure 1, allowing us to. relate globalization and growth in a more disciplined way; (2) collecting a new data set on.

  4. A Comprehensive Literature Review on Marketing Strategies ...

    A marketing strategy is a company's overarching plan for connecting with potential customers and persuading them to purchase its products or services. A marketing plan typically includes the value proposition of the company, key brand messages, data on target customer demographics, and other significant elements.

  5. A Study on Modernizing Marketing and Sales Potential: A Literature Review

    Abstract. A review of literature on marketing and sales potential was made to determine the organisation capabilities in the market. Organisations always try to develop a strong brand which sustained growth in the market. This requires a transformation in current strategies and policies. Leaders play a crucial role in plan strategies according ...

  6. Literature review as a research methodology: An overview and guidelines

    As mentioned previously, there are a number of existing guidelines for literature reviews. Depending on the methodology needed to achieve the purpose of the review, all types can be helpful and appropriate to reach a specific goal (for examples, please see Table 1).These approaches can be qualitative, quantitative, or have a mixed design depending on the phase of the review.

  7. Competitive pricing on online markets: a literature review

    Setting prices relative to competitors, i.e., competitive pricing, Footnote 1 is a classical marketing problem which has been studied extensively before the emergence of e-commerce (Talluri and van Ryzin 2004; Vives 2001).Although literature on online pricing has been reviewed in the past (Ratchford 2009), interrelations between pricing and competition were rarely considered systematically (Li ...

  8. Innovative growth: the role of market power and negative selection

    1. Introduction. This study examines innovations and the performance of low- and high-market-share firms, and it incorporates a broad range of intangible capital (IC) as innovation inputs, as described in Crepon, Duguet, and Mairessec's (Citation 1998) (CDM) framework.The current literature contains a research gap resulting in the inability to fully explain why low-technology rather than ...

  9. Artificial intelligence in marketing: A systematic literature review

    By way of a systematic literature review (SLR), the article evaluates 57 qualifying publications in the context of AI-powered marketing and qualitatively and quantitatively ranks them based on their coverage, impact, relevance, and contributed guidance, and elucidates the findings across various sectors, research contexts, and scenarios.

  10. Marketing Literature Review

    Marketing Literature Review MYRON LEONARD, Editor Western Carolina University This section is based on a selection of article abstracts from a compre-hensive business literature database. Marketing-related abstracts from over 125 journals (both academic and trade) are reviewed by JM staff. Descriptors for each entry are assigned by JM staff.

  11. (PDF) A Literature Review on Digital Marketing: The Evolution of a

    as 'the Digital Transformation' of marketing, widely accepted and investigated by both. practitioners and a cademics. Digital advertiseme nts, e-commerce, mobile services, just to name. a few ...

  12. Marketing of Management: Share Market, Potential Market, Competitor

    Abstract: Literature review research on marketing management: market share, potential market, competitor business, and promotion is a scientific literature article in the scope of marketing management science. The purpose of this literature research is to build a hypothesis

  13. Capacity Market Mechanism Analyses: a Literature Review

    Purpose of Review This paper focuses on providing an overview of research into different capacity market mechanisms. Beginning with the idea of the energy-only market and the resulting potential concerns of the missing money problem, this survey overviews a variety of studies of different capacity mechanisms, considering issues such as market power, risk attitudes, uncertainty, and pricing ...

  14. A systematic literature review: digital marketing and its impact on

    A systematic literature review has been conducted on digital marketing, and its implementation in SMEs. The impact of digital marketing on SMEs performance is observed over the past 12 years through the resources which are undertaken for the study, namely, Science Direct, Scopus, Springer, IEEE Explorer, ACM Digital Library, Engineering Village ...

  15. (PDF) A Literature Review on Digital Marketing Strategies and Its

    A Literature Review on Digital Marketing Strategies and Its Impact on Online Business Sellers During the COVID-19 Crisis ... Because of the potential market share gains that social media marketing ...

  16. Chapter Two

    Without it, quantitative market This literature review consists of four sections: â ¢ The first section provides a brief history of survey sam- pling and the theoretical basis for market research anal- ysis, providing context for what became the standard procedures and expectations of market research. ... One potential solution is the use of ...

  17. Review of Marketing Research

    Review of Marketing Research All Books; Recent Chapters; All books in this series (21 titles) The Vulnerable Consumer, Volume 21. Artificial Intelligence in Marketing, Volume 20. Measurement in Marketing, Volume 19. Marketing Accountability for Marketing and Non-marketing Outcomes, Volume 18 ...

  18. The untapped potential of B2B advertising: A literature review and

    1. Introduction. In recent years, business-to-business (B2B) marketers have adopted and experimented with a wide range of advertising and marketing communications tools in an effort to meet the challenges of ambitious growth objectives and the mounting pressure to reach performance goals (Lilien, 2016).This shift is due to a recognition that personal selling strategies are important (Hutt ...

  19. Assessing and enhancing the impact potential of marketing articles

    Although the impact of marketing is a recognized priority, current academic practices do not fully support this goal. A research manuscript's likely influence is difficult to evaluate prior to publication, and audiences differ in their understandings of what "impact" means. This article develops a set of criteria for assessing and enhancing a publication's impact potential. An article ...

  20. Machine learning in business and finance: a literature review and

    This study provides a comprehensive review of machine learning (ML) applications in the fields of business and finance. First, it introduces the most commonly used ML techniques and explores their diverse applications in marketing, stock analysis, demand forecasting, and energy marketing. In particular, this review critically analyzes over 100 articles and reveals a strong inclination toward ...

  21. Potential of Big Data for Marketing: A Literature Review

    of BD allow businesses to make well-informed decisions, improve supply chain operations, and strengthen. brand attachment. Although the academic literature on BD in marketing is continuously ...

  22. Dynamic Impact of Digital Inclusive Finance and Financial Market ...

    This study investigates how digital inclusive finance, financial development, and technology influenced forest and timber outputs across 31 provinces in China from 2011 to 2021. The findings, derived from panel quantile regression analysis, indicate that digital inclusive finance significantly enhances forest economic output, particularly in regions with lower economic activity, by improving ...