Research methodology vs. research methods
The research methodology or design is the overall strategy and rationale that you used to carry out the research. Whereas, research methods are the specific tools and processes you use to gather and understand the data you need to test your hypothesis.
To further understand research methodology, let’s explore some examples of research methodology:
a. Qualitative research methodology example: A study exploring the impact of author branding on author popularity might utilize in-depth interviews to gather personal experiences and perspectives.
b. Quantitative research methodology example: A research project investigating the effects of a book promotion technique on book sales could employ a statistical analysis of profit margins and sales before and after the implementation of the method.
c. Mixed-Methods research methodology example: A study examining the relationship between social media use and academic performance might combine both qualitative and quantitative approaches. It could include surveys to quantitatively assess the frequency of social media usage and its correlation with grades, alongside focus groups or interviews to qualitatively explore students’ perceptions and experiences regarding how social media affects their study habits and academic engagement.
These examples highlight the meaning of methodology in research and how it guides the research process, from data collection to analysis, ensuring the study’s objectives are met efficiently.
When it comes to writing your study, the methodology in research papers or a dissertation plays a pivotal role. A well-crafted methodology section of a research paper or thesis not only enhances the credibility of your research but also provides a roadmap for others to replicate or build upon your work.
Wondering how to write the research methodology section? Follow these steps to create a strong methods chapter:
At the start of a research paper , you would have provided the background of your research and stated your hypothesis or research problem. In this section, you will elaborate on your research strategy.
Begin by restating your research question and proceed to explain what type of research you opted for to test it. Depending on your research, here are some questions you can consider:
a. Did you use qualitative or quantitative data to test the hypothesis?
b. Did you perform an experiment where you collected data or are you writing a dissertation that is descriptive/theoretical without data collection?
c. Did you use primary data that you collected or analyze secondary research data or existing data as part of your study?
These questions will help you establish the rationale for your study on a broader level, which you will follow by elaborating on the specific methods you used to collect and understand your data.
Now that you have told your reader what type of research you’ve undertaken for the dissertation, it’s time to dig into specifics. State what specific methods you used and explain the conditions and variables involved. Explain what the theoretical framework behind the method was, what samples you used for testing it, and what tools and materials you used to collect the data.
Once you have explained the data collection process, explain how you analyzed and studied the data. Here, your focus is simply to explain the methods of analysis rather than the results of the study.
Here are some questions you can answer at this stage:
a. What tools or software did you use to analyze your results?
b. What parameters or variables did you consider while understanding and studying the data you’ve collected?
c. Was your analysis based on a theoretical framework?
Your mode of analysis will change depending on whether you used a quantitative or qualitative research methodology in your study. If you’re working within the hard sciences or physical sciences, you are likely to use a quantitative research methodology (relying on numbers and hard data). If you’re doing a qualitative study, in the social sciences or humanities, your analysis may rely on understanding language and socio-political contexts around your topic. This is why it’s important to establish what kind of study you’re undertaking at the onset.
Now that you have gone through your research process in detail, you’ll also have to make a case for it. Justify your choice of methodology and methods, explaining why it is the best choice for your research question. This is especially important if you have chosen an unconventional approach or you’ve simply chosen to study an existing research problem from a different perspective. Compare it with other methodologies, especially ones attempted by previous researchers, and discuss what contributions using your methodology makes.
No matter how thorough a methodology is, it doesn’t come without its hurdles. This is a natural part of scientific research that is important to document so that your peers and future researchers are aware of it. Writing in a research paper about this aspect of your research process also tells your evaluator that you have actively worked to overcome the pitfalls that came your way and you have refined the research process.
1. Remember who you are writing for. Keeping sight of the reader/evaluator will help you know what to elaborate on and what information they are already likely to have. You’re condensing months’ work of research in just a few pages, so you should omit basic definitions and information about general phenomena people already know.
2. Do not give an overly elaborate explanation of every single condition in your study.
3. Skip details and findings irrelevant to the results.
4. Cite references that back your claim and choice of methodology.
5. Consistently emphasize the relationship between your research question and the methodology you adopted to study it.
To sum it up, what is methodology in research? It’s the blueprint of your research, essential for ensuring that your study is systematic, rigorous, and credible. Whether your focus is on qualitative research methodology, quantitative research methodology, or a combination of both, understanding and clearly defining your methodology is key to the success of your research.
Once you write the research methodology and complete writing the entire research paper, the next step is to edit your paper. As experts in research paper editing and proofreading services , we’d love to help you perfect your paper!
Here are some other articles that you might find useful:
What does research methodology mean, what types of research methodologies are there, what is qualitative research methodology, how to determine sample size in research methodology, what is action research methodology.
Found this article helpful?
This is very simplified and direct. Very helpful to understand the research methodology section of a dissertation
Leave a Comment: Cancel reply
Your email address will not be published.
Your organization needs a technical editor: here’s why, your guide to the best ebook readers in 2024, writing for the web: 7 expert tips for web content writing.
Subscribe to our Newsletter
Get carefully curated resources about writing, editing, and publishing in the comfort of your inbox.
How to Copyright Your Book?
If you’ve thought about copyrighting your book, you’re on the right path.
© 2024 All rights reserved
Detailed Walkthrough + Free Methodology Chapter Template
If you’re working on a dissertation or thesis and are looking for an example of a research methodology chapter , you’ve come to the right place.
In this video, we walk you through a research methodology from a dissertation that earned full distinction , step by step. We start off by discussing the core components of a research methodology by unpacking our free methodology chapter template . We then progress to the sample research methodology to show how these concepts are applied in an actual dissertation, thesis or research project.
If you’re currently working on your research methodology chapter, you may also find the following resources useful:
PS – If you’re working on a dissertation, be sure to also check out our collection of dissertation and thesis examples here .
Research methodology example: frequently asked questions, is the sample research methodology real.
Yes. The chapter example is an extract from a Master’s-level dissertation for an MBA program. A few minor edits have been made to protect the privacy of the sponsoring organisation, but these have no material impact on the research methodology.
As we discuss in the video, every research methodology will be different, depending on the research aims, objectives and research questions. Therefore, you’ll need to tailor your literature review to suit your specific context.
You can learn more about the basics of writing a research methodology chapter here .
The best place to find more examples of methodology chapters would be within dissertation/thesis databases. These databases include dissertations, theses and research projects that have successfully passed the assessment criteria for the respective university, meaning that you have at least some sort of quality assurance.
The Open Access Thesis Database (OATD) is a good starting point.
You can access our free methodology chapter template here .
Yes. There is no cost for the template and you are free to use it as you wish.
Great insights you are sharing here…
Your email address will not be published. Required fields are marked *
Save my name, email, and website in this browser for the next time I comment.
Purdue Online Writing Lab Purdue OWL® College of Liberal Arts
This page is brought to you by the OWL at Purdue University. When printing this page, you must include the entire legal notice.
Copyright ©1995-2018 by The Writing Lab & The OWL at Purdue and Purdue University. All rights reserved. This material may not be published, reproduced, broadcast, rewritten, or redistributed without permission. Use of this site constitutes acceptance of our terms and conditions of fair use.
Note: This page reflects the latest version of the APA Publication Manual (i.e., APA 7), which released in October 2019. The equivalent resource for the older APA 6 style can be found here .
Media Files: APA Sample Student Paper , APA Sample Professional Paper
This resource is enhanced by Acrobat PDF files. Download the free Acrobat Reader
Note: The APA Publication Manual, 7 th Edition specifies different formatting conventions for student and professional papers (i.e., papers written for credit in a course and papers intended for scholarly publication). These differences mostly extend to the title page and running head. Crucially, citation practices do not differ between the two styles of paper.
However, for your convenience, we have provided two versions of our APA 7 sample paper below: one in student style and one in professional style.
Note: For accessibility purposes, we have used "Track Changes" to make comments along the margins of these samples. Those authored by [AF] denote explanations of formatting and [AWC] denote directions for writing and citing in APA 7.
Apa 7 professional paper:.
Do you need support in running a pricing or product study? We can help you with agile consumer research and conjoint analysis.
Conjointly offers a great survey tool with multiple question types, randomisation blocks, and multilingual support. The Basic tier is always free.
Fully-functional online survey tool with various question types, logic, randomisation, and reporting for unlimited number of surveys.
Completely free for academics and students .
This paper should be used only as an example of a research paper write-up. Horizontal rules signify the top and bottom edges of pages. For sample references which are not included with this paper, you should consult the Publication Manual of the American Psychological Association, 4th Edition .
This paper is provided only to give you an idea of what a research paper might look like. You are not allowed to copy any of the text of this paper in writing your own report.
Because word processor copies of papers don’t translate well into web pages, you should note that an actual paper should be formatted according to the formatting rules for your context. Note especially that there are three formatting rules you will see in this sample paper which you should NOT follow. First, except for the title page, the running header should appear in the upper right corner of every page with the page number below it. Second, paragraphs and text should be double spaced and the start of each paragraph should be indented. Third, horizontal lines are used to indicate a mandatory page break and should not be used in your paper.
Running Head: SUPPORTED EMPLOYMENT
This paper describes the psychosocial effects of a program of supported employment (SE) for persons with severe mental illness. The SE program involves extended individualized supported employment for clients through a Mobile Job Support Worker (MJSW) who maintains contact with the client after job placement and supports the client in a variety of ways. A 50% simple random sample was taken of all persons who entered the Thresholds Agency between 3/1/93 and 2/28/95 and who met study criteria. The resulting 484 cases were randomly assigned to either the SE condition (treatment group) or the usual protocol (control group) which consisted of life skills training and employment in an in-house sheltered workshop setting. All participants were measured at intake and at 3 months after beginning employment, on two measures of psychological functioning (the BPRS and GAS) and two measures of self esteem (RSE and ESE). Significant treatment effects were found on all four measures, but they were in the opposite direction from what was hypothesized. Instead of functioning better and having more self esteem, persons in SE had lower functioning levels and lower self esteem. The most likely explanation is that people who work in low-paying service jobs in real world settings generally do not like them and experience significant job stress, whether they have severe mental illness or not. The implications for theory in psychosocial rehabilitation are considered.
Over the past quarter century a shift has occurred from traditional institution-based models of care for persons with severe mental illness (SMI) to more individualized community-based treatments. Along with this, there has been a significant shift in thought about the potential for persons with SMI to be “rehabilitated” toward lifestyles that more closely approximate those of persons without such illness. A central issue is the ability of a person to hold a regular full-time job for a sustained period of time. There have been several attempts to develop novel and radical models for program interventions designed to assist persons with SMI to sustain full-time employment while living in the community. The most promising of these have emerged from the tradition of psychiatric rehabilitation with its emphases on individual consumer goal setting, skills training, job preparation and employment support (Cook, Jonikas and Solomon, 1992). These are relatively new and field evaluations are rare or have only recently been initiated (Cook and Razzano, 1992; Cook, 1992). Most of the early attempts to evaluate such programs have naturally focused almost exclusively on employment outcomes. However, theory suggests that sustained employment and living in the community may have important therapeutic benefits in addition to the obvious economic ones. To date, there have been no formal studies of the effects of psychiatric rehabilitation programs on key illness-related outcomes. To address this issue, this study seeks to examine the effects of a new program of supported employment on psychosocial outcomes for persons with SMI.
Over the past several decades, the theory of vocational rehabilitation has experienced two major stages of evolution. Original models of vocational rehabilitation were based on the idea of sheltered workshop employment. Clients were paid a piece rate and worked only with other individuals who were disabled. Sheltered workshops tended to be “end points” for persons with severe and profound mental retardation since few ever moved from sheltered to competitive employment (Woest, Klein & Atkins, 1986). Controlled studies of sheltered workshop performance of persons with mental illness suggested only minimal success (Griffiths, 1974) and other research indicated that persons with mental illness earned lower wages, presented more behavior problems, and showed poorer workshop attendance than workers with other disabilities (Whitehead, 1977; Ciardiello, 1981).
In the 1980s, a new model of services called Supported Employment (SE) was proposed as less expensive and more normalizing for persons undergoing rehabilitation (Wehman, 1985). The SE model emphasizes first locating a job in an integrated setting for minimum wage or above, and then placing the person on the job and providing the training and support services needed to remain employed (Wehman, 1985). Services such as individualized job development, one-on-one job coaching, advocacy with co-workers and employers, and “fading” support were found to be effective in maintaining employment for individuals with severe and profound mental retardation (Revell, Wehman & Arnold, 1984). The idea that this model could be generalized to persons with all types of severe disabilities, including severe mental illness, became commonly accepted (Chadsey-Rusch & Rusch, 1986).
One of the more notable SE programs was developed at Thresholds, the site for the present study, which created a new staff position called the mobile job support worker (MJSW) and removed the common six month time limit for many placements. MJSWs provide ongoing, mobile support and intervention at or near the work site, even for jobs with high degrees of independence (Cook & Hoffschmidt, 1993). Time limits for many placements were removed so that clients could stay on as permanent employees if they and their employers wished. The suspension of time limits on job placements, along with MJSW support, became the basis of SE services delivered at Thresholds.
There are two key psychosocial outcome constructs of interest in this study. The first is the overall psychological functioning of the person with SMI. This would include the specification of severity of cognitive and affective symptomotology as well as the overall level of psychological functioning. The second is the level of self-reported self esteem of the person. This was measured both generally and with specific reference to employment.
The key hypothesis of this study is:
which will be tested against the alternative:
The population of interest for this study is all adults with SMI residing in the U.S. in the early 1990s. The population that is accessible to this study consists of all persons who were clients of the Thresholds Agency in Chicago, Illinois between the dates of March 1, 1993 and February 28, 1995 who met the following criteria: 1) a history of severe mental illness (e.g. either schizophrenia, severe depression or manic-depression); 2) a willingness to achieve paid employment; 3) their primary diagnosis must not include chronic alcoholism or hard drug use; and 4) they must be 18 years of age or older. The sampling frame was obtained from records of the agency. Because of the large number of clients who pass through the agency each year (e.g. approximately 500 who meet the criteria) a simple random sample of 50% was chosen for inclusion in the study. This resulted in a sample size of 484 persons over the two-year course of the study.
On average, study participants were 30 years old and high school graduates (average education level = 13 years). The majority of participants (70%) were male. Most had never married (85%), few (2%) were currently married, and the remainder had been formerly married (13%). Just over half (51%) are African American, with the remainder Caucasian (43%) or other minority groups (6%). In terms of illness history, the members in the sample averaged 4 prior psychiatric hospitalizations and spent a lifetime average of 9 months as patients in psychiatric hospitals. The primary diagnoses were schizophrenia (42%) and severe chronic depression (37%). Participants had spent an average of almost two and one-half years (29 months) at the longest job they ever held.
While the study sample cannot be considered representative of the original population of interest, generalizability was not a primary goal – the major purpose of this study was to determine whether a specific SE program could work in an accessible context. Any effects of SE evident in this study can be generalized to urban psychiatric agencies that are similar to Thresholds, have a similar clientele, and implement a similar program.
All but one of the measures used in this study are well-known instruments in the research literature on psychosocial functioning. All of the instruments were administered as part of a structured interview that an evaluation social worker had with study participants at regular intervals.
Two measures of psychological functioning were used. The Brief Psychiatric Rating Scale (BPRS)(Overall and Gorham, 1962) is an 18-item scale that measures perceived severity of symptoms ranging from “somatic concern” and “anxiety” to “depressive mood” and “disorientation.” Ratings are given on a 0-to-6 Likert-type response scale where 0=“not present” and 6=“extremely severe” and the scale score is simply the sum of the 18 items. The Global Assessment Scale (GAS)(Endicott et al, 1976) is a single 1-to-100 rating on a scale where each ten-point increment has a detailed description of functioning (higher scores indicate better functioning). For instance, one would give a rating between 91-100 if the person showed “no symptoms, superior functioning…” and a value between 1-10 if the person “needs constant supervision…”
Two measures of self esteem were used. The first is the Rosenberg Self Esteem (RSE) Scale (Rosenberg, 1965), a 10-item scale rated on a 6-point response format where 1=“strongly disagree” and 6=“strongly agree” and there is no neutral point. The total score is simply the sum across the ten items, with five of the items being reversals. The second measure was developed explicitly for this study and was designed to measure the Employment Self Esteem (ESE) of a person with SMI. This is a 10-item scale that uses a 4-point response format where 1=“strongly disagree” and 4=“strongly agree” and there is no neutral point. The final ten items were selected from a pool of 97 original candidate items, based upon high item-total score correlations and a judgment of face validity by a panel of three psychologists. This instrument was deliberately kept simple – a shorter response scale and no reversal items – because of the difficulties associated with measuring a population with SMI. The entire instrument is provided in Appendix A.
All four of the measures evidenced strong reliability and validity. Internal consistency reliability estimates using Cronbach’s alpha ranged from .76 for ESE to .88 for SE. Test-retest reliabilities were nearly as high, ranging from .72 for ESE to .83 for the BPRS. Convergent validity was evidenced by the correlations within construct. For the two psychological functioning scales the correlation was .68 while for the self esteem measures it was somewhat lower at .57. Discriminant validity was examined by looking at the cross-construct correlations which ranged from .18 (BPRS-ESE) to .41 (GAS-SE).
A pretest-posttest two-group randomized experimental design was used in this study. In notational form, the design can be depicted as:
The comparison group received the standard Thresholds protocol which emphasized in-house training in life skills and employment in an in-house sheltered workshop. All participants were measured at intake (pretest) and at three months after intake (posttest).
This type of randomized experimental design is generally strong in internal validity. It rules out threats of history, maturation, testing, instrumentation, mortality and selection interactions. Its primary weaknesses are in the potential for treatment-related mortality (i.e. a type of selection-mortality) and for problems that result from the reactions of participants and administrators to knowledge of the varying experimental conditions. In this study, the drop-out rate was 4% (N=9) for the control group and 5% (N=13) in the treatment group. Because these rates are low and are approximately equal in each group, it is not plausible that there is differential mortality. There is a possibility that there were some deleterious effects due to participant knowledge of the other group’s existence (e.g. compensatory rivalry, resentful demoralization). Staff were debriefed at several points throughout the study and were explicitly asked about such issues. There were no reports of any apparent negative feelings from the participants in this regard. Nor is it plausible that staff might have equalized conditions between the two groups. Staff were given extensive training and were monitored throughout the course of the study. Overall, this study can be considered strong with respect to internal validity.
Between 3/1/93 and 2/28/95 each person admitted to Thresholds who met the study inclusion criteria was immediately assigned a random number that gave them a 50/50 chance of being selected into the study sample. For those selected, the purpose of the study was explained, including the nature of the two treatments, and the need for and use of random assignment. Participants were assured confidentiality and were given an opportunity to decline to participate in the study. Only 7 people (out of 491) refused to participate. At intake, each selected sample member was assigned a random number giving them a 50/50 chance of being assigned to either the Supported Employment condition or the standard in-agency sheltered workshop. In addition, all study participants were given the four measures at intake.
All participants spent the initial two weeks in the program in training and orientation. This consisted of life skill training (e.g. handling money, getting around, cooking and nutrition) and job preparation (employee roles, coping strategies). At the end of that period, each participant was assigned to a job site – at the agency sheltered workshop for those in the control condition, and to an outside employer if in the Supported Employment group. Control participants were expected to work full-time at the sheltered workshop for a three-month period, at which point they were posttested and given an opportunity to obtain outside employment (either Supported Employment or not). The Supported Employment participants were each assigned a case worker – called a Mobile Job Support Worker (MJSW) – who met with the person at the job site two times per week for an hour each time. The MJSW could provide any support or assistance deemed necessary to help the person cope with job stress, including counseling or working beside the person for short periods of time. In addition, the MJSW was always accessible by cellular telephone, and could be called by the participant or the employer at any time. At the end of three months, each participant was post-tested and given the option of staying with their current job (with or without Supported Employment) or moving to the sheltered workshop.
There were 484 participants in the final sample for this study, 242 in each treatment. There were 9 drop-outs from the control group and 13 from the treatment group, leaving a total of 233 and 229 in each group respectively from whom both pretest and posttest were obtained. Due to unexpected difficulties in coping with job stress, 19 Supported Employment participants had to be transferred into the sheltered workshop prior to the posttest. In all 19 cases, no one was transferred prior to week 6 of employment, and 15 were transferred after week 8. In all analyses, these cases were included with the Supported Employment group (intent-to-treat analysis) yielding treatment effect estimates that are likely to be conservative.
The major results for the four outcome measures are shown in Figure 1.
Insert Figure 1 about here
It is immediately apparent that in all four cases the null hypothesis has to be accepted – contrary to expectations, Supported Employment cases did significantly worse on all four outcomes than did control participants.
The mean gains, standard deviations, sample sizes and t-values (t-test for differences in average gain) are shown for the four outcome measures in Table 1.
Insert Table 1 about here
The results in the table confirm the impressions in the figures. Note that all t-values are negative except for the BPRS where high scores indicate greater severity of illness. For all four outcomes, the t-values were statistically significant (p<.05).
The results of this study were clearly contrary to initial expectations. The alternative hypothesis suggested that SE participants would show improved psychological functioning and self esteem after three months of employment. Exactly the reverse happened – SE participants showed significantly worse psychological functioning and self esteem.
There are two major possible explanations for this outcome pattern. First, it seems reasonable that there might be a delayed positive or “boomerang” effect of employment outside of a sheltered setting. SE cases may have to go through an initial difficult period of adjustment (longer than three months) before positive effects become apparent. This “you have to get worse before you get better” theory is commonly held in other treatment-contexts like drug addiction and alcoholism. But a second explanation seems more plausible – that people working full-time jobs in real-world settings are almost certainly going to be under greater stress and experience more negative outcomes than those who work in the relatively safe confines of an in-agency sheltered workshop. Put more succinctly, the lesson here might very well be that work is hard. Sheltered workshops are generally very nurturing work environments where virtually all employees share similar illness histories and where expectations about productivity are relatively low. In contrast, getting a job at a local hamburger shop or as a shipping clerk puts the person in contact with co-workers who may not be sympathetic to their histories or forgiving with respect to low productivity. This second explanation seems even more plausible in the wake of informal debriefing sessions held as focus groups with the staff and selected research participants. It was clear in the discussion that SE persons experienced significantly higher job stress levels and more negative consequences. However, most of them also felt that the experience was a good one overall and that even their “normal” co-workers “hated their jobs” most of the time.
One lesson we might take from this study is that much of our contemporary theory in psychiatric rehabilitation is naive at best and, in some cases, may be seriously misleading. Theory led us to believe that outside work was a “good” thing that would naturally lead to “good” outcomes like increased psychological functioning and self esteem. But for most people (SMI or not) work is at best tolerable, especially for the types of low-paying service jobs available to study participants. While people with SMI may not function as well or have high self esteem, we should balance this with the desire they may have to “be like other people” including struggling with the vagaries of life and work that others struggle with.
Future research in this are needs to address the theoretical assumptions about employment outcomes for persons with SMI. It is especially important that attempts to replicate this study also try to measure how SE participants feel about the decision to work, even if traditional outcome indicators suffer. It may very well be that negative outcomes on traditional indicators can be associated with a “positive” impact for the participants and for the society as a whole.
Chadsey-Rusch, J. and Rusch, F.R. (1986). The ecology of the workplace. In J. Chadsey-Rusch, C. Haney-Maxwell, L. A. Phelps and F. R. Rusch (Eds.), School-to-Work Transition Issues and Models. (pp. 59-94), Champaign IL: Transition Institute at Illinois.
Ciardiello, J.A. (1981). Job placement success of schizophrenic clients in sheltered workshop programs. Vocational Evaluation and Work Adjustment Bulletin, 14, 125-128, 140.
Cook, J.A. (1992). Job ending among youth and adults with severe mental illness. Journal of Mental Health Administration, 19(2), 158-169.
Cook, J.A. & Hoffschmidt, S. (1993). Psychosocial rehabilitation programming: A comprehensive model for the 1990’s. In R.W. Flexer and P. Solomon (Eds.), Social and Community Support for People with Severe Mental Disabilities: Service Integration in Rehabilitation and Mental Health. Andover, MA: Andover Publishing.
Cook, J.A., Jonikas, J., & Solomon, M. (1992). Models of vocational rehabilitation for youth and adults with severe mental illness. American Rehabilitation, 18, 3, 6-32.
Cook, J.A. & Razzano, L. (1992). Natural vocational supports for persons with severe mental illness: Thresholds Supported Competitive Employment Program, in L. Stein (ed.), New Directions for Mental Health Services, San Francisco: Jossey-Bass, 56, 23-41.
Endicott, J.R., Spitzer, J.L. Fleiss, J.L. and Cohen, J. (1976). The Global Assessment Scale: A procedure for measuring overall severity of psychiatric disturbance. Archives of General Psychiatry, 33, 766-771.
Griffiths, R.D. (1974). Rehabilitation of chronic psychotic patients. Psychological Medicine, 4, 316-325.
Overall, J. E. and Gorham, D. R. (1962). The Brief Psychiatric Rating Scale. Psychological Reports, 10, 799-812.
Rosenberg, M. (1965). Society and Adolescent Self Image. Princeton, NJ, Princeton University Press.
Wehman, P. (1985). Supported competitive employment for persons with severe disabilities. In P. McCarthy, J. Everson, S. Monn & M. Barcus (Eds.), School-to-Work Transition for Youth with Severe Disabilities, (pp. 167-182), Richmond VA: Virginia Commonwealth University.
Whitehead, C.W. (1977). Sheltered Workshop Study: A Nationwide Report on Sheltered Workshops and their Employment of Handicapped Individuals. (Workshop Survey, Volume 1), U.S. Department of Labor Service Publication. Washington, DC: U.S. Government Printing Office.
Woest, J., Klein, M. and Atkins, B.J. (1986). An overview of supported employment strategies. Journal of Rehabilitation Administration, 10(4), 130-135.
Pretest | Posttest | Gain | ||
---|---|---|---|---|
Treatment | Mean | 59 | 43 | -16 |
sd | 25.2 | 24.3 | 24.75 | |
N | 229 | 229 | 229 | |
Control | Mean | 61 | 63 | 2 |
sd | 26.7 | 22.1 | 24.4 | |
N | 233 | 233 | 233 | |
t = | -7.87075 | p<.05 |
Pretest | Posttest | Gain | ||
---|---|---|---|---|
Treatment | Mean | 27 | 16 | -11 |
sd | 19.3 | 21.2 | 20.25 | |
N | 229 | 229 | 229 | |
Control | Mean | 25 | 24 | -1 |
sd | 18.6 | 20.3 | 19.45 | |
N | 233 | 233 | 233 | |
t = | -5.41191 | p<.05 |
Figure 1. Pretest and posttest means for treatment (SE) and control groups for the four outcome measures.
Please rate how strongly you agree or disagree with each of the following statements.
Research Method | Quantitative | Qualitative | Mixed Methods |
---|---|---|---|
To measure and quantify variables | To understand the meaning and complexity of phenomena | To integrate both quantitative and qualitative approaches | |
Typically focused on testing hypotheses and determining cause and effect relationships | Typically exploratory and focused on understanding the subjective experiences and perspectives of participants | Can be either, depending on the research design | |
Usually involves standardized measures or surveys administered to large samples | Often involves in-depth interviews, observations, or analysis of texts or other forms of data | Usually involves a combination of quantitative and qualitative methods | |
Typically involves statistical analysis to identify patterns and relationships in the data | Typically involves thematic analysis or other qualitative methods to identify themes and patterns in the data | Usually involves both quantitative and qualitative analysis | |
Can provide precise, objective data that can be generalized to a larger population | Can provide rich, detailed data that can help understand complex phenomena in depth | Can combine the strengths of both quantitative and qualitative approaches | |
May not capture the full complexity of phenomena, and may be limited by the quality of the measures used | May be subjective and may not be generalizable to larger populations | Can be time-consuming and resource-intensive, and may require specialized skills | |
Typically focused on testing hypotheses and determining cause-and-effect relationships | Surveys, experiments, correlational studies | Interviews, focus groups, ethnography | Sequential explanatory design, convergent parallel design, explanatory sequential design |
Examples of Research Methods are as follows:
Qualitative Research Example:
A researcher wants to study the experience of cancer patients during their treatment. They conduct in-depth interviews with patients to gather data on their emotional state, coping mechanisms, and support systems.
Quantitative Research Example:
A company wants to determine the effectiveness of a new advertisement campaign. They survey a large group of people, asking them to rate their awareness of the product and their likelihood of purchasing it.
Mixed Research Example:
A university wants to evaluate the effectiveness of a new teaching method in improving student performance. They collect both quantitative data (such as test scores) and qualitative data (such as feedback from students and teachers) to get a complete picture of the impact of the new method.
Research methods are used in various fields to investigate, analyze, and answer research questions. Here are some examples of how research methods are applied in different fields:
Research methods serve several purposes, including:
Research methods are used when you need to gather information or data to answer a question or to gain insights into a particular phenomenon.
Here are some situations when research methods may be appropriate:
Research methods provide several advantages, including:
Researcher, Academic Writer, Web developer
Intended for healthcare professionals
Predicting future outcomes of patients is essential to clinical practice, with many prediction models published each year. Empirical evidence suggests that published studies often have severe methodological limitations, which undermine their usefulness. This article presents a step-by-step guide to help researchers develop and evaluate a clinical prediction model. The guide covers best practices in defining the aim and users, selecting data sources, addressing missing data, exploring alternative modelling options, and assessing model performance. The steps are illustrated using an example from relapsing-remitting multiple sclerosis. Comprehensive R code is also provided.
Clinical prediction models aim to forecast future health outcomes given a set of baseline predictors to facilitate medical decision making and improve people’s health outcomes. 1 Prediction models are becoming increasingly popular, with many new ones published each year. For example, a review of prediction models identified 263 prediction models in obstetrics alone 2 ; another review found 606 models related to covid-19. 3 Interest in predicting health outcomes has been heightened by the increasing availability of big data, 4 which has also led to the uptake of machine learning methods for prognostic research in medicine. 5 6
Several resources are available to support prognostic research. The PROGRESS (prognosis research strategy) framework provides detailed guidance on different types of prognostic research. 7 8 9 The TRIPOD (transparent reporting of a multivariable prediction model for individual prognosis or diagnosis) statement gives recommendations for reporting and has recently been extended to address prediction model research in clustered datasets. 10 11 12 13 14 PROBAST (prediction model risk-of-bias assessment tool) provides a structured way to assess the risk of bias in a prediction modelling study. 15 Several papers further outline good practices and provide software code. 16 17 18
Despite these resources, published prediction modelling studies often have severe methodological limitations. For instance, a review of prediction models for cardiovascular disease identified 363 models 19 ; the authors concluded that “the usefulness of most of the models remains unclear owing to methodological shortcomings, incomplete presentation, and lack of external validation and model impact studies.” Another review of 308 prediction models in psychiatry found that most were at high risk of bias. 20 Many biases well known in clinical and epidemiological research also apply to prediction model studies, including inconsistent definitions and measurements of predictors and outcomes or lack of blinding. Some biases are particularly pertinent to prediction modelling; for example, overfitting—estimating many model parameters from few data points—can lead to overestimating the model's performance. 15
This article provides a step-by-step guide for researchers interested in clinical prediction modelling. Based on a scoping review of the literature and discussions in our group, we identified 13 steps. We aim to provide an overview to help numerically minded clinicians, clinical epidemiologists, and statisticians navigate the field. We introduce key concepts and provide references to further reading for each step. We discuss issues related to model inception, provide practical recommendations about selecting predictors, outline sample size considerations, cover aspects of model development, such as handling missing data and assessing performance, and discuss methods for evaluating the model’s clinical usefulness. The concepts we describe and the steps we propose largely apply to statistical and machine learning models. An appendix with code in R accompanies the paper. Although several issues discussed here are also relevant to diagnostic research 21 (which is related but has subtle differences with prediction modelling) and models on predicting treatment effects, 22 23 our focus is primarily on methods for predicting a future health outcome. We illustrate the proposed procedure using an example of a prediction model for relapse in relapsing-remitting multiple sclerosis. The glossary in table 1 summarises the essential concepts and terms used.
Glossary of key terms and concepts used in prediction modelling
Many prediction models are published each year, but they often have methodological shortcomings that limit their internal validity and applicability. A 13 step guide has been developed to help healthcare professionals and researchers develop and validate prediction models, avoiding common pitfalls
In the first step, the objective of the prediction model should be defined, including the target population, the outcome to be predicted, the healthcare setting where the model will be used, the intended users, and the decisions the model will inform
Prediction modelling requires a collaborative and interdisciplinary effort within a team that ideally includes clinicians with content expertise, methodologists, users, and people with lived experiences
Common pitfalls include inappropriate categorising of continuous outcomes or predictors, data driven cut-off points, univariable selection methods, overfitting, and lack of attention to missing data and a sound assessment of performance and clinical benefit
Defining aims.
We should start by clearly defining the purpose of the envisaged prediction model. In particular, it is important to clearly determine the following:
The target population—for whom should the model predict? For example, people with HIV in South Africa; people with a history of diabetes; postmenopausal women in western Europe.
The health outcome of interest—what is the endpoint that needs to be predicted? For example, AIDS, overall survival, progression free survival, a particular adverse event.
The healthcare setting—how will the model be used? For example, the model might be used in primary care or be implemented in a clinical decision support system in tertiary care.
The user—who is going to use the model? For example, primary care physicians, secondary care physicians, patients, researchers.
The clinical decisions that the model will inform—how will model predictions be used in the clinical decision making process? For example, a model might be used to identify patients for further diagnostic investigation, to decide on treatment strategies, or to inform a range of personal decisions. 24
Answers to these questions should guide the subsequent steps; they will inform various issues, such as what predictors to include in the model, what data to use for developing and validating the model, and how to assess its clinical usefulness.
When developing a prediction model for clinical use, assembling a group with expertise in the specific medical field, the statistical methodology, and the source data are highly advisable. Including users—that is, clinicians who might use the model and people with lived experiences—is also beneficial. Depending on the model's complexity, it might be necessary to involve software developers at later stages of the project; that is, developing a web application for users to make predictions.
Identifying relevant published prediction models and studies on important risk factors is crucial and can be achieved through a scoping review. Discussing the review's findings with clinicians will help us to understand established predictors and the limitations of existing models. The literature review might also provide information on interactions between predictors, nonlinear associations between predictors and outcomes, reasons for missing data, and the expected distribution of predictors in the target population. In some situations, performing a systematic review might be helpful. Specific guidance on systematic reviews of prediction models has been published. 25 26 27
A study protocol should guide subsequent steps. The protocol can be made publicly available in an open access journal or as a preprint in an online repository (eg, www.medrxiv.org or https://osf.io/ ). In addition to the steps discussed here, the TRIPOD statement 10 14 and the PROBAST tool 15 might be helpful resources when writing the protocol.
Depending on the specific field, the literature review might show that relevant prediction models already exist. Suppose an existing model has a low risk of bias (according to PROBAST 15 ) and applies to the research question. In that case, assessing its validity for the intended setting might be more appropriate than developing a new model. This approach is known as external validation ( table 1 ). Depending on the validation results, we might decide to update and adapt the model to the population and setting of intended use. Common strategies for updating a prediction model include recalibration (eg, adjustment of the intercept term in a regression model), revision (ie, re-estimation of some model parameters), and extension (ie, addition of new predictors). 28 29 Although updating strategies have mainly been described for regression models, they can also be applied to machine learning. For example, a random forest model was used to predict whether patients with stroke would experience full recovery within 90 days of the event. When tested on an external dataset, the model needed recalibration, which was performed by fitting logistic regression models to the predictions from the random forest. 30 Prediction models for imaging data are often developed by fine tuning previously trained neural networks using a process known as transfer learning. 31
Further guidance on external validation and model updating is available elsewhere, 32 33 34 35 36 including sample size considerations for external validation. 37 In the following steps, we focus on developing a new model; we briefly revisit external validation in step 9.
An outcome can be defined and measured in many ways. For example, postoperative mortality can be measured as a binary outcome at 30 days, at 60 days, or using survival time. Using time-to-event instead of binary variables is good practice; a prediction model for time-to-event can better handle people who were followed up for a limited time and did not experience the outcome of interest. Moreover, time-to-event data provide richer information (eg, the survival probability at any time point) than a binary outcome at one time point only. Similarly, we can analyse a continuous health outcome using a continuous scale or after dichotomising or categorising. For example, a continuous depression score at week 8 after starting drug treatment could be dichotomised as remission or non-remission. Categorising a continuous outcome leads to loss of information. 38 39 40 Moreover, the selection of thresholds for categorisation is often arbitrary, lacking biological justification. In some cases, thresholds are chosen after exploring various cut-off points and opting for those that fit the data best or yield statistically significant results. This data driven approach could lead to reduced performance in new data. 38
Candidate predictors.
We should identify potential predictors based on the literature review and expert knowledge (step 1). Like the outcomes of interest, they should ideally be objectively defined and measured using an established, reliable method. Understanding the biological pathways that might underpin associations between predictors and the outcome is key. Predictors with proven or suspected causal relationships with the outcome should be prioritised for inclusion; this approach might increase the model's generalisability. On the other hand, the absence of a causal relationship should not a priori exclude potential predictors. Predictors not causally related to the outcome but strongly associated with it might still contribute to model performance, although they might generalise less well to different settings than causal factors. Further, we must include only baseline predictors; that is, information available when making a prognosis. Dichotomising or categorising continuous predictors reduces information and diminishes statistical power and should be avoided. 41 42 Similarly to categorising outcomes, we advise against making data driven, post hoc decisions after testing several categorisation thresholds for predictors. In other words, we should not choose the categories of a continuous outcome based solely on the associated model performance.
It is crucial to consider the model's intended use (defined in step 1) and the availability of data. What variables are routinely measured in clinical practice and are available in the database? What are the costs and practical issues related to their measurement, including the degree of invasiveness? 43 For example, the veterans ageing cohort study index (VACS index 2.0) predicts all cause mortality in people with HIV. 44 However, some of its predictors, such as the liver fibrosis index (FIB-4), will not be available in routine practice in many settings with a high prevalence of HIV infection. Similarly, a systematic review of prognostic models for multiple sclerosis found that 44 of 75 models (59%) included predictors unlikely to be measured in primary care or standard hospital settings. 45
Data collection.
Ideally, prediction models are developed using individual participant data from prospective cohort studies designed for this purpose. 1 In practice, developing prediction models using existing data from cohort studies or other data not collected explicitly for this purpose is much more common. Data from randomised clinical trials can also be used. The quality of trial data will generally be high, but models could have limited generalisability because trial participants might not represent the patients seen in clinical practice. For example, a study found only about 20% of people who have schizophrenia spectrum disorders would be eligible for inclusion in a typical randomised clinical trial. Patients who are ineligible had a higher risk of hospital admission with psychosis than those who are eligible. 46 Therefore, a prediction model based on trial data might underestimate the real world risk of hospital admissions. Registry data offer a simple, low cost alternative; their main advantage is the relatively large sample size and representativeness. However, drawbacks relate to data limitations such as inadequate data on relevant predictors or outcomes, and variability in the timing of measurements. 47
Before fitting the model, addressing potential misclassification or measurement errors in predictors and outcomes is crucial. This involves considering the nature of the variables collected and the methods used for measurement or classification. For example, predictors such as physical activity or dietary intake are prone to various sources of measurement error. 48 The extent of these errors can vary across settings, for example, because of differences in the measurement method used. This means that the model's predictive performance and potential usefulness could be reduced. 49 If the risk of measurement error is considered high, we might consider alternative outcome measures or exclude less important, imprecisely measured predictors from the list created in step 4. In particular, if systematic errors in the dataset do not mirror those encountered in clinical practice, the model’s calibration might be poor. While methods for correcting measurement errors have been proposed, they typically require additional data and assumptions. 49
After examining their distribution in the dataset, excluding predictors with limited variation is advisable because they will contribute little. For example, if the ages range from 25 to 45 years and the outcomes are not expected to change much within this range, we should remove age from the list of predictors. Similarly, a binary predictor might be present in only a few people. In such cases, we might consider removing it from the model unless there is previous evidence that this is a strong predictor. 47 More complications arise when a variable with low prevalence is known to have meaningful predictive value. For example, a rare genetic mutation could be strongly associated with the outcome. The mutation could be omitted from the model because its effect is difficult to estimate accurately. Alternatively, the few people with the mutation could be excluded, making the model applicable only to people without it. 47 Another issue is incomplete data on predictors and outcomes for some participants. Depending on the prevalence of missing data, we might want to modify the outcome or exclude certain candidate predictors. For example, we might omit a predictor with many missing values, especially if there is little evidence of its predictive power and imputing the missing data is challenging (step 7); that is, when the missing values cannot be reliably predicted using the observed data. Conversely, if the missing information can be imputed, we might decide to retain the variable, particularly when there is existing evidence that the predictor is important.
General considerations about sample size.
A very simple model or a model based on covariates that are not associated with the outcome will perform poorly in the data used to develop it and in new data; this scenario is called underfitting. Conversely, a model with too many predictors developed in a small dataset (overfitting) could perform well in this particular dataset but fail to predict accurately in new data. In practice, overfitting is more common than underfitting because datasets are often small and have few events, and there is the temptation to create models with the best (apparent) performance. Therefore, we must ensure the data are sufficient to develop a robust model that includes the relevant predictors.
Riley and colleagues 50 provide helpful guidance and code 51 52 on sample size calculations. Users need to specify the overall risk (for binary outcomes) or mean outcome value (for continuous outcomes) in the target population, the number of model parameters, and a measure of expected model performance (eg, the coefficient of determination, R 2 ). Note that the number of parameters can be larger than the number of predictors. For example, we need two parameters when using a restricted cubic spline with three knots to model a nonlinear association of age with the outcome. The sample size calculated this way is the minimum for a standard statistical model. The sample size must be several times larger if we want to use machine learning models. 53 Sample size calculations for such models are considerably more complex and might require simulations. 54
Suppose the sample size is fixed or based on an existing study, as is often the case. Then, we should perform sample size calculations to identify the maximum number of parameters we can include in the model. A structured way to guide model development can be summarised as follows:
Calculate the maximum number of parameters that can be included in the model given the available sample size.
Use the available parameters sequentially by including predictors from the list, starting from the ones that are perceived to be more important. 55
Note that additional parameters will be needed for including nonlinear terms or interactions among the predictors in the list.
General considerations on missing data.
After removing predictors or outcomes with many missing values, as outlined in step 5, we might still need to address missing values in the retained data. Relying only on complete cases for model development—that is, participants with data for all variables—can dramatically reduce the sample size. To mitigate the loss of valuable information during model development and evaluation, researchers should consider imputing missing data.
Multiple imputation is the approach usually recommended to handle missing data during model development, and appropriately accounts for missing data uncertainty. 56 Several versions of the original dataset are created, each with missing values imputed using an imputation model. The imputation model should be the same (in terms of predictors included, their transformations and interactions) as the final model we will use to make predictions. Additionally, the imputation model might involve auxiliary variables associated with missing data, which can enhance the effectiveness of the imputations. Once we have created the imputed datasets, we must decide whether to include participants with imputed outcomes in the model development. If no auxiliary variables were used in the imputations, people with imputed outcomes can be removed, and the model can be developed based only on people with observed outcomes. 57 However, if imputation incorporates auxiliary variables, including those with imputed outcomes in the model development is advisable. 58 A simpler alternative to multiple imputation is single imputation when each missing value is imputed only once using a regression model. Sisk and colleagues showed that single imputation can perform well, although multiple imputation tends to be more consistent and stable. 59
In step 4, we made the point that a model should include predictors that will be available in practice. However, we might want to make the model available even when some predictors are missing, for example, when using the model in a lower level of care. For example, the QRisk3 tool for predicting cardiovascular disease can be used even if the general practitioner does not enter information on blood pressure variability (the standard deviation of repeated readings). 60 When anticipating missing data during use in clinical practice, we can impute data during the development and implementation phases. In this case, single imputation can be used during model development and model use. 59
Ιmputation methods are not a panacea and might fail, typically when the tendency of the outcome to be missing correlates with the outcome itself. For example, patients receiving a new treatment might be more likely to miss follow-up visits if the treatment was successful, leading to missing data. Developing a prediction model in such cases requires additional modelling efforts 61 that are beyond the scope of this tutorial.
Modelling strategies.
The strategies for model development should be specified in the protocol (step 5). Linear regression for continuous outcomes, logistic regression for binary outcomes, and Cox or simple parametric models for survival outcomes are the usual starting points in modelling. If the sample size is large enough (see step 6), models can include nonlinear terms for continuous predictors or interactions between predictors. More advanced modelling strategies, such as machine learning models (eg, random forests, support vector machines, boosting methods, neural networks, etc), can also be used. 62 63 These strategies might add value if there are strong nonlinearities and interactions between predictors, although they are not immune to biases. 64 As discussed under step 10, a final strategy needs to be selected if several modelling strategies are explored.
When predicting binary or time-to-event outcomes, we should consider whether there are relevant competing events. This situation occurs when several possible outcomes exist, but a person can only experience one event. For example, when predicting death from breast cancer, death from another cause is a competing event. In this case, and especially whenever competing events are common, we should use a competing risks model for the analysis, such as a cause specific Cox regression model. 65 A simpler approach would be to analyse a composite outcome.
We advise against univariable selection methods—that is, methods that test each predictor separately and retain only statistically significant predictors. These methods do not consider the association between predictors and could lead to loss of valuable information. 55 66 Stepwise methods for variable selection (eg, forward, backwards, or bidirectional variable selection) are commonly used. Again, they are not recommended because they might lead to bias in estimation and worse predictive performance. 55 67 68 If variable selection is desirable—for instance, to simplify the implementation of the model by further reducing the number of predetermined predictors—more suitable methods can be used as described below.
Adding penalty terms to the model (a procedure called penalisation, regularisation, or shrinkage; see table 1 ) is recommended to control the complexity of the model and prevent overfitting. 69 70 71 Penalisation methods such as ridge, LASSO (least absolute shrinkage and selection operator), and elastic net generally lead to smaller absolute values of the coefficients—that is, they shrink coefficients towards zero—compared with maximum likelihood estimation. 72 LASSO and elastic net can be used for variable selection (similar to the methods described above). These models might exclude some predictors by setting their coefficients to zero, leading to a more interpretable and simpler model. Machine learning methods typically also have penalisation embedded. Penalisation is closely related to the bias-variance trade-off depicted in figure 1 , and is a method aiming to bring the model closer to the sweet spot of the bias-variance trade-off curve, where model performance in new data is maximised (note that the figure does not include a description of the double descent phenomenon). 73 Although penalisation methods have advantages, they do not solve all the problems associated with small sample sizes. While these methods typically are superior to standard estimation techniques, they can be unstable in small datasets. Moreover, their application does not ensure improved predictive performance. 74 75
Upper panel: graphical illustration of bias-variance trade-off. The training set is used to develop a model; the testing set is used to test it. A simple, underfitting model leads to high prediction error in training and testing sets. By increasing model complexity, the training set error can be lowered to zero. However, the testing set error (which needs to be minimised) only reduces to a point and then increases as complexity increases. The ideal model complexity is one that minimises the testing set error. An overfitting model might appear to perform well in the training set but might still be worthless—ie, overfitting leads to optimism. Lower three panels: fictional example of three prediction models (lines) developed using a dataset (points). x, y: single continuous predictor and outcome, respectively. The underfitting model has large training error and will also have large testing error; the overfitting model performs perfectly in the development set (ie, zero training error) but will perform poorly in new data (large testing error). The ideal model complexity will perform better than the other two in new data
If multiple imputation was used, we must apply each modelling strategy to every imputed dataset. Consequently, if there are m imputed datasets, m different models will be developed for each modelling strategy. When predicting outcomes, these m models need to be combined. There are two methods to achieve this. The first method uses Rubin’s rule, 76 which is suitable for simple regression models. The estimated parameters from the m models are averaged, resulting in a final set of parameters, which can then be used to predict the outcome for a new person. However, this method is not straightforward for model selection strategies (eg, LASSO) because the m fitted models might have selected different sets of parameters. As a result, combining them becomes more complex. 77 78 Rubin’s rule might not apply to machine learning methods because the m models could have different architectures. Another method for combining the m models is to use them to make predictions for the new person and then average these m predictions, 79 a procedure conceptually similar to stacking in machine learning.
General concepts in assessing model performance.
We assess the predictive performance of the modelling strategies explored in step 8. Specifically, we contrast predictions with observed outcomes for people in a dataset to calculate performance measures. For continuous outcomes like blood pressure this is straightforward: observed outcomes can be directly compared with predictions because they are on the same scale. When dealing with binary or survival outcomes, the situation becomes more complex. In these cases, prediction models might give the probability of an event occurring for each individual while observed outcomes are binary (event or no event) or involve time-to-event data with censoring. Consequently, more advanced methods are required.
Prediction performance has two dimensions, and it is essential to assess them both, particularly for binary and survival outcomes (see glossary in table 1 ).
Discrimination—for continuous outcomes, discrimination refers to the model’s ability to distinguish between patients with different outcomes: good discrimination means that patients with higher predicted values also had higher observed outcome values. For binary outcomes, good discrimination means that the model separates people at high risk from those at low risk. For time-to-event outcomes, discrimination refers to the ability of the model to rank patients according to their survival; that is, patients predicted to survive longer survived longer.
Calibration relates to the agreement between observed and predicted outcome values. 80 81 For continuous outcomes, good calibration means that predicted values do not systematically overestimate or underestimate observed values. For binary and survival outcomes, good calibration means the model does not overestimate or underestimate risks.
Discrimination and calibration are essential when evaluating prediction models. A model can have good discrimination by accurately distinguishing between risk levels, but still have poor calibration owing to a mismatch between predicted and observed probabilities. Moreover, a well calibrated model might have poor discrimination. Thus, a robust prediction model should have good discrimination and calibration. Box 1 outlines measures for assessing model performance.
Continuous outcomes.
Predicted and observed outcomes can be compared through mean bias, mean squared error, and the coefficient of determination, R 2 , to measure overall performance—ie, combining calibration and discrimination. For discrimination alone, rank correlation statistics between predictions and observations can be used, although this seldom occurs in practice. For calibration, results can be visualised in a scatterplot and an observed versus predicted line fitted. For a perfectly calibrated model, this line is on the diagonal; for an overfit (underfit) model, the calibration line is above (below) the diagonal. A smooth calibration line can assess calibration locally—ie, it can indicate areas where the model underestimates or overestimates the outcome. Smooth calibration lines can be obtained by fitting, for example, restricted cubic splines or a locally estimated scatterplot smoothing line (LOESS) of the predicted versus the observed outcomes.
Discrimination can be assessed using the area under the receiver operating characteristic curve (AUC). Mean calibration (calibration in the large, see table 1 ) can be determined by comparing mean observed versus mean predicted event rates. A logistic regression model can be fit to the observed outcome using the log odds of the event from the prediction model as the sole independent variable and then the intercept and slope can be evaluated. Additionally, a calibration curve can be created; for this, participants are grouped according to their predicted probabilities. Calculate the mean predicted probability and the proportion of events for each group; then compare the two in a scatterplot and draw a smooth calibration curve (eg, using splines) to assess calibration locally. The Brier score measures overall performance—it is simply calculated as the mean squared difference between predicted probabilities and actual outcomes. Many additional measures can be used to measure performance, for example, F score, sensitivity-specificity, etc.
If focus is on a specific time point, discrimination can be assessed as for binary outcomes (fixed time point discrimination). 18 However, censoring of follow-up times complicates this assessment. Uno and colleagues' inverse probability of censoring weights method can account for censoring. 82 Also, discrimination can be assessed across all time points using Harrell's c statistic. 83 Uno's c statistic can be expanded to a global measure, across all time points. 84 Calibration can be assessed for a fixed time point by comparing the average predicted survival from the model with the observed survival—ie, estimated while accounting for censorship; this can be obtained from a Kaplan-Meier curve by looking at the specific time point (calibration in the large at a fixed time). The Kaplan-Meier curve can be compared with the mean predicted survival across all times. More details can be found elsewhere. 18 Smooth calibration curves can also be used to assess performance of the model across the full range of predicted risks, while additional calibration metrics have also been proposed. 85 86 Similar measures can be used for competing events, with some adjustments. 16
What data should we use to assess the performance of a prediction model? The simplest approach is to use the same dataset as for model development; this approach will return the so-called apparent model performance (apparent validation). However, this strategy might overestimate the model’s performance ( fig 1 ); that is, it might lead to erroneous (optimistic) assessments. Optimism is an important issue in prediction modelling and is particularly relevant when sample sizes are small and models complex. Therefore, assessing model performance using a more adequate validation procedure is crucial. Proper validation is essential in determining a prediction model’s generalisability—that is, its reproducibility and transportability. 33 47 Reproducibility refers to the model’s ability to produce accurate predictions in new patients from the same population. Transportability is the ability to produce accurate predictions in new patients drawn from a different but related population. Below, we describe different approaches to model validation.
Internal validation focuses on reproducibility and specifically aims to ensure that assessments of model performance using the development dataset are honest, meaning optimism does not influence them. In an internal validation procedure, we use data on the same patient population as the one used to develop the model and try to assess model performance while avoiding optimism. Validation must follow all steps of model development, including variable selection.
The simplest method is the split sample approach where the dataset is randomly split into two parts (eg, 70% training and 30% testing). However, this method is problematic because it wastes data and decreases statistical power. 55 87 When applied to a small dataset, it might create two datasets that are inadequate for both model development and evaluation. Conversely, for large datasets it offers little benefit because the risk of overfitting is low. Further, it might encourage researchers to repeat the procedure until they obtain satisfactory results. 88 Another approach is to split the data according to the calendar time of patient enrolment. For example, we might develop the model using data from an earlier period and test it in patients enrolled later. This procedure (temporal validation) 35 89 might inform us about possible time trends in model performance. However, the time point used for splitting the data will generally be arbitrary and older data might not reflect current patient characteristics or health care. Therefore, this approach is not recommended for the development phase. 88
A better method is k-fold cross validation. In this approach, we divide the data randomly in k (usually 10) subsets (folds). The model is built using k−1 of these folds and evaluated on the remaining one fold. This process is repeated, cycling through all the folds so that each can be the testing set. The model's performance is measured in each cycle, and the k estimates are then combined and summarised to get a final performance measure. Bootstrapping is another method, 90 which can be used to calculate optimism and optimism corrected performance measures for any model. Box 2 outlines the procedure. 47 Bootstrapping generally leads to more stable and less biased results, 93 and is therefore recommended for internal validation. 47 However, implementation of k-fold cross validation and bootstrapping can be computationally demanding when multiple imputation of missing data is needed. 88
Use bootstrapping to correct apparent performance and obtain optimism corrected measures for any model M and any performance measure as follows.
Select a measure X (eg, R 2 , mean squared error, AUC (area under the receiver operating characteristic curve)) and calculate apparent performance (X 0 ) of model M in the original sample.
Create many (at least N B =100) bootstrap samples with the same size as the original dataset by drawing patients from the study population with replacement. Replacement means that some individuals might be included several times in a bootstrap sample, while others might not appear at all.
In each bootstrap sample i (i=1, 2 … N B ) construct model M i by exactly reiterating all steps of developing M, ie, including variable selection methods (if any were used). Determine the apparent performance X i of model M i in sample i.
Apply M i to the original sample and calculate performance, X i *. This performance will generally be worse than X i owing to optimism. Calculate optimism for measure X, sample i, as O i X =X i −X i *.
Average the N B different values of O i X to estimate optimism, O X .
Calculate the optimism corrected value of X as X corrected =X 0 −O i X .
More advanced versions of bootstrapping (eg, the 0.632+ bootstrap 91 ) require slightly different procedures. 92 In practice, we often need to combine bootstrapping with multiple imputation. Ideally, we should first bootstrap and then impute. 92 However, this strategy might be computationally difficult. Instead, we can first impute, then bootstrap, obtain optimism corrected performance measures from each imputed dataset, and finally pool these.
Another method of assessing whether a model’s predictions are likely to be reliable or not is by checking the model’s stability. Model instability means that small changes in the development dataset lead to large changes in the resulting model structure (important differences in estimates of model parameters, included predictors, etc), leading to important changes in predictions and model performance. Riley and Collins described how to assess the stability of clinical prediction models during the model development phase using a bootstrap approach. 94 The model building procedure is repeated in several bootstrap samples to create numerous models. Predictions from these models are then compared with the original model predictions to investigate possible instability.
An alternative approach is the internal-external or leave-one-out cross validation. This method involves partitioning the data into clusters based on a specific variable (eg, different studies, hospitals, general practices, countries) and then iteratively using one cluster as the test set while training the model on the remaining clusters. 95 96 Like in k-fold cross validation, this process is repeated for each cluster, and the performance results are summarised at the end. In contrast to k-fold cross validation, internal-external validation can provide valuable insights into how well the model generalises to new settings and populations because it accounts for heterogeneity across different clusters. For example, prediction models for patients with HIV were developed based on data from treatment programmes in Côte d’Ivoire, South Africa, and Malawi and validated using leave-one-country-out cross validation. 97
Note here that although all internal and internal-external validation methods include some form of data splitting, the final model should be developed using data from all patients. This strategy contrasts with the external validation method outlined below.
External validation requires testing the model on a new set of patients—that is, those not used for model development. 36 Assuming that the model has shown good internal validity, external validation studies are the next step in determining a model’s transportability before considering its implementation in clinical practice. The more numerous and diverse the settings in which the model is externally validated, the more likely it will generalise to a new setting. An external validation study could indicate that a model requires updating before being used in a new setting. A common scenario is when a model’s discrimination is adequate in new settings and fairly stable over time, but calibration is suboptimal across settings or deteriorates over time (calibration drift). 98 For example, EuroSCORE is a model developed in 1999 for predicting mortality in hospital for patients undergoing cardiac surgery. 99 Using data from 2001 to 2011, EuroSCORE was shown to consistently overestimate mortality and its calibration deteriorated over time. 100 In such situations, model updating (step 2) might be required.
The inclusion of external validation in model development is a topic of debate, with certain journals mandating it for publication. 88 100 One successful external validation, however, does not establish transportability to many other settings, while such a requirement might lead to the selective reporting of validation data. 100 Therefore, our view (echoing recent recommendations 88 ) is that external validation studies should be separated from model development at the moment of model development. External validation studies are ideally performed by independent investigators who were not involved in the original model development. 101 For guidance on methods for external validation, see references cited in step 2.
Now it is time to choose the final model based on the internal and internal-external validation performance metrics (and possibly on stability assessments). If different modelling strategies perform similarly, we might want to select the simpler model (related to Occam’s razor principle 102 ). For example, logistic regression performed similarly to optimised machine learning models for discriminating between type 1 and type 2 diabetes in young adults. 103 In this case, we would prefer the regression model because it is simpler and easier to communicate and use.
A prediction model might strongly discriminate and be well calibrated, but its value depends on how we intend to use it in clinical practice. While an accurate prediction model can be valuable in counselling patients on likely outcomes, determining its utility in guiding decisions is less straightforward. Decision analysis methods can be used to assess whether a prediction model should be used in practice by incorporating and quantifying its clinical impact, considering the anticipated benefits, risks, and costs. 104 For example, the National Institute for Health and Care Excellence (NICE) in the UK recommends cholesterol lowering treatment if the predicted 10 year risk of myocardial infarction or stroke is 10% or higher (the cut-off threshold probability) based on the QRISK3 risk calculator. 60 105 The assumption is that the benefit of treating one patient who would experience a cardiovascular event over 10 years outweighs the harms and costs incurred by treating another nine people who will not benefit. In other words, the harm associated with not treating the one patient who would develop the event is assumed to be nine times greater than the consequences of treating a patient who does not need it.
Net benefit brings the benefits and harms of a decision strategy (eg, to decide for or against treatment based on a prediction model) on the same scale so they can be compared. 104 We can compute the net benefit of using the model at a particular cut-off threshold (eg, 10% risk for the case of QRISK3 risk calculator). The net benefit is calculated as the expected percentage of true positives minus the expected percentage of true negatives, multiplied by a weight determined by the chosen cut-off threshold. We obtain the decision curve by plotting the model's net benefit across a range of cut-off thresholds deemed clinically relevant. 106 107 We can compare the benefit of making decisions based on the model with alternative strategies, such as treating everyone or no one. We can also compare different models. The choice of decision threshold can be subjective, and the range of sensible thresholds will depend on the settings, conditions, available diagnostic tests or treatments, and patient preferences. The lower the threshold, the more unnecessary tests or interventions we are willing to accept. Of note, a decision curve analysis might indicate that a model is not useful in practice despite its excellent predictive ability.
There are several pitfalls in the interpretation of decision curves. 24 Most importantly, the decision curve cannot determine at what threshold probability the model should be used. Moreover, because the model’s predictive performance influences the decision curve, the decision curve can be affected by optimism. Therefore, a model’s good predictive performance (in internal validation and after correction for optimism) should be established before evaluating its clinical usefulness through a decision curve. Additionally, the curve can be obtained using a cross validation approach. 108 Vickers and colleagues provide a helpful step-by-step guide to interpreting decision curve analysis, and a website with a software tutorial and other resources. 107 The multiple sclerosis example below includes a decision curve analysis.
In prediction modelling, the primary focus is typically not on evaluating the importance of individual predictors; rather, the goal is to optimise the model’s overall predictive performance. Nevertheless, identifying influential predictors might be of interest, for example, when evaluating the potential inclusion of a new biomarker as a routine measurement. Also, some predictors might be modifiable, raising the possibility that they could play a part in prevention if their association with the outcome is causal. Therefore, as an additional, optional step, researchers might want to assess the predictive capacity of the included predictors.
Looking at estimated coefficients in (generalised) linear regression models is a simple way to assess the importance of different predictors. However, when the assumptions of linear regression are not met, for example, when there is collinearity, these estimates might be unreliable. However, note that multicollinearity does not threaten a model's predictive performance, just at the interpretation of the coefficients. Another method to assess the importance of a predictor, also applicable to machine learning models, is to fit the model with and without this predictor and note the reduction in model performance; omitting more important predictors will lead to a larger reduction in performance. More advanced methods include the permutation importance algorithm 109 and SHAP (Shapley additive explanations) 110 ; we do not discuss these here.
Regardless of the method we choose to assess predictor importance, we should be careful in our interpretations; associations seen in data might not reflect causal relationships (eg, see the “Table 2 fallacy” 111 ). A thorough causal inference analysis is needed to establish causal associations between predictors and outcomes. 112
Congratulations to us! We have developed a clinical prediction model! Now, it is time to write the paper and describe the process and results in detail. The TRIPOD reporting guideline and checklist 10 14 (or, for clustered datasets, TRIPOD cluster 13 ) should be used to ensure all important aspects are covered in the paper. If possible, the article should report the full model equation to allow reproducibility and independent external validation studies. Software code and, ideally, data should be made freely available. Further, we must ensure the model is accessible to the users we defined in step 1. Although this should be self-evident, in practice, there is often no way to use published models to make an actual prediction; for example, Reeve and colleagues found that 52% of published models for multiple sclerosis could not be used in practice because no model coefficients, tools, or instructions were provided. 45
The advantages and disadvantages of different approaches for making the model available to users, including score systems, graphical score charts, nomograms, and websites and smartphone applications have been reviewed elsewhere. 113 Simpler approaches are easier to use, for example, on ward rounds, but might require model simplification by removing some predictors or categorising continuous variables. Online calculators where users input predictor values (eg, a web application using Shiny in R) 114 can be based on the whole model without information loss. However, if publicly accessible, calculators might be misused by people for whom they are not intended, or if the model fails to show any clinical value (eg, in a subsequent external validation). Generally, the presentation and implementation should always be discussed with the users to match their needs (defined in step 1).
Multiple sclerosis is a chronic inflammatory disorder of the central nervous system with a highly variable clinical course. 115 Relapsing-remitting multiple sclerosis (RRMS), the most common form, is characterised by attacks of worsening neurological function (relapses) followed by periods of partial or complete recovery (remissions). 116 117 118 These fluctuations pose a major challenge in managing the disease. A predictive tool could inform treatment decisions. Below, we describe the development of a prediction model for RMMS. 119 We briefly outline the procedures followed in the context of our step-by-step guide. Details of the original analysis and results are provided elsewhere. 119
The aim was to predict relapse within two years in patients with RRMS. Such a prediction can help treatment decisions; if the risk of relapsing is high, patients might consider intensifying treatment, for example, by taking more active disease modifying drugs, which might however have a higher risk of serious adverse events, or considering stem cell transplantation. A multidisciplinary team comprising clinicians, patients, epidemiologists, and statisticians was formed. A literature review identified several potential predictors for relapse in RRMS. Additionally, the review showed limitations of existing prediction models, including lack of internal validation, inadequate handling of missing data, and lack of assessment of clinical utility (step 1). These deficiencies compromised the reliability and applicability of existing models in clinical settings. Based on the review, it was decided to pursue the development of a new model, instead of updating an existing one (step 2). The authors chose the (binary) occurrence of at least one relapse within a two year period for people with RRMS (step 3) as the outcome measure.
The following predictors were used based on the literature review and expert opinion: age, expanded disability status scale score, previous treatment for multiple sclerosis, months since last relapse, sex, disease duration, number of previous relapses, and number of gadolinium enhanced lesions. The selection aimed to include relevant predictors while excluding those that are difficult to measure in clinical practice (step 4). The model was developed using data from the Swiss Multiple Sclerosis Cohort, 120 a prospective cohort study that closely monitors patients with RRMS. Data included a total of 1752 observations from 935 patients followed up every two years, with 302 events observed (step 5). Sample size calculations 50 indicated a minimum sample of 2082 patients, which is larger than the available sample, raising concerns about possible overfitting issues (step 6). Multiple imputations were used to impute missing covariate data. The authors expected no missing data when using the model in practice (step 7).
A Bayesian logistic mixed effects prediction model was developed, which accounted for several observations within patients. Regression coefficients were penalised through a Laplace prior distribution to address possible overfitting (step 8). Model calibration was examined in a calibration plot ( fig 2 , upper panel), and discrimination was assessed using the AUC (area under the receiver operating characteristic curve). Both assessments were corrected for optimism through a bootstrap validation procedure (described in box 2 ), with 500 bootstrap samples created for each imputed dataset. The optimism corrected calibration slope was 0.91, and the optimism corrected AUC was 0.65—this value corresponds to low to moderate discriminatory ability, comparable to or exceeding previous RRMS models (steps 9 and 10). A decision curve analysis was performed to assess the clinical utility of the model ( fig 2 , lower panel). The analysis indicated that deciding to intensify or not intensify the treatment using information from the model is preferable to simpler strategies—do not intensify treatment, and intensify treatment for all—for thresholds between 15% and 30%. Therefore, the model is useful to guide decisions in practice only if we value the avoidance of relapse 3.3–6.6 times more than the risks and inconveniences of more intensive treatments (step 11). Among the included predictors, younger age, higher expanded disability status scale scores, and shorter durations since the last relapse were associated with higher odds of experiencing a relapse in the next two years according to the estimated regression coefficients. However, none of the predictors were modifiable factors (step 13). The model was implemented in a freely available R-shiny 114 web application, where patients, doctors, and decision makers can estimate the probability of experiencing at least one relapse within the next two years ( https://cinema.ispm.unibe.ch/shinies/rrms/ ) . To enable reproducibility, all code was made publicly available at https://github.com/htx-r/Reproduce-results-from-papers/tree/master/PrognosticModelRRMS (step 13).
Results from a model predicting the probability of a patient with relapsing-remitting multiple sclerosis experiencing a relapse in the next two years. Figures adapted from Chalkou et al. 119 Upper panel: calibration plot. Solid blue line shows calibration using a LOESS (locally estimated scatterplot smoothing line), and shaded area shows 95% confidence intervals. Dotted blue line corresponds to perfect calibration. Maximum predicted probability was around 60% for this example. The model is well calibrated for predicted probabilities lower than 35%. Lower panel: decision curve analysis comparing net benefit of three strategies deciding on whether to intensify treatment in patients with relapsing-remitting multiple sclerosis (from no treatment to first line treatment, or from first line to second line treatment, etc). The strategies are to continue current treatment (do not intensify), to intensify treatment for all, or to intensify treatment according to predictions from model considering probability of experiencing a relapse in next two years—ie, if predicted probability is higher than a threshold (shown on x axis), then the treatment can be intensified
Our appendix is available online at https://github.com/esm-ispm-unibe-ch/R-guide-to-prediction-modelling , where we provide R code covering many aspects of the development of prediction models. The code uses simulated datasets and describes the case of continuous, binary, time-to-event, and competing risk outcomes. The code covers the following aspects: sample size calculations, multiple imputation, modelling nonlinear associations, assessing apparent model performance, performing internal validation using bootstrap, internal-external validation, and decision curve analysis. Readers should note that the appendix does not cover all possible modelling methods, models, and performance measures that can be used. Moreover, parts of the code are based on previous publications. 16 18 Additional code is provided elsewhere, for example, by Zhou and colleagues. 17
This tutorial provides a step-by-step guide to developing and validating clinical prediction models. We stress that this is not a complete and exhaustive guide, and it does not aim to replace existing resources. Our intention is to introduce essential aspects of clinical prediction modelling. Figure 3 provides an overview of the proposed steps.
Graphical overview of 13 proposed steps for developing a clinical prediction model. TRIPOD=transparent reporting of a multivariable prediction model for individual prognosis or diagnosis
In principle, most steps we have described apply to traditional statistical and machine learning approaches, 14 with some exceptions. For example, the structure of a machine learning model is often defined during model development and so will not be known a priori. Consequently, using the final model for multiple imputations, as we discussed in step 7, might not be possible. Further, bootstrapping, which we recommended as the method of choice for internal validation, might not be computationally feasible for some machine learning approaches. Moreover, some machine learning approaches might require additional development steps to ensure calibration. 94 121 122
We trust that our presentation of the key concepts and discussion of topics relevant to the development of clinical prediction models will help researchers to choose the most sensible approach for the problem at hand. Moreover, the paper will hopefully increase awareness among researchers of the need to work in diverse teams, including clinical experts, methodologists, and future model users. Similar to guidance on transparent reporting of research, adopting methodological guidance to improve the quality and relevance of clinical research is a responsibility shared by investigators, reviewers, journals, and funders. 123
Contributors: OE conceived the idea of the project and wrote the first draft of the manuscript. KC performed the analysis of the real example in relapsing-remitting multiple sclerosis. MS and OE prepared the online supplement. ME and GS contributed concepts and revised the manuscript. All authors contributed to the final manuscript. OE is the guarantor of the article. ME and GS contributed equally to the manuscript as last authors. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.
Funding: OE and MS were supported by the Swiss National Science Foundation (SNSF Ambizione grant 180083). ME was supported by special project funding from the SNSF (grant 32FP30-189498) and funding from the National Institutes of Health (5U01-AI069924-05, R01 AI152772-01). KC and GS were supported by the HTx project, funded by the European Union's Horizon 2020 research and innovation programme, 825162. The funders had no role in considering the study design or in the collection, analysis, interpretation of data, writing of the report, or decision to submit the article for publication.
Competing interests: All authors have completed the ICMJE uniform disclosure form at www.icmje.org/disclosure-of-interest/ and declare support from the Swiss National Science Foundation, National Institutes of Health, and European Union's Horizon 2020 research and innovation programme for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.
Provenance and peer review: Not commissioned; externally peer reviewed.
This is an Open Access article distributed in accordance with the terms of the Creative Commons Attribution (CC BY 4.0) license, which permits others to distribute, remix, adapt and build upon this work, for commercial use, provided the original work is properly cited. See: http://creativecommons.org/licenses/by/4.0/ .
Jingyu jiang, hanchao wang.
Compliant bistable mechanisms are specialized mechanisms that have specific self-locking characteristics in two positions. They are widely used in aerospace, micro-electromechanical systems, and high-precision manufacturing. The coupling of kinematic with elastomechanical behaviors of compliant mechanisms, known as kinetostatics, increases the difficulty of synthesizing compliant mechanisms. Currently, most research relies on optimization approaches to find compliant mechanisms that meet motion requirements. To address this challenge, this paper proposes a geometric synthesis method for compliant bistable mechanisms to solve the rigid guidance problem. The pole similarity transformation characteristics of planar beams and the static equilibrium characteristic of bistable mechanisms at stable positions are utilized to decouple the kinematic synthesis and static analysis. The proposed method introduces a task-driven synthesis process, where the critical structural parameters in compliant mechanisms are determined based on the desired guidance positions of motion tasks. This approach eliminates the need for a tedious and time-consuming iterative optimization process. The resulting bistable mechanisms have two stable positions that correspond to the desired guidance positions of the motion task. To illustrate the effectiveness of the geometric synthesis method, a two-position problem of a compliant bistable mechanism is provided as an example.
Jiang, J., Lin, S., and Wang, H.: Task-driven geometric synthesis method of a bistable compliant mechanism for the rigid guidance problem, Mech. Sci., 15, 515–529, https://doi.org/10.5194/ms-15-515-2024, 2024.
The synthesis of mechanisms has long been a topic of study in the field of rigid mechanisms. One classic problem in this area is the rigid guidance problem, which has been extensively researched for hundreds of years. Various efficient and accurate synthesis methods, such as the geometric method, analytical method, and atlas method, have been proposed to address this problem ( McCarthy and Soh , 2010 ) . However, when it comes to compliant mechanisms, the situation is different. The motion transmission in compliant mechanisms primarily relies on the deformation of compliant components (usually planar beams) under external forces. This unique characteristic makes it challenging to independently figure out kinematic design and static analysis, posing significant challenges in the synthesis of compliant mechanisms ( Howell et al. , 2013 ; Lobontiu , 2002 ) .
If the length of the compliant components is similar to the length of the rigid components, the geometric nonlinearity caused by large deformation must be considered ( Kimball and Tsai , 2002 ) . At present, many method have developed to analyze the large deflection of planar beams, such as the elliptical integral method, pseudo-rigid-body model (PRBM) and beam constraint model (BCM). The elliptic integral method is a classic solution for high-precision large deformation problems of planar beams. This method originated from the elastica problem and was introduced into the analysis of compliant mechanisms to solve the large deflection problem under different tip loads ( Shoup and McLarnan , 1971 ) . Zhang and Chen ( 2013 ) have extended this method and provided a comprehensive solution of elliptic integrals for large deflection problems, which can solve the problem of multiple inflection points in compliant beam deformation. Holst et al. ( 2011 ) and others improved the accuracy of the elliptic integral method by in troducing axial deflection and applying it to fixed-guidance beams. Although the final results of the elliptic integral method need to be obtained through elliptic integral tables, as an analytical solution for large deflection problems, the method provides the most accurate results for compliant beam deformation. Based on the elliptic integral method, Wang and Xu ( 2017 ) conducted an analysis of the kinetostatics of an XY micro-positioning stage with negative stiffness. Based on the results of the elliptic integral method, Midha et al. ( 2000 ) proposed a pseudo-rigid-body model (PRBM), which approximates compliant beams as a rigid link mechanism with torsion springs and decouples the kinematics and static analysis of planar beams. Howell and Midha ( 1994 ) created a synthesis approach based on PRBM, which can provide a practical means for analyzing and designing the compliant mechanisms. PRBM simplifies the geometric nonlinearity problem of compliant beams down to a very intuitive rigid mechanism model but at the cost of reducing the accuracy of motion analysis. Furthermore, several improved model, including PRBM with axial springs ( Saxena and Kramer , 1998 ) , PRBM with variable parameters ( Dado , 2001 ) , 2R PRBM ( Yu et al. , 2012 ) , 3R PRBM ( Su , 2009 ; Lin et al. , 2021 ) , and 5R PRBM ( Yu and Zhu , 2017 ) , were proposed and applied in the design of compliant mechanisms with large deflection. Various compliant mechanisms with special characteristics, including compliant beams with inflection points ( Zhu and Yu , 2017 ) , compliant beams with contact ( Jin et al. , 2020 ) , three-dimensional (3D) compliant beam deformation ( Chase et al. , 2011 ) , and initially curved compliant beam deformation ( Kalpathy Venkiteswaran and Su , 2017 ) , can be analyzed and designed using PRBM. Another widely used method for modeling compliant mechanism is the beam constraint model (BCM). BCM, proposed by Awtar et al. ( 2006 ) , provides a closed-form model of planar beam within an intermediate deformation range. Ma and Chen ( 2015 ) proposed a chain-beam constraint model (CBCM) to solve large deformation problems based on the BCM. CBCM can obtain the displacement at each node on the planar beam, making it more suitable for general compliant mechanism design problems. Besides, the energy-minimization-based kinetostatic solutions are also used in the design of compliant mechanism. For example, Turkkan and Su ( 2017 ) , Turkkan et al. ( 2018 ) , and Jiang et al. ( 2023 ) have all proposed design methods for compliant mechanisms based on the principle of minimum potential energy combined with optimization methods. Chen et al. ( 2017 ) also proposed a design method for compliant mechanisms based on the Crotti–Engesser theorem.
Bistable mechanisms are a type of compliant mechanism with special energy characteristics. Within their range of motion, there are positions or deformed states with local minima of strain energy, which are referred to as the stable positions or stable equilibrium positions of the mechanism. The mechanism can remain in a stable equilibrium position without relying on external forces and can return to the stable equilibrium position after being disturbed by external forces. This self-sustaining characteristic of compliant bistable mechanisms makes them highly valuable in specific rigid guidance problems. Currently, the most common design method for bistable mechanisms is to utilize the buckling characteristic of planar beams. Sönmez and Tutum ( 2008 ) and Zhao et al. ( 2008 ) established models of bistable mechanisms with hinged and fixed connections at both ends of a buckled beam. To avoid higher-order buckling states during deformation, Qiu et al. ( 2004 ) , Hussein et al. ( 2019 ) , Hussein et al. ( 2020 ) , and Haddab et al. ( 2018 ) proposed buckling models of curved beams and used them to create linear bistable mechanisms. In order to provide more adjustable parameters for the design of bistable mechanisms, scholars such as Parkinson et al. ( 2000 ) , Chen et al. ( 2021 ) , Todd et al. ( 2010 ) , and Tran and Wang ( 2017 ) proposed multi-segment planar beam bistable mechanisms. Another method to obtain the desired mechanical performance of bistable mechanism is using the planar beams with special shapes based on topology optimization ( Chen et al. , 2019 ) or other optimization method ( Chi et al. , 2019 ) . Building upon this, to address the issue of axial stiffness reduction after buckling of planar beams, Nathan and Howell ( 2003 ) , Wilcox and Howell ( 2005 ) , Han et al. ( 2017 ) , and others proposed bending-torsion planar beam configurations for designing planar bistable mechanisms with linear motion. Additionally, Jiang et al. ( 2024 ) proposed a synthesis method of series-based bistable compliant mechanisms for the rigid-body guidance problem based on the geometrical similarity transformation. Sargent et al. ( 2020 ) proposed a bistable mechanism used in medical support systems based on origami. Huang et al. ( 2020 ) designed a special linear bistable mechanism which only need one actuator to switch between two stable positions.
As mentioned above, a large number of compliant mechanism design methods have proposed and successfully applied in the design of various compliant mechanisms. Most of these methods still start from the analysis of mechanisms and find the optimal mechanisms that meet the motion task requirements through numerical optimization, especially in the field of bistable mechanism design. It is still difficult to simultaneously consider the accuracy and efficiency in large-deformation compliant mechanism design, and it is even more difficult to adjust the structural parameters of the mechanism with purpose based on motion tasks. Therefore, this paper proposes a synthesis method for the compliant bistable mechanism based on the pole similarity transformation. This method utilizes the special properties in the geometric transformation process of planar beams and the static equilibrium characteristics of stable positions in bistable compliant mechanism to directly select and determine the structural parameters in compliant mechanisms according to the given motion tasks. The synthesis of compliant mechanism rigid guidance problems with two stable positions is completed through this method.
The organization of paper is as follows: Sect. 2 presents the basic theories involved in this paper, including the deformation behavior of planar beams, the solution of the poles of planar beams, and the similarity transformation characteristics of planar beams. Section 3 introduces the synthesis method for two-position bistable mechanisms, including the description of motion tasks, the solution of the mechanism's structural parameters, and the general process of bistable mechanism synthesis. Sections 4 and 5 provide a specific synthesis case, and the design results were validated through simulations and experiments. In Sect. 6 , we discuss the experimental results and propose future research directions.
Planar beams are the primary elements in compliant mechanisms that transmit motion and force. This study initially determines the deformation behavior of planar beams. The motion of the beam’s tip is then described using the pole and pole angle. Lastly, the study presents the similarity transformation characteristics of the pole, which establishes the relationship between the structural parameters of planar beams and the motion of the beams' tips.
The deformation of the planar beam in this paper is based on the Bernoulli–Euler beam theory, in which the relationship between the sectional bending moment and the beam curvature of the planar beam is as follows:
where M b represents the bending moment of the cross section, d θ / d s represents the angular deformation rate (curvature) along the beam, E represents Young’s modulus of the material, and I represents the moment of inertia of the beam. The curvature, κ , can be further calculated by the deformation of a flexible beam as follows:
As shown in Fig. 1 , for a planar beam subjected to concentrated loads and bending moments at the beam's tip, the sectional bending moment on the beam can be calculated using loads and the coordinates of the deformed beam as follows:
where F x = F cos ( φ ) represents the component of the load in the x direction and F y = F sin ( ϕ ) represents the component of the load in the y direction. ϕ is the angle between the load F and the x direction. ( a , b ) represents the coordinates of the end of the beam after deformation. In addition, as shown in Fig. 1 , an equivalent action line of force, ℓ F , can be found, which is at distance d e from the end of the beam, where d e = M e / F .
Figure 1 The deformation and bending moment diagram of the planar beam under the loads.
By substituting Eq. ( 3 ) into Eq. ( 1 ) and differentiating both sides of the equation, we obtain the following:
Since d x / d s = cos θ and d y / d s = sin θ , Eq. ( 4 ) can be further simplified as follows:
After applying the chain rule of differentiation, the left side of Eq. ( 5 ) can be written as follows:
By substituting d θ / d s = κ into Eq. ( 6 ), we obtain
By substituting Eq. ( 7 ) into Eq. ( 5 ) and integrating both sides of the equation, we obtain the following:
At θ = θ e , we can establish the boundary condition – that is, κ e = M e / ( E I ) ; therefore, we can obtain the following:
By substituting Eq. ( 9 ) into Eq. ( 8 ), we obtain the following:
We define the first term of Eq. ( 11 ) as the load ratio, η , of the planar beam as follows:
When φ ∈ [ 0 , π ] , with α 2 = F L 2 sin φ / ( E I ) , Eq. ( 10 ) can be rewritten as
By separating variables and integrating Eq. ( 13 ), we can obtain the relationship between α and the rotational angle, θ e , as follows:
When the angle of loads, φ ∈ [ 0 , π ] ; the rotational angle, θ e ; and the load ratio, η , are provided, the load of the planar beam, F , can be calculated as follows:
The coordinates of the beam's tip can be calculated as follows:
Similarly, when the angle of loads is φ ∈ ( π , 2 π ] , with α 2 = - F L 2 sin φ / ( E I ) , we have
As shown in Fig. 2 , the relative positional relationship between the beam's tip positions, B 1 and B 2 , can be described by pole P and its corresponding rotation angle, ϑ . According to the definition of the pole, the position coordinates of pole P and the rotation angle, ϑ , can be calculated by the following formulas:
where B 1 = L + i 0 represents the tip position of the planar beam in its natural state and B 2 = a + i b represents the tip position of the planar beam carrying the tip loads F and M e . Expanding this formula yields the coordinates of the pole as follows:
Moment M O of the planar beam at frame O is
where M e = 2 F η E I sin φ . The position of the equivalent load line, ℓ F , can be determined by its intersection point with the x axis, R , as follows:
The distance, d P , between pole P and the equivalent load line, ℓ F can be calculated by the vector product of the vector R P and equivalent load line, ℓ F , as follows:
Figure 2 The pole of the deformed planar beam.
Pole P of the planar beam completely describes the relative position of the beam's tip. For any planar beam, the rotation direction of the beam's tip, the position of the pole, the distance between the pole and equivalent load line, and the rotation angle of the equivalent load line can be adjusted by the similarity transformation.
Planar beams can change the rotation direction by mirroring the load along the x axis. As shown in Fig. 3 a, to change the rotation direction of the planar beam, the equivalent force line of the planar beam, ℓ F ′ , needs to flipped along the x axis. In this scenario, the load angle is φ ′ = - φ , the pole position ( x P ′ , y P ′ ) = ( x P , - y P ) , and the pole angle ϑ ′ = - ϑ .
Figure 3 The similarity transformation of deformed planar beam: (a) mirror transformation, (b) translation transformation, (c) rotation transformation, and (d) scale transformation.
Planar beams can adjust the pole position by translating the frame, O . As shown in Fig. 3 b, when pole P of the beam is translated along vector τ , its tip position, ( a ′ , b ′ ) ; equivalent force line, ℓ F ′ ; and frame, O ′ , also undergo the same translation motion. After translation, the magnitude of the load, F ′ ; the angle of the equivalent force line, φ ′ ; and the distance from pole to the equivalent force line, d P ′ , all remain unchanged.
Planar beams can adjust the angle of the equivalent force line by rotating the frame O . As shown in Fig. 3 c, when the frame O rotates around the pole by an angle, γ , its tip position, ( a ′ , b ′ ) , and the equivalent force line, ℓ F ′ , also undergo the same rotational motion. After rotation, the magnitude of load F ′ and the distance from the pole to the equivalent force line d P ′ remain unchanged. The angle between the equivalent force line and the positive direction of the x axis changes to φ ′ = φ + γ .
The planar beam can adjust the distance from pole to the equivalent force line by proportionally changing the beam's length, L . As shown in Fig. 3 d, when the beam's length is scaled by proportion μ , according to Eqs. ( 16 ), ( 15 ), and ( 19 ), the scaled beam's tip position is ( a ′ , b ′ ) = ( μ a , μ b ) , the scaled pole position ( x P ′ , y P ′ ) = ( μ x P , μ y P ) , and the scaled loads F ′ = F / μ 2 . On this basis, by substituting scaled parameters ( a ′ , b ′ ) , ( x P ′ , y P ′ ) , and F ′ into Eqs. ( 20 ), ( 21 ), and ( 22 ), it can be concluded that the scaled distance from the pole to equivalent force line is d P ′ = μ d P .
Through the analysis of the deformation behavior of the planar beam, the load required for achieving a given beam's tip rotation is determined. The relative positional relationship between the two positions of the beam's tip is described using the pole. Through the pole similarity transformation, the pole position and the equivalent force line of the beam can be flexibly adjusted within the motion plane. In order to ensure that the rigid components of the compliant mechanism are in a stable state at specified positions, it is necessary to arrange the planar beams of the compliant mechanism into suitable positions. This paper utilizes the characteristics of the similarity transformation to adjust the pole position of the planar beams, ultimately establishing a synthesis method for the rigid-body guidance problem of compliant bistable mechanisms.
Figure 4 shows two stable positions of the bistable mechanism that needs to be designed in this paper, which consists of three planar beams and two rigid components. In the natural state, the flexible beams are located at O a B 1 a , O b B 1 b , and O c B 1 c , respectively. The three beams are connected to the rigid components B 1 a D 1 a C 1 a O c and B 1 b D 1 b C 1 b B 1 c . O a B 2 a , O b B 2 b , and O c ′ B 2 c ′ represent the deformation states of the three planar beams when B 2 a D 2 a C 2 a O c ′ and B 2 b D 2 b C 2 b B 2 c ′ are the rigid components at the second stable position. In this paper, the two given positions of first rigid component are D 1 a C 1 a and D 2 a C 2 a , while the two given positions of the second rigid component are D 1 b C 1 b and D 2 b C 2 b , respectively. The motion task of the rigid guidance problem can be described by the poles. As shown in Fig. 5 , the pole of first component, P a , can be calculated by
while the pole of second component, P b , can be calculated by
Figure 4 The illustration of the stable positions of the compliant bistable mechanism.
Figure 5 The pole of the two different positions in the rigid-body guidance problem.
3.2.1 the poles of planar beams in the designed compliant mechanism.
In the compliant mechanism, two planar beams, O a B 1 a and O b B 1 b , are connected to the frame on one tip and to the rigid components on the other tip, so their poles should be consistent with the poles of the motion task, P a ( ϑ a ) and P b ( ϑ b ) . The two tips of the third planar beam, O c B 1 c , are connected to two rigid components, respectively; thus, there are two poles, P c and P c ′ , that correspond to two stable equilibrium positions. As shown in Fig. 6 , pole P c ′ can be viewed as the result of rotating P c around P a by ϑ a or as the result of rotating P c around P b by ϑ b . Therefore, P c and P c ′ are symmetric in relation to the line P a P b . When we know the poles of two rigid components, we connect the poles P a and P b , rotate P a P b around P a by ϑ a , and then rotate P a P b around P b by ϑ b . The intersection of the two lines is the first pole of the third beam, P c . Similarly, when P a P b is rotated around P a and P b by − ϑ a and − ϑ b , respectively, the intersection is the second pole of the third beam P c ′ . The pole of third beam represents the relative angle between two rigid components, which is known from the characteristic of the polar triangle, i.e., the rotation angle ϑ c = - ϑ a + ϑ b .
Figure 6 The pole of the three planar beams of the compliant mechanism in the motion generation problem.
After determining the poles of each planar beam, it is necessary to determine the dimensional parameters and installation positions of the planar beams in the compliant mechanism. The process is as follows:
Determination of the load balance line. The first stable position of the bistable compliant mechanism is the natural state of the mechanism, in which the planar beams are not subjected to external forces. When the compliant mechanism is in the second stable position, the equilibrium of the mechanism is achieved through the interaction forces between the beams. Therefore, in order to ensure that the compliant mechanism maintains balance in the second stable position, the equivalent instances of the force line of the three compliant beams, ℓ F k , need to coincide, and the coinciding position is the load balance line, ℓ B .
As shown in Fig. 7 , due to the opposite rotation directions of the two planar beams connected to the frame, the directions of their loads, F a and F b , are also opposite. In order to ensure the balance of forces on the rigid components, load F c on the third beam has the same direction as F a . The position and angle of the load balance line can be arbitrarily given, but it needs to ensure that the load balance line, ℓ B , passes through two edges, P a P c ′ and P b P c ′ , of the pole triangle P a P b P c ′ . After the load balance line is selected, the load directions of each beam in the x O y coordinate system are already determined – that is, the directions of F a and F c are ϕ a = ϕ c = ϕ , and the direction of F b is ϕ b = ϕ - π . Based on this, the distances d PB a , d PB b , and d PB c between the poles of planar beams and the load balance line are also determined.
The initial solution of planar beams. The rotation angle of the beam's tip, θ e k , is determined by the pole angle of each beam. To avoid negative curvature, a mirror transformation is required for the planar beam with negative pole rotation. Taking Fig. 6 as an example, beams b and c need to be reversed using a mirror transformation. The rotation angle of the planar beams can be calculated by
The elastic modulus, E , of the planar beam needs to be determined based on the selected material, and the section height, h 0 k ; section width, b 0 k ; beam length, L 0 k ; load ratio, η 0 k ; and load angle, φ 0 k , of each planar beam can be arbitrarily chosen. After parameter selection, as shown in Fig. 8 , the position of the pole, P 0 k , the load magnitude, F 0 k , and the distance from the equivalent force line to the pole of the planar beam, d 0 k , can be calculated according to the procedures in Sect. 2.1 and 2.2. The calculated result is the initial solution of the planar beam.
The similarity transformation of planar beam. To meet the requirements of the motion task, pole P 0 k and the equivalent force line of the planar beams, ℓ F k , need to be transformed to the suitable positions ( P k and ℓ B ) in the compliant mechanism. For each planar beam, four steps of the pole similarity transformation are required.
Translation transformation. Pole P 0 k of the planar beam k is moved to the origin, and the translation vector τ 1 k = - P 0 k , where k ∈ { a , b , c } .
Scale transformation. In order to ensure that d P k = d PB k , planar beam k is subjected to a scale transformation with a scaling factor, μ k = d PB k / d 0 k , where k ∈ { a , b , c } . After the scaling transformation, the load magnitude of the planar beam also changes to F 1 k = F 0 k / ( μ k ) 2 .
Rotation transformation. In order to make ℓ F k parallel to ℓ B , planar beam k needs to undergo a rotation transformation with an angle of γ k = ϕ k - φ 0 k , where k ∈ { a , b , c } .
Translation transformation. The rotated planar beam, k , needs to be translated back to the pole of the compliant mechanism, and the translation vector is τ 2 k = P k , where k ∈ { a , b , c } .
Determination of the beams' width. After the similarity transformation, the load position and motion of the planar beam already satisfy the requirements of the compliant bistable mechanism. In order to ensure that the static equilibrium condition is satisfied at the second stable position, it is necessary to adjust the width of the planar beam b k to unify the load magnitude, F k . According to the scaled load, F 1 k , of each planar beam k , the equilibrium load at the second stable position of the compliant mechanism, F m , is selected. According to Eq. ( 15 ), the tip load of the planar beam is proportional to the inertia moment, I . For the rectangular cross section, the inertia moment, I , is proportional to the width of the cross section. Therefore, the cross-section width of each planar beam k is adjusted in proportion to the load magnitude, F 1 k , and equilibrium load F m – that is, b k = v k b 0 k , where v k = F m / F 1 k .
Figure 7 The loads balanced line of compliant bistable mechanism.
Figure 8 The initial solutions of three planar beams in compliant bistable mechanism.
Based on the pole similarity transformation, the synthesis process of the bistable compliant mechanism is shown in Fig. 9 . First, the poles and corresponding pole angles of the rigid components and planar beams in the compliant mechanism are determined based on the given positions of the motion task. Second, the load balance line and the magnitude of the equilibrium load for the compliant mechanism are selected. Then, the initial solution of the planar beams based on the geometric and mechanical features is obtained, and so are the compliant mechanism solution that satisfies the motion task requirements and static equilibrium conditions through similarity transformation. Finally, output the relevant parameters of the planar beams and rigid components in the compliant mechanism for the specific design of the compliant mechanism. After defining the input and output of the program, this synthesis process can be automatically completed using MATLAB software.
Figure 9 General synthesis process of compliant bistable mechanism.
This section takes the planar two-position rigid-body guidance mechanism as an example and designs a bistable compliant mechanism based on the proposed synthesis process. The guidance positions of the rigid components are D 1 a C 1 a , D 2 a C 2 a and D 1 b C 1 b , D 2 b C 2 b . Table 1 shows the motion task parameters.
Table 1 Parameters of motion task.
Download Print Version | Download XLSX
According to the proposed synthesis process, the geometric features of the motion task and the compliant mechanism are first extracted. Poles P a and P b are determined according to Eqs. ( 23 ) and ( 24 ). Poles P c and P c ′ are obtained by rotation and pole angle of ϑ c = - ϑ a + ϑ b . The specific results are shown in Table 2 .
Table 2 Geometric features of the motion task and the compliant mechanism.
The position of the load balance line is selected according to the position of the poles. The load balance line passes through points (0.5 cm, 2.5 cm) and (8.5 cm, 4.5 cm), and ϕ = 194.04°. The magnitude of equilibrium load is F m = 0.5 N. The distance from poles P a , P b , and P c ′ to the load balance line, ℓ B , is determined to be d PB a = 0.606 cm, d PB b = 0.606 cm, and d PB c = 0.666 cm, respectively. The material selected for the planar beams is 65Mn spring steel, with an elastic modulus of E = 210 000 MPa.
The initial parameters of the planar beams need to be selected, including the initial length, L 0 k = 1 cm; initial width, b 0 k = 0.1 cm; and initial height, h 0 k = 0.005 cm. The load ratio, η 0 k , and angle of equivalent force line, ϕ 0 k , are shown in Table 3 . Based on these parameters, the initial solutions of the planar beams are determined. As shown in Table 4 , the similarity transformation parameters of the three beams are calculated according to Sect. 3.2.2 . After transforming the planar beams to the corresponding pole positions, all parameters of the compliant mechanism can be determined. The specific result can be found in Table 5 and Fig. 10 .
Table 3 Parameters for the initial solution of compliant mechanism.
Table 4 Similarity transformation parameters of compliant mechanism.
Table 5 Parameters of final solution of compliant mechanism.
Figure 10 The synthesis results of compliant bistable mechanism and their comparison with finite-element analysis (FEA).
The synthesis process of bistable compliant mechanism for the rigid-body guidance problem is completed. The designed compliant mechanism is modeled in finite-element analysis (FEA) software to verify the motion accuracy and bistable characteristics. As shown in Fig. 11 a, the models of planar beams are imported into the FEA software and establish rigid constraints between the beams to simulate the connection of rigid components in the compliant mechanism. Two reference points are established at D 1 a and D 1 b , and they are fixed to the tips of slender beams a and b . The rotation constraint is applied at the reference points of D 1 a , and the first rigid component is driven to rotate counterclockwise by Δ β a = 35°, indicating that the mechanism will reach and surpass the second stable position. The deformation of the mechanism at the initial position and the second stable position is shown in Fig. 11 b. The strain energy of the mechanism and the second rigid component's motion during the movement are shown in Fig. 12 . It can be observed that in the simulation, the strain energy of the mechanism reaches a local minimum when the rotation angle of the first rigid component reaches 30°. At this point, the rotation angle of the second rigid component is − 40°, and the coordinates of reference points D 1 a and D 1 b are (1.87 cm, 2.50 cm) and (8.23 cm, 4.64 cm), respectively, indicating that the designed mechanism meets the requirement of stable positions in the motion task. The beam's shape of the second stable position in the simulation is shown in Fig. 10 .
Figure 11 The FEA results regarding the strain energy of the compliant mechanism.
Figure 12 The FEA results regarding the strain energy of the compliant mechanism and second rigid component's motion.
The prototype is manufactured according to the synthesis results. As shown in Fig. 13 , the frame and rigid components in the mechanism are manufactured through 3D printing. The planar beams are made of spring steel (65Mn). The planar beams and rigid components are securely fastened together using bolted connections. Three yellow markers are added to the frame and rigid components using 3D printing. The second stable position of the prototype is shown in Fig. 13 . The red markers in the figure represent the results from the finite-element analysis at the second stable position, and it can be observed that the deformation of the planar beams in the prototype is consistent with the simulation results.
Figure 13 The prototype and the comparison of the beams' shape at the second stable position with FEA results.
We utilize monocular ranging algorithm to calculate the stable position of the rigid components in the prototype. As shown in Fig. 14 , image-processing techniques are employed to identify the markers on the frame and rigid components. The three markers on the frame are used to determine the origin and orientation of the coordinate system for the prototype. The positions of the markers on the rigid components can be calculated through coordinate transformation. The pole angles and the location of the poles are calculated by the markers. Any two markers on the same rigid component can form a directed line segment. By utilizing the positional relationship of directed line segments between two stable positions, a pole and its corresponding pole angle can be determined. The results are presented in Table 6 . The maximum error for the location of the pole is 3.82 %, and the maximum error for the pole angle is 4.92 %.
Figure 14 The marker detection results of the two stable positions.
Table 6 The pole angles and location of poles of the prototype.
Due to the self-holding characteristic of the bistable mechanism, the driving force or torque of the mechanism has significant features. There are three states with zero driving force between the two stable positions, which corresponds to the positions of minimum and maximum strain energy. To measure the driving torque of the prototype, an experimental platform is set up as shown in Fig. 15 . The first rigid component is made to rotate using a servo motor in position mode. The driving torque is recorded using a torque sensor. The experiment is repeated three times, and the comparison between the experimental results and the simulation is shown in Fig. 16 . It can be seen that the results of the three experiments show good consistency. Throughout the motion process, the driving torque has three intersections with the x axis, verifying the bistable characteristic of the mechanism. The first intersection is located at the initial position with a driving angle of Δ β a = 0°, and the third intersection is located at the second stable position with a driving angle of approximately Δ β a = 30°, which is consistent with the design goal. The second intersection is located at a driving angle of Δ β a = 26°, which is greater than the simulated result of 25°. This is mainly caused by manufacturing and assembly errors in the prototype.
Figure 15 The platform of the driving torque experiment.
Figure 16 The comparison of the driving torque between experiments and simulation.
Poles are a geometric tool that can accurately describe multiple planar positions, and they can reveal the relationship between guidance mechanisms and given design tasks. Based on the similarity transformation of poles, this paper proposes a novel synthesis method for compliant bistable mechanisms. The synthesis example and results of simulation and prototype experiments have demonstrated the effectiveness of the proposed method. Regarding the proposed design method, future research can focus on the following two aspects:
Study on the energy and mechanical characteristics of the intermediate states in bistable mechanisms . The proposed method can design the stable positions of the mechanism based on the motion task. However, the maximum strain energy and maximum driving force between the two stable positions of the mechanism also have important research value. How to incorporate the calculations of these intermediate characteristics into the overall process of mechanism synthesis will be a key issue for future research.
Research on integrated manufacturing methods for mechanisms . One important advantage of compliant mechanisms is their ability to be manufactured in an integrated manner, eliminating the assembly process. In the proposed method, the planar beams have different widths, which requires us to manufacture and assemble the components separately. The assembly process not only limits the size of the prototype but also introduces significant assembly errors. Exploring miniaturized integrated manufacturing methods is also one of the important research topics for future studies.
This paper proposes a novel geometrical approach to bistable compliant mechanism synthesis based on the similarity transformation of poles. The study demonstrates the feasibility of decoupling the kinematic design and static analysis processes in the synthesis of bistable compliant mechanisms. At the method level, a general synthesis process for bistable compliant mechanisms is provided, simplifying the iterative process in the design of compliant mechanisms and offering an efficient synthesis tool for general compliant bistable mechanisms. In addition, this study illustrates the synthesis approach with an example, and a prototype was made.
All the code and data used in this paper can be obtained from the corresponding author upon request.
JJ and SL proposed the methodology. HW took part in the discussion of the paper.
The contact author has declared that none of the authors has any competing interests.
Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.
The authors would like to thank anonymous reviewers for their valuable comments and suggestions that enabled them to revise the paper.
This research has been supported by the Natural Science Foundation of Fujian Province (grant no. 2022J05246).
This paper was edited by Engin Tanık and reviewed by two anonymous referees.
Awtar, S., Slocum, A. H., and Sevincer, E.: Characteristics of Beam-Based Flexure Modules, J. Mech. Design, 129, 625–639, 2006. a
Chase Jr., R. P., Todd, R. H., Howell, L. L., and Magleby, S. P.: A 3-D chain algorithm with pseudo-rigid-body model elements, Mech. Based Des. Struc., 39, 142–156, 2011. a
Chen, G., Ma, F., Bai, R., Magleby, S. P., and Howell, L. L.: A Framework for Energy-Based Kinetostatic Modeling of Compliant Mechanisms, in: ASME 2017 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, Cleveland, Ohio, USA, 6–9 August 2017, 5A, p. V05AT08A021, https://doi.org/10.1115/DETC2017-68205 , 2017. a
Chen, J.-q., Hao, Y.-x., and Zhang, W.: STATIC and SNAP-through Behaviors of trapezoidal BI-stable Laminates, in: 2020 15th Symposium on Piezoelectrcity, Acoustic Waves and Device Applications (SPAWDA), Zhengzhou, Henan Province, China, 16–19 April 2021, IEEE, 650–658, https://doi.org/10.1109/SPAWDA51471.2021.9445512 , 2021. a
Chen, Q., Zhang, X., Zhang, H., Zhu, B., and Chen, B.: Topology optimization of bistable mechanisms with maximized differences between switching forces in forward and backward direction, Mech. Mach. Theory, 139, 131–143, 2019. a
Chi, I. T., Tien Hoang, N., Chang, P.-L., Ngoc Dang Khoa, T., and Wang, D.-A.: Design of a bistable mechanism with B-spline profiled beam for versatile switching forces, Sensor. Actuat. A-Phys., 294, 173–184, https://doi.org/10.1016/j.sna.2019.05.028 , 2019. a
Dado, M. H.: Variable parametric pseudo-rigid-body model for large-deflection beams with end loads, Int. J. Nonlin. Mech., 36, 1123–1133, 2001. a
Haddab, Y., Aiche, G., Hussein, H., Salem, M. B., Lutz, P., Rubbert, L., and Renaud, P.: Mechanical Bistable Structures for Microrobotics and Mesorobotics from Microfabrication to Additive Manufacturing, in: 2018 International Conference on Manipulation, Automation and Robotics at Small Scales (MARSS), Nagoya, Japan, 4–8 July 2018, IEEE, 1–6, https://doi.org/10.1109/MARSS.2018.8481186 , 2018. a
Han, Q., Jin, K., Chen, G., and Shao, X.: A novel fully compliant tensural-compresural bistable mechanism, Sensor. Actuat. A-Phys., 268, 72–82, 2017. a
Holst, G. L., Teichert, G. H., and Jensen, B. D.: Modeling and experiments of buckling modes and deflection of fixed-guided beams in compliant mechanisms, J. Mech. Design, 133, 051002, https://doi.org/10.1115/1.4003922 , 2011. a
Howell, L. L. and Midha, A.: A Method for the Design of Compliant Mechanisms With Small-Length Flexural Pivots, J. Mech. Design, 116, 280–290, https://doi.org/10.1115/1.2919359 , 1994. a
Howell, L. L., Magleby, S. P., and Olsen, B. M.: Handbook of Compliant Mechanisms, John Wiley & Sons, ISBN: 9781119953456, https://doi.org/10.1002/9781118516485 , 2013. a
Huang, S.-W., Lin, F.-C., and Yang, Y.-J.: A novel single-actuator bistable microdevice with a moment-driven mechanism, Sensor. Actuat. A-Phys., 310, 111934, https://doi.org/10.1016/j.sna.2020.111934 , 2020. a
Hussein, H., Le Moal, P., Younes, R., Bourbon, G., Haddab, Y., and Lutz, P.: On the design of a preshaped curved beam bistable mechanism, Mech. Mach. Theory, 131, 204–217, https://doi.org/10.1016/j.mechmachtheory.2018.09.024 , 2019. a
Hussein, H., Khan, F., and Younis, M. I.: A symmetrical bistable mechanism from combination of pre-shaped microbeams, Sensor. Actuat. A-Phys., 306, 111961, https://doi.org/10.1016/j.sna.2020.111961 , 2020. a
Jiang, J., Lin, S., Wang, H., and Modler, N.: Modeling Method for Static Large Deflection Problem of Curved Planar Beams in Compliant Mechanisms Based on a Novel Governing Equation, J. Mech. Robot., 16, 031014, https://doi.org/10.1115/1.4062916 , 2023. a
Jiang, J., Lin, S., Wang, H., and Modler, N.: The synthesis method of series-based bistable compliant mechanisms for rigid-body guidance problem based on geometrical similarity transformation of pole maps, J. Mech. Design, 146, 103301, https://doi.org/10.1115/1.4065023 , 2024. a
Jin, M., Zhu, B., Mo, J., Yang, Z., Zhang, X., and Howell, L. L.: A CPRBM-based method for large-deflection analysis of contact-aided compliant mechanisms considering beam-to-beam contacts, Mech. Mach. Theory, 145, 103700, https://doi.org/10.1016/j.mechmachtheory.2019.103700 , 2020. a
Kalpathy Venkiteswaran, V. and Su, H.-J.: Pseudo-Rigid-Body Models of Initially-Curved and Straight Beams for Designing Compliant Mechanisms, in: Proceedings of the ASME 2017 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, Cleveland, Ohio, USA, 6–9 August 2017, ASME, 5A, v05AT08A006, https://doi.org/10.1115/DETC2017-67431 , 2017. a
Kimball, C. and Tsai, L.-W.: Modeling of Flexural Beams Subjected to Arbitrary End Loads, J. Mech. Design, 124, 223–235, 2002. a
Lin, S., Zhang, Y., Wang, H., Jiang, J., and Modler, N.: Geometric synthesis method of compliant mechanism based on similarity transformation of pole maps, Mech. Sci., 12, 375–391, https://doi.org/10.5194/ms-12-375-2021 , 2021. a
Lobontiu, N.: Compliant mechanisms: design of flexure hinges, CRC Press, https://doi.org/10.1201/9781420040272 , ISBN: 9780429121654, 2002. a
Ma, F. and Chen, G.: Modeling Large Planar Deflections of Flexible Beams in Compliant Mechanisms Using Chained Beam-Constraint-Modell, J. Mec. Robot., 8, 021018, https://doi.org/10.1115/1.4031028 , 2015. a
McCarthy, J. M. and Soh, G. S.: Geometric design of linkages, Vol. 11, Springer Science & Business Media, https://doi.org/10.1007/978-1-4419-7892-9 , ISBN: 978-1-4419-7891-2, 2010. a
Midha, A., Howell, L. L., and Norton, T. W.: Limit positions of compliant mechanisms using the pseudo-rigid-body model concept, Mech. Mach. Theory, 35, 99–115, 2000. a
Nathan, D. and Howell, L.: A self-retracting fully compliant bistable micromechanism, J. Microelectromech. S., 12, 273–280, 2003. a
Parkinson, M. B., Jensen, B. D., and Roach, G. M.: Optimization-Based Design of a Fully-Compliant Bistable Micromechanism, in: Proceedings of the ASME 2000 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, 26th Biennial Mechanisms and Robotics Conference. Baltimore, Maryland, USA, 10–13 September 2000, ASME, 7A, 635–641, https://doi.org/10.1115/DETC2000/MECH-14119 , 2000. a
Qiu, J., Lang, J., and Slocum, A.: A curved-beam bistable mechanism, J. Microelectromech. S., 13, 137–146, https://doi.org/10.1109/JMEMS.2004.825308 , 2004. a
Sargent, B., Butler, J., Seymour, K., Bailey, D., Jensen, B., Magleby, S., and Howell, L.: An Origami-Based Medical Support System to Mitigate Flexible Shaft Buckling, J. Mech. Robot., 12, 041005, https://doi.org/10.1115/1.4045846 , 2020. a
Saxena, A. and Kramer, S. N.: A Simple and Accurate Method for Determining Large Deflections in Compliant Mechanisms Subjected to End Forces and Moments, J. Mech. Design, 120, 392–400, 1998. a
Shoup, T. E. and McLarnan, C. W.: On the Use of the Undulating Elastica for the Analysis of Flexible Link Mechanisms, J. Eng. Ind., 93, 263–267, 1971. a
Sönmez, U. and Tutum, C. C.: A Compliant Bistable Mechanism Design Incorporating Elastica Buckling Beam Theory and Pseudo-Rigid-Body Model, J. Mech. Design, 130, 042304, https://doi.org/10.1115/1.2839009 , 2008. a
Su, H.-J.: A Pseudo-Rigid-Body 3R Model for Determining Large Deflection of Cantilever Beams Subject to Tip Loads, J. Mech. Robot., 1, 021008, https://doi.org/10.1115/1.3046148 , 2009. a
Todd, B., Jensen, B. D., Schultz, S. M., and Hawkins, A. R.: Design and testing of a thin-flexure bistable mechanism suitable for stamping from metal sheets, J. Mech. Design, 132, 071011, https://doi.org/10.1115/1.4001876 , 2010. a
Tran, N. D. K. and Wang, D.-A.: Design of a crab-like bistable mechanism for nearly equal switching forces in forward and backward directions, Mech. Mach. Theory, 115, 114–129, 2017. a
Turkkan, O. A. and Su, H.-J.: A general and efficient multiple segment method for kinetostatic analysis of planar compliant mechanisms, Mech. Mach. Theory, 112, 205–217, https://doi.org/10.1016/j.mechmachtheory.2017.02.010 , 2017. a
Turkkan, O. A., Venkiteswaran, V. K., and Su, H.-J.: Rapid conceptual design and analysis of spatial flexure mechanisms, Mech. Mach. Theory, 121, 650–668, https://doi.org/10.1016/j.mechmachtheory.2017.11.025 , 2018. a
Wang, P. and Xu, Q.: Design of a flexure-based constant-force XY precision positioning stage, Mech. Mach. Theory, 108, 1–13, https://doi.org/10.1016/j.mechmachtheory.2016.10.007 , 2017. a
Wilcox, D. L. and Howell, L. L.: Double-tensural bistable mechanisms (DTBM) with on-chip actuation and spring-like post-bistable behavior, in: ASME 2005 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, Long Beach, California, USA, 24–28 September 2005, American Society of Mechanical Engineers, https://doi.org/10.1115/DETC2005-84697 , ISBN: 0-7918-4744-6, 2005. a
Yu, Y.-Q. and Zhu, S.-K.: 5R pseudo-rigid-body model for inflection beams in compliant mechanisms, Mech. Mach. Theory, 116, 501–512, 2017. a
Yu, Y.-Q., Feng, Z.-L., and Xu, Q.-P.: A pseudo-rigid-body 2R model of flexural beam in compliant mechanisms, Mech. Mach. Theory, 55, 18–33, 2012. a
Zhang, A. and Chen, G.: A Comprehensive Elliptic Integral Solution to the Large Deflection Problems of Thin Beams in Compliant Mechanisms, J. Mech. Robot., 5, 021006, https://doi.org/10.1115/1.4023558 , 2013. a
Zhao, J., Jia, J., He, X., and Wang, H.: Post-buckling and Snap-Through Behavior of Inclined Slender Beams, J. Appl. Mech., 75, 041020, https://doi.org/10.1115/1.2870953 , 2008. a
Zhu, S.-K. and Yu, Y.-Q.: Pseudo-Rigid-Body Model for the Flexural Beam With an Inflection Point in Compliant Mechanisms, J. Mech. Robot., 9, 031005, https://doi.org/10.1115/1.4035986 , 2017. a
BMC Health Services Research volume 24 , Article number: 1006 ( 2024 ) Cite this article
2 Altmetric
Metrics details
Stroke is a leading cause of mortality and disability. In higher-income countries, mortality and disability have been reduced with advances in stroke care and early access to rehabilitation services. However, access to such services and the subsequent impact on stroke outcomes in the Philippines, which is a lower- and middle-income countries (LMIC), is unclear. Understanding gaps in service delivery and underpinning research from acute to chronic stages post-stroke will allow future targeting of resources.
This scoping review aimed to map available literature on stroke services in the Philippines, based on Arksey and O’Malley’s five-stage-process.
A targeted strategy was used to search relevant databases (Focused: MEDLINE (ovid), EMBASE (ovid), Cumulative Index to Nursing and Allied Health Literature (CINAHL), PsycINFO (ebsco); broad-based: Scopus; review-based: Cochrane Library, International Prospective Register of Systematic Reviews (PROSPERO), JBI (formerly Joanna Briggs Institute) as well as grey literature (Open Grey, Google scholar). The searches were conducted between 12/2022-01/2023 and repeated 12/2023. Literature describing adults with stroke in the Philippines and stroke services that aimed to maximize well-being, participation and function were searched. Studies were selected if they included one or more of: (a) patient numbers and stroke characteristics (b) staff numbers, qualifications and role (c) service resources (e.g., access to a rehabilitation unit) (d) cost of services and methods of payment) (e) content of stroke care (f) duration of stroke care/rehabilitation and interventions undertaken (g) outcome measures used in clinical practice.
A total of 70 papers were included. Articles were assessed, data extracted and classified according to structure, process, or outcome related information. Advances in stroke services, including stroke ready hospitals providing early access to acute care such as thrombectomy and thrombolysis and early referral to rehabilitation coupled with rehabilitation guidelines have been developed. Gaps exist in stroke services structure (e.g., low number of neurologists and neuroimaging, lack of stroke protocols and pathways, inequity of stroke care across urban and rural locations), processes (e.g., delayed arrival to hospital, lack of stroke training among health workers, low awareness of stroke among public and non-stroke care workers, inequitable access to rehabilitation both hospital and community) and outcomes (e.g., low government insurance coverage resulting in high out-of-pocket expenses, limited data on caregiver burden, absence of unified national stroke registry to determine prevalence, incidence and burden of stroke). Potential solutions such as increasing stroke knowledge and awareness, use of mobile stroke units, TeleMedicine, TeleRehab, improving access to rehabilitation, upgrading PhilHealth and a unified national long-term stroke registry representing the real situation across urban and rural were identified.
This scoping review describes the existing evidence-base relating to structure, processes and outcomes of stroke services for adults within the Philippines. Developments in stroke services have been identified however, a wide gap exists between the availability of stroke services and the high burden of stroke in the Philippines. Strategies are critical to address the identified gaps as a precursor to improving stroke outcomes and reducing burden. Potential solutions identified within the review will require healthcare government and policymakers to focus on stroke awareness programs, primary and secondary stroke prevention, establishing and monitoring of stroke protocols and pathways, sustainable national stroke registry, and improve access to and availability of rehabilitation both hospital and community.
Stroke services in the Philippines are inequitable, for example, urban versus rural due to the geography of the Philippines, location of acute stroke ready hospitals and stroke rehabilitation units, limited transport options, and low government healthcare insurance coverage resulting in high out-of-pocket costs for stroke survivors and their families.
The Philippines have a higher incidence of stroke in younger adults than other LMICs, which impacts the available workforce and the country’s economy. There is a lack of data on community stroke rehabilitation provision, the content and intensity of stroke rehabilitation being delivered and the role and knowledge/skills of those delivering stroke rehabilitation, unmet needs of stroke survivors and caregiver burden and strain,
A wide gap exists between the availability of stroke services and the high burden of stroke. The impact of this is unclear due to the lack of a compulsory national stroke registry as well as published data on community or home-based stroke services that are not captured/published.
This review provides a broad overview of existing evidence-base of stroke services in the Philippines. It provides a catalyst for a) healthcare government to address stroke inequities and burden; b) development of future evidence-based interventions such as community-based rehabilitation; c) task-shifting e.g., training non-neurologists, barangay workers and caregivers; d) use of digital technologies and innovations e.g., stroke TeleRehab, TeleMedicine, mobile stroke units.
Peer Review reports
In the Philippines, stroke is the second leading cause of death, with a prevalence of 0·9% equating to 87,402 deaths per annum [ 1 , 2 ]. Approximately 500,000 Filipinos will be affected by stroke, with an estimated US$350 million to $1·2 billion needed to meet the cost of medical care [ 1 ]. As healthcare is largely private, the cost is borne out-of-pocket by patients and their families. This provides a major obstacle for the lower socio-demographic groups in the country.
Research on implementation of locally and regionally adapted stroke-services and cost-effective secondary prevention programs in the Philippines have been cited as priorities [ 3 , 4 ]. Prior to developing, implementing, and evaluating future context-specific acute stroke management services and community-based models of rehabilitation, it was important to map out the available literature on stroke services and characteristics of stroke in the Philippines.
The scoping review followed a predefined protocol, established methodology [ 5 ] and is reported according to the Preferred Reporting Items for Systematic Review and Meta-Analyses Extension for Scoping Reviews Guidelines (PRISMA-ScR) [ 6 , 7 ]. Healthcare quality will be described according to the following three aspects: structures, processes, and outcomes following the Donabedian model [ 8 , 9 ].The review is based on Arksey and O'Malley’s five stages framework [ 5 ].
Stage 1: The research question:
What stroke services are available for adults within the Philippines? The objective was to systematically scope the literature to describe the availability, structure, processes, and outcome of stroke services for adults within the Philippines.
Stage 2: Identifying relevant studies:
The following databases were searched. Focused: MEDLINE, EMBASE, Cumulative Index to Nursing and Allied Health Literature (CINAHL), PsycINFO; broad-based: Scopus; review-based: Cochrane Library, Prospero, JBI (formerly Joanna Briggs Institute); Grey literature: Herdin, North Grey, Grey matters, MedRxiv, NIHR health technology assessment, Department of Health Philippines, The Kings Fund, Ethos, Carrot2. Additionally, reference lists of full text included studies were searched.
The targeted search strategy, developed in consultation with an information scientist, was adapted for each database (see supplemental data). Search terms were peer reviewed using the PRESS (Peer Review of Electronic Search Strategies) checklist [ 10 ].
The key search concepts from the Population, Concept and Context (PCC) framework were ≥ 18 years with a stroke living in the Philippines ( population ), stroke services aiming to maximize well-being, participation and function following a stroke ( concept ) and stroke services from acute to chronic including those involving healthcare professionals, non-healthcare related personnel or family or friends ( context ). Search tools such as medical subject headings (MESH) and truncation to narrow or expand searches were used. Single and combined search terms were included (see supplemental data). The search was initially conducted over two weeks in December 2022 and re-run in December 2023.
Studies were selected if they described stroke care in the Philippines in terms of one or more of the following: (a) patient numbers and stroke characteristics (b) staff numbers, qualifications and role (c) service resources (e.g., number of beds/access to a rehabilitation unit, equipment used) (d) cost of services and methods of payment (UHC, Insurance, private) (e) content of stroke care (f) duration of stroke care (hours of personnel contact e.g., Therapy hours per day); interventions undertaken (g) outcome measures used in clinical practice.
Additional criteria:
Context: all environments (home, hospital, outpatients, clinic, academic institute).
Date limits: published between 2002 onwards. This is based on the Philippines Community Rehabilitation Guidelines published in 2009 that would suggest that papers earlier than 2002 may not reflect current practice [ 11 ].
Qualitative and quantitative studies including grey literature.
Language: reported in English or Filipino only.
Publication status: no limit because the level of rigor was not assessed.
Type of study: no limit which included conference abstracts, as the level of rigor was not assessed.
Studies were excluded if they were in non-stroke populations or the full text article could not be obtained. Conference abstracts were excluded if there were insufficient data about methods and results.
Searches of databases were performed by one researcher (JM) and searches of grey literature were performed by one researcher (AO). All retrieved articles were uploaded into Endnote X9 software™, and duplicates identified and removed before transferring them to Rayyan [ 12 ] for screening.
Stage 3: study selection
The title and abstract were selected using eligibility criteria. Two pairs of researchers independently screened abstracts and titles;(Databases: JM and AL and grey literature by AO and LF). Where a discrepancy existed for title and abstract screening, the study was automatically included for full text review and discussed among reviewers.
Two reviewers (JM and AL) undertook full-text screening of the selected studies. Discrepancies were resolved through consensus discussions without the need for a third reviewer. There were no discrepancies that required a third reviewer. Reason for exclusion were documented according to pre-determined eligibility criteria. References of included full text articles were screened by each reviewer independently and identified articles were subjected to the same screening process as per the PRISMA-ScR checklist (Fig. 1 ).
PRISMA-ScR flow diagram
Stage 4: Charting the data
Two reviewers independently extracted the data using a piloted customized and standardized data extraction form including (1) Structure: financial (e.g., costs, insurance, government funding), resources (structure and number of stroke facilities, staff (number, profession/specialism, qualifications etc.), stroke characteristics (2) Process: duration of care, content of stroke care within acute, secondary care, community, outcome measures used; (3) Outcome: survival, function, patient satisfaction, cost (admission and interventions), and (4) year of publication, geographical location (including if Philippines only or multiple international locations) and type of evidence (e.g., policy, review, observational, experimental, clinical guidelines). Critical appraisal of included studies was not undertaken because the purpose of the review was to map available evidence on stroke services available within the Philippines.
Stage 5: Collating, summarising and reporting the results
The search identified 351 records from databases and registers. A total of 70 records are included and reasons for non-inclusion are summarized in Fig. 1 .
The characteristics of included studies are shown in Supplementary Material Table 1. Of the 70 included studies, 36 were observational with most being based on a retrospective review of case notes ( n = 31), two were audits, eight were surveys or questionnaires, four were consensus opinion and/or guideline development, three were randomized controlled trial (RCT) or feasibility RCT, 1 was a systematic review, two were policy and guidelines, 11 were narrative reviews or opinion pieces, two were case series or reports and one was an experimental study.
Of the 70 studies, 32 (45.7%) were based in a single tertiary hospital site. There were only three papers based in the community (4.3%). Papers that were opinion pieces or reviews were classified as having a national focus. Of the 22 papers classified as having a national focus, 10 (45.5%) were narrative reviews/ opinion pieces (Table 1 ).
The primary focus of the research studies (excluding the 11 narrative reviews and 2 policy documents) were classified as describing structure ( n = 8, 14%); process ( n = 21,36.8%) or outcomes ( n = 29, 49.2%). The structure of acute care was described in seven studies out of eight studies ( n = 7/8 87.5%) whilst neurosurgery structures were described in one out of eight studies (12.5%). Acute care processes were described in 11 out of 21 studies ( n = 11/21 52.3%) whilst rehabilitation processes were described in six out of 21 studies (28.6%), with three out of 21 studies primarily describing outcome measurement (14.3%). The primary focus of the outcomes were stroke characteristics (25 out of 28 papers, 89.2%) in terms of number of stroke (prevalence), mortality or severity of stroke. Measures of stroke quality of life were not reported. Healthcare professional knowledge was described in two studies ( n = 2/28 7.1%) whilst risk factors for stroke were described in one study ( n = 1/28, 3.6%). Carer burden was described in one study ( n = 1/28, 3.6%).
A summary of the findings is presented in Table 2 .
This scoping review describes the available literature on stroke services within the Philippines across the lifespan of an adult (> 18 years) with a stroke. The review has identified gaps in information about structures, processes and outcomes as well as deficits in provision of stroke services and processes as recommended by WHO. These included a low number of specialist clinicians including neurologists, neuro-radiographers and neurosurgeons. The high prevalence of stroke suggests attention and resources need to focus on primary and secondary prevention. Awareness of stroke is low, especially in terms of what a stroke is, the signs/symptoms and how to minimize risk of stroke [ 25 ]. Barriers exist, such as lack of healthcare resources, maldistribution of health facilities, inadequate training on stroke treatment among health care workers, poor stroke awareness, insufficient government support and limited health insurance coverage [ 22 ].
The scoping review also highlighted areas where further work is needed, for example, descriptions and research into the frequency, intensity, and content of rehabilitation services especially in the community setting and the outcome measures used to monitor recovery and impairment. PARM published stroke rehabilitation clinical practice guidelines in 2012, which incorporated an innovative approach to contextualize Western clinical practice guidelines for stroke care to the Philippines [ 42 ]. Unfortunately, availability and equitable access to evidence-based rehabilitation for people with stroke in the Philippines pose significant challenges because of multiple factors impacting the country (e.g., geographical, social, personal, environmental, educational, economic, workforce) [ 25 , 40 , 43 ].
The number of stroke survivors with disability has not been reported previously, thus, the extent and burden of stroke from acute to chronic is unknown. The recent introduction of a national stroke registry across public and private facilities may provide some of this data [ 82 ]. The project started in 2021 and captures data on people hospitalized for transient ischemic attack or stroke in the Philippines. National stroke registries have been identified as a pragmatic solution to reduce the global burden of stroke [ 83 ] through surveillance of incidence, prevalence, and outcomes (e.g., death, disability) of, and quality of care for, stroke, and prevalence of risk factors. For the Philippine government to know the full impact and burden of stroke nationally, identify areas for improvement and make meaningful changes for the benefit of Filipinos, the registry would need to be compulsory for all public and private facilities and include out of hospital data. This will require information technology, trained workforces for data capture, monitoring and sharing, as well as governance and funding [ 83 ].
This scoping review has generated a better understanding of the published evidence focusing on availability of stroke services in the Philippines, as well as the existing gaps through the lens of Donabedian’s Structure , Process and Outcome framework. The findings have helped to inform a wider investigation of current stroke service utilization conducted using survey and interview methods with stroke survivors, carers and key stakeholders in the Philippines, and drive forward local, regional and national policy and service changes.
This scoping review describes the existing evidence-based relating to structure, processes and outcomes of stroke services for adults within the Philippines. The review revealed limited information in certain areas, such as the impact of stroke on functional ability, participation in everyday life, and quality of life; the content and intensity of rehabilitation both in the hospital or community setting; and the outcome measures used to evaluate clinical practice. Developments in stroke services have been identified however, a wide gap exists between the availability of stroke services and the high burden of stroke in the Philippines. Strategies are critical to address the identified gaps as a precursor to improving stroke outcomes and reducing burden. Potential solutions identified within the review will require a comprehensive approach from healthcare policymakers to focus on stroke awareness programs, primary and secondary prevention, establishing and monitoring of stroke protocols and pathways, implementation of a compulsory national stroke registry, use of TeleRehab, TeleMedicine and mobile stroke units and improve access to and availability of both hospital- and community-based stroke rehabilitation. Furthermore, changes in PhilHealth coverage and universal credit to minimize catastrophic out-of-pocket costs.
Although a comprehensive search was undertaken, data were taken from a limited number of located published studies on stroke in the Philippines. This, together with data from databases and grey literature, may not reflect the current state of stroke services in the country.
Not applicable.
No datasets were generated or analysed during the current study.
Navarro JC, Baroque AC, Lokin JK, Venketasubramanian N. The real stroke burden in the Philippines. Int J Stroke. 2014;9(5):640–1.
Article PubMed Google Scholar
Philippines TSSot. Phillipines: stroke 2024. Available from: https://www.strokesocietyphilippines.org/philippines-stroke/#:~:text=Stroke%20is%20the%20Philippines'%20second,or%2014.12%25%20of%20total%20deaths .
Banaag MS, Dayrit MM, Mendoza RU. Health Inequity in the Philippines. In: Batabyal A, Higano Y, Nijkamp P (eds). Disease, Human Health, and Regional Growth and Development in Asia. New Frontiers in Regional Science: Asian Perspectives, vol 38. Singapore: Springer; 2019.
Hodge A, Firth S, Bermejo R, Zeck W, Jimenez-Soto E. Utilisation of health services and the poor: deconstructing wealth-based differences in facility-based delivering in teh Philippines. BMC Public Health. 2016;16:1–12.
Arksey H, O’Malley L. Scoping studies: towards a methodological framework. Int J Soc Res Methodol. 2005;8:19–32.
Article Google Scholar
Tricco AC, Lillie E, Zarin W, O’Brien KK, Colquhoun H, Levac D, et al. PRISMA extension for scoping reviews (PRISMA-ScR): checklist and explanation. Ann Intern Med. 2018;169(7):467–73.
Levac D, Colquhoun H, O’Brien KK. Scoping studies: advancing the methodology. Implement Sci. 2010;5:69.
Article PubMed PubMed Central Google Scholar
Donabedian A. The quality of care. How can it be assessed? JAMA. 1988;260(12):1743–8.
Article CAS PubMed Google Scholar
McDonald KM, Sundaram V, Bravata DM, Lewis R, Lin N, Kraft SA, et al. Closing the quality gap: a critical analysis of quality improvement strategies. Tech Rev. 2007;7(9).
McGowan J, Sampson M, Salzwedel DM, Cogo E, Foerster V, Lefebvre C. PRESS peer review of electronic search strategies: 2015 guideline statement. J Clin Epidemiol. 2016;75:40–6.
McGlade B, Mendoza VE. Philippines CBR manual: an inclusive development strategy. Philippines: CBM-CBR Coordinating office; 2009.
Ouzzani M, Hammady H, Fedorowicz Z, et al. Rayyan—a web and mobile app for systematic reviews. Syst Rev. 2016;5(210). https://doi.org/10.1186/s13643-016-0384-4 .
Baliguas B. Adherence to the clinical practice guidelines of the stroke society of the Philippines in the management of ischemic stroke in young adults admitted in 3 tertiary hospitals in Bacolod City, Philippines from May to October 2010. Neurology. 2018;90(15).
Barcelon EA, Moll MAKDN, Serondo DJ, Collantes MEV. Validation of the Filipino version of national institute of health stroke scale. Clinical Neurology. 2016;56:S379.
Google Scholar
Baticulon RE, Lucena LLN, Gimenez MLA, Sabalza MN, Soriano JA. The Neurosurgical Workforce of the Philippines. Neurosurgery. 2024;94(1):202–11. https://doi.org/10.1227/neu.0000000000002630 .
Berroya RM. Incidence of symptomatic intracerebral hemorrhage after thrombolysis for acute ischemic stroke at St. Luke’s Medical Center-Global City from January 2010 to February 2017. J Neurol Sci. 2010;2017(381):398–9.
Carcel C, Espiritu-Picar R. Circadian variation of ischemic and hemorrhagic strokes in adults at a tertiary hospital: a retrospective study. J Neurol Sci. 2009;285:S174.
Cayco CS, Gorgon EJR, Lazaro RT. Proprioceptive neuromuscular facilitation to improve motor outcomes in older adults with chronic stroke. Neurosciences (Riyadh). 2019;24(1):53–60.
Co COC, Yu JRT, Macrohon-Valdez MC, Laxamana LC, De Guzman VPE, Berroya-Moreno RMM, et al. Acute stroke care algorithm in a private tertiary hospital in the Philippines during the COVID-19 pandemic: a third world country experience. J Stroke Cerebrovasc Dis. 2020;29(9):105059.
Co COC, Yu JRT, Laxamana LC, David-Ona DIA. Intravenous thrombolysis for stroke in a COVID-19 positive Filipino patient, a case report. J Clin Neurosci. 2020;77:234–6.
Article CAS PubMed PubMed Central Google Scholar
Collantes ME. Evaluation of change in stroke care in the Philippines using RES-Q data. Eur Stroke J. 2019;4:318.
Collantes ME. Improving stroke systems of care in lmic: Philippines. Int J Stroke. 2021;16(2):4.
Collantes ME, Navarro J, Belen A, Gan R. Stroke systems of care in the Philippines: addressing gaps and developing strategies. Front Neurol. 2022;13:1046351.
Collantes MEV, Yves Miel H, Zuñiga Uezono DR. Incidence and prevalence of stroke and its risk factors in the Philippines: a systematic review. Acta Medica Philippina. 2022;56:26–34.
Collantes MV, Zuniga YH, Granada CN, Uezono DR, De Castillo LC, Enriquez CG, et al. Current state of stroke care in the Philippines. Front Neurol. 2021;12:665086.
Constantino GA, Soliven JA. Points of in-hospital delays in thrombolytic therapy among patients with acute ischemic stroke: a single center 5-year retrospective study. Neurology. 2020;94(15). https://doi.org/10.1212/WNL.94.15_supplement.2901 .
Constantino GAA, Señga MMA, Soliven JAR, Jocson VED. Emerging Utility of Endovascular Thrombectomy in the Philippines: A Single-center Clinical Experience. Acta Med Philipp [Internet]. 2023;57(5). Available from: https://actamedicaphilippina.upm.edu.ph/index.php/acta/article/view/5113 . [cited 2024 Aug 21].
Dans AL, Punzalan FE, Villaruz MV. National Nutrition and Health Survey (NNHeS): atherosclerosis-related diseases and risk factors. Philipp J Intern Med. 2005;43:103–15.
De Castillo LL, Collantes ME. Thrombolysis for stroke at the Philippine general hospital: a descriptive analysis. Cerebrovasc Dis. 2019;48:54.
de Castillo LLC, Diestro JDB, Tuazon CAM, Sy MCC, Añonuevo JC, San Jose MCZ. J Stroke Cerebrovasc Dis. 2021;30(7):105831.
Delfino JPM, Carandang-Chacon CA. Comparison of acute ischemic stroke care quality before and during the COVID-19 pandemic in a private tertiary hospital in metro Manila, Philippines. Neurol Asia. 2023;28(1):13–7.
Department of Health. Department of Health Administrative Order 2011-0003. 2011. [Accessed online: 12/2022], from the Philippine Department of Health].
Department of Health. The national policy framework on the prevention, control and management of acute stroke in the Philippines. 2020.
Diestro JDB, Omar AT, Sarmiento RJC, Enriquez CAG, Castillo LLC, Ho BL, et al. Cost of hospitalization for stroke in a low-middle-income country: Findings from a public tertiary hospital in the Philippines. Int J Stroke. 2021;16(1):39–42.
Duenas M, Ranoa G, Benjamin VS. Assessment of post-stroke caregivers’ burden through the modified caregivers strain index (MCSI) in a tertiary center in the Philippines: a cross-sectional study. Cerebrovasc Dis. 2019;48:56–7.
Duya JE, Hernandez K, San Jose MC. The evolving clinical and echocardiographic profile of patients admitted for acute cardioembolic stroke at a Tertiary Hospital in the Philippines. J Hong Kong Coll Cardiol. 2019;27(1):58.
Espiritu AI, San Jose MCZ. A call for a stroke referral network between primary care and stroke-ready hospitals in the philippines: a narrative review. Neurologist. 2021;26(6):253–60.
Gambito ED, Gonzalez-Suarez CB, Grimmer KA, Valdecañas CM, Dizon JM, Beredo ME, et al. Updating contextualized clinical practice guidelines on stroke rehabilitation and low back pain management using a novel assessment framework that standardizes decisions. BMC Res Notes. 2015;8:643.
Gelisanga MA, Gorgon EJ. Upright motor control test: interrater reliability, retest reliability, and concurrent validity in adults with subacute stroke. Eur Stroke J. 2017;2(1):357–8.
Gonzalez-Suarez C, Grimmer K, Alipio I, Anota-Canencia EG, Santos-Carpio ML, Dizon JM, et al. Stroke rehabilitation in the Philippines: an audit study. Disabil CBR Inclusive Develop. 2015;26(3):44–67.
Gonzalez-Suarez CB, Grimmer K, Cabrera JTC, Alipio IP, Anota-Canencia EGG, Santos-Carpio MLP, et al. Predictors of medical complications in stroke patients confined in hospitals with rehabilitation facilities: a Filipino audit of practice. Neurology Asia. 2018;23(3):199–208.
Gonzalez-Suarez CB, Grimmer-Somers K, Margarita Dizon J, King E, Lorenzo S, Valdecanas C, et al. Contextualizing Western guidelines for stroke and low back pain to a developing country (Philippines): an innovative approach to putting evidence into practice efficiently. J Healthc Leadersh. 2012;4:141–56.
Gonzalez-Suarez CB, Margarita J, Dizon R, Grimmer K, Estrada MS, Uyehara ED, et al. Implementation of recommendations from the Philippine Academy of Rehabilitation Medicine's Stroke Rehabilitation Guideline: a plan of action. Clin Audit. 2013;5:77–89.
Ignacio KHD, Diestro JDB, Medrano JMM, Salabi SKU, Logronio AJ, Factor SJV, et al. Depression and anxiety after stroke in young adult Filipinos. J Stroke Cerebrovasc Dis. 2022;31(2):106232.
Inting K, Canete MT. Ischemic stroke subtypes: a comparison between causative and phenotypic classifications in a tertiary hospital in the Philippines. Int J Stroke. 2021;16(2):28.
Jaca PKM, Chacon CAC, Alvarez RM. Clinical characteristics of cerebrovascular disease with COVID-19: a single-center study in Manila. Philippines Neurology Asia. 2021;26(1):15–25.
Jamora RDG, Corral EV, Ang MA, Epifania M, Collantes V, Gan R. Stroke recurrence among Filipino patients taking aspirin for first-ever non-cardioembolic ischemic stroke. Neurol Clin Neurosci. 2017;5:1–5.
Jamora RDG, Prado MB Jr, Anlacan VMM, Sy MCC, Espiritu AI. Incidence and risk factors for stroke in patients with COVID-19 in the Philippines: an analysis of 10,881 cases. J Stroke Cerebrovasc Dis. 2022;31(11).
Juangco DN, Mariano GS. Endovascular therapy for acute ischemic stroke: a review of cases and outcomes from a primary stroke center (a 5-year retrospective study). Cerebrovasc Dis. 2016;41:54.
Leochico CFD, Austria EMV, Gelisanga MAP, Ignacio SD, Mojica JAP. Home-based telerehabilitation for community-dwelling persons with stroke during the COVID-19 pandemic: a pilot study. J Rehabil Med. 2023;55:jrm4405.
Loo KW, Gan SH. Burden of stroke in the Philippines. Int J Stroke. 2013;8(2):131–4.
Mansouri A, Ku JC, Khu KJ, Mahmud MR, Sedney C, Ammar A, et al. Exploratory analysis into reasonable timeframes for the provision of neurosurgical care in low- and middle-income countries. World Neurosurg. 2018;117:e679–91.
Mendoza RA. The clinical profile and treatment outcome of acute ischemic stroke patients who underwent thrombolysis with recombinant tissue plasminogen activator therapy, Philippine experience: a retrospective study. J Neurol Sci. 2009;285:S85–6.
Navarro J. Prevalence of stroke: a community survey. Philipp J Neurol. 2005;9(2):11–5.
Navarro JC, Venketasubramanian N. Stroke burden and services in the Philippines. Cerebrovasc Dis Extra. 2021;11(2):52–4.
Navarro JC, Baroque AC 2nd, Lokin JK. Stroke education in the Philippines. Int J Stroke. 2013;8 Suppl A100:114–5.
Navarro JC, Chen CL, Lee CF, Gan HH, Lao AY, Baroque AC, et al. Durability of the beneficial effect of MLC601 (NeuroAiD™) on functional recovery among stroke patients from the Philippines in the CHIMES and CHIMES-E studies. Int J Stroke. 2017;12(3):285–91.
Navarro JC, Escabillas C, Aquino A, Macrohon C, Belen A, Abbariao M, et al. Stroke units in the Philippines: an observational study. Int J Stroke. 2021;16(7):849–54.
Navarro JC, San Jose MC, Collantes E, Macrohon-Valdez MC, Roxas A, Hivadan J, et al. Stroke thrombolysis in the Philippines. Neurol Asia. 2018;23(2):115.
Ng JC, Churojana A, Pongpech S, Vu LD, Sadikin C, Mahadevan J, et al. Current state of acute stroke care in Southeast Asian countries. Interv Neuroradiol. 2019;25(3):291–6.
Ocampo FF, De Leon-Gacrama FRG, Cuanang JR, Navarro JC. Profile of stroke mimics in a tertiary medical center in the Philippines. Neurol Asia. 2021;26(1):35–9.
Pascua R, Hiyadan JH. Outcome of decompressive hemicraniectomy without evacuation of hematoma in supratentorial intracerebral hemorrhage in a tertiary government hospital in the Philippines: a retrospective study. Eur Stroke J. 2023;8(2):586.
Prado M, Jamora RD, Charmaine Sy M, Anlacan M, Espiritu A. Determinants and Outcomes of Cerebrovascular Disease in Patients with COVID19 in the Philippines: An Analysis of 10881 Cases. Neurology. 2022;98(18). https://doi.org/10.1212/WNL.98.18_supplement.2076 .
Qua CV, Tiqui V, Villatima NE, Perales DJ, Rubio SM, Santos ER, et al. A predictive assessment of early neurological deterioration among Filipino acute ischemic stroke patients utilizing hematological, lipid profile, and metabolic parameters in a tertiary hospital in Pampanga. Philippines Cerebrovasc Dis. 2022;51:101.
Que DL, Cuanang J, San Jose MC. Clinical profile, management and outcomes of patients with cerebralvenous thrombosis in atertiary hospital in the Philippines. Int J Stroke. 2020;15(1):511.
Quiles LEP, Diamante PAB, Pascual JLV. Impact of the COVID-19 pandemic in the acute stroke admissions and outcomes in a Philippine Tertiary Hospital. Cerebrovasc Dis Extra. 2022;12(2):76–84.
Roxas AA. The RIFASAF project: a case-control study on risk factors for stroke among Filipinos. Philippine J Neurol. 2002;6(1):1–7.
Roxas AAC, Carabal-Handumon J. Knowledge and perceptions among the barangay health workers in Plaridel, Misamis Occidental. Philipp J Neurol. 2002;6(1):44.
Sasikumar S, Bengzon Diestro JD. Global & community health: acute ischemic stroke in Toronto and Manila: bridging the gap. Neurology. 2020;95(13):604–6.
Senga MM, Reyes JPB. Cerebral venous thrombosis in a single center tertiary hospital of a South East Asian country (CVSTS study)-a retrospective study on the clinical profiles of patients with cerebral venous thrombosis. Neurology. 2019;92(15). https://doi.org/10.1212/WNL.92.15_supplement.P5.3-011 .
Sese LVC, Guillermo MCL. Strengthening stroke prevention and awareness in the Philippines: a conceptual framework. Front Neurol. 2023;14:1258821.
Suwanwela NC, Chen CLH, Lee CF, Young SH, Tay SS, Umapathi T, et al. Effect of combined treatment with MLC601 (NeuroAiDTM) and rehabilitation on post-stroke recovery: the CHIMES and CHIMES-E studies. Cerebrovasc Dis. 2018;46(1–2):82–8.
Talamera AF, Franco DS. Validation study of Siriraj stroke score in Southern Philippines. Cerebrovasc Dis. 2011;32:9.
Tan A, Navarro J. Outcomes and quality of care outcome of patients with primary intracerebral hemorrhage in a single center in the philippines. Int J Stroke. 2014;9:269.
Tangcuangco NC, Bitanga ES, Roxas AA, Pascual JL, Saniel E, Reyes JP, et al. Intravenous recombinant tissue plasminogen activator (IV-rtPA) use in acute ischemic stroke in a private tertiary hospital: a Philippine setting. Int J Stroke. 2010;5:107.
Tsang ACO, Yang IH, Orru E, Nguyen QA, Pamatmat RV, Medhi G, et al. Overview of endovascular thrombectomy accessibility gap for acute ischemic stroke in Asia: a multi-national survey. Int J Stroke. 2020;15(5):516–20.
Vatanagul J, Cantero-Auguis C. Awareness on acute stroke management among family medicine and internal medicine residents in Metro Cebu. Philippines J Neurol Sci. 2015;357:e418–9.
Vatanagul J, Rulona IA. The incidence of post-stroke depression in a tertiary hospital in Cebu City, Philippines. J Neurol Sci. 2015;357:e419.
Vatanagul JAS, Rulona IA, Belonguel NJ. Cerebral venous thrombosis (CVST): study of four Filipino patients and literature review. Cerebrovasc Dis. 2013;36:81.
Venketasubramanian N, Yoon BW, Pandian J, Navarro JC. Stroke epidemiology in south, east, and south-east Asia: a review. J Stroke. 2017;19(3):286–94.
Yu RF, San Jose MC, Manzanilla BM, Oris MY, Gan R. Sources and reasons for delays in the care of acute stroke patients. J Neurol Sci. 2002;199(1–2):49–54.
Philippine Neurological Association One Database - Stroke DsSMG. Multicentre collection of uniform data on patients hospitalised for transient ischaemic attack or stroke in the Philippines: the Philippine Neurological Association One Database-Stroke (PNA1DB-Stroke) protocol. BMJ Open. 2022;12(5):54.
Feigin VL, Owolabi MO, Group WSOLNCSC. Pragmatic solutions to reduce the global burden of stroke: a world stroke organization-lancet neurology commission. Lancet Neurol. 2023;22(12):1160–206.
Download references
We acknowledge the TULAY collaborators: Dr Roy Francis Navea, Dr Myrna Estrada, Dr Elda Grace Anota, Dr Maria Mercedes Barba, Dr June Ann De Vera, Dr Maria Elena Tan, Dr Sarah Buckingham and Professor Fiona Jones. We are grateful to Lance de Jesus and Dr Annah Teves, Research Assistants on the TULAY project, for their contribution to some of the data extraction.
This research was funded by the NIHR Global Health Policy and Systems Research Programme (Award ID: NIHR150244) in association with UK aid from the UK Government to support global health research. The views expressed in this publication are those of the authors and not necessarily those of the NIHR or the UK’s Department of Health and Social Care.
Authors and affiliations.
Faculty of Health, Intercity Place, University of Plymouth, Plymouth, Devon, PL4 6AB, UK
Angela Logan, Bridie Kent, Aira Ong & Jonathan Marsden
Royal Devon University Healthcare NHS Foundation Trust, William Wright House, Barrack Road, Exeter, Devon, EX2 5DW, UK
Angela Logan
De La Salle University-Evelyn D. Ang Institute of Biomedical Engineering and Health Technologies, 2401 Taft Avenue, Malate, Manila, 1004, Philippines
Lorraine Faeldon
The University of Plymouth Centre for Innovations in Health and Social Care: A JBI Centre of Excellence, Faculty of Health, Intercity Place, University of Plymouth, Plymouth, Devon, PL4 6AB, UK
Bridie Kent
You can also search for this author in PubMed Google Scholar
Conceptualisation, methodology and setting search terms, AL, LF, AO, JM, BK. Searches and screening, AL, JM, LF, AO. Data extraction, AL, LF, AO, JM, LdJ, AT. Original draft preparation, AL, JM. All authors provided substantive intellectual and editorial revisions and approved the final manuscript.
Correspondence to Angela Logan .
Ethics approval and consent to participate, consent for publication, competing interests.
The authors declare no competing interests.
Publisher’s note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary material 1., rights and permissions.
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .
Reprints and permissions
Cite this article.
Logan, A., Faeldon, L., Kent, B. et al. A scoping review of stroke services within the Philippines. BMC Health Serv Res 24 , 1006 (2024). https://doi.org/10.1186/s12913-024-11334-z
Download citation
Received : 20 March 2024
Accepted : 22 July 2024
Published : 30 August 2024
DOI : https://doi.org/10.1186/s12913-024-11334-z
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
ISSN: 1472-6963
Advertisement
With the development and application of machine learning, significant advances have been made in landslide susceptibility mapping. However, due to challenges in actual field landslide investigations, current landslide susceptibility mapping is usually characterized by insufficient landslide samples (positive samples) and low reliability of non-landslide samples (negative samples). Considering Lianghe County in Yunnan Province, China, as an example, this paper aims to research the effectiveness of three oversampling models in generating positive samples for landslides: Conditional Tabular Generative Adversarial Networks (CTGAN), Generative Adversarial Networks (GAN), and the traditional Synthetic Minority Oversampling Technique (SMOTE) algorithms. Additionally, three machine learning methods, including 1D Convolutional Neural Network-Long Short-Term Memory Neural Network (CNN-LSTM), Random Forest (RF), and Gradient Boosting Decision Tree (GBDT) classifiers, are used for landslide susceptibility assessment. We also devise a non-landslide data (negative samples) screening method utilizing a self-trained support vector machine within a semi-supervised framework. The results show that by training on the dataset after negative sample screening, the AUC values for the 1D-CNN-LSTM, RF, and GBDT models have shown significant improvement, increasing from (0.778, 0.869, 0.849) to (0.837, 0.936, 0.877). Compared with the original training set, the prediction accuracy of the three machine learning models is improved after training on the augmented data by CTGAN, GAN, and SMOTE models. The RF model, augmented with 200 positive samples generated by CTGAN, achieves the highest prediction accuracy in the study (AUC = 0.962). The 1D CNN-LSTM model achieves its highest prediction accuracy (AUC = 0.953) when augmented with 200 positive samples from GAN. Similarly, the GBDT model reaches its highest prediction accuracy (AUC = 0.928) when augmented with 200 positive samples created by SMOTE. In addition, the spatial distribution of data indicates that the data generated by the generative adversarial model exhibits higher diversity, which can be used for landslide susceptibility assessment.
This is a preview of subscription content, log in via an institution to check access.
Subscribe and save.
Price includes VAT (Russian Federation)
Instant access to the full article PDF.
Rent this article via DeepDyve
Institutional subscriptions
Ado M, Amitab K, Maji AK, Jasińska E, Gono R, Leonowicz Z, Jasiński M (2022) Landslide susceptibility mapping using machine learning: a literature survey. Remote Sens 14(13):3029. https://doi.org/10.3390/rs14133029
Article Google Scholar
Agrawal K, Baweja Y, Dwivedi D, Saha R, Prasad P, Agrawal S, Kapoor S, Chaturvedi P, Mali N, Kala VU (2017) A comparison of class imbalance techniques for real-world landslide predictions. In: 2017 international conference on machine learning and data science (MLDS). IEEE, pp 1–8. https://doi.org/10.1109/MLDS.2017.21
Akinci H, Yavuz Ozalp A (2021) Landslide susceptibility mapping and hazard assessment in Artvin (Turkey) using frequency ratio and modified information value model. Acta Geophys 69(3):725–745. https://doi.org/10.1007/s11600-021-00577-7
Al-Najjar HA, Pradhan B (2021) Spatial landslide susceptibility assessment using machine learning techniques assisted by additional data created with generative adversarial networks. Geosci Front 12(2):625–637. https://doi.org/10.1016/j.gsf.2020.09.002
Al-Najjar HA, Pradhan B, Sarkar R, Beydoun G, Alamri A (2021) A new integrated approach for landslide data balancing and spatial prediction based on generative adversarial networks (GAN). Remote Sens 13(19):4011. https://doi.org/10.3390/rs13194011
An C, Sun J, Wang Y, Wei Q (2021) A k-means improved ctgan oversampling method for data imbalance problem. In: 2021 IEEE 21st international conference on software quality, reliability and security (QRS). IEEE, pp 883–887. https://doi.org/10.1109/QRS54544.2021.00097
Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324
Chang C-Y, Chen T-Y, Chung P-C (2018) Semi-supervised learning using generative adversarial networks. In: 2018 IEEE symposium series on computational intelligence (SSCI). IEEE, pp 892–896. https://doi.org/10.1109/SSCI.2018.8628663
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357. https://doi.org/10.1613/jair.953
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297. https://doi.org/10.1007/BF00994018
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232. https://www.jstor.org/stable/2699986
Gao H, Fam PS, Tay LT, Low HC (2020) Three oversampling methods applied in a comparative landslide spatial research in Penang Island, Malaysia. SN Applied Sciences 2:1–20. https://doi.org/10.1007/s42452-020-03307-8
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Adv Neural Inf Process Syst 27:2672–2680
Google Scholar
Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville AC (2017) Improved training of Wasserstein GANs. Advances in Neural Information Processing Systems 30
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Article CAS Google Scholar
Huang F, Yan J, Fan X, Yao C, Huang J, Chen W, Hong H (2022) Uncertainty pattern in landslide susceptibility prediction modelling: effects of different landslide boundaries and spatial shape expressions. Geosci Front 13(2):101317. https://doi.org/10.1016/j.gsf.2021.101317
Huang F, Yin K, Huang J, Gui L, Wang P (2017) Landslide susceptibility mapping based on self-organizing-map network and extreme learning machine. Eng Geol 223:11–22. https://doi.org/10.1016/j.enggeo.2017.04.013
Jang E, Gu S, Poole B (2016) Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144 . https://doi.org/10.48550/arXiv.1611.01144
Jiang Y, Wang W, Zou L, Cao Y (2024) Regional landslide susceptibility assessment based on improved semi-supervised clustering and deep learning. Acta Geotech 19(1):509–529. https://doi.org/10.1007/s11440-023-01950-0
Kim S, Yoon H-K (2023) Application of classification coupled with PCA and SMOTE, for obtaining safety factor of landslide based on HRA. Bull Eng Geol Env 82(10):381. https://doi.org/10.1007/s10064-023-03403-0
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems 25
Liao M, Wen H, Yang L (2022) Identifying the essential conditioning factors of landslide susceptibility models under different grid resolutions using hybrid machine learning: a case of Wushan and Wuxi counties, China. Catena 217:106428. https://doi.org/10.1016/j.catena.2022.106428
Li Y, Chen J, Tan C, Li Y, Gu F, Zhang Y, Mehmood Q (2021a) Application of the borderline-SMOTE method in susceptibility assessments of debris flows in Pinggu District, Beijing, China. Nat Hazards 105:2499–2522. https://doi.org/10.1007/s11069-020-04409-7
Li Z, Liu F, Yang W, Peng S, Zhou J (2021b) A survey of convolutional neural networks: analysis, applications, and prospects. IEEE Trans Neural Netw Learn Syst 33(12):6999–7019. https://doi.org/10.1109/TNNLS.2021.3084827
Lin Z, Khetan A, Fanti G, Oh S (2018) Pacgan: the power of two samples in generative adversarial networks. Advances in Neural Information Processing Systems 31
Magrì S, Solimano M, Delogu F, Del Giudice T, Quagliati M, Cicoria M, Silvestro F (2024) Modelling rainfall-induced landslides at a regional scale, a machine learning based approach. Landslides 21:573–582. https://doi.org/10.1007/s10346-023-02173-w
Niu S, Li B, Wang X, Lin H (2020) Defect image sample generation with GAN for improving defect recognition. IEEE Trans Autom Sci Eng 17(3):1611–1622. https://doi.org/10.1109/TASE.2020.2967415
Olson M, Wyner A, Berk R (2018) Modern neural networks generalize on small data sets. Advances in Neural Information Processing Systems 31
Pham BT, Pradhan B, Bui DT, Prakash I, Dholakia M (2016) A comparative study of different machine learning methods for landslide susceptibility assessment: a case study of Uttarakhand area (India). Environ Model Softw 84:240–250. https://doi.org/10.1016/j.envsoft.2016.07.005
Pham QB, Ekmekcioğlu Ö, Ali SA, Koc K, Parvin F (2023) Examining the role of class imbalance handling strategies in predicting earthquake-induced landslide-prone regions. Appl Soft Comput 143:110429. https://doi.org/10.1016/j.asoc.2023.110429
Rong G, Alu S, Li K, Su Y, Zhang J, Zhang Y, Li T (2020) Rainfall induced landslide susceptibility mapping based on Bayesian optimized random forest and gradient boosting decision tree models—a case study of Shuicheng County, China. Water 12(11):3066. https://doi.org/10.3390/w12113066
Sharma N, Saharia M, Ramana G (2024) High resolution landslide susceptibility mapping using ensemble machine learning and geospatial big data. CATENA 235:107653. https://doi.org/10.1016/j.catena.2023.107653
Song Y, Niu R, Xu S, Ye R, Peng L, Guo T, Li S, Chen T (2018) Landslide susceptibility mapping based on weighted gradient boosting decision tree in Wanzhou section of the Three Gorges Reservoir Area (China). ISPRS Int J Geo Inf 8(1):4. https://doi.org/10.3390/ijgi8010004
Song Y, Yang D, Wu W, Zhang X, Zhou J, Tian Z, Wang C, Song Y (2023) Evaluating landslide susceptibility using sampling methodology and multiple machine learning models. ISPRS Int J Geo Inf 12(5):197. https://doi.org/10.3390/ijgi12050197
Srivastava A, Valkov L, Russell C, Gutmann MU, Sutton C (2017) Veegan: reducing mode collapse in gans using implicit variational learning. Advances in Neural Information Processing Systems 30
Stumpf A, Kerle N (2011) Object-oriented mapping of landslides using Random Forests. Remote Sens Environ 115(10):2564–2577. https://doi.org/10.1016/j.rse.2011.05.013
Sun D, Xu J, Wen H, Wang D (2021) Assessment of landslide susceptibility mapping based on Bayesian hyperparameter optimization: a comparison between logistic regression and random forest. Eng Geol 281:105972. https://doi.org/10.1016/j.enggeo.2020.105972
Tang Y, Feng F, Guo Z, Feng W, Li Z, Wang J, Sun Q, Ma H, Li Y (2020) Integrating principal component analysis with statistically-based models for analysis of causal factors and landslide susceptibility mapping: a comparative study from the loess plateau area in Shanxi (China). J Clean Prod 277:124159. https://doi.org/10.1016/j.jclepro.2020.124159
Triguero I, García S, Herrera F (2015) Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study. Knowl Inf Syst 42:245–284. https://doi.org/10.1007/s10115-013-0706-y
Turner AK (2018) Social and environmental impacts of landslides. Innov Infrastruct Solut 3(1):70. https://doi.org/10.1007/s41062-018-0175-y
Van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9(11)
Wang J, Jaboyedoff M, Chen G, Luo X, Derron M-H, Hu Q, Fei L, Prajapati G, Choanji T, Luo S (2024) Landslide susceptibility prediction and mapping using the LD-BiLSTM model in seismically active mountainous regions. Landslides 21(1):17–34. https://doi.org/10.1007/s10346-023-02141-4
Wen L, Li Y, Zhao W, Cao W, Zhang H (2023) Predicting the deformation behaviour of concrete face rockfill dams by combining support vector machine and AdaBoost ensemble algorithm. Comput Geotech 161:105611. https://doi.org/10.1016/j.compgeo.2023.105611
Wu D, Shang M, Luo X, Xu J, Yan H, Deng W, Wang G (2018) Self-training semi-supervised classification based on density peaks of data. Neurocomputing 275:180–191. https://doi.org/10.1016/j.neucom.2017.05.072
Xie Y, Wan Q, Xie H, Xu Y, Wang T, Wang S, Lei B (2023) Fundus image-label pairs synthesis and retinopathy screening via GANs with class-imbalanced semi-supervised learning. IEEE Trans Med Imaging 42(9):2714–2725. https://doi.org/10.1109/TMI.2023.3263216
Xu L, Skoularidou M, Cuesta-Infante A, Veeramachaneni K (2019) Modeling tabular data using conditional gan. Advances in Neural Information Processing Systems 32
Yang C, Liu L-L, Huang F, Huang L, Wang X-M (2023) Machine learning-based landslide susceptibility assessment with optimized ratio of landslide to non-landslide samples. Gondwana Res 123:198–216. https://doi.org/10.1016/j.gr.2022.05.012
Yao J, Qin S, Qiao S, Liu X, Zhang L, Chen J (2022) Application of a two-step sampling strategy based on deep neural network for landslide susceptibility mapping. Bull Eng Geol Env 81(4):148. https://doi.org/10.1007/s10064-022-02615-0
Yi Y, Zhang W, Xu X, Zhang Z, Wu X (2022) Evaluation of neural network models for landslide susceptibility assessment. International Journal of Digital Earth 15(1):934–953. https://doi.org/10.1080/17538947.2022.2062467
Yuan R, Chen J (2023) A novel method based on deep learning model for national-scale landslide hazard assessment. Landslides 20(11):2379–2403. https://doi.org/10.1007/s10346-023-02101-y
Zhang H, Song Y, Xu S, He Y, Li Z, Yu X, Liang Y, Wu W, Wang Y (2022) Combining a class-weighted algorithm and machine learning models in landslide susceptibility mapping: a case study of Wanzhou section of the three gorges reservoir, China. Comput Geosci 158:104966. https://doi.org/10.1016/j.cageo.2021.104966
Zhang Y, Ayyub BM, Gong W, Tang H (2023) Risk assessment of roadway networks exposed to landslides in mountainous regions—a case study in Fengjie County. China Landslides 20(7):1419–1431. https://doi.org/10.1007/s10346-023-02045-3
Zhao L, Wu X, Niu R, Wang Y, Zhang K (2020) Using the rotation and random forest models of ensemble learning to predict landslide susceptibility. Geomat Nat Haz Risk 11(1):1542–1564. https://doi.org/10.1080/19475705.2020.1803421
Download references
This work was supported by the National Natural Science Foundation of China (Grant Nos. 12072102, 12102129), the Six talent peaks project in Jiangsu Province, and the Program to Cultivate Middle-aged and Young Science Leaders of Colleges and Universities of Jiangsu Province, China.
Authors and affiliations.
Geotechnical Research Institute, Hohai University, Nanjing, Jiangsu, 210098, China
Yuhang Jiang, Wei Wang & Yajun Cao
Key Laboratory of Ministry of Education for Geomechanics and Embankment Engineering, Hohai University, Nanjing, Jiangsu, 210098, China
School of Earth Science and Engineering, Hohai University, Nanjing, Jiangsu, 211100, China
Department of Civil and Environmental Engineering, University of Waterloo, 200 University Avenue West, Waterloo, ON N2L 3G1, Canada
Wei-Chau Xie
You can also search for this author in PubMed Google Scholar
Correspondence to Wei Wang .
Competing interests.
The authors declare no competing interests.
Reprints and permissions
Jiang, Y., Wang, W., Zou, L. et al. Investigating landslide data balancing for susceptibility mapping using generative and machine learning models. Landslides (2024). https://doi.org/10.1007/s10346-024-02352-3
Download citation
Received : 10 April 2024
Accepted : 09 August 2024
Published : 05 September 2024
DOI : https://doi.org/10.1007/s10346-024-02352-3
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
Run a free plagiarism check in 10 minutes, generate accurate citations for free.
A research paper is a piece of academic writing that provides analysis, interpretation, and argument based on in-depth independent research.
Research papers are similar to academic essays , but they are usually longer and more detailed assignments, designed to assess not only your writing skills but also your skills in scholarly research. Writing a research paper requires you to demonstrate a strong knowledge of your topic, engage with a variety of sources, and make an original contribution to the debate.
This step-by-step guide takes you through the entire writing process, from understanding your assignment to proofreading your final draft.
Upload your document to correct all your mistakes in minutes
Understand the assignment, choose a research paper topic, conduct preliminary research, develop a thesis statement, create a research paper outline, write a first draft of the research paper, write the introduction, write a compelling body of text, write the conclusion, the second draft, the revision process, research paper checklist, free lecture slides.
Completing a research paper successfully means accomplishing the specific tasks set out for you. Before you start, make sure you thoroughly understanding the assignment task sheet:
Carefully consider your timeframe and word limit: be realistic, and plan enough time to research, write, and edit.
The academic proofreading tool has been trained on 1000s of academic texts. Making it the most accurate and reliable proofreading tool for students. Free citation check included.
Try for free
There are many ways to generate an idea for a research paper, from brainstorming with pen and paper to talking it through with a fellow student or professor.
You can try free writing, which involves taking a broad topic and writing continuously for two or three minutes to identify absolutely anything relevant that could be interesting.
You can also gain inspiration from other research. The discussion or recommendations sections of research papers often include ideas for other specific topics that require further examination.
Once you have a broad subject area, narrow it down to choose a topic that interests you, m eets the criteria of your assignment, and i s possible to research. Aim for ideas that are both original and specific:
Note any discussions that seem important to the topic, and try to find an issue that you can focus your paper around. Use a variety of sources , including journals, books, and reliable websites, to ensure you do not miss anything glaring.
Do not only verify the ideas you have in mind, but look for sources that contradict your point of view.
In this stage, you might find it helpful to formulate some research questions to help guide you. To write research questions, try to finish the following sentence: “I want to know how/what/why…”
A thesis statement is a statement of your central argument — it establishes the purpose and position of your paper. If you started with a research question, the thesis statement should answer it. It should also show what evidence and reasoning you’ll use to support that answer.
The thesis statement should be concise, contentious, and coherent. That means it should briefly summarize your argument in a sentence or two, make a claim that requires further evidence or analysis, and make a coherent point that relates to every part of the paper.
You will probably revise and refine the thesis statement as you do more research, but it can serve as a guide throughout the writing process. Every paragraph should aim to support and develop this central claim.
A research paper outline is essentially a list of the key topics, arguments, and evidence you want to include, divided into sections with headings so that you know roughly what the paper will look like before you start writing.
A structure outline can help make the writing process much more efficient, so it’s worth dedicating some time to create one.
Your first draft won’t be perfect — you can polish later on. Your priorities at this stage are as follows:
You do not need to start by writing the introduction. Begin where it feels most natural for you — some prefer to finish the most difficult sections first, while others choose to start with the easiest part. If you created an outline, use it as a map while you work.
Do not delete large sections of text. If you begin to dislike something you have written or find it doesn’t quite fit, move it to a different document, but don’t lose it completely — you never know if it might come in useful later.
Paragraphs are the basic building blocks of research papers. Each one should focus on a single claim or idea that helps to establish the overall argument or purpose of the paper.
George Orwell’s 1946 essay “Politics and the English Language” has had an enduring impact on thought about the relationship between politics and language. This impact is particularly obvious in light of the various critical review articles that have recently referenced the essay. For example, consider Mark Falcoff’s 2009 article in The National Review Online, “The Perversion of Language; or, Orwell Revisited,” in which he analyzes several common words (“activist,” “civil-rights leader,” “diversity,” and more). Falcoff’s close analysis of the ambiguity built into political language intentionally mirrors Orwell’s own point-by-point analysis of the political language of his day. Even 63 years after its publication, Orwell’s essay is emulated by contemporary thinkers.
It’s also important to keep track of citations at this stage to avoid accidental plagiarism . Each time you use a source, make sure to take note of where the information came from.
You can use our free citation generators to automatically create citations and save your reference list as you go.
APA Citation Generator MLA Citation Generator
The research paper introduction should address three questions: What, why, and how? After finishing the introduction, the reader should know what the paper is about, why it is worth reading, and how you’ll build your arguments.
What? Be specific about the topic of the paper, introduce the background, and define key terms or concepts.
Why? This is the most important, but also the most difficult, part of the introduction. Try to provide brief answers to the following questions: What new material or insight are you offering? What important issues does your essay help define or answer?
How? To let the reader know what to expect from the rest of the paper, the introduction should include a “map” of what will be discussed, briefly presenting the key elements of the paper in chronological order.
The major struggle faced by most writers is how to organize the information presented in the paper, which is one reason an outline is so useful. However, remember that the outline is only a guide and, when writing, you can be flexible with the order in which the information and arguments are presented.
One way to stay on track is to use your thesis statement and topic sentences . Check:
Be aware of paragraphs that seem to cover the same things. If two paragraphs discuss something similar, they must approach that topic in different ways. Aim to create smooth transitions between sentences, paragraphs, and sections.
The research paper conclusion is designed to help your reader out of the paper’s argument, giving them a sense of finality.
Trace the course of the paper, emphasizing how it all comes together to prove your thesis statement. Give the paper a sense of finality by making sure the reader understands how you’ve settled the issues raised in the introduction.
You might also discuss the more general consequences of the argument, outline what the paper offers to future students of the topic, and suggest any questions the paper’s argument raises but cannot or does not try to answer.
You should not :
There are four main considerations when it comes to the second draft.
The goal during the revision and proofreading process is to ensure you have completed all the necessary tasks and that the paper is as well-articulated as possible. You can speed up the proofreading process by using the AI proofreader .
Check the content of each paragraph, making sure that:
Next, think about sentence structure , grammatical errors, and formatting . Check that you have correctly used transition words and phrases to show the connections between your ideas. Look for typos, cut unnecessary words, and check for consistency in aspects such as heading formatting and spellings .
Finally, you need to make sure your paper is correctly formatted according to the rules of the citation style you are using. For example, you might need to include an MLA heading or create an APA title page .
Scribbr’s professional editors can help with the revision process with our award-winning proofreading services.
Discover our paper editing service
I have followed all instructions in the assignment sheet.
My introduction presents my topic in an engaging way and provides necessary background information.
My introduction presents a clear, focused research problem and/or thesis statement .
My paper is logically organized using paragraphs and (if relevant) section headings .
Each paragraph is clearly focused on one central idea, expressed in a clear topic sentence .
Each paragraph is relevant to my research problem or thesis statement.
I have used appropriate transitions to clarify the connections between sections, paragraphs, and sentences.
My conclusion provides a concise answer to the research question or emphasizes how the thesis has been supported.
My conclusion shows how my research has contributed to knowledge or understanding of my topic.
My conclusion does not present any new points or information essential to my argument.
I have provided an in-text citation every time I refer to ideas or information from a source.
I have included a reference list at the end of my paper, consistently formatted according to a specific citation style .
I have thoroughly revised my paper and addressed any feedback from my professor or supervisor.
I have followed all formatting guidelines (page numbers, headers, spacing, etc.).
You've written a great paper. Make sure it's perfect with the help of a Scribbr editor!
Open Google Slides Download PowerPoint
Other students also liked.
Wednesday, September 11, 2024 12pm to 1:15pm
About this Event
AMERICAN POLITICS & PUBLIC POLICY WORKSHOP Abstract: When do modern difference-in-differences (DID)-style methods work for empirical political science? Scholars exploit the staggered roll-out of policies like election regulation, civil service reform, and healthcare across places to estimate causal effects - often using the two-way fixed effects (TWFE) estimator. However, recent literature has highlighted the TWFE estimator's bias in the presence of heterogeneous treatment effects and tendency to make "forbidden comparisons" between treated units. In response, scholars have increasingly turned to modern DID estimators that promise greater robustness to real-world data problems. This paper asks how well these modern methods work for the empirical settings and sample sizes commonly used in political science, with the U.S. states as the running example. In particular, it provides a simulation study of the performance of seven DID methods under either constant or heterogeneous effects, in an N=50 setting that mimics the American federalism natural experiment. I find that many modern methods (1) produce confidence intervals that do not include the true average effect at the specified rate and (2) are underpowered. I show that many cases of coverage problems with modern DID estimators can be addressed using the block bootstrap to estimate standard errors. However, I also show that even where identification and estimation are straightforward, the fifty-state sample poses a power problem without large average effect sizes - at least 0.5 standard deviations. I illustrate the challenges of DID research with the fifty-state panel in the case of estimating the effects of strict voter identification laws on voter turnout. Amanda Weiss is a Ph.D. candidate in political science, also getting an MA in statistics. She works on political methodology and American politics, often with a policy orientation. The first strand of her research takes up challenges in observational causal inference about policy effects. The second strand takes up challenges in experimental political behavior about policy attitudes and affective states. Open to the Yale community only. Please visit this link to subscribe and receive regular announcements: csap.yale.edu/american-politics-public-policy-workshop .
See who is interested.
0 people are interested in this event
User Activity
No recent activity
Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
Scientific Reports volume 14 , Article number: 20355 ( 2024 ) Cite this article
Metrics details
To address the problems of low accuracy in fault diagnosis of oil-immersed transformers, poor state perception ability and real-time collaboration during diagnosis feedback, a fault diagnosis method for transformers based on the integration of digital twins is proposed. Firstly, fault sample balance is achieved through Iterative Nearest Neighbor Oversampling (INNOS), Secondly, nine-dimensional ratio features are extracted, and the correlation between dissolved gases in oil and fault types is established. Then, sparse principal component analysis (SPCA) is used for feature fusion and dimensionality reduction. Finally, the Aquila Optimizer (AO) is introduced to optimize the parameters of the Kernel Extreme Learning Machine (KELM), establishing the optimal AO-KELM diagnosis model. The final fault diagnosis accuracy reaches 98.1013%. Combining transformer digital twin models, real-time interaction mapping between physical entities and virtual space is achieved, enabling online diagnosis of transformer faults. Experimental results show that the method proposed in this paper has high diagnostic accuracy and strong stability, providing reference for the intelligent operation and maintenance of transformers.
Introduction.
The transformer, as the hub of power systems, its health status directly impacts the stability and reliability of the electrical system's operation. Therefore, the precise management of a transformer's health status is paramount to ensuring the steadfast and secure operation of the power grid 1 .
Presently, the technology of Dissolved Gas Analysis (DGA) is extensively employed in the monitoring and identification of faults within oil-insulated transformers 2 , 3 , primarily encompassing: the IEC triad ratio method 4 , the Rogers quadruple ratio method 5 , and the DUVAL triangle technique 6 . Despite their simplicity of operation, these approaches lack the depth of representation for fault characteristics and are limited by their capabilities, resulting in a blurred and indistinct encoding boundary, thereby leading to a low accuracy rate in fault recognition 7 . With the rapid advancement of artificial intelligence, eminent scholars have integrated machine learning with DGA technology, achieving notable results in the field of transformer fault detection. The literature 8 optimizes the support vector machine parameters through the refinement of the scalar search algorithm, thereby augmenting both the convergence velocity and the diagnostic precision of the methodology. The literature 9 proffers an SE-ELM diagnostic method, whose efficacy was validated through the verification across various datasets. The literature 10 enhances the particle swarm optimization algorithm through the dynamic adjustment of inertial weights and acceleration factors, iteratively optimizing the parameters of XGBoost, thereby augmenting the model's classification acumen. Additionally, methods such as Convolutional Neural Networks 11 , 12 , Long Short-Term Memory Networks 13 , 14 , 15 , LightGBM 16 , and the Capsule Network 17 are extensively employed.
With the advancement of big data and the Internet of Things (IoT) technologies, the Digital Twin (DT) 18 technology has paved a new path for enhancing the efficiency of equipment health management. The core concept is to construct a holographic virtual twin model in the digital realm, utilizing advanced technologies such as intelligent sensing and data transmission, which accurately, comprehensively, and in real-time reflect the evolution of physical devices, achieving intelligent control over entities 19 , 20 , 21 . This technology has been extensively utilized in various sectors including aerospace, manufacturing, and healthcare.
In the field of transformer fault diagnosis, scholars both domestically and internationally have carried out extensive research. Referencing 22 , the study proposed a method for constructing a dual-driving twin model integrating data and models, focusing on 10 kv oil-immersed transformers. This approach enables the synchronization between the actual operating conditions of the transformer and the digital twin center. Referencing 23 , a digital twin fault diagnosis model was constructed based on the mechanism model and data model of transformers. Five characteristic gases extracted from DGA data were selected as input feature vectors for a CNN. Experimental results showed that the 1D-CNN model established in this study responded rapidly, had a short training time, and achieved high accuracy, thus validating the effectiveness of the model. Referencing 24 , a fault diagnosis model based on digital twin was constructed for transformers, taking into account their structural characteristics and operational traits. By optimizing the smoothing factor δ in a probabilistic neural network through differential evolution algorithm, the diagnostic accuracy reached an impressive 96.7%, enabling precise monitoring of the transformer's actual operating state. Reference 25 conducts a statistical analysis of the operating data and state information quantity of power transformers, proposes a framework for a state evaluation system and fault detection system based on GCA-CNN, and verifies with 2000 real data cases that the model has higher accuracy and evaluation and detection effects. The literature 26 establishes a high-fidelity simulation model of transformers to accurately simulate winding currents and the temperatures of different components, which can be used for the identification of early faults. However, the aforementioned research is only focused on a single dissolved gas in oil or vibration signal as the basis for fault diagnosis, but there are many factors affecting transformer faults. In the future, it may be possible to combine multi-source data for comprehensive judgment.
In light of the above context, this paper proposes a fault diagnosis method for oil-immersed transformers that integrates a digital twin model. The main contributions of the paper are divided into several parts. Part 1 mainly elaborates on the research background of the paper and the future research direction. Part 2 establishes a transformer digital twin framework, based on geometric, physical, behavioral, and rule models, to achieve interaction mapping between the virtual entity and the physical entity. Part 3 introduces the methods used in the paper, providing theoretical support for the establishment of an accurate and efficient fault diagnosis model. Part 4 addresses the issue of imbalanced small sample data that can easily lead to misjudgment of minority class samples, deeply explores the correlation between dissolved gases in oil and fault types, and eliminates the 'dimensionality catastrophe' problem, using instance data to obtain diagnostic results. Part 5 discusses and analyzes different sampling methods, different features, and different diagnostic models. Part 6 summarizes the entire paper.
Transformer digital twin framework.
This article takes a 400kV oil-immersed transformer as the research object and establishes a transformer digital twin integrated digital twin technology. The constructed digital twin framework mainly includes: physical space, twin body, twin data layer and application service layer 27 , as shown in Fig. 1 .
Transformer digital twin framework.
In the process of building a digital twin, the geometric model is the foundation for creating the digital twin model. Three-dimensional software such as UG and SolidWorks are used to comprehensively describe the solid model in terms of geometric dimensions, material properties, and assembly relationships. Based on prior knowledge, physical properties, and operating mechanisms, the geometric model is analyzed and tested for magnetic field, structure, and other modeling aspects, fully reflecting the intrinsic nature and operating mechanism of the transformer. Heterogeneous data from multiple sources, such as dissolved gas in oil and acoustic vibration signals, are collected using state-aware devices. Artificial intelligence algorithms integrated in the behavior model are used for processing and analysis. The derived data generated from simulation calculations are fed back to the mechanism model in real-time. At the same time, simulation data, state-aware data, as well as transformer's full life cycle process data, maintenance records, and computed derived data collectively form the twin database. Through data communication protocols and interfaces, real-time updates and interactive control between the physical entity and the digital twin are achieved, enabling visual description, real-time monitoring, analysis, diagnosis, and intelligent decision-making for the physical transformer. This provides new ideas for improving the safety and reliable operation of power transmission and transformation equipment.
The present work is founded on the five-dimensional model proposed by Tao Fei from Beijing Aerospace University 28 , culminating in the creation of a digital twin for transformers, as exemplified by Eq. ( 1 ).
where: PE denotes the physical entity of the transformer, VE represents the virtual entity, SS signifies data, algorithms and models of the digital twin, DD stands for the twinning data of the transformer, and CN symbolizes the interaction and communication among the various components.
The acronym PE stands for transformer physical entity, an ensemble of components including the core, windings, tap-changer, and cooling equipment, it caters to the perception of contact or non-contact by state-sensing devices, embodying the interactive and responsive essence of an objective presence.
The SS represents the process of integrating data and models generated by the digital twin transformer system, thereby facilitating comprehensive monitoring of entities, diagnostic analysis of equipment failures, and predictive maintenance.
VE represents the twin model of the virtual realm, establishing the fundamental groundwork for mapping the virtual to the real. The specific composition is delineated by the formula ( 2 ) shown:
where: Gv represents the geometric model, which uses 3D modeling software to create a comprehensive description of the geometric features of physical entities; Pv represents the physical model, which describes the physical properties and operating mechanisms of electrical equipment; Bv represents the behavior model, which combines artificial intelligence algorithms to create Bv; Rv represents the rule model, which mainly includes expert experience and rule inference based on processed historical data for optimization and deduction.
DD represents twin data, which dynamically stores relevant data of PE/VE/SS, and is an important prerequisite for ensuring intelligent operation and maintenance of transformers. The specific representation is shown in formula ( 3 ):
where: Dp refers to the dynamic factor data collected through the state-aware device; Dv refers to the running parameters in the virtual model; Ds mainly refers to the functional and business service data; Dk includes expert experience, industry rules in the transformer field, and usage guidelines, etc. Df refers to the integrated transformation, interactive fusion, and other derived data of the above-mentioned data.
CN represents the data connection part, which is crucial for ensuring the interaction and updating of the elements in the digital twin model. Through data interfaces, communication protocols, etc., efficient transmission and utilization of data in the digital twin system can be achieved, enabling seamless communication and connectivity among different parts of the model. The interactive relationships of the five dimensions in the digital twin model are shown in Fig. 2 .
Transformer digital twin five-dimensional model connection relationship.
Iterative nearest neighbor oversampling algorithm.
The iterative neighborhood oversampling 29 algorithm is a sampling method designed to tackle class imbalance issues, with its principal tenet being the selection of a multitude of class-specific samples as neighbors, and then traversing all k data points within this category, scouring for the most recent unlabeled instance within each label data subset of said category until the dataset balances out or approaches close to it. Here follow the specific steps:
Assume the samples in the dataset for each tag to be \({\text{r}} = \left\{ {r_{1} ,r_{2} , \cdots ,r_{j} , \cdots ,r_{a} } \right\}\) , with \(r_{j} \left( {j = 1,2, \cdots a} \right)\) denoting the number of samples contained within category j . Define the sample set's imbalance factor, utilizing the standard deviation \({\text{var}} \left( r \right)\) to symbolize the dispersal of various types of samples within the dataset, as illustrated in Eq. ( 4 ):
where: \(\mathop r\limits^{ - } = \frac{1}{a}\sum\limits_{j = 1}^{a} {r_{j} }\) .
Based on the philosophy of greedy search, endeavor to identify a multitude of particular sub-samples, with the process detailed in formula ( 5 ):
where: \(x_{j}\) represents the labeled data in category j . If \(x_{\max k}\) is the classification boundary, remove it and select the next nearest neighbor. Then, label it as category j , remove it from the unlabeled data set \(X_{U}\) , add it to the labeled data set \(X_{L}\) , and set \(r_{j} = r_{j} + 1\) . Recalculate the imbalance degree until the preset value is reached, and stop iterating.
The Kernel Extreme Learning Machine (KELM) 30 is based on a single hidden layer feedforward neural network. It introduces a kernel function on top of the ELM algorithm, which maps low-dimensional data to a high-dimensional feature space, resulting in a model with stronger generalization and robustness. The specific steps are as follows:
Assume we are provided with N samples represented as \(\left\{ {\left( {{\text{x}}_{{\text{i}}} ,t_{i} } \right)} \right\}_{i = 1}^{N}\) , where \(x_{i} = \left[ {x_{i1} ,x_{i2} , \cdots ,x_{in} } \right]^{T} \in R^{n}\) and \(t_{i} = \left[ {t_{i1} ,t_{i2} , \cdots ,t_{im} } \right]^{T} \in R^{n}\) denote the input vector and output function of the model respectively. In the context of a neural network with k hidden layers and an activation function \(g\left( x \right)\) , the number of hidden nodes is L , and the ELM model can be articulated by the formula shown in Eq. ( 6 ):
where: \(\beta_{j} = \left[ {\beta_{j1} ,\beta_{j2} , \cdots ,\beta_{jL} } \right]^{T} \left( {j = 1,2, \cdots ,L} \right)\) denotes the output weight value connecting the j th implicit layer node with the output layer node. Among these, \(H = \left\{ {h_{ij} } \right\}\left( {i = 1,2, \cdots ,N;j = 1,2, \cdots ,L} \right)\) represents the output matrix of the hidden layer, and H denotes the jth column of the input \(x_{1} ,x_{2} , \cdots ,x_{n}\) corresponding to the jth hidden layer node. Within H, the jth row corresponds to the output vector of \(x_{i}\) .
Using the least squares method to obtain the output weight values, as shown in formula ( 7 ):
In the formula, \(H{\prime}\) represents the generalized inverse matrix of the hidden layer output matrix H .
Introducing the kernel function mitigates the issue of randomly generated input weights and bias values, exemplified by the KELM weight output formula ( 8 ):
The KELM output function as expressed in formula ( 9 ):
When \(h\left( x \right)\) remains unknown, the kernel function matrix is represented by formula ( 10 ):
In the equation, \(K\left( {x_{i} ,x_{j} } \right)\) denotes the nuclear function, represented as:
The KELM model's output function expression is delineated in formula ( 12 ):
The sparse principal component analysis 31 is a method that builds upon the principal component analysis algorithm by incorporating the LASSO penalty term, thereby enabling the matrix to be sparsely populated. By solving the regression coefficient matrix, it further transforms PCA into an optimization problem aimed at finding the optimal set of coefficients for regression. Compared to traditional PCA, SPCA excels in effectively managing the sparsity within high-dimensional data, yielding results that are more interpretative.
The SPCA algorithm is resolve into two segments: the first entails calculating the principal components via PCA; the second entails enhancing the LASSO penalty term to render the obtained solution sparse. Here follow the specific steps:
Given a \({\text{n}} \times m\) -variant dataset X, the feature decomposition upon normalization treatment, as expounded upon in formula ( 13 ):
In the equation, \(\Lambda \in R^{m \times m}\) represents a diagonal matrix of eigenvalues, arranged in descending order. \(\Lambda \in R^{m \times m}\) is a unitary matrix with column vectors as load vectors.
Select the first k columns of the load matrix \(P \in R^{m \times k}\) , compute the score matrix T , as shown in Eq. ( 14 ):
Projecting T onto X yields a new matrix \(\mathop X\limits^{ \wedge }\) that encompasses information from the corresponding principal component; the difference with X is denoted as E , as illustrated in formula ( 15 ), ( 16 ):
The solution of the SPCA first reverts to the PCA model. The formula ( 15 – 16 ) yields the expression ( 17 ):
Ensure the main component is as near to the original data as possible, that is,it mandates E'sminimalism. Therefore, the principal component seeks resolution through formula ( 18 ):
In the equation, \(\mathop P\limits^{ \wedge }\) is the solution to the minimum value of the principal matrix P .
The vectors sought by PCA are all non-zero; thus, the sparse solution is achieved by incorporating the LASSO penalty term, thereby mitigating the overfitting issue in PCA. The solution formula for sparse principal components, as displayed in formula ( 19 ), is illustrated:
In this equation, matrix A denotes the expected demand matrix to be sought, while matrix B represents the demand matrix expected under the regression problem. A and B represent the \(m \times k\) matrix, \(\mathop A\limits^{ \wedge }\) and \(\mathop B\limits^{ \wedge }\) the matrices to be solved for minimizing values of A and B; they are subject to the constraints \(b_{j} \propto P_{j}\) , \(\lambda\) and \(\lambda_{1,j}\) being the penalty coefficients, and must adhere to \(\lambda > 0\) . The adjusted variance, as expressed in formula ( 20 ), is indicative of:
In the equation, the diagonal matrix interpreting variance is delineated, with \(\mathop P\limits^{ \wedge }\) representing the load matrix following the coefficients. Model contribution lies articulated in formula ( 21 ):
This article, established on the premise of transformer fault imbalance within small sample sets, aims at achieving real-time and precise diagnosis through the establishment of a diagnostic model and a determined diagnostic process. The specific diagnostic process is illustrated in Fig. 3 . The article employs the AO-KELM model as the diagnostic model, erecting a diagnostic process that integrates offline model training with online fault identification.
Transformer fault diagnosis model based on optimized kernel extreme learning machine.
⑴ Train the model offline
The article delves into the offline model training segment from three perspectives: data preprocessing, feature extraction, and model recognition.
Step 1: the preprocessing segment encompasses data INNOS's oversampling and normalization treatment. Collect the gathered DGA samples through INNOS for augmenting the minority class samples, followed by normalization treatment.
Step 2: the feature extraction section encompasses the establishment of ratio signature generation and the integration of SPCA for fusion dimensionality reduction. First, construct a multidimensional discriminant signature, delving deeply into the correlation between the ratio of dissolved gas content in oil and the type of fault. Subsequently, employ SPCA for feature fusion to acquire the optimal principal component, thereby removing redundant information, and divide the training set, validation set, and test set proportionally.
Step 3: the model identification segment encompasses the training and validation of the model. Utilizing the AO algorithm to optimize the regularization parameters C and the kernel functions within the KELM model, one verifies the model's accuracy through validation set on each iteration. Should the discrepancy between consecutive training sessions fall beneath 5%, the model training continues; otherwise, the model retraining commences anew until the prerequisite conditions are met. The ultimate establishment of the AO-KELM optimal diagnostic model.
⑵ Online fault diagnosis
Normalize the samples collected in real-time to handle and construct multi-dimensional features, employing an unencoded ratio method to input into an optimal diagnosis model directly following optimal principal component projection, thereby achieving swift recognition of transformer fault. Although the computational time for offline model training is accordingly elevated, it is merely necessary to undergo training once, with the aim of achieving online recognition and diagnosis of transformer faults as data from real-time monitoring continues to be inputted.
Data source and normalization processing.
Transformer insults are exacerbated by thermal electrochemical action, causing the decomposition of internal insulating materials and the dissolution of various hydrocarbon gases within the insulation oil. Distinct characteristics of gas dissolved in oil under varying fault types exist; research has demonstrated that diagnostic and classification of faults can be achieved through the use of DGA techniques 32 . Consequently, these five gas contents are utilized as a basis for transformer fault diagnosis in this article.
The article selected a comprehensive sample of 337 monitoring data from a particular power supply company, dividing the operating status of transformers into categories such as normal, moderate heat overload, high temperature overload, high energy discharge, low energy discharge, and local discharge, each represented by labels 1 through 6. Each type of fault is augmented with specific characteristic gases including H 2 , CH 4 , C 2 H 4 , C 2 H 6 , and C 2 H 2 ; the exact number of samples for each category is detailed in Table 1 . The data reveals that the majority of samples fall into the category of normal, comprising 35.63% of the total. Low-energy discharge and local discharge types account for 5.55% and 9.78% respectively, with a maximum disparity reaching 5.1:1. Such imbalanced data is prone to misidentifying samples of the minority class as normal, thereby impacting recognition accuracy. Therefore, this paper employs the INNOS algorithm to augment the minority class samples, achieving a balance in sample categories.
To manifest the disparities between data prior to and after sampling, a principal component analysis is conducted upon the sample data from before and after said sampling process. Subsequently, the first two principal components are selected for visualizing the data of various types both before and after said sampling, as illustrated in Fig. 4 . In Fig. 4 , it becomes apparent that the data distribution trends for various types of faults, prior to and after the adoption of the INNOS sampling method, are identical, thereby underscoring the viability of the INNOS sampling approach.
Scatter plot of INNOS samples.
Considering the substantial disparities among the various volatile gases, a preliminary normalization is required for each gas's abundance, as illustrated in Eq. ( 22 ):
In the equation: \(x_{i}\) and \(x_{{\text{i}}}^{*}\) represent features pre-normalized; \({\text{x}}_{{{\text{i}}\max }}\) and \({\text{x}}_{{{\text{i}}\min }}\) indicate the original minimal and maximum values.
The method of unencoded ratio analysis 33 is but one among numerous techniques widely employed, utilizing the percentage ratio of key gases to either the total gas or the hydrocarbon concentration can profoundly illustrate the interconnectedness between characteristic gases and types of failures. For instance, the ratio of a singular gas to the total hydrocarbon concentration provides a more conclusive indicator of the interplay between diverse fault types; the concentrations of C 2 H 4 and CH 4 can effectively demarcate local discharge from discharge with overheating diagnosis; the percentage composition of C 2 H 2 can determine whether a transformer has experienced thermal failure, among other determinations. The construction of this paper is predicated on the integration of pertinent literature, establishing a nine-dimensional candidate ratio signature for transformer fault diagnosis 31 , as delineated in Table 2 , wherein THC = CH 4 + C 2 H 4 + C 2 H 6 + C 2 H 2 , and ALL = H 2 + CH 4 + C 2 H 4 + C 2 H 6 + C 2 H 2 .
To avoid the redundancy of fault-related feature information within the samples and to enhance the efficiency and precision of the diagnostic model, the SPCA method was employed for the integration of the derived rational features. The cumulative explicable variance contribution rate of each principal component is depicted in Fig. 5 . It is evident from Fig. 5 that the cumulative variance contribution rate for the first six principal components reaches 90.4419%, indicating that the first five principal components can achieve more than 90% of the ability expressed by all the principal components. Hence, selecting these five principal components as inputs for the transformer fault diagnosis model is warranted.
Cumulative variance contribution rate.
The fused features derived from the SPCA extraction are delineated in a ratio of 6:2:2 to be divided into training, testing, and validation datasets. The regularization parameters C within KELM determine the learning capacity of the model and its diagnostic precision; in this paper, we employ the AO optimization algorithm to optimize C, with an introduction of the AO algorithm as delineated in literature 34 , 35 , culminating in the establishment of a diagnostic model based on SPCA-AO-KELM. Figure 6 delineates the confusion matrix diagram of the transformer fault diagnosis. It is evident from Fig. 6 that within the test set of 158 samples, 155 were correctly diagnosed, representing a total correct rate of 98.1013%. The accuracy rates for normal, high-temperature overheating, and low-energy discharge diagnoses are 100%, one case of misjudgment was found in medium–low temperature overheating, high-energy discharge, and partial discharge.
Transformer fault diagnosis results.
However, the precision of diagnostic accuracy alone cannot comprehensively nor efficaciously evaluate the impact of rare class faults on classification performance 36 , 37 . In this paper, we introduce classification model performance evaluation metrics derived from confusion matrices, employing accuracy (R), precision (P), and F1-score as the core components of our evaluation system. The veracity of diagnostic models for identifying various faults is assessed by the accuracy rate, the sensitivity of the model in recognizing a variety of faults is evaluated by the coverage rate, while the F1 score derived from the amalgamation of precision and recall reflects the model's classification performance amidst sample imbalance, with specific formulas denoted in the literature displayed here. The model's precision, recall, and F1-score derived from the computed graph in Fig. 6 respectively stand at 0.9816, 0.9825, and 0.9820, further underscoring the model's high fault detection accuracy and its stable nature.
Comparison and analysis of different sampling methods.
To verify the effectiveness of the new samples synthesized based on INNOS in improving the accuracy of transformer fault diagnosis, this paper uses unbalanced data set, random oversampling, SMOTE, and ADASYN oversampling algorithms for sample augmentation, and the diagnostic results are shown in Fig. 7 . The red dots in the figure represent the samples that are correctly classified in the test set, while the circles represent the samples of the true class, and the scattered dots represent the samples that are misclassified as other classes. The more scattered sample points, the higher the misclassification rate. In Fig. 7 d, the diagnostic accuracy of the original unbalanced data set without balancing processing is only 88.4058%, indicating that due to the imbalance of data in each fault category, the training of the diagnostic model is insufficient, and it is easy to misclassify minority class samples as majority class samples during classification recognition. After balancing the data set using different sampling methods, the misclassification rate of the samples decreases. The sampling method used in this paper improves the diagnostic accuracy by 7.7967%, 2.5316%, and 1.8987% compared to ADASYN, SMOTE, and random oversampling, respectively, indicating that the INNOS sampling method can effectively solve the problem of low diagnostic accuracy caused by data imbalance.
Diagnostic results under different sampling methods.
To demonstrate the effectiveness of the SPCA feature fusion method, this study conducted analysis from two perspectives: qualitative observation and quantitative analysis. Firstly, PCA, KPCA, and SPCA were used to extract features from the constructed ratio signs. The cumulative variance contribution rate threshold was set at 90%, and the obtained principal component information is detailed in Table 3 . LASSO penalty term was introduced based on PCA to constrain some loading vectors to zero, resulting in a loss of variance contribution rate. From the data in the table, it can be seen that the contribution rate of SPCA principal components is slightly lower than that of PCA and KPCA, effectively removing redundant information in the ratio features and providing a valid data foundation for subsequent classification and recognition.
Furthermore, for the above feature extraction methods, quantitative calculations were performed. The fused features extracted by the 9-dimensional joint feature, PCA, KPCA, and SPCA were input into the diagnostic model for comparative analysis, as shown in Fig. 8 . From Fig. 8 a–d, it can be observed that the diagnostic accuracy is significantly improved after feature extraction. Figure 8 a has a higher accuracy compared to Fig. 8 b and c, which validates the superiority of the SPCA feature extraction method.
Diagnostic outcomes under various characteristics.
To explore the diagnostic performance of the models, three diagnostic models, ELM, KELM, and AO-ELM, were constructed for horizontal comparison. The diagnostic results are shown in Table 4 . From the perspective of a single model, the introduction of a kernel function improved the diagnostic accuracy and evaluation indicators of ELM. From the perspective of optimization algorithms, the diagnostic capability of fault recognition was effectively improved after parameter optimization using the AO algorithm.
On the other hand, the extracted integration features are respectively inputted into the POA-SVM model proposed in Literature 38 , the SSA-ELM model suggested in Literature 39 , and the PSO-BiLSTM model introduced in Literature 40 for longitudinal comparison. To circumvent the chances of chance, each model is subjected to ten-fold cross-validation, as manifested in Table 5 . It is evident from Table 5 that, under conditions where the input features remain identical, the AO-KELM outperforms both the POA-SVM and POA-SVM by elevating the average accuracy by 3.23% and 2.64%, respectively, while the PSO-BiLSTM lags behind with a mere 1.8% increase in accuracy. This clearly signifies the robust stability of the AO-KELM model and its formidable classification capabilities.
The paper introduces an oil-immersed transformer fault diagnosis method that integrates digital twin models, providing validation through case studies, leading to the conclusions below:
Build a twin mechanism model based on geometric, physical, rule, and behavior models, use real-time data to drive the fusion of data and mechanism models, complete real-time mapping between physical entities and virtual entities, and use visualization technology to express the twin in multiple dimensions, achieve intelligent diagnosis, health monitoring, and optimization decision-making for the transformer entity.
Proposed a transformer fault diagnosis model based on optimized kernel extreme learning machine, which solves the problem of misjudgment of minority class samples caused by unbalanced small samples, effectively extracts fusion features, establishes the optimal AO-KELM classifier, and achieves an accuracy of 98.1013%. By comparing with different diagnostic models, the classification performance and stability of the proposed method are verified.
The datasets generated and/or analysed during the currentstudy are not publicly availabledue [REASON WHY DATA ARENOT PUBLlC] but are availablefrom the corresponding authoron reasonable request. E-mail:[email protected].
Tightiz, L. et al. An intelligent system based on optimized ANFIS and association rules for power transformer fault diagnosis. ISA Trans. 103 , 63–74 (2020).
Article PubMed Google Scholar
Zhang, Y. et al. Fault diagnosis of transformer using artificial intelligence: A review. Front. Energy Res. 10 , 1006474 (2022).
Article Google Scholar
Wani, S. A. et al. Advances in DGA based condition monitoring of transformers: A review. Renew. Sustain. Energy Rev. 149 , 111347 (2021).
Article CAS Google Scholar
Malik, H. & Mishra, S. Application of gene expression programming (GEP) in power transformers fault diagnosis using DGA. IEEE Trans. Ind. Appl. 52 (6), 4556–4565 (2016).
Lin, J., Ma, J. & Zhu, J. Hierarchical federated learning for power transformer fault diagnosis. IEEE Trans. Instrum. Meas. 71 , 1–11 (2022).
Google Scholar
Duval, M. A review of faults detectable by gas-in-oil analysis in transformers. IEEE Electr. Insul. Mag. 18 (3), 8–17 (2002).
Li, P. & Hu, G. M. Transformer fault diagnosis based on data enhanced one-dimensional improved convolutional neural network. Power Syst. Technol. 47 (07), 2957–2967 (2023).
Zhou, X. H. et al. Transformer fault diagnosis based on SVM optimized by the improved bald eagle search algorithm. Power Syst. Prot. Control 51 (08), 118–126 (2023).
Chen, H. C., Zhang, Y. & Chen, M. Transformer dissolved gas analysis for highly-imbalanced dataset using multi-class sequential ensembled ELM. IEEE Trans. Dielectr. Electr. Insulat. https://doi.org/10.1109/TDEI.2023.3280436 (2023).
Gong, Z. W. Y. et al. Fault diagnosis method of transformer based on improved particle swarm optimization XGBoost. High Volt. Appar. 59 (08), 61–69 (2023).
Xu, H. R. & Wang, Z. Y. Condition evaluation and fault diagnosis of power transformer based on GAN-CNN. J. Electrotechnol. Electr. Eng. Manag. 6 (3), 8–16 (2023).
Wang, Z. & Xu, H. GCA-CNN based transformer digital twin model construction and fault diagnosis and condition evaluation analysis. Acad. J. Comput. Inf. Sci. 6 (6), 100–107 (2023).
MathSciNet Google Scholar
Wang, L., Littler, T. & Liu, X. Dynamic incipient fault forecasting for power transformers using an LSTM model. IEEE Trans. Dielectr. Electr. Insulat. https://doi.org/10.1109/TDEI.2023.3253463 (2023).
Ding, Y. et al. A novel time–frequency Transformer based on self-attention mechanism and its application in fault diagnosis of rolling bearings. Mech. Syst. Signal Process. 168 , 108616 (2022).
Zheng, Q. et al. A real-time transformer discharge pattern recognition method based on CNN-LSTM driven by few-shot learning. Electr. Power Syst. Res. 219 , 109241 (2023).
Yan, P. et al. Transformer fault diagnosis research based on LIF technology and IAO optimization of LightGBM. Anal. Methods 15 (3), 261–274 (2023).
Yang, D. C. et al. Fault diagnosis of transformer based on capsule network. High Volt. Eng. 47 (02), 415–425 (2021).
Grieves, M. & Vickers, J. Digital twin: Mitigating unpredictable, undesirable emergent behavior in complex systems. In Transdisciplinary Prespectives on Complex Systems (eds Kahlen, F.-J. et al.) 85–113 (Springer International Publishing, 2017).
Chapter Google Scholar
Bai, X. Z. et al. Selection method of feature derived from dissolved gas in oil for transformers fault diagnosis. High Volt. Eng. 49 (09), 3873–3886 (2023).
Liu, Y. P. et al. Application prospect and key technology of digital twin in power transmission and transformation equipment. High Volt. Eng. 48 (05), 1621–1633 (2022).
Yang, F. et al. Application and implementation method of digital twin in electric equipment. High Volt. Eng. 47 (05), 1505–1521 (2021).
Jiang, L. et al. Research on transformer fault diagnosis method based on digital twin. J. Syst. Simulat. https://doi.org/10.16182/j.issn1004731x.joss.23-1402 (2024).
Yan, Z. J. & Yang, Y. F. Fault diagnosis of transformers based on CNN and digital twin. Comput. Digit. Eng. 51 (11), 2758–2762 (2023).
Wang, Y. & Zhang, T. H. Fault diagnosis of transformers based on optimal probabilistic neural network based on digital twin. Mod. Mach. Tool Autom. Manuf. Techn. 11 , 20–23 (2020).
Moutis, P. & Alizadeh-Mousavi, O. Digital twin of distribution power transformer for real-time monitoring of medium voltage from low voltage measurements. IEEE Trans. Power Deliv. 36 (4), 1952–1963 (2020).
Zhang, L. J. et al. Study on electrothermal characteristics of oil-immersed power transformers in early stage of interturn faults. Proc. CSEE 43 (15), 6124–6136 (2023).
Tao, F. et al. Five-dimension digital twin model and its ten applications. Comput. Integr. Manuf. Syst. 25 (01), 1–18 (2019).
Li, S. W. et al. Application of data feature selection and classification in mechanical fault diagnosis. J. Vibrat. Shock 39 (02), 218–222 (2020).
CAS Google Scholar
Han, X. et al. A novel power transformer fault diagnosis model based on Harris-Hawks-optimization algorithm optimized kernel extreme learning machine. J. Electr. Eng. Technol. 17 (3), 1993–2001 (2022).
Kong, D. M. et al. Research on oil identification method based on three-dimensional fluorescence spectroscopy combined with sparse principal component analysis and support vector machine. Spectroscopy Spectral Anal. 41 (11), 3474–3479 (2021).
Kim, S. W. et al. New methods of DGA diagnosis using IEC TC 10 and related databases part l: Application of gas-ratio combinations. IEEE Trans. Dielectr. Electr. Insulat. 20 (2), 685–690 (2013).
Guo, R. Y., Peng, M. M. & Cao, Z. Q. Fault diagnosis of power transformer based on SE-DenseNet. Adv. Technol. Electr. Eng. Energy 40 (01), 61–69 (2021).
Wang, K. et al. New features derived from dissolved gas Analysis for fault diagnosis of power transformers. Proc. CSEE 36 (23), 6570–6578+6625 (2016).
Li, G. L. et al. Thermal error model of spindle for precision CNC machine tool based on AO-CNN. J. Xi’an Jiaotong Univ. 56 (08), 51–61 (2022).
Zhang, C. S. et al. improved aquila optimization based on multi-strategy integration. Acta Electron. Sin. 51 (05), 1245–1255 (2023).
Wang, Y. et al. Transformer fault diagnosis fused with synthetic minority over-sampling balanced multi-classification data based on improved extreme learning machine. Power Syst. Technol. 47 (09), 3799–3807 (2023).
Tang, J. et al. Oversampling and cost⁃sensitive algorithm for transformer fault diagnosis with unbalanced samples. High Volt. Apparatus 59 (06), 93–102 (2023).
Liu, D. D. et al. POA-SVM transformer fault diagnosis based on ADASYN balanced data set. Power Syst. Clean Energy 39 (08), 36–44 (2023).
Wang, Y. et al. Transformer DGA fault diagnosis method based on DBN-SSAELM. Power Syst. Prot. Control 51 (04), 32–42 (2023).
Fan, Q. C., Yu, F. & Xuan, M. Power transformer fault diagnosis based on optimized Bi-LSTM model. Comput. Simul. 39 (11), 136–140 (2022).
Download references
Project supported by Jilin Provincial Development and Reform Commission innovation capacity construction fund (2020C022-6).
Authors and affiliations.
Hangzhou Electric Power Equipment Manufacturing Co. Ltd Yuhang Qunli Complete Sets Electricity Manufacturing Branch Electric, Hangzhou, 311000, China
Haiyan Yao, Qiang Guo & Yufeng Miao
Hangzhou Electric Power Equipment Manufacturing Co. Ltd., Hangzhou, 311000, China
Northeast Electric Power University School of Mechanic Engineering, Jilin, 132012, China
You can also search for this author in PubMed Google Scholar
Haiyan Y designed the experiments and contributedmaterials/analysis tools; Xin Zhang analyzed the data and its visualization; Qiang Guo and Yufeng Miao M guided the data analysis; Shan Guan wrote the paper; All authors have reviewed the manuscript.
Correspondence to Shan Guan .
Competing interests.
The authors declare no competing interests.
Publisher's note.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .
Reprints and permissions
Cite this article.
Yao, H., Zhang, X., Guo, Q. et al. Fault diagnosis method for oil-immersed transformers integrated digital twin model. Sci Rep 14 , 20355 (2024). https://doi.org/10.1038/s41598-024-71107-w
Download citation
Received : 20 May 2024
Accepted : 26 August 2024
Published : 02 September 2024
DOI : https://doi.org/10.1038/s41598-024-71107-w
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.
Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.
COMMENTS
The main heading of "Methods" should be centered, boldfaced, and capitalized. Subheadings within this section are left-aligned, boldfaced, and in title case. You can also add lower level headings within these subsections, as long as they follow APA heading styles. To structure your methods section, you can use the subheadings of ...
The methods section is a fundamental section of any paper since it typically discusses the 'what', 'how', 'which', and 'why' of the study, which is necessary to arrive at the final conclusions. In a research article, the introduction, which serves to set the foundation for comprehending the background and results is usually ...
The methods section should describe what was done to answer the research question, describe how it was done, justify the experimental design, and explain how the results were analyzed. Scientific writing is direct and orderly. Therefore, the methods section structure should: describe the materials used in the study, explain how the materials ...
Your Methods Section contextualizes the results of your study, giving editors, reviewers and readers alike the information they need to understand and interpret your work. Your methods are key to establishing the credibility of your study, along with your data and the results themselves. A complete methods section should provide enough detail for a skilled researcher to replicate your process ...
The methods section of a research paper describes the procedures, participants, and materials used in an experiment. Learn more about how to write a method section. ... For example: "An examiner interviewed children individually at their school in one session that lasted 20 minutes on average. The examiner explained to each child that he or she ...
The methodology section of your paper describes how your research was conducted. This information allows readers to check whether your approach is accurate and dependable. A good methodology can help increase the reader's trust in your findings. First, we will define and differentiate quantitative and qualitative research.
3. Follow the order of the results: To improve the readability and flow of your manuscript, match the order of specific methods to the order of the results that were achieved using those methods. 4. Use subheadings: Dividing the Methods section in terms of the experiments helps the reader to follow the section better.
For example, you need to ensure that you have a large enough sample size to be able to generalize and make recommendations based upon the findings. ... "How to Write the Methods Section of a Research Paper." Respiratory Care 49 (October 2004):1229-1232; Lunenburg, Frederick C. Writing a Successful Thesis or Dissertation: Tips and Strategies ...
Research Methodology Example. An Example of Research Methodology could be the following: ... The research methodology is an important section of any research paper or thesis, as it describes the methods and procedures that will be used to conduct the research. It should include details about the research design, data collection methods, data ...
In this first subsection, you will need to identify the participants of your experiment or study. You should include: How many people took part, and how many were assigned to the experimental condition. How they were selected for participation. Any relevant demographic information (e.g., age, sex, ethnicity) You'll also need to address ...
Research methods are specific procedures for collecting and analyzing data. Developing your research methods is an integral part of your research design. When planning your methods, there are two key decisions you will make. First, decide how you will collect data. Your methods depend on what type of data you need to answer your research question:
What Is a Research Methodology? | Steps & Tips. Published on August 25, 2022 by Shona McCombes and Tegan George. Revised on November 20, 2023. Your research methodology discusses and explains the data collection and analysis methods you used in your research. A key part of your thesis, dissertation, or research paper, the methodology chapter explains what you did and how you did it, allowing ...
Provide the rationality behind your chosen approach. Based on logic and reason, let your readers know why you have chosen said research methodologies. Additionally, you have to build strong arguments supporting why your chosen research method is the best way to achieve the desired outcome. 3. Explain your mechanism.
This section of the APA methods section should cover the percentage of respondents who participated in the research, and how they were chosen. You also need to state how participants were compensated and the ethical standard followed. Example. Transgender male students from London were invited to participate in a study.
Passive voice is often considered the standard for research papers, but it is completely fine to mix passive and active voice, even in the method section, to make your text as clear and concise as possible. Use the simple past tense to describe what you did, and the present tense when you refer to diagrams or tables.
1. Qualitative research methodology. Qualitative research methodology is aimed at understanding concepts, thoughts, or experiences. This approach is descriptive and is often utilized to gather in-depth insights into people's attitudes, behaviors, or cultures. Qualitative research methodology involves methods like interviews, focus groups, and ...
Research Methodology Example. Detailed Walkthrough + Free Methodology Chapter Template. If you're working on a dissertation or thesis and are looking for an example of a research methodology chapter, you've come to the right place. In this video, we walk you through a research methodology from a dissertation that earned full distinction ...
An annotated Method section and other empirical research paper resources are available here. What is the purpose of the Method section in an empirical research paper? The Method section (also sometimes called Methods, Materials and Methods, or Research Design and Methods) describes the data collection and analysis procedures for a research project.
Media Files: APA Sample Student Paper , APA Sample Professional Paper This resource is enhanced by Acrobat PDF files. Download the free Acrobat Reader. Note: The APA Publication Manual, 7 th Edition specifies different formatting conventions for student and professional papers (i.e., papers written for credit in a course and papers intended for scholarly publication).
Sample Paper. This paper should be used only as an example of a research paper write-up. Horizontal rules signify the top and bottom edges of pages. For sample references which are not included with this paper, you should consult the Publication Manual of the American Psychological Association, 4th Edition. This paper is provided only to give ...
Quantitative research methods are used to collect and analyze numerical data. This type of research is useful when the objective is to test a hypothesis, determine cause-and-effect relationships, and measure the prevalence of certain phenomena. Quantitative research methods include surveys, experiments, and secondary data analysis.
For example, a review of prediction models identified 263 prediction models in obstetrics alone2; another review found 606 models related to covid-19.3 Interest in predicting health outcomes has been heightened by the increasing availability of big data,4 which has also led to the uptake of machine learning methods for prognostic research in ...
When you write a thesis, dissertation, or research paper, you will likely have to conduct a literature review to situate your research within existing knowledge. The literature review gives you a chance to: Demonstrate your familiarity with the topic and its scholarly context; Develop a theoretical framework and methodology for your research
Research methodology should accommodate diversity in the salience of reasons for migration, as something that requires explanation. In a survey of Polish migrants in Western Europe, Renee Luthra, Platt, and Salamońska (2018) and her colleagues astutely included "just because" as a response option to the question "why did you move?" It ...
The synthesis example and results of simulation and prototype experiments have demonstrated the effectiveness of the proposed method. Regarding the proposed design method, future research can focus on the following two aspects: 1. Study on the energy and mechanical characteristics of the intermediate states in bistable mechanisms. The proposed ...
The scoping review followed a predefined protocol, established methodology [] and is reported according to the Preferred Reporting Items for Systematic Review and Meta-Analyses Extension for Scoping Reviews Guidelines (PRISMA-ScR) [6, 7].Healthcare quality will be described according to the following three aspects: structures, processes, and outcomes following the Donabedian model [8, 9].The ...
The method first selects a point as the base point from a small number of class samples and calculates its nearest k neighboring samples based on the K-Nearest Neighbors (KNN) algorithm. For each neighboring sample found, a new sample is generated by interpolating between the neighboring samples and the base sample based on the following equation:
Develop a thesis statement. Create a research paper outline. Write a first draft of the research paper. Write the introduction. Write a compelling body of text. Write the conclusion. The second draft. The revision process. Research paper checklist.
This paper asks how well these modern methods work for the empirical settings and sample sizes commonly used in political science, with the U.S. states as the running example. In particular, it provides a simulation study of the performance of seven DID methods under either constant or heterogeneous effects, in an N=50 setting that mimics the ...
The sampling method used in this paper improves the diagnostic accuracy by 7.7967%, 2.5316%, and 1.8987% compared to ADASYN, SMOTE, and random oversampling, respectively, indicating that the INNOS ...