IEEE Account

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • JMIR Hum Factors
  • v.9(3); Jul-Sep 2022

Defining Recommendations to Guide User Interface Design: Multimethod Approach

1 Digital Media and Interaction Research Centre (DigiMedia), Department of Communication and Art, University of Aveiro, Aveiro, Portugal

2 Center for Health Technology and Services Research, University of Aveiro, Aveiro, Portugal

Ana Martins

3 Center for Health Technology and Services Research, School of Health Sciences, University of Aveiro, Aveiro, Portugal

Ana Almeida

Telmo silva, Óscar ribeiro.

4 Center for Health Technology and Services Research, Department of Education and Psychology, University of Aveiro, Aveiro, Portugal

Gonçalo Santinha

5 Governance, Competitiveness and Public Policies, Department of Social, Political and Territorial Sciences, University of Aveiro, Aveiro, Portugal

Nelson Rocha

6 Institute of Electronics and Informatics Engineering of Aveiro, Department of Medical Sciences, University of Aveiro, Aveiro, Portugal

Anabela G Silva

Associated data.

Mapping of the categories proposed by the experts and the final categories.

Final list of 69 generic recommendations proposed.

For the development of digital solutions, different aspects of user interface design must be taken into consideration. Different technologies, interaction paradigms, user characteristics and needs, and interface design components are some of the aspects that designers and developers should pay attention to when designing a solution. Many user interface design recommendations for different digital solutions and user profiles are found in the literature, but these recommendations have numerous similarities, contradictions, and different levels of detail. A detailed critical analysis is needed that compares, evaluates, and validates existing recommendations and allows the definition of a practical set of recommendations.

This study aimed to analyze and synthesize existing user interface design recommendations and propose a practical set of recommendations that guide the development of different technologies.

Based on previous studies, a set of recommendations on user interface design was generated following 4 steps: (1) interview with user interface design experts; (2) analysis of the experts’ feedback and drafting of a set of recommendations; (3) reanalysis of the shorter list of recommendations by a group of experts; and (4) refining and finalizing the list.

The findings allowed us to define a set of 174 recommendations divided into 12 categories, according to usability principles, and organized into 2 levels of hierarchy: generic (69 recommendations) and specific (105 recommendations).

Conclusions

This study shows that user interface design recommendations can be divided according to usability principles and organized into levels of detail. Moreover, this study reveals that some recommendations, as they address different technologies and interaction paradigms, need further work.

Introduction

In the context of digital solutions, user interface design consists of correctly defining the interface elements so that the tasks and interactions that users will perform are easy to understand [ 1 ]. Therefore, a good user interface design must allow users to easily interact with the digital solution to perform the necessary tasks in a natural way [ 2 ]. Considering that a digital solution is used by an individual with specific characteristics in a particular context [ 3 - 7 ], when developing a digital solution, designers must pay attention to a high number of components of user interface design, such as color [ 8 ] typography [ 1 ], navigation and search [ 9 ], input controls, and informational components [ 10 ].

Digital solutions and their interfaces must be accessible to all audiences and aimed at universal use in an era of increasingly heterogeneous users [ 3 , 4 , 11 - 17 ]. Therefore, designers should also be aware of broad and complex issues such as context-oriented design, user requirements, and adaptable and adaptive interactive behaviors [ 5 ]. The universal approach to user interface design follows heuristics and principles systematized by different authors over the years [ 18 - 20 ], but these are generic guidelines, and examples of how they can be operationalized in practice are scarce.

The literature presents many user interface design recommendations for varied digital solutions and users [ 21 - 25 ]. However, the absence of a detailed critical analysis that compares, evaluates, and validates existing recommendations is likely to facilitate an increasing number of similar recommendations [ 12 , 26 - 29 ]. Although existing recommendations refer to specific technologies, forms of interaction, or end users, the content of some recommendations is generic and applicable to varied technologies and users, such as “always create good contrast between text and page background” [ 30 ]; “color contrast of background and front content should be visible” [ 23 ]; “leave space between links and buttons” [ 30 ]; and “allow a reasonable spacing between buttons” [ 31 ]. These illustrative examples highlight the need to aggregate, analyze, and validate existing recommendations on user interface design. Accordingly, this study aimed to synthesize existing guidelines into a practical set of recommendations that could be used to guide user interface design for different technologies. This is important because it contributes to the standardization of good practices and will conceivably allow for better interface design achieved at earlier stages of product development.

In a previous work, 244 interface recommendations were identified, and they formed the basis for this study [ 32 ]. The identification of the 244 recommendations combined multiple sources: (1) our previous work [ 33 ], (2) a purposive search on Scopus database, and (3) inputs provided by experts in the field of interface design. The references identified through all 3 steps were extracted into an Excel (Microsoft) database with a total of 1210 recommendations. We screened this database and looked for duplicated recommendations. During this analysis, very generic recommendations were also deleted, and recommendations addressing similar content were merged. The resulting database, with 194 recommendations, was analyzed by 10 experts in user interface design recruited among SHAPES (Smart and Health Ageing through People Engaging in Supportive Systems) [ 34 ] project partners, who added another 62 recommendations, resulting in 256 recommendations. A further analysis identified 12 duplicated references that were deleted, resulting in a final list of 244 recommendations. The large number of recommendations was deemed impractical, and further action was necessary. Building on this prior research, a set of recommendations on user interface design were engendered following 4 steps: (1) interview with user interface design experts, (2) analysis of the experts’ feedback and drafting of a set of recommendations, (3) reanalysis of the shorter list of recommendations by a group of experts, and (4) refining and finalizing the list. Each of these steps is detailed below, and the whole process is illustrated in Figure 1 .

An external file that holds a picture, illustration, etc.
Object name is humanfactors_v9i3e37894_fig1.jpg

Steps of analysis of the user interface design recommendations.

Interview With User Interface Design Experts

An Excel file with the 244 user interface design recommendations was sent to external experts in the field of user interface design. For an individual to be considered an expert, they had to meet at least 1 the following criteria: (1) have designed user interfaces for at least 2 projects/digital solutions or (2) have participated in the evaluation of user interfaces for at least 2 projects/digital products.

An invitation email was sent to experts explaining the objectives of the study, along with a supporting document with the 244 recommendations. They were asked to analyze the recommendations and report on (1) repetition/relevance, (2) wording, (3) organization, and (4) any other aspect they felt relevant. They were given approximately 4 weeks to analyze the 244 recommendations and send their written analysis and comments back to us. Subsequently, they were asked to attend a Zoom (Zoom Video Communications) meeting aimed at clarifying their views and discussing potential contradictory comments. The written comments (sent in advance by the experts) were used to prepare a PowerPoint (Microsoft) presentation where recommendations and respective comments (anonymized) from all experts were synthetized. This presentation was used to guide the Zoom meeting discussion. To maximize the efficiency of the discussion, recommendations without any comments and those that received similar comments from different experts were not included in the presentation. For recommendations with contradictory comments, the facilitator led a discussion and tried to reach a consensus. For recommendations with comments from a single expert, the facilitator asked for the opinion of other experts. The Zoom meeting was facilitated by one of the authors (AIM) and assisted by another (author CD) who took notes. The facilitator encouraged the discussion and exchange of opinions from all experts participating in each meeting. The Zoom meetings were recorded, and the experts’ arguments were transcribed and analyzed using content analysis by 2 authors (AIM and AGS) with experience in qualitative data analysis. Written comments sent by the experts as well as comments and relevant notes made during the meeting were transposed into a single file and subject to content analysis. After transcription, the notes were independently read by both the aforementioned authors and grouped into themes, categories, and subcategories with similar meaning [ 35 ]. Themes, categories, and subcategories were then compared, and a consensus was reached by discussion.

Analyzing Experts’ Feedback and Drafting a Set of Recommendations

The authors of this manuscript (internal experts), including individuals with expertise on content analysis and on user interface design and usability, participated in a series of 6 meetings that were approximately 2 to 3 hours in duration each. These meetings, which took place between January and April 2021 and were held online, aimed to analyze the comments made by external experts in the previous step. Based on the experts’ comments, each recommendation was either (1) not changed (if no comments were made by the experts), (2) deleted, (3) merged with other complementary recommendations, (4) rewritten, or (5) broken up into more than 1 recommendation. The decisions were based on the number of experts making the same suggestion, alignment with existing guidelines, and coherence with previous decisions for similar recommendations. In addition, based on external experts’ suggestions, the recommendations were organized as follows: (1) hierarchical levels according to level of detail and interdependency, (2) usability principles, and (3) type of technology and interaction paradigm.

Reanalyzing the Shorter List of Recommendations

To further validate decisions made by the internal panel and explore the possibility of reducing the number of recommendations, the set of recommendations resulting from the previous step (and its respective organization according to hierarchical levels and principles) was reanalyzed by an additional external panel of experts. Once again, to be considered an expert, individuals had to meet the previously identified criteria for experts (have designed user interfaces for at least 2 projects/digital products or have participated in the evaluation of user interfaces for at least 2 projects/digital products). An online individual interview was conducted in May 2021 with each expert by one of the authors (CD). Experts had to answer 3 questions about each of the recommendations: (1) Do you consider this recommendation useful? (Yes/No); (2) Do you consider this recommendation mandatory? (Yes/No); and (3) Do you have any observation/comment on any recommendations or on its organization? The first question was used to determine the inclusion or exclusion of recommendations, and the second one was used to inform on the priority of recommendations through the possibility of having 2 sets of recommendations: 1 mandatory and 1 optional. The third question aimed to elicit general comments on both individual recommendations and their organization. Consensus on the first 2 questions was defined as 70% or more of the experts signaling a recommendation with “Yes” and less than 15% of experts signaling the same recommendation with “No.” Qualitative data from the third question was independently analyzed by 2 authors (CD and AGS) using content analysis, as previously described.

Refining and Finalizing the List of Recommendations

The internal panel of experts (the authors of this study) had an online meeting in which findings of the previous step were presented and discussed, and amendments to the existing list of recommendations were decided to generate the final list of user interface design recommendations.

Ethical Considerations

This study focused on the analysis of previously published research and recommendations; therefore, ethical approval was considered unnecessary.

A total of 9 experts participated in this step of the study: 5 females and 4 males with a mean age of 39.1 (SD 4.3) years. The participants were user interface designers (n=3, 33%) and user interface researchers (n=6, 67%) who had a background in design (n=6, 67%), communication and technology sciences (n=2, 22%), or computer engineering (n=1, 11%). A total of 3 meetings with 1 to 3 participants were conducted with a mean duration of 2 hours. Of the 244 recommendations, 166 (68%) were commented on by at least 1 expert.

Regarding the analysis of the interviews and written information sent by the experts, it was possible to aggregate commentaries into 2 main themes: (1) wording and content of recommendations and (2) organization of recommendations. The first theme was divided into 5 categories: (1) not changed (if no comments were made by the experts); (2) deletion of recommendations (because they were not useful or were contradictory); (3) merging of recommendations (to address complementary aspects of user interface design); (4) rewriting of recommendations (for clarity and coherence); and (5) splitting recommendations into more than 1 (because they included different aspects of user interface design). Of the 244 recommendations, external experts suggested that 108 should be merged (usually pairs of recommendations but could also include more than 2 recommendations), 29 should be rewritten, 4 should be split into more than 1, and 44 should be deleted. Among the recommendations, 78 received no comment. For 19 recommendations, it was not possible to reach consensus in the interview phase.

The second theme (organization of the recommendations) was divided into 2 categories: (1) hierarchization of recommendations and (2) grouping of recommendations. This last category was subdivided into 2 subcategories: (1) grouping of recommendations according to usability principles and (2) grouping of recommendations according to whether they apply to digital solutions in general or to specific digital solutions/interaction paradigms. Examples of quotations that support these categories and subcategories are presented in Table 1 . Regarding the grouping of recommendations according to usability principles, the categories proposed by 5 experts ( Table 1 ) were reorganized and merged into 12 categories: feedback, recognition, flexibility, customization, consistency, errors, help, accessibility, navigation, privacy, visual component, and emotional component. The mapping of the categories proposed by the experts and the 12 categories (named principles) are presented in Multimedia Appendix 1 .

Categories and subcategories of the theme “organization of recommendations,” quotations supporting the categories, and number of experts that made comments in each category/subcategory.

CategoriesSubcategoriesCitations (examples)Experts, n (%)
Hierarchization

N/A 6, male]




4 (44)
Grouping of recommendationsDesign
principles






Of the 9 experts, 5 suggested categories for grouping recommendations:








. [E2, female]


7 (78)

Generic vs specific to technology/ interaction paradigms









Of the 9 experts, 3 suggested categories for grouping recommendations:




3 (33)

a N/A: not applicable.

b E: expert.

Analysis of Experts’ Feedback and Reanalysis of the Recommendations

Based on the external expert’s comments, the recommendations were reanalyzed. Of the 244 recommendations, 61 (25%) were deleted because they were duplicated or redundant, 48 (19.7%) were merged with other complementary recommendations, 62 (25.4%) were rewritten for clarification and language standardization, 14 (5.7%) were split in 2 or more recommendations, and 59 (24.2%) were not changed. This resulted in a preliminary list of 175 recommendations. Table 2 compares the external experts’ recommendations and internal experts’ final decision.

Comparison of external expert’s recommendations and internal experts’ decision.

Type of actionExternal experts’ recommendations (N=263) , n (%)Internal experts’ decision (N=244), n (%)
Deleted44 (16.7)61 (25)
Merged108 (41.1)48 (19.7)
Rewritten29 (11)62 (25.4)
Split4 (1.5)14 (5.7)
Not changed78 (29.7)59 (24.2)

a Consensus was not possible for 19 recommendations.

The 175 recommendations were then categorized into 12 mutually exclusive principles (feedback, recognition, flexibility, customization, consistency, errors, help, accessibility, navigation, privacy, visual, and emotional) and within each principle, organized into 2 levels of hierarchy according to the specificity/level of detail.

Of the 175 recommendations, 70 were categorized as level 1 and were generic recommendations applied to all digital solutions, and 105 recommendations were linked to 1 first level recommendation and subdivided by type of digital solution/interaction paradigm. The recommendations of both levels are linked, as level 2 recommendations detail how level 1 recommendations can be implemented. For example, the level 1 recommendation that “the system should be used efficiently and with a minimum of fatigue” is linked to a set of level 2 recommendations targeted at specific interaction paradigms, such as feet interaction and robotics: (1) “In feet interaction, the system should minimize repetitive actions and sustained effort, using reasonable operating forces and allowing the user to maintain a neutral body position,” and (2) “In robotics, the system should have an appropriate weight, allowing the person to move the robot easily (this can be achieved by using back drivable hardware).” Table 3 shows the distribution of the 175 recommendations.

Distribution of recommendations by level and category.

CategoryLevel 1, (N=70), n

Level 2, (N=105), nTechnology/interaction paradigmTotal (N=175), n
Feedback65 11
Recognition512
17
Flexibility610 16
Customization76 13
Consistency22 4
Errors57
12
Help32
5
Accessibility823
31
Navigation66 12
Privacy35 8
Visual component1622 38
Emotional component35 8

Reanalysis of the Shorter List of Recommendations by Experts

A total of 14 experts (8 females and 6 males) with a mean age of 35 (SD 8.8) years old provided feedback on recommendations. Experts were user interface designers (n=6, 43%) and user interface researchers (n=8, 57%) who had a background in design (n=8, 57%) or communication and technology sciences (n=6, 43%). The interviews lasted up to 2 hours each.

All the 175 recommendations reached consensus for the usefulness question. However, for question 2 (Do you consider this recommendation mandatory?), there was consensus that 54 (77%) level 1 recommendations were mandatory. The remaining 16 (23%) level 1 recommendations were considered by 5 (36%) to 9 (64%) experts as not mandatory. For the 105 level 2 recommendations, there was consensus that 91 (87%) were mandatory, and the remaining 14 were not considered mandatory by 5 (36%) to 9 (64%) experts.

Experts’ comments were aggregated into 5 main themes: (1) deletion or recategorizing of recommendations, (2) consistency, (3) contradiction, (4) asymmetry, and (5) uncertainty. It was suggested that 1 recommendation be deleted (“The system should be free from errors”), and another moved from the visual component category to the emotional component category. No other suggestions were made regarding the structure of the recommendations. There were comments related to the consistency, particularly regarding the need to use either British or American spelling throughout all recommendations and to consistently refer to “users” instead of “persons” or “individuals.” The remaining comments applied mostly to level 2 recommendations, for which experts identified contradictory recommendations (eg, accessibility: “In robotics, the system should meet the person’s needs, be slow, safe and reliable, small, easy to use, and have an appearance not too human-like, not patronizing or stigmatizing ” vs emotional: “In robotics, the system should indicate aliveness by showing some autonomous behavior, facial expressions, hand/head gestures to motivate engagement, as well as changing vocal patterns and pace to show different emotions”). Experts also commented on the asymmetry across the number of level 2 recommendations linked to level 1 recommendations and on the asymmetry regarding the number of recommendations per type of technology and interaction paradigm. In addition, experts were uncertain about the accuracy of the measures indicated in the recommendations (eg, visual: “In robotics, the system graphical user interface and button elements should be sufficiently big in size, so they can be easily seen and used, about ~20 mm in case of touch screen, buttons” vs visual: “In feet interaction, the system should consider an appropriate interaction radius of 20 cm for tap, 30 cm at the front, and 25 cm at the back for kick”).

Based on the experts’ comments and issues raised in the previous step, the term “users” was adopted throughout the recommendations, 1 recommendation was removed, and 1 was moved from the visual component to the emotional component. In addition, all level 1 recommendations for which no consensus was reached on whether they were mandatory were considered not mandatory (identified by using the word “may” in the recommendation). The internal panel also recognized that level 2 recommendations cannot be used to guide user interface design in their current stage and that further work is needed. Therefore, a final list of 69 generic recommendations is proposed ( Multimedia Appendix 2 ).

Principal Findings

To the best of our knowledge, this is the first study that attempted to analyze and synthetize existing recommendations on user interface design. This was a complex task that generated a high number of interdependent recommendations that could be organized into hierarchical levels of interdependency and grouped according to usability principles. Level 1 recommendations are generic and can be used to inform the user interface design of different types of technology and interaction paradigms. Meanwhile, level 2 recommendations are more specific and therefore apply to different types of technology and interaction paradigms. Furthermore, the level of detail and absence of evidence that they had been validated raised doubts about their validity.

The external experts’ suggestions formed the basis for the internal experts’ (our) analysis. However, there is a discrepancy between the analysis of both panels of experts in terms of the number of recommendations that should be deleted, merged, rewritten, fragmented, or not changed. This was because when analyzing the recommendations, the internal panel verified that there were more recommendations to delete that were repeated or generic beyond those already identified by the external panel. It is likely that these were missed due to the high number of recommendations, which made the analysis a time-consuming and complex task. Furthermore, changing 1 recommendation in line with external experts’ suggestions resulted in subsequently having to change other recommendations for coherence and consistency, resulting in a higher number of recommendations that were rewritten. In addition, there was a lack of consensus among external experts, leaving the final decision to the internal experts (us), further contributing to discrepancies.

Regarding the organization of the recommendations, the division into 2 hierarchical levels based on the specificity/level of detail resulted from the external experts’ feedback and aimed at making the consultation of the list of recommendations easier. This type of hierarchization in levels of detail was also used in previous studies aimed at synthetizing existing guidelines [ 23 , 36 ].

The recommendations were grouped into 12 categories, which closely relate to existing usability principles (feedback, recognition, flexibility, customization, consistency, errors, help, accessibility, navigation, and privacy [ 18 , 37 - 39 ]). Usability principles are defined as broad “rules of thumb” or design guidelines that describe features of the systems to guide the design of digital solutions [ 18 , 40 ]. Additionally, they are oriented to improve user interaction [ 3 ] and impact the quality of the digital solution interface [ 41 ]. Therefore, having the recommendations organized in a way that maps these principles helps facilitate a practical use of the recommendations proposed herein, as these usability principles are familiar to designers and are well established, well known, and accepted in the literature [ 23 , 42 ].

The results showed an asymmetry in the number of recommendations categorized into each of the 12 usability principles (eg, for level 1, consistency has 2 recommendations while the visual component has 16 recommendations). This discrepancy suggests that some areas of user interface design such as the visual component might be better detailed, more complex, or more valued in the literature, but can also suggest that the initial search might not have been comprehensive enough, as it included a reduced number of databases [ 32 ]. Nevertheless, the heterogeneity between categories does not influence its relevance, as it is the set of recommendations as a whole that influences the user design interface of a digital solution.

The number of level 2 recommendations aggregated under each level 1 recommendation is also uneven. Most of the level 2 recommendations that resulted from this study concern web and mobile technologies because their utilization is widespread among the population [ 43 ] and therefore more likely to have design recommendations reported in the literature [ 23 , 31 , 44 , 45 ]. On the other hand, emerging technologies like robotics and interaction paradigms (eg, gestures, voice, and feet) represent new challenges for researchers, and recommendations are still being formulated, resulting in a lower number of specific recommendations that are published [ 46 - 49 ]. Moreover, the level 2 recommendations raised doubts among experts, namely regarding (1) the lack of consensus on whether they were mandatory or not, (2) apparent contradictions between recommendations, and (3) uncertainty regarding the accuracy of some recommendations, particularly those very specific (eg, the recommendations on the size of the buttons in millimeters). These aspects suggest that level 2 recommendations need further validation in future studies. No data was found on how the authors of the recommendations arrived at this level of detail and how the exact recommendation might vary depending on the target users [ 50 , 51 ], the type of technology [ 49 ], interaction paradigm [ 46 ], and the context of use [ 52 ]. Validation of the level 2 recommendations might be performed by gathering expert’s consensus on the adequacy of recommendations by type of technology/interaction paradigm and involving real users to test if the specific user interfaces that fulfill the recommendations improve usability and user experience [ 50 , 53 ].

We believe that level 1 recommendations apply to different users, contexts, and technologies/interaction paradigms and that the necessary level of specificity will be given by level 2 recommendations, which can be further operationalized into more detailed recommendations (eg, creating level 3 recommendations under level 2 recommendations). For example, recommendation 1 from the recognition category states that “the system should consider the context of use, using phrases, words, and concepts that are familiar to the users and grounded in real conventions, delivering an experience that matches the system and the real world,” which is an example of applicability to different contexts such as health or education. Similarly, recommendation 1 from the flexibility category states that “the system should support both inexperienced and experienced users, be easy to learn, and to remember, even after an inactive period,” also showing adaptability to different types of users. Nevertheless, the level of importance of each level 1 recommendation might vary. For example, recommendation 6 of the flexibility category, which states that “the system may make users feel confident to operate and take appropriate action if something unexpected happens,” was not considered mandatory by the panel of external experts. However, one might argue that it should be considered mandatory in the field of health, where the feeling of being in control and acting immediately if something unexpected happens is of utmost importance. Therefore, both level 1 and level 2 recommendations require further validation across different types of technology and interaction paradigms but also for different target users and contexts of use. Also required are investigations to determine whether their use results in better digital solutions, and particularly for the health care field, increases adhesion to and effectiveness of interventions.

In synthesis, although this study constitutes an attempt toward a more standardized approach in the field of user interface design, the set of recommendations presented herein should not be seen as a final set but rather as guides that should be critically appraised by designers according to the context, type of technology, type of interaction, and the end users for whom the digital solution is intended.

Strengths and Limitations

The strengths of this proposed set of recommendations are that it was developed based on multiple sources and multiple rounds of experts’ feedback. However, although several experts were involved in different steps of the study, it cannot be guaranteed that the views of the included experts are representative of the views of a broader community of user interface design experts. Another limitation of this study is that the initial search for recommendations might not have been comprehensive enough. Nevertheless, external experts were given the possibility of adding recommendations to the list, and none suggested the need to include additional recommendations. The list of level 2 recommendations is a work in progress that should be further discussed and changed considering the technology/paradigm of interaction. Finally, some types of technologies and interaction paradigms are not represented in the recommendations (eg, virtual reality), and it would be important to have specific recommendations for all types of technologies and interaction paradigms in the future.

Acknowledgments

This work was supported by the SHAPES (Smart and Health Ageing through People Engaging in Supportive Systems) project funded by the Horizon 2020 Framework Program of the European Union for Research Innovation (grant agreement 857159 - SHAPES – H2020 – SC1-FA-DTS – 2018-2020).

Abbreviations

SHAPESSmart and Health Ageing through People Engaging in Supportive Systems

Multimedia Appendix 1

Multimedia appendix 2.

Conflicts of Interest: None declared.

MauroNewMedia

Pulse UX Blog

Theory, Analysis and Reviews on UX User Experience Research and Design

Home Mission About Charles L. Mauro CHFP

« Updates From Mauro Usability Science - Critical Analysis of Maeda Design in Tech Report 2017 »

User Interface Design and UX Design: 80+ Important Research Papers Covering Peer-Reviewed and Informal Studies

Charles Mauro CHFP

Important peer-reviewed and informally published recent research on user interface design and user experience (UX) design.

For the benefit of clients and colleagues we have culled a list of approximately 70 curated recent research publications dealing with user interface design, UX design and e-commerce optimization.

In our opinion these publications represent some of the best formal research thinking on UI and UX design. These papers are also among the most widely downloaded and cited formal research on UI / UX design. We have referenced many of these studies in our work at MauroNewMedia.

research article on user interface

Pay walls: As you will note in reviewing the following links and abstracts, most of the serious research on UI / UX design and optimization is located behind pay walls controlled by major publishers. However, in the end, good data is well worth the investment. Many links and other cited references are, of course, free.

Important disclaimer: We do not receive any form of compensation for citing any of the following content. Either Charles L Mauro CHFP or Paul Thurman MBA has personally reviewed all papers and links in this list. Some of these references were utilized in the recent NYTECH UX talk given by Paul Thurman MBA titled: Critical New UX Design Optimization Research

In addition to historical research papers, we frequently receive requests from colleagues, clients and journalists for recommended reading lists on topics covering our expertise in UX design, usability research and human factors engineering. These requests prompted us to pull from our research library (yes, we still have real books) 30+ books which our professional staff felt should be considered primary conceptual literature for anyone well-read in the theory and practice of UX design and research. Please follow the for PulseUX’s compilation of the 30+ Best UX Design and Research Books of All Time

Title: The influence of hedonic and utilitarian motivations on user engagement: The case of online shopping experiences

Abstract User experience seeks to promote rich, engaging interactions between users and systems. In order for this experience to unfold, the user must be motivated to initiate an interaction with the technology. This study explored hedonic and utilitarian motivations in the context of user engagement with online shopping. Factor analysis was performed to identify a parsimonious set of factors from the Hedonic and Utilitarian Shopping Motivation Scale and the User Engagement Scale based on responses from 802 shoppers. Multiple linear regression was used to test hypotheses with hedonic and utilitarian motivations (Idea, Social, Adventure/Gratification, Value and Achievement Shopping) and attributes of user engagement (Aesthetics, Focused Attention, Perceived Usability, and Endurability). Results demonstrate the salience of Adventure/Gratification Shopping and Achievement Shopping Motivations to specific variables of user engagement in the e-commerce environment and provide considerations for the inclusion of different types of motivation into models of engaging user experiences. Abstract Copyright © 2010 Elsevier B.V. All rights reserved.

Title: New Support for Marketing Analytics

Abstract Consumer surveys and myriad other forms of research have long been the grist for marketing decisions at large companies. But many firms have been reluctant to embrace the high-tech approach to data gathering and number crunching that falls under the rubric of marketing analytics, which uses advanced techniques to transform the tracking of promotional efforts, customer preferences, and industry developments into sophisticated branding and advertising campaigns. Fueled in part by Tom Peters and Robert Waterman’s seminal 1982 book In Search of Excellence , which coined the phrase “paralysis through analysis,” skepticism about the approach remains widespread, even in the face of a number of positive research results over the years. This new study, involving Fortune 1000 companies, offers yet more ammunition for supporters of marketing analytics. Abstract Copyright © 2013 Booz & Company Inc. All rights reserved.

Title: Video game values: Human-computer interaction and games

Abstract Current human–computer interaction (HCI) research into video games rarely considers how they are different from other forms of software. This leads to research that, while useful concerning standard issues of interface design, does not address the nature of video games as games specifically. Unlike most software, video games are not made to support external, user-defined tasks, but instead define their own activities for players to engage in. We argue that video games contain systems of values which players perceive and adopt, and which shape the play of the game. A focus on video game values promotes a holistic view of video games as software, media, and as games specifically, which leads to a genuine video game HCI. Abstract Copyright © 2006 Elsevier B.V. All rights reserved.

Title: When fingers do the talking: a study of text messaging

Abstract SMS or text messaging is an area of growth in the communications field. The studies described below consisted of a questionnaire and a diary study. The questionnaire was designed to examine texting activities in 565 users of the mobile phone. The diary study was carried out by 24 subjects over a period of 2 weeks. The findings suggest that text messaging is being used by a wide range of people for all kinds of activities and that for some people it is the preferred means of communication. These studies should prove interesting for those examining the use and impact of SMS. Abstract Copyright © 2004 Elsevier B.V. All rights reserved.

Title: Understanding factors affecting trust in and satisfaction with mobile banking in Korea: A modified DeLone and McLean’s model perspective

Abstract As mobile technology has developed, mobile banking has become accepted as part of daily life. Although many studies have been conducted to assess users’ satisfaction with mobile applications, none has focused on the ways in which the three quality factors associated with mobile banking – system quality, information quality and interface design quality – affect consumers’ trust and satisfaction. Our proposed research model, based on DeLone and McLean’s model, assesses how these three external quality factors can impact satisfaction and trust. We collected 276 valid questionnaires from mobile banking customers, then analyzed them using structural equation modeling. Our results show that system quality and information quality significantly influence customers’ trust and satisfaction, and that interface design quality does not. We present herein implications and suggestions for further research. Abstract Copyright © 2009 Elsevier B.V. All rights reserved.

research article on user interface

Title: What is beautiful is usable

Abstract An experiment was conducted to test the relationships between users’ perceptions of a computerized system’s beauty and usability. The experiment used a computerized application as a surrogate for an Automated Teller Machine (ATM). Perceptions were elicited before and after the participants used the system. Pre-experimental measures indicate strong correlations between system’s perceived aesthetics and perceived usability. Post-experimental measures indicated that the strong correlation remained intact. A multivariate analysis of covariance revealed that the degree of system’s aesthetics affected the post-use perceptions of both aesthetics and usability, whereas the degree of actual usability had no such effect. The results resemble those found by social psychologists regarding the effect of physical attractiveness on the valuation of other personality attributes. The findings stress the importance of studying the aesthetic aspect of human–computer interaction (HCI) design and its relationships to other design dimensions. Abstract Copyright © 2000 Elsevier Science B.V. All rights reserved.

Title: UX Curve: A method for evaluating long-term user experience

Abstract The goal of user experience design in industry is to improve customer satisfaction and loyalty through the utility, ease of use, and pleasure provided in the interaction with a product. So far, user experience studies have mostly focused on short-term evaluations and consequently on aspects relating to the initial adoption of new product designs. Nevertheless, the relationship between the user and the product evolves over long periods of time and the relevance of prolonged use for market success has been recently highlighted. In this paper, we argue for the cost-effective elicitation of longitudinal user experience data. We propose a method called the “UX Curve” which aims at assisting users in retrospectively reporting how and why their experience with a product has changed over time. The usefulness of the UX Curve method was assessed in a qualitative study with 20 mobile phone users. In particular, we investigated how users’ specific memories of their experiences with their mobile phones guide their behavior and their willingness to recommend the product to others. The results suggest that the UX Curve method enables users and researchers to determine the quality of long-term user experience and the influences that improve user experience over time or cause it to deteriorate. The method provided rich qualitative data and we found that an improving trend of perceived attractiveness of mobile phones was related to user satisfaction and willingness to recommend their phone to friends. This highlights that sustaining perceived attractiveness can be a differentiating factor in the user acceptance of personal interactive products such as mobile phones. The study suggests that the proposed method can be used as a straightforward tool for understanding the reasons why user experience improves or worsens in long-term product use and how these reasons relate to customer loyalty. Abstract Copyright 2011 British Informatics Society Limited. Published by Elsevier B.V. All rights reserved.

Title: Heuristic evaluation: Comparing ways of finding and reporting usability problems

Abstract Research on heuristic evaluation in recent years has focused on improving its effectiveness and efficiency with respect to user testing. The aim of this paper is to refine a research agenda for comparing and contrasting evaluation methods. To reach this goal, a framework is presented to evaluate the effectiveness of different types of support for structured usability problem reporting. This paper reports on an empirical study of this framework that compares two sets of heuristics, Nielsen’s heuristics and the cognitive principles of Gerhardt-Powals, and two media of reporting a usability problem, i.e. either using a web tool or paper. The study found that there were no significant differences between any of the four groups in effectiveness, efficiency and inter-evaluator reliability. A more significant contribution of this research is that the framework used for the experiments proved successful and should be reusable by other researchers because of its thorough structure. Abstract Copyright © 2006 Elsevier B.V. All rights reserved.

Title: Socio-technical systems: From design methods to systems engineering

Abstract It is widely acknowledged that adopting a socio-technical approach to system development leads to systems that are more acceptable to end users and deliver better value to stakeholders. Despite this, such approaches are not widely practised. We analyse the reasons for this, highlighting some of the problems with the better known socio-technical design methods. Based on this analysis we propose a new pragmatic framework for socio-technical systems engineering (STSE) which builds on the (largely independent) research of groups investigating work design, information systems, computer-supported cooperative work, and cognitive systems engineering. STSE bridges the traditional gap between organisational change and system development using two main types of activity: sensitisation and awareness; and constructive engagement. From the framework, we identify an initial set of interdisciplinary research problems that address how to apply socio-technical approaches in a cost-effective way, and how to facilitate the integration of STSE with existing systems and software engineering approaches. Abstract Copyright © 2010 Elsevier B.V. All rights reserved.

Title: Five reasons for scenario-based design

Abstract Scenarios of human–computer interaction help us to understand and to create computer systems and applications as artifacts of human activity—as things to learn from, as tools to use in one’s work, as media for interacting with other people. Scenario-based design of information technology addresses five technical challenges: scenarios evoke reflection in the content of design work, helping developers coordinate design action and reflection. Scenarios are at once concrete and flexible, helping developers manage the fluidity of design situations. Scenarios afford multiple views of an interaction, diverse kinds and amounts of detailing, helping developers manage the many consequences entailed by any given design move. Scenarios can also be abstracted and categorized, helping designers to recognize, capture and reuse generalizations and to address the challenge that technical knowledge often lags the needs of technical design. Finally, scenarios promote work-oriented communication among stakeholders, helping to make design activities more accessible to the great variety of expertise that can contribute to design, and addressing the challenge that external constraints designers and clients face often distract attention from the needs and concerns of the people who will use the technology. Abstract Copyright © 2000 Elsevier Science B.V. All rights reserved.

Title: Needs, affect, and interactive products – Facets of user experience

Abstract Subsumed under the umbrella of User Experience (UX), practitioners and academics of Human–Computer Interaction look for ways to broaden their understanding of what constitutes “pleasurable experiences” with technology. The present study considered the fulfilment of universal psychological needs, such as competence, relatedness, popularity, stimulation, meaning, security, or autonomy, to be the major source of positive experience with interactive technologies. To explore this, we collected over 500 positive experiences with interactive products (e.g., mobile phones, computers). As expected, we found a clear relationship between need fulfilment and positive affect, with stimulation, relatedness, competence and popularity being especially salient needs. Experiences could be further categorized by the primary need they fulfil, with apparent qualitative differences among some of the categories in terms of the emotions involved. Need fulfilment was clearly linked to hedonic quality perceptions, but not as strongly to pragmatic quality (i.e., perceived usability), which supports the notion of hedonic quality as “motivator” and pragmatic quality as “hygiene factor.” Whether hedonic quality ratings reflected need fulfilment depended on the belief that the product was responsible for the experience (i.e., attribution). Abstract Copyright © 2010 Elsevier B.V. All rights reserved.

Title: The role of social presence in establishing loyalty in e-Service environments

Abstract Compared to offline shopping, the online shopping experience may be viewed as lacking human warmth and sociability as it is more impersonal, anonymous, automated and generally devoid of face-to-face interactions. Thus, understanding how to create customer loyalty in online environments (e-Loyalty) is a complex process. In this paper a model for e-Loyalty is proposed and used to examine how varied conditions of social presence in a B2C e-Services context influence e-Loyalty and its antecedents of perceived usefulness, trust and enjoyment. This model is examined through an empirical study involving 185 subjects using structural equation modeling techniques. Further analysis is conducted to reveal gender differences concerning hedonic elements in the model on e-Loyalty. Abstract Copyright © 2006 Elsevier B.V. All rights reserved.

Title: A framework for evaluating the usability of mobile phones based on multi-level, hierarchical model of usability factors

Abstract As a mobile phone has various advanced functionalities or features, usability issues are increasingly challenging. Due to the particular characteristics of a mobile phone, typical usability evaluation methods and heuristics, most of which are relevant to a software system, might not effectively be applied to a mobile phone. Another point to consider is that usability evaluation activities should help designers find usability problems easily and produce better design solutions. To support usability practitioners of the mobile phone industry, we propose a framework for evaluating the usability of a mobile phone, based on a multi-level, hierarchical model of usability factors, in an analytic way. The model was developed on the basis of a set of collected usability problems and our previous study on a conceptual framework for identifying usability impact factors. It has multi-abstraction levels, each of which considers the usability of a mobile phone from a particular perspective. As there are goal-means relationships between adjacent levels, a range of usability issues can be interpreted in a holistic as well as diagnostic way. Another advantage is that it supports two different types of evaluation approaches: task-based and interface-based. To support both evaluation approaches, we developed four sets of checklists, each of which is concerned, respectively, with task-based evaluation and three different interface types: Logical User Interface (LUI), Physical User Interface (PUI) and Graphical User Interface (GUI). The proposed framework specifies an approach to quantifying usability so that several usability aspects are collectively measured to give a single score with the use of the checklists. A small case study was conducted in order to examine the applicability of the framework and to identify the aspects of the framework to be improved. It showed that it could be a useful tool for evaluating the usability of a mobile phone. Based on the case study, we improved the framework in order that usability practitioners can use it more easily and consistently. Abstract Copyright © 2011 British Informatics Society Limited. Published by Elsevier B.V. All rights reserved.

Title: Understanding the most satisfying and unsatisfying user experiences: Emotions, psychological needs, and context

Abstract The aim of this research was to study the structure of the most satisfying and unsatisfying user experiences in terms of experienced emotions, psychological needs, and contextual factors. 45 university students wrote descriptions of their most satisfying and unsatisfying recent user experiences and analyzed those experiences using the Positive and Negative Affect Schedule (PANAS) method for experienced emotions, a questionnaire probing the salience of 10 psychological needs, and a self-made set of rating scales for analyzing context. The results suggested that it was possible to capture variations in user experiences in terms of experienced emotions, fulfillment of psychological needs, and context effectively by using psychometric rating scales. The results for emotional experiences showed significant differences in 16 out of 20 PANAS emotions between the most satisfying and unsatisfying experiences. The results for psychological needs indicated that feelings of autonomy and competence emerged as highly salient in the most satisfying experiences and missing in the unsatisfying experiences. High self-esteem was also notably salient in the most satisfying experiences. The qualitative results indicated that most of the participants’ free-form qualitative descriptions, especially for the most unsatisfying user experiences, gave important information about the pragmatic aspects of the interaction, but often omitted information about hedonic and social aspects of user experience. Abstract Copyright © 2011 British Informatics Society Limited. Published by Elsevier B.V. All rights reserved.

Title: The Usability Metric for User Experience

Abstract The Usability Metric for User Experience (UMUX) is a four-item Likert scale used for the subjective assessment of an application’s perceived usability. It is designed to provide results similar to those obtained with the 10-item System Usability Scale, and is organized around the ISO 9241-11 definition of usability. A pilot version was assembled from candidate items, which was then tested alongside the System Usability Scale during usability testing. It was shown that the two scales correlate well, are reliable, and both align on one underlying usability factor. In addition, the Usability Metric for User Experience is compact enough to serve as a usability module in a broader user experience metric. Abstract Copyright © 2010 Elsevier B.V. All rights reserved.

research article on user interface

Title: User acceptance of mobile Internet: Implication for convergence technologies

Abstract Using the Technology Acceptance Model as a conceptual framework and a method of structural equation modeling, this study analyzes the consumer attitude toward Wi-Bro drawing data from 515 consumers. Individuals’ responses to questions about whether they use/accept Wi-Bro were collected and combined with various factors modified from the Technology Acceptance Model.

The result of this study show that users’ perceptions are significantly associated with their motivation to use Wi-Bro. Specifically, perceived quality and perceived availability are found to have significant effect on users’ extrinsic and intrinsic motivation. These new factors are found to be Wi-Bro-specific factors, playing as enhancing factors to attitudes and intention. Abstract Copyright © 2007 Elsevier B.V. All rights reserved.

Title: Understanding purchasing behaviors in a virtual economy: Consumer behavior involving virtual currency in Web 2.0 communities

Abstract This study analyzes consumer purchasing behavior in Web 2.0, expanding the technology acceptance model (TAM), focusing on which variables influence the intention to transact with virtual currency. Individuals’ responses to questions about attitude and intention to transact in Web 2.0 were collected and analyzed with various factors modified from the TAM. The results of the proposed model show that subjective norm is a key behavioral antecedent to using virtual currency. In the extended model, the moderating effects of subjective norm on the relations among the variables were found to be significant. The new set of variables is virtual environment-specific, acting as factors enhancing attitudes and behavioral intentions in Web 2.0 transactions. Abstract Copyright © 2008 Elsevier B.V. All rights reserved.

Title: Fundamentals of physiological computing

Abstract This review paper is concerned with the development of physiological computing systems that employ real-time measures of psychophysiology to communicate the psychological state of the user to an adaptive system. It is argued that physiological computing has enormous potential to innovate human–computer interaction by extending the communication bandwidth to enable the development of ‘smart’ technology. This paper focuses on six fundamental issues for physiological computing systems through a review and synthesis of existing literature, these are (1) the complexity of the psychophysiological inference, (2) validating the psychophysiological inference, (3) representing the psychological state of the user, (4) designing explicit and implicit system interventions, (5) defining the biocybernetic loop that controls system adaptation, and (6) ethical implications. The paper concludes that physiological computing provides opportunities to innovate HCI but complex methodological/conceptual issues must be fully tackled during the research and development phase if this nascent technology is to achieve its potential. Abstract Copyright © 2008 Elsevier B.V. All rights reserved.

Title: Modelling user experience with web sites: Usability, hedonic value, beauty and goodness

Abstract Recent research into user experience has identified the need for a theoretical model to build cumulative knowledge in research addressing how the overall quality or ‘goodness’ of an interactive product is formed. An experiment tested and extended Hassenzahl’s model of aesthetic experience. The study used a 2 × 2 × (2) experimental design with three factors: principles of screen design, principles for organizing information on a web page and experience of using a web site. Dependent variables included hedonic perceptions and evaluations of a web site as well as measures of task performance, navigation behaviour and mental effort. Measures, except Beauty, were sensitive to manipulation of web design. Beauty was influenced by hedonic attributes (identification and stimulation), but Goodness by both hedonic and pragmatic (user-perceived usability) attributes as well as task performance and mental effort. Hedonic quality was more stable with experience of web-site use than pragmatic quality and Beauty was more stable than Goodness. Abstract Copyright © 2008 Elsevier B.V. All rights reserved.

Title: Sample Size In Usability Studies

Abstract Usability studies are a cornerstone activity for developing usable products. Their effectiveness depends on sample size, and determining sample sizehas been a research issue in usability engineering for the past 30 years. In 2010, Hwang and Salvendy reported a meta study on the effectiveness of usability evaluation, concluding that a sample size of 10±2 is sufficient for discovering 80% of usability problems (not five, as suggested earlier by Nielsen in 2000). Here, I show the Hwang and Salvendy study ignored fundamental mathematical properties of the problem, severely limiting the validity of the 10±2 rule, then look to reframe the issue of effectiveness and sample-size estimation to the practices and requirements commonly encountered in industrial-scale usability studies. Abstract Copyright © 2013 ACM, Inc. Title: An experimental study of learner perceptions of the interactivity of web-based instruction

Abstract An effectively designed interaction mechanism creates a shortcut for human–computer interaction. Most studies in this area have concluded that the higher the level of interactivity, the better, especially regarding interactive websites applied in the fields of business and education. Previous studies have also suggested that designs with a higher level of interactivity result in higher learner evaluations of websites. However, little research has examined learner perceptions as they interact with web-based instruction (WBI) systems in a situation with limited time. To assist learners in acquiring knowledge quickly, the interactivity design must make the web learning environment easier to use by reducing the complexity of the interface. The aim of the present study is to explore learner perceptions of three WBI systems with different interaction levels under time limitations. This study was therefore designed to provide a new framework to design systems with different degrees of interactivity, and to examine learners’ perceptions of these interaction elements. Three WBI systems were developed with different degrees of interactivity from high to low, and a between-subject experiment was conducted with 45 subjects. The results of the experiment indicate that a higher level of interactivity does not necessarily guarantee a higher perception of interactivity in a short-term learning situation. Therefore, the instructors must pay attention to modifying or selecting appropriate interactive elements that are more suitable for various learning stages. The findings provide insights for designers to adopt different degrees of interactivity in their designs that will best fulfill various learners’ needs. Abstract Copyright © 2011 British Informatics Society Limited. Published by Elsevier B.V. All rights reserved.

research article on user interface

Title: Age differences in the perception of social presence in the use of 3D virtual world for social interaction

Abstract 3D virtual worlds are becoming increasingly popular as tool for social interaction, with the potential of augmenting the user’s perception of physical and social presence. Thus, this technology could be of great benefit to older people, providing home-bound older users with access to social, educational and recreational resources. However, so far there have been few studies looking into how older people engage with virtual worlds, as most research in this area focuses on younger users. In this study, an online experiment was conducted with 30 older and 30 younger users to investigate age differences in the perception of presence in the use of virtual worlds for social interaction. Overall, we found that factors such as navigation and prior experience with text messaging tools played a key role in older people’s perception of presence. Both physical and social presence was found to be linked to the quality of social interaction for users of both age groups. In addition, older people displayed proxemic behavior which was more similar to proxemic behavior in the physical world when compared to younger users. Abstract Copyright © 2012 British Informatics Society Limited. Published by Elsevier B.V. All rights reserved.

Title: Human error and information systems failure: the case of the London ambulance service computer-aided despatch system project

Abstract Human error and systems failure have been two constructs that have become linked in many contexts. In this paper we particularly focus on the issue of failure in relation to that group of software systems known as information systems. We first review the extant theoretical and empirical work on this topic. Then we discuss one particular well-known case — that of the London ambulance service computer-aided despatch system (Lascad) project — and use it as a particularly cogent example of the features of information systems failure. We maintain that the tendency to analyse information systems failure solely from a technological standpoint is limiting, that the nature of information systems failure is multi-faceted, and hence cannot be adequately understood purely in terms of the immediate problems of systems construction. Our purpose is also to use the generic material on IS failure and the specific details of this particular case study to critique the issues of safety, criticality, human error and risk in relation to systems not currently well considered in relation to these areas. Abstract Copyright © 1999 Elsevier B.V. All rights reserved.

research article on user interface

Title: Feminist HCI meets facebook: Performativity and social networking sites

Abstract In this paper, I reflect on a specific product of interaction design, social networking sites. The goals of this paper are twofold. One is to bring a feminist reflexivity, to HCI, drawing on the work of Judith Butler and her concepts of peformativity, citationality, and interpellation. Her approach is, I argue, highly relevant to issues of identity and self-representation on social networking sites; and to the co-constitution of the subject and technology. A critical, feminist HCI must ask how social media and other HCI institutions, practices, and discourses are part of the processes by which sociotechnical configurations are constructed. My second goal is to examine the implications of such an approach by applying it to social networking sites (SNSs) drawing the empirical research literature on SNSs, to show how SNS structures and policies help shape the subject and hide the contingency of subject categories. Abstract Copyright © 2011 British Informatics Society Limited. Published by Elsevier B.V. All rights reserved.

Title: A survey of methods for data fusion and system adaptation using autonomic nervous system responses in physiological computing

Abstract Physiological computing represents a mode of human–computer interaction where the computer monitors, analyzes and responds to the user’s psychophysiological activity in real-time. Within the field, autonomic nervous system responses have been studied extensively since they can be measured quickly and unobtrusively. However, despite a vast body of literature available on the subject, there is still no universally accepted set of rules that would translate physiological data to psychological states. This paper surveys the work performed on data fusion and system adaptation using autonomic nervous system responses in psychophysiology and physiological computing during the last ten years. First, five prerequisites for data fusion are examined: psychological model selection, training set preparation, feature extraction, normalization and dimension reduction. Then, different methods for either classification or estimation of psychological states from the extracted features are presented and compared. Finally, implementations of system adaptation are reviewed: changing the system that the user is interacting with in response to cognitive or affective information inferred from autonomic nervous system responses. The paper is aimed primarily at psychologists and computer scientists who have already recorded autonomic nervous system responses and now need to create algorithms to determine the subject’s psychological state. Abstract Copyright © 2012 British Informatics Society Limited. Published by Elsevier B.V. All rights reserved.

Title: Positive mood induction procedures for virtual environments designed for elderly people

Abstract Positive emotions have a significant influence on mental and physical health. Their role in the elderly’s wellbeing has been established in numerous studies. It is therefore worthwhile to explore ways in which elderly people can increase the number of positive experiences in their daily lives. This paper describes two Virtual Environments (VEs) that were used as mood induction procedures (MIPs) for this population. In addition, the VEs’ efficacy at increasing joy and relaxation in elderly users is analyzed. The VEs contain exercises for generating positive-autobiographic memories, mindfulness and slow breathing rhythms. The total sample comprised 18 participants over 55 years old who used the VEs on two occasions. Twelve of them used the joy environment, while 16 used the relaxation environment. Moods before and after each session were assessed using Visual Analogical Scales. After using both VEs, results indicated significant increases in joy and relaxation and significant decreases in sadness and anxiety. The participants also indicated low levels of difficulty of use and high levels of satisfaction and sense of presence. Hence, the VEs demonstrate their usefulness at promoting positive affects and enhancing the wellbeing of elderly people. Abstract Copyright © 2012 British Informatics Society Limited. Published by Elsevier B.V. All rights reserved.

Title: The effects of trust, security and privacy in social networking: A security-based approach to understand the pattern of adoption

Abstract Social network services (SNS) focus on building online communities of people who share interests and/or activities, or who are interested in exploring the interests and activities of others. This study examines security, trust, and privacy concerns with regard to social networking Websites among consumers using both reliable scales and measures. It proposes an SNS acceptance model by integrating cognitive as well as affective attitudes as primary influencing factors, which are driven by underlying beliefs, perceived security, perceived privacy, trust, attitude, and intention. Results from a survey of SNS users validate that the proposed theoretical model explains and predicts user acceptance of SNS substantially well. The model shows excellent measurement properties and establishes perceived privacy and perceived security of SNS as distinct constructs. The finding also reveals that perceived security moderates the effect of perceived privacy on trust. Based on the results of this study, practical implications for marketing strategies in SNS markets and theoretical implications are recommended accordingly. Abstract Copyright © 2010 Elsevier B.V. All rights reserved.

Title: Usability testing: what have we overlooked?

Abstract For more than a decade, the number of usability test participants has been a major theme of debate among usability practitioners and researchers keen to improve usability test performance. This paper provides evidence suggesting that the focus be shifted to task coverage instead. Our data analysis of nine commercial usability test teams participating in the CUE-4 study revealed no significant correlation between the percentage of problems found or of new problems and number of test users, but correlations of both variables and number of user tasks used by each usability team were significant. The role of participant recruitment on usability test performance and future research directions are discussed. Abstract Copyright © 2013 ACM, Inc.

Title: Predicting online grocery buying intention: a comparison of the theory of reasoned action and the theory of planned behavior

Abstract This paper tests the ability of two consumer theories—the theory of reasoned action and the theory of planned behavior—in predicting consumer online grocery buying intention. In addition, a comparison of the two theories is conducted. Data were collected from two web-based surveys of Danish ( n =1222) and Swedish ( n =1038) consumers using self-administered questionnaires. These results suggest that the theory of planned behavior (with the inclusion of a path from subjective norm to attitude) provides the best fit to the data and explains the highest proportion of variation in online grocery buying intention. Abstract Copyright © 2013 Elsevier B.V. All rights reserved.

Title: Decomposition and crossover effects in the theory of planned behavior: A study of consumer adoption intentions

Abstract The Theory of Planned Behavior, an extension of the well-known Theory of Reasoned Action, is proposed as a model to predict consumer adoption intention. Three variations of the Theory of Planned Behavior are examined and compared to the Theory of Reasoned Action. The appropriateness of each model is assessed with data from a consumer setting. Structural equation modelling using maximum likelihood estimation for the four models revealed that the traditional forms of the Theory of Reasoned Action and the Theory of Planned Behavior fit the data adequately. Decomposing the belief structures and allowing for crossover effects in the Theory of Planned Behavior resulted in improvements in model prediction. The application of each model to theory development and management intervention is explored. Abstract Copyright © 1995 Elsevier B.V. All rights reserved.

Title: Knowledge and the Prediction of Behavior: The Role of Information Accuracy in the Theory of Planned Behavior

Abstract The results of the present research question the common assumption that being well informed is a prerequisite for effective action to produce desired outcomes. In Study 1 ( N = 79), environmental knowledge had no effect on energy conservation, and in Study 2 ( N = 79), alcohol knowledge was unrelated to drinking behavior. Such disappointing correlations may result from an inappropriate focus on accuracy of information at the expense of its relevance to and support for the behavior. Study 3 ( N = 85) obtained a positive correlation between knowledge and pro-Muslim behavior, but Study 4 ( N = 89) confirmed the proposition that this correlation arose because responses on the knowledge test reflected underlying attitudes. Study 4 also showed that the correlation could become positive or negative by appropriate selection of questions for the knowledge test. The theory of planned behavior (Ajzen, 1991 ), with its focus on specific actions, predicted intentions and behavior in all four studies. Abstract Copyright © 2013 Informa plc

research article on user interface

Link: h ttp://www.businessinsider.com/ron-johnson-apple-store-j-c-penney-2011-11

People come to the Apple Store for the experience — and they’re willing to pay a premium for that. There are lots of components to that experience, but maybe the most important — and this is something that can translate to any retailer — is that the staff isn’t focused on selling stuff, it’s focused on building relationships and trying to make people’s lives better. Abstract Copyright © 2013 Business Insider, Inc. All rights reserved.

Title : Naturalizing aesthetics: Brain areas for aesthetic appraisal across sensory modalities

Abstract We present here the most comprehensive analysis to date of neuroaesthetic processing by reporting the results of voxel-based meta-analyses of 93 neuroimaging studies of positive-valence aesthetic appraisal across four sensory modalities. The results demonstrate that the most concordant area of activation across all four modalities is the right anterior insula, an area typically associated with visceral perception, especially of negative valence (disgust, pain, etc.). We argue that aesthetic processing is, at its core, the appraisal of the valence of perceived objects. This appraisal is in no way limited to artworks but is instead applicable to all types of perceived objects. Therefore, one way to naturalize aesthetics is to argue that such a system evolved first for the appraisal of objects of survival advantage, such as food sources, and was later co-opted in humans for the experience of artworks for the satisfaction of social needs. Abstract Copyright © 2011 Elsevier Inc. All rights reserved.

Link: http://www.scientificamerican.com/article.cfm?id=the-neuroscience-of-beauty

Studies from neuroscience and evolutionary biology challenge this separation of art from non-art. Human neuroimaging studies have convincingly shown that the brain areas involved in aesthetic responses to artworks overlap with those that mediate the appraisal of objects of evolutionary importance, such as the desirability of foods or the attractiveness of potential mates. Hence, it is unlikely that there are brain systems specific to the appreciation of artworks; instead there are general aesthetic systems that determine how appealing an object is, be that a piece of cake or a piece of music. Abstract © 2013 Scientific American, a Division of Nature America, Inc.

Link: http://blogs.scientificamerican.com/symbiartic/2011/10/03/need-proof-that-were-visual-beings/

This video offers proof that humans are visual beings. Abstract © 2013 Scientific American, a Division of Nature America, Inc.

Link: http://hbr.org/web/slideshows/five-charts-that-changed-business/1-slide

Once in a while, a chart so deftly captures an important strategic insight that it becomes an iconic part of management thinking and a tool that shows up in MBA classrooms and corporate boardrooms for years to come. As HBR prepares for its 90th anniversary, in 2012, their editors have combed the magazine archives and other sources to select five charts that changed the shape of strategy. Abstract Copyright © 2013 Harvard Business School Publishing. All rights reserved.

Link: http://www.strategy-business.com/article/04412

It is a widely accepted and rarely challenged tenet of marketing that companies can sustain competitive advantage only through “new and improved” product differentiation based on unique features and benefits. What a mistake. By paying attention to what consumers really want, companies can attract new customers and create a distinctive brand. Abstract © 2013 Booz & Company Inc. All rights reserved.

Link: http://www.economist.com/node/17723028

If you can have everything in 57 varieties, making decisions becomes hard work. Many of these options have improved life immeasurably in the rich world, and to a lesser extent in poorer parts. They are testimony to human ingenuity and innovation. Free choice is the basis on which markets work, driving competition and generating economic growth. It is the cornerstone of liberal democracy. The 20th century bears the scars of too many failed experiments in which people had no choice. But amid all the dizzying possibilities, a nagging question lurks: is so much extra choice unambiguously a good thing? Abstract Copyright © The Economist Newspaper Limited 2013. All rights reserved.

Link: http://e.businessinsider.com/public/1099804

Mobile apps are becoming more important to people, not less important, according to this chart plucked from a big presentation on the internet. It’s an interesting trend because it shows how mobile behavior is different than traditional desktop computing behavior when it comes to the web. Abstract Copyright © 2013 Business Insider, Inc. All rights reserved.

Link: http://blogs.scientificamerican.com/scicurious-brain/2012/07/30/you-want-that-well-i-want-it-too-the-neuroscience-of-mimetic-desire/

Mimetic desire is more than jealously wanting something because someone else has it. Rather, it’s about valuing something because someone else values it . And it’s pretty easy to transmit the value. Just writing about Person A’s activities and habits and showing it to Person B will make Person B start to think Person A must have seen something good about the Toyota Camry…maybe his next car…

But what is behind this contagion of desires? Abstract © 2013 Scientific American, a Division of Nature America, Inc.

research article on user interface

Link: http://www.united-academics.org/magazine/27212/visual-memory-blindness/

A well-known pheonomenon in psychology has been the ‘inattentional blindness’ principle. In fact, you might know it from experience: it means that people tend to fail seeing things in their visible fields when they have to focus on a task. Until now, it was thought that in order to cause the effect, a cluttered visual field is required. Recent research shows that the effect is present though in many more situations. Abstract Copyright United Academics 2012 Coypright – All rights Reserved

Link: http://www.businessinsider.com/18-24-texting-2011-9

Chart of the Day: According to the Pew Internet project , people in the 18-24 year-old range are sending and receiving 110 texts per day on average. The median number of texts sent/received by that group is 50 per day. Abstract Copyright © 2013 Business Insider, Inc. All rights reserved.

Link: http://www.businessinsider.com/chart-of-the-day-facebook-time-2011-9

Chart of the Day: A new report on social media from Nielsen shows U.S. users spent 53.5 billion minutes on Facebook in May, which is more time than was spent on the next four biggest sites. Abstract Copyright © 2013 Business Insider, Inc. All rights reserved.

Link: http://www.scientificamerican.com/article.cfm?id=your-brain-on-facebook

A recent study showed that certain brain areas expand in people who have greater numbers of friends on Facebook . There was a problem, though. The study, in Proceedings of the Royal Society B , was unable to resolve the question of whether “friending” plumps up the brain areas or whether people with a type of robustness in brain physiology are just natural social butterflies. But with the help of a few monkeys in England, teenagers everywhere may now have more ammunition to use against parents. Abstract © 2013 Scientific American, a Division of Nature America, Inc.

Link: http://iwc.oxfordjournals.org/content/26/3/196.abstract.html?etoc

Although advances in technology now enable people to communicate ‘anytime, anyplace’, it is not clear how citizens can be motivated to actually do so. This paper evaluates the impact of three principles of psychological empowerment, namely perceived self-efficacy, sense of community and causal importance, on public transport passengers’ motivation to report issues and complaints while on the move. A week-long study with 65 participants revealed that self-efficacy and causal importance increased participation in short bursts and increased perceptions of service quality over longer periods. Finally, we discuss the implications of these findings for citizen participation projects and reflect on design opportunities for mobile technologies that motivate citizen participation. Abstract 2013 Oxford University Press.

Link: http://iwc.oxfordjournals.org/content/26/3/208.abstract.html?etoc

This review paper argues that users of personal information management systems have three particularly pressing requirements, for which current systems do not fully cater: (i) To combat information overload, as the volume of information increases. (ii) To ease context switching, in particular, for users who face frequent interrupts in their work. (iii) To be supported in information integration, across a variety of applications. To meet these requirements, four broad technological approaches should be adopted in an incremental fashion: (i) The deployment of a unified file system to manage all information objects, including files, emails and webpage URLs. (ii) The use of tags to categorize information; implemented in a way which is backward-compatible with existing hierarchical file systems. (iii) The use of context to aid information retrieval; built upon existing file and tagging systems rather than creating a parallel context management system. (iv) The deployment of semantic technologies, coupled with the harvesting of all useful metadata. Abstract 2013 Oxford University Press.

Link: http://iwc.oxfordjournals.org/content/26/3/238.abstract.html?etoc

Projective techniques are used in psychology and consumer research to provide information about individuals’ motivations, thoughts and feelings. This paper reviews the use of projective techniques in marketing research and user experience (UX) research and discusses their potential role in understanding users, their needs and values, and evaluating UX in practical product development contexts. A projective technique called sentence completion is evaluated through three case studies. Sentence completion produces qualitative data about users’ views in a structured form. The results are less time-consuming to analyze than interview results. Compared with quantitative methods such as AttrakDiff, the results are more time consuming to analyze, but more information is retrieved on negative feelings. The results show that sentence completion is useful in understanding users’ perceptions and that the technique can be used to complement other methods. Sentence completion can also be used online to reach wider user groups. Abstract 2013 Oxford University Press.

Link: http://iwc.oxfordjournals.org/content/26/3/256.abstract.html?etoc

Cognitive load (CL) is experienced during critical tasks and also while engaged emotional states are induced either by the task itself or by extraneous experiences. Emotions irrelevant to the working memory representation may interfere with the processing of relevant tasks and can influence task performance and behavior, making the accurate detection of CL from nonverbal information challenging. This paper investigates automatic CL detection from facial features, physiology and task performance under affective interference. Data were collected from participants (n=20) solving mental arithmetic tasks with emotional stimuli in the background, and a combined classifier was used for detecting CL levels. Results indicate that the face modality for CL detection was more accurate under affective interference, whereas physiology and task performance were more accurate without the affective interference. Multimodal fusion improved detection accuracies, but it was less accurate under affective interferences. More specifically, the accuracy decreased with an increasing intensity of emotional arousal. Abstract 2013 Oxford University Press.

Link: http://iwc.oxfordjournals.org/content/26/3/269.abstract.html?etoc

In the field of virtual reality (VR), many efforts have been made to analyze presence, the sense of being in the virtual world. However, it is only recently that functional magnetic resonance imaging (fMRI) has been used to study presence during an automatic navigation through a virtual environment. In the present work, our aim was to use fMRI to study the sense of presence during a VR-free navigation task, in comparison with visualization of photographs and videos (automatic navigations through the same environment). The main goal was to analyze the usefulness of fMRI for this purpose, evaluating whether, in this context, the interaction between the subject and the environment is performed naturally, hiding the role of technology in the experience. We monitored 14 right-handed healthy females aged between 19 and 25 years. Frontal, parietal and occipital regions showed their involvement during free virtual navigation. Moreover, activation in the dorsolateral prefrontal cortex was also shown to be negatively correlated to sense of presence and the postcentral parietal cortex and insula showed a parametric increased activation according to the condition-related sense of presence, which suggests that stimulus attention and self-awareness processes related to the insula may be linked to the sense of presence. Abstract 2013 Oxford University Press.

Link: http://iwc.oxfordjournals.org/content/26/3/285.abstract.html?etoc

Unlike visual stimuli, little attention has been paid to auditory stimuli in terms of emotion prediction with physiological signals. This paper aimed to investigate whether auditory stimuli can be used as an effective elicitor as visual stimuli for emotion prediction using physiological channels. For this purpose, a well-controlled experiment was designed, in which standardized visual and auditory stimuli were systematically selected and presented to participants to induce various emotions spontaneously in a laboratory setting. Numerous physiological signals, including facial electromyogram, electroencephalography, skin conductivity and respiration data, were recorded when participants were exposed to the stimulus presentation. Two data mining methods, namely decision rules and k-nearest neighbor based on the rough set technique, were applied to construct emotion prediction models based on the features extracted from the physiological data. Experimental results demonstrated that auditory stimuli were as effective as visual stimuli in eliciting emotions in terms of systematic physiological reactivity. This was evidenced by the best prediction accuracy quantified by the F1 measure (visual: 76.2% vs. auditory: 76.1%) among six emotion categories (excited, happy, neutral, sad, fearful and disgusted). Furthermore, we also constructed culture-specific (Chinese vs. Indian) prediction models. The results showed that model prediction accuracy was not significantly different between culture-specific models. Finally, the implications of affective auditory stimuli in human–computer interaction, limitations of the study and suggestions for further research are discussed. Abstract 2013 Oxford University Press.

Link: http://www.sciencedirect.com/science/article/pii/S0160289614000087

The deliberate practice view has generated a great deal of scientific and popular interest in expert performance. At the same time, empirical evidence now indicates that deliberate practice, while certainly important, is not as important as Ericsson and colleagues have argued it is. In particular, we (Hambrick, Oswald, Altmann, Meinz, Gobet, & Campitelli, 2014) found that individual differences in accumulated amount of deliberate practice accounted for about one-third of the reliable variance in performance in chess and music, leaving the majority of the reliable variance unexplained and potentially explainable by other factors. Ericsson’s (2014) defense of the deliberate practice view, though vigorous, is undercut by contradictions, oversights, and errors in his arguments and criticisms, several of which we describe here. We reiterate that the task now is to develop and rigorously test falsifiable theories of expert performance that take into account as many potentially relevant constructs as possible. Abstract © 2014 Elsevier Inc.

Link: http://techcrunch.com/2013/02/05/amazon-to-launch-virtual-currency-amazon-coins-in-its-appstore-in-may/

Amazon has just announced a new virtual currency for Kindle Fire owners to use on in-app purchases, app purchases, etc. in the Amazon Appstore. Abstract © 2013 AOL Inc. All rights reserved.

Link: http://onlinelibrary.wiley.com/doi/10.1002/smj.2284/abstract

Link: http://iwc.oxfordjournals.org/content/early/2014/05/09/iwc.iwu016.abstract.html?papetoc

Wizard of Oz (WOZ) is a well-established method for simulating the functionality and user experience of future systems. Using a human wizard to mimic certain operations of a potential system is particularly useful in situations where extensive engineering effort would otherwise be needed to explore the design possibilities offered by such operations. The WOZ method has been widely used in connection with speech and language technologies, but advances in sensor technology and pattern recognition as well as new application areas such as human–robot interaction have made it increasingly relevant to the design of a wider range of interactive systems. In such cases, achieving acceptable performance at the user interface level often hinges on resource-intensive improvements such as domain tuning, which are better done once the overall design is relatively stable. Although WOZ is recognized as a valuable prototyping technique, surprisingly little effort has been put into exploring it from a methodological point of view. Starting from a survey of the literature, this paper presents a systematic investigation and analysis of the design space for WOZ for language technology applications, and proposes a generic architecture for tool support that supports the integration of components for speech recognition and synthesis as well as for machine translation. This architecture is instantiated in WebWOZ—a new web-based open-source WOZ prototyping platform. The viability of generic support is explored empirically through a series of evaluations. Researchers from a variety of backgrounds were able to create experiments, independent of their previous experience with WOZ. The approach was further validated through a number of real experiments, which also helped to identify a number of possibilities for additional support, and flagged potential issues relating to consistency in wizard performance. Abstract 2014 Oxford University Press

Link: http://www.thinkwithgoogle.com/insights/library/studies/the-new-multi-screen-world-study/

This paper studies how business models can be designed to tap effectively into open innovation labor markets with heterogeneously motivated workers. Using data on open source software, we show that motivations are diverse, and demonstrate how managers can strategically influence the flow of code contributions and their impact on project performance. Unlike previous literature using survey data, we exploit the observed pattern of project membership and code contributions—the “revealed preference” of developers—to infer the motivations driving their decision to contribute. Developers strongly sort along key dimensions of the business model chosen by project managers, especially the degree of openness of the project license. The results indicate an important role for intrinsic motivation, reputation, and labor market signaling, and a more limited role for reciprocity. Abstract 2014 John Wiley & Sons, Ltd.

updated on 5/13

Title: Developing elements of user experience for mobile phones and services: survey, interview, and observation approaches

Abstract The term user experience (UX) encompasses the concepts of usability and affective engineering. However, UX has not been defined clearly. In this study, a literature survey, user interview and indirect observation were conducted to develop definitions of UX and its elements. A literature survey investigated 127 articles that were considered to be helpful to define the concept of UX. An in-depth interview targeted 14 hands-on workers in the Korean mobile phone industry. An indirect observation captured daily experiences of eight end-users with mobile phones. This study collected various views on UX from academia, industry, and end-users using these three approaches. As a result, this article proposes definitions of UX and its elements: usability, affect, and user value. These results are expected to help design products or services with greater levels of UX. Abstract Copyright 2011 Wiley Periodicals, Inc.

Title: Why different people prefer different systems for different tasks: An activity perspective on technology adoption in a dynamic user environment

Abstract In a contemporary user environment, there are often multiple information systems available for a certain type of task. Based on the premises of Activity Theory, this study examines how user characteristics, system experiences, and task situations influence an individual’s preferences among different systems in terms of user readiness to interact with each. It hypothesizes that system experiences directly shape specific user readiness at the within-subject level, user characteristics and task situations make differences in general user readiness at the between-subject level, and task situations also affect specific user readiness through the mediation of system experiences. An empirical study was conducted, and the results supported the hypothesized relationships. The findings provide insights on how to enhance technology adoption by tailoring system development and management to various task contexts and different user groups. Abstract Copyright 2011 ASIS&T

Title: A review of factors influencing user satisfaction in information retrieval

Abstract The authors investigate factors influencing user satisfaction in information retrieval. It is evident from this study that user satisfaction is a subjective variable, which can be influenced by several factors such as system effectiveness, user effectiveness, user effort, and user characteristics and expectations. Therefore, information retrieval evaluators should consider all these factors in obtaining user satisfaction and in using it as a criterion of system effectiveness. Previous studies have conflicting conclusions on the relationship between user satisfaction and system effectiveness; this study has substantiated these findings and supports using user satisfaction as a criterion of system effectiveness. Abstract Copyright 2010 ASIS&T

Title: The development and evaluation of a survey to measure user engagement

Abstract Facilitating engaging user experiences is essential in the design of interactive systems. To accomplish this, it is necessary to understand the composition of this construct and how to evaluate it. Building on previous work that posited a theory of engagement and identified a core set of attributes that operationalized this construct, we constructed and evaluated a multidimensional scale to measure user engagement. In this paper we describe the development of the scale, as well as two large-scale studies (N=440 and N=802) that were undertaken to assess its reliability and validity in online shopping environments. In the first we used Reliability Analysis and Exploratory Factor Analysis to identify six attributes of engagement: Perceived Usability, Aesthetics, Focused Attention, Felt Involvement, Novelty, and Endurability. In the second we tested the validity of and relationships among those attributes using Structural Equation Modeling. The result of this research is a multidimensional scale that may be used to test the engagement of software applications. In addition, findings indicate that attributes of engagement are highly intertwined, a complex interplay of user-system interaction variables. Notably, Perceived Usability played a mediating role in the relationship between Endurability and Novelty, Aesthetics, Felt Involvement, and Focused Attention. Abstract Copyright 2009 ASIS&T

Title: Exploring user engagement in online news interactions

Abstract This paper describes a qualitative study of online news reading and browsing. Thirty people participated in a quasi-experimental study in which they were asked to browse a news website and select three stories to discuss at a social gathering. Semi-structured interviews were conducted post-task to understand participants’ perceptions of what makes online news reading and browsing engaging or non-engaging. Findings as presented within the experience-based framework of user engagement and demonstrate the complexity of users’ interactions with information content and systems in online news environments. This study extends the model of user engagement and contributes new insights into user’s experience in casual-leisure settings, such as online news, which has implications for other information domains. Abstract Copyright 2011 by American Society for Information Science and Technology

Abstract This chapter of The Fabric of Mobile Services: Software Paradigms and Business Demands contains sections titled: New Services and User Experience, User-Centered Simplicity and Experience, Methodologies for Simplicity and User Experience, and Case Studies: Simplifying Paradigms Abstract Copyright 2009 John Wiley & Sons, Inc.

Title: The Right Angle: Visual Portrayal of Products Affects Observers’ Impressions of Owners

Abstract Consumer products have long been known to influence observers’ impressions of product owners. The angle at which products are visually portrayed in advertisements, however, may be an overlooked factor in these effects. We hypothesize and find that portrayals of the same product from different viewpoints can prime different associations that color impressions of product and owner in parallel ways. In Study 1, automobiles were rated higher on status- and power-related traits (e.g., dominant , powerful ) when portrayed head-on versus in side profile, an effect found for sport utility vehicles (SUVs)—a category with a reputation for dominance—but not sedans. In Study 2, these portrayal-based associations influenced the impressions formed about the product’s owner: a target person was rated higher on status- and power-related traits when his SUV was portrayed head-on versus in side profile. These results suggest that the influence of visual portrayal extends beyond general evaluations of products to affect more specific impressions of products and owners alike, and highlight that primed traits are likely to influence impressions when compatible with other knowledge about the target. Abstract Copyright 2012 Wiley Periodicals, Inc

Title: The Counterfeit Self: The Deceptive Costs of Faking It

Abstract Although people buy counterfeit products to signal positive traits, we show that wearing counterfeit products makes individuals feel less authentic and increases their likelihood of both behaving dishonestly and judging others as unethical. In four experiments, participants wore purportedly fake or authentically branded sunglasses. Those wearing fake sunglasses cheated more across multiple tasks than did participants wearing authentic sunglasses, both when they believed they had a preference for counterfeits (Experiment 1a) and when they were randomly assigned to wear them (Experiment 1b). Experiment 2 shows that the effects of wearing counterfeit sunglasses extend beyond the self, influencing judgments of other people’s unethical behavior. Experiment 3 demonstrates that the feelings of inauthenticity that wearing fake products engenders—what we term the counterfeit selfmediate the impact of counterfeits on unethical behavior. Finally, we show that people do not predict the impact of counterfeits on ethicality; thus, the costs of counterfeits are deceptive. Abstract Copyright 2010 Francesca Gino, Michael I. Norton, and Dan Ariely3

Link: http://iwc.oxfordjournals.org/content/26/5/389.full.html?etoc

Menus are a key mechanism for organizing different commands in graphical user interfaces. Nowadays low-cost devices that allow using different interaction techniques in remote interfaces have become widespread. Nevertheless, their corresponding menus are direct adaptations from traditional ones. As a consequence, they are inaccurate and slow, and also produce tiredness. In this paper, we design, implement and evaluate a menu selection technique for remote interfaces, the Body Menu. This technique permits whole-body interaction and is specifically designed to take advantage of the proprioception sense. The Body Menu attaches virtual menu items to different parts of the body and selects them when the users reach these zones with their hands. We use the Microsoft Kinect to implement this system. Additionally, we compared it with the most representative menus, studied the best number of body parts to be used and analyzed how children interact with it. Abstract © 2013 Oxford University Publishing.

Link: http://iwc.oxfordjournals.org/content/26/5/403.full.html?etoc

We present the evaluation of an interactive audio map system that enables blind and partially sighted users to explore and navigate city maps from the safety of their home using simulated 3D audio and synthetic speech alone. We begin with a review of existing literature in the areas of spatial knowledge and wayfinding, auditory displays and auditory map systems, before describing how this research builds on and differentiates itself from this body of work. One key requirement was the ability to quantify the effectiveness of the audio map, so we describe the design and implementation of the evaluation, which took the form of a game downloaded by participants to their own computers. The results demonstrate that participants (blind, partially sighted and sighted) have acquired detailed spatial knowledge and also that the availability of positional audio cues significantly improves wayfinding performance. Abstract © 2013 Oxford University Publishing.

Link: http://iwc.oxfordjournals.org/content/26/5/417.full.html?etoc

Delegation is the practice of sharing authority with another individual to enable them to complete a specific task as a proxy. Practices to permit delegation can range from formal to informal arrangements and can involve spontaneous yet finely balanced notions of trust between people. This paper argues that delegation is a ubiquitous yet an unsupported feature of socio-technical computer systems and that this lack of support illustrates a particular neglect to the everyday financial practices of the more vulnerable people in society. Our contribution is to provide a first exploration of the domain of person-to-person delegation in digital payments, a particularly pressing context. We first report qualitative data collected across several studies concerning banking practices of individuals over 80 years of age. We then use analytical techniques centred upon identification of stakeholders, their concerns and interactions, to characterize the delegation practices we observed. We propose a Concerns Matrix as a suitable representation to capture conflicts in the needs of individuals in such complex socio-technical systems, and finally propose a putative design response in the form of a Helper Card. Abstract © 2013 Oxford University Publishing..

Link: Why We Love Beautiful Things

Great design, the management expert Gary Hamel once said, is like Justice Potter Stewart’s famous definition of pornography — you know it when you see it. You want it, too: brain scan studies reveal that the sight of an attractive product can trigger the part of the motor cerebellum that governs hand movement. Instinctively, we reach out for attractive things; beauty literally moves us. © 2013 New York Times

Link: http://www.bris.ac.uk/news/2013/9478.html

A new study has analysed tens of thousands of articles available to readers of online news and created a model to find out ‘what makes people click’. The aim of the study was to model the reading preferences for the audiences of 14 online news outlets using machine learning techniques. The models, describing the appeal of an article to each audience, were developed by linear functions of word frequencies. The models compared articles that became “most popular” on a given day in a given outlet with articles that did not. The research dentified the most attractive keywords, as well as the least attractive ones, and explained the choices readers made. Abstract © 2013 University of Bristol.

Title: Pointing and Selecting with Facial Activity

Abstract The aim of this paper was to evaluate the use of three facial actions (i.e. frowning, raising the eyebrows, and smiling) in selecting objects on a computer screen when gaze was used for pointing. Dwell time is the most commonly used selection technique in gaze-based interaction, and thus, a dwell time of 400 ms was used as a reference selection technique. A wireless, head-mounted prototype device that carried out eye tracking and contactless, capacitive measurement of facial actions was used for the interaction task. Participants (N=16) performed point-and-select tasks with three pointing distances (i.e. 60, 120 and 240 mm) and three target sizes (i.e. 25, 30 and 40 mm). Task completion times, pointing errors and throughput values based on Fitts’ law were used to compare the selection techniques. The participants also rated the techniques with subjective ratings scales. The results showed that the different techniques performed equally well in many respects. However, throughput values varied from 8.38 bits/s (raising the eyebrows) to 15.33 bits/s (smiling) and were comparable to or, in the case of smiling, better than in earlier research with similar interaction techniques. The dwell time was found to be the least accurate selection technique in terms of the magnitudes of point-and-select errors. Smiling technique was rated as more accurate to use than the frowning or the raising techniques. The results give further support for methods that combine facial behavior to eye tracking when interacting with technology.

Abstract Copyright 2014 Outi Tuisku1, Ville Rantanen, Oleg Špakov, Veikko Surakka and Jukka Lekkala

Title: Modeling Traditional Literacy, Internet Skills and Internet Usage: An Empirical Study

Abstract This paper focuses on the relationships among traditional literacy (reading, writing and understanding text), medium-related Internet skills (consisting of operational and formal skills), content-related Internet skills (consisting of information and strategic skills) and Internet usage types (information- and career-directed Internet use and entertainment use). We conducted a large-scale survey that resulted in a dataset of 1008 respondents. The results reveal the following: (i) traditional literacy has a direct effect on formal and information Internet skills and an indirect effect on strategic Internet skills and (ii) differences in types of Internet usage are indirectly determined by traditional literacy and directly affected by Internet skills, such that higher levels of strategic Internet skills result in more information- and career-directed Internet use. Traditional literacy is a pre-condition for the employment of Internet skills, and Internet skills should not be considered an easy means of disrupting historically grounded inequalities caused by differences in traditional literacy.

Abstract Copyright 2014 A.J.A.M. van Deursen and J.A.G.M. van Dijk

Title: Life Is Too Short to RTFM: How Users Relate to Documentation and Excess Features in Consumer Products

Abstract This paper addresses two common problems that users of various products and interfaces encounter—over-featured interfaces and product documentation. Over-featured interfaces are seen as a problem as they can confuse and over-complicate everyday interactions. Researchers also often claim that users do not read product documentation, although they are often exhorted to ‘RTFM’ (read the field manual). We conducted two sets of studies with users which looked at the issues of both manuals and excess features with common domestic and personal products. The quantitative set was a series of questionnaires administered to 170 people over 7 years. The qualitative set consisted of two 6-month longitudinal studies based on diaries and interviews with a total of 15 participants. We found that manuals are not read by the majority of people, and most do not use all the features of the products that they own and use regularly. Men are more likely to do both than women, and younger people are less likely to use manuals than middle-aged and older ones. More educated people are also less likely to read manuals. Over-featuring and being forced to consult manuals also appears to cause negative emotional experiences. Implications of these findings are discussed.

Abstract Copyright 2014 Alethea L. Blackler, Rafael Gomez, Vesna Popovic and M. Helen Thompson

Title: Effect of Age on Human–Computer Interface Control Via Neck Electromyography

Abstract The purpose of this study was to determine the effect of age on visuomotor tracking using submental and anterior neck surface electromyography (sEMG) to assess feasibility of computer control via neck musculature, which allows people with little remaining motor function to interact with computers. Thirty-two healthy adults participated: 16 younger adults aged 18–29 years and 16 older adults aged 69–85 years. Participants modulated sEMG to achieve targets presented at different amplitudes using real-time visual feedback. Root mean squared (RMS) error was used to quantify tracking performance. RMS error was increased for older adults relative to younger adults. Older adults demonstrated more RMS error than younger adults as a function of increasing target amplitude. The differential effects of age found on static tracking performance in anterior neck musculature suggest more difficult translation of human–computer interfaces controlled using anterior neck musculature for static tasks to older populations.

Abstract Copyright 2014 Gabrielle L. Hands and Cara E. Stepp

Title: Should I Stay or Should I Go? Improving Event Recommendation in the Social Web

Abstract This paper focuses on the recommendation of events in the Social Web, and addresses the problem of finding if, and to which extent, certain features, which are peculiar to events, are relevant in predicting the users’ interests and should thereby be taken into account in recommendation. We consider, in particular, three ‘additional’ features that are usually shown to users within social networking environments: reachability from the user location, the reputation of the event in the community and the participation of the user’s friends. Our study is aimed at evaluating whether adding this information to the description of the event type and topic, and including in the user profile the information on the relevance of these factors, can improve our capability to predict the user’s interest. We approached the problem by carrying out two surveys with users, who were asked to express their interest in a number of events. We then trained, by means of linear regression, a scoring function defined as a linear combination of the different factors, whose goal was to predict the user scores. We repeated this experiment under different hypotheses on the additional factors, in order to assess their relevance by comparing the predictive capabilities of the resulting functions. The compared results of our experiments show that additional factors, if properly weighted, can improve the prediction accuracy with an error reduction of 4.1%. The best results were obtained by combining content-based factors and additional factors in a proportion of ∼10:4.

Abstract Copyright 2014 Federica Cena, Silvia Likavec, Ilaria Lombardi and Claudia Picardi

Title: “I Need to Be Explicit: You’re Wrong”: Impact of Face Threats on Social Evaluations in Online Instructional Communication

Abstract Online instructional communication, as found in ask-an-expert forums, e-learning discussion boards or online help desks, creates situations that threaten the recipient’s face. This study analyzed the evaluation of face-threatening acts with a 1×3 design. An online forum thread confronted a layperson with an expert who either (a) addressed the layperson’s misconceptions directly and frankly, (b) mitigated face threats through explicit hints about the need to be direct or (c) communicated politely and indirectly. College students read these dialogues and assessed the expert communicator’s facework, recipient orientation, credibility and likability. Results showed that polite experts were evaluated most positively; explicit hints did not improve perceptions of face-threatening acts. This implies that users of instructional forums prefer communicators to be polite even when face threats are necessary. We discuss practical implications for different online instruction contexts and make suggestions for further research.

Abstract Copyright 2014 Regina Jucks, Lena Päuler and Benjamin Brummernhenrich

Title: The Potential of a Text-Based Interface as a Design Medium: An Experiment in a Computer Animation Environment

Abstract Since the birth of the concept of direct manipulation, the graphical user interface has been the dominant means of controlling digital objects. In this research, we hypothesize that the benefits of a text-based interface involve multiple tradeoffs, and we explore the potential of text as a medium of design from three perspectives: (i) the perceived level of control of the designed object, (ii) a tool for realizing creative ideas and (iii) an effective form for a highly learnable user interface. Our experiment in a computer animation environment shows that (i) participants did feel a high level of control of characters, (ii) creativity was both restricted and facilitated depending on the task and (iii) natural language expedited the learning of a new interface language. Our research provides experimental proof of the effect of a text-based interface and offers guidelines for the design of future computer-aided design applications.

Abstract Copyright 2014 Sangwon Lee and Jin Yan

Title: Framing a Set: Understanding the Curatorial Character of Personal Digital Bibliographies

Abstract We articulate a model of curatorship that emphasizes framing the character of the curated set as the focus of curatorial activity. This curatorial character is structured through the articulation, via mechanisms of selection, description and arrangement, of coherent classificatory principles. We describe the latest stage of a continuing project to examine the curatorial character of personal digital bibliographies, such as Pinterest boards, Flickr galleries and GoodReads shelves, and to support the design of such curatorially expressive personal collections. In the study reported here, 24 participants created personal bibliographies using either a structured design process, with explicit tasks for selecting, describing and arranging collection items, or an unstructured process that did not separate these activities. Our findings lead to a more complex understanding of personal collections as curatorial, expressive artifacts. We explore the role of cohesion as a quality that facilitates expression of the curatorial frame, and we find that when designers read source materials as a part of a set, they are more likely to write cohesive collections. Our findings also suggest that the curatorial act involves both the definition of abstract classificatory principles and their instantiation in a specific material environment. We describe various framing devices that facilitate these reading and writing activities, and we suggest design directions for supporting curatorial reading and writing tasks.

Abstract Copyright 2014 Melanie Feinberg, Ramona Broussard and Eryn Whitworth

Title: Identifying Problems Associated with Focus and Context Awareness in 3D Modelling Tasks

Abstract Creating complex 3D models is a challenging process. One of the main reasons for this is that 3D models are usually created using software developed for conventional 2D displays which lack true depth perspective, and therefore do not support correct perception of spatial placement and depth-ordering of displayed content. As a result, modellers often have to deal with many overlapping components of 3D models (e.g. vertices, edges, faces, etc.) on a 2D display surface. This in turn causes them to have difficulties in distinguishing distances, maintaining position and orientation awareness, etc. To better understand the nature of these problems, which can collectively be defined as ‘focus and context awareness’ problems, we have conducted a pilot study with a group of novice 3D modellers, and a series of interviews with a group of professional 3D modellers. This article presents these two studies, and their findings, which have resulted in identifying a set of focus and context awareness problems that modellers face in creating 3D models using conventional modelling software. The article also provides a review of potential solutions to these problems in the related literature.

Abstract Copyright 2014 Masood Masoodian, Azmi bin Mohd Yusof and Bill Rogers

Abstract The goal of user experience design in industry is to improve customer satisfaction and loyalty through the utility, ease of use, and pleasure provided in the interaction with a product. So far, user experience studies have mostly focused on short-term evaluations and consequently on aspects relating to the initial adoption of new product designs. Nevertheless, the relationship between the user and the product evolves over long periods of time and the relevance of prolonged use for market success has been recently highlighted. In this paper, we argue for the cost-effective elicitation of longitudinal user experience data. We propose a method called the “UX Curve” which aims at assisting users in retrospectively reporting how and why their experience with a product has changed over time. The usefulness of the UX Curve method was assessed in a qualitative study with 20 mobile phone users. In particular, we investigated how users’ specific memories of their experiences with their mobile phones guide their behavior and their willingness to recommend the product to others. The results suggest that the UX Curve method enables users and researchers to determine the quality of long-term user experience and the influences that improve user experience over time or cause it to deteriorate. The method provided rich qualitative data and we found that an improving trend of perceived attractiveness of mobile phones was related to user satisfaction and willingness to recommend their phone to friends. This highlights that sustaining perceived attractiveness can be a differentiating factor in the user acceptance of personal interactive products such as mobile phones. The study suggests that the proposed method can be used as a straightforward tool for understanding the reasons why user experience improves or worsens in long-term product use and how these reasons relate to customer loyalty.

Abstract Copyright 2011 Sari Kujalaa, Virpi Rotob, Kaisa Väänänen-Vainio-Mattilaa, Evangelos Karapanosc and Arto Sinneläa

Title: Researching Young Children’s Everyday Uses of Technology in the Family Home

Abstract Studies of the everyday uses of technology in family homes have tended to overlook the role of children and, in particular, young children. A study that was framed by an ecocultural approach focusing on children’s play and learning with toys and technologies is used to illustrate some of the methodological challenges of conducting research with young children in the home. This theoretical framework enabled us to identify and develop a range of methods that illuminated the home’s unique mix of inhabitants, learning opportunities and resources and to investigate parents’ ethnotheories, or cultural beliefs, that gave rise to the complex of practices, values and attitudes and their intersections with technology and support for learning in the home. This resulted in a better understanding of the role of technology in the lives of these 3- and 4-year-old children.

Abstract Copyright 2014 Lydia Plowman

Title: Measuring web usability using item response theory: Principles, features and opportunities

Abstract Usability is considered a critical issue on the web that determines either the success or the failure of a company. Thus, the evaluation of usability has gained substantial attention. However, most current tools for usability evaluation have some limitations, such as excessive generality and a lack of reliability and validity. The present work proposes the construction of a tool to measure usability in e-commerce websites using item response theory (IRT). While usability issues have only been considered in theoretical or empirical contexts, in this study, we discuss them from a mathematical point of view using IRT. In particular, we develop a standardised scale to measure usability in e-commerce websites. This study opens a new field of research in the ergonomics of interfaces with respect to the development of scales using IRT.

Abstract Copyright 2011 Rafael Tezzaa, Antonio Cezar Borniaa and Dalton Francisco de Andrade

Title: Everything Science Knows Right Now About Standing Desks

Abstract If it wasn’t already clear through common sense, it’s become painfully clear through science that sitting all day is terrible for your health. What’s especially alarming about this evidence is that extra physical activity doesn’t seem to offset the costs of what researchers call “prolonged sedentary time.” Just as jogging and tomato juice don’t make up for a night of smoking and drinking, a little evening exercise doesn’t erase the physical damage done by a full work day at your desk.

In response some people have turned to active desks—be it a standing workspace or even a treadmill desk—but the research on this recent trend has been too scattered to draw clear conclusions on its benefits (and potential drawbacks). At least until now. A trio of Canada-based researchers has analyzed the strongest 23 active desk studies to draw some conclusions on how standing and treadmill desks impact both physiological health and psychological performance. Abstract Copyright 2015 Eric Jaffe

Send Us Your Research References: If you have interesting and relevant research references post, post content as comment below for possible inclusion in next year’s updated list.

Other Content from PulseUX: Here are 2 other references from widely read and quoted long-form posts you may find interesting.

research article on user interface

Angry Birds UX: Why Angry Birds is so successful and popular: a cognitive teardown of the user experience (1.5 million page views). https://live-mauro-usability-science.pantheonsite.io/blog/why-angry-birds-is-so-successful-a-cognitive-teardown-of-the-user-experience/

research article on user interface

Apple v. Samsung: Impact and Implications for Product Design, User Interface Design (UX), Software Development and the Future of High-Technology Consumer Products https://live-mauro-usability-science.pantheonsite.io/blog/apple-v-samsung-implications-for-product-design-user-interface-ux-design-software-development-and-the-future-of-high-technology-consumer-products/

Charles L Mauro CHFP President / Founder MauroNewMedia

Find out more about Charles L Mauro Find out more about MauroNewMedia Follow Pulse>UX on Twitter @PulseUX

Subscribe to email updates

  • Pingback: Why Flappy Bird Is / Was So Successful And Popular But So Difficult To Learn: A Cognitive Teardown Of The User Experience (UX)
  • Pingback: Wishing You and Your Technology Success in 2014… and Goodbye to Engelbart from MauroNewMedia

Fantastic site. A lot of helpful info here. I’m sending it to some buddies ans additionally sharing in delicious. And naturally, thanks on your sweat!

you are truly a just right webmaster. The site loading speed is incredible. It kind of feels that you’re doing any distinctive trick. In addition, The contents are masterwork. you have done a great activity in this matter!

This is such an intriguing post, thank you for sharing it!

This is a comment from Bing using Python.

pelicula mexicana red social hi5 red social the social network en español red social mas usada en mexico

Thanks, I have just been looking for information about this subject for a long time and yours is the best I’ve discovered till now. However, what in regards to the bottom line? Are you certain in regards to the supply?

Post a Comment

Subscribe to email updates.

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

The influence of user interface design on task performance and situation awareness in a 3-player diner's dilemma game

Roles Data curation, Formal analysis, Methodology, Software, Validation, Visualization, Writing – original draft

Affiliation School of Psychology and Cognitive Science, East China Normal University, Shanghai, China

ORCID logo

Roles Conceptualization, Writing – review & editing

* E-mail: [email protected]

  • Tingwei Jiang, 
  • Huicong Fang

PLOS

  • Published: March 17, 2020
  • https://doi.org/10.1371/journal.pone.0230387
  • Reader Comments

Fig 1

To understand the influence of user interface on task performance and situation awareness, three levels of user interface were designed based on the three-level situation awareness model for the 3-player diner’s dilemma game. The 3-player diner's dilemma is a multiplayer version of the prisoner's dilemma, in which participants play games with two computer players and try to achieve high scores. A total of 117 participants were divided into 3 groups to participate in the experiment. Their task performance (the dining points) was recorded and their situation awareness scores were measured with the Situation Awareness Global Assessment Technique. The results showed that (1) the level-3 interface effectively improved the task performance and situation awareness scores, while the level-2 interface failed to improve them; (2) the practice effect did exist in all three conditions; and (3) the levels of user interface had no effect on the task learning process, implying that the learning rules remained consistent across different conditions.

Citation: Jiang T, Fang H (2020) The influence of user interface design on task performance and situation awareness in a 3-player diner's dilemma game. PLoS ONE 15(3): e0230387. https://doi.org/10.1371/journal.pone.0230387

Editor: Valerio Capraro, Middlesex University, UNITED KINGDOM

Received: August 29, 2019; Accepted: February 28, 2020; Published: March 17, 2020

Copyright: © 2020 Jiang, Fang. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All the relevant data are within the paper and its Supporting Information files.

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

Introduction

With the development and popularization of network and smart devices, people have become familiar with user interfaces (UIs). In the process of human-computer interaction, the interface plays a vital role [ 1 ]. A productive and stimulating interface helps ensure high-quality interactions. Therefore, it is necessary to explain the influence of UI design on human behavior and to set standards to measure the characteristics of interfaces.

In this study, the 3-player diner’s dilemma game was employed to explore the impact of UI design on task performance and situation awareness (SA). The 3-player diner's dilemma is a multi-player version of the prisoner's dilemma, which has been widely applied in research in the economics, psychology, and political science fields [ 2 ]. Based on the SA theory and the diner's dilemma game, we hoped to explore the role of UI design in task performance and SA.

Information availability in UI design

As the medium of human-computer interaction, people need the information provided by UIs to complete a variety of tasks. So, information display plays an important role in human-computer interaction. However, according to Lucas and Nielsen, it is difficult to design a graphical interface because there are many variables that affect the development of a computer-based information system [ 3 ]. What and how much information should be presented on the interface has long been an intriguing question.

A naïve approach would be to put as much information as possible into the UI; however, this is untenable and “less is more” has been proved to be true in many cases. In some decision studies, researchers found that presenting more information would make participants feel more confident and lead to lower accuracy in decision making [ 4 ]. Todd et al. noted that sometimes people's decisions rely on limited information [ 5 ]. The same can be found in UI research. For example, Davies et al. studied the influence of menus on task performance during command learning in a word processing application. The results showed that the group without menus performed better than the group with menus [ 6 ]. Xuan et al. used a train simulation driving game to explore the relationship between UI information and task performance, and the results also showed that more information did not mean better performance [ 7 ].

Therefore, the amount of information and how it is presented in the UI should be carefully considered. According to Tufte, “attractive displays of statistical information… display an accessible complexity of detail” [ 8 ]. In other words, a good information display should contain task-relevant information and be presented in a reasonable way that does not overwhelm the users. This sentiment was echoed by human factors psychologists such as Sweller [ 9 , 10 ] and was a component of the ISO standards for interface design.

Under the guidance of these principles, the results still depended on the circumstances. Laura et al. studied information availability in a simulated command and control environment. The results indicated that increasing the volume of information, even when it was accurate and task-relevant, was not necessarily beneficial to decision-making performance [ 11 ]. Davidsson and Alm's research on driving information showed that drivers' need for information was very complex. People usually need different information in different contexts [ 12 ]. Therefore, we argue that the design of a UI should reflect human cognitive processes and fully consider cognitive processing limits and capabilities. There is some evidence to support this assumption, such as the study by Dina et al., which found that when users could customize their UIs, errors were reduced and user acceptance was improved [ 13 ]. In summary, to achieve a well-designed UI, we should not only study the task itself, but also take a deep look at the cognitive processes of the human operators.

SA and SAGAT

One theory that facilitates the understanding of humans’ cognitive processes and has been effectively applied to interface design is the SA theory. Since the SA theory values both explanation and prediction, it has been widely applied in interface research to validate the effectiveness of UI design (e.g., [ 11 , 14 , 15 ]).

The concept of SA originated in research on fighter pilots in the 1990s (e.g., [ 16 ]) to explain the psychological processes of pilots in complex and dynamic environments. Then it was extended to other scenarios [ 17 ]. In such studies, the involvement of SA and the related approaches brought considerable benefits such as high efficiency and error reduction [ 18 ]. Therefore, exploring the characteristics of interfaces based on SA theory is a feasible approach.

There are three distinct definitions of SA theory: (i) individual SA, (ii) team SA, and (iii) system SA. Individual SA means “knowing what is going on around you” [ 18 ]. SA is not an entity that can be touched and observed, but rather a concept involving a cognitive black box. Therefore, there is no unified definition or measurement for it. One of the most famous individual SA models is the three-level model proposed by Endsley [ 19 ]. Endsley believed that SA was the individual’s perception of the elements of their environment, the comprehension of their meaning, and the projection of future states under specific time and space conditions. More specifically, the three levels were presented as follows:

Perception (Lv. 1): the simple awareness of task-related elements (objects, events, people, etc.) and their present states (locations, conditions, modes, actions, etc.) in the surrounding environment.

Comprehension (Lv. 2): Integrating elements from Lv. 1 through understanding their past states and how they impact goals or objectives.

Projection (Lv. 3): Integrating Lv. 1 and Lv. 2 information and using it to project future actions and states of the elements in the environment.

The Situation Awareness Global Assessment Technique (SAGAT), which was proposed based on the above definition, is a popular method of assessing individual SA through probe techniques [ 20 ]. Researchers need to compile situation-related questions on three levels: perception, comprehension, and projection. The test is inserted into the task. At this time, the task is suspended and the participants cannot view other information, which means they need to complete the test by memory. After the test is over, the task continues. Its reliability and validity have been confirmed in many experimental studies [ 14 , 21 – 23 ].

The 3-player diner’s dilemma and interface design

In the 3-player diner’s dilemma, three players enter a restaurant. They agree to order a dish for themselves and the final cost is shared. The restaurant offers two types of dish, one is a hot dog at a low price and the other is a lobster at a high price. The more expensive the dish, the higher the quality. However, the hot dog has a higher quality-cost ratio. Here, we call this ratio dining points (DPs). In this game, players need to take part in multiple rounds of the game, with the goal of improving their total DPs. In our experiment, the hot dog had a quality of 200 and a cost of 10, so the DPs were 20, while the lobster had a quality of 400 and a cost of 30, so the DPs were 13.33. Since the final cost was shared by all three players, there were six possible outcomes for each player (see Fig 1 ).

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

In our study, there were six possible outcomes for each player.

https://doi.org/10.1371/journal.pone.0230387.g001

Looking at Fig 1 , we can see that although the hot dog had higher DPs, if the human player chose lobster and the other two computer players still chose the hot dog, the player could get more DPs from the loss of the other two players (see lines 1 and 2). However, if one of the computer players chose the lobster, the human player’s benefit would disappear or they would make a loss (see lines 3 and 4). In addition, if the other two computer players chose lobster and the human player still chose the hot dog, then their loss would be even greater (see lines 5 and 6).

Therefore, in order to get more DPs, the players need to observe how the other two behave. In the simplest case, if the other two players always chose the hot dog, the lobster would be the best choice (DPs = 24>20). Similarly, if the other two players always chose lobster, the lobster would also be the optimal solution (DPs = 13.3>8.6). So, when would the hot dog be the optimal solution? The answer is when the other two players were playing TFT (tit-for-tat). When the human player tried to get more DPs by selecting the lobster, the two computer players immediately chose the lobster to counterattack, so the human player’s total DPs would be reduced.

In summary, the players needed to understand the mathematical rules behind the situation and analyze the selection habits of the other players to develop their own strategies. The strategic depth of the game allowed us to design different interfaces to influence the task performance and SA of the participants. At the perception level (Lv. 1), the interface would present the basic task-related elements and allow the human player to understand their meaning. Specifically, all of the player's choices in the last round, as well as the player’s DPs, would be presented. All of this information was necessary for the player to complete the task. At the comprehension level (Lv. 2), as mentioned above, the tendencies of the other two players were vital to the task performance. Therefore, the information in UI-2 needed to reflect the tendencies of the other two players, especially whether they had been playing TFT. At the projection level (Lv. 3), the player needed to be able to use the information to predict future outcomes. The elements in UI-3 had to help the participants understand the possible outcomes of the different options. Because the three levels of SA should not be isolated, the UI-3 needed to contain all of the elements of the subordinate interfaces. So, we designed three interfaces for the game.

Previous studies

As far as we know, a total of three studies used a similar experimental paradigm to explore the impact of UI on SA and task performance in the 3-player diner’s dilemma [ 23 ]. All of the previous studies were based on the three-level model of SA. Three levels of interface were designed. UI-1 involved the level of perception, UI-2 involved the level of comprehension, and UI-3 involved the level of projection. Participants played the game against two computer players.

Yun et al. [ 23 ] originally conducted the experiment to explore the relationship among

SA, trust, and interface types, laying the foundation for subsequent research. The results showed that when using UI-1, the participants had a higher tendency to cooperate, and in the context of encouraging cooperation, the self-reported trust scores and proportion of cooperation responses were positively correlated. However, as a first try, this study unavoidably had some limitations. First, SA was not measured directly. In the study, the interface types and levels of SA were made equivalent, but the interface types did not necessarily reflect the SA. Second, the computer strategy was not described in detail. Finally, the interface was relatively simple.

Therefore, in the study by Onal et al. [ 21 ], some important improvements were made. First, the researchers employed the SAGAT to obtain objective SA scores, making the SA and interface types no longer equivalent. Second, they tried to quantify the computer strategy so that the relationship between it and the behavior of the participants could be clearly recorded. Third, the interface design became more sophisticated. As a result, the study showed that the interface did significantly affect task performance and a significant positive correlation between SA and task performance was also detected.

In the latest research, Schafffer et al. [ 22 ] refined the study by upgrading the number of computer strategies (from 5 to 12), enlarging the sample population (from 95 to 901), and adding advanced statistical methods such as path analysis to explain the results. In general, the conclusions of the study were similar to those of Onal et al. [ 21 ]. The impact of the interface on SA was determined, and the SA scores increased along with the interface level. It was also found that the interface did affect task performance. In the context of encouraging cooperation, better performance tended to increase (indicating improved performance) with the increasing interface level.

In conclusion, past studies explained the relationship among interface, SA, and cooperation, but they still had some shortcomings. First, they did not introduce the time dimension, so it was impossible for them to explain whether (and how) task performance and SA would change with practice. That is to say, the above studies only considered the situation of "novices" in the diner’s dilemma. In fact, the “novice-expert” comparison was an important research issue for individual SA [ 17 ]. Also, Gonzalez et al. [ 14 ] have already shown that SA scores and task performance increased with practice in the water purification plant task. In their study, it was also found that only the levels of perception and comprehension increased with practice, while the third level of projection did not. Gonzalez et al. argued that this might be linked to the difficulty of the task. The establishment of a mental model had a positive effect on SA [ 24 ] and the essence of the practice effect might be the establishment of a mental model that could distinguish novices and experts [ 17 ]. So, it was necessary to probe the practice effect in different tasks to verify the assumption. Second, the computer strategies of the past studies still had a subdivision space. They set computer strategies through two types of parameter, distinguishing between cooperation and defection situations. These settings only changed one of the parameters while fixing the other, which was only a one-dimensional change. Therefore, our study extended it to a two-dimensional plane, making the computer strategy more similar to human behavior.

Based on the previous studies and literature, we proposed the following hypotheses:

  • H1: SA is positively correlated with task performance.
  • H2: As more components become involved in the interface, task performance will improve.
  • H3: The practice effect will exist in all three interface conditions.

The East China Normal University Ethics Committee approved this study. Our whole research process followed the ethical guidelines for research involving human subjects, and every participant provided written informed consent.

Experimental design

This study adopted a mixed 3 (between-subjects variable) × 4 (within-subjects variable) design to examine the effects of interface and practice on DPs and SA scores. In the experiment, the participants were asked to compete with two computers that used a preset strategy, and they were encouraged to get as many DPs as possible. The task was divided into 4 blocks in a total of 200 trials. The program settled the DPs once every 50 trials, and then the DPs returned to zero to start a new block. The experiment contained three levels of UI, and each participant fulfilled the task under only one.

Computer strategy.

In the experiment, the two computer players used the TFT strategy that was proposed by previous studies [ 21 – 23 ]. The TFT strategy meant that the computer would make the next round’s decisions based on the choice of the participant. In other words, if the participant chose a hot dog, the computer would also choose the hot dog in the next trial. Or if the participant chose the lobster, the computer would choose the lobster in the next trial. However, there was a 10% chance that the computer would make a different choice from the participant. There were two reasons for this setting: first, it ensured that the game had a clear optimal strategy so that the participants had the chance to learn; second, the probability was variable, which would make the computer players more natural, thus improving the ecological validity of the experiment.

Blocks and trials.

In order to verify the practice effect, the task was divided into 4 blocks, 50 trials per block, for a total of 200 trials. At the end of each block, the DP results (including all previous blocks) were presented to the participants and reset to 0 after confirmation (see Fig 2 ).

thumbnail

In this study, the task was divided into 4 blocks, 50 trials per block. Dining points would be set to zero at the end of each block.

https://doi.org/10.1371/journal.pone.0230387.g002

The effect of the interface level was another focus of the experiment. Three levels were designed based on SA theory (see Fig 3 ).

thumbnail

This figure shows the UIs employed in the study. The solid green line box is the level-1 UI, the blue dashed box is the level-2 UI, and the red line box is the level-3 UI.

https://doi.org/10.1371/journal.pone.0230387.g003

Level-1 UI: The green solid line box shown in Fig 3 was a simplified UI (level-1 UI), in which only the basic buttons, the players’ selection for each trial, the DPs obtained in each trial, and the sum of the DPs were presented. At this level, participants could not directly view the selection tendency of the computer players.

Level-2 UI: The blue dashed box shown in Fig 3 was the level-2 UI. Compared with the level-1 UI, a history panel was added. In the history panel, the participants could view the selection information of all players in each trial, the DPs obtained in each trial, and the frequency count of the computer players mimicking the choices of the participants.

Level-3 UI: The red line frame part of Fig 3 was the level-3 UI. Compared with the level-2 UI, a prediction panel was added. Through the prediction panel, the participants could adjust the parameters and the program would calculate the expected DPs according to the preset formula to help the participants make decisions.

SAGAT measurement.

The SAGAT was employed in this study to measure the SA scores. The test was inserted in the 25th trial of the task of every block. At this time, the participant could not view the main interface. According to the task situation, eight questions were put forward, of which questions 1–3 corresponded to the level of perception, 4–5 to the level of comprehension, and 6–8 to the level of projection. All were 4-to-1 single-choice questions, with 1 point for the correct choice and 0 points for the wrong choice, so the highest SA score for each test was 8 points. The eight questions are shown in Table 1 .

thumbnail

https://doi.org/10.1371/journal.pone.0230387.t001

The program was run on a MacBook Pro 13 with an external 25-inch 2560×1440 display. In the experiment, the participants used an external display and a Bluetooth mouse. Both the GUI and the background code were written in MATLAB 2019a.

Participants

Initially, we recruited 90 undergraduates. Then, during the revision of the manuscript, we recruited another 27 undergraduates to improve the statistical power. Finally, a total of 117 undergraduates were recruited and formally informed, including 82 females and 35 males. The 117 volunteers (between 18 and 26 years old) were randomly assigned to 1 of the 3 interface conditions. Each group contained 39 participants. In the three interface conditions, the number of males was 11 (UI-1), 12 (UI-2), and 12 (UI-3).

A repeated measures analysis of variance (ANOVA) was employed to analyze the DPs. The results showed that the main effect of the block was significant: F (3, 342) = 48.924, p < 0.001, η partial 2 = 0.300. The main effect of the interface condition was also significant: F (2, 114) = 6.183, p = 0.003, η partial 2 = 0.098. However, the interaction between the block and the interface condition was not significant: F (6, 342) = 0.872, p = 0.515, η partial 2 = 0.015. Post-hoc multiple comparisons with Bonferroni correction indicated that (1) there were significant differences between block 1 and block 2 ( p < 0.001), block 1 and block 3 ( p < 0.001), block 1 and block 4 ( p < 0.001), block 2 and block 3 ( p = 0.01), block 2 and block 4 ( p < 0.001), and block 3 and block 4 ( p = 0.002); (2) the DPs in UI-3 were significantly higher than in UI-1 ( p = 0.005) and UI-2 ( p = 0.016) and there were no significant differences between UI-1 and UI-2 ( p = 1.000) (see Fig 4 ).

thumbnail

The red box on the left of the figure shows the post-hoc comparison results of the block and the blue box on the right shows the post-hoc comparison results of the UI. (* p < 0.05, ** p < 0.01, *** p < 0.001; the error bars denote 2 SDs).

https://doi.org/10.1371/journal.pone.0230387.g004

Similarly, a repeated measures ANOVA was used to analyze the SA scores. The results showed that the main effect of the block was significant: F (3, 342) = 16.681, p < 0.001, η partial 2 = 0.128. The main effect of the interface condition was also significant: F (2, 114) = 3.985, p = 0.021 < 0.05, η partial 2 = 0.065. However, the interaction between the block and the interface condition was not significant: F (6, 342) = 0.631, p = 0.705, η partial 2 = 0.011. Post-hoc multiple comparisons with Bonferroni correction indicated that (1) the SA scores of block 1 and block 2 ( p = 0.027), block 1 and block 3 ( p < 0.001), block 1 and block 4 ( p < 0.001), and block 2 and block 4 ( p = 0.001) showed significant differences and there were no significant differences between block 2 and block 3 ( p = 0.228) or block 3 and block 4 ( p = 0.289); (2) the SA scores of UI-3 were significantly higher than those of UI-2 ( p = 0.017), while there were no significant differences between UI-1 and UI-3 ( p = 0.664) and UI-1 and UI-2 ( p = 0.346) (see Fig 5 ).

thumbnail

https://doi.org/10.1371/journal.pone.0230387.g005

Pearson’s correlation

In theory, both the SA scores and the DPs could reflect the participants' understanding of the task, and there should be a positive correlation between the two variables. The results showed that there did exist a significant positive correlation; the Pearson’s correlation coefficient results were r = 0.543, p < 0.001 (see Fig 6 ).

thumbnail

There was a significant positive correlation between SAs and DPs with r = 0.5, indicating that the SA scores could reflect the participants' understanding of the task.

https://doi.org/10.1371/journal.pone.0230387.g006

In our study, we examined the influence of interface design on task performance and SA, and also took the practice effect into consideration. There were two different dependent variables, in which the DPs reflected the task performance, while the SA scores reflected the understanding of the task. As mentioned above, the task in our study was not complicated in operation, so we had supposed that there would be a positive correlation between the two dependent variables (Hypothesis 1). The results supported the hypothesis, as there was a significant positive correlation ( r = 0.543) between the DPs and the SA scores. High SA scores might have helped the participants emphasize the long-term gains over short-term gains (choosing lobster to gain more DPs in a single round), thus increasing the total DPs. On the other hand, the results also showed that the SAGAT had good reliability and validity, which could reflect the participants' understanding of the task.

The second hypothesis was that as more components became involved in the interface, the task performance would improve. In our interface design, UI-3 was the most complex level, which covered the compositions of the subordinate levels. Therefore, we speculated that UI-3 would lead to the best task performance and SA scores. The results verified this point. It was found that there was a significant positive effect of UI-3 on both the DPs and the SA scores. Therefore, the design of the prediction panel did enhance the participants' comprehension of the task situation. It not only helped the participants understand the rules more clearly, but also facilitated their more accurate and finer future planning. These results were in line with our hypothesis.

Taking a deeper view on this, the advantage of UI-3 might have come from the high integration of information. A good interface should not only integrate information but also embrace simplicity [ 25 ]. There was evidence that integrating all kinds of sub-UIs into a single UI was helpful for improving SA [ 26 ]. According to Durso et al., both working memory and mental models are relevant to SA [ 17 ]. By providing a graphical depiction of the diners’ historical decisions, UI-3 served as a working memory assist, which reduced the unnecessary cognitive load of the participants. Meanwhile, the prediction panel made good use of the information provided by the history panel, prompting the participants to build a correct mental model of the situation. The participants could understand the task through a single interface. So, under this condition, the participants performed better in both DPs and the SA test.

However, when using UI-2, there was no significant improvement in task performance and SA scores compared with UI-1. So, contrary to the effect of the prediction panel of UI-3, the history panel in UI-2, which corresponded to the comprehension level of SA theory, was not ideal. The history panel failed to improve the participants’ DPs or SA scores. Also, in the SAGAT, the SA scores of UI-2 were slightly lower than those of UI-1. The history panel also failed to help the participants answer the two questions that were relevant to the comprehension level. This result did not fit our hypothesis, but might reflect that more information was not always beneficial [ 7 , 11 ]. Looking back at UI-2, the frequency count on the history panel did not have an intuitive effect in the absence of a predictive panel, which might have made the participants feel puzzled. This also suggested that simply providing additional information did not necessarily improve SA and task performance.

The third hypothesis was that the practice effect would exist in all three interface conditions. The results showed that it did exist in all three conditions; not only the DPs, but also the SA scores significantly increased with practice. Also, the levels of UI had no effect on the task learning process, which implied that the learning rules remained consistent across the different conditions. According to Endsley, it usually takes a long time to build SA [ 27 ]. However, in a simple task, the SA could also change through training in a short time [ 28 ]. The task of this experiment should belong to the latter, as the task was a relatively simple simulation scenario, so the SA scores could change significantly in the four blocks. However, the improvement of the DPs and the SA scores was not smooth. The SA scores tended to be stable in blocks 2, 3, and 4, while the changes in DPs were significant from block to block. These results implied that the participants had established a stable mental model of the task. In addition, no interaction of blocks and UI level was found in the experiment. Given that the assignment of the participants was completely random, it could be inferred that the pre-task instructions and practice were the main reasons why the UI-3 group had the best performance. In the subsequent whole process of the task, the advantages were continually maintained, neither expanding nor decreasing. This might mean that the UI level did not have a substantial impact on the efficiency of task learning, and the design of the UI might not have changed the law of learning.

Limitations

First, the main purpose of our study was to verify the theoretical basis for interface design. So, we adopted the 3-player diner's dilemma as the experimental task, which was a simulated situation. Moreover, we chose only situations where the computer played TFT without considering more complex computer strategies, such as computer players dynamically adjusting their preferences in response to the participants’ choices. Given that, whether our results could be extended to other tasks or even to the real world still remains to be proven.

Another limitation is that our interface design did not exclude all of the irrelevant variables. UI-3 was interactive, while UI-1 and UI-2 were not. Previous studies confirmed that the controllability of the system was beneficial to understanding the situation and improving task performance and satisfaction [ 13 , 29 ]. Thus, it is the interaction itself that might help the participants to have better control over the UI and deepen their understanding of the task. To eliminate this possibility, in the future, we would like to change the parameter adjustment in the prediction panel in UI-3 from manual input to automatic input to test this assumption.

Finally, the SAGAT in our study used only eight questions, which could have led to insufficient reliability and validity of the test. Specifically, even if the participants did not know the correct answer, just guessing might have led to a high score. In addition, the effect size of the SA became small. Therefore, in future experiments, we might add more questions and refine the test to improve its reliability.

Conclusions

Three interfaces based on the three-level SA model were designed to explore the role of interface design in task performance, SA scores, and the learning process in a simulated situation. We found that: (1) the level-3 interface effectively improved the task performance and SA scores, while the level-2 interface failed to improve them; (2) the practice effect did exist in all three conditions; and (3) the levels of UI had no effect on the task learning process, which implied that the learning rules remained consistent across the different conditions.

Supporting information

S1 table. descriptive statistical results of the main variables of the experiment..

https://doi.org/10.1371/journal.pone.0230387.s001

S1 Appendix. Mixed linear model [ 30 – 31 ].

https://doi.org/10.1371/journal.pone.0230387.s002

S1 Raw dataset.

https://doi.org/10.1371/journal.pone.0230387.s003

  • View Article
  • PubMed/NCBI
  • Google Scholar
  • 2. Jervis R, Jervis R. The Illogic of American Nuclear Strategy. Temple University School of Law; 1984.
  • 10. Sweller J. Cognitive load theory. Psychology of learning and motivation. Elsevier; 2011. pp. 37–76.
  • 15. Nwiabu N, Allison I, Holt P, Lowit P, Oyeneyin B. User interface design for situation-aware decision support systems. 2012 IEEE International Multi-Disciplinary Conference on Cognitive Methods in Situation Awareness and Decision Support. 2012. pp. 332–339.
  • 19. Endsley MR. Situation awareness global assessment technique (SAGAT). Aerospace and Electronics Conference, 1988 NAECON 1988, Proceedings of the IEEE 1988 National. IEEE; 1988. pp. 789–795.
  • 21. Onal E, Schaffer J, O’Donovan J, Marusich L, Yu MS, Gonzalez C, et al. Decision-making in abstract trust games: A user interface perspective. 2014 IEEE International Inter-disciplinary Conference on Cognitive Methods in Situation Awareness & Decision Support. 2014. pp. 21–27.
  • 23. Yun T, Jones R, Marusich L, O’Donovan J, Gonzalez C, Höllerer T. Trust and Situation Awareness in a 3-Player Diner’s Dilemma game. 2013 IEEE International Multi-disciplinary Conference on Cognitive Methods in Situation Awareness & Decision Support. 2013. pp. 9–15.
  • 29. Knijnenburg BP, Bostandjiev S, O’Donovan J, Kobsa A. Inspectability and control in social recommenders. Acm Conference on Recommender Systems. 2012. p. 43.

Skip navigation

Nielsen Norman Group logo

World Leaders in Research-Based User Experience

Usability 101: introduction to usability.

research article on user interface

January 3, 2012 2012-01-03

  • Email article
  • Share on LinkedIn
  • Share on Twitter

This is the article to give to your boss or anyone else who doesn't have much time, but needs to know the basic usability facts.

In This Article:

What — definition of usability, why usability is important, how to improve usability, when to work on usability, where to test.

Usability is a quality attribute that assesses how easy user interfaces are to use. The word "usability" also refers to methods for improving ease-of-use during the design process.

Usability is defined by 5 quality components :

  • Learnability : How easy is it for users to accomplish basic tasks the first time they encounter the design?
  • Efficiency : Once users have learned the design, how quickly can they perform tasks?
  • Memorability : When users return to the design after a period of not using it, how easily can they reestablish proficiency?
  • Errors : How many errors do users make, how severe are these errors, and how easily can they recover from the errors?
  • Satisfaction : How pleasant is it to use the design?

There are many other important quality attributes. A key one is utility , which refers to the design's functionality: Does it do what users need?

Usability and utility are equally important and together determine whether something is useful: It matters little that something is easy if it's not what you want. It's also no good if the system can hypothetically do what you want, but you can't make it happen because the user interface is too difficult. To study a design's utility, you can use the same user research methods that improve usability.

  • Definition of Utility = whether it provides the features you need .
  • Definition of Usability = how easy & pleasant these features are to use.
  • Definition of Useful = usability + utility .

On the Web, usability is a necessary condition for survival. If a website is difficult to use, people leave . If the homepage  fails to clearly state what a company offers and what users can do on the site, people leave . If users get lost on a website, they leave . If a website's information is hard to read or doesn't answer users' key questions, they leave . Note a pattern here? There's no such thing as a user reading a website manual or otherwise spending much time trying to figure out an interface. There are plenty of other websites available; leaving is the first line of defense when users encounter a difficulty.

The first law of ecommerce  is that if users cannot find the product, they cannot buy it either.

For intranets , usability is a matter of employee productivity . Time users waste being lost on your intranet or pondering difficult instructions is money you waste by paying them to be at work without getting work done.

Current best practices call for spending about 10% of a design project's budget on usability. On average, this will more than double a website's desired quality metrics  (yielding an improvement score of 2.6) and slightly less than double an intranet's quality metrics. For software and physical products, the improvements are typically smaller — but still substantial — when you emphasize usability in the design process.

For internal design projects, think of doubling usability as cutting training budgets in half and doubling the number of transactions employees perform per hour. For external designs, think of doubling sales, doubling the number of registered users or customer leads, or doubling whatever other KPI (key performance indicator) motivated your design project.

There are many methods for studying usability, but the most basic and useful is user testing , which has 3 components:

  • Get hold of some representative users , such as customers for an ecommerce site or employees for an intranet (in the latter case, they should work outside your department).
  • Ask the users to perform representative tasks with the design.
  • Observe what the users do, where they succeed, and where they have difficulties with the user interface. Shut up and let the users do the talking .

It's important to test users individually and let them solve any problems on their own. If you help them or direct their attention to any particular part of the screen, you have contaminated the test results.

To identify a design's most important usability problems, testing 5 users is typically enough. Rather than run a big, expensive study, it's a better use of resources to run many small tests and revise the design between each one so you can fix the usability flaws as you identify them. Iterative design is the best way to increase the quality of user experience. The more versions and interface ideas you test with users, the better.

User testing is different from focus groups , which are a poor way of evaluating design usability. Focus groups have a place in market research, but to evaluate interaction designs you must closely observe individual users as they perform tasks with the user interface. Listening to what people say is misleading; you have to watch what they actually do.

Usability plays a role in each stage of the design process. The resulting need for multiple studies is one reason I recommend making individual studies fast and cheap. Here are the main steps:

  • Before starting the new design , test the old design to identify the good parts that you should keep or emphasize, and the bad parts that give users trouble.
  • Unless you're working on an intranet, test your competitors' designs to get cheap data on a range of alternative interfaces that have similar features to your own. (If you work on an intranet, read the intranet design annual to learn from other designs.)
  • Conduct a field study to see how users behave in their natural habitat.
  • Make paper prototypes of one or more new design ideas and test them . The less time you invest in these design ideas the better, because you'll need to change them all based on the test results.
  • Refine the design ideas that test best through multiple iterations , gradually moving from low-fidelity prototyping to high-fidelity representations that run on the computer. Test each iteration.
  • Inspect the design relative to established usability guidelines whether from your own earlier studies or published research.
  • Once you decide on and implement the final design , test it again. Subtle usability problems always creep in during implementation.

Don't defer user testing until you have a fully implemented design. If you do, it will be impossible to fix the vast majority of the critical usability problems that the test uncovers. Many of these problems are likely to be structural, and fixing them would require major rearchitecting.

The only way to a high-quality user experience is to start user testing early in the design process and to keep testing every step of the way.

If you run at least one user study per week , it's worth building a dedicated usability laboratory. For most companies, however, it's fine to conduct tests in a conference room or an office — as long as you can close the door to keep out distractions. What matters is that you get hold of real users and sit with them while they use the design. A notepad is the only equipment you need.

Related Courses

Ux basic training.

Foundational concepts that everyone should know

Interaction

Remote User Research

Collect insights without leaving your desk

Web Page UX Design

Designing successful web pages based on content priority, visual design, and the right page components to meet objectives

Related Topics

  • Human Computer Interaction Human Computer Interaction
  • User Testing
  • Web Usability

Learn More:

Please accept marketing cookies to view the embedded video. https://www.youtube.com/watch?v=st9AEPOjGpU

Usability 101

research article on user interface

Tabs vs. Accordions: When to Use Each

Huei-Hsin Wang · 3 min

research article on user interface

How to Conduct a Heuristic Evaluation

Kate Moran · 5 min

research article on user interface

What Is UX (Not)?

Kate Kaplan · 3 min

Related Articles:

User-Experience Quiz: 2023 UX Year in Review

Raluca Budiu · 3 min

UX Basics: Study Guide

Tim Neusesser · 6 min

Competitive Usability Evaluations

The ELIZA Effect: Why We Love AI

Caleb Sponheim · 3 min

Evaluate Interface Learnability with Cognitive Walkthroughs

Kim Salazar · 8 min

The Hawthorne Effect or Observer Bias in User Research

Mayya Azarova · 10 min

research article on user interface

Empower Your E-Commerce Design with 130,000+ Hours of UX Research self.__wrap_n!=1&&self.__wrap_b(":R6j4j96:",1)

Baymard Institute uncovers what designs cause usability issues, how to create “State of the Art” user experiences, and measure how your UX performance stacks up against leading e-commerce sites.

UX Articles

384 free articles covering 5% of Baymard’s large-scale e-commerce UX research findings.

UX Benchmarks

Case studies of 244 top e-commerce sites. Ranked using 215,000+ UX performance scores.

Page Designs

14,000+ annotated design examples, for systematic inspiration on e-commerce page types.

Premium Research

Get full access to Baymard’s 130,000+ hours of research and empower your UX decisions.

Research-Based Articles

Bi-weekly UX articles based on Baymard’s e-commerce research

research article on user interface

The Current State of Accounts & Self-Service UX: 5 Common Pitfalls & Best Practices

research article on user interface

If Providing Sidebar Filtering, Position the “Size” Filter near the Top and Expand It by Default

research article on user interface

Always Allow Users to Navigate across User Reviews via Reviewer-Submitted Images

research article on user interface

Mobile UX Trends: The Current State of Mobile UX (15 Common Pitfalls & Best Practices)

Premium Research Database

Make Research-Based UX And Design Decisions

Use Baymard’s comprehensive UX research database to create “State of the Art” user experiences, and see how your UX performance stacks up.

With Baymard Premium you will get access to 650+ design guidelines and 215,000+ performance scores — insights already used by several of the world’s leading sites.

Learn More About Baymard Premium

Abstract example of the Premium Research Database

UX Audit Service

The 40 Most Important Changes You Can Make To Your UX

A UX expert from Baymard will perform a full analysis of your site, based on our 130,000+ hours of UX research.

The 120-page audit report will outline 40 improvements for your site, document its UX performance across 500 parameters, and compare it to industry leaders and competitors.

Learn More About Our UX Audit Service

Example of a UX performance chart

UX Training for Career Growth

100% remote & self-paced courses

Unlock the full potential of your UX team and accelerate your individual career with Baymard's UX training and certification platform. The self-paced courses are based on Baymard’s 130,000+ hours of UX research.

There’s 3 difficulty levels to match all UX backgrounds and ambitions, from the uninitiated to the UX veteran. Beyond training and certification, there’s also guest lectures from Google, Luke Wroblewski, Paul Boag, Brad Frost, etc.

Learn More About Our UX Training

Example of a UX performance chart

What Our Clients Say

Baymard’s research is used by 17,500+ brands, agencies, researchers, and UX designers, across 80+ countries, and includes 71% of all Fortune 500 e-commerce companies.

Amazon Logo

“ Baymard produces some of the most relevant and actionable user experience research available. They really understand the needs of UX and Product Management professionals, and their deep experience in the eCommerce field allows them to offer sophisticated, nuanced insights. ”

Office Depot Logo

“ Baymard has been a great resource in helping us improve the customer experience. We are continually applying these best practices to our sites. ”

Belk Logo

“ I can not tell you how much help your benchmark studies have been for our company, e-commerce and UX teams. We have used and continue to use these reports for baseline benchmarks as we build test protocols or eye tracking scripts etc. in lab. ”

Overstock Inc. Logo

“ Thanks again for the great work on our checkout project. Our whole group found it incredibly insightful. We’re applying the suggestions you provided to our new checkout design which launches at the end of the month! One of my colleagues was also interested in your group’s competitive expertise with regard to responsive web and native apps. ”

Etsy Logo

“ Thank you. This was an excellent piece of work: professional, thorough, and actionable for the team. We’re very happy with the work Baymard has done for us. ”
“ Thank you very much for the 7 usability audits of our country-specific sites. The audits have provided us with specific and actionable advice, allowed us to prioritize development resources, and enabled us to compare UX performance between the 7 different country-specific sites, and against State of the Art implementations. The audit itself is done really professionally, and the recommendations contain actionable and insightful information. ”

Nike Inc. Logo

“ Intelligent, consumer-focused insights that are clear and actionable. The team in the room really loved the way the Baymard Institute highlighted the optimizations in the various user experience elements (copy, layout, design, calls-to-action…), from the perspective of consumer struggles. Baymard’s Usability research really complements our other existing research tools. ”

Instacart Logo

“ Thank you, this was really insightful! ”
“ We’ve received some awesome feedback from our Merchant Success team as well as our merchants about all of the UX Audits we’ve had thus far with Baymard. Thank you so much to you and your team for all of your hard work. The pilot with Baymard has been going fantastic and I’m really excited with all that we’re learning! You have an amazing platform, team and super helpful data base for us to work with. ”

Levi Strauss & Co. Logo

“ Having Baymard is like having access to a magical UX super power. I can't believe how helpful and easy to use it is, given the vast array of tools and information they provide! ”

Staples Logo

“ Baymard's audit services give us a detailed view of usability improvements across our entire site. This is so much more comprehensive than running individual usability studies. ”

Caleres Logo

“ Clear, concise, actionable, data-driven insights! ”

TaylorMade Logo

“ I was able to bring these designed solutions home with me and kickoff multiple optimization projects that I am confident will affect the site in a positive way, both in usability and conversion. ”
“ I just wanted to take a minute to thank you for the amazing work on this audit. You should know that this has been very well received internally and there’s a lot of excitement around adopting the ideas you have shared. ”

On Running Logo

“ Very thorough and professional UX review of our website, based on an extensive amount of previous UX research insights within the industry, and specifically targeted to our needs. We received both critical and, most importantly, constructive feedback, along with actionable, prioritized suggestions and best-practice examples. This will allow us to address the areas of improvement and significantly help ameliorate the experience users have on our website, which in turn is expected to drive conversion rates and reduce the number of customer service requests. We can highly recommend Baymard's UX audit. ”

Author of ‘Don’t Make Me Think’ Logo

“ Damn. The reports that the @Baymard folks do cost money, but they’re worth it. ”

Hibbett Logo

“ This has been fantastic: really good recommendations, really comprehensive. ”

B&H Photo & Electronics Corp. Logo

“ I can confirm that the list was fully implemented. Every time we put up a change we either A/B test or we watch it very closely to determine that it’s doing better and not the opposite. So I can confirm that these fixes have improved our checkout. Thanks for everything. ”
“ Excellent tool – looking forward to using it with our other sites and prototypes as they’re developed. ”

Harley-Davidson Logo

“ We found the audits extremely helpful and validated a number of changes we have been wanting to make or are in the process of making, so thank again for all the great insights. ”

Columbia Sportswear Logo

“ This was indeed very helpful guidance and a very well-documented roadmap for us to fix, validate, organize, collectively understand and continually improve our ecommerce foundation. ”

ClickBank Logo

“ It is immensely valuable having a thorough, independent study to help validate my work and in particular, help facilitate buy-off from stakeholders. Baymard has quickly become one of my most trusted resources for the UX/UI field. ”

AB InBev Logo

“ I found the UX audit a very comprehensive evaluation, with clear reports and actionable recommendations. Baymard's commitment to excellence in user experience shines through its thorough approach! ”

Jarden Consumer Solutions Logo

“ Thanks for everything. The audit was extremely useful, I think we have gained valuable insight. ”

DSW Logo

“ This was…mind-blowing. We’ve been having conversations on the side as you’ve been presenting the audit findings. There’s so much to do! ”

Hallmark Logo

“ These reports are fabulous. The content is exactly what our team has been looking for, and so much more! Extremely helpful, thank you! ”

Keurig Green Mountain Logo

“ I have found the M-Commerce and E-Commerce reports very useful, thank you! ”

Ace Hardware Logo

“ I’m an avid user of your reports and recommendations. I have leveraged your articles and findings throughout my career in B2B, B2C, and hospitality. ”

John Lewis Logo

“ The Baymard team has been a delight to work with on the JohnLewis digital platform audit. They responded to the brief very well, have been very accessible for ongoing clarification and queries and Rebecca was excellent in the recent team share, articulately presenting findings in an engaging walk-through with the wider team which will really support driving engagement and a robust response. Many thanks for all the effort and focus folks. ”

Room & Board Logo

“ The Baymard reports have proven to be an invaluable resource for us. Comprehensive, pragmatic and actionable. We have redesigned our checkout process and made changes to our category pages based on usability guidelines in the reports. ”

V-ZUG Logo

“ Thanks for this audit and your good work. This was exactly what I was aiming for. Also thanks for the very, very professional presentation, and answering all our countless questions. Very good work. ”

WCR Window Cleaning Resource Logo

“ I just wanted to let you know that I think your site is the best thousand bucks I’ve ever spent. I wish I found you years ago. ”

Ōura Ring Logo

“ First off, thank you. This was the most engrossed I’ve ever been in a 2-hour meeting. This [audit presentation] was incredibly insightful and very helpful. Many, many thanks. ”

Best Secret Logo

“ We are very excited to finally proceed with the UX improvements, and I truly believe your audit report will be super helpful to put us ahead of the wave. If you ever need a reference, please do not hesitate to share my contact. ”

Wrench Inc. Logo

“ Baymard has helped so much: UX was a brand new role at my company when I was hired. I was researching, planning, and designing UX & UI for 5 different products, all by myself. After showing real-world, bottom-line results from a UX centered approach to our products, we have expanded our UX team and greatly improved our UX-to-product process. Baymard’s research database was a critical component to my (and my company’s) success. Thank you! ”

prAna Logo

“ Wanted to thank you again for the checkout audit and walking us through the process. It was super helpful and we can’t wait to apply the changes to our checkout for a better user experience. ”

Nutrisystem Logo

“ The recommendations in our audit were awesome - well prioritized, actionable and helped us focus on what to optimize. This audit, along with the e-Commerce Reports & Benchmark Databases, are my go-to resources for thorough, insightful information. Thank you! ”

Party City Logo

“ This is awesome so far. Everyone wants to know what's going on – you just got everyone's attention here. Everything that you've called out is definitely eye-opening for us over here. ”

Clicktale Logo

“ Some time ago we purchased the Ecommerce Homepage & Category report - the research and insights are extremely useful to us and help us a lot in our work! ”

Epicenter Consulting Logo

“ Given the tricky science of conversion rate optimization, it is great to know that you are dealing with professionals whose advice is based on solid research. It was a pleasure collaborating with the Baymard team. ”

RepairClinic.com Logo

“ Within a very short time Baymard Institute provided 15 clear, useful improvement suggestions for our checkout process. We intend to implement all of them. It’s easy to find companies that offer website improvement suggestions. But, most companies don’t do their homework and don’t provide specific examples of how best to make the improvements. With Baymard Institute, the checkout process suggestions they made were intuitive, specific, and actionable. I highly recommend their audit service. ”

Shutterstock Logo

“ This UX audit has been very helpful, not just for our design and product teams, but even for the UX research team, because we can reference back to the audit, either in the design of a user research session or when we analyze findings. Thank you very much; this has been incredibly valuable. ”

CarTrawler Logo

“ The Baymard UX audit has been a revelation for our organisation and will likely become a vital tool in our process moving forward. ”

Monster Notebook Logo

“ Working with Baymard for our UX audit was an exceptional experience from start to finish. Their attention to detail, depth of analysis, and clear communication throughout the process truly exceeded our expectations. The insights they provided were not only actionable but profoundly insightful. I highly recommend Baymard for their expertise, professionalism, and commitment to elevating user experiences. ”

IONOS SE Logo

“ The audit opened our eyes once again, as we are often blind to our own operations. The comparison with competitors' best practices was particularly helpful. ”

Will senior adults accept being cognitively assessed by a conversational agent? a user-interaction pilot study

  • Open access
  • Published: 15 June 2024

Cite this article

You have full access to this open access article

research article on user interface

  • Moisés R. Pacheco-Lorenzo   ORCID: orcid.org/0000-0002-0424-8850 1 ,
  • Luis E. Anido-Rifón 1 ,
  • Manuel J. Fernández-Iglesias 1 &
  • Sonia M. Valladares-Rodríguez 2  

15 Accesses

Explore all metrics

Background: early detection of dementia and Mild Cognitive Impairment (MCI) have an utmost significance nowadays, and smart conversational agents are becoming more and more capable. DigiMoCA, an Alexa-based voice application for the screening of MCI, was developed and tested. Objective: to evaluate the acceptability and usability of DigiMoCA, considering the perception of end-users and cognitive assessment administrators, through standard evaluation questionnaires. Method: a sample of 46 individuals and 24 evaluators participated in this study. End-users were fairly heterogeneous considering demographic and neuro-psychological characteristics. Evaluators were mostly health and social care professionals, relatively well-balanced in terms of gender, career background and years of experience. Results: end-users acceptability ratings were generally positive (rating above 3 in a 5-point scale for all dimensions) and it improved significantly after the interaction with DigiMoCA. Administrators also rated the usability of DigiMoCA, with an average score of 5.86/7 and with high internal consistency ( \(\alpha \) = 0.95). Conclusion: although there is still room for improvement in terms of user satisfaction and voice interface, DigiMoCA is perceived as an acceptable, accessible and usable cognitive screening tool, both by individuals being tested and test administrators.

Graphical abstract

research article on user interface

Similar content being viewed by others

research article on user interface

Effects of Age-Related Cognitive Decline on Elderly User Interactions with Voice-Based Dialogue Systems

research article on user interface

Evaluating smart speakers as assistants for informal caregivers of people with dementia

research article on user interface

Expert Insights for Designing Conversational User Interfaces as Virtual Assistants and Companions for Older Adults with Cognitive Impairments

Avoid common mistakes on your manuscript.

1 Introduction

In a world with an ever-increasing human lifespan, the quality of life of senior adults is becoming more and more relevant. According to WHO [ 1 ], the percentage of population over the age of 60 will increase by 34% between 2020 and 2030, and with it, the prevalence of neuro-psychiatric disorders, particularly dementia, which have an extremely high impact on people’s well-being and their social and economical aspects.

Mild Cognitive Impairment (MCI) is the transition stage between healthy aging and dementia and is characterized by subtle cognitive deficits that do not meet the criteria for diagnosis of a major neuro-cognitive disorder (DSM-V) [ 2 ]. These difficulties can manifest themselves in areas such as memory, attention, language, orientation or decision making. Thus, detecting MCI in its early stages is beneficial in preventing the progression of the disease, and, in certain cases, in slowing down some of its symptoms. However, in most cases the detection of cognitive deficits occurs when the symptoms are already evident and when the underlying neurological disorder was already present for some time [ 3 ], which means that the disease progressed. The traditional screening method for early detection of cognitive impairment involves the use of clinically-validated gold-standard tests that assess the cognitive state of a person.

The inception of these tests trace back to the second half of the 20th century. One of the first widely used screening tools was the Mini-Mental State Examination (MMSE), published by Folstein [ 4 ] in 1975; it includes items of orientation, concentration, attention, verbal memory, naming and visuospatial skills. In the 80s, the Alzheimer’s Disease Assessment Scale-Cognitive Subscale (ADAS-Cog) was developed [ 5 ] and it included 7 items, namely word recall, naming, commands, constructional praxis, ideational praxis, orientation and word recognition.

One of the limitations of these evaluation instruments is the fact that they are dementia-oriented, particularly Alzheimer’s. Therefore, in later years other screening tools were created, e.g., the Montreal Cognitive Assessment (MoCA) [ 6 ] test, which has a 90% sensitivity for MCI detection (MMSE is not sensitive to MCI). Its telephone version (T-MoCA) [ 7 , 8 ] is also validated and has a strong correlation with MoCA with a Pearson coefficient of 0.74.

The fact that MoCA is oriented at MCI detection makes it suitable as a screening tool for an early diagnosis.

In this context, the use of Information and Communication Technologies (ICT) could be a valuable tool for the early detection of MCI cases in a reliable and efficient way, where smart conversational agents are a disruptive technology with the potential to help detect neuro-psychiatric disorders in early stages [ 9 , 10 ]. Note that the penetration of these technological tools among senior adults is not as higher as in the case of other age groups, which makes these tools even more relevant.

Previous research demonstrated that it is possible to implement a voice-based version of a gold standard test for cognitive assessment using conversational agents [ 11 ]. More specifically, DigiMoCA, an Alexa voice application based on T-MoCA, was developed and tested with actual elderly people using a smart speaker.

DigiMoCA makes use of Alexa’s voice recognition and natural language processing services, and is able to store and retrieve session data in DynamoDB (Amazon’s NoSQL database service) persistently. Additionally, DigiMoCA utilizes prosodic annotations to adapt the speech rate to the user, and collects the response time to each item using a statistical estimation of rountrip times. This information is subsequently used to enhance DigiMoCA’s CI screening performance. DigiMoCA was evaluated using the Paradigm for Dialogue System Evaluation (PARADISE), yielding a confusion matrix with a Kappa coefficient \(\kappa = 0.901\) . This means DigiMoCA understands the user approximately 90% of the time, which is equivalent to “almost perfect”[ 12 ] in terms of task completion performance.

The main objective of this work is to analyze the acceptability and usability of DigiMoCA through a user interaction pilot study [ 13 ]. For this, the perception of senior end-users as well as administrators was collected by means of standard evaluation questionnaires, and the outcomes were analyzed using standard statistical procedures.

Thus, the research question posed is:

Is the screening tool DigiMoCA acceptable and usable for the cognitive evaluation of senior adults, both by them and their evaluators?

Section 2 describes the sample of participants, the study design and the data analysis carried out; Section 3 presents and discuss the findings of the study, both from the senior end-users as well as the administrators’ point of view; finally, Section 4 summarizes the results of this research.

2 Material and methods

This user-interaction study included the participation of 46 senior end-users and 24 sector-related professionals. According to previous relevant works [ 14 , 15 ], in order to calculate the number of participants for a pilot study we need to take into account: (1) the parameters to be estimated; (2) that at least 30 participants are involved; (3) a minimum confidence interval of 80% is required. The present study fits all three criteria.

Senior end-users participated through two associations: Parque Castrelos Daycare Center (PCDC) and the Association of Relatives of Alzheimer’s Patients (AFAGA), both located in the city of Vigo (Spain). Before the start of each study, applications were submitted to the Research Ethics Committee of Pontevedra-Vigo-Ourense, containing: (1) the objectives of the study, main and secondary; (2) the methodology proposed, i.e. tests and questionnaires to administer, inclusion and exclusion criteria, recruiting procedure within the association, sample size and structure, and detailed schedule; (3) security concerns and how to address them (anonymization and encryption); (4) ethical and legal aspects, particularly regarding data privacy; and finally, (5) a copy of the informed consent to be signed in advance by all participants. Both applications for AFAGA and CDPC were approved by the corresponding dictums with registration codes 2021/213 and 2023/115 respectively.

Inclusion criteria for senior participants consisted mainly of being over the age of 65 and not having an advanced state of dementia or any other psychological pathology, or any auditory/vocal disability. Table 1 collects the demographic characteristics of the end-user participants, classified by cognitive group. The mean age was 78.61 ± 6.75, with 65% of them being female. We can see that the number of individuals is fairly distributed per group. For cognitive state classification, we used the Global Deterioration Scale (GDS) [ 16 ], which is a widely utilized scale that describes the stage of cognitive impairment, with higher GDS score meaning more deterioration. For additional information, we also show the results of the T-MoCA evaluation (16.25 ± 3.28 for healthy users (HC), 16.25 ± 3.28 for users with MCI and 16.25 ± 3.28 for users with dementia (AD)), as well as the Memory Failures of Everyday (MFE) [ 17 ] questionnaire and the Instrumental Activities of the Daily Living (IADL) scale [ 18 ].

Administrator participants, on the other hand, were affiliated to several associations, namely the Unit of Psychogerontology at the University of Santiago de Compostela, the Galicia Sur Health Research Institute, the Multimedia Technology Group at the University of Vigo, and also AFAGA and PCDC. Table 2 depicts the information about these participants, where we can see that they are predominantly from the health field. The sample has a 58.33% female composition, mostly with middle-aged participants, and fairly evenly distributed among different backgrounds. We can also see a variety in terms of seniority, ranging from less than 5 of years of experience (29.17%) to more than 20 (20.83%).

2.1 Study design

The study was organized along 3 different sessions: during the first one, T-MoCA, MFE and IADL questionnaires were administered; during the second, and after at least two weeks in between, DigiMoCA administration took place. Finally, again after two or more weeks, a second administration of DigiMoCA was carried out during the third session.

Before the first and after the second conversation with the agent, participants were asked to answer to a Technology Acceptance Model (TAM) [ 20 ] questionnaire, which covers how users come to accept a technological system. In order to determine the acceptability of the conversational agent by participants, the designed TAM questionnaire addressed 3 dimensions:

Perceived usefulness (PU) . It measures whether a participant finds the smart speaker useful, both as a general concept, and specifically during the cognitive assessment sessions.

Perceived ease-of-use (PEoU) . It measures whether the conversation with the speaker was comfortable and straightforward for the user, purely in terms of communication.

Perceived satisfaction (PS) . It measures whether the user enjoyed the utilization of the speaker, and whether they prefer it to a human counterpart (i.e., another person conducting T-MoCA as an interviewer).

The resulting questionnaire consisted of a 5-point Likert rating scale composed of 6 items, 2 for each main dimension (1 meaning strongly negative/disagree, 5 strongly positive/agree, 3 neutral). For reference, the TAM questionnaire used is available in Section 1 , translated to English.

In addition to studying how end-users interacted with DigiMoCA, another study was conducted to gather the opinions of cognitive evaluation administrators on its usability and user-friendliness. These were individuals either responsible for administering cognitive assessment tools to older adults, or had a background of expertise related to application development and voice assistants. A 7-point Likert scale questionnaire based on the Post-Study System Usability Questionnaire (PSSUQ) [ 21 ] was used (1 meaning strongly disagree, 7 strongly agree, 4 neutral). The English translation of the PSSUQ questionnaire used is available in Section 2 .

The PSSUQ-based questionnaire was designed in order to evaluate 3 usability dimensions:

System usefulness: measures the ease of use and convenience. In the designed version, includes the average scores of items 1 to 8.

Information quality: measures the usefulness of the information and messages provided by the application. Includes average scores of questions 9 to 14.

Interface quality: measures the friendliness and functionality of the user interface of the system. Includes average scores of items 15 to 17 of the questionnaire.

Overall: measures overall usability, computed as the average of the scores of all items (1 to 18 in our case).

2.2 Data analysis

The following statistical instruments were used to assess acceptability:

Fundamental statistics: mean, standard deviation and percentages.

Cronbach’s Alpha ( \(\alpha \) )[ 22 ] to estimate the reliability, and specifically the internal consistency, of the responses. It is widely used in psychological test construction and interpretation, and it seeks to measure how closely test items are related to one another - thus measuring the same construct. When test items are closely related to each other, Cronbach’s alpha will be closer to 1; if they are not, Cronbach’s alpha will be closer to 0. In this study, we use this metric to evaluate the internal consistency of the responses to the TAM (end-user centered) and PSSUQ (administrators centered) questionnaires. It is computed as follows:

k is the number of items/questions included.

\(\sigma _i^2\) is the variance of each item across all responses.

\(\sigma _x^2\) is the total variance, including all items.

According to Gliem [ 23 ], a good interpretation of the value of Cronbach’s alpha regarding internal consistency is \(\alpha > 0.9\) means “excellent"; \(\alpha > 0.8\) means “good"; \(\alpha > 0.7\) menas “acceptable"; \(\alpha > 0.6\) means “questionable"; and anything below 0.6 is considered an indicator of low internal consistency.

Student T-tests [ 24 ] were used for comparison of pre-pilot and post-pilot questionnaires, giving insight on the evolution of the acceptability perception of the participants during the administration. Statistical significance was measured by means of p-values.

Cohen’s d [ 25 ]: measures the effect size of T-tests, and is computed as the standardized mean difference between two groups (in this case, pre-pilot and post-pilot). It is computed as the difference between the means divided by the square root of the average of both variances:

Based on Tellez’s analysis[ 26 ] the interpretation of Cohen’s d is as follows: \(d < 0.2\) is “trivial effect"; \(0.2< d < 0.5\) is “small effect"; \(0.5< d < 0.8\) is “medium effect"; and \(d > 0.8\) means “large effect".

Statistical analysis was performed using the Google Sheets online tool, as well as Google Colab with Jupyter notebooks written in Python. Several commonly-used data analysis libraries were used (e.g., NumPy, Pandas, Pingouin).

3 Results and discussion

This section presents and analyzes the main results obtained regarding the usability and acceptability of DigiMoCA, both from the end-users’ perspective (sample of n = 46) as well as the administrators’ (n = 24).

3.1 User interaction from senior end-users

As explained in Section 2 , users completed the TAM questionnaire before and after the administration of DigiMoCA. The questionnaire included two sections, each with the 3 dimensions and 6 questions: one focused on technology in general, and another focused on DigiMoCA and conversational agents.

Table 3 presents the results of TAM’s 3-dimensional scale, taken from the post evaluation, regarding DigiMoCA’s section. Most relevant results are:

Perceived usefulness: a value of 3.87 ± 0.92 was obtained including all groups, with the highest rating within the MCI group (4.11 ± 0.92) and the lowest from the HC group (3.42 ± 0.93). Regarding the internal consistency of the answers, a value of \(\alpha \) = 0.63 was obtained, with the most internally consistent group being HC ( \(\alpha \) = 0.76) and the lowest MCI ( \(\alpha \) = 0.42).

Perceived ease of use: a value of 3.98 ± 0.96 was obtained including all groups. Once again, the highest mean value was found in the MCI group (4.14 ± 0.99), whereas the lowest rating was also obtained within the HC group (3.83 ± 0.96). In terms of internal consistency, a value of \(\alpha \) = 0.73 was obtained overall, being the HC group the most internally consistent ( \(\alpha \) = 0.96) and MCI the least one ( \(\alpha \) = 0.28).

Perceived satisfaction: including all groups we observe a value of 3.27 ± 1.21, in this case with the best rating coming from the AD group (3.47 ± 1.16), and the worst rating from the HC group (3.00 ± 1.22). Regarding the internal consistency, a value of \(\alpha \) = 0.41 was obtained, with the most internally consistent group being HC again ( \(\alpha \) = 0.56) and the least one being MCI ( \(\alpha \) = 0.25).

Overall, we consider these results to be rather positive: none of the ratings drop below 3 (out of 5) on average, either considering the overall sample or any particular group/sub-sample. This means that regardless of the level of cognitive deterioration, the users find DigiMoCA useful, easy to use and satisfactory.

Regarding the internal consistency however, it is only “acceptable" for one of the dimensions (PEoU), with a worryingly low value for the PS dimension. We believe this inconsistency to be caused by the disparity of results obtained from the two questions regarding PS: the first asks about whether participants “liked to use DigiMoCA", and the second whether they would rather “use DigiMoCA instead of T-MoCA". We observe that the answers to the second part (i.e., after interacting with the agent) are considerably lower than to the first, perhaps due to the comparison between a human-robot interaction and a human-human interaction (which is usually strongly preferred by this demographic group).

Additionally, we can observe a tendency for the MCI group to give the highest ratings but with lowest internal consistency, whereas the HC group usually gives the lowest ratings but with highest internal consistency. One possible explanation for this behavior is that cognitive impairment can interfere with consistent reasoning; it is also likely that users with MCI had more trouble understanding the full implications of the questions posed, giving less consistent answers. Certainly, it is reasonable to believe that healthy users are generally more sensitive to the intrusiveness of these evaluations, hence the slightly lower ratings.

Tables 4 and 5 present the results of the perception variation between pre-administration and post-administration of DigiMoCA. Table 4 contains the results regarding the section about technology in general, while Table 5 contains the results of the section about conversational agents. Again, data is classified by TAM dimensions (rows), including the results for each individual question (“.1" and “.2" for each dimension). We also have the results obtained classified by cognitive group (columns): HC, MCI, AD and the whole sample.

The main objective of this analysis is to determine whether the acceptability perception of users has a significant change after the administration of DigiMoCA. For this, we performed a student’s T-test with pre and post questions, and obtained three metrics: percentage change between the averages, Cohen’s d and the statistical significance p . The following paragraphs address the main findings of this process.

Regarding the technology section, there is a percentage increase in all items of the first two dimensions: +6.17% for PU.1 (d = 0.33), +3.05% for PU.2 (d = 0.11), +5.26% for PEoU.1 (d = 0.17) and +9.00% for PEoU.2 (d = 0.44). However, there is only one item (PEoU.2) that exhibits a significant change (p = 0.010). Both items from the PS dimension remain essentially unchanged. Therefore, generally speaking, we can establish that the administration does not significantly change the acceptability of technology in senior adults, but we do observe a non-significant positive change in both PU and PEoU items. Furthermore, if we look at the sample sub-groups independently, we can also observe a positive non-significant change in the vast majority of items, only one of them being significant (PEoU.2 for AD group with +17.08% change; d = 0.84, p = 0.007).

With respect to the conversational agents section, the acceptability has a more noticeable improvement among most items, three of them being statistically significant, and we also find the first item with a “large effect" size: PU.1 with +59.14% (d = 1.06, p < 0.001), PEoU.2 with +13.71% (d = 0.65, p = 0.005), PS.1 with +12.22% (d = 0.61, p = 0.005). We should also notice that the PS.2 item has a significant decrease of -24.24% (d = 0.95, d < 0.001), but we do not think this particular item is a good representative of the PS dimension, since -as it was stated previously- the pre and post questions are different, and thus it should be taken with a grain of salt. If we look at sample sub-groups independently, we can notice that none of the significant changes are in the HC group, while most are concentrated on the MCI group: +85.84% (d = 1.29, p < 0.001) for PU.1, +21.61% (d = 1.04, p = 0.007) for PEoU.2, and +16.90% (d = 1.14, p = 0.003) for PS.1. Within the AD group, only the PU.1 is statistically significant (+58.59%, d = 1.02, p = 0.013).

In light of the results discussed, it seems reasonable to affirm that the acceptability on conversational agents by senior adults improves significantly after the interaction with DigiMoCA. To support this, we found that at least one item exhibits a statistically significant (p < 0.05) positive change in all 3 dimensions, and if we discard item PS.2, which as pointd out above is probably not accurate, all items have an increase in acceptability across all groups.

3.2 Usability perception of DigiMoCA from administrators

In addition to the end-user interaction study, an additional study was carried out in order to measure the usability perception of DigiMoCA from cognitive assessment administrators and professionals. For this, we employed the PSSUQ questionnaire with items rated in a 7-point Likert scale, which is widely used to measure user’s perceived satisfaction of a software system. Table 6 summarizes the results, which are also categorized by gender, field of occupation and years of experience:

Overall usability (OVERALL): we obtain a 5.86 ± 1.24 mean value for all participants and all items. The mean rating does not excessively change based on gender or career experience, although the average rating for participants in the technological field was slightly higher (6.26 ± 0.94). The internal consistency obtained was “excellent" ( \(\alpha \) = 0.95) overall, with some slight differences based on gender ( \(\alpha \) being 0.88 for males and 0.97 for females), field of expertise ( \(\alpha \) = 0.96 for health field, 0.90 for technological field) and experience ( \(\alpha \) = 0.91 for administrators with 10+ years of experience, 0.97 for the ones with less than 10).

System usefulness (SYSUSE): including items 1 to 8, we obtain a mean value of 5.96 ± 1.14 for all participants. Again the mean rating is not considerably affected by gender or career experience, but we do obtain a slightly higher value of 6.36 ± 0.94 for participants in the technological field. As for the internal consistency of the answers, we get an “excellent" \(\alpha \) = 0.91 for the whole sample, although it does drop to just “good" for the male group ( \(\alpha \) = 0.85) and the most experienced participants ( \(\alpha \) = 0.88). The lowest internal consistency is found within the technological field, with an “acceptable" \(\alpha \) = 0.76.

Information quality (INFOQUAL): the mean value obta-ined from items 9 to 14 was 5.74 ± 1.44 overall. Once again, the highest differences found were based on the field of expertise: the technological field group had the highest mean value of 6.17 ± 1.22, while the lowest value was obtained from the health field group (5.63 ± 1.48). The overall internal consistency was \(\alpha \) = 0.90, and we do find differences between the demographic groups: higher consistency for females ( \(\alpha \) = 0.96) than males ( \(\alpha \) = 0.74); higher consistency for the health field group ( \(\alpha \) = 0.91) than the technological group ( \(\alpha \) = 0.79); and higher consistency for the least experienced individuals ( \(\alpha \) = 0.93) than the most experienced ( \(\alpha \) = 0.84).

Interface quality (INTERQUAL): including items 15 to 17, the overall mean rating was 5.81 ± 1.11. For this dimension the mean value for the technological field group was the highest (6.11 ± 0.65), and the mean value for the least experienced group was the lowest (5.71 ± 1.23). As for the internal consistency, this was the dimension with the lowest overall, with an “acceptable" value of \(\alpha \) = 0.77. Again we find considerable differences between demographic groups: higher \(\alpha \) = 0.88 for females than males ( \(\alpha \) = 0.34), higher \(\alpha \) = 0.80 for health field than technological field ( \(\alpha \) = 0.27) and higher \(\alpha \) = 0.90 for the less experienced group than the people with 10+ years of experience ( \(\alpha \) = 0.42). This is the only dimension where we see the internal consistency drop below an “acceptable" level, and it is probably due to the small amount of items it considers (only three).

In light of the presented results, we observe that the overall usability perception is generally positive, slightly under 6 out of 7 points, and never drops below 5 for any of its dimensions, even if considering specific demographic groups based on gender, career field and experience.

We do observe a pattern between the groups: females provide slightly lower ratings than males, but with a higher internal consistency. The exact same happens between the health field group (i.e., slightly lower ratings and higher consistency) and the technological field group, as well as between the most experienced group and the least one. The fact that this pattern repeats across groups is expected, and it is probably due to the fact that the groups are overlapping: more males than females work in the tech field, and the males happen to be younger on average than females (34.9 years old vs. 40.14, cf. Table  2 ), hence the difference found between different seniority groups. Furthermore, we noticed that participants from the medical field made more comments suggesting improvement areas than participants from the technical field, particularly regarding the user interface.

As to why this pattern occurs, we believe it is justified since DigiMoCA is inherently a technological and disruptive screening tool. Therefore, it is to be expected that professionals from the technological field are more keen to using it, and generally more interested in it and curious about how it works. Conversely, it also makes sense that professionals from the health field are more “skeptical" and less interested, since the health field is generally more stable and less prone to disruptive changes [ 27 ], and certainly more people-oriented than tool-oriented.

Finally, the fact that the information and interface-related items obtain a slightly lower rating across all groups is justified, as one of the main drawbacks of using a voice-only communication channel is the restriction of the user interface, which lacks visual user interaction. This probably means that the PSSUQ questionnaire should be adapted in this context to new ICT tools based on conversational agents, where questions about the user interface either need a reformulation or simply to be excluded.

4 Conclusion

In this paper a user-interaction pilot study analyizing the usability and acceptability of DigiMoCA -a digital Alexa-based cognitive impairment screening tool based on T-MoCA- is discussed, both from end-users’ and administrators’ perspectives.

In the case of end-users, a TAM questionnaire was utilized, administered both before and after DigiMoCA. Overall, the results show that users accept DigiMoCA, giving it a 3+ score in all three TAM’s dimensions, meaning that they perceive it as useful, easy to use and satisfactory. The perceived ease of use was particularly positive and internally consistent, with a mean score of 3.98. Additionally, the pre vs. post analysis show that, while the acceptability of technology does not change significantly after the administration of DigiMoCA, when it comes to conversational agents specifically, their perceived acceptability improves significantly. All three dimensions have an item with a statistically significant positive change. Moreover, the vast majority of non-significant changes were also positive.

In the case of test administrators, a PSSUQ questionnaire was used. Its results show that DigiMoCA is considered usable (mean score 5.86) very consistently ( \(\alpha \) = 0.95), with a score of 5+ out of 7 for all the dimensions and demographic groups. System usefulness was rated consistently higher than information and interface quality, and we find the biggest demographic differences between the health field group and the technological field group.

The sample size is one of the main limitations of the study. To estimate an ideal sample size, initially we obtained an estimation of the prevalence of AD in Spain (10.285%) Footnote 1 . Then, based on the confidence interval needed of 95% we would need n = 142 participants per study group, which is far from the sample size achieved so far.

Future lines of work include further characterizing the sample, carrying out a study of acceptability and usability by technological training of the participants, including their relationship with technology throughout their lives. Additionally, it could be worth to analyse more objective metrics, such as participants’ response times, which could enrich the study of DigiMoCA.

Ongoing work addresses the improvement of the perceived satisfaction from using DigiMoCA, by making it more friendly, while also improving its interface and the information provided to the user, compensating the voice-only interaction limitations. As these aspects are improved to make user interaction with conversational agents to be perceived closer and closer to that with human administrators, the distinctive affordability and accessibility of smart assistant-based tests can effectively set them off as a powerful screening technology.

figure b

Data Availability

All data supporting the findings of this study are available within the paper and its Supplementary Files, particularly the user responses to the usability and acceptability questionnaires.

Source: Clinical Practice Guideline on Comprehensive Care for People with Alzheimer’s Disease and other dementias https://portal. guiasalud.es/wp-content/uploads/2018/12/GPC_484_Alzheimer_AIA- QS_resum.pdf

WHO (2023) Un decade of healthy ageing: Plan of action. https://cdn.who.int/media/docs/default-source/decade-of-healthy-ageing/decade-proposal-final-apr2020-en.pdf?sfvrsn=b4b75ebc_28

APA (2013) Diagnostic and statistical manual of mental disorders, 5th Edn. https://doi.org/10.1176/appi.books.9780890425596

Kowalska M, Owecki M, Prendecki M, Wize K, Nowakowska J, Kozubski W, Lianeri M, Dorszewska J (2017) Aging and neurological diseases. In: Senescence, IntechOpen, Ch. 5. https://doi.org/10.5772/intechopen.69499

Gallegos M, Morgan M, Cervigni M, Martino P, Murray J, Calandra M, Razumovskiy A, Caycho-Rodríguez T, Arias Gallegos W (2022) 45 years of the mini-mental state examination (mmse): A perspective from ibero-america. Dementia & Neuropsychologia. https://doi.org/10.1590/1980-5764-dn-2021-0097

Kueper J, Speechley M, Montero-Odasso M (2018) The alzheimer’s disease assessment scale–cognitive subscale (adas-cog): Modifications and responsiveness in pre-dementia populations. a narrative review. Journal of Alzheimer’s Disease. https://doi.org/10.3233/JAD-170991

Nasreddine Z, Phillips N, Bédirian V, Charbonneau S, Whitehead V, Collin I, Cummings J, Chertkow H (2005) The montreal cognitive assessment, moca: A brief screening tool for mild cognitive impairment. J Am Geriatr Soc 53:695–9. https://doi.org/10.1111/j.1532-5415.2005.53221.x

Article   Google Scholar  

Katz M, Wang C, Nester C, Derby C, Zimmerman M, Lipton R, Sliwinski M, Rabin L (2021) T-moca: A valid phone screen for cognitive impairment in diverse community samples. Alzheimer’s & Dementia: Diagnosis, Assessment & Disease Monitoring 13. https://doi.org/10.1002/dad2.12144

Nasreddine ZS (2021) Moca test: Validation of a five-minute telephone version. Alzheimer’s & Dementia 17. https://doi.org/10.1002/alz.057817

Pacheco-Lorenzo MR, Valladares-Rodríguez SM, Anido-Rifón LE, Fernández-Iglesias MJ (2021) Smart conversational agents for the detection of neuropsychiatric disorders: A systematic review. Journal of Biomedical Informatics 113. https://doi.org/10.1016/j.jbi.2020.103632

Otero-González I, Pacheco-Lorenzo MR, Fernández-Iglesias MJ, Anido-Rifón LE (2024) Conversational agents for depression screening: A systematic review. International Journal of Medical Informatics. https://doi.org/10.1016/j.ijmedinf.2023.105272

Pacheco-Lorenzo M, Fernández-Iglesias MJ, Valladares-Rodriguez S, Anido-Rifón LE (2023) Implementing scripted conversations by means of smart assistants. Software: Practice and Experience 53. https://doi.org/10.1002/spe.3182

Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics. https://doi.org/10.2307/2529310

Valladares-Rodriguez S, Fernández-Iglesias MJ, Anido-Rifón L, Facal D, Rivas-Costa C, Pérez-Rodríguez R (2019) Touchscreen games to detect cognitive impairment in senior adults. a user-interaction pilot study, International Journal of Medical Informatics 127. https://doi.org/10.1016/j.ijmedinf.2019.04.012

Lancaster GA, Dodd S, Williamson PR (2004) Design and analysis of pilot studies: recommendations for good practice. Journal of Evaluation in Clinical Practice. https://doi.org/10.1111/j.2002.384.doc.x

Cocks K, Torgerson DJ (2013) Sample size calculations for pilot randomized trials: a confidence interval approach. Journal of Clinical Epidemiology. https://doi.org/10.1016/j.jclinepi.2012.09.002

Reisberg B, Torossian C, Shulman M, Monteiro I, Boksay I, Golomb J, Benarous F, Ulysse A, Oo T, Vedvyas A, Rao J, Marsh K, Kluger A, Sangha J, Hassan M, Alshalabi M, Arain F, Sh N, Buj M, Shao Y (2018) Two year outcomes, cognitive and behavioral markers of decline in healthy, cognitively normal older persons with global deterioration scale stage 2 (subjective cognitive decline with impairment). Journal of Alzheimer’s disease: JAD. https://doi.org/10.3233/JAD-180341

Montejo P, Peña M, Sueiro M (2012) The memory failures of everyday questionnaire (mfe): Internal consistency and reliability. The Spanish Journal of Psychology. https://doi.org/10.5209/rev_SJOP.2012.v15.n2.38888

Graf C (2008) The lawton instrumental activities of daily living (iadl) scale, AJN. American Journal of Nursing. https://doi.org/10.1097/01.NAJ.0000314810.46029.74

CSIC (2023) Un perfil de las personas mayores en españa 2023. https://envejecimientoenred.csic.es/wp-content/uploads/2023/10/enred-indicadoresbasicos2023.pdf

Abu Rbeian AH, Owda A, Owda M (2022) A technology acceptance model survey of the metaverse prospects, AI. https://doi.org/10.3390/ai3020018

Lewis JR (1992) Psychometric evaluation of the post-study system usability questionnaire: The pssuq. Proceedings of the Human Factors Society Annual Meeting. https://doi.org/10.1177/154193129203601617

Cronbach LJ (1951) Coefficient alpha and the internal structure of tests. Psychometrika. https://doi.org/10.1007/BF02310555

Gliem JA, Gliem RR (2003) Calculating, interpreting, and reporting cronbach’s alpha reliability coefficient for likert-type scales. https://hdl.handle.net/1805/344

Mishra P, Singh U, Pandey CM, Mishra P, Pandey G (2019) Application of student’s t-test, analysis of variance, and covariance. Annals of Cardiac Anaesthesia. https://doi.org/10.4103/aca.ACA_94_19

Thalheimer W, Cook S (2002)How to calculate effect sizes from published research: A simplified methodology. Work-Learning Research. https://api.semanticscholar.org/CorpusID:145490810

Tellez A, Garcia Cadena C, Corral-Verdugo V (2015) Effect size, confidence intervals and statistical power in psychological research. Psychology in Russia: State of the Art. https://doi.org/10.11621/pir.2015.0303

Nadarzynski T, Miles O, Cowie A, Ridge D (2019) Acceptability of artificial intelligence (ai)-led chatbot services in healthcare: A mixed-methods study. Digital Health. https://doi.org/10.1177/2055207619871808

Download references

Acknowledgements

We acknowledge the contributions and support of author’s colleague Noelia Lago, as well as the staff at AFAGA (Miriam Fortes and Maxi Rodríguez) and Centro de Día Parque Castrelos (Ángeles Álvarez), and all of the participants of this study, without whom this work would not be possible.

Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. Funding for open access charge: CISUG/Universidade de Vigo. This work has been partially funded by Ministerio de Ciencia e Innovación, project SAPIENS- Services and applications for a healthy aging [grant PID2020-115137RB-I00 funded by MCIN/AEI/10.13039/501100011033] and by the Ministry of Science, Innovation and Universities [grant FPU19/01981] (Formación de Profesorado Universitario).

Author information

Authors and affiliations.

atlanTTic, University of Vigo, 36310, Vigo, Spain

Moisés R. Pacheco-Lorenzo, Luis E. Anido-Rifón & Manuel J. Fernández-Iglesias

Department of Electronics and Computing, USC, 15782, Santiago de Compostela, Santiago de Compostela, Spain

Sonia M. Valladares-Rodríguez

You can also search for this author in PubMed   Google Scholar

Contributions

Moisés R. Pacheco-Lorenzo : administration of questionnaires, statistical analysis and writing.

Sonia Valladares-Rodriguez : statistical analysis and writing.

Manuel J. Fernández-Iglesias : supervision, writing, review and editing.

Luis E. Anido-Rifón : supervision, writing, review and editing.

Corresponding author

Correspondence to Moisés R. Pacheco-Lorenzo .

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: TAM questionnaire

figure c

Appendix B: PSSUQ questionnaire

figure g

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Pacheco-Lorenzo, M.R., Anido-Rifón, L.E., Fernández-Iglesias, M.J. et al. Will senior adults accept being cognitively assessed by a conversational agent? a user-interaction pilot study. Appl Intell (2024). https://doi.org/10.1007/s10489-024-05558-z

Download citation

Accepted : 23 May 2024

Published : 15 June 2024

DOI : https://doi.org/10.1007/s10489-024-05558-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • User interaction
  • Acceptability
  • Cognitive impairment
  • Cognitive assessment
  • Voice application
  • Smart speaker
  • Find a journal
  • Publish with us
  • Track your research

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

aerospace-logo

Article Menu

research article on user interface

  • Subscribe SciFeed
  • Recommended Articles
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Research on the movement speed of situational map symbols based on user dynamic preference perception.

research article on user interface

1. Introduction

  • Experiment I Measurement of Just Noticeable Difference in Speed

2. Method of Experiment I

2.1. design, 2.2. participant, 2.3. procedure, 2.4. apparatus and environment, 3. results of experiment i, 4. discussion of experiment i.

  • Experiment II Evaluating and Selecting the Optimal Speed for Dynamic Symbols in Cognition Task

5. Method of Experiment II

5.1. design, 5.2. participants, 5.3. procedure, 5.4. apparatus and environment, 6. results of experiment ii, 6.1. apparent speed rate, 6.2. accuracy rate, 6.3. visual comfort score, 7. discussion of experiment ii, 8. general discussion, 9. conclusions, author contributions, data availability statement, conflicts of interest.

  • Hoekstra, J.M.; van Gent, R.N.H.W.; Ruigrok, R.C.J. Designing for safety: The ‘free flight’ air traffic management concept. Reliab. Eng. Syst. Saf. 2002 , 75 , 215–232. [ Google Scholar ] [ CrossRef ]
  • Research on Road Traffic Situation Awareness System Based on Image Big Data|IEEE Journals & Magazine|IEEE Xplore. Available online: https://ieeexplore.ieee.org/abstract/document/8848403 (accessed on 8 April 2024).
  • Eräranta, S.; Staffans, A. From Situation Awareness to Smart City Planning and Decision Making. In Proceedings of the 14th International Conference on Computers in Urban Planning and Urban Management, Cambridge, MA, USA, 7–10 July 2015. [ Google Scholar ]
  • Crouser, R.J.; Fukuda, E.; Sridhar, S. Retrospective on a decade of research in visualization for cybersecurity. In Proceedings of the 2017 IEEE International Symposium on Technologies for Homeland Security (HST), Waltham, MA, USA, 25–26 April 2017; pp. 1–5. [ Google Scholar ]
  • Lu, C.; Lyu, J.; Zhang, L.; Gong, A.; Fan, Y.; Yan, J.; Li, X. Nuclear Power Plants with Artificial Intelligence in Industry 4.0 Era: Top-Level Design and Current Applications—A Systemic Review. IEEE Access 2020 , 8 , 194315–194332. [ Google Scholar ] [ CrossRef ]
  • McKee, S.P. A local mechanism for differential velocity detection. Vis. Res. 1981 , 21 , 491–500. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • de Bruyn, B.; Orban, G.A. Human velocity and direction discrimination measured with random dot patterns. Vis. Res. 1988 , 28 , 1323–1335. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Snowden, R.J.; Braddick, O.J. The temporal integration and resolution of velocity signals. Vis. Res. 1991 , 31 , 907–914. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Sudkamp, J.; Bocian, M.; Souto, D. The role of eye movements in perceiving vehicle speed and time-to-arrival at the roadside. Sci. Rep. 2021 , 11 , 23312. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Strasburger, H.; Rentschler, I.; Jüttner, M. Peripheral vision and pattern recognition: A review. J. Vis. 2011 , 11 , 13. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Orban, G.A.; De Wolf, J.; Maes, H. Factors influencing velocity coding in the human visual system. Vis. Res. 1984 , 24 , 33–39. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Mckee, S.P.; Nakayama, K. The detection of motion in the peripheral visual field. Vis. Res. 1984 , 24 , 25–32. [ Google Scholar ] [ CrossRef ]
  • Thompson, P.; Brooks, K.; Hammett, S.T. Speed can go up as well as down at low contrast: Implications for models of motion perception. Vis. Res. 2006 , 46 , 782–786. [ Google Scholar ] [ CrossRef ]
  • Wolffsohn, J.S.; Lingham, G.; Downie, L.E.; Huntjens, B.; Inomata, T.; Jivraj, S.; Kobia-Acquah, E.; Muntz, A.; Mohamed-Noriega, K.; Plainis, S.; et al. TFOS Lifestyle: Impact of the digital environment on the ocular surface. Ocul. Surf. 2023 , 28 , 213–252. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Wee, S.W.; Moon, N.J. Clinical evaluation of accommodation and ocular surface stability relavant to visual asthenopia with 3D displays. BMC Ophthalmol. 2014 , 14 , 29. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Kuze, J.; Ukai, K. Subjective evaluation of visual fatigue caused by motion images. Displays 2008 , 29 , 159–166. [ Google Scholar ] [ CrossRef ]
  • Bando, T.; Iijima, A.; Yano, S. Visual fatigue caused by stereoscopic images and the search for the requirement to prevent them: A review. Displays 2012 , 33 , 76–83. [ Google Scholar ] [ CrossRef ]
  • Jaschinski, W.; Bonacker, M.; Alshuth, E. Accommodation, convergence, pupil diameter and eye blinks at a CRT display flickering near fusion limit. Ergonomics 1996 , 39 , 152–164. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Blehm, C.; Vishnu, S.; Khattak, A.; Mitra, S.; Yee, R.W. Computer Vision Syndrome: A Review. Surv. Ophthalmol. 2005 , 50 , 253–262. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Goodwin, P.E. Evaluation of Methodology for Evaluating Lighting for Offices with VDTs. J. Illum. Eng. Soc. 1987 , 16 , 39–51. [ Google Scholar ] [ CrossRef ]
  • Smith, S.P.; Nesbitt, K.; Blackmore, K. Using optical flow as an objective metric of cybersickness in virtual environments. In Proceedings of the Australasian Simulation Congress 2017, Sydney, Australia, 28–31 August 2017. [ Google Scholar ]
  • Terenzi, L.; Zaal, P. Rotational and Translational Velocity and Acceleration Thresholds for the Onset of Cybersickness in Virtual Reality. In AIAA Scitech 2020 Forum ; American Institute of Aeronautics and Astronautics: Reston, VA, USA, 2020. [ Google Scholar ]
  • Tong, M.; Chen, S.; Niu, Y.; Wu, J.; Tian, J.; Xue, C. Visual search during dynamic displays: Effects of velocity and motion direction. J. Soc. Inf. Disp. 2022 , 30 , 635–647. [ Google Scholar ] [ CrossRef ]
  • Burr, D.; Thompson, P. Motion psychophysics: 1985–2010. Vis. Res. 2011 , 51 , 1431–1456. [ Google Scholar ] [ CrossRef ]
  • Finlay, D.; Dodwell, P.C.; Caelli, T. The waggon-wheel effect. Perception 1984 , 13 , 237–248. [ Google Scholar ] [ CrossRef ]
  • Davis, S.; Nesbitt, K.; Nalivaiko, E. A Systematic Review of Cybersickness. In Proceedings of the 2014 Conference on Interactive Entertainment, Association for Computing Machinery, New York, NY, USA, 2–3 December 2014; pp. 1–9. [ Google Scholar ] [ CrossRef ]
  • Gegenfurtner, K.R.; Hawken, M.J. Interaction of motion and color in the visual pathways. Trends Neurosci. 1996 , 19 , 394–401. [ Google Scholar ] [ CrossRef ]
  • Fitting the Mel Scale|IEEE Conference Publication|IEEE Xplore. Available online: https://ieeexplore.ieee.org/abstract/document/758101 (accessed on 12 July 2023).
  • Jacobsen, F.; de Bree, H.-E. A comparison of two different sound intensity measurement principles. J. Acoust. Soc. Am. 2005 , 118 , 1510–1517. [ Google Scholar ] [ CrossRef ]
  • Langley, G.B.; Sheppeard, H. The visual analogue scale: Its use in pain measurement. Rheumatol. Int. 1985 , 5 , 145–148. [ Google Scholar ] [ CrossRef ]
  • Tong, M.; Chen, S.; Niu, Y.; Xue, C. Effects of speed, motion type, and stimulus size on dynamic visual search: A study of radar human–machine interface. Displays 2023 , 77 , 102374. [ Google Scholar ] [ CrossRef ]
  • Yang, L.; Yu, R.; Lin, X.; Liu, N. Shape representation modulating the effect of motion on visual search performance. Sci. Rep. 2017 , 7 , 14921. [ Google Scholar ] [ CrossRef ]
  • Ko, Y.-H. The effects of luminance contrast, colour combinations, font, and search time on brand icon legibility. Appl. Ergon. 2017 , 65 , 33–40. [ Google Scholar ] [ CrossRef ]
  • Initiative (WAI) WWA. Web Accessibility Initiative (WAI). Cognitive Accessibility at W3C. Available online: https://www.w3.org/WAI/cognitive/ (accessed on 13 November 2023).
  • Law Insider [Internet]. ANSI/HFES 100-2007 Standards Definition. Available online: https://www.lawinsider.com/dictionary/ansihfes-100-2007-standards (accessed on 11 November 2023).
  • Bond, J.G. Introduction to Game Design, Prototyping, and Development: From Concept to Playable Game with Unity and C# ; Addison-Wesley Professional: New York, NY, USA, 2014; 944p. [ Google Scholar ]
  • Farell, B.; Pelli, D.G. Psychophysical methods, or how to measure a threshold, and why. In Vision Research: A Practical Guide to Laboratory Methods ; Carpenter, R., Robson, J., Eds.; Oxford University Press: Oxford, UK, 1998. [ Google Scholar ] [ CrossRef ]
  • Nishida, S. Advancement of motion psychophysics: Review 2001–2010. J. Vis. 2011 , 11 , 11. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • McKee, S.P.; Smallman, H.S. Size and speed constancy. In Perceptual Constancy: Why Things Look as They Do ; Cambridge University Press: New York, NY, USA, 1998; pp. 373–408. [ Google Scholar ]
  • Yong, Z.; Hsieh, P.-J. Speed–size illusion correlates with retinal-level motion statistics. J. Vis. 2017 , 17 , 1. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Burr, D. Motion smear. Nature 1980 , 284 , 164–165. [ Google Scholar ] [ CrossRef ]
  • Peschel, A.; Orquin, J. A review of the findings and theories on surface size effects on visual attention. Front. Psychol. 2013 , 4 , 70924. [ Google Scholar ] [ CrossRef ]
  • Rajabi-Vardanjani, H.; Habibi, E.; Pourabdian, S.; Dehghan, H.; Maracy, M.R. Designing and Validation a Visual Fatigue Questionnaire for Video Display Terminals Operators. Int. J. Prev. Med. 2014 , 5 , 841–848. [ Google Scholar ] [ PubMed ]
  • Kennedy, R.S.; Lane, N.E.; Berbaum, K.S.; Lilienthal, M.G. Simulator Sickness Questionnaire: An enhanced method for quantifying simulator sickness. Int. J. Aviat. Psychol. 1993 , 3 , 203–220. [ Google Scholar ] [ CrossRef ]
  • E-Prime , Version 3.0. Experimental Implementation Software. Psychology Software Tools, Inc.: Pittsburgh, PA, USA, 2016.
  • IBM SPSS Statistics for Windows , Version 26.0. Software for Technical Computation. IBM Cor: Yorktown Heights, NY, USA, 2019.
  • Hussain, Q.; Alhajyaseen, W.K.M.; Pirdavani, A.; Reinolsmann, N.; Brijs, K.; Brijs, T. Speed perception and actual speed in a driving simulator and real-world: A validation study. Transp. Res. Part F Traffic Psychol. Behav. 2019 , 62 , 637–650. [ Google Scholar ] [ CrossRef ]
  • Yu, R.; Chan, A.H.S. Display Movement Velocity and Dynamic Visual Search Performance: Dynamic Visual Search Performance. Hum. Factors Man. 2015 , 25 , 269–278. [ Google Scholar ] [ CrossRef ]
  • Gowrisankaran, S.; Nahar, N.K.; Hayes, J.R.; Sheedy, J.E. Asthenopia and Blink Rate Under Visual and Cognitive Loads. Optom. Vis. Sci. 2012 , 89 , 97. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Wang, M.T.M.; Tien, L.; Han, A.; Lee, J.M.; Kim, D.; Markoulli, M.; Craig, J.P. Impact of blinking on ocular surface and tear film parameters. Ocul. Surf. 2018 , 16 , 424–429. [ Google Scholar ] [ CrossRef ]
  • Hubel, D.H.; Livingstone, M.S. Segregation of form, color, and stereopsis in primate area 18. J. Neurosci. 1987 , 7 , 3378–3415. [ Google Scholar ] [ CrossRef ]

Click here to enlarge figure

Large SizeMedium SizeSmall Size
SpeedJNDS (M)JNDS (SD)Weber FractionJNDS (M)JNDS (SD)Weber FractionJNDS (M)JNDS (SD)Weber Fraction
0.250.040.020.1710.040.030.1760.050.020.203
10.110.030.1080.110.040.1120.110.030.112
40.290.140.0720.320.130.0810.330.150.083
80.560.140.0700.540.120.0670.550.120.069
161.140.250.0711.180.220.0681.200.240.070
322.340.620.0732.240.660.0702.270.710.071
483.740.900.0784.031.020.0843.890.950.081
645.821.950.0916.081.880.0956.022.030.094
889.683.520.11010.123.850.11510.213.680.116
12818.698.960.14619.207.690.15020.357.840.159
25649.4116.130.19352.7317.580.20658.3718.600.228
Apparent Speed RateAccuracy Rate
Mean ValueStandard DeviationMean ValueStandard Deviation
Velocity
11.210.1599.20.2
21.600.1099.40.3
41.950.1199.20.3
82.060.1298.20.4
122.550.1697.80.2
163.060.1796.80.5
243.580.1596.50.8
324.060.1395.20.2
484.320.1693.31.4
644.920.1891.21.8
SpeedRegression EquationR Adjusted R
1–64Y = 0.921 + 0.169x − 0.003x + 2.1 × 10 x 0.8260.784
JNDSApparent SpeedAccuracy Rate
F-Valuep-ValueF-Valuep-ValueF-Valuep-Value
Speed108.561<0.0565.174<0.0511.628<0.05
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Tong, M.; Chen, S.; Wang, X.; Xue, C. Research on the Movement Speed of Situational Map Symbols Based on User Dynamic Preference Perception. Aerospace 2024 , 11 , 478. https://doi.org/10.3390/aerospace11060478

Tong M, Chen S, Wang X, Xue C. Research on the Movement Speed of Situational Map Symbols Based on User Dynamic Preference Perception. Aerospace . 2024; 11(6):478. https://doi.org/10.3390/aerospace11060478

Tong, Mu, Shanguang Chen, Xinyue Wang, and Chengqi Xue. 2024. "Research on the Movement Speed of Situational Map Symbols Based on User Dynamic Preference Perception" Aerospace 11, no. 6: 478. https://doi.org/10.3390/aerospace11060478

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

Usability Studies on Mobile User Interface Design Patterns: A Systematic Literature Review

Wiley

  • November 2017
  • 2017(16):1-22

Lumpapun Punchoojit at Thammasat University

  • Thammasat University

Nuttanont Hongwarittorrn at Thammasat University

Discover the world's research

  • 25+ million members
  • 160+ million publication pages
  • 2.3+ billion citations

Zahra Galavi

  • J Braz Comput Soc

Adriano Luiz de Souza Lima

  • Christiane Gresse von Wangenheim
  • Osvaldo P. Heiderscheidt Roberge Martins

Adriano Ferreti Borgatto

  • KNOWL INF SYST

Iñigo Aldalur

  • Ilham Prabowo
  • Fathiya Hasyifah Sibarani

Eric Palacpac

  • Marie Pacsa Balingit
  • Maica Miclat Abes

Hwee Weng Dennis Hey

  • Zi Ning Anthea Foong
  • Yang En Yee
  • Eng Tat Khoo
  • INFORM HEALTH SOC CA
  • Alexander John Stabile

Sarah J. Iribarren

  • Jennifer Sonney

Rebecca Schnall

  • Ambreen Kousar

Saif Ur Rehman Khan

  • Naseem Ibrahim
  • Lect Notes Comput Sci

Helen L. Petrie

  • Dixie L. Thompson

David Bassett

  • Mendoza G. Alfredo

Francisco Alvarez Rodriguez

  • Mendoza G. Ricardo

Jaime Muñoz

  • Matthew Ernst
  • Audrey Girouard

Fong-Gong Wu

  • Alexander Ng
  • John Williamson

Stephen Anthony Brewster

  • Berthel Tate

Nicolas Louveton

  • Recruit researchers
  • Join for free
  • Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up

This paper is in the following e-collection/theme issue:

Published on 18.6.2024 in Vol 26 (2024)

Developing a Chatbot to Support Individuals With Neurodevelopmental Disorders: Tutorial

Authors of this article:

Author Orcid Image

  • Ashwani Singla 1 , MSc   ; 
  • Ritvik Khanna 1 , BSc   ; 
  • Manpreet Kaur 1 , MEng   ; 
  • Karen Kelm 1 , MA   ; 
  • Osmar Zaiane 1 , MSc, PhD   ; 
  • Cory Scott Rosenfelt 1 , BSc   ; 
  • Truong An Bui 1 , PhD   ; 
  • Navid Rezaei 1 , PhD   ; 
  • David Nicholas 1 , PhD   ; 
  • Marek Z Reformat 1 , MSc, PhD   ; 
  • Annette Majnemer 2 , OT, PhD   ; 
  • Tatiana Ogourtsova 2 , OT, PhD   ; 
  • Francois Bolduc 1 , MD, PhD  

1 Department of Pediatrics, University of Alberta, Edmonton, AB, Canada

2 School of Physical & Occupational Therapy, McGill University, Montreal, QC, Canada

Corresponding Author:

Francois Bolduc, MD, PhD

Department of Pediatrics

University of Alberta

11315 87th Avenue

Edmonton, AB, T6G 2E1

Phone: 1 780 492 9713

Email: [email protected]

Families of individuals with neurodevelopmental disabilities or differences (NDDs) often struggle to find reliable health information on the web. NDDs encompass various conditions affecting up to 14% of children in high-income countries, and most individuals present with complex phenotypes and related conditions. It is challenging for their families to develop literacy solely by searching information on the internet. While in-person coaching can enhance care, it is only available to a minority of those with NDDs. Chatbots, or computer programs that simulate conversation, have emerged in the commercial sector as useful tools for answering questions, but their use in health care remains limited. To address this challenge, the researchers developed a chatbot named CAMI (Coaching Assistant for Medical/Health Information) that can provide information about trusted resources covering core knowledge and services relevant to families of individuals with NDDs. The chatbot was developed, in collaboration with individuals with lived experience, to provide information about trusted resources covering core knowledge and services that may be of interest. The developers used the Django framework (Django Software Foundation) for the development and used a knowledge graph to depict the key entities in NDDs and their relationships to allow the chatbot to suggest web resources that may be related to the user queries. To identify NDD domain–specific entities from user input, a combination of standard sources (the Unified Medical Language System) and other entities were used which were identified by health professionals as well as collaborators. Although most entities were identified in the text, some were not captured in the system and therefore went undetected. Nonetheless, the chatbot was able to provide resources addressing most user queries related to NDDs. The researchers found that enriching the vocabulary with synonyms and lay language terms for specific subdomains enhanced entity detection. By using a data set of numerous individuals with NDDs, the researchers developed a knowledge graph that established meaningful connections between entities, allowing the chatbot to present related symptoms, diagnoses, and resources. To the researchers’ knowledge, CAMI is the first chatbot to provide resources related to NDDs. Our work highlighted the importance of engaging end users to supplement standard generic ontologies to named entities for language recognition. It also demonstrates that complex medical and health-related information can be integrated using knowledge graphs and leveraging existing large datasets. This has multiple implications: generalizability to other health domains as well as reducing the need for experts and optimizing their input while keeping health care professionals in the loop. The researchers' work also shows how health and computer science domains need to collaborate to achieve the granularity needed to make chatbots truly useful and impactful.

Introduction

Knowledge exchange in the medical domain presents multiple challenges, including accessibility, readability, and accuracy. Chatbots, or computer programs that simulate conversations, can help answer users’ or caregivers’ questions [ 1 , 2 ]. Moreover, a chatbot offers several advantageous features for medical needs: flexibility (service providers available 24/7 and from any location [ 3 ]), speed (rapid delivery of a large number of resources), privacy (confidential access for users), engagement (an appealing user interface that fosters interaction), and trustworthiness (information developed by professionals, ensuring reliability). Chatbots have already been developed for diagnosing heart disease [ 4 , 5 ], providing counseling for mental health [ 6 ], improving patient monitoring and medical services [ 7 ], and preventing eating disorders [ 8 ]. Some chatbots have integrated coaching elements to support youth with weight management and prediabetes symptoms [ 9 , 10 ], young adults with depression and anxiety [ 3 , 11 ], people with obesity and emotional eating issues, adults wishing to improve wellness [ 4 , 5 ], and young adults with a high level of stress [ 6 ]. User trust remains a key challenge faced by medical and health-related chatbots [ 12 ].

Chatbots powered by advances in natural language processing (NLP) such as large language models (LLMs; eg, ChatGPT [ 7 ]) have shown how chatbots can revolutionize the way information is shared and accessed.

Nonetheless, developing a chatbot for the medical domain is challenging, especially when targeting complex medical conditions such as neurodevelopmental disorders (NDDs). NDDs represent a diverse group of conditions affecting development, including conditions such as attention-deficit/hyperactivity disorder (ADHD), intellectual disability, autism spectrum disorder, cerebral palsy, and learning difficulty. Together, NDDs affect up to 14% of children [ 8 ] and have major implications not only for the individuals themselves but also for their families and society [ 13 ]. It is increasingly recognized that individuals with 1 NDD diagnosis (eg, autism spectrum disorder) often also present with features of ADHD or learning difficulty. Moreover, several associated conditions (often referred to as comorbidities) [ 14 , 15 ] can also significantly affect the clinical presentation of individuals with NDDs and their needs in terms of health, social participation, and education; for instance, sleep disorders, gastrointestinal symptoms, anxiety, depression, or even seizures are more commonly found in individuals with NDDs than in the general population. NDDs are also increasingly linked to genes. While this has paved the way for personalized medicine, it has also led to silos where parents have access to information only through associations created for their genes of interest, resulting in very limited information in terms of management for many rare conditions. Furthermore, NDDs are chronic conditions with changing manifestations and needs over the course of an individual’s lifespan.

While LLM technology is evolving quickly, here, we discuss key steps to be considered when developing a chatbot for the medical domain and present how our team applied these steps to our use case of developing a chatbot for medical information related to NDDs named CAMI (Coaching Assistant for Medical/Health Information).

Consulting End Users About Their Needs to Target the Needs of Potential Users

It is important to consult directly with individuals with lived experience. While this is now common practice in industry, it remains challenging to reach out to individuals with medical issues.

Step 1: We Recommend Developing an Advisory Group That Includes Representative Individuals

In our case, we developed a national advisory group that included 9 caregivers of individuals with NDDs (we advertised for the advisory group position through associations and partners involved in NDD research in Canada). The caregivers were predominantly female (8/9, 89%), and their ages ranged from 32 to 51 years. In terms of marital status, of the 9 participants, 7 (78%) were married or had common law spouses, and 2 (22%) were divorced or separated. They had diverse levels of education: bachelor’s degree (7/9, 78%), master’s degree (1/9, 11%), and PhD (1/9, 11%). Of the 9 participants, 6 (67%) were employed full time, while 3 (33%) worked part time. The participants had various occupations: special education teacher (1/9, 11%), planning and programming officer (1/9, 11%), senior IT consultant (1/9, 11%), community development coordinator (1/9, 11%), academic professor (2/9, 11%), research assistant (1/9, 11%), psychologist (1/9, 11%), and manufacturer (1/9, 11%). Of the 9 individuals, 8 (89%) were White, and 1 (11%) identified as a First Nations person. All participants were biological parents.

The advisory group will not only provide direct feedback on the project but will also, through their personal and professional networks, recruit participants for testing the ideas or chatbot prototype.

Step 2: Ethical Considerations

It is important to prepare in advance a complete set of questions to be used in semistructured interviews with individuals with lived experience. The advisory group can provide key insights into the development of the consent forms, study protocol, and interview material. For developers from industry, it is key to collaborate with clinicians, association leaders, or other allied health professionals who are not only able to provide feedback but can also help engage participants in the project.

In our case, the project was approved by the research ethics board at the University of Alberta (Pro00081113). Informed consent was obtained from participants in interviews. The participants were compensated for their work as per ethics regulation.

Step 3: Consultations

We used semistructured interviews ( Multimedia Appendix 1 ) adapted from user interface evaluation resources [ 12 , 16 - 18 ] to ask advisory group members about their current needs, their use of web-based platforms to gather information, and the current barriers to access to information. All interviews were video recorded after receiving consent from the participants.

Our interviews allowed us to identify overlapping themes using thematic analysis [ 19 ]. We were able to identify the patterns and themes among different user groups and build a plan on how to represent the data. The advisory group members (who were individuals with lived experience) suggested that the chatbot should provide rapid access to information:

I like the way that the resources just pop up in the side window as the user is answering the questions.
You could be asking me questions all day long but I’m not sure how close it can come to identifying the biggest stressors and challenges for our family. According to those responses they could target the info resources accordingly. If the chatbot doesn’t get to the bottom of our challenges fast enough, as a parent, I would likely find myself turning to Google more quickly to find what I’m looking for.

Step 4: Knowledge Mapping

It is important to identify the potential differences in the conceptual framework associated with medical conditions. These differences in mental representations, known as mental models, can lead to blind spots that may prevent the chatbot from being widely useful [ 20 ].

Developing a Database of Resources for the Chatbot

Step 1: gathering information.

Providing trusted, exhaustive, and actionable information is key for a chatbot meant to provide medical information. It is important to partner with individuals with lived experience, associations (patients, parents, and professionals) relevant to the condition, and health authorities to obtain such information. The information about the relevant web pages, books, or other formats needs to be stored centrally and visible to others so that it can be peer reviewed. This can be done using shared sheets or websites.

In our case, we developed a database of NDD resources by leveraging existing databases (Alberta Children’s Hospital, AIDE Canada, and InformAlberta [ 21 , 22 ]) as well as by developing a nationwide consultation with individuals with lived experience from across Canada who submitted 1422 resources. AIDE Canada and InformAlberta shared their resource database with the research team, which has been incorporated in the main CAMI database. The same data set or a superset of the data is used on the general websites of the aforementioned organizations. The individuals with lived experience used Google Sheets to add the known or found resources with the appropriate data labels. These data included the web page URL, type of resource, language, age group, location, eligibility criteria, and some important keywords. All resources were collected and annotated manually by group members. Submitted resources covered several topics such as health, education, and support programs and were grouped under the categories core knowledge , educational, financial support , and services .

A key feature brought up by the families of individuals we consulted was the need for an in-depth, trusted, and diverse set of resources. Therefore, we collaborated with several organizations involved in providing NDD treatment as well as individuals with lived experience. We encountered challenges in obtaining resources for diverse NDD subtypes and covering all regions of Canada at the start because we connected only with organizations that tend to have better coverage in urban areas. We found that forming a network of individuals with lived experience was very helpful and allowed us to achieve broader coverage in identifying resources ( Figure 1 ; Textbox 1 ).

research article on user interface

Province and number of resources

  • Ontario: 11,819
  • Alberta: 8986
  • British Columbia: 7330
  • Saskatchewan: 7285
  • Quebec: 6640
  • Manitoba: 6218
  • New Brunswick: 6095
  • Yukon: 4951
  • Nunavut: 4508
  • Newfoundland and Labrador: 4085

Step 2: Data Annotation

It is important to annotate the resources included in the data set in a meaningful way so that the information can be retrieved in response to related queries by future chatbot users. Again, discussions with potential users are key in capturing topics of interest and making sure that they are properly covered in the database.

We developed an automated annotation tool [ 23 ] using an NLP pipeline that uses a combination of named entity recognition, topic modeling, and text classification model to annotate the resources. All resource annotations along with their weight were stored in the Neo4j graph database (Neo4j, Inc) in the form of a weighted knowledge graph as shown in Figure 2 in the paper authored by Costello et al [ 23 ]. Ordered weighted aggregation operators are used to rank the resources that will be printed out in response to a user query.

Alternatively, one could use LLMs to extract keywords from the individual web pages. This would include selecting a suitable prompt and providing the web page content extracted from the URL. By leveraging LLMs such as ChatGPT and Google Gemini, among others, the content can be analyzed to identify and filter out important keywords from the web page.

research article on user interface

Step 3: Database Formatting

The information is stored in the database in a format that retains the dependencies and relations among different data. The data are divided into multiple tables or classes such that the data are modular and can be used independently.

We used the Neo4j database to store the database models, and the database includes several nodes, relationships, and properties. The web page content was initially stored in the MongoDB database (MongoDB, Inc) and was further annotated into multiple properties and relationships. These properties are is_about , is_associated_with , is_located_in , occurred_together , and so on. When querying the database, these properties and corresponding relationships are extracted to filter out the resources from the database and rank them in accordance with the defined parameters.

Designing and Evaluating the Chatbot User Interface Design

A crucial aspect of the chatbot relates to its user interface, which requires it to be functional, easy to use, and attractive so that it appeals to users [ 6 - 8 , 13 ]. Of particular importance is the landing page because it represents the first impression of the application and should connect with the target audience [ 7 ] by reflecting the user’s [ 6 ] as well as the chatbot’s goals [ 14 ]. Important components of the page design include a logo, design, fonts, colors, and layout [ 15 ] ( Multimedia Appendices 2 - 7 ).

Step 1: Consultation Regarding the Interface

To streamline the process of multiple iterations, mock-up representations of the chatbot can be used in interviews with individuals with lived experience. This is important to increase the cost-effectiveness and reduce the time required for coding each version. In our case, the mock-ups, or the design wireframes of the chatbot, were shared with the participants (individuals with lived experience) by research team members on a Zoom call (Zoom Video Communications, Inc), and their feedback was recorded on Google Docs during these calls. We usually included both content experts and computer scientists to uncover gaps. This cycle should be repeated until a consensus is reached among the target user base, and the design is finalized [ 24 ]. In the design process, we used Figma (Figma Inc) [ 25 ], a collaborative design tool that allows users to work on various designs while also allowing others to review them and provide comments. Optimal user interface design uses an agile methodology that includes iterative design, implementation, and testing [ 26 ].

We started with the home page of the chatbot ( Figure 3 ). We found that families of individuals with NDDs favored a simple design with a clear indication of the purpose of the chatbot. We also included a tutorial video about the purpose and intended use of the chatbot. The families also recommended the inclusion of the names of the institutions involved in the development of the chatbot as a mark of trusted information.

The home page was designed using Gestalt principles [ 27 ], and all closely matching content was kept on the same page. During the initial interviews, participants indicated that the home page content was not self-explanatory, clear, and trustworthy. The feedback was considered and the required changes made to showcase the authenticity of the chatbot.

research article on user interface

Step 2: Conversation Design

The conversation (or generic flow), which represents the exchange between the user and the chatbot, is a very important aspect in usability. Most chatbots will include an introductory text explaining the purpose of the chatbot. Subsequently, the conversation can be more or less prestructured.

An introductory video can help the user understand rapidly the aim of the chatbot as well as how best to ask questions. Importantly, the video needs to be short enough for the user to stay engaged with the chatbot.

In our case, the conversation (chat) design went through several iterations based on input from caregivers of individuals with NDDs ( Figure 4 ). In the initial design, the users were confused when the chatbot was not able to understand them. Either they had to start the chat again or follow along with the conversation. In the final design, users have the ability to inform the chatbot if its understanding is incorrect, and they can enter their query again. Users have more flexibility in responding to certain questions by clicking “I am not sure” if they are not comfortable with answering them. In certain cases, we also provided radio buttons based on participant feedback:

Instead of having the user type yes or no every time, could a selection not appear using radio buttons, one for yes and one for no? It would be easier for the user.

We developed a video explaining the aim and scope of the chatbot to avoid users asking questions outside of our domain of expertise. The video is available on the web [ 28 ].

research article on user interface

Step 3: Output Representation

Another major focus was on how to present the results of the query.

In our case, web resources identified by CAMI are presented to the user ( Figure 5 ). This part underwent significant modification to provide information to users that would influence their decision to continue browsing or not. Users have major time constraints, and they would lose interest quickly if they are unclear about the quality of the resource. The resource card evolved from showing a title and rating to showing a static image of the website, the type of resource, tags describing the resource, and some key functions such as sharing or saving the resource (which was identified by users as key in being able to access the chatbot from multiple environments, eg, work, commute, and home). All static images are captured as screenshots and saved in the database. The families reported that in addition to being more appealing, the images would help to build trust and identify the authenticity of a site if they could see its home page. Moreover, users mentioned that they would be able to recognize and remember the site more effectively if it displayed an image. We also took advantage of the tags to allow the user to converse with or direct the chatbot dynamically. Indeed, when the user clicks on a radio button, say, “parent” or “autism,” the chatbot would respond by providing resources related to this tag. This not only allows the user to start with a query but also enables them to navigate to other topics of interest that they may not have initially considered.

Next, we investigated how many resources to display during a single interaction. The families’ feedback showed consensus at 3. Resources are presented in sets of 3 with page numbers at the bottom showing the user that there are more resources for them to look at in the future ( Figure 6 ).

research article on user interface

Step 4: User Engagement and Coaching Strategies

While websites have used several ways to encourage users to remain on their platform, it is important to consider the vulnerable nature of patients or their families as well as the potential detrimental effect of prolonged use. At the same time, encouraging the user to reconnect with the platform will allow repeated use and coaching.

Health coaching and care coordination [ 29 , 30 ] have been shown to improve health outcomes for individuals with NDDs, but they are not offered to most families because of cost, a lack of specialized health care professionals [ 31 ], access barriers due to geography, or a lack of integration into the health system (eg, insurance and social factors).

In our case, we consulted with health experts in NDDs and identified key coaching aspects that could be provided to users. We also asked users whether they were interested in being sent tips or related resources via SMS text messaging or email ( Figure 7 ).

research article on user interface

Optimizing User Input Understanding in Domain-Specific Terms

Understanding user input is an important factor in providing responses to user queries. NLP is a domain of artificial intelligence focused on developing computer programs that are able to read textual data, analyze the content, and extract meaningful information [ 32 ]. Advanced deep network NLP algorithms, such as named entity recognition and relation extraction, have been repurposed to identify diseases, symptoms, and drugs from user input [ 33 - 35 ]. Chatbots use NLP to process user text and respond to the text. While the underlying processes are different, chatbots aim to behave like people who listen and respond to their conversational partners. Day-to-day language is difficult for a computer to understand, but, using NLP, the computer is able to break up user text into a set of attribute-value pairs; for example, if the user types “I want to know if there are any services available for Global Developmental Delay in Edmonton,” the NLP pipeline would break up the text into subparts by extracting keywords: { Services, Global Developmental Delay, Edmonton }.

We leveraged an annotation NLP pipeline [ 23 ] to extract structured information from the user free-text query. Every query submitted by the user is divided into multiple medical-related domains such as Human Phenotype Ontology (HPO) for medical terms and Alliance of Information & Referral Systems (now known as Inform USA) for service-related terms, as well as symptom-specific terms (eg, challenging behavior), geographic location, and the age of the individual.

We found that formatting the query into multiple domains helped us identify the resources with more accuracy compared to performing a search using keywords. We experimented with certain queries from the test group and identified the related conditions for common queries. Textbox 2 shows the extraction of a single query to a list of multiple medical database keywords that relate to each other when filtering the resources.

In CAMI, certain keywords are hardcoded so that the recommendations can be domain-specific and adhere closely to user requirements. If keywords such as “knowledge” or “information” are included in the initial user query, the user will not be asked about their geographic location because location is not relevant; for instance, general knowledge regarding autism is likely consistent in Canada, the United States, or Germany, but services or policies will differ based on location. By contrast, if the keyword “services” is detected in the user query, and the location is not extracted from the query itself, the user will be prompted to provide their location (city or province) such that the recommended services can be actionable.

Term and value

  • HPO-DDD: “attention-deficit/hyperactivity disorder”
  • UMLS: “mental suffering”
  • ERICterm: “sons”
  • cb_category: “behavioral concerns”
  • Ngrams: “son, suffering, adhd”
  • searchedCategory: “core knowledge”
  • Location: “not found/not required”
  • Age: “not found”

Chatbot Framework Coding

There are several tools available to create chatbots with diverse functionalities and their use varies with each use case. Wit.ai [ 36 ], Rasa [ 37 ], and Dialogflow [ 38 ] are some of the popular frameworks used to design rule-based chatbots.

We used the Django library in Python to create the backend for CAMI. It follows the model-view-template framework and app-based architecture, in which all classes are called apps and can work independently as well as in conjunction with other apps in the system ecosystem [ 36 ]. Each app includes a model file, a views file, a test file, and an init file. The schema was defined for the database table in the models.py file, where the column name is declared with its definition, which includes data type, constraints, and default values. All changes in the database model generate a migration file, which is an auto-created SQL code that creates or updates the database’s internal schema [ 37 , 38 ]. View files include a set of functions with decorators that define the type of http request served [ 39 ]. Django (Django Software Foundation) was selected over comparable frameworks such as Flask because using Django’s authentication module streamlined the development process for registration and log-in functionalities through auto-encryption, password verification, and authorization [ 34 , 35 ].

Once the information is extracted from the user query, it is sent to the matching engine. The matching engine incorporate the logic to find the similarity between the given input data dictionary and the list of resources from the database. The resources are annotated in the same way as the query to maintain the uniformity between the query and the list of resources to be considered for matching after the resources are filtered with the type of resource requested ( Figure 8 ).

We optimized the system to obtain faster throughput to display the results faster. This was done by incrementing the heap size and page cache in the Neo4j configuration file. The average response time for query analysis is approximately 2 seconds, but this depends on the length of the sentence. For smaller sentence sizes, the output will be returned in a shorter time period ( Figure 9 ).

Similarly, the time required to obtain the recommendations for the queries ( Figure 9 ) changes with the size of the payload. The engine checks all web pages and ranks them on the basis of the topic, entity, and location ( Figure 10 ).

In developing the chatbot, it is important to follow the CONSORT (Consolidated Standards of Reporting Trials) guidelines [ 39 ]. CONSORT guidelines are usually meant for reporting clinical trials, ensuring transparency and accuracy in the reporting of trial methods and results, but the same principles can be applied to software development as well. The guidelines include the following: (1) mention the names, credentials, and affiliations of the developers, sponsors, and owners; (2) describe the history and development process of the application and previous formative evaluations; (3) report revisions and updates; (4) provide information on quality assurance methods to ensure the accuracy and quality of the information provided; (5) ensure replicability by publishing the source code; (6) provide the URL of the application; however, because the intervention is likely to change or disappear over the years, make sure that the intervention is archived; and (7) describe how participants accessed the application, in what setting or context, and if they had to pay for access.

In our case, we applied the CONSORT guidelines as outlined in Textbox 3 .

research article on user interface

Adhering to the CONSORT guidelines

  • We provided the following information about the developers and authors: Ashwani Singla (backend), Ritvik Khanna (front-end), Manpreet Kaur (NLP [natural language processing]), and Francois Bolduc (conceptual and content expert).
  • CAMI (Coaching Assistant for Medical/Health Information) went through multiple major changes throughout the project timeline as well as numerous ongoing minor changes that were tackled on the way to building the Epic (project milestone) or the goal of the system. All code changes are committed to the GitHub repository with separate branches for front-end, backend, and NLP. Any modification to the code will go to the respective branch, and the integration developer will perform integration testing to merge the code. All modifications to the chatbot are subject to feedback from the testing group and the principal investigators of the project.
  • CAMI went through 3 major releases and 403 commits on the GitHub repository [ 40 ]. The most recent stable version for the backend functionality was released on April 16, 2023, and some changes were made on the front-end progressively. Development was frozen for all versions during the testing phase so that the same system could be tested by multiple people. There were significant changes to version 3.0 with regard to using the knowledge graph to find the connection between the entities and share the information with users.
  • During testing, the chatbot's quality was evaluated based on 3 main components: user understanding of the chatbot's functionality, user satisfaction with the overall quality of resources, and user confidence in sharing information on the platform. These metrics helped the development team to enhance the quality of the resources by placing more relevant resources on the first page and others on the subsequent pages. Similarly, the question flow changes are a result of the confidence shown by users, which was evaluated based on the user being willing to answer at least 80% of the questions asked by the chatbot.
  • CAMI is deployed on the University of Alberta’s MedIT server with the URL [ 41 ]. The default page corresponds to the home page or landing page. The backend end points can be accessed using the same URL but with different subroutings.
  • The testing method varied based on the version of the chatbot. In the first and second test phases, the participants were on a video call on Zoom (Zoom Video Communications, Inc), and the screen view was shared with them to enable them to view the website. Participants were granted remote control access to the screen to test the system, although some participants guided the proctor to click certain options. In the third version of testing, the URL was shared with the participants so that they could use a web browser to test the system. Participants could report any bugs they found. Finally, the testing group completed a feedback form.
  • The chatbot is currently freely accessible.

Establishing a Conversation Flow for the Chatbot

In the case of medical conditions, especially complex ones such as NDDs, patients present with variable or multiple issues that can all be of interest. Therefore, it is important to include a knowledge of such associated concepts to provide comprehensive information. In some cases, the core disease, say, autism or advanced stage cancer, may not be treatable, but associated conditions (eg, anxiety or constipation) may have interventions available to the patients. Concepts (or entities) and their relationships are often stored in knowledge graphs. A knowledge graph consists of a graphical and structured representation of information: terms or concepts can be stored as nodes that are connected to one another through edges that define different relationships [ 42 ]. Knowledge graphs are already being used to store information about disorders [ 8 , 13 , 14 ] and could therefore be leveraged by chatbots.

Using a knowledge graph in a health assistant or chatbot allows the question-answering system to provide (1) information regarding, for instance, disease symptoms [ 43 , 44 ]; (2) internet-based diagnosis and risk assessment [ 45 ]; (3) personal lifestyle interventions for serious conditions [ 46 - 48 ]; and (4) prediagnosis and triaging using the patient’s symptoms and medical history [ 49 , 50 ].

We developed a knowledge graph representing key entities observed in individuals with NDDs as well as their interrelation by leveraging the largest data set of individuals with NDDs, the Deciphering Developmental Disorders (DDD) database [ 51 ]. The DDD database includes phenotypic and genotypic information along with the ages of 13,424 patients with severe and undiagnosed NDDs [ 30 ]. The DDD database labels phenotypes using the HPO. The knowledge graph contains 4181 HPO phenotypes for which we calculated the co-occurrence counts among all patient profiles and stored them in the knowledge graph using the “is_associated_with” relationship. The knowledge graph contains 357,514 “is_associated_with” relations. As all resources are already annotated with HPO phenotypes, the chatbot suggests resources to the user for other phenotypes associated with the queried phenotype.

Leveraging the DDD database allowed us to identify the concepts that co-occurred in individuals with NDDs ( Figure 2 ). Using CAMI, we are successfully able to identify the relationship between different nodes (medical entities related to NDDs). This is possible, based on the large number of individuals in the DDD database, which makes a salient difference between the rates of co-occurrence.

The tips or suggestions are based currently on established concepts from the coaching literature developed for families or caregivers of individuals with NDDs ( Figure 11 ); for instance, suggesting that the user set up objectives, which is a key step and a common coaching strategy when dealing with individuals with complex conditions such as NDDs. The chatbot will provide a tip, which, if the user wishes, can be sent to them via email or SMS text message.

This allowed CAMI to provide the user with the names of conditions related to their initial query. It then offered the user the opportunity to see resources for these related entities ( Figure 11 ).

The entities extracted from the query will be passed to the knowledge graph by calling the Neo4j database’s object instance, and the resources are matched [ 26 ]; for example, a query such as “My child has sleep issues” will be parsed and subcategorized into multiple key-value pairs. The aforementioned sample will return the result in the following format:

{HPO-DDD: [], UMLS: [Problem], EricTerm: [Sleep], AIRS: [], cb_category: [Sleep issues], location: AGE: Child, ngrams: [child, sleep, issue], relatedConditions: [], searchedCategory: [core knowledge]}

The keywords in the query, including “sleep,” “child,” and “issues,” are mapped with other medical data sets and categorized in the format shown.

research article on user interface

Evaluating Chatbot Outputs

Step 1: internal testing.

It is important, especially for the medical domain, to assess the output of the chatbot internally with a content expert before conducting further testing. This process, known as red teaming , is carried out to make sure that the chatbot does not provide inappropriate results.

We started by reviewing the chatbot output with 3 caregivers of individuals with NDDs as well as the resources identified by the chatbot to assess their quality. Overall, these individuals with lived experience appreciated the resource quality; most of the suggestions were from trusted websites and did not include any advertisements or false information. The chatbot was also able to recognize, for instance, the location and age and provide different recommendations integrating this information ( Table 1 ).

Although query 1 and query 2 in Table 1 are similar, the top resource recommendation differs. When the age is included in the query, the system goes through a different pipeline and categorizes or ranks the resources on the basis of age as well. More specific resources will be ranked higher, and very general resources will be ranked lower. Similarly, if the location is added, then another ranking filter will be appended to the recommendations, and the list is sorted with 2 filters.

Query and title of web pageReference

“Siblings of ADHD Children: Family Dynamics”[ ]

“ADHD Behavior: Expert Discipline Skills”[ ]

“Expert Answers to Common Questions About ADHD”[ ]

“Being Strength-Minded: An Introduction To Growth Mindset - Foothills Academy”[ ]

“Baby Registry Tips: Baby Clothes | Cando Kiddo”[ ]

“Alberta Child Health Benefit | Alberta.ca”[ ]

“Services — Qi Creative Inc.”[ ]

Step 2: Large-Scale Testing

Next, it is important to conduct a broader assessment of the chatbot output with a larger number of queries and evaluators. This is more challenging because patients or families have limited time. Similarly, pilot testing by clinicians is difficult to carry out.

We collaborated with undergraduate students who received training about NDDs and subsequently were asked to evaluate the resources provided by the chatbot. A total of 17 students signed up for the evaluation, but 3 (18%) dropped out, leaving 14 (82%) to evaluate the recommendations. The 14 students were divided into 3 subgroups, and each subgroup evaluated the recommendations individually for the full testing set. The evaluation sheet was divided into 4 tabs: behavioral concerns , support groups or programs , cognitive development , and other topics . Multiple user queries (from the interviews or feedback) were grouped together into each tab, and up to 50 recommendations were provided for each query. In a single subgroup, each member checked 1 query and all recommendations and answered the questions in the evaluation form ( Multimedia Appendix 8 ). Thus, each query and its recommendations were reviewed by 3 individuals, and a majority decision (2 out of 3) was considered the final label for the evaluation.

After the evaluation, the queries were subcategorized into NDD topics such as autism, ADHD, sleep concerns, and so on, and were analyzed further ( Figure 12 ). The analysis included the relevance scores for the top 10, top 15, and top 50 recommendations for each topic. The results varied for each topic; for ADHD, the relevance scores decreased from the top 10 recommendations to the top 15 recommendations but subsequently increased when considering the top 50 results. For autism, the relevance score gradually decreased as the considered recommendations increased, which shows that the initial recommendations are considered better and more relevant than the later recommendations. Upon analysis, we also found that the data in the database and the mapping keywords play an important role in the data ranking.

The ranking relies on multiple factors, such as location, category, Unified Medical Language System (UMLS) terms, Education Resources Information Center (ERIC) terms, type of resource required, and so on. Although we had approximately 11,000 resources that include the term ADHD or attention-deficit/hyperactivity disorder , the filters and ranking algorithms need a weighted mechanism to decide which resource to rank first. Regarding autism , 4169 resources in the CAMI database include the term, and CAMI performed better in terms of ranking, which shows that having more resources does not necessarily equate to better recommendations. We kept all weights at the same scale for our research, but varying weights have a better chance of improving the ranking.

research article on user interface

While chatbots are emerging as a key aspect in customer service workflow for several businesses, their use in the medical domain remains limited. Although several features of chatbots, such as 24/7 availability, accessibility from remote areas, and privacy, promise a bright future for the use of chatbots in health care, significant challenges remain; for instance, users expect chatbots to be able to process their query. This relies on NLP, and while there are extensive ontologies for general domains, an understanding of the medical domain, especially when specific to a subfield such as NDDs, remains limited. The vocabulary used by users who are lay people requires the development of synonyms for most medical terms used usually by web resources that provide either core knowledge or specialized services. We found that, in addition to using the UMLS, involving medical experts as well as individuals with lived experience could greatly improve on this. This proved to be engaging for families of individuals with NDDs because they felt involved in the development of the chatbot. We would therefore encourage developers to form, as we did, advisory groups that include individuals with lived experience.

Another important aspect to consider when developing a medical domain chatbot is the level of complexity of most medical conditions. This may be related to sex differences, age-dependent differential manifestations, or associated conditions (comorbid conditions) that may influence the clinical presentation and the needs of the user; for example, an individual with an NDD is more at risk of developing seizures. This is important information for the user to be aware of when asking about, for instance, change in behavior. However, this information may not be easily available to computer scientists developing the chatbot. Even general practitioners may not consider this information shared knowledge. Interviewing domain-specific medical experts is challenging due to access or time constraints. Therefore, we developed an alternative approach by using a large data set of information and integrating the data into a knowledge graph. This allowed us to identify coassociated conditions and provide the chatbot with knowledge of the topic.

In addition to these more conceptual aspects, several practical points need to be considered when developing a chatbot in the medical domain. Possibly the most important concerns how to manage a highly multidisciplinary team (medical, social science, and computer science professionals as well as individuals with lived experience) with very limited overlap in background knowledge. First, we noted the importance of involving potential users (caregivers) in identifying the needs of users, the features to be included, and the user interface best suited for efficient and useful mobilization of knowledge. We found that it was very important to bring together individuals who share a common interest in the topic to surmount the challenges related to differences in language and background. Indeed, we noted that an agile-inspired approach was extremely important because caregivers could better identify needs and refine approaches as they were presented with multiple iterations of the chatbot [ 59 ]. The agile approach has gained popularity because it focuses more on the potential users and the results, which enables projects to swiftly adapt to new requirements or changes as they arise.

Second, we identified several technical points that made knowledge mobilization particularly challenging for a chatbot when using automation, as opposed to manual human curation. We observed from all our user experience testing that, to be useful, the chatbot needed a wide array of resources from which to make recommendations (as was pointed out during the initial steps with the advisory group). However, we found it difficult to identify a labeled set of such resources that could be used by the chatbot. Furthermore, we noted that, for the most part, resources were labeled manually by each individual group focused on a specific health-related entity. This made it difficult for the chatbot to identify the relevant resources using a common set of vocabulary. Related to this is the difficulty for the chatbot to provide resources according to a rank most useful to users; for instance, some websites (eg, services) may have more content and therefore might include more keywords of interest than other sites with less content (which includes a list of telephone numbers for a service). At the moment, it is still unclear how simpler sites, which may be more impactful to the user, could be added.

Finally, developing a chatbot that uses a coaching approach to engage with the user and build a long-term relationship proved to be difficult. We incorporated several elements to foster engagement (customization, allowing resources to be shared, and providing expert tips), but recreating the relationship between a caregiver and a human coach remains a complex task that will require further research in mood analysis, personalization based on deeper understanding of a user’s inner mental model (which can vary significantly between users), and an awareness of the course of the disease.

Limitations

Individuals with NDDs can present with different clinical features based on age and sex. Unfortunately, there is currently no database containing longitudinal data. In the future, user interactions, such as resource ratings of the resources combined with data on individuals with NDDs and cluster analysis approaches, could be used to recommend age-, sex-, or phenotypic profile–specific resources [ 60 - 63 ].

The conversation between chatbot and the user is still scripted to some degree, although the user can decide to branch out by either navigating to the related topics suggested by the chatbot or simply clicking on the tags presented with the resources provided for their query. In the future, we would like to expand on this and allow a question-generation system that will use the knowledge graph of entities related to NDDs to ask users questions to provide more specific recommendations [ 64 ]. Developing a question-generation system could allow parents to access required information without answering too many queries.

From a technical point of view, querying the database imposes significant load, resulting in prolonged wait times for results, which may impact the user experience. Using the correct database or adding indexing to the database queries can expedite data processing, but this will need many changes in database modeling and extraction queries [ 65 ]. In addition, regularly checking for annotation updates on websites is crucial to ensure their continued operation and content consistency. Sometimes, websites do not follow coding standards, and it is very difficult to spot abandoned or corrupted web pages and make sure that they do not show up in the suggestions. We could potentially develop a script that runs automatically regularly to find and remove pages that do not exist or have broken links. The website’s status can be confirmed by either receiving the response using the Python request module or using the Linux ping command [ 66 , 67 ]. However, the authenticity of the website needs to be reverified after the data update to ensure that false information is not disseminated.

Finally, in its current version, CAMI is only available in English, but newcomers to the country, immigrants, and people who are comfortable in a different language should be able to access the chatbot so that it can reach, and provide resources to, a larger audience [ 68 , 69 ].

Future Plan

We would like to highlight the importance for our project of creating a large data set of longitudinal data to track patients’ data over a period of time using the chatbot. This will allow the system to monitor patient behavior and can provide proactive solutions or resources in the early stages itself. This type of information will also be crucial for other artificial intelligence tools aiming at predicting outcomes and assessing intervention impact.

Another key aspect of chatbots operating within the medical domain will be a precise NLP system, necessitating the creation of a lexicon of domain-specific medically relevant terms alongside layperson language that can be recognized by the language models. Along the same lines, a truly dynamic system with automatic question generation will make the chatbot smarter in terms of its conversational manner. In addition to the user asking general questions, the specificity of interactions with the chatbot will be determined by the responses or queries typed by the user. Better questions will lead to better and more specific responses and better recommendations [ 70 - 72 ].

Authors' Contributions

FB conceptualized the study, obtained funding, and managed the project. OZ, AM, and DN assisted with the conceptualization, navigation, and development of the chatbot. MZR and OZ assisted with the technical validation of the project. AS developed and designed the architecture and backend of the chatbot and cowrote the paper. RK developed the user interface of the chatbot and cowrote the paper. MK assisted with the knowledge graph and matching engine of the chatbot and cowrote the paper. MZR assisted with the validation of the knowledge graph. CR assisted with interviews with individuals with lived experience as well as managed the resource identification team, user data validation, and interviews for requirement gathering and analysis. KK assisted with the parent advisory group and provided feedback about the chatbot from the point of view of an individual with lived experience. NR assisted with the development of the text classification model that was used in natural language processing. AM also assisted with feedback on the paper. TAB assisted with the development of the CAMI (Coaching Assistant for Medical/Health Information) logo as well as other visual aspects of the chatbot. TO assisted with the development of the parent advisory group and initial interviews.

Conflicts of Interest

None declared.

Semistructured interview consent and questions.

Coaching Assistant for Medical/Health Information home page (top portion).

Coaching Assistant for Medical/Health Information home page (bottom portion).

Coaching Assistant for Medical/Health Information sign-up page.

Coaching Assistant for Medical/Health Information chat interface.

Coaching Assistant for Medical/Health Information resource card.

Coaching Assistant for Medical/Health Information chat radio buttons that enable users to explore related conditions.

Coaching Assistant for Medical/Health Information questions for recommendation evaluation.

  • Følstad A, Araujo T, Papadopoulos S, Law EL, Granmo OC, Luger E, et al. Chatbot Research and Design. Cham, Switzerland. Springer; 2020.
  • Grassi L, Recchiuto CT, Sgorbissa A. Knowledge-grounded dialogue flow management for social robots and conversational agents. Int J Soc Robot. 2022;14(5):1273-1293. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Youper homepage. Youper. URL: https://www.youper.ai/ [accessed 2023-03-07]
  • Welltory homepage. Welltory. URL: https://app.welltory.com/landing/main/ [accessed 2023-03-07]
  • Meditation and sleep made simple - Headspace. Headspace. URL: https://www.headspace.com/ [accessed 2023-03-07]
  • MindShift® CBT App. Anxiety Canada. URL: https://www.anxietycanada.com/resources/mindshift-cbt/ [accessed 2023-03-07]
  • ChatGPT 3.5. ChatGPT. URL: https://chat.openai.com/ [accessed 2024-05-21]
  • Child functional characteristics explain child and family outcomes better than diagnosis: population-based study of children with autism or other neurodevelopmental disorders/disabilities. Statistics Canada. Jun 15, 2016. URL: https://www150.statcan.gc.ca/n1/pub/82-003-x/2016006/article/14635-eng.htm [accessed 2023-03-06]
  • mySugr Global - Make diabetes suck less. mySugr. URL: https://www.mysugr.com/en/ [accessed 2023-03-07]
  • GlucoseZone® homepage. GlucoseZone. URL: https://glucosezone.com/home [accessed 2023-03-07]
  • Woebot health homepage. Woebot Health. URL: https://woebothealth.com/ [accessed 2023-03-07]
  • Shevat A. Designing Bots: Creating Conversational Experiences. Sebastopol, CA. O'Reilly Media; 2017.
  • Ansari NJ, Dhongade RK, Lad PS, Borade A, Yg S, Yadav V, et al. Study of parental perceptions on health and social needs of children with neuro-developmental disability and it's impact on the family. J Clin Diagn Res. Dec 2016;10(12):SC16-SC20. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Wang CH, Lee TY, Hui KC, Chung MH. Mental disorders and medical comorbidities: association rule mining approach. Perspect Psychiatr Care. Jul 2019;55(3):517-526. [ CrossRef ] [ Medline ]
  • Dewey D. What is comorbidity and why does it matter in neurodevelopmental disorders? Curr Dev Disord Rep. Sep 22, 2018;5(4):235-242. [ CrossRef ]
  • Levy J. UX Strategy: How to Devise Innovative Digital Products that People Want. Sebastopol, CA. O'Reilly Media; 2015.
  • Krug S. Don't Make Me Think, Revisited: A Common Sense Approach to Web Usability. Indianapolis, IN. New Riders; 2014.
  • Unger R, Chandler C. A Project Guide to UX Design: For User Experience Designers in the Field or in the Making. London, UK. Pearson Education; 2009.
  • Delve HL, Limpaecher A. How to do thematic analysis. Delve. 2020. URL: https://delvetool.com/blog/thematicanalysis [accessed 2024-03-01]
  • Holtrop JS, Scherer LD, Matlock DD, Glasgow RE, Green LA. The importance of mental models in implementation science. Front Public Health. 2021;9:680316. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Default. AIDE Canada. URL: https://library.aidecanada.ca/ [accessed 2023-05-10]
  • ca. InformAlberta. URL: https://informalberta.ca/public/common/index_Search.do [accessed 2023-03-09]
  • Costello J, Kaur M, Reformat MZ, Bolduc FV. Leveraging knowledge graphs and natural language processing for automated web resource labeling and knowledge mobilization in neurodevelopmental disorders: development and usability study. J Med Internet Res. Apr 17, 2023;25:e45268. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Rohrer C. When to use which user-experience research methods. Nielsen Norman Group. Jul 17, 2022. URL: https://www.nngroup.com/articles/which-ux-research-methods/ [accessed 2022-10-03]
  • Figma: the collaborative interface design tool. Figma. URL: https://www.figma.com/ [accessed 2024-02-27]
  • Sharma S, Sarkar D, Gupta D. Agile processes and methodologies: a conceptual study. Int J Comput Sci Eng. May 2012;4(5). [ FREE Full text ]
  • Todorovic D. Gestalt principles. Scholarpedia. 2008;3(12):5345. [ CrossRef ]
  • cami3.0. GitHub. URL: https://github.com/ashwani227/cami3.0/blob/master/frontend/src/videos/CAMI.mp4 [accessed 2024-05-21]
  • Majnemer A, O'Donnell M, Ogourtsova T, Kasaai B, Ballantyne M, Cohen E, et al. BRIGHT coaching: a randomized controlled trial on the effectiveness of a developmental coach system to empower families of children with emerging developmental delay. Front Pediatr. 2019;7:332. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Ogourtsova T, O'Donnell M, De Souza Silva W, Majnemer A. Health coaching for parents of children with developmental disabilities: a systematic review. Dev Med Child Neurol. Nov 2019;61(11):1259-1265. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Ghanouni P, Quirke S, Blok J, Casey A. Independent living in adults with autism spectrum disorder: stakeholders' perspectives and experiences. Res Dev Disabil. Dec 2021;119:104085. [ CrossRef ] [ Medline ]
  • Chowdhary KR. Natural language processing. In: Fundamentals of Artificial Intelligence. New Delhi, India. Springer; 2020.
  • Lai T, Ji H, Zhai C, Tran QH. Joint biomedical entity and relation extraction with knowledge-enhanced collective inference. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. 2021. Presented at: ACL-IJCNLP 2021; August 1-6, 2021; Online. [ CrossRef ]
  • El-allaly ED, Sarrouti M, En-Nahnahi N, Ouatik El Alaoui S. MTTLADE: a multi-task transfer learning-based method for adverse drug events extraction. Inf Process Manage. May 2021;58(3):102473. [ CrossRef ]
  • Zhang Y, Lin H, Yang Z, Wang J, Zhang S, Sun Y, et al. A hybrid model based on neural networks for biomedical relation extraction. J Biomed Inform. May 2018;81:83-92. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Wit.ai homepage. Wit.ai. URL: https://wit.ai/ [accessed 2024-02-27]
  • Conversational AI platform. Rasa. URL: https://rasa.com/ [accessed 2024-02-27]
  • Dialogflow. Google Cloud. URL: https://cloud.google.com/dialogflow [accessed 2024-02-27]
  • Falci SG, Marques LS. CONSORT: when and how to use it. Dental Press J Orthod. 2015;20(3):13-15. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • cami3.0. GitHub. URL: https://github.com/ashwani227/cami3.0 [accessed 2024-05-21]
  • Canadian Network for Personalized Interventions in Intellectual Disability. University of Alberta. URL: https://cami.med.ualberta.ca [accessed 2024-05-21]
  • Fensel D, Şimşek U, Angele K, Huaman E, Kärle E, Panasiuk O, et al. Introduction: what is a knowledge graph? In: Knowledge Graphs: Methodology, Tools and Selected Use Cases. Cham, Switzerland. Springer; 2020.
  • Zou X. A survey on application of knowledge graph. J Phys Conf Ser. Mar 2020;1487(1):012016. [ CrossRef ]
  • Wang X, Luo Z, He R, Shao Y. Novel medical question and answer system: graph convolutional neural network based with knowledge graph optimization. Expert Syst Appl. Oct 2023;227:120211. [ CrossRef ]
  • Li L, Wang P, Yan J, Wang Y, Li S, Jiang J, et al. Real-world data medical knowledge graph: construction and applications. Artif Intell Med. Mar 2020;103:101817. [ CrossRef ] [ Medline ]
  • Chen Y, Guo Y, Fan Q, Zhang Q, Dong Y. Health-aware food recommendation based on knowledge graph and multi-task learning. Foods. May 22, 2023;12(10):2079. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Seneviratne O, Harris J, Chen CH, McGuinness DL. Personal health knowledge graph for clinically relevant diet recommendations. arXiv. Preprint posted online October 19, 2021. [ FREE Full text ]
  • Chen Y, Sinha B, Ye F, Tang T, Wu R, He M, et al. Prostate cancer management with lifestyle intervention: from knowledge graph to Chatbot. Clin Transl Dis. Feb 20, 2022;2(1):e29. [ CrossRef ]
  • Liu Z, Li X, You Z, Yang T, Fan W, Yu P. Medical triage chatbot diagnosis improvement via multi-relational hyperbolic graph neural network. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’21). 2021. Presented at: SIGIR '21; July 11-15, 2021; Virtual Event, Canada. [ CrossRef ]
  • Ma Y, Chen G, Yan W, Xu B, Qi J. Artificial intelligence chronic disease management system based on medical resource perception. In: Sun X, Zhang X, Xia Z, Bertino E, editors. Artificial Intelligence and Security. Cham, Switzerland. Springer; 2021.
  • Deciphering Developmental Disorders (DDD) project - Homepage. Deciphering Developmental Disorders. URL: https://www.ddduk.org/ [accessed 2023-05-29]
  • ADDitude asked: parent-to-parent. ADDitude. URL: https://www.additudemag.com/additude-asked-parent-to-parent/ [accessed 2024-05-23]
  • Loftin A. How to encourage good behavior in children with ADHD. ADDitude. May 10, 2021. URL: https://www.additudemag.com/adhd-behavior-expert-discipline-skills/ [accessed 2024-05-23]
  • Jones C, Brady C, Hirschfeld RM, Barkley R. Expert answers to common questions about ADHD. ADDitude. Nov 7, 2018. URL: https://www.additudemag.com/expert-answers-adhd-myths-facts/ [accessed 2024-05-23]
  • Yee A. Being strength-minded: an introduction to growth mindset. Foothills Academy. Nov 7, 2018. URL: https://www.foothillsacademy.org/community/articles/growth-mindset [accessed 2024-05-23]
  • Baby registry tips: baby clothes. CanDo Kiddo. URL: https://www.candokiddo.com/news/baby-registry-tips-baby-clothes [accessed 2024-05-23]
  • Alberta child health benefit. Government of Alberta. URL: https://www.alberta.ca/alberta-child-health-benefit [accessed 2024-05-23]
  • Qi Creative homepage. Qi Creative. URL: https://www.qicreative.com/services [accessed 2024-05-23]
  • Keita B. What are the reasons behind agile popularity? Invensis. URL: https://www.invensislearning.com/blog/reasons-behind-agile-popularity/ [accessed 2022-10-26]
  • Testa D, Jourde-Chiche N, Mancini J, Varriale P, Radoszycki L, Chiche L. Unsupervised clustering analysis of data from an online community to identify lupus patient profiles with regards to treatment preferences. Lupus. Oct 2021;30(11):1837-1843. [ CrossRef ] [ Medline ]
  • Najafabadi MK, Mahrin MN, Chuprat S, Sarkan HM. Improving the accuracy of collaborative filtering recommendations using clustering and association rules mining on implicit data. Comput Hum Behav. Feb 2017;67:113-128. [ CrossRef ]
  • Vitolo M, Proietti M, Shantsila A, Boriani G, Lip GY. Clinical phenotype classification of atrial fibrillation patients using cluster analysis and associations with trial-adjudicated outcomes. Biomedicines. Jul 20, 2021;9(7):843. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Zou J, Luo JF, Shen Y, Cai JF, Guan JL. Cluster analysis of phenotypes of patients with Behçet's syndrome: a large cohort study from a referral center in China. Arthritis Res Ther. Jan 30, 2021;23(1):45. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Zeng J, Nakano YI. Exploiting a large-scale knowledge graph for question generation in food preference interview systems. In: Proceedings of the Companion Proceedings of the 25th International Conference on Intelligent User Interfaces. 2020. Presented at: IUI '20 Companion; March 17-20, 2020; Cagliari, Italy. [ CrossRef ]
  • Fortis AE. Indexing research papers in open access databases. arXiv. Preprint posted online May 28, 2009. [ FREE Full text ]
  • Matotek D, Turnbull J, Lieverdink P. Pro Linux System Administration: Learn to Build Systems for Your Business Using Free and Open Source Software. New York, NY. Apress; 2017.
  • Chandra RV, Varanasi BS. Python Requests Essentials. Mumbai, India. Packt Publishing Pvt Ltd; 2015.
  • O'Hare S. 7 reasons why having a multi-language site benefits your business. Weglot. URL: https://weglot.com/blog/reasons-why-a-multi-language-site-benefits-your-business/ [accessed 2022-10-31]
  • Fowler W. 9 reasons a multi-language website benefits your business. TechSpective. Sep 19, 2018. URL: https://techspective.net/2018/09/19/9-reasons-a-multi-language-website-benefits-your-business/ [accessed 2022-10-31]
  • Hapke H, Howard C, Lane H. Natural Language Processing in Action: Understanding, Analyzing, and Generating Text with Python. New York, NY. Manning Publications; 2019.
  • Ni P, Okhrati R, Guan S, Chang V. Knowledge graph and deep learning-based text-to-GQL model for intelligent medical consultation chatbot. Inf Syst Front. Jul 06, 2022:1-19. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Weeks R, Cooper L, Sangha P, Sedoc J, White S, Toledo A, et al. Chatbot-delivered COVID-19 vaccine communication message preferences of young adults and public health workers in urban American communities: qualitative study. J Med Internet Res. Jul 06, 2022;24(7):e38418. [ FREE Full text ] [ CrossRef ] [ Medline ]

Abbreviations

attention-deficit/hyperactivity disorder
Coaching Assistant for Medical/Health Information
Consolidated Standards of Reporting Trials
Deciphering Developmental Disorders
Education Resources Information Center
Human Phenotype Ontology
large language model
neurodevelopmental disability/difference
natural language processing
Unified Medical Language System

Edited by A Mavragani; submitted 22.06.23; peer-reviewed by M Chatzimina; comments to author 21.07.23; revised version received 27.07.23; accepted 19.04.24; published 18.06.24.

©Ashwani Singla, Ritvik Khanna, Manpreet Kaur, Karen Kelm, Osmar Zaiane, Cory Scott Rosenfelt, Truong An Bui, Navid Rezaei, David Nicholas, Marek Z Reformat, Annette Majnemer, Tatiana Ogourtsova, Francois Bolduc. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 18.06.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research (ISSN 1438-8871), is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.

research article on user interface

Introducing Microsoft 365 Copilot – your copilot for work

Mar 16, 2023 | Jared Spataro - CVP, AI at Work

  • Share on Facebook (opens new window)
  • Share on Twitter (opens new window)
  • Share on LinkedIn (opens new window)

Screenshot Microsoft 365 Copilot

Humans are hard-wired to dream, to create, to innovate. Each of us seeks to do work that gives us purpose — to write a great novel, to make a discovery, to build strong communities, to care for the sick. The urge to connect to the core of our work lives in all of us. But today, we spend too much time consumed by the drudgery of work on tasks that zap our time, creativity and energy. To reconnect to the soul of our work, we don’t just need a better way of doing the same things. We need a whole new way to work.

Today, we are bringing the power of next-generation AI to work. Introducing Microsoft 365 Copilot — your copilot for work . It combines the power of large language models (LLMs) with your data in the Microsoft Graph and the Microsoft 365 apps to turn your words into the most powerful productivity tool on the planet.

“Today marks the next major step in the evolution of how we interact with computing, which will fundamentally change the way we work and unlock a new wave of productivity growth,” said Satya Nadella, Chairman and CEO, Microsoft. “With our new copilot for work, we’re giving people more agency and making technology more accessible through the most universal interface — natural language.”

Copilot is integrated into Microsoft 365 in two ways. It works alongside you, embedded in the Microsoft 365 apps you use every day — Word, Excel, PowerPoint, Outlook, Teams and more — to unleash creativity, unlock productivity and uplevel skills. Today we’re also announcing an entirely new experience: Business Chat . Business Chat works across the LLM, the Microsoft 365 apps, and your data — your calendar, emails, chats, documents, meetings and contacts — to do things you’ve never been able to do before. You can give it natural language prompts like “Tell my team how we updated the product strategy,” and it will generate a status update based on the morning’s meetings, emails and chat threads.

With Copilot, you’re always in control. You decide what to keep, modify or discard. Now, you can be more creative in Word, more analytical in Excel, more expressive in PowerPoint, more productive in Outlook and more collaborative in Teams.

Microsoft 365 Copilot transforms work in three ways:

Unleash creativity. With Copilot in Word, you can jump-start the creative process so you never start with a blank slate again. Copilot gives you a first draft to edit and iterate on — saving hours in writing, sourcing, and editing time. Sometimes Copilot will be right, other times usefully wrong — but it will always put you further ahead. You’re always in control as the author, driving your unique ideas forward, prompting Copilot to shorten, rewrite or give feedback. Copilot in PowerPoint helps you create beautiful presentations with a simple prompt, adding relevant content from a document you made last week or last year. And with Copilot in Excel, you can analyze trends and create professional-looking data visualizations in seconds.

Unlock productivity. We all want to focus on the 20% of our work that really matters, but 80% of our time is consumed with busywork that bogs us down. Copilot lightens the load. From summarizing long email threads to quickly drafting suggested replies, Copilot in Outlook helps you clear your inbox in minutes, not hours. And every meeting is a productive meeting with Copilot in Teams. It can summarize key discussion points — including who said what and where people are aligned and where they disagree — and suggest action items, all in real time during a meeting. And with Copilot in Power Platform, anyone can automate repetitive tasks, create chatbots and go from idea to working app in minutes.

GitHub data shows that Copilot promises to unlock productivity for everyone. Among developers who use GitHub Copilot, 88% say they are more productive, 74% say that they can focus on more satisfying work, and 77% say it helps them spend less time searching for information or examples.

But Copilot doesn’t just supercharge individual productivity. It creates a new knowledge model for every organization — harnessing the massive reservoir of data and insights that lies largely inaccessible and untapped today. Business Chat works across all your business data and apps to surface the information and insights you need from a sea of data — so knowledge flows freely across the organization, saving you valuable time searching for answers. You will be able to access Business Chat from Microsoft 365.com, from Bing when you’re signed in with your work account, or from Teams.

Uplevel skills. Copilot makes you better at what you’re good at and lets you quickly master what you’ve yet to learn. The average person uses only a handful of commands — such as “animate a slide” or “insert a table” — from the thousands available across Microsoft 365. Now, all that rich functionality is unlocked using just natural language. And this is only the beginning.

Copilot will fundamentally change how people work with AI and how AI works with people. As with any new pattern of work, there’s a learning curve — but those who embrace this new way of working will quickly gain an edge.

Screenshot Microsoft 365 Copilot

The Copilot System: Enterprise-ready AI

Microsoft is uniquely positioned to deliver enterprise-ready AI with the Copilot System . Copilot is more than OpenAI’s ChatGPT embedded into Microsoft 365. It’s a sophisticated processing and orchestration engine working behind the scenes to combine the power of LLMs, including GPT-4, with the Microsoft 365 apps and your business data in the Microsoft Graph — now accessible to everyone through natural language.

Grounded in your business data. AI-powered LLMs are trained on a large but limited corpus of data. The key to unlocking productivity in business lies in connecting LLMs to your business data — in a secure, compliant, privacy-preserving way. Microsoft 365 Copilot has real-time access to both your content and context in the Microsoft Graph. This means it generates answers anchored in your business content — your documents, emails, calendar, chats, meetings, contacts and other business data — and combines them with your working context — the meeting you’re in now, the email exchanges you’ve had on a topic, the chat conversations you had last week — to deliver accurate, relevant, contextual responses.

Built on Microsoft’s comprehensive approach to security, compliance and privacy. Copilot is integrated into Microsoft 365 and automatically inherits all your company’s valuable security, compliance, and privacy policies and processes. Two-factor authentication, compliance boundaries, privacy protections, and more make Copilot the AI solution you can trust.

Architected to protect tenant, group and individual data. We know data leakage is a concern for customers. Copilot LLMs are not trained on your tenant data or your prompts. Within your tenant, our time-tested permissioning model ensures that data won’t leak across user groups. And on an individual level, Copilot presents only data you can access using the same technology that we’ve been using for years to secure customer data.

Integrated into the apps millions use every day. Microsoft 365 Copilot is integrated in the productivity apps millions of people use and rely on every day for work and life — Word, Excel, PowerPoint, Outlook, Teams and more. An intuitive and consistent user experience ensures it looks, feels and behaves the same way in Teams as it does in Outlook, with a shared design language for prompts, refinements and commands.

Designed to learn new skills.  Microsoft 365 Copilot’s foundational skills are a game changer for productivity: It can already create, summarize, analyze, collaborate and automate using your specific business content and context. But it doesn’t stop there. Copilot knows how to command apps (e.g., “animate this slide”) and work across apps, translating a Word document into a PowerPoint presentation. And Copilot is designed to learn new skills. For example, with Viva Sales, Copilot can learn how to connect to CRM systems of record to pull customer data — like interaction and order histories — into communications. As Copilot learns about new domains and processes, it will be able to perform even more sophisticated tasks and queries.

Committed to building responsibly

At Microsoft, we are guided by our AI principles and Responsible AI Standard and decades of research on AI, grounding and privacy-preserving machine learning. A multidisciplinary team of researchers, engineers and policy experts reviews our AI systems for potential harms and mitigations — refining training data, filtering to limit harmful content, query- and result-blocking sensitive topics, and applying Microsoft technologies like InterpretML and Fairlearn to help detect and correct data bias. We make it clear how the system makes decisions by noting limitations, linking to sources, and prompting users to review, fact-check and adjust content based on subject-matter expertise.

Moving boldly as we learn  

In the months ahead, we’re bringing Copilot to all our productivity apps—Word, Excel, PowerPoint, Outlook, Teams, Viva, Power Platform, and more. We’ll share more on pricing and licensing soon. Earlier this month we announced Dynamics 365 Copilot as the world’s first AI Copilot in both CRM and ERP to bring the next-generation AI to every line of business.

Everyone deserves to find purpose and meaning in their work — and Microsoft 365 Copilot can help. To serve the unmet needs of our customers, we must move quickly and responsibly, learning as we go. We’re testing Copilot with a small group of customers to get feedback and improve our models as we scale, and we will expand to more soon.

Learn more on the Microsoft 365 blog and visit WorkLab to get expert insights on how AI will create a brighter future of work for everyone.

And for all the blogs, videos and assets related to today’s announcements, please visit our microsite .

Tags: AI , Microsoft 365 , Microsoft 365 Copilot

  • Check us out on RSS

research article on user interface

Cart

  • SUGGESTED TOPICS
  • The Magazine
  • Newsletters
  • Managing Yourself
  • Managing Teams
  • Work-life Balance
  • The Big Idea
  • Data & Visuals
  • Reading Lists
  • Case Selections
  • HBR Learning
  • Topic Feeds
  • Account Settings
  • Email Preferences

Research: How Remote Work Impacts Women at Different Stages of Their Careers

  • Natalia Emanuel,
  • Emma Harrington,
  • Amanda Pallais

research article on user interface

Data on software engineers at a Fortune 500 company revealed that junior and senior women saw contrasting costs and benefits.

While much has been said about the potential benefits of remote work for women, recent research examines how working from home affects the professional development of female software engineers at a Fortune 500 company, revealing that its impact varies by career stage. Junior women engineers benefit significantly from in-person mentorship, receiving 40% more feedback when sitting near colleagues, while senior women face reduced productivity due to increased mentoring duties. Male engineers also benefit from proximity, but less so. The authors suggest that recognizing and rewarding mentorship efforts could mitigate these disparities, ensuring junior women receive adequate support remotely and senior women are properly compensated for their mentoring contributions.

Since the pandemic began, work from home (WFH) has at times been pitched as a means of supporting women in the workplace. This argument often focuses on WFH’s potential to help women juggle the demands of their jobs with the demands of their families. However, WFH’s impact on women’s professional development may vary over their careers. In our research, we explored how WFH impacts young women as they try to get a foothold in their careers and how it affects the often-invisible mentorship work done by more senior women.

research article on user interface

  • NE Natalia Emanuel serves as a research economist at the New York Fed.
  • EH Emma Harrington is a professor at the University of Virginia.
  • AP Amanda Pallais is a professor at Harvard University.

Partner Center

IMAGES

  1. (PDF) A review on user interface design principles to increase software

    research article on user interface

  2. three features of graphical user interface

    research article on user interface

  3. An example of an annotated research article on the SitC website. Users

    research article on user interface

  4. How to Write a User Research Plan

    research article on user interface

  5. (PDF) Usability Studies on Mobile User Interface Design Patterns: A

    research article on user interface

  6. (PDF) Graphical User Interface Features In Building Attractive And

    research article on user interface

VIDEO

  1. Research Article Writing; Professional Scientific Communication; NPTEL-PMRF Week 3 Live Session

  2. Analyze Any Websites Domain, Article or Journal

  3. Research Methodologies

  4. "User research for games"

  5. Jconsole overview

  6. All Possible types of "TOP NAVIGATION" Menu in one video

COMMENTS

  1. Intelligent User Interfaces and Their Evaluation: A Systematic Mapping Study

    2.1. Intelligent User Interfaces. The idea of introducing intelligence to human-computer interaction (HCI) and user interface sprouted decades ago in the form of intelligent computer-assisted instructions, which later gained a wider following and application as IUIs [].The field connects different disciplines, mainly AI, software engineering (SE), and HCI, with AI contributing simulation and ...

  2. (PDF) Intelligent User Interfaces and Their Evaluation ...

    Faculty of Electrical Engineering and Computer Science, University of Maribor, 2000 Maribor, Slovenia. * Correspondence: [email protected]. Abstract: Intelligent user interfaces (IUI) are driven ...

  3. Review article Adaptive user interfaces and universal usability through

    Plasticity of user interface design research is the latest trend in the domain of AUI design, facing many challenges. Some criticism arises from the fact that it is allied with AI, and thus some scepticisms have been raised by the HCI community. From the literature survey, most of the AUIs are designed for office and web-based applications.

  4. User Experience Methods in Research and Practice

    Abstract. User experience (UX) researchers in technical communication (TC) and beyond still need a clear picture of the methods used to measure and evaluate UX. This article charts current UX methods through a systematic literature review of recent publications (2016-2018) and a survey of 52 UX practitioners in academia and industry.

  5. Usability and user experience evaluation of natural user interfaces: a

    1 Introduction. Natural user interface (NUI) came up to improve users' interaction with the system using natural body movements to perform actions [].According to Wigdor and Wixon [], the natural property is not referring to the interface, but to the way that users interact with it and what they feel using it.Norman [], a prominent human-computer interaction (HCI) researcher, cited Steve ...

  6. (PDF) Investigating the User Interface Design Frameworks of Current

    The goal of this research was to outline the current user interface design criteria and guidelines applied when designing a mobile learning application and explore how these factors affect the ...

  7. User Interface Matters: Analysing the Complexity of ...

    Mobile applications can be computationally simple, while incorporating a heavy design, implementation and overall effort in constructing a complex UI that services a critical aspect of the product communication and interaction with the user. Previous research falls short at considering the UI definition of the product, and this gap opens the ...

  8. Unifying Functional User Interface Design Principles

    Her research interests include User Interface design, model-driven engineering (MDE) and software engineering. Estefanía Serral. Estefanía Serral is a computer scientist currently working as an assistant professor at KU Leuven (Belgium). She has a highly international and interdisciplinary profile. Her research focuses on IoT, MDE and ...

  9. Usability and User Experience: Design and Evaluation

    Usability and user experience (UX) are important concepts in the design and evaluation of products or systems intended for human use. This chapter introduces the fundamentals of design for usability and UX, focusing on the application of science, art, and craft to their principled design. It reviews the major methods of usability assessment ...

  10. Usability and user experience evaluation of natural user interfaces: a

    Natural user interface (NUI) is considered a recent topic in human-computer interaction (HCI) and provides innovative forms of interaction, which are performed through natural movements of the human body like gestures, voice, and gaze. ... voice, and gaze. In the software development process, usability and user eXperience (UX) evaluations are ...

  11. User Interface Design in the 21st Century

    The articles in this special issue report on user interface design on the frontiers of computing. They investigate topics with significant potential or risk, showing the maturity of new concepts and evidence of usefulness. Each article summarizes interesting research, engages readers, and directs future efforts.

  12. (PDF) USABILITY AND USER EXPERIENCE: DESIGN AND EVALUATION

    Abstract. Usability and user experience (UX) are important concepts in the design and evaluation of products or systems intended for human use. This chapter introduces the fundamentals of design ...

  13. Designing User Interfaces for Illiterate and Semi-Literate Users: A

    Exploring the African village metaphor for computer user interface icons [Conference session]. 2009 Annual Research Conference of the South African Institute of Computer Scientists and Information Technologists, Vanderbijlpark, Emfuleni, South Africa (pp. 132-140).

  14. A systematic review of global research on natural user interface for

    The natural user interface (NUI) enables intuitive interaction between users and the SHS, significantly lowering the barrier to entry and enhancing user experience. However, a comprehensive evaluation of research on NUI for the SHS to provide a valuable synthesis of existing research and informing future research directions remains unavailable.

  15. Full article: User experience framework that combines aspects

    UX measurement method is a method to measure UX aspects and to get information about the fulfillment level of a certain aspect. The primary studies have been used to identify the different measurement methods used by researchers to measure UX aspects, either as a separate method or mixed with other measurement methods.

  16. Defining Recommendations to Guide User Interface Design: Multimethod

    Building on this prior research, a set of recommendations on user interface design were engendered following 4 steps: (1) interview with user interface design experts, (2) analysis of the experts' feedback and drafting of a set of recommendations, (3) reanalysis of the shorter list of recommendations by a group of experts, and (4) refining ...

  17. User Interface Design and UX Design: 80+ Important Research Papers

    Important peer-reviewed and informally published recent research on user interface design and user experience (UX) design. For the benefit of clients and colleagues we have culled a list of approximately 70 curated recent research publications dealing with user interface design, UX design and e-commerce optimization. ...

  18. The influence of user interface design on task performance and ...

    To understand the influence of user interface on task performance and situation awareness, three levels of user interface were designed based on the three-level situation awareness model for the 3-player diner's dilemma game. The 3-player diner's dilemma is a multiplayer version of the prisoner's dilemma, in which participants play games with two computer players and try to achieve high scores.

  19. UX Research Cheat Sheet

    UX Research Cheat Sheet. Susan Farrell. February 12, 2017. Summary: User research can be done at any point in the design cycle. This list of methods and activities can help you decide which to use when. User-experience research methods are great at producing data and insights, while ongoing activities help get the right things done.

  20. Usability 101: Introduction to Usability

    Iterative design is the best way to increase the quality of user experience. The more versions and interface ideas you test with users, the better. User testing is different from focus groups, which are a poor way of evaluating design usability. Focus groups have a place in market research, but to evaluate interaction designs you must closely ...

  21. Researching the best ways to improve the online user experience

    Use Baymard's comprehensive UX research database to create "State of the Art" user experiences, and see how your UX performance stacks up. With Baymard Premium you will get access to 650+ design guidelines and 215,000+ performance scores — insights already used by several of the world's leading sites.

  22. What Does a UX Designer Do?

    1. Understand the user and the brand. Think about what problem you're trying to solve for the user (and how this aligns with brand goals). 2. Conduct user research. Identify user needs, goals, behaviors, and pain points. Tools for user research might include surveys, one-on-one interviews, focus groups, or A/B testing.

  23. Will senior adults accept being cognitively assessed by a ...

    Abstract Background: early detection of dementia and Mild Cognitive Impairment (MCI) have an utmost significance nowadays, and smart conversational agents are becoming more and more capable. DigiMoCA, an Alexa-based voice application for the screening of MCI, was developed and tested. Objective: to evaluate the acceptability and usability of DigiMoCA, considering the perception of end-users ...

  24. Research on the Movement Speed of Situational Map Symbols Based on User

    When designing situational maps, selecting distinct and visually comfortable movement speeds for dynamic elements is an ongoing challenge for designers. This study addresses this issue by conducting two experiments to measure the human eye's ability to discern moving speeds on a screen and examines how symbol movement speeds within situational maps affect users' subjective experiences ...

  25. Usability Studies on Mobile User Interface Design Patterns: A

    Mobile platforms have called for attention from HCI practitioners, and, ever since 2007, touchscreens have completely changed mobile user interface and interaction design. Some notable differences ...

  26. Full article: Optimizing cost of quality in commercial projects using

    Research Article. Optimizing cost of quality in commercial projects using fuzzy expert system. ... The Graphical User Interface is shown in Figure 7 and the sequence of processes included in the FES is shown in Figure 8. Figure 7. The graphical user interface (gui) of the expert system.

  27. Research article A User Interface (UI) and User eXperience (UX

    SEP-CyLE includes learning objects and tutorials on a variety of computer science and software engineering topics. In this experiment, we chose to use learning objects related to software testing. Our research aim is to investigate the utility of SEP-CyLE and evaluate the usability of its user interface based on the actual user experience. 1.1.

  28. Journal of Medical Internet Research

    To identify NDD domain-specific entities from user input, a combination of standard sources (the Unified Medical Language System) and other entities were used which were identified by health professionals as well as collaborators. ... Journal of Medical Internet Research 8439 articles JMIR Research Protocols 4028 articles ...

  29. Introducing Microsoft 365 Copilot

    Microsoft 365 Copilot transforms work in three ways: Unleash creativity. With Copilot in Word, you can jump-start the creative process so you never start with a blank slate again. Copilot gives you a first draft to edit and iterate on — saving hours in writing, sourcing, and editing time.

  30. Research: How Remote Work Impacts Women at Different Stages of Their

    While much has been said about the potential benefits of remote work for women, recent research examines how working from home affects the professional development of female software engineers at ...