banner



What Is The Name Of The Woman In The Background Image At 0xxx.ws

  • Periodical Listing
  • JAMIA Open
  • 5.4(2); 2021 Apr
  • PMC8220305

JAMIA Open. 2021 Apr; 4(2): ooab042.

Automatic gender detection in Twitter profiles for health-related cohort studies

Yuan-Chi Yang

1 Department of Biomedical Informatics, School of Medicine, Emory University, Atlanta, Georgia, USA

Mohammed Ali Al-Garadi

ane Department of Biomedical Informatics, Schoolhouse of Medicine, Emory University, Atlanta, Georgia, USA

Jennifer S Love

two Department of Emergency Medicine, School of Medicine, Oregon Wellness & Science University, Portland, Oregon, USA

Jeanmarie Perrone

3 Section of Emergency Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA

Abeed Sarker

1 Department of Biomedical Informatics, Schoolhouse of Medicine, Emory University, Atlanta, Georgia, USA

4 Section of Biomedical Engineering, Georgia Constitute of Technology and Emory University, Atlanta, Georgia, USA

Received 2021 January 21; Revised 2021 Apr 27; Accustomed 2021 May iv.

Supplementary Materials

ooab042_Supplementary_Data.

GUID: 8CD60598-0B9E-443F-ADC0-AD50F32C777E

Data Availability Argument

The data underlying this article cannot be shared publicly due to Twitter API'due south use terms and privacy concern. The information volition exist shared on reasonable request to the corresponding writer.

Abstruse

Objective

Biomedical research involving social media data is gradually moving from population-level to targeted, cohort-level data analysis. Though crucial for biomedical studies, social media user'south demographic information (eg, gender) is ofttimes not explicitly known from profiles. Here, we nowadays an automatic gender classification organisation for social media and we illustrate how gender information tin can be incorporated into a social media-based health-related report.

Materials and Methods

Nosotros used a large Twitter dataset composed of public, gender-labeled users (Dataset-1) for grooming and evaluating the gender detection pipeline. We experimented with machine learning algorithms including back up vector machines (SVMs) and deep-learning models, and public packages including M3. Nosotros considered users' data including contour and tweets for classification. We also developed a meta-classifier ensemble that strategically uses the predicted scores from the classifiers. Nosotros then applied the best-performing pipeline to Twitter users who have self-reported nonmedical use of prescription medications (Dataset-two) to assess the system's utility.

Results and Discussion

We collected 67 181 and 176 683 users for Dataset-one and Dataset-2, respectively. A meta-classifier involving SVM and M3 performed the best (Dataset-one accuracy: 94.4% [95% conviction interval: 94.0–94.8%]; Dataset-ii: 94.four% [95% confidence interval: 92.0–96.six%]). Including automatically classified information in the analyses of Dataset-ii revealed gender-specific trends—proportions of females closely resemble data from the National Survey of Drug Apply and Health 2018 (tranquilizers: 0.50 vs 0.50; stimulants: 0.50 vs 0.45), and the overdose Emergency Room Visit due to Opioids past Nationwide Emergency Department Sample (hurting relievers: 0.38 vs 0.37).

Keywords: natural language processing, machine learning, Twitter, user profiling, gender detection, toxicovigilance

INTRODUCTION

Social media data are increasingly beingness used for health-related research.i , 2 Users often discuss personal experiences or opinions regarding a variety of health topics, such every bit wellness services or medications.one–3 Such information tin exist categorized, aggregated and analyzed to obtain population-level insights,four–8 at low cost and in close to existent time. It has thus been used every bit a resource for population wellness tasks such as flu surveillance, pharmacovigilance, and toxicovigilance.9–eleven While early research mostly attempted to bear observational studies on entire populations (eg, Twitter users discussing influenza),12 some contempo studies accept been moving to targeted cohorts (eg, pregnant women,13 people in sure geo-locations,14 cancer patients,15 and people suffering from mental health issues16–nineteen). Demographic data virtually such cohorts can help researchers investigate what roles demographics take in a given study, understand if social media is biased toward specific cohorts, and explicitly accost these biases.twenty , 21 Due to the importance of explicitly considering biological sexual activity or gender in health research, funding agencies, including the National Institutes of Health, accept emphasized the necessity to describe sex/gender data of the cohorts included in research studies (eg, through inclusion of women).22 This, still, presents a claiming for social media-based studies because the demographic information of the users are frequently not explicitly known.

One solution is to infer the demographic data from the users' metadata. In the past ii decades, researchers take developed various automatic methods for characterizing users. Taking gender detection on Twitter as an example, researchers have investigated classification schemes based on the users' (screen) names, profile descriptions, tweets, profile colors, and even images, with machine learning algorithms such equally support vector automobile (SVM), Naive Bayes, Decision Tree, Deep Neural Network, and Bidirectional Encoder Representations from Transformers (BERT).23–33 Some have made their pipelines publicly available and take since been applied to social media mining tasks. For case, Sap et al26 released a lexicon for gender and age detection and it was practical for mental wellness research.sixteen–18 Knowles et al28 released a package named Demographer to infer gender based on users' first names and it was later employed to infer gender in studies for flu vaccination34 and mental health.19 Wang et al31 also released a multimodal deep learning system (M3) to infer gender based on users' profile information, including pictures, (screen) names, and descriptions. Though these existing pipelines can exist directly applied to biomedical tasks, at that place is even so room for improvement, peculiarly for Twitter data. First, none of these pipelines used all 4 of the users' textual attributes—names, screen names, descriptions, and tweets. This is a missed opportunity and in that location is thus the possibility to further amend upon these models by developing a pipeline capable of incorporating these iv attributes or more. Second, these experiments have not been validated on the same data, making it incommunicable to perform straight comparisons of their performances. 3rd, to the best of our noesis, these pipelines were developed based on general users, but have non been tested on gender-labeled, domain-specific datasets. Benchmarking the performance variations due to domain change tin can inform researchers about the applicability of these pipelines on their specific tasks.

In this work, we aimed at developing a high-accuracy, automatic gender classification system and evaluated its performance and utility on a domain-specific dataset. In the following sections, nosotros first draw our experiments with various unimodal and multimodal strategies and existing pipelines, and compare their performances on a unified platform. We and then discuss the benchmarking of the best strategies on our domain-specific (Toxicovigilance) dataset, consisting of a Twitter cohort of cocky-reported nonmedical consumers of prescription medications (PMs). The benchmarking involves evaluating performance scores on an annotated subset. To illustrate the utility of this pipeline, nosotros practical the best-performing approach to compare the inferred gender proportions of a Twitter cohort with traditional, trusted sources.35 , 36 The source code for gender detection experiments described volition be fabricated open up source (https://bitbucket.org/sarkerlab/gender-detection-for-public).

LAY SUMMARY

To perform biomedical enquiry using social media information on a targeted cohort, the user's demographic information (eg, gender) is typically required. Yet, the information is often non explicitly known from the user contour. One solution is to infer the information from the user's public data via natural language processing and automobile-learning techniques. In this work, we focused on estimating the user'south gender and developed a highly accurate pipeline. We then applied the pipeline on a Toxicovigilance accomplice of Twitter users who have self-reported misuse of prescription medications (PMs), including tranquilizers, stimulants, and opioids. We found that the pipeline performs with high accuracy on this data set. Additionally, the inferred gender proportions of those users are consistent with traditional surveys, including the National Drug Use and Wellness Survey 2018 past the Substance Abuse and Mental Wellness Services Administration and the estimated overdose-related Emergency Department visits in 2016 from the Nationwide Emergency Department Sample. The results back up that social media data tin can exist harnessed as a complementary source to traditional surveys and tin can exist used to understand the demographics of PM misuse in the United States. Our gender detection pipeline volition be made publicly available to ensure transparency and support community-driven development.

MATERIALS AND METHODS

This report was canonical by the Emory Academy institutional review board (IRB00114235).

Gender detection pipeline evolution

Data collection

We collected gender-labeled datasets for general Twitter users, released by previous work.25 , 33 The data from Liu and Ruths25 consists of 12 681 users with binary annotations obtained via crowdsourcing through Amazon Mechanical Turk.37 Each instance was coded past three annotators and a label was accepted only if all iii annotators agreed. The data from Volkova et al33 consists of i 000 000 tweets, randomly sampled from the information in Burger et al,23 which is labeled using users' cocky-specified genders on Facebook or MySpace profiles linked to their Twitter accounts. Both datasets provide the users' IDs and gender labels. Our focus is to develop the informatics infrastructure to detect gender as Twitter users self-place themselves on the social media platform and we consider the 2 annotation methods to fall inside this definition. We combined the 2 datasets and extracted users' publicly available data using Twitter API, including contour meta-data, such equally handle names, descriptions, and profile colors, likewise as the users' timelines (just English tweets were collected, while the retweets were excluded; users who had no original English tweets were dropped). We called this dataset as Dataset-1 and split it into training (60%), validation (20%), and test (20%) sets for pipeline evolution.

Classification

We first developed classifiers based on unmarried attributes (ie, unimodal), including names and screen names, descriptions, tweets, and profile colors. Nosotros so experimented with building meta-classifiers based on the predicted scores from these classifiers (ie, multimodal). The flowchart in Figure one illustrates our processing pipeline. In the experiments, nosotros considered auto learning algorithms including SVMs,38 , 39 Random Wood (RF),40 Bi-directional Long Short-Term Retention (BLSTM),41 , 42 and BERT,43 , 44 every bit well equally existing resources including the lexica released by Sap et al,26 the Demographer system by Knowles et al28 and the M3 system (without profile moving picture) by Wang et al31 Beneath we briefly outline each experiment, with further details in the Supplementary Tabular array S1.

An external file that holds a picture, illustration, etc.  Object name is ooab042f1.jpg

Gender classification pipeline, from user profile to gender characterization.

Name and screen name

We applied packet Demographer28 (DG) on the users' names. DG attempts to identify gender using character north-grams of user's first name, trained using the list of given names from Us Social Security data. Similar to DG, nosotros trained an SVM classifier for screen names using character northward-grams.

Description

To classify gender using a user'south description, we experimented with SVM, BLSTM, and BERT, approaches suited for free text information. BERT is a transformer-based model that produces contextual vector representations of words and achieves state-of-the-art performance on many tasks.43 , 45 Many models with like architecture have then been implemented and released.46 , 47

Each description was pre-candy by lowercasing and anonymizing URLs and user names. For SVM, the features are the normalized term frequency of unigrams. For BLSTM and BERT, each discussion or grapheme sequence was replaced with a dense vector, and the vectors were fed into the algorithms for training.

Tweets

Focusing on users who take a substantial number of tweets, we selected users in the training data with at to the lowest degree 100 tweets and merged all collected tweets as the training texts for experiments on SVMs. The pre-processing is the aforementioned as that for the SVM classifier using description. The regularization parameter was optimized according to the validation accurateness.

Colors

We utilized 5 features associated with profile colors, including background color, link colour, sidebar border colour, sidebar fill color, and text color. Each profile color is represented using RGB values, each ranging from 0 to 255. We collapsed each value into 4 groups, yielding 64 groups for each color. Nosotros and so experimented with SVM and RF.

Meta-classifier

We experimented with building SVM models on the predicted scores from 4 different combinations of the classifiers:

  • meta-1: SVM on tweets and M3.

  • meta-2: SVM on tweets, M3, Demographer on name, and BERT on description.

  • meta-3: SVM on tweets, M3, and SVM on colors.

  • meta-4: DM on names, SVM on screen names, BERT on clarification, and SVM on tweets.

Nomenclature functioning evaluation

The classification performance evaluation is based on grade-specific precision, remember, and F1 score, as well as accuracy (male and female combined). These metrics are defined as the follows:

precision = number of true positive instances number of positive instances

recall = number of true positive instances number of relevant instances

F 1 score = 2 1 / precision + i / recall

accuracy = number of correctly classified instances number of instances

where F1 score is the harmonic mean of precision and recall. We as well calculate the area nether the receiver operating characteristic curve (AUROC). The receiver operating characteristic curve presents the human relationship between the true positive rate and the fake positive rate nether different threshold and the AUROC provides a measure for the performance. The range of AUROC is from 0 to 1, with ane being the best.

Coverage

Some users take missing profile information such equally name or description or use not-English language characters in the name field. This may make the inference using the specific information impossible. Therefore, for each classifier, nosotros prove the percentage of users whose genders tin can be inferred from the relevant profile information (equally "coverage") while the functioning is evaluated using this subset of users.

Application on Toxicovigilance dataset

Data collection

To comport Toxicovigilance research using social media, we had collected publicly available, English tweets mentioning over xx PMs that take the potential for nonmedical use or misuse. The lists of PMs can be institute in Supplementary Tabular array S2. In our prior work, we have developed notation guidelines with our domain skillful (JP) and have annotated a subset consisting of 16 433 tweets.48 A brief description of annotation guideline and example tweets are given in Supplementary Tables S3 and S4, respectively. Based on this ready, we and then developed automatic classification schemes to discover if the tweets are describing cocky-reported nonmedical use (referred every bit "misuse tweets" in the following).49 In this work, nosotros used this classifier to classify a dataset collected from March 6, 2018 to Jan 14, 2020 and extracted the users' publicly available data. Nosotros referred to this set every bit Dataset-2. We besides grouped users whose misuse tweets could be geo-located in the United States equally a subset (Dataset-ii-The states).50

Since Dataset-two did not have manual binary annotations, we relied on a secondary source to identify a user'south gender—their self-identified gender information on the linked public Facebook profiles—whenever possible. These users make up the exam set up of Dataset-2 for benchmarking.

Classification performance

Nosotros applied the best-performing nomenclature strategies on the examination set of Dataset-2 to evaluate their performances. This serves not only as a benchmarking of how those pipelines perform on the Toxicovigilance dataset (Dataset-2) simply also provides a measure of the transferability of our pipelines across research bug.

Gender distribution inference

To assess the utility of our cohort label pipeline on a health surveillance related task, we applied the best-performing classification pipeline on Dataset-2 (and Dataset-ii-US) and analyzed the gender distributions of the users who had self-reported misuse/abuse on i of the three abuse-prone PM categories—stimulants (eg, Adderall®), which tin can increment alertness, attention, and energy and are by and large prescribed to treat Attention Deficit Hyperactivity Disorder, tranquilizers (eg, alprazolam/Xanax®), which ho-hum encephalon activeness and are mostly used to care for anxiety, and pain relievers (eg, Oxycodone/OxyContin®), specifically for those containing opioids.35 , 51 Nosotros so compared the distributions with metrics from the 2018 NSDUH,35 as well as the overdose-related Emergency Section Visits (EDV) in 2016 from the Nationwide Emergency Section Sample (NEDS).36 The details of the calculation are given in the Supplementary Materials. Nosotros performed Pearson's Chi-squared exam for contingency table to determine if the differences in the proportions of females inferred from the dissimilar sources (Twitter vs survey information) are statistically significant, defined as P-value < 0.05.

RESULTS

Gender detection pipeline development

Data Collection (Dataset-1)

In total, we were able to think the user data from 67 181 users, consisting of 35 812 (53.3%) females (F) and 31 369 (46.seven%) males (Grand), which is shut to the distribution estimated by Burger et al23 and Heil and Piskorski52 (55% female and 45% male) but deviate from the distribution estimated by Liu and Ruths25 (65% female and 35% male person). The distribution is presented in Table 1.

Table ane.

Information distributions for the grooming, validation and test sets from Dataset-1

Dataset F M Total
Training (Dataset-1) 21 521 18 788 40 309
Validation (Dataset-1) 7133 6303 13 436
Exam (Dataset-1) 7158 6278 13 436
Total (Dataset-1) 35 812 31 369 67 181

Classification

The operation (F1-score, accurateness, and AUROC) for each classifier and meta-classifier is presented in Table ii, while the precisions and recalls are presented in the Supplementary Table S5. Nosotros now highlight the main findings.

Table 2.

Test results (on Dataset-one) for classifiers and meta-classifiers

Feature/method F1 score (95% CI) (0.XXX)
Coverage (%) Accurateness (95% CI) (%) AUROC
F M
Proper name/DG 802 (795–810) 802 (795–809) 98.one 80.ii (79.5–80.9) 0.878
Screen name/SVM 748 (740–756) 719 (710–728) 100.0 73.iv (727–742) 0.817
Description/SVM 728 (719–736) 693 (683–703) 88.9 71.one (seventy.three–72.0) 0.796
Description/BLSTM 724 (716–733) 665 (655–675) 88.9 69.vii (68.9–70.6) 0.781
Description/BERT 790 (782–797) 757 (748–766) 88.9 77.4 (76.7–78.two) 0.873
Tweets/SVM 893 (888–898) 879 (872–885) 100.0 88.six (88.1–89.2) 0.933
Tweets/Lexicon 874 (868–880) 856 (849–862) 100.0 86.v (86.0–87.1) 0.917
Contour/M3 903 (897–908) 898 (893–903) 100.0 90.0 (89.v–ninety.5) 0.968
Colors/SVM 671 (662–682) 649 (640–659) 100.0 66.1 (65.3–66.nine) 0.712
Colors/RF 660 (651–669) 640 (630–649) 100.0 65.0 (64.2–65.8) 0.692
Meta-1 947 (944–951) 940 (936–944) 100.0 94.4 (94.0–94.viii) 0.965
Meta-two 947 (943–951) 939 (935–944) 100.0 94.3 (93.9–94.seven) 0.971
Meta-3 948 (944–952) 941 (937–945) 100.0 94.5 (94.ane–94.9) 0.966
Meta-4 930 (925–934) 920 (915–925) 100.0 92.5 (92.1–92.9) 0.953

The best performing nomenclature schemes were the meta-classifiers using predicted scores from M3 and SVM on tweets (meta-1, 2, 3), with accuracies around 94.4%. The second best scheme was meta-iv, with an accuracy of 92.v%. These all performed better than existing pipelines, including the lexicon (86.5%), the Demographer (80.2%), and M3 (90.0%), and other unimodal classifiers.

Application on toxicovigilance dataset

Information Collection (Dataset-2)

We were able to retrieve past data from 176 683 users for Dataset-two (63 306 users for Dataset-ii-US). Less than 0.3% of the users (412) had publicly available gender data from linked Facebook profile pages. One hundred fifty-v out of 412 users in this subset were female (37.six%), while 257 users were male (62.4%).

Nomenclature operation

The performances of the pipelines on the test ready of Dataset-2 are shown on Table 3 (precisions and recalls are on Supplementary Tabular array S6). The best performing pipeline was meta-1 (accuracy 94.4%). Besides M3 and meta-i, all the classifiers experience performance drops possibly due to domain change. Hither, we left out meta-ii and meta-iii considering meta-one provides comparable functioning while being simpler. We also note that the accuracy of meta-1 is 95.viii% (95% confidence interval 93.3–98.3%) when restricted to users whose misuse tweets could be geo-located in the United States (239 users with 103 females and 136 males).

Table 3.

Test results (on Dataset-ii, for users who have revealed gender information on Facebook) for classifiers and meta-classifiers

Feature/method F1 score (95% CI) (0.XXX)
Coverage (%) Accuracy (95% CI) (%) AUROC
F M
Name/DG 717 (655–773) 833 (796–867) 94.9 79.0 (74.9–82.9) 0.844
Screen name/SVM 692 (634–745) 776 (732–816) 100.0 74.0 (69.7–78.2) 0.838
Clarification/BERT 674 (616–727) 715 (663–762) 94.9 69.6 (65.0–74.2) 0.839
Tweets/SVM 821 (772–865) 894 (864–921) 100.0 86.7 (83.3–89.eight) 0.916
Tweets/Lexicon 770 (717–818) 846 (810–879) 100.0 81.6 (77.7–85.2) 0.889
Profile/M3 894 (855–928) 936 (913–956) 100.0 92.0 (89.iii–94.iv) 0.974
Meta-1 927 (894–954) 955 (935–972) 100.0 94.4 (92.0–96.6) 0.964
Meta-iv 885 (846–919) 926 (902–949) 100.0 91.0 (88.1–93.7) 0.955

Gender distribution inference

We applied meta-1 on all the users and analyzed the gender distributions for the users who accept self-reported abuse/misuse of tranquilizers, stimulants, or pain relievers (opioids). In Tabular array 4, we report the number of users for each category, and the percentage of males and females, inferred through the classification results (meta-1), and reported by NSDUH 2018.35

Table iv.

Gender distributions for selected medication categories (inferred by the classifier/NSDUH 2018/overdose EDV 2016)

Medication category Number of users (geo-located in the US) Per centum of male person/female
inferred (geo-located in the United states of america) NSDUH 2018 overdose EDV 2016
Tranquilizers 62 471 (xx 863) 0.499/0.501 (0.490/0.510) 0.499/0.501
Stimulants 93 598 (36,323) 0.504/0.496a (0.514/0.486a) 0.551/0.449
Pain relievers 38,299 (12,077) 0.621/0.379a (0.635/0.365a) 0.518/0.482b 0.630/0.370

Although the users in Dataset-2-United states are but roughly one-third of all users in Dataset-2, the gender proportions are close to each other. For tranquilizer and stimulants users, the gender proportions inferred from Twitter are very shut to the comparator from NSDUH 2018 (with no statistically significant difference for tranquilizer users). In dissimilarity, the gender proportions of pain reliever users are quite unlike from the comparator from NSDUH 2018, only much closer to the overdose EDV from NEDS.36 This suggests that Twitter information could be an indicator of the gender distribution of opioid overdoses and might provide complementary information to better sympathise the discrepancies between the same ii traditional information sources.

DISCUSSION

Model functioning and comeback

Meta-ane performs with loftier accuracy consistently across Dataset-1 (94.4%) and Dataset-2 (94.4%), ameliorate than all the existing pipelines and other classifiers on this platform. This shows edifice the gender detection pipeline based on the 4 prominent textual features (proper noun, screen name, description, and tweets) can ameliorate performance over existing approaches. As well, except meta-1 and M3, all classifiers performed worse on the domain-specific data. This illustrates the importance of benchmarking the existing machine learning systems on the targeted cohorts, in gild to evaluate their applicability on the desired tasks. It also indicates that multimodal strategies could enhance the robustness of the system against unseen data and is thus desirable when building similar user-characterization pipelines.

Moving forward, there are ii directions to further improve the pipeline, inclusion of targeted cohort into training data and experimenting with additional classification algorithms/architectures. For example, incorporating multiple features in one arrangement, similar to the M3 arrangement,31 might farther improve the operation. We chose our architecture based on model simplicity and development efficiency. We notation that, though potentially complex and time-consuming, it is possible to pattern and train a model that learns from all the user's attributes simultaneously and performs well, in contrast to our architecture that learns these information through a transformed knowledge—the predicted scores. We get out this investigation to future piece of work.

Potential pipeline utility

Given that our pipeline performs well across domains and shows promising results on the external task (eg, inferring gender proportions), nosotros believe that this pipeline is well-suited for awarding on medical/wellness tasks harnessing Twitter information. This pipeline can exist used to infer the gender proportions in targeted cohorts and potentially help investigate the gender disparities in health topics of interest. For example, social media has been shown to be a potentially excellent resource for conducting large-calibration mental health surveillance,19 , 53 , 54 and our methods tin be used to derive gender-specific insights from such surveillance tasks. Tasks commonly performed using social media data, such every bit sentiment analyses regarding targeted topics, may likewise do good from the gender-specific characterizations enabled by our system.55 , 56 Combined with other recently adult methods, such every bit geolocation-based label of social media chatter,14 , fifty our methods tin provide very unique insights over a given population of social media users.

Toxicovigilance

Our post-classification analyses of the PM cohort illustrated the utility of automatic gender classification on social media data. The closeness of the gender proportions of tranquilizer and stimulant misusers from Twitter and those from NSDUH 2018 validates the effectiveness applying social media mining for Toxicovigilance.x , 57 , 58 The inferred gender proportion of pain reliever users, though different from NSDUH 2018, is most identical to that of the overdose EDV co-ordinate to the NEDS. This association between self-reports of drug use on Twitter and overdose EDV rates is consistent with our past research,14 in which nosotros identified pregnant associations between opioid misuse reports on Twitter and overdose deaths over specific geolocations (eg, counties and sub-states). Social media provides the opportunity to combine multiple types of data, including past tweets, social connections, and geolocation. All the information combined can provide geolocation-, gender- and time-specific trends to extract insights, for example for gender inequalities in medical treatment regarding substance employ disorder.59–63 It potentially could also test hypotheses such as the association between mental wellness bug and PM misuse.64 , 65 The development of sophisticated models for social media mining may fifty-fifty provide broad insights almost how nonmedical users of pain relievers become victims of overdose over time, and may even serve as an early warning organization.57 , 58 , 66–68 Furthermore, the surveillance can be done close to existent time—a great improvement over the turnaround fourth dimension for curating overdose statistics and conducting the NSDUH, which may brand timely public health intervention possible.69 For example, the system can provide the trends and statistics to the local wellness department and hospitals for amend training for PM misuse prevention and treatment, and highlight cohorts at higher risk.57 , lxx Note that nosotros do not envision that social media data analytics can supervene upon the traditional resources, simply nosotros know from the current state of enquiry that it provides splendid complementary data, and the opportunity to provide information/intervention beyond the traditional health services.

Limitations

Our pipeline may inherit the biases and errors introduced by the data and resources used in the pipeline evolution, leading to meaning limitations. The lack of information related to the biases (eg, race, main language, or location) limits the performance and our ability to address them. For instance, the users in Dataset-1, though having at least one English tweet, may non be representative for U.s. Twitter users (eg, by racial distribution). Our pipeline may inherit this undetected bias. Also, using Demographer28 might introduce bias toward racial majority. Though its effect on the test performance might be detected, we are non able to remedy such biases when the racial coding is absent-minded. Besides biases might be introduced during note. For case, Dataset-one and the test set of Dataset-2 may be biased toward those whose gender identities are public. Therefore, though the evaluation provides a mensurate of the pipeline's performance confronting human estimation, it may not be accurate on users whose genders are hard to place.

Besides, merging the two individually labeled datasets when amalgam Dataset-i, though essential for obtaining acceptable grooming power and generalizability, could also affect note quality by introducing inconsistency. Though the annotation methods adopted in Liu and Ruths25 and Burger et al23 both autumn within our definition (gender identified on social media), nosotros circumspection that these methods are dissimilar and are not perfect. For example, some users might use dissimilar gender identities on different platforms.

Crucially, limited past the annotations, our methods are only applicable to populations with binary gender identities. While this covers the majority of the population, our methods do not piece of work for the not-binary gender minorities—a community that has been shown to be especially vulnerable from a public health perspective.71–74 Despite this limitation, our proposed system not just serves every bit an important stepping rock for future piece of work by establishing a strong performance over the simplistic binary nomenclature, simply already allows us to investigate the inequalities that women experience in medical treatment (eg, for substance employ disorder).59–63 Including non-binary population in our model would crave collecting data from this population using coding schemes tailored for the differences within the population. Obtaining comprehensive demographic information could also aid extending our methods to include not-binary users. Nosotros currently are in the early stage of exploring how to all-time accost these issues.

There are too pregnant limitations associated with the analysis of nonmedical PM users. First, not all people living in the United States use English language primarily over social media. Limited by our infrastructure, nosotros currently are unable to capture Twitter users who employ languages other than English, just extending to other languages, specifically Spanish, is a planned future direction.75–77 Second, Twitter users might choose not to include geo data in tweets, which makes geo-locating impossible. For example, Dredze et all estimated that merely less than 25% of the public tweets could exist geo-located by their organisation. We caution that, because of this low proportion, information technology is non clear if the tweets geo-located in the Usa can well represent the The states tweets. For Dataset-2, we constitute that roughly twoscore% of the users' misuse tweets could be geo-located while about 84% of them were located in the United States, and the gender proportions inferred using Dataset-2 and those using Dataset-two-U.s.a. are very similar. Though this suggests that they might represent similar populations, they may withal not be representative of all The states Twitter users. Third, the detection of misuse tweets is based on a classification pipeline, and then the inference is also express past this NLP pipeline's operation.49 Fourth, the information are express to Twitter users that are attainable via the Twitter API, and should not be considered as a random sample of U.s. population.

Ethics

Though we limit this piece of work to observational research on publicly available data and adhere to Twitter API's employ terms, in that location is notwithstanding concern over Twitter users' protection and their perceptions.78–81 To avoid potential harms to the users, we just study and report on the aggregated information; no user's data will be released. We also will make the NLP pipeline publicly available (without the data) to ensure reproducibility, transparency to researchers and Twitter users, and to back up community-driven development. Simply the scripts for gender detection pipeline and our best performing pipeline will exist fabricated bachelor with this manuscript.

CONCLUSIONS

As social media-based health research focus is moving from population-level to cohort-level studies, incorporating user demographic data is condign more than important. In this work, we adult a gender detection pipeline and evaluated its performance on a general dataset and a domain-specific dataset. Our proposed pipeline shows loftier accuracy fifty-fifty when practical on a health-specific dataset. We farther showed that the pipeline tin be used to infer the nonmedical PM users' gender distributions, which is consistent with the statistical information reported by NSDUH 2018 (stimulants and tranquilizers) and by NEDS (overdose EDV due to Opioids). With the much-needed growing attention on explicitly incorporating demographic information, such as gender and race/ethnicity, in research, it is crucial to exist able to conduct aggregated gender-specific analyses of wellness-related social media data. Our pipeline is readily usable by social media researchers who need to infer users' demographics from their data. We note that, besides gender, other demographic data, such as race or historic period are also of import for research, and developing pipelines for these user characterization tasks and evaluating them on domain-specific datasets are part of our planned future work.

FUNDING

Enquiry reported in this publication was supported past the National Institute on Drug Abuse (NIDA) of the National Institutes of Health (NIH) under award number R01DA046619. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.

AUTHOR CONTRIBUTIONS

YY conducted and directed the machine learning experiments, evaluations and data analyses, with assistance from MAA and AS. AS provided supervision for full study. JSL and JP provided toxicology domain expertise for interpreting the results. YY drafted the manuscript and all authors contributed to the final manuscript.

SUPPLEMENTARY MATERIAL

Supplementary material is available at Journal of the American Medical Computer science Association online.

Supplementary Cloth

ooab042_Supplementary_Data

ACKNOWLEDGEMENTS

The authors thank the support from the National Institute of Health and National Plant of Drug Abuse.

Conflict OF Interest Statement

None declared.

DATA AVAILABILITY

The data underlying this article cannot exist shared publicly due to Twitter API's use terms and privacy concern. The data will be shared on reasonable request to the corresponding writer.

REFERENCES

1. Grajales FJ Three, Sheps S, Ho K, Novak-Lauscher H, Eysenbach G.. Social media: a review and tutorial of applications in medicine and health care. J Med Internet Res 2014; sixteen (2): e13. [PMC complimentary article] [PubMed] [Google Scholar]

2. Moorhead SA, Hazlett DE, Harrison L, Carroll JK, Irwin A, Hoving C.. A new dimension of health intendance: systematic review of the uses, benefits, and limitations of social media for health communication. J Med Internet Res 2013; 15 (4): e85. [PMC costless article] [PubMed] [Google Scholar]

4. Yang Y-C, Al-Garadi MA, Bremer W, Zhu JM, Grande D, Sarker A.. Developing an Automatic System for Classifying Churr About Health Services on Twitter: Case Study for Medicaid. J Med Internet Res 2021; 23 (v): e26616. [PMC gratis article] [PubMed] [Google Scholar]

v. Glover G, Khalilzadeh O, Choy G, Prabhakar AM, Pandharipande PV, Gazelle GS.. Hospital evaluations past social media: a comparative analysis of facebook ratings among operation outliers. J Gen Intern Med 2015; thirty (10): 1440–6. [PMC free commodity] [PubMed] [Google Scholar]

vi. Campbell L, Li Y.. Are Facebook user ratings associated with hospital cost, quality and patient satisfaction? A cantankerous-exclusive analysis of hospitals in New York State. BMJ Qual Saf 2018; 27 (2): 119–29. [PubMed] [Google Scholar]

seven. Hefele JG, Li Y, Campbell Fifty, Barooah A, Wang J.. Nursing domicile Facebook reviews: who has them, and how exercise they chronicle to other measures of quality and feel? BMJ Qual Saf 2018; 27 (2): 130–9. [PubMed] [Google Scholar]

viii. Ranard BL, Werner RM, Antanavicius T, et al. Yelp reviews of hospital care tin supplement and inform traditional surveys of the patient experience of intendance. Health Affairs 2016; 35 (4): 697–705. [PMC costless article] [PubMed] [Google Scholar]

nine. Broniatowski DA, Paul MJ, Dredze M.. National and local influenza surveillance through Twitter: an assay of the 2012-2013 influenza epidemic. PLoS One 2013; 8 (12): e83672. [PMC gratis article] [PubMed] [Google Scholar]

x. Sarker A, O'Connor Grand, Ginn R, et al. Social media mining for toxicovigilance: automatic monitoring of prescription medication abuse from Twitter. Drug Saf 2016; 39 (three): 231–40. [PMC free commodity] [PubMed] [Google Scholar]

11. O'Connor Yard, Pimpalkhute P, Nikfarjam A, Ginn R, Smith KL, Gonzalez G. Pharmacovigilance on twitter? Mining tweets for agin drug reactions. In: AMIA annual symposium proceedings, Vol. 2014. American Medical Informatics Association; 2014: 924. [PMC free commodity] [PubMed]

12. Mowery J. Twitter influenza surveillance: quantifying seasonal misdiagnosis patterns and their bear on on surveillance estimates. Online J Public Health Inform 2016; 8 (three): e198. [PMC free article] [PubMed] [Google Scholar]

13. Sarker A, Chandrashekar P, Magge A, Cai H, Klein A, Gonzalez G.. Discovering cohorts of meaning women from social media for condom surveillance and analysis. J Med Internet Res 2017; nineteen (10): e361. [PMC free article] [PubMed] [Google Scholar]

14. Sarker A, Gonzalez-Hernandez G, Ruan Y, Perrone J.. Machine learning and natural language processing for geolocation-centric monitoring and characterization of opioid-related social media chatter. JAMA Netw Open 2019; 2 (11): e1914672. [PMC free article] [PubMed] [Google Scholar]

fifteen. Al-Garadi MA, Yang Y-C, Lakamana S, et al. Automatic chest cancer survivor detection from social media for studying latent factors affecting handling success. In: Michalowski M, Moskovitch R, eds. Bogus Intelligence in Medicine. AIME 2020. Lecture Notes in Computer science. Vol. 12299. Cham: Springer; 2020. 10.1007/978-iii-030-59137-3_10 [CrossRef]

16. Coppersmith G, Leary R, Crutchley P, Fine A.. Tongue processing of social media equally screening for suicide risk. Biomed Inf Insights 2018; x: 1178222618792860. [PMC costless article] [PubMed] [Google Scholar]

17. Mowery DL, Park YA, Bryan C, Conway Thou. Towards automatically classifying depressive symptoms from Twitter data for population health. In: Proceedings of the Workshop on Computational Modeling of People's Opinions, Personality, and Emotions in Social Media (PEOPLES); The COLING 2016 Organizing Committee; 2016: 182–91. [Google Scholar]

eighteen. Coppersmith Thousand, Dredze M, Harman C, Hollingshead K. From ADHD to SAD: Analyzing the language of mental health on Twitter through cocky-reported diagnoses. In: Proceedings of the 2nd workshop on computational linguistics and clinical psychology: from linguistic signal to clinical reality; Association for Computational Linguistics; 2015: ane–ten. [Google Scholar]

19. Amir Due south, Dredze M, Ayers JW. Mental wellness surveillance over social media with digital cohorts. In: Proceedings of the Sixth Workshop on Computational Linguistics and Clinical Psychology; Clan for Computational Linguistics; 2019: 114–20. [Google Scholar]

xx. Cesare North, Grant C, Nguyen Q, Lee H, Nsoesie EO.. How well tin can machine learning predict demographics of social media users? arXiv Preprint arXiv:170201807. 2017.

21. Cesare N, Grant C, Hawkins JB, Brownstein JS, Nsoesie EO. Demographics in social media data for public health enquiry: does it matter? Bloomberg Data for Good Exchange Conference. New York; 2017.

23. Burger JD, Henderson J, Kim G, Zarrella K. Discriminating gender on Twitter. In: Proceedings of the 2011 Conference on Empirical Methods in Tongue Processing; Clan for Computational Linguistics; 2011: 1301–ix. [Google Scholar]

24. Alowibdi JS, Purchase UA, Yu P. Language contained gender classification on Twitter. In: Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Assay and Mining (ASONAM 'xiii). New York, NY, USA: Association for Calculating Mechanism; 739–43. DOI: 10.1145/2492517.2492632 [CrossRef]

25. Liu Due west, Ruths D. What'due south in a name? using first names every bit features for gender inference in twitter. In: 2013 AAAI Jump Symposium Serial. Association for the Advancement of Artificial Intelligence; 2013.

26. Sap M, Park G, Eichstaedt J, et al. Developing age and gender predictive lexica over social media. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP); Association for Computational Linguistics; 2014: 1146–51. [Google Scholar]

27. Merler M, Cao L, Smith JR. You are what you tweet… movie! gender prediction based on semantic analysis of social media images. In: 2015 IEEE International Conference on Multimedia and Expo (ICME); IEEE; 2015: ane–6. [Google Scholar]

28. Knowles R, Carroll J, Dredze 1000. Demographer: Extremely Elementary Proper name Demographics. In: Proceedings of the Kickoff Workshop on NLP and Computational Social Science; Association for Computational Linguistics; 2016: 108–xiii. [Google Scholar]

29. Bsir B, Zrigui M. Bidirectional LSTM for author gender identification. In: Nguyen N., Pimenidis E., Khan Z., Trawiński B. (eds) Computational Collective Intelligence. ICCCI 2018. Lecture Notes in Computer Science, vol 11055. Springer, Cham. ten.1007/978-3-319-98443-8_36 [CrossRef]

xxx. Vicente M, Batista F, Carvalho JP.. Gender detection of Twitter users based on multiple data sources In: Kóczy L, Medina-Moreno J, Ramírez-Poussa Due east, eds. Interactions Between Computational Intelligence and Mathematics Part ii. Studies in Computational Intelligence. Vol 794. Cham: Springer. 10.1007/978-3-030-01632-6_3 [CrossRef] [Google Scholar]

31. Wang Z, Unhurt South, Adelani DI, et al. Demographic inference and representative population estimates from multilingual social media data. In: The World Wide Web Briefing (WWW '19). Clan for Computing Machinery; New York, NY, U.s.; 2056–2067. DOI: 10.1145/3308558.3313684 [CrossRef]

32. Zhang C, Abdul-Mageed M.. BERT-based arabic social media AuthorProfiling. In: Mehta P, Rosso P, Majumder P, Mitra 1000, eds. Working Notes of the Forum for Information Retrieval Evaluation (FIRE 2019). CEUR Workshop Proceedings. CEUR-WS.org; December 12–15, 2019; Kolkata, India. [Google Scholar]

33. Volkova S, Wilson T, Yarowsky D. Exploring demographic language variations to improve multilingual sentiment analysis in social media. In: Proceedings of the 2013 Briefing on Empirical Methods in Natural Linguistic communication Processing. Association for Computational Linguistics; 2013: 1815–27. [Google Scholar]

34. Huang X, Smith MC, Paul MJ, et al. Examining patterns of influenza vaccination in social media. In: AAAI Workshops. Association for the Advocacy of Artificial Intelligence; 2017.

35. Substance Corruption and Mental Health Services Assistants. Results from the 2018 National Survey on Drug Use and Health: Detailed tables. Rockville, Dr.: Center for Behavioral Wellness Statistics and Quality, Substance Corruption and Mental Wellness Services Assistants; 2019. https://www.samhsa.gov/data/. [Google Scholar]

36. Centers for Disease Controland Prevention. 2019 AnnualSurveillance Written report of Drug-RelatedRisks and Outcomes — UnitedStates Surveillance Special Report. Centers for Disease Control and Prevention, U.S. Departmentof Health and Human Services. Published November ane, 2019.

38. Chang CC, Lin CJ.. LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2011; 2 (three): ane–27. [Google Scholar]

39. Platt J. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv Large Margin Classifiers 1999; 10 (3): 61–74. [Google Scholar]

xl. Ho TK. Random decision forests. In: Proceedings of 3rd international briefing on document analysis and recognition. vol. one. IEEE; 1995: 278–82. [Google Scholar]

41. Hochreiter Southward, Schmidhuber J.. Long short-term memory. Neural Comput 1997; ix (8): 1735–fourscore. [PubMed] [Google Scholar]

42. Schuster M, Paliwal KK.. Bidirectional recurrent neural networks. IEEE Trans Bespeak Procedure 1997; 45 (11): 2673–81. [Google Scholar]

43. Devlin J, Chang M-Westward, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for linguistic communication understanding. In: Proceedings of the 2019 Conference of the Due north American Chapter of the Association for Computational Linguistics: Human Language Technologies. Vol. 1 (Long and Short Papers); Association for Computational Linguistics; 2019: 4171–86. [Google Scholar]

44. Liu Y, Ott M, Goyal N, et al. Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:190711692. 2019.

45. Vaswani A, Shazeer N, Parmar N, et al. Attention is all you lot need. In: Advances in Neural Information Processing Systems; Curran Associates, Inc; 2017.

46. Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I.. Language models are unsupervised multitask learners. OpenAI Blog 2019; one (viii): 9. [Google Scholar]

47. Conneau A, Lample G. Cross-lingual language model pretraining. In: Advances in Neural Information Processing Systems; Curran Associates, Inc; 2019.

48. O'Connor K, Sarker A, Perrone J, Hernandez GG.. Promoting reproducible research for characterizing nonmedical use of medications through data notation: description of a Twitter corpus and guidelines. J Med Net Res 2020; 22 (2): e15861. [PMC free article] [PubMed] [Google Scholar]

49. Al-Garadi MA, Yang Y-C, Cai H, et al. Text classification models for the automatic detection of nonmedical prescription medication employ from social media. BMC Med Inf Decis Mak 2021; 21 (ane): 27. [PMC costless article] [PubMed] [Google Scholar]

50. Dredze M, Paul MJ, Bergsma S, Tran H.. Carmen: A twitter geolocation system with applications to public health. In: AAAI workshop on expanding the boundaries of wellness informatics using AI (HIAI). Vol. 23; Carmen: A twitter geolocation system with applications to public health; Citeseer; 2013: 45. [Google Scholar]

51. Corruption NIoD. Research report series. Prescription Drugs—Abuse and Addiction 2001. [Google Scholar]

52. Heil B, Piskorski M.. New Twitter research: men follow men and nobody tweets. Harv Jitney Rev 2009; one: 2009. [Google Scholar]

53. Birnbaum ML, Ernala SK, Rizvi AF, De Choudhury 1000, Kane JM.. A collaborative approach to identifying social media markers of schizophrenia past employing machine learning and clinical appraisals. J Med Internet Res 2017; 19 (8): e289. [PMC free commodity] [PubMed] [Google Scholar]

54. Reece AG, Reagan AJ, Lix KL, Dodds PS, Danforth CM, Langer EJ.. Forecasting the onset and course of mental disease with Twitter data. Sci Rep 2017; 7 (1): 1–11. [PMC complimentary article] [PubMed] [Google Scholar]

55. Zunic A, Corcoran P, Spasic I.. Sentiment analysis in health and well-being: systematic review. JMIR Med Inform 2020; 8 (one): e16023. [PMC gratis commodity] [PubMed] [Google Scholar]

56. Gohil Southward, Vuik Southward, Darzi A.. Sentiment analysis of wellness intendance Tweets: review of the methods used. JMIR Public Health Surveill 2018; 4 (2): e43. [PMC complimentary commodity] [PubMed] [Google Scholar]

57. Cagey 1000, Genes N, McKenzie A, Manini AF.. Leveraging social networks for toxicovigilance. J Med Toxicol 2013; 9 (2): 184–91. [PMC complimentary article] [PubMed] [Google Scholar]

58. Sarker A, DeRoos A, Perrone J.. Mining social media for prescription medication abuse monitoring: a review and proposal for a data-centric framework. J Am Med Inf Assoc 2020; 27 (2): 315–29. [PMC gratuitous commodity] [PubMed] [Google Scholar]

59. McHugh RK, Votaw VR, Sugarman DE, Greenfield SF.. Sexual practice and gender differences in substance utilize disorders. Clin Psychol Rev 2018; 66: 12–23. [PMC free article] [PubMed] [Google Scholar]

60. Manuel JI, Lee J.. Gender differences in discharge dispositions of emergency department visits involving drug misuse and abuse—2004-2011. Subst Abuse Care for Prev Policy 2017; 12 (1): ane–12. [PMC gratis article] [PubMed] [Google Scholar]

61. Ryoo H-J, Choo EK.. Gender differences in emergency department visits and detox referrals for illicit and nonmedical use of opioids. WestJEM 2016; 17 (3): 295–301. [PMC costless article] [PubMed] [Google Scholar]

62. Beaudoin FL, Baird J, Liu T, Merchant RC.. Sex differences in substance use among adult emergency section patients: prevalence, severity, and demand for intervention. Acad Emerg Med 2015; 22 (xi): 1307–fifteen. [PMC gratis article] [PubMed] [Google Scholar]

63. Choo EK, Douriez C, Green T.. Gender and prescription opioid misuse in the emergency section. Acad Emerg Med 2014; 21 (12): 1493–viii. [PMC costless commodity] [PubMed] [Google Scholar]

64. Hawkins EH. A tale of 2 systems: co-occurring mental health and substance abuse disorders treatment for adolescents. Ann Rev Psychol 2009; lx: 197–227. [PubMed] [Google Scholar]

65. Unger JB, Kipke MD, Simon TR, Montgomery SB, Johnson CJ.. Homeless youths and young adults in Los Angeles: prevalence of mental health problems and the relationship between mental health and substance corruption disorders. Am J Commun Psychol 1997; 25 (three): 371–94. [PubMed] [Google Scholar]

66. Kenne DR, Hamilton M, Birmingham 50, Oglesby WH, Fischbein RL, Delahanty DL.. Perceptions of damage and reasons for misuse of prescription opioid drugs and reasons for not seeking treatment for concrete or emotional pain among a sample of college students. Subst Use Misuse 2017; 52 (1): 92–ix. [PubMed] [Google Scholar]

67. Boys A, Marsden J, Strang J.. Agreement reasons for drug utilize amongst young people: a functional perspective. Health Educ Res 2001; sixteen (4): 457–69. [PubMed] [Google Scholar]

68. Stewart SH, Karp J, Pihl RO, Peterson RA.. Anxiety sensitivity and self-reported reasons for drug use. J Subst Abuse 1997; ix: 223–forty. [PubMed] [Google Scholar]

69. Cao B, Gupta S, Wang J, et al. Social media interventions to promote HIV testing, linkage, adherence, and retention: systematic review and meta-analysis. J Med Internet Res 2017; nineteen (11): e394. [PMC free article] [PubMed] [Google Scholar]

70. Sloboda Z. Changing patterns of "drug corruption" in the Usa: connecting findings from macro-and microepidemiologic studies. Subst Utilise Misuse 2002; 37 (8–10): 1229–51. [PubMed] [Google Scholar]

71. Meerwijk EL, Sevelius JM.. Transgender population size in the United States: a meta-regression of population-based probability samples. Am J Public Health 2017; 107 (2): e1–e8. [PMC free article] [PubMed] [Google Scholar]

72. Mayer KH, Bradford JB, Makadon HJ, Stall R, Goldhammer H, Landers S.. Sexual and gender minority health: what we know and what needs to be washed. Am J Public Health 2008; 98 (6): 989–95. [PMC complimentary commodity] [PubMed] [Google Scholar]

73. Streed CG, McCarthy EP, Haas JS.. Association between gender minority status and self-reported physical and mental health in the Us. JAMA Intern Med 2017; 177 (8): 1210–2. [PMC free article] [PubMed] [Google Scholar]

74. Reisner SL, Greytak EA, Parsons JT, Ybarra ML.. Gender minority social stress in adolescence: disparities in adolescent bullying and substance apply by gender identity. J Sex Res 2015; 52 (3): 243–56. [PMC complimentary commodity] [PubMed] [Google Scholar]

75. Soares F, Villegas Thou, Gonzalez-Agirre A, Krallinger M, Armengol-Estapé J. Medical word embeddings for Spanish: Development and evaluation. In: Proceedings of the 2nd Clinical Natural Language Processing Workshop; Clan for Computational Linguistics; 2019: 124–33. [Google Scholar]

76. Segura-Bedmar I, Martínez P, Revert R, Moreno-Schneider J.. Exploring Spanish health social media for detecting drug effects. BMC Med Inform Decis Mak 2015; xv (Suppl half-dozen): one–9. x.1186/1472-6947-15-S2-S6. [PMC gratis commodity] [PubMed] [CrossRef] [Google Scholar]

77. Cook BL, Progovac AM, Chen P, Mullin B, Hou S, Baca-Garcia E.. Novel utilize of natural linguistic communication processing (NLP) to predict suicidal ideation and psychiatric symptoms in a text-based mental health intervention in Madrid. Comput Math Methods Med 2016; 2016: one–8. [PMC complimentary article] [PubMed] [Google Scholar]

78. Williams ML, Burnap P, Sloan L.. Towards an ethical framework for publishing Twitter data in social research: Taking into account users' views, online context and algorithmic estimation. Sociology 2017; 51 (half-dozen): 1149–68. [PMC free commodity] [PubMed] [Google Scholar]

79. Mello MM, Wang CJ.. Ethics and governance for digital disease surveillance. Scientific discipline 2020; 368 (6494): 951–4. [PubMed] [Google Scholar]

lxxx. Klingwort J, Schnell R. Critical Limitations of Digital Epidemiology. Surv Res Methods xiv (two): 95–101. 10.18148/srm/2020.v14i2.7726 [CrossRef]

81. Morgan HM. Research annotation: surveillance in gimmicky health and social care: friend or foe? Surveill Soc 2014; 12 (4): 594–6. [Google Scholar]


Articles from JAMIA Open up are provided here courtesy of Oxford University Printing


What Is The Name Of The Woman In The Background Image At 0xxx.ws,

Source: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8220305/

Posted by: gordonfastir.blogspot.com

0 Response to "What Is The Name Of The Woman In The Background Image At 0xxx.ws"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel