Bei Yu
Draw Something Help
Associate Professor
School of Information Studies
Syracuse University

320 Hinds Hall
Syracuse, NY 13244
Email: byu AT syr DOT edu


Associate Professor, School of Information Studies, Syracuse University (2015-present)
Assistant Professor, School of Information Studies, Syracuse University (2009-2015)
Advisor on the Information Representation and Retrieval Concentration, Linguistics Studies Program, Syracuse University (since 2009)

Research interests

Text classification, feature selection, sentiment classification and opinion mining, text mining for social science research and digital humanities

Education and Training

Postdoctoral Fellow, Kellogg School of Management, Northwestern University (2006-2009). Supervisor: Daniel Diermeier
Ph.D. 2006 Library and Information Science, University of Illinois at Urbana-Champaign. Advisors: John Unsworth and Linda Smith
M.S. 1999 Computer Science, Institute of Computing Technology, Chinese Academy of Sciences
B.S. 1996 Computer Science, University of Science and Technology of China


Edited proceedings

Yu, B. and Jiang, M. (eds.) (2009). Proceedings of the 1st international CIKM workshop on Topic-sentiment analysis for mass opinion, ACM, New York.[Summary]

Refereed journal articles

Yu, B. (2014). Language and gender in Congressional speech. Literary and Linguistic Computing, 29(1): 118-132. PDF     doi:10.1093/llc/fqs073
Yu, B., Willis, M., Sun, P., and Wang, J. (2013). Crowdsourcing Participatory Evaluation of Medical Pictograms. Journal of Medical Internet Research; 15(6):e108.  PDF     doi:10.2196/jmir.2513
Kwok, L. and Yu, B. (2013). Spreading social media messages on Facebook: An analysis of the restaurant industry. Cornell Hospitality Quarterly, 54(1): 84-94 PDF     doi:10.1177/1938965512458360
Diermeier, D., Godbout, J-F., Yu, B. and Kaufmann, S. (2012). Language and ideology in Congress. British Journal of Political Science 42(1):31-55.  DATA  PDF    doi:10.1017/S0007123411000160
Yu, B., Kaufmann, S., and Diermeier D. (2008). Classifying party affiliation from political speech. Journal of Information Technology and Politics 5(1): 33-48.PDF     doi:10.1080/19331680802149608
Yu, B. (2008). An evaluation of text classification methods for literary study. Literary and Linguistic Computing 23(3): 327-343. PDF     doi: 10.1093/llc/fqn015

Refereed conference proceedings


Yu, B. and Fanelli, D. (2014). Classifying negative findings in biomedical publications. Proceedings of ACL 2014 Workshop on Biomedical Natural Language Processing, 19-23. PDF


Yu, B. (2013). Automated Citation Sentiment Analysis: What Can We Learn From Biomedical Researchers. Proceedings of the American Society for Information Science and Technology, 50(1): 1-9. PDF    10.1002/meet.14505001084


Yu, B. (2012). Function Words for Chinese Authorship Attribution. Proceedings of the NAACL-HLT 2012 Workshop on Computational Linguistics for Literature, Montreal, Canada, June 8th, 2012, 45-53. PDF
Wang, J. and Yu, B. (2012). Collecting Representative Pictures for Words: A Human Computation Approach based on Draw Something Game. Proceedings of the AAAI Human Computation Workshop, Toronto, Canada, July 23rd, 2012, 157-158. PDF


Yu, B., Chen, M., and Kwok, L. (2011). Toward Predicting Popularity of Social Marketing Messages. Proceedings of the 2011 International Conference on Social Computing, Behavioral-Cultural Modeling and Prediction (SBP'11), University of Maryland, College Park, MD, March 29-31, 2011, Lecture Notes in Computer Science 6589, Springer, 2011, 317-324, ISBN 978-3-642-19655-3. PDF    doi: 10.1007/978-3-642-19656-0_44
Yu, B. and Kwok, L. (2011). Classifying Business Marketing Messages on Facebook. Proceedings of the SIGIR 2011 Workshop on Internet Advertisement, July 28, 2011. PDF
Wang, J. and Yu, B. (2011). Labeling images with queries: A recall-based image retrieval game approach. Proceedings of the SIGIR 2011 Workshop on Crowd-sourcing for Information Retrieval, July 28, 2011.   (Best Paper Award)
Hu, X. and Yu, B. (2011). Exploring The Relationship Between Mood and Creativity in Rock Lyrics. Proceedings of the 12th International Society for Music Information Retrieval Conference (ISMIR'11), Miami, Florida, Obtober 24-28, 2011, 789-794, ISBN 978-0-615-54865-4. PDF
Yu, B. (2011). The Emotional World of Health Online Communities. (Poster) iConference 2011, Seattle, WA, February 8-11. PDF     doi: 10.1145/1940761.1940914


Yu, B. and Diermeier, D. (2010). A longitudinal study of language and ideology in Congress. The 68th National Conference of Midwest Political Science Association, Chicago, IL, April 2010. PDF
Wang, J. and Yu, B. (2010). Sentence recall game: a novel tool for collecting data to discover language usage patterns. Proceedings of the 16th ACM SIGKDD Workshop on Human Computation, Washington, D.C., July 25th 2010, pp. 56-59. PDF     10.1145/1837885.1837904
Yu, B. and Ku, M. (2010). Collecting legacy corpora from social science research for text mining evaluation. (Poster) ASIST 2010 Annual Meeting, Pittsburgh, PA, October 22-27, 2010 PDF
Chen, M., Yu, B., and Liu, X. (2010). Building Folk UMLS: An Approach to Finding Meaning of Folk Terms in Medical Domain. iConference 2010 Poster PDF


Yu, B., Kaufmann, S., and Diermeier D. (2008). Exploring the characteristics of opinion expressions for political opinion classification. Proceedings of the 9th Annual International Conference on Digital Government Research, Montreal, Canada, May 2008, pp. 82-91.PDF
Shao, H., Yu, B., and Nadeau, J. (2008). Strangeness-based feature weighting and classification of gene expression profiles. Proceedings of the 23rd Annual ACM Symposium on Applied Computing Bioinformatics Track (SAC'08), Fortaleza, Ceara, Brazil, March 2008, pp.1292-1296. PDF


Diermeier, D., Francois, J., Yu, B. and Kaufmann, S. (2007). Language and ideology in Congress. The 65th National Conferece of Midwest Political Science Association, Chicago, IL, April 2007
Yu, B. and Unsworth, J. (2007). An evaluation of text classification methods for literary study. Digital Humanities, Champaign, IL, June 2007


Plaisant, C., Rose, J., Yu, B., Auvil, L., Kuschenbaum, M., Smith, M., Clement, T. and Lord, G. (2006) Exploring erotics in Emily Dickinson's correspondence with text mining and visual interfaces. Proceedings of the 6th ACM/IEEE Joint Conference on Digital Libraries (JCDL'06), Chapel Hill, NC, June 2006, pp. 141-150 (Vannevar Bush Best Paper Candidate) PDF
Yu, B. and Unsworth, J. (2006). Toward discovering potential data mining applications in literary criticism. Digital Humanities, Paris, France, July 2006


Yu, B., Mei, Q., and Zhai, C. (2005). English usage comparison between native and non-native English speakers in academic writing. The 7th Joint International Conference of the Association for Computers and the Humanities and the Association for Literary and Linguistic Computing (ACH/ALLC), University of Victoria, Canada, June 2005


Zhai, C., Velivelli, A., and Yu, B. (2004) A cross-collection mixture model for comparative text mining. Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, WA, August 2004, pp. 743-748. PDF


Wang, J., Yu, B., and Gasser, L. (2002) Concept tree based ordering for shaded similarity matrix. Proceedings of the 2nd IEEE International Conference on Data Mining, Maebashi City, Japan, December 2002, pp. 697-700. PDF

Book chapter

Godbout, J-F and Yu, B. (2009). Speech and legislative extremism in the U.S. Senate. In L. Imbeau (Ed.),"Do They Walk Like They Talk?" Speech and Action in Policy Processes. Chapter 11, pp. 185-206, Springer (Studies in Public Choice Series): New York.

Invited Talks

Predicting gender and ideology from political speeches. NxtMedia Conference. Trondheim, Norway, October
Language, ideology, and gender in Congressional Speech: a computational approach. NTNU, VTT, and NxtMedia Seminar on Recommendation Technology. Trondheim, Norway, October, 2013
Computational thinking in text mining for social science research, Center for Advanced Study in the Behavioral Sciences Workshop on Tracking, Transcribing, and Tagging Government: Building Digital Records for Computational Social Science, , Stanford University, June, 2010
Language and ideology in Congress. Information Science Colloquium, Cornell University, October, 2009
Towards automatic polarity classification of mass opinion. iForum at School of Information, University of Texas at Austin, February, 2009

Professional Activities

Conference organization

Co-chair of the First International Workshop on Topic-Sentiment Analysis for Mass Opinion Measurement (affiliated with the 18th ACM Conference on Information and Knowledge Management), Hong Kong, 2009

Program committee members

Annual Meeting of the Association for Computational Linguistics (ACL 2014, 2013)
International Conference on Social Computing, Behavioral-Cultural Modeling, and Prediction (SBP 2014, 2013, 2012)
Workshop on Computational Linguistics for Literature (CLFL 2014, 2013)
International Workshop on News Recommendation and Analytics (NRA 2014)
International World Wide Web conference (WWW 2011)
International Joint Conference on Natural Language Processing (IJCNLP 2013, 2011)
International Conference on Computational Linguistics (COLING 2010)
Conference on Empirical Methods in Natural Language Processing (EMNLP 2010)
Workshop on Opinion Mining and Business Intelligence (OMBI 2010)
International Workshop on Topic Feature Discovery and Opinion Mining (TFDOM 2010)

Conference Reviewers

Medicine 2.0 Conference (2014, 2013)
First International Workshop on Emotion and Sentiment in Social and Expressive Media (2013)
iConference (2011, 2010)
Digital Humanities Conference (2011, 2010, 2009, 2008)

Journal Reviewers

ACM Transactions on Information Systems (ACM TOIS)
Canadian Journal of Information and Library Science (CJILS)
Electronic Library Journal (ELJ)
Journal of Data and Knowledge Engineering (DKE)
Journal of Decision Support Systems (DSS)
Journal of Information Processing and Management (IP&M)
Journal of the American Society for Information Science and Technology (JASIST)
Journal of Information Technology and Politics (JITP)
Journal of Medical Internet Research (JMIR)
Literary and Linguistic Computing (LLC)

last updated: 9/10/2014