Bei Yu

Assistant Professor
School of Information Studies
Syracuse University

320 Hinds Hall
Syracuse, NY 13244
Email: byu AT syr DOT edu

Research interests

Text Mining, Opinion Mining, Digital Humanities

Education

Ph.D. 2006 Library and Information Science, University of Illinois at Urbana-Champaign
M.S. 1999 Computer Science, Institute of Computing Technology, Chinese Academy of Sciences
B.S. 1996 Computer Science, University of Science and Technology of China

Publications

Godbout, J-F and Yu, B. (2009). Speech and legislative extremism in the U.S. Senate. In L. Imbeau (Ed.), "Do They Walk Like They Talk?" Speech and Action in Policy Processes, pp.185-205. Springer (Studies in Public Choice Series): New York.

Yu, B., Kaufmann, S. and Diermeier, D. (2008) Classifying party affiliation from political speech. Journal of Information Technology and Politics, 5(1), pp. 33-48. PDF

Yu, B. (2008) An evaluation of text classification methods for literary study. Literary and Linguistic Computing 23(3):327-343. PDF

Yu, B., Kaufmann, S. and Diermeier, D. (2008) Exploring the characteristics of opinion expressions for political opinion classification. Proceedings of the 9th Annual International Conference on Digital Government Research (dg.o 2008), Montreal, Canada, May 2008, pp. 82-91. PDF

Shao, H., Yu, B. and Nadeau, J. (2008) Strangeness-based feature weighting and classification of gene expression profiles. Proceedings of the 23rd Annual ACM Symposium on Applied Computing Bioinformatics Track, pp. 1292-1296. PDF

Plaisant, C., Rose, J., Yu, B., Auvil, L., Kuschenbaum, M., Smith, M., Clement, T. and Lord, G. (2006) Exploring erotics in Emily Dickinson's correspondence with text mining and visual interfaces. Proceedings of the 6th ACM/IEEE Joint Conference on Digital Libraries (JCDL 2006), Chapel Hill, NC, June 2006, pp. 141-150 (Vannevar Bush Best Paper Candidate)  PDF

Zhai, C., Velivelli, A., and Yu, B. (2004) A cross-collection mixture model for comparative text mining. Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2004), Seattle, WA, August 2004, pp. 743-748. PDF

Wang, J., Yu, B., and Gasser, L. (2002) Concept tree based ordering for shaded similarity matrix. Proceedings of the 2nd IEEE International Conference on Data Mining (ICDM 2002), Maebashi City, Japan, December 2002, pp. 697-700. PDF

Conference Presentations

Lee, T. and Yu, B. (2008). Finding "patients like me" on the Internet. Proposal presented at the planning meeting for the Johns Hopkins Summer Research Workshop on Machine Learning and Language Engineering. Baltimore, MD, November, 2008

Diermeier, D., Francois, J., Yu, B. and Kaufmann, S. (2007). Language and ideology in Congress. Midwest Political Science Association Conference, Chicago, IL, April 2007

Yu, B. and Unsworth, J. (2007). An evaluation of text classification methods for literary study. Digital Humanities, Champaign, IL, June 2007

Yu, B. and Unsworth, J. (2006). Toward discovering potential data mining applications in literary criticism. Digital Humanities, Paris, France, July 2006

Yu, B., Mei, Q., and Zhai, C. (2005). English usage comparison between native and non-native English speakers in academic writing. The 7th Joint International Conference of the Association for Computers and the Humanities and the Association for Literary and Linguistic Computing (ACH/ALLC), University of Victoria, Canada, June 2005

Downie, J. S., Unsworth, J., Yu, B., Tcheng, D., Rockwell, G., and Ramsay, S. (2005). A revolutionary approach to humanities computing?: Tools development and the D2K data-mining framework." Panel discussion at ACH/ALLC, University of Victoria, Canada, June, 2005

Yu, B. (2004). Writing style discrimination between native and non-native English writers. The International Federation of Classification Societies (IFCS), Chicago, IL, August, 2004

Yu, B. (2004). Comparative literary style mining between native and non-native English writers. Poster presentation at ASIST 2004 Annual Meeting, Providence, Rhode Island, November, 2004

Yu, B. and Searsmith, D. (2003). Genre-based in-document content type classification. The 13th ASIST SIGCR Workshop, Long Beach, CA, October 2003

Invited Talks

Text classification for political science research. Second Chicago Colloquium on Digital Humanities and Computer Science, Evanston, IL, October, 2007. With Daniel Diermeier and Stefan Kaufmann

Mass opinion: extraction, classification, and measurement. Northwestern Institute of Complex Systems (NICO), February, 2007. With Daniel Diermeier and Stefan Kaufmann

Teaching Experience

Co-instructor, Northwestern University, spring 2009
"Mining for meaning" pro-seminar in the Department of Linguistics

Guest lecturer, Northwestern University, summer 2008
Invited lecture on Chinese higher education for graduate course "Topics in Higher Education"

Guest lecturer, GSLIS, UIUC, fall 2003
Invited lecture on website design for undergraduate course "Information Technology and Organizations"

Teaching assistant, GSLIS, UIUC, spring 2001
LEEP course "Interfaces to Information Systems"

Research Experience

Text mining
Postdoctoral fellow, 2006-2009
Kellogg School of Management, Northwestern University
Lead the research projects (1) language and ideology, (2) political opinion expression and voting behavior, and (3) corporate opinion retrieval and classification. Recruit and train student research assistants.

Research assistant, 2004-2006
Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign
The Nora Digital Humanities Project
Design and implement text classification and feature analysis methods for literary text analysis and visualization

Research assistant, 2002-2004
Automated Learning Group, National Center for Supercomputing Applications
Research on opinion classification and text summarization methods; develop online job information retrieval system to assist disability specialists to match job descriptions with the skills of disabled people

Research assistant, 2000-2001
Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign
The Biological Information Browsing Environment project
Research on automatic extraction of floral descriptions

Database management
Graduate assistant, 1999-2000
Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign
Design and implement the GSLIS people and course database and the access through web browsers

Research assistant, 1997-1999
CAD Lab, Institute of Computing Technology, Chinese Academy of Science
Oracle database administrator

Professional Activities

Co-chair and reviewer for the CIKM Topic-Sentiment Analysis Workshop (TSA'2009)
Reviewer for the Digital Humanities Conferences (2008, 2009, 2010)
Reviewer for the Canadian Journal of Information and Library Science (CJILS)

Languages

English, Chinese

last updated: 09/24/2009