PSYCH 290: Natural Language Processing in the Social Sciences (SOC 281, SYMSYS 195T)
Digital communications (including social media) are the largest data sets of our time, and most of them are text. Social scientists need to be able to digest small and big data sets alike, process them and extract psychological insight. This applied and project-focused course introduces students to a Python codebase developed to facilitate text analysis in the social sciences (see dlatk.wwbp.org -- knowledge of Python is helpful but not required). The goal is to practice these methods in guided tutorials and project-based work so that the students can apply them to their own research contexts and be prepared to write up the results for publication. The course will provide best practices, as well as access to and familiarity with a Linux-based server environment to process text, including the extraction of words and phrases, topics, and psychological dictionaries. We will also practice the use of machine learning based on text data for psychological assessment, and the further statistic
more »
Digital communications (including social media) are the largest data sets of our time, and most of them are text. Social scientists need to be able to digest small and big data sets alike, process them and extract psychological insight. This applied and project-focused course introduces students to a Python codebase developed to facilitate text analysis in the social sciences (see dlatk.wwbp.org -- knowledge of Python is helpful but not required). The goal is to practice these methods in guided tutorials and project-based work so that the students can apply them to their own research contexts and be prepared to write up the results for publication. The course will provide best practices, as well as access to and familiarity with a Linux-based server environment to process text, including the extraction of words and phrases, topics, and psychological dictionaries. We will also practice the use of machine learning based on text data for psychological assessment, and the further statistical analysis of language variables in R. The course has no computer science prerequisites. Familiarity with Python, SSH, and basic Linux is helpful but not required ¿ they will be minimally introduced in the course, as will SQL (databases) and Jupyter notebooks. Understanding regression, basic familiarity with R, and the ability to wrangle your data into spreadsheet form are expected. For more information, please see
psych290.stanford.edu, where you will be able to access the google form to apply for the class.
Terms: Spr
| Units: 3
Instructors:
Eichstaedt, J. (PI)
;
Lim (Chun Hui), C. (TA)
SOC 281: Natural Language Processing in the Social Sciences (PSYCH 290, SYMSYS 195T)
Digital communications (including social media) are the largest data sets of our time, and most of them are text. Social scientists need to be able to digest small and big data sets alike, process them and extract psychological insight. This applied and project-focused course introduces students to a Python codebase developed to facilitate text analysis in the social sciences (see dlatk.wwbp.org -- knowledge of Python is helpful but not required). The goal is to practice these methods in guided tutorials and project-based work so that the students can apply them to their own research contexts and be prepared to write up the results for publication. The course will provide best practices, as well as access to and familiarity with a Linux-based server environment to process text, including the extraction of words and phrases, topics, and psychological dictionaries. We will also practice the use of machine learning based on text data for psychological assessment, and the further statistic
more »
Digital communications (including social media) are the largest data sets of our time, and most of them are text. Social scientists need to be able to digest small and big data sets alike, process them and extract psychological insight. This applied and project-focused course introduces students to a Python codebase developed to facilitate text analysis in the social sciences (see dlatk.wwbp.org -- knowledge of Python is helpful but not required). The goal is to practice these methods in guided tutorials and project-based work so that the students can apply them to their own research contexts and be prepared to write up the results for publication. The course will provide best practices, as well as access to and familiarity with a Linux-based server environment to process text, including the extraction of words and phrases, topics, and psychological dictionaries. We will also practice the use of machine learning based on text data for psychological assessment, and the further statistical analysis of language variables in R. The course has no computer science prerequisites. Familiarity with Python, SSH, and basic Linux is helpful but not required ¿ they will be minimally introduced in the course, as will SQL (databases) and Jupyter notebooks. Understanding regression, basic familiarity with R, and the ability to wrangle your data into spreadsheet form are expected. For more information, please see
psych290.stanford.edu, where you will be able to access the google form to apply for the class.
Last offered: Winter 2023
SYMSYS 195T: Natural Language Processing in the Social Sciences (PSYCH 290, SOC 281)
Digital communications (including social media) are the largest data sets of our time, and most of them are text. Social scientists need to be able to digest small and big data sets alike, process them and extract psychological insight. This applied and project-focused course introduces students to a Python codebase developed to facilitate text analysis in the social sciences (see dlatk.wwbp.org -- knowledge of Python is helpful but not required). The goal is to practice these methods in guided tutorials and project-based work so that the students can apply them to their own research contexts and be prepared to write up the results for publication. The course will provide best practices, as well as access to and familiarity with a Linux-based server environment to process text, including the extraction of words and phrases, topics, and psychological dictionaries. We will also practice the use of machine learning based on text data for psychological assessment, and the further statistic
more »
Digital communications (including social media) are the largest data sets of our time, and most of them are text. Social scientists need to be able to digest small and big data sets alike, process them and extract psychological insight. This applied and project-focused course introduces students to a Python codebase developed to facilitate text analysis in the social sciences (see dlatk.wwbp.org -- knowledge of Python is helpful but not required). The goal is to practice these methods in guided tutorials and project-based work so that the students can apply them to their own research contexts and be prepared to write up the results for publication. The course will provide best practices, as well as access to and familiarity with a Linux-based server environment to process text, including the extraction of words and phrases, topics, and psychological dictionaries. We will also practice the use of machine learning based on text data for psychological assessment, and the further statistical analysis of language variables in R. The course has no computer science prerequisites. Familiarity with Python, SSH, and basic Linux is helpful but not required ¿ they will be minimally introduced in the course, as will SQL (databases) and Jupyter notebooks. Understanding regression, basic familiarity with R, and the ability to wrangle your data into spreadsheet form are expected. For more information, please see
psych290.stanford.edu, where you will be able to access the google form to apply for the class.
Last offered: Winter 2023
Filter Results: