LIN 373N

Machine Learning Toolbox for Text Analysis

The University of Texas at Austin
Fall 2022
Instructor: Jessy Li

Syllabus
Schedule
Project
Canvas

Syllabus

Contact information

Course materials

Course overview and objectives

Technology that automatically analyzes text has made amazing strides, and lets us do things like automatically translate from Chinese to English, summarize what people on Twitter think about some current political topic, or find clues on who the author is of some classic piece of literature. Machine learning plays a central role in this technology: software that can learn from experience. This course provides an overview of basic statistical methods for machine learning and natural language processing. This is a very hands-on course in which we are going to be using the Python programming language.

The bulk of the course focuses on machine learning methods and applying them to analyze data, much of which textual. The later portion of the course will shift to surveying several tasks in natural language processing and to class projects, which will be a major component of the course. These projects will allow you to pursue your own interests (and conduct new research in so doing!).

Topics of this course include:

Acknowledgement: I thank Byron Wallace for sharing his syllabus, materials, and experiences from his course Applied Data Mining for the initial development of this course.

Flags

This course carries the Quantitative Reasoning flag. Quantitative Reasoning courses are designed to equip you with skills that are necessary for understanding the types of quantitative arguments you will regularly encounter in your adult and professional life. You should therefore expect a substantial portion of your grade to come from your use of quantitative skills to analyze real-world problems.

This course also carries the Independent Inquiry flag. Independent Inquiry courses are designed to engage you in the process of inquiry over the course of a semester, providing you with the opportunity for independent investigation of a question, problem, or project related to your major. You should therefore expect a substantial portion of your grade to come from the independent investigation and presentation of your own work.

Course requirements and grading policy

Grade Percentage
A >= 93%
A- >= 90%
B+ >= 87%
B >= 83%
B- >= 80%
C+ >= 77%
C >= 73%
C- >= 70%
D+ >= 67%
D >= 63%
D- >= 60%

Extension policy

Academic dishonesty policy

You are encouraged to discuss assignments with classmates. But all coding/written work must be your own. Students caught cheating will automatically fail the course. If in doubt, ask the instructor.

Notice about students with disabilities

The University of Texas at Austin provides upon request appropriate academic accommodations for qualified students with disabilities. Please contact the Division of Diversity and Community Engagement, Services for Students with Disabilities, 512-471-6259.

Notice about missed work due to religious holy days

A student who misses an examination, work assignment, or other project due to the observance of a religious holy day will be given an opportunity to complete the work missed within a reasonable time after the absence, provided that he or she has properly notified the instructor. It is the policy of the University of Texas at Austin that the student must notify the instructor at least fourteen days prior to the classes scheduled on dates he or she will be absent to observe a religious holy day. For religious holy days that fall within the first two weeks of the semester, the notice should be given on the first day of the semester. The student will not be penalized for these excused absences, but the instructor may appropriately respond if the student fails to complete satisfactorily the missed assignment or examination within a reasonable time after the excused absence.

Senate Bill 212 and Title IX Reporting Requirements

Under Senate Bill 212 (SB 212), the professor and TAs for this course are required to report for further investigation any information concerning incidents of sexual harassment, sexual assault, dating violence, and stalking committed by or against a UT student or employee. Federal law and university policy also requires reporting incidents of sex- and gender-based discrimination and sexual misconduct (collectively known as Title IX incidents). This means we cannot keep confidential information about any such incidents that you share with us. If you need to talk with someone who can maintain confidentiality, please contact University Health Services (512-471-4955 or 512-475-6877) or the UT Counseling and Mental Health Center (512-471-3515 or 512-471-2255). We strongly urge you make use of these services for any needed support and that you report any Title IX incidents to the Title IX Office.

Sharing of Course Materials is Prohibited

No materials used in this class, including, but not limited to, lecture hand-outs, videos, assessments (quizzes, exams, papers, projects, homework assignments), in-class materials, review sheets, and additional problem sets, may be shared online or with anyone outside of the class unless you have my explicit, written permission. Unauthorized sharing of materials promotes cheating. It is a violation of the University’s Student Honor Code and an act of academic dishonesty. I am well aware of the sites used for sharing materials, and any materials found online that are associated with you, or any suspected unauthorized sharing of materials, will be reported to Student Conduct and Academic Integrity in the Office of the Dean of Students. These reports can result in sanctions, including failure in the course.

Schedule

Schedule is tentative and subject to change.

Project

Please refer to the grading policy for a high level overview about the project and requirements.

Topic suggestions

Detailed requirements