Text mining is an interdisciplinary area that primarily combines advances in Natural Language Processing (NLP), Information Retrieval (IR), and Machine Learning (ML) to help the computers understand human written language, and thus transform information from free text to structured knowledge. The volume of textual data has been growing rapidly. For instance, there are over 100,000 articles on COVID-19 tracked in LitCovid (https://www.ncbi.nlm.nih.gov/research/coronavirus/) at a rate of 10,000 articles per month. Text mining techniques can be applied to respond to the pandemic. For example, automatically extracting the symptoms, diseases, and drugs mentioned in the articles would assist the diagnosis and treatment management of COVID-19 patients. More generally, the amount of text of popular databases is at million or trillion-scale. Such scale necessities the development of related text mining tools to facilitate data curation and knowledge discovery.
This course will introduce participants to a comprehensive set of text mining related topics, tools and techniques. It will cover three primary components: (1) basics of Python and its related packages, (2) an overview of text mining pipeline and techniques, and (3) an introduction to machine learning and development of text mining applications using machine learning. Each component will have hands-on exercises and case studies for practice.
At the end of the course a learner should be able to:
Write basic Python codes and use Python-related packages such as Pandas, Numpy and Sklearn for textual data analysis
Understand text mining pipelines and develop text mining methods for text processing
Understand machine learning related concepts and develop text mining applications (such as text classification and named entity recognition) using machine learning techniques
Who should attend?
This course is designed for learners and interested individuals with little or no experience in text mining or machine learning.
Prior exposure to programming languages is highly recommended but not required. We will spend the first day on Python fundamentals. Learners without experience in programming languages are expected to practice and grasp the basic syntax of Python. Basic computer skills are required.
General Training Rate:
Discounted Training Rate:
$1,075.00 - NIH Community (Trainees, Contractors, Employees, Tenants working at one of the NIH campuses)
$1,195.00 - Academia, US Government (Non-NIH), US Military
Although no grades are given for courses, each participant will receive Continuing Education Units (CEUs) based on the number of contact hours. One CEU is equal to ten contact hours. Upon completion of this course each participant will receive a certificate, showing completion of the workshop and 2.8 CEUs.
100% tuition refund for registrations cancelled 14 or more calendar days prior to the start of the workshop.
50% tuition refund for registrations cancelled between 4 to 13 calendar days prior to the start of the workshop.
No refund will be issued for registrations cancelled 3 calendar days or less prior to the start of the workshop.
All cancellations must be received in writing via email to Ms. Carline Coote at firstname.lastname@example.org.
Cancellations received after 4:00 pm (ET) on business days or received on non-business days are time marked for the following business day.
All refund payments will be processed by the start of the initial workshop.