Unlocking NLP Success: Essential Linguistics for Better Text Analysis 🚀
Join Ashley Zimmerman at PyData Vermont 2025 to explore core linguistics and data science principles that enhance your ability to extract meaningful insights from text data. Don't miss this deep dive into the fundamentals of NLP!

PyData
65 views • Nov 14, 2025

About this video
www.pydata.org
We will discuss fundamental linguistics and data science concepts that underpin the ability to extract signal from text. This talk brings theoretical context to general data science and NLP approaches. Topics will include the linguistic grounding of large language models (LLMs), basic NLP methods, and common pitfalls in textual analysis. We will also present some tools developed by our lab that can act as powerful lenses for textual data. Some examples we will use to approach these topics include: word frequency and distributions, Zipf’s law, the Distributional Hypothesis, allotaxonometry, sentiment, time series, and scale.
Takeaways from this talk will be theoretical background and tools that support a holistic approach to extracting signal from text, empowering attendees to engage critically with NLP applications in the wild and to deploy NLP approaches responsibly and creatively.
PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.
00:00 Welcome!
00:10 Help us add time stamps or captions to this video! See the description for details.
Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVideoTimestamps
We will discuss fundamental linguistics and data science concepts that underpin the ability to extract signal from text. This talk brings theoretical context to general data science and NLP approaches. Topics will include the linguistic grounding of large language models (LLMs), basic NLP methods, and common pitfalls in textual analysis. We will also present some tools developed by our lab that can act as powerful lenses for textual data. Some examples we will use to approach these topics include: word frequency and distributions, Zipf’s law, the Distributional Hypothesis, allotaxonometry, sentiment, time series, and scale.
Takeaways from this talk will be theoretical background and tools that support a holistic approach to extracting signal from text, empowering attendees to engage critically with NLP applications in the wild and to deploy NLP approaches responsibly and creatively.
PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.
00:00 Welcome!
00:10 Help us add time stamps or captions to this video! See the description for details.
Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVideoTimestamps
Tags and Topics
Browse our collection to discover more content in these categories.
Video Information
Views
65
Likes
1
Duration
46:23
Published
Nov 14, 2025
Related Trending Topics
LIVE TRENDSRelated trending topics. Click any trend to explore more videos.
Trending Now