Your Privacy

This site uses cookies to enhance your browsing experience and deliver personalized content. By continuing to use this site, you consent to our use of cookies.
COOKIE POLICY

Skip to main content

Text Analytics Spells “Big Savings”

Text Analytics Spells “Big Savings”
Back to insights

Text analytics and natural language processing are extremely powerful concepts that are increasingly within organizations’ grasp. Many of the concepts for mining text to extract new information have existed since the mid-1980s, but with the rise of the data scientist the barrier of entry has been dramatically lowered. Before we talk about how text analytics might be useful to your organization, let’s establish a quick baseline of understanding. 

What is Text Analytics?

Text analytics is roughly synonymous with text mining, and text data miningTechnically it is not related to biblio-wizardry or vocabu-sorcery but I’d still like to think there’s some magic left in the world. The whole idea behind text analytics is taking a body of text and extracting valuable, discrete, or new information. Think about your business, then think about how much of a paper trail there is: E-mails, contracts, invoices, industry publications, etc. Most organizations have an absolute mountain of text information that is likely providing little value right now, other than its original intended purpose.  

  (See More about turning data into insights /data-activation-when-your-data-hands-you-lemons/ ) 

What about Natural Language Processing?

Natural Language Processing is a subset of text analytics that deals with aspects of language such as identifying the parts of speech, disambiguation, sentiment analysis, and the other vagaries of human language that computers will soon be better at understanding than we are. Although I’m afraid that no amount of context clues can help me understand modern slang (https://thoughtcatalog.com/january-nelson/2018/09/millennial-slang/ ). I used to be cool, but now I’m just a data geek.  

Text Analytics and Machine Learning

As you’d expect in the new frontier of data jiggery, there are quite a few different approaches to text analytics. Some of the more interesting approaches utilize machine learning to train a model on an existing corpus of text and apply that model to related text. Perhaps we’re looking to extract entities by identifying law firm names in a body of legal documents. Maybe we’re trying to measure a customer’s sentiment to a customer service call by identifying speech patterns and word choice. Maybe we’re trying to determine if two historical works are actually written by the same author, or if they’ve just been attributed to the same person. These are exciting use-cases, and I doubt you have to think hard before you come up with something applicable to your own organization. 

A Real World Example

UDig is working with an association who publishes scholarly articles. Their ask is to improve their ability to use the abstracts of the works to automatically match new content with specific peer reviewers. A high-level explanation of our approach to tackling the challenge roughly follows.  

First, we take the massive corpus of abstracts and do some simple pre-processing. We do things like remove stop words (“the”, “and”, etc) and stem words (i.e., change “monitoring” to “monitor”). Next, we calculate a metric called TF-IDF. TF-IDF (which stands for “Term frequency–inverse document frequency”) essentially counts the appearance of a particular word in a document and then penalizes the “score” for the word if it appears in many different documents. For example, the word “the” (if it weren’t already removed by our stop word elimination) would appear quite frequently in a single document; but because it appears numerous times in every document, it gets penalized to count for nothing. Conversely, if one article happens to be about “biblio-wizardry”, and only two other documents contain the terms “biblio-wizardry” we can start to assume those texts might be related; particularly as we assess other common terms across the documents. 

In this case, ranking scholarly articles utilizing TF-IDF lets us get a pretty good idea of when two documents are related; and when two documents have little to do with each other. From there, we can take these terms and marry them up with peer-reviewers. If we discover that one person has a penchant for reviewing articles about “biblio-wizardry” but never touches the (frankly more profane) “vocabu-sorcery”, we know how to route new abstracts as they come in by applying the same technique. 

How achievable is this?

The possibilities for text analytics are endless. While it can be challenging to extract the information and no text analytics project looks the same, I believe there is an absolute treasure trove of value to be discovered. From automating discrete data identification, to gaining a more holistic view of your customers, text analytics is worth investigating.  

 

 

Digging In

  • Artificial Intelligence

    From Experimentation to Enterprise: Making AI Adoption Real A Q&A with Josh Bartels, Chief Technology Officer

    Everyone’s talking about AI, but how do you actually move from buzz to business impact? We sat down with UDig CTO Josh Bartels to break down what it really takes to move beyond experimentation and build meaningful, scalable adoption across the enterprise. Q: How can organizations move beyond experimentation and start realizing real value with […]

  • Artificial Intelligence

    Paid Media Analyzer Prototype

    Built during UDig’s internal Airwave program, this prototype delivers automated Google Ads intelligence that pinpoints what’s working and what’s not, freeing teams from manual reporting and boosting ROI through faster, data-driven decisions.

  • Artificial Intelligence

    Generative BI Prototype

    Built during UDig’s internal Airwave program, this prototype lets users explore enterprise data in plain language through a conversational interface that translates questions into SQL and instantly returns results as charts or insights.

  • Artificial Intelligence

    Airwave

    Accelerate AI adoption with clarity. By tuning into the right wavelength, enterprises move past the noise, build fluency fast, and turn experiments into scalable business impact.

  • Artificial Intelligence

    Meet UDig’s 2025 Intern Cohort

    This summer, four talented students from universities across the Southeast joined UDig as interns, bringing curiosity and fresh perspectives to the table. Sarah Galloway is studying Industrial Design at Georgia Institute of Technology. Vansh Joshi is a Computer Science major at the University of Tennessee – Knoxville. Kat Leon is pursuing Computer Science at Virginia […]

  • Artificial Intelligence

    UDig Joins CNBC AI Summit as Gold Sponsor to Advance AI Adoption

    Nashville, Tennessee – August 6, 2025 — UDig, a leading technology consulting firm, is proud to announce its participation as a Gold Sponsor of the inaugural CNBC AI Summit, taking place on October 15, 2025, in Nashville, Tennessee. The CNBC AI Summit will convene top executives, entrepreneurs, and AI leaders to explore how artificial intelligence […]