Keyword Extraction

How to find the most important topics in customer feedback with AI

keyword extraction for customer feedback with ai

Customer feedback is valuable. Unfortunately, it is only valuable if it is read and understood. Those who understand feedback channels not only as a hygiene factor in the dialogue with their customers need intelligent tools that read out of hundreds of thousands of individually formulated, short texts what is actionable for the brand addressed and its brand management.

Keyword Extraction is one of these intelligent tools for a better understanding of big text data. It is an automated process for identifying the most relevant topics and expressions in texts. In order to be able to continue to understand customers and target groups without increasing effort in the face of increasing customer feedback, the use of artificial intelligence is indispensable. If you want to assign texts to defined categories, for example to channel support tickets or emails, the technique Text Classification is probably suitable. Read our article on how Text Classification accelerates companies in the processing of documents and what role Machine Learning plays in this.

Start extracting keywords from your text – Sign up for free

Read in this article how you can use Keyword Extraction to overview large amounts of customer feedback, convert potential customers and improve communication with your customers.

  1. Introduction to Keyword Extraction
    - What is Keyword Extraction?
    - How to understand customer feedback quickly and easily?
  2. How it works
  3. Use Cases & Applications
    - Product Testing
    - Customer Feedback
    - Spot Testing
  4. Keyword Extraction Tools & Resources

Introduction to Keyword Extraction

What is Keyword Extraction?

Keyword Extraction also known as Keyword Detection or Keyword Analysis allows you to condense a text to the relevant topics and expressions. This procedure allows texts to be summarised in such a way that the central messages and their frequency of occurrence can be easily understood. By using Deep Learning and Natural Language Processing, normal human everyday language from surveys, social media comments, reviews or customer feedback in very large quantities can be analysed in a short time.

How to understand customer feedback quickly and easily?

There are various methods for analysing text data using artificial intelligence. What they all have in common is that they all reduce the amount of time and effort required. In keyword extraction, the focus is on understanding the content of feedback. What topics do my customers talk about? Which topics are becoming relevant right now? What associations do my target groups have?

How it works


Keyword extraction prepares unstructured text in such a way that the content is fully understandable and finding relevant topics is easy. The method can be applied to text data such as customer feedback, surveys, social media comments, reviews or communication including chat conversations and e-mails.

Keyword extraction makes it possible to define actions and workflows based on content triggers. Customer enquiries can thus be channelled to the relevant specialist areas on the basis of topics, for example, or more urgent enquiries can be prioritised. This significantly reduces the risk of customers churning, enables companies to react more quickly to relevant enquiries and improves the user experience in the long term. Furthermore, keyword extraction enables a better understanding of customer situations, experiences and opinions, which is the basis for better business decisions.

Through the use of Machine Learning and Natural Language Processing, Keyword Extraction is automated, reducing effort and increasing speed regardless of the volume of data.

Different methods for automated keyword extraction can be distinguished. Approaches range from simple counting of words to sophisticated deep learning methods, which continuously and in a self-learning manner improve with new data. For each application case, the application of different methods should be considered. Due to the technological progress of the last years, even sophisticated deep learning models are now affordable and immediately applicable due to zero shot learning and thus suitable for a large number of application cases.

In the following section you will learn about the different methods for keyword extraction. The focus is on the Machine and Deep Learning based models, which Caufliflower also applies.

keyword extraction software application use cases

Use Cases & Applications


Whether you are a product manager and you want to measure your products with consumer opinions, you are a marketer planning the next campaign or you and your CX team receive hundreds of feedbacks daily through various channels. Keyword Extraction can help you to get an overview of all texts, understand the most important topics and search for specific phrases.

By using Keyword Extraction, teams become more efficient and can concentrate on their work, making actionable deductions and bringing the results into the organisation. Spare your team redundant tasks and empower them with interesting and aggregated insights from unstructured and granular text data.

Below you will find some common applications of Keyword Extraction:

  1. Product Testing
  2. Customer Feedback
  3. Spot Testing

Product Testing

People give their feedback on products via platforms like Amazon but also within online shops. In addition, certain target groups can be asked specifically to rate their own products or product concepts. The use of keyword extraction is helpful here.

Imagine you are a product manager and plan a new set of products to be sold in your own online shop and your own shops. The assortment has already been prescribed. To get a better picture of the competition in the segment, you have selected reference products at Amazon. With the scraping of the Amazon Reviews a database for a detailed analysis can be created. By using the Keyword Extraction method, it is now possible to find out which topics play a role in the assortment, which topics have the highest relevance and how users describe topics in concrete terms. The results serve ideally as a basis for the conception of the new product set.

“Angel Audio” is a fictional manufacturer of premium hi-fi audio speakers and headphones. Angel Audio is planning a new set consisting of headphones and portable speakers. By analysing Amazon reviews of competitor products, Angel Audio gets a good one of the relevant topics. The results of the keyword extraction show the top topics and their share in the data:  sound, volume and bass. The associations concretise the image of the products: sound: good / bass: rich / volume: weak

In order to test new product concepts or your own products, it is best to carry out a product test with a targeted random sample. With questions such as likeability, purchase probability or brand fit, you can get an assessment of your products. If you include indicators such as brand awareness, consideration and purchase of your own brand, you can differentiate in the analysis between buyers, potential customers and those who only know your brand.

The brand X is likeable.
  • Totally agree
  • Partly agree
  • I do not agree or disagree
  • Partly disagree
  • Totally disagree
How likely is it that you would buy this product for X€?
  • Very likely
  • Likely
  • Neither nor
  • Unlikely
  • Very unlikey
In your opinion, which products do not fit the X brand at all?
How well do you know the X brand?
  • I know this brand very well
  • I know this brand well
  • I know this brand a little
  • I know this brand by name only
  • I do not know this brand
Which of these suppliers are suitable for you when buying [Product Category]?

Open-ended questions are suitable for getting precise and individual feedback on individual products. With a spontaneous association question, you can better understand what strikes potential buyers. By using likes and dislikes, it is possible to work out very specific plus and minus aspects of products. Is it the cut, the material, the colour, the print or simply the lack of need? It is also possible to ask about these facets with closed questions. But only with open questions do you get the concrete reasons for an evaluation in the individuals' own words and thus concrete derivations for the adaptation of products or specifications for the development of new products as well as topics that you may not have noticed at all. 

To get even more precise feedback you can use the Cauliflower Surveybot within surveys. This allows you to ask a specific question about a specific topic in response to a respondent's answer:

The open questions generate valuable but also unstructured feedback in the form of comments. In order to understand what you like and what you don't like about the products, the use of keyword extraction is suitable. By clustering terms that are related in terms of content, we can see at a glance which topics are particularly negative and what works especially well. With a ranking of the most frequent keywords within the likes and dislikes, a selection of relevant topics is available. With the exploration of individual keywords, the concrete feedback then becomes clear. 

Surveybot. To go one step further, the feedback collected on products can be correlated with sales figures from the past. With the training of a specific deep learning model, predictions about product success can then be made based on the product evaluations. Read this post to find out how Cauliflower uses the method at Tchibo to find 80% of flop products during product conception.

How to use natural language processing for keyword extraction

Customer Feedback

The ongoing digitisation enables the collection of feedback at all contact points of the customer journey. Whether on an iPad in the shop, within a pop-up in the online shop or with an email after a contact - the possible contact points for collecting feedback are almost endless. Whether after the delivery of a product, after a support call, on the website with a new design, after the migration of a platform or after the deletion of the shopping basket - the inhat points are also unlimited. The result is precise feedback on real experiences. An El Dorado for experience measurement, which was unimaginable a few years ago. 

The methods for collecting feedback are manifold. Classically as Net Promoter Score (NPS), what is one of the most popular ways to collect customer feedback and measure customer loyalty, with school grades or star ratings, up to differentiated evaluation of individual facets (Friendliness, comprehensibility, competence, quality, waiting time etc.). What all these methods have in common is the simple possibility to aggregate the values to obtain an average value, to break down indicators to individual groups such as different locations, teams or sources, and to compare their own performance with benchmarks.

Net Promoter Score

The NPS, for which there are many industry-specific benchmarks, is particularly suitable for this purpose. From a simple question on a scale of 0 to 10, the three liquidity types promoters, passives and detractors are derived. 

How likely are you to recommend X to a friend or colleague?

Calculation of Net Promoter Score

What is missing from the survey of these different indicators is the rationale for the assessment. Therefore, a free text field for feedback is often added. With this text field, the possibility is given to give general feedback or to justify the previous evaluation. 

What comes to your mind spontaneously when you think of the new website?
What can we improve about our ordering process?
Tell us why you gave us a 7.

These texts are a valuable treasure trove of data. No other source provides insights that are so authentic and situational at the same time. Insights that follow an intrinsic motivation of consumers to give feedback on a specific experience at a specific moment. Especially verbal feedback, spoken or written, is so rich in nuances that it allows conclusions to be drawn about specific aspects, their value and context. It is information that has the potential to fill gaps that open up when we talk about brand experience. These messages in the feedback concretise the customer's image and concerns and do so in the consumer's own language instead of abstract marketing speak.

The use of Keyword Extraction allows us to extract the relevant content from the wealth of feedback in order to address the most relevant problems, win back customers or compare different designs. By combining an assessment of the delivery process, Keyword Extraction can be used to deduce which delivery centres are performing better and what the reasons are from the customer's perspective. A new website design is not as well received as the old one. With keyword extraction, the exact reasons for the poorer performance can be derived from the feedback. Through the automation of Keyword Extraction, acute problems become immediately visible and important decisions can be made without wasting time.

“The food was delicious but the delivery time was too long

keyword extraction based on machine learning

Spot Testing

Understanding whether a spot is well received and conveys the key messages is crucial. From feedback on the visibility of the new product, the brand perception to the likeability of the testimonial, the music or the editing. The interplay of various criteria makes a spot successful. 

Since the costs for placing commercials are high, it makes sense to test a web spot or, even better, different versions of a new spot before placing it in different channels. An online survey of specific target groups that are to be reached with the commercial is suitable for this purpose. In order to be able to put the respondents' evaluations into the right context, in addition to the most important socio-demographic indicators such as gender, age and state, attitudes or buying behaviour in the category concerned as well as the use of the most important brands should also be collected.

Online surveys can be used to measure a commercial in a number of ways. In the following we would like to present 4 categories for measuring TV commercials: (1) engagement, (2) brand impact, (3) evaluation and (4) associations


Measuring engagement is about understanding how involved respondents are with the spot being shown. Do the viewers feel strongly involved or do they remain emotionally untouched by the spot. There are different methods to measure this involvement. In neuromarketing, brain activity during the spot is measured by an ECG to measure the unconscious. Another method is eye and emotion tracking. By measuring physiological reactions, conclusions can be drawn about the emotional evaluation of the stimulus shown (TV spot). These methods require an appropriate test environment with expert supervision. A less complex approach is the digital measurement of whether web spots are watched to the end or whether a spot is ended prematurely. In addition to the spot to be tested, other current reference spots should be tested for comparison.


Brand Impact

To ensure that a spot not only appeals but also works for a specific brand, it is important to measure the brand impact. Advertising for an entire product category is not very efficient. On the other hand, a spot with too large a spillover effect on competitors is even damaging. To measure brand impact, a distinction can be made between an unaided and an aided method. For the unaided procedure, an open question is asked about the advertised brand.


For which brand have you just seen a TV spot?


With the protected procedure, a selection of predefined brands is asked.  Care should be taken to randomise the different brands between the respondents in order to exclude sequence effects. With the assisted procedure, answer categories can be predefined to make the results more precise: sure, maybe, definitely not.

In addition to these methods, other questions are suitable, such as the question about concrete advertising content for individual brands or the assignment of individual frames from advertising to brands.


Evaluation and Associations

To check whether the spot addresses the right values, the targeted perceptions must be evaluated. A distinction should be made between emotional and content-related dimensions. 


Emotional dimension. The affective facets of the spot are recorded with questions about the emotional reactions. Is the spot funny, dramatic, energetic, sad, appealing or disturbing?  Does the spot fit my brand? Does the spot stand out from the competition?


Content dimension. Questions about the understanding of the content check whether the message of the commercial is conveyed. Does the spot convey the information and content that is being addressed? Are the advertised products or services perceived and understood?


Of course, it is conceivable to collect statements with a Likert scale in order to obtain specific feedback on individual dimensions. In order to obtain direct feedback in the respondents' own words, the collection of open text fields is suitable. It makes sense to collect a content summary from the respondent. This makes it clear what sticks in the view of the audience. In addition, it is also advisable to ask specifically for likes and dislikes:


What appeals to you most about this TV commercial?


What appeals to you least about this TV commercial?


Associations. The use of open questions is also suitable for measuring emotional reactions to commercials. Especially when recording feelings, prefabricated answer options or very specific statements are suksessive and thus influencing. In order to really understand what a TV spot triggers, an unprompted question about the respondents' thoughts is suitable:


What associations, thoughts or feelings come to mind after seeing this TV spot? 


There are several options for answering this question. Either 3 - 10 text fields are provided in which respondents can enter short associations. Alternatively, you can provide a larger text field to receive feedback in continuous text. Sometimes it can be exciting to motivate the respondents to "let their thoughts run free".


Open-ended questions are suitable for getting precise and individual feedback on individual spots.  On the basis of likes and dislikes, very specific plus and minus points of a TV spot can be worked out. Is it the editing, the music, the actors or simply the story? You can also ask about these facets with closed questions. But only with open questions do you get the concrete reasons for an evaluation in the people's own words and thus concrete derivations for the adaptation of spots or guidelines for the conception of new spots as well as topics that you may not have noticed at all. 


The Surveybot can also be used within Spot Testing to obtain even more precise feedback with a specially trained chatbot. This allows you to ask a specific question on a specific topic in response to a respondent's answer.


As useful as the use of open-ended questions to gather opinions is - it also entails elaborate evaluation processes. The manual evaluation of open-ended responses requires time, structured code plans and trained personnel. To understand what's in open-ended feedback, AI-based keyword extraction can be used. Deep Learning or Machine Learning models trained to understand language can automatically cluster content-related terms in a short time. This makes it possible to see at a glance what works particularly well or poorly in a spot. Associations with TV commercials become quantifiable because individual topics can be given an exact share in the mentions. Experts in the fields of data science and natural language processing are needed to develop such models. But even without such experts, it is possible to benefit from the latest developments in the field of artificial intelligence. By using keyword extraction software, one's own data can be analysed with little effort and at low cost.

keyword extraction based on machine learning

Keyword Extraction Tools & Resources

Now that you know about keyword extraction, you are ready to take your first steps. There are several ways to do this. You can develop an entire custom system from scratch that runs within your system. This approach has the advantage of being a perfect fit for your needs and infrastructure. But this approach requires a lot of effort. 

In the field of Machine Learning and Natural Language Processing, there is a great open source community that continuously develops and publishes libraries. If you have the corresponding background in Data Science, you can find some exciting papers here. Join our Natural Language Processing Meetup to exchange ideas with other Data Scientists in the field of Natural Language Processing and to follow exciting talks.

In case you don't have a background in programming or don't have the right team to build an appropriate solution from scratch, we present two convenient alternatives to get you started right away: Use keyword extraction tools to analyse your own data and disseminate the insights in your organisation or connect keyword extraction APIs to implement existing algorithms within your solutions. With both solutions, you can get started right away, you require no or very little code, cost much less and are scalable.

Keyword Extraction Tools

To choose the right tool, you should be clear about the use cases. All tools differ in scope and have advantages as well as disadvantages. We recommend that you try out a tool. Register for free at Cauliflower to get started with Keyword Extraction. 

  1. Cauliflower
  2. DiscoverText
  3. Mozenda
  4. PrediCX
  5. WordStat
  6. MonkeyLearn


Focus: Companies of any size that want to analyse multilingual text data (surveys, reviews, customer feedback, news articles, social media etc.) with keyword extraction and sentiment analysis in intuitive graphics.

Cauliflower is an easy-to-use SaaS platform for instant text analytics. You can get started with keyword extraction right away, and thanks to Cauliflower's proprietary methodology, you can analyse data from multiple sources without having to train or customise a model. 

The use of Natural Language Processing and deep machine learning combines scalability to a large number of data sets with an analysis quality of human analysis. With its use, you can significantly reduce the effort in your company at an affordable pricing. With automatically generated, engaging dashboards, you can easily spread the insights throughout your organisation.

In addition, an API is also provided to link Cauliflower's various analytics and models to applications in your infrastructure. With easy documentation, you can start implementing in just a few steps.


Focus: Fortune 1000 companies, academics and government agencies that need analytics solutions that streamline the process of capturing, storing and collaborating on natural language text data.

DiscoverText is a web-based collaborative text analytics system that allows academics, companies, and governments to schedule fetches from Twitter, import from SurveyMonkey, email, or a spreadsheet, to search, filter, cluster, human-code and machine-classify text. Texifter's CoderRank patent focuses on algorithmic techniques analogous to Google's PageRank for web search, but tailored to improve large scale, collaborative text analytics through enhanced machine-learning and the gamification of crowd source coding. (Source:


Focus: Individuals and teams within companies of all sizes. Mozenda is trusted by thousands of businesses and over 30% of the Global Fortune 500 companies.

Mozenda is a Web Scraping Software and Service. Trusted by 1/3 of the Fortune 500 & five star rated customer support. Mozenda Provides: 1) Cloud-hosted software 2) On-premise software 3) Data Services Over 15 years of experience, Mozenda enables you to automate web data extraction from any website. (Source:


Focus: Companies who want to get more insight from their customer feedback to improve their customer journey, reduce costs and improve their call center operations.

PrediCX offers automatic classification of incoming customer data, whatever the channel. Understand what your customers are really saying, in near real time. Get early warnings of issues, fast track any complaints or urgent enquiries and get your customer channel right first time. CSat improved by up to 20%, FCR Improved by 50%, AHT reduced by 35% and Customer Churn reduced by 18%. (Source:


Focus: Designed for academic institutions, government businesses, NGOs, and researchers using text mining and qualitative data analysis to find themes, trends, and topics in unstructured text data.

WordStat is a software that helps people analyze very large amounts of written documents. eg.customer surveys, political speeches, academic papers, emails or twitter feeds. The software helps people find common topics, themes and hidden meanings in unstructured text data. A hotel chain may have 100,000 customer comments about their rooms, food service or outdoor facilities. The software helps search those responses and generate common themes or topics to target where they need to improve. (Source:


Focus: Small and medium companies that need to turn text into actionable data. Users range from marketers to salespeople, customer support teams, data analysts, developers, among others.

MonkeyLearn is an AI platform that allows you analyze text with Machine Learning to automate business workflows and save hours of manual data processing. Customers like Clearbit, Segment and Drift use MonkeyLearn to classify and extract actionable data from raw texts like emails, chats, web pages, documents, tweets and more! MonkeyLearn can be easily integrated via integrations like Google Sheets, Zapier, Zendesk or Rapidminer (no coding required) or via our beautiful API and SDKs. (Source:

Start with Keyword Extraction

Keyword extraction can be used to structure large amounts of data so that what is relevant becomes visible. With keyword extraction, customer feedback can be understood, processes can be automated and capacities can be saved and used for more important tasks. With this article, you have learned the basics of keyword extraction and been introduced to methods from the fields of computational linguistics and data science. With the use cases from the areas of product testing, customer feedback and spot testing, you have learned about possible applications in the areas of customer support, marketing and market research.

Now you can use Keyword Extraction to make better business decisions based on valuable insights. Getting started with Cauliflower is easy. Register now or request a personal demo from one of our consultants. Find out how you can use Keyword Extraction in your projects and stay one step ahead of the competition.

back to top

Try Cauliflower. You get 500 credits for free.

Schedule a demo with a consultant and learn how to start analyzing open-ended responses.

Free Trial  Schedule Call
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
  • Fast
  • Precise
  • No manual effort