Intelligent Conversations: Text Mining over Intercom data to Generate Customer Support Insights

Eleni Markou

For quite a while I have been wondering how we could possibly use the customer support data we get from sources like Intercom to improve our customer experience. Monthly reports and aggregated statistics can tell a lot about the team’s performance and what we can improve. But still, a huge piece of stored information is left out:

What are our customers talking about when they message us? What language are they using? What questions are they asking? What do they expect from our service?

To answer these questions, we need to look into the actual conversations that our customers are having with us over Intercom even with these unstructured data, we can still use data science techniques such as text mining and clustering, in order to gain insights.

I want to point out that I was very lucky to have the input and feedback from our friends at Hellas Direct. Hellas Direct are one of Blendo’s first customers moving data from various data sources and one of them is Intercom.

Import your Intercom data into your data warehouse - Sync your customer success data from Intercom to any data warehouse. Analytics-ready data with no hassle. Integrate Now
See more data integrations to sync with your data warehouse powered with ❤ by Blendo

Since 2012, Hellas Direct has been one of the fastest growing car insurance companies in Greece, majorly disrupting the insurance market and the way car insurance companies service their clients.

But before moving on to analyzing our options, I need to mention that none of what follows is a ready-to-use solution. The project is still ongoing and I just want to share with you my thoughts and some initial results I came up with by applying a few methodologies. I would be more than happy to discuss my work with you, have, so do not hesitate to leave comments or contact me directly 🙂

Back to cool data stuff… For our problem, I have identified a few potential approaches:

  1. Identify similar conversations using text mining techniques

You may be wondering, why should we care if a customer’s conversation is similar to some other customer’s conversation?  Of course we should care.

A customer will usually interact with us when they have a specific question. Questions which appeared once are very likely to be faced again and thus, being able to take that question and query our FAQ database for a good answer will save us from a lot of time and effort. Think of it as a real-time suggestion engine to be used by our customer support when answering a question. Wouldn’t that be great?

  1.  Clustering conversations by subject

Is this going to be of any help? Of course, it will. This way we will be able to identify the most common topics in our conversations. Furthermore, by clustering our customers together based on the conversations they are having with us, we can look into their demographics and gain better insights.

Is our website hard to understand by a particular audience? Are we using confusing language for one of our offers?  Do we need to change our messaging when addressing a certain demographic?

Task #1: Conversation similarity

Given a conversation, we would like to retrieve others similar to this one. The steps performed as part of this text mining process are the following:

  • Step 1 – Shingling: Each conversation was broken down into its structural elements (shingles). In our case, each shingle contains k number of words.
  • Step 2 – Minhashing: For improved performance, representative transformations of shingles (signatures) are extracted in a way that preserves similarity.
  • Step 3 – LSH: Signatures are appropriately used to map documents into buckets so that similar documents are more likely to land within the same bucket.
  • Step 4 – Document Comparison: Similarity is computed based on the assumption that only documents from the same bucket are likely to be similar to each other. Hence, calculations are performed just for nearest neighbor’s pairs.

A commonly used metric for document comparison is Jaccard similarity, i.e. the ratio of shared components between two different conversations (intersection) to their total distinct number (union).The Jaccard similarity can be computed with the use of either shingles or signatures, since the main principle of minhashing is that the similarity of signatures is on expectation close to shingles’ similarity.


Procedure Diagram

Gathering and cleaning of data

So, that’s the algorithm we are going to utilize. Before continuing with the actual implementation we need to gather some data and represent them in a way convenient for the analysis we are conducting.

As mentioned before the data we chose to work with, come from Intercom and have to do with insurance related conversations. Accessing them becomes a piece of cake when using Blendo, and thus Intercom data collection wasn’t cumbersome at all.

Regarding the amount of data needed, there is no restriction. Even with a fairly small amount of the data we can still calculate the distances and retrieve neighbor conversation. Yet it’s recommended to gather as much data as possible in order to capture all underlying patterns.

The next step of data preparation includes cleaning and some extra transformations. More specifically:

  1. We kept only conversations initially written by customers.
  2. We removed punctuation, special characters, HTML tags etc.
  3. We reconstructed the conversation’s body as in many cases it was broken into multiple columns.

At last, only the conversation’s id and its body description were maintained for further analysis.


After completing the development of the algorithm many tests were conducted in order to determine the produced results’ quality and if these were aligned with our human understanding.

For that reason, we selected a large number of query conversations and retrieved their nearest neighbors based on Jaccard similarity. To our pleasure, the model seemed indeed able to select relevant conversations and identify correctly the main subject of each one.    

A few indicative examples are the following:

Query Conversation Neighbor Conversation Retrieved
I pay 138 for the same coverage and the same amount of time I would like to ask about the coverages I have because they opened the car and stole the radio
I’m a member of Aegean miles and bonus and I would like to redeem Aegean’s offer for my car’s insurance which starts in May How can this be done I want to claim the offer for the Aegian’s 5000 miles but I do not want to buy today because the car plates are deposited in the tax office So I will secure the car as soon as I get them back  what do I have to do

The two examples confirm the above-mentioned observations. Specifically, in the first example, both conversations, i.e. the query and the response, ask for information regarding certain insurance coverages. In the second example, both customers are interested in a certain discount offered by the company. It is interesting that despite the typo in the response conversation (Aegean instead of Aegean) the algorithm manages to correctly locate the right keywords in the text.

Task #2: Conversation clustering

The other proposed approach includes clustering of the Intercom conversations. Text clustering aims at the creation of internally coherent clusters, distinct from each other and is widely used for :

  1. Grouping of similar documents
  2. Analysis of customer feedback
  3. Discovery of meaningful implicit subjects across multiple documents.

Additional data cleaning

In addition to previously mentioned preprocessing steps we also included:

  1. Removal of punctuations and frequent words.  For instance, common words such as articles, linking words, conjunctions and greetings may not be very informative regarding the topic of a conversation. The same applies to punctuation marks.
  2. Removal of words with very rare occurrences It can be assumed that infrequent words may be a result of typos or spelling mistakes. For this, we removed those that appeared less than 2 times in the whole dataset.

Since clustering implies the measurement of some kind of distance so that Intercom conversations in close distance will be assigned to the same cluster, we need to determine an appropriate metric. In our case cosine distance was selected over others like Euclidean, as more appropriate for textual input.


After experimenting with the number of clusters, the 6-means clustering was selected. For each one the produced clusters the top 6 words closer to clusters’ center according to cosine similarity are the following:

Cluster 0 words: green,  card,  want,  at,  please,  mail

  • Cluster 0 includes conversations regarding inquiries for issuing a green card  – a special gas card  that certifies vehicles for low emissions.

Cluster 1 words: payment,  pay,  wanted,  code,  codes, the

  • Cluster 1 includes conversations regarding payments and payment codes.

Cluster 2 words: the,  from,  refer,  friend,  doing,  renewal

  • Cluster 2 refers to a certain promotional action called “Refer a Friend”

Cluster 3 words: motorcycle,  happening,  happen,  following,  questions, question

  • Cluster 3 refers to moto insurances  

Cluster 4 words:  insurances,  at,  am,  at,  doing, want

  • Cluster 4 includes general conversations regarding insurances

Cluster 5 words:  offer,  how,  one,  again,  came,  deposit

  • Cluster 5 includes conversations regarding offers and discounts

From the results, it seems that the majority of the clusters represent some specific topic. For example, cluster 0 refers exclusively to questions and issues regarding green cards while cluster 2 to a specific promotional action called “Refer a Friend”. In contrast, there are also clusters that appear to have more general topics such as cluster 4.

Yet is evident that extra cleaning is needed as in many cases stop words dominated some of the produced clusters. Lemmatizing would also be useful since different grammatical types with the same origin appeared multiple times in the results. In general, this is a quite easy task since you don’t have to do it “by hand” as we did but instead there are nltk libraries that support a large variety of languages.

To get a visual representation into a 2-dimensional plane, as shown below, we used the first two components:

Resulting Clusters

Use cases

So far so good but you may wonder what can we do with the results and how these models would probably be incorporated into something bigger that would help you optimize your customer support. Although the answer is not that straightforward as it relies on the type of your business and your business goals, there are a few interesting proposals that we came up to.

The conversation similarity computation, for example, can lead to the specification of predefined answers that can reduce the first response time to customers. Going even further, we can build a chatbot, probably in the form of a response recommendation engine, to help Intercom operators in handling incoming inquiries.

In some business cases, it can also be used as part of fraudulent claim detection based on feature derivation from other similar reports.

Furthermore, clustering can optimize the customer support process, e.g. depending on the topic detected the question can be assigned automatically to a specific person.

Even as a standalone project it will be proven helpful in cases where the incoming inquiries are numerous and handling them by clustering in bigger categories can simplify a lot the whole process.

If you’re interested in syncing your Intercom data into your data warehouse, it is a matter of minutes.

Check it out 👉 here.