In today’s digitalized world, data insights into the customer’s spending habits and personal finance management have become an essential part of digital banking. However, the process of categorizing transactions can be challenging, particularly when it comes to merchant category code (MCC) based categorization. In this study, we have recognized such challenges and limitations and we bring you a deeper insight as well as solutions to the issues addressed.
MCC is a shortcut for Merchant Category Code, a classification four-digit code that defines the type of products or services provided by the merchant. The history of merchant classification codes goes back to the 1980s. The modern form of the MCC classification system was created in 2003 by the International Organization for Standardization (ISO).
Payment providers implemented this classification system for transaction categorization (the most widely known are implementations by Visa and MasterCard). The implementations by the different providers are fundamentally the same, differing only in the details.
In addition to the categorization of payments, MCCs were created to identify risky transactions, determine interchange fees, and to file tax returns (in the US).
At Dateio, when we started to develop TapiX – a payment enrichment API, which is partially responsible for payment categorization, we thought that categorization would be easy to solve thanks to existing MCCs. However, when working with MCCs, we have faced several obstacles that make it difficult to work with them directly in order to classify payments and merchants. We soon realized that it would not be so simple and that MCCs do not in fact correlate to categorization.
While MCC codes can serve as a useful starting point, other additional factors and some sorting and reorganization of MCCs need to be included in the categorization process. With over a thousand different codes, grouping them into several broader units (categories) that will better reflect types of customer expenses and their spending habits has been necessary.
When comparing strictly MCC-based categorization with more complex categorization, which takes multiple factors into account, we discovered only 63 % of transactions were correctly categorized based on their MCC code. So, using only MCC codes, every third transaction ends up in the wrong bucket or no bucket at all which prohibits effective usage in PFM and does not provide the desired added value for the clients. In our experience, a 95 % success rate of categorized is ideal and possible to provide clients with high-quality data insights.
Why MCCs have such a low success rate? Let’s look at the biggest challenges associated with MCC, how to overcome them, and how to get more than 95% of the data reliably categorized.
Some MCCs cover several industries and they cannot be assigned to just one category from the consumers’ perspective.
Sometimes it is clear that a code belongs to 2 (or 3) different categories just from the name of the code or its description by the payment providers. For example, code 5045: Computers and Computer Peripheral Equipment and Software belongs (according to the Quick Reference Booklet by MasterCard) to computer software, hardware, or related equipment which matches various consumer categories.
Several MCC codes are highly ambiguous and are distributed across such a wide range of industries that they are of almost no useful contribution to categorization. This relates to codes such as 5399 – Miscellaneous General Merchandise, 5999 – Miscellaneous & Specialty Retail Stores, and 7399 – Business Services.
Let’s have a look at the code 5999. According to the Visa Merchant Data Standards Manual, this MCC should be used only when a merchant cannot be classified with other more specific MCC. This (according to Visa and MasterCard) includes stores selling the following special goods: ammunition, maps, fireworks, picture frames, monuments, etc.
Most of the stores mentioned for this MCC by payment providers in the manuals should fall under the Consumer Goods category. And the reality?
As can be seen in Tables 1 and Table 2, most of the transactions with this MCC belong to the category Shopping Online. The Consumer goods category is only in fifth place. The most common merchant types are e-shops selling basically everything and payment gateways. Amazon plays a big role here because in our database almost 54% of all transactions with MCC 5999 fall under it.
Another category of MCCs not suitable for direct use in categorization are codes that do not say WHAT type of products/services were provided, but HOW the services were offered/HOW the customer pays for the services.
Good examples are these codes:
Subscriptions are becoming a common form of payment these days with customers paying regularly for media of all kinds: newspapers, e-books, music, streaming services, etc. Subscriptions are also standard for games, educational courses, sports clubs, or fitness centers.
On the other hand, a lot of MCCs are very specific. Most of them are related to transportation and travel. Many airlines, car rentals, hotel chains, and casinos have their own MCC – ISO has reserved for them an interval from 3000 to 3999 (Table 4)
Is it true that the more specific the code is, the more accurate and reliable it is? Yes, in general they are reliable. One simply needs to go through all of them and decide where to map them. But still, one must be careful and validate the decisions.
Some stores with unusual products have their own MCC – for example, Orthopedic Goods – Artificial Limb Stores (5976), and Rubber Stamp Store (5974). These codes may be too detailed for the purposes of banks and their clients. Particularly for such codes, a system of grouping the code into larger groups is important.
One might be surprised that for some common industries (e.g. cafés) no code fits at all. Merchant Costa Coffee is a case in point (Table 6). Who would say, based purely on these MCCs, that it’s a coffee shop?
According to MasterCard’s Quick Reference Booklet the MCC should reflect the merchant’s primary business. In the case of cafés, the primary service is the preparation of coffee drinks – not baking, cooking, or serving alcoholic beverages (as highlighted in Table 6).
On the other hand, several overlapping and even duplicate codes can be found. Overlaps can be seen, for example, in codes relating to furniture or computers. And last but not least, some codes (e.g., 4821: Telegraph Services) are becoming redundant nowadays.
A complication may be the existence of country-specific codes. For instance, purely for Spain, there are some duplicate codes to the global standard variants – e.g., Spain-specific code 1465 is an exact duplicate of the global code 5441 – Candy, Nut, Confectionery Stores.
All the above characteristics make the MCC classification system quite confusing and are one of the reasons for poorly chosen MCCs for transactions. However, it is not only the imperfections of the system that are to blame. The merchants themselves are also behind the unreliability of MCCs.
Even in cases where none of the above problems occur, MCCs can be misleading. It can be an unintentional error on the part of the merchant, but it can also be a purposeful substitution of the MCC.
The MCC is given to the merchant when they start accepting cards as a payment option. The MCC could match the merchant’s focus at that time. But typically, merchants evolve and expand their services as time goes on.
Setting up a matching MCC is usually not a business priority for merchants. So merchants tend to stick to the MCC which was given to them initially.
On the other hand, there are situations where the merchant is very interested in their MCC. Merchants may want to avoid code associated with certain industries and especially code from a publicly known list of high-risk MCCs. This is because high-risk code = higher interchange rates (paid by the merchant to the cardholder’s bank for each payment). In some countries and with some banks, you may even risk payment rejection.
The mismatches between a merchant’s MCC and its actual category and field of operation can be absurd and one might even say twisted – as can be seen in the examples below. (Table 7)
Interesting situations occur with chains and franchises. It is common for one merchant, especially one that operates a chain of stores, to have different MCCs between locations. Within a single merchant, MCCs can vary on several levels.
First, a single merchant with multiple different departments – may (but may not) have their terminals with different MCCs. Further, branches are largely independent in their choice of MCCs, leading to inconsistent MCCs for a single merchant within a region, country, or continent, up to the entire world. Unsurprisingly, country-specific codes also play a large role in inconsistencies.
We’ve chosen McDonald’s in Spain as an example of inconsistencies of MCCs at the level of one country (Table 8). And inconsistencies at the global level are shown with the example of IKEA (Table 9).
For the Spanish McDonald’s there are 3 country-specific codes that are not used in other countries. Yet, as we can see in Table 8, transactions from McDonald’s
Spanish branches also have other MCCs – notably the MCC typical of fast food restaurants (5814), but also some wrong codes (5411 and 5813)
For IKEA (Table 9), the second most frequent code is 5999. This is an example of an incorrect code choice. As we already mentioned, the 5999 code is not needed if there is a more specific code for the industry. And for furniture retailers, there are plenty of MCCs. Perhaps too many, which can lead to inconsistent MCC selections within a merchant. IKEA operates large stores with several departments, Therefore, we see in the table the MCC for department stores (5200). As mentioned above each department may have its own terminal – this may also have caused some more specific MCCs (5021, 5713) of IKEA transactions.
MCCs are useful in categorizing payments, but we need to be aware of the risks and limits of categorization based purely on them. MCCs allow us to very quickly get an idea of the type of products or services provided by a merchant. But it is important to keep in mind that there is a risk that one in three MCC transactions is inaccurate (Figure 1).
A better approach to get a more accurate categorization is to consider other information than just an MCC associated with the transaction. Other transaction-specific characteristics (especially merchant descriptions) can be useful. Ideal is to identify the merchant first. Other important sources of information are the websites and social sites of merchants. Also, Artificial Intelligence, such as machine-learning algorithms, can help in categorization.
When categorizing payments, a certain level of work with MCCs is unavoidable. It is, therefore, necessary to know how to deal with them. Mainly, the MCC codes need to be sorted and grouped into several broad boxes – the categories, because the MCCs themselves (there are thousands of them) do not directly correspond to customer spending behavior.
However, there are situations where a system of pre-defined ‘boxes’ (categories) becomes binding and rigid. What about merchants that operate in multiple industries? These merchants are in between categories – one category is not enough to categorize such a merchant. How did we approach this at TapiX?
Our categorization system consists of 25 categories. In addition, we use tags. We have over 500 tags, spread over several layers. Some tags are more general, while others are more specific – in any case, all tags describe the merchant’s business in more detail than categories.
The payment labeling provides flexibility but also complexity to our system. We are free to combine tags independently of the category and thus capture the merchant’s services or products in more depth. The system works for transactions – which, by the way, do not have MCCs (see Table 10 for examples)
We regularly revise the tag list. We can react quickly to the trends of the times and create new tags after careful consideration. In comparison, MCCs lists are revised by payment providers after 5 years.
Let’s demonstrate the high diversity of our tags. In our Groceries category, over three-quarters of transactions (78 %) have MCC 5411 – Grocery Stores and Supermarkets. Our tags associated with the Groceries category and MCC 5411 are shown in the following diagram.
We wrap up the article with examples of merchants that are difficult to grasp purely through categories, and where our tags provide more detail about the merchant as its MCC.
TapiX is API-based service with 220 000+ merchant data coverage globally and 99.99% accuracy. Insert your own inputs and see the quality of the enriched data for yourself in the demo.
or let us know if you are interested in more information