Mapping key/value in semi-structured data in Python

2 min read
I had to merge several wordclouds into a larger wordcloud based off all keywords and counts, and I only had access to the textfiles containing the keyword and the keyword count, not a list with all the keywords.

In order to be able to do a re-count and to build the new list, I needed to create a list containing the exact count of each keyword.

Luckily the structure was the same for all pairs, and since it was only one token (keyword) in each pair it could be done in one loop.

An alternative way to solve this problem, could be to build a “dict” of tokens and add upp the count each new time a matching token was found.

You can see the code below.

