Sorting Words by Frequency in Python

I’ll be honest, it took nearly three hours and about four iterations before I finally figured out how to get this working. What I needed to do was grab N words with the highest value from a list loaded by configparser.

I looked at using a dictionary first, but since dictionaries in Python are unordered, there was no reasonably simple way to sort the dictionary without a lambda statement.

I also looked at namedtuples, a very cool Python feature, but I would have to load the classes into a list to iterate over them. It quickly became too complicated to keep track of everything.

What I came up with was just using a list of strings since a list can easily be sorted. By simply calling sorted() on a list, Python will return the list in ascending numerical or alphabetical order. So if the strings in my list begin with a number, I get back a sorted list from least frequency to greatest.

Iterating through the dictionary provided by configparser, I can simply compare frequency values and just drop the first item in the sorted list an replace it with the higher frequency word.


def topMissedWords(self, max_count=5):
"""Returns a list with items 'Frequency:Word' as strings from the student .ini
    file MissedWords section"""
    list = ['0:none']                                         # Initiate the list
    missed_words = self.stats["MissedWords"].keys()           # Load the missed words into a local dictionary

    for eachWord in missed_words:                             # Iterate through the dict
        list = sorted(list)                                   # Keep the list sorted
        value = int(self.stats.get("MissedWords", eachWord))  # Get the word frequency
        low = int(list[0].split(":")[0])                      # Get the word with the lowest frequency from the list
        if value >= low:                                      # Compare current index from missed_words to our list
            if len(list) == max_count:                        # Check the length of our list
                del list[0]                                   # Delete the word with the lowest frequency to keep our list at the requested length
            list.append(f"{str(value)}:{eachWord}")           # Finally, add the new word to the list

    return list