Return to Blog Home

Codewars Leaderboard & Archiving

Source 

Codewars Leaderboard & Archiving

How Codewars started pulling me in...and I started automating it!

As part of #100Devs, it had gotten to the time in which we were expected to do a single Codewars problem a day - initially just in the 8kyu fundamentals track, but once I got a taste, I was enthralled.

My first disappointment was the fact that the leaderboard on the homepage reports your rank within your joined clan as #101, which I thought was fine, I can make the few extra clicks to view the larger one...except it didn't exist! Well a larger list of allies did exist, but aside it from being a different perspective into the mostly same data, it was missing one feature that made it a true leaderboard: position numbers.

It was also missing the user who was viewing the list, but the last of position numbers was the primary obstacle to seeing one's relative ranking.

Userscript

Of course I started on a Userscript to solve this issue, starting off just polling the page every few seconds, finding the table, and prefixing a #N cell to each row. Of course as this was the least performant approach, I quickly migrated to using a new MutationObserver() to detect the table being appended to, then run my table-processing code on it.

Lastly was the fact that the current user was not visible in the list, which after detecting which position to insert ones-self into based on comparing the current rows honor, users honor, and next rows honor, required a duplicating of the existing rows with the current users info templated in.

As Codewars does do manual history management, I did have to do some polling - but I did so via another new MutationObserver(), this one attached to the entire document, checking if the URL changes, and if it's a valid URL running the code to attach the other new MutationObserver().

Clan Archival

While seeing the current state of the leaderboard was enjoyable, it's data over time that's much more interesting, so I started off with a script that would - using the user's login token - download, parse, and save the entire Allies tab of codewars.

Thankfully this data is delivered to the client via repeated requests to:

https://www.codewars.com/users/USERNAME/allies?page=PAGE

Therefore I continued making and parsing requests to this page until there were no longer any returned rows, and finally sorted the parsed results by honor and saved them into a JSON file.


I quickly took this code and copied it to my server, setting up a cronjob to have this run every three hours, and shortly after saw the data piling up. It would take some time both to get enough data to do anything interesting with, and for me to find something interesting with the data, but that's for another blog!

Obtaining over 1,000 honor triggered a bug in my code for parsing my own honor, which was caused by the fact that all others honor were listed as non-formatted numbers, yet your own Honor was formatted with commas, so unfortunately I lost a number of days of my personal honor values.

Solution Downloader

Another thing expected of us #100Devers was to eventually push our Codewars solutions to GitHub daily, and while this could be done manually, of course I would never pass up an opportunity to automate such a tedious task...or at least the downloading portion of this task!

I started by copying everything from the Clan Archival script, and began making modifications primarily to the parsing aspect, as the fetching was nearly identical aside from a slight URL change. It was slightly more complicated as each solution not only included the solutions of all languages, but all the solutions of each of those languages, not sure of what kind of filtering I'd like to do though, I parsed them all, finalizing on this data structure:

interface CompletedKata {
  slug: string;
  title: string;
  rank: string;
  solutions: Solution[];
}

interface Solution {
  content: string;
  language: string;
  when: Date;
}

Formatting

While I already had in mind the ability to have multiple solution format output types, I decided to go with two:

This wasn't particularly difficult, it was mostly getting things textually accurate - from the filenames, to the markdown.


Eventually I would create another format type:

Which would create a file for each day & language the solutions were marked for.


Finally I would create the last format type, the one that would be compatible with my current DailyProblem repository, similar to the Kata Slug format, but using my README.md templates, and including the tests in the solution code.

Kata Metadata

Katas had a lot of information available to download, and initially I only cared for the solutions, but for my repo would need more - at minimum the description, and if possible the test code.

As this information existed only on the page of the Kata, this would be a large amount of requests, so I went with a cache-then-fetch data approach, and started off writing the parser. Thankfully the test code was in a <code> un-formatted, and the description markdown existed within initialization JSON in a <script> near the bottom of the page.

After having this added to the CompletedKata as a Record<string, { description: string; testCode: string }>, I only needed to pull the info from the Kata when I needed it, which is exactly what I did when creating the README.md and solution code files.