In the Hackerrank problem titled Challenges, a strategic approach is necessary to obtain the correct solution. When framed properly, a complicated query can be reduced to only a few strategic lines of code. In this article we compare a well formed query to a poorly formed one.
Problem statement:
/* Print: hacker_id, name, and count(challenges) created by each student. Order by: total challenges descending then sort the result by hacker_id then, exclude those students if the count < maximum number of challenges */
Two correct solutions are given below. The first is well formed, easy to understand, and takes only a few lines of code.
Query # 1
The problem statement requires discarding records for hackers whom contributed an equal number of total challenges, when the total contributed is less than the maximum.
Because of this, it will be useful to first produce a single column sub-select query that obtains count(challenges_id) all other hacker_id. Then, we can reference this list to check if a hacker’s total challenges is not in the list (unique), or is in the list (not-unique)
The other sub-select in this query is relatively simple and produces a single value for the maximum number of total challenges from all hackers.
Finally, the outer-most select statement is also very simple.
Psuedo-code for query #1:
SELECT <fields>, COUNT(challenge_id) as challenge_count HAVING challenge_count = max_challenge_count OR challenge_count NOT IN <list_of_challenge_counts_by_other_hackers>
Actual code for query #1:
SELECT c.hacker_id, h.name, COUNT(c.challenge_id) AS cnt FROM Hackers AS h JOIN Challenges AS c ON h.hacker_id = c.hacker_id GROUP BY c.hacker_id, h.name HAVING cnt = ( SELECT COUNT(c1.challenge_id) FROM Challenges AS c1 GROUP BY c1.hacker_id ORDER BY COUNT(*) DESC LIMIT 1 ) OR cnt NOT IN ( SELECT COUNT(c2.challenge_id) FROM Challenges AS c2 GROUP BY c2.hacker_id HAVING c2.hacker_id <> c.hacker_id ) ORDER BY cnt DESC, c.hacker_id;
Query # 2
The second query is a brute-force approach that does not take advantage of any strategic principles.
Psuedo-code for query #2:
SELECT <fields>, total_challenges FROM hackers JOIN <hacker_id, total_challenges> ON hacker_id JOIN <total_challenges, hacker_count> ON hacker_id JOIN <max_challenges> WHERE total_challenges = max_challenges OR hacker_count = 1
The lack of strategy lead to duplicate code, shown in green and orange highlighted text below. Encapsulating the code by using other techniques will not address the larger problem here.
Actual code for query #2:
select h.hacker_id, h.name, totals.total_challenges/*, summ_group.hacker_count, summ_group_2.max_challenges*/from hackers h join ( select hacker_id as hacker_id, count(challenge_id) as total_challenges from challenges group by hacker_id ) totals on h.hacker_id = totals.hacker_id join ( select summ.total_challenges as total_challenges, count(summ.hacker_id)as hacker_count from( select h.hacker_id as hacker_id, h.name as name, t.total_challenges as total_challenges from hackers h join ( select hacker_id as hacker_id, count(challenge_id) as total_challenges from challenges group by hacker_id ) t on h.hacker_id = t.hacker_id ) summ group by summ.total_challenges ) summ_group on summ_group.total_challenges = totals.total_challenges join( select max(summ.total_challenges) as max_challenges from( select h.hacker_id as hacker_id, h.name as name, t.total_challenges as total_challenges from hackers h join ( select hacker_id as hacker_id, count(challenge_id) as total_challenges from challenges group by hacker_id ) t on h.hacker_id = t.hacker_id ) summ ) summ_group_2 where summ_group_2.max_challenges = summ_group.total_challenges or summ_group.hacker_count = 1 order by totals.total_challenges desc, h.hacker_id
This hackerrank problem is a great exercise for developing SQL skills and identifying opportunities for SQL query simplification! Indirect approaches can lead to vastly simplified queries including the use of single-value sub-selects and single-column sub-selects.
Please leave a comment below with your thoughts on this topic and any simplifications you have found for this hackerrank challenge problem.

