Back to WCA Statistics

All Results — paginated search over every WCA result

Lookup

Flattens the WCA dump's single + average results into one wca_results_top table (~11M rows, no cap) so the client can filter by event / single vs average / country / year / month / person or comp name and paginate millions of rows deep in under 300 ms.

The hard part is deep pagination: ORDER BY value LIMIT/OFFSET with OFFSET 1,000,000 would naïvely join three tables and then discard the first million rows — 10 s+. The route uses a derived-table + late-join on the id PK pattern with an INCLUDE (id) index so the inner subquery is Index-Only and the outer only enriches 100 rows.

By the numbers

~11 M
Rows
One row each for single and average
6
Main indexes
wrt_main / country / wca_id / comp_id / year / comp_lookup
< 300 ms
At OFFSET 1 M
late-join + INCLUDE(id) + VACUUM → Heap Fetches = 0
200
Page size cap
MAX_SIZE prevents huge pulls from starving the server

Data source

CI streams results per event_id, emitting 0 / 1 / 2 TSV rows per source row (one single if best > 0, one average if average > 0); the trailing round_type_id / format_id / record_tag cols feed the /comp page fast-path — no cap anywhere. load.sql does DROP + CREATE + COPY then VACUUM (ANALYZE) so the visibility map is clean; without that, Index Only Scan re-fetches the heap and deep OFFSET regresses.

Deep-page late-join — inner subquery rides INCLUDE(id), outer joins via id PK
sql
SELECT q.value, q.wca_id, t.person_country_id, co.iso2,
       t.comp_id, c.name AS comp_name, t.comp_date, t.attempts, p.name
FROM (
  SELECT t.id, t.value, t.wca_id
  FROM wca_results_top t
  WHERE t.event_id = ? AND t.is_avg = ?
  ORDER BY t.value ASC, t.wca_id ASC
  LIMIT ? OFFSET ?
) q
JOIN wca_results_top t ON t.id = q.id
JOIN wca_persons p ON p.wca_id = t.wca_id
LEFT JOIN wca_countries co ON co.id = t.person_country_id
LEFT JOIN wca_competitions c ON c.id = t.comp_id
ORDER BY q.value ASC, q.wca_id ASC;

Algorithm / pipeline

1
Base filter: event + type
event_id validated against VALID_EVENTS (21 incl. discontinued); type is single or average only; 333mbf rejects average. Both values form the leading prefix of wrt_main.
2
Optional filters: country / year / month
country accepts 2-char ISO2, resolved via wca_countries.id. year hits the generated column comp_year (STORED EXTRACT(YEAR FROM comp_date)::SMALLINT); month is EXTRACT(MONTH FROM comp_date) as a post-filter. year rides the wrt_year index.
3
Free-text q: ILIKE two tables
q ILIKEs both wca_persons.name and wca_competitions.name with LIMIT 200 each. Their ids feed (wca_id IN ... OR comp_id IN ...) in the main predicate. Both empty → short-circuit total: 0.
4
Inner derived query stays Index-Only
Inner query only SELECT id, value, wca_id — all three columns live inside wrt_main INCLUDE (id). ORDER BY value, wca_id LIMIT ? OFFSET ? walks the index linearly; OFFSET 1 M still ~250 ms with Heap Fetches: 0 (assuming a recent VACUUM).
5
Outer joins via id PK, enrich, count
JOIN wca_results_top t ON t.id = q.id rides the PK and only resolves 100 rows; three lookup joins add name / comp_name / iso2. A separate COUNT(*) with the same WHERE produces total.

Caveats & edges

Related stats & links