Codility AI Capabilities | Assess Real Engineering Work in an AI-First World

Codility AI Capabilities

Assess real work
in an AI-first world_

Engineers use AI every day. Your assessments should reflect that reality, with full visibility into how candidates and employees collaborate with AI, and reviewable evidence your team can trust.

The challenge

Screening at scale when every candidate has AI


You are sending thousands of assessments, but AI-generated submissions look identical to hand-written code. Your scoring models cannot differentiate senior engineers from candidates who copy-pasted a prompt. Manual review of every submission is not an option.

description

Recruiters review every CV by hand

“We’re missing better candidates because their CV isn’t strong, but they perform well.” Manual filtering at the top of the funnel misses the engineers worth pulling forward.

bar_chart

AI tools push every candidate to 90%+

When correctness scoring tops out for everyone, the funnel collapses. Hiring managers default back to CV signal, which defeats the purpose of skill-based hiring.

groups

Heavy proctoring burden, unsustainable at scale

“Heavy proctoring burden, unsustainable as we scale.” At enterprise volume, manually reviewing every flagged session breaks. Engineering teams pulled in to verify integrity end up doing work that should be handled by a defensible record.

Before and after

From blind scoring to reviewable signal


Status quoWith Codility Screen
Correctness score with no insight into how it was producedScore plus code evolution timeline, AI interaction log, and similarity analysis
Ban AI and get fake signal, or allow AI and lose auditabilityEnable or disable AI per assessment, with every candidate interaction captured as reviewable AI activity
Senior engineers manually verify every promising submissionAI-generated follow-up questions probe candidate understanding automatically
Candidates receive a pass/fail with no feedbackAI-generated coaching feedback helps candidates improve, even after a low score
Top-of-funnel scores cluster at 90%+ with no differentiationSimilarity analysis and AI interaction logs reveal who understands their code and who relied on generation

AI capabilities in Screen


AI in the assessment

How candidates interact with AI during their assessment, and how AI generates feedback and follow-up signal for reviewers.

Cody: In-Assessment AI Assistant

Trained on Codility’s content library, Cody helps candidates clarify tasks and iterate on their approach. It is guardrailed: candidates can explore ideas and get guidance, but Cody will not generate a full solution. Every prompt and response is logged as reviewable AI activity. Cody does not affect scoring.

Available

AI Copilot in VS Code

A multi-model AI Copilot inside real VS Code dev containers: tab completion, inline suggestions, multi-file edits, and chat-based assistance. All prompts, accepted suggestions, and rejected completions are captured for playback.

Preview

AI Follow-Up Questions

After submission, the system generates contextual questions about the candidate’s approach, trade-offs, and edge cases. Responses are captured for reviewer analysis.

Preview

AI Feedback for Candidates

Growth-oriented feedback identifies positive patterns alongside areas for improvement. Even low-scoring candidates receive actionable coaching on structure, naming, and approach.

Preview

AI Readiness: Tech Roles

130+ tasks covering machine learning, NLP, model training, and AI tool collaboration. Assess whether engineers can build with AI: prompt engineering, output verification, debugging, and iterating on model outputs.

Available

AI Task Creation via MCP

Build assessment tasks directly from your IDE using the Model Context Protocol (MCP). The MCP server is live and powering customer task publishing across hiring and skills assessments.

Available

AI Readiness for Business Tasks

Work simulations for non-technical roles where AI is part of the job: customer support, success, sales, product management, marketing, data analysis, and finance. Candidates use Cody to produce real outputs. Reviewers see prompt quality, output evaluation, and judgment.

Preview

Integrity signals

How Codility surfaces reviewable evidence about candidate behaviour during assessments.

Integrity Risk

A calculated integrity score per screening session that combines behavioural, identity, and plagiarism signals into one reviewable indicator. Reviewers see what drove the score and act on it directly.

Available

Similarity Check

Detects code pasted from external sources and submissions that match known AI output patterns. Pairs with the AI interaction log to differentiate authored work from generated output.

Available

Pattern Detection

Analyses typing cadence, paste patterns, and editing behaviour to detect when solutions were retyped from another device or screen. Surfaces reviewable evidence for your team to act on.

Preview

Cheating apps detection

Detects unauthorized applications running alongside the assessment, including tools designed to be invisible to screen sharing. Detected apps appear in the integrity widget and on the candidate timeline for reviewer playback.

Preview

What customers are saying

Signal from the field


“Everyone’s using AI and it feels sometimes unfair to disregard candidates the access. But then we can monitor it and ensure they’re not using it for the whole thing.”
Demi, Engineering Leader, Consumer Analytics
“The QA team currently spends the first part of follow-up interviews testing a candidate’s understanding of what they completed in Codility.”
Sonia, Talent Acquisition Manager, E-Commerce

The challenge

Live interviews need to reflect how engineers work


Your engineers use VS Code, containers, and AI copilots every day. But your live interviews still run in a stripped-down browser editor with no tooling. The result: you are assessing how candidates perform in an artificial environment instead of how they build real software.

dvr

Generic live coding tools weren’t built for AI

Browser editors test syntax recall. They cannot run containers, install packages, or capture how a candidate uses AI mid-session. The interviewer ends up guessing what was generated.

visibility_off

Ad hoc interviews across hiring managers

“Interviews are not necessarily run the same way.” Without a shared environment and one rubric, two interviewers running the same role produce different signal. The decision drifts to gut feel.

sentiment_dissatisfied

Senior engineers want their hours back

Live interviews eat senior engineering time. Without a real environment that captures what the candidate actually did, interviewers re-test in follow-on sessions. The cost compounds.

Before and after

From artificial test to real work simulation


Status quoWith Codility Interview
Browser-based editor with no terminal, no packages, no toolingFull VS Code environment with dev containers, sidecar services, and extensions
Ban AI and assess an unrealistic workflowEnable AI Copilot with full interaction capture: prompts, completions, and acceptance patterns
Interviewer guesses what was AI-generatedCode timeline shows every edit, AI suggestion, and candidate decision in playback
Interviewer notes are the only recordFull transcript, code evolution, and structured evaluation form for every session

AI capabilities in Interview


AI in the assessment

How candidates interact with AI during live interviews, and how interviewers gain visibility into that collaboration.

Cody: In-Assessment AI Assistant

Trained on Codility’s content library, Cody helps candidates clarify tasks and iterate on their approach during live sessions. It is guardrailed: candidates can explore ideas, but Cody will not generate a full solution. Every interaction is logged as reviewable AI activity and visible to interviewers in real time.

Available

AI Copilot in VS Code

A multi-model AI Copilot with tab completion, inline suggestions, multi-file edits, and chat. Interviewers see exactly how candidates use AI: what they prompted, what they accepted, and what they rejected.

Preview

AI Readiness: Tech Roles

Interview tasks designed to evaluate AI collaboration skills: prompt engineering, output verification, debugging AI-generated code, and iterating on model outputs.

Available

AI Task Creation via MCP

Build interview tasks directly from your IDE using the Model Context Protocol (MCP). The MCP server is live and powering customer task publishing across hiring and skills assessments.

Available

AI Readiness for Business Tasks

Live interview tasks for non-technical roles where AI collaboration is part of the work. Interviewers see prompt quality, output evaluation, and judgment in real time.

Preview

Integrity signals

How Codility surfaces reviewable evidence about candidate behaviour during live sessions.

Pattern Detection

Analyses typing cadence, paste patterns, and editing behaviour to detect when solutions were retyped from another device or screen. Surfaces evidence for the interviewer to review.

Preview

Cheating apps detection

Detects unauthorized applications running alongside the interview session, including tools designed to be invisible to screen sharing. Detected apps appear in the integrity widget and on the candidate timeline for interviewer playback.

Preview

What customers are saying

Signal from the field


“If I was in an interview with someone and I could just see them put everything into AI, I may feel disengaged and question that candidate’s capabilities.”
Demi, Engineering Leader, Consumer Analytics
“Within 12 months I think all of the vendors in the market are going to have very similar AI copilot integration. I think the real differentiation is: what are the insights?”
Engineering leaders, European Fintech

The challenge

You invested in AI tools. Can your workforce use them?


Your organization is spending millions on Copilot, ChatGPT, and internal AI tools. But you have no objective measurement of whether employees can actually use them effectively. Self-reporting and adoption metrics tell you who logged in. They say nothing about who built something valuable.

payments

We can’t prove ROI on AI investment

Leadership asks for data. We have license counts and login rates. The board wants evidence that training moved the needle on actual AI proficiency.

help_outline

No shared definition of good AI skills

Prompting quality? Output evaluation? Multi-step collaboration? Without a framework, every team defines readiness differently. The benchmark drifts by department.

sync_alt

We need our team to meet the benchmark we hire to

“We need to know our existing teams meet the benchmark of the candidates we already have here.” Internal mobility, project staffing, and upskilling run on manager opinion today. There is no objective way to check.

Before and after

From training spend to measured capability


Status quoWith Codility Skills Intelligence
AI training completion rates with no skill validationBefore-and-after assessments that prove capability improvement by role
Manager reviews drive internal mobility decisionsObjective skills data identifies who is ready for new roles
“AI readiness” lives in slide decks with no way to validate it130+ tasks across ML, NLP, model training, and AI collaboration with deterministic scoring
Employees take assessments and never hear backAI-generated feedback turns every assessment into a coaching moment
License counts prove adoption without measuring proficiencySkills Intelligence maps AI capability across the entire engineering org

AI capabilities in Skills Intelligence


AI in the assessment

How employees interact with AI during skills assessments, and how AI generates actionable feedback and proficiency scoring.

AI Feedback for Employees

The feature Skills Intelligence customers ask about most. After every assessment, employees receive coaching-oriented feedback that identifies strengths, surfaces specific improvement areas, and provides actionable guidance on structure, naming, and approach. This shifts the perception of assessments from “testing” to “development,” driving higher engagement and repeat participation across the org.

Preview

Cody: In-Assessment AI Assistant

Trained on Codility’s content library, Cody helps employees clarify tasks and iterate on their approach. It is guardrailed: employees can explore ideas and get guidance, but Cody will not generate a full solution. Logged interactions reveal prompt quality, iteration patterns, and output evaluation.

Available

AI Copilot in VS Code

A multi-model AI Copilot in dev containers. Captures how employees leverage AI tools they use daily: tab completion, multi-file edits, chat-based assistance. Provides signal on real-world AI collaboration patterns.

Preview

AI Readiness: Tech Roles

130+ tasks for ML engineers, data scientists, MLOps, and AI integrators. Assess real AI capability with deterministic scoring that the board can trust.

Available

AI Readiness for Business Tasks

Work simulations for non-technical roles: customer support, success, sales, product management, marketing, data analysis, and finance. Candidates use Cody to produce real outputs and reviewers see prompt quality, output evaluation, and judgment. Soft-launched on codility.ai while validation pilots run.

Preview

Scoring 2.0

Adds maintainability metrics to skills proficiency scoring: code structure, naming conventions, modularity. Maps actual engineering quality across the full spectrum of code craftsmanship.

Available

AI Task Creation via MCP

Build assessment tasks directly from your IDE using the Model Context Protocol (MCP). The MCP server is live and powering customer task publishing across hiring and skills assessments.

Available

Integrity signals

How Codility maintains trust across internal assessments.

Similarity Check

Identifies code pasted from external sources and submissions that match known AI output patterns. Reinforces the integrity of pre- and post-training capability measurements.

Available

Cheating apps detection

Detects unauthorized applications running alongside the assessment, including tools designed to be invisible to screen sharing. Detected apps appear in the integrity widget and on the candidate timeline for reviewer playback.

Preview

What customers are saying

Signal from the field


“We are all in on AI. There’s no going back at this point. I would put us in that 5% category of companies who can demonstrate ROI. There’s not a tremendous amount when you really look under the covers.”
Mike, Executive, Global Financial Services
“That’s a burning question for me right now: how do we test, how do we know capability in AI prompting, how do we do that?”
Mel, Tech Academy Leader, Financial Services

Why Codility

What sets Codility apart


Controlled AI collaboration with reviewable activity

Cody does not affect scoring. Enable or disable AI per assessment, with every candidate interaction captured as reviewable AI activity. More granular controls are on the roadmap. No decisions are outsourced to opaque models.

Real IDE that mirrors daily engineering work

VS Code dev containers with optional sidecar services and a multi-model AI Copilot. Candidates and employees work in the environment they know, with the AI tools they use daily, while every interaction is captured for review.

Assessment science with a maintainability lens

I/O psychologist-led validation through the Engineering Skills Model, validated by engineering leaders. Scoring 2.0 adds 25+ maintainability metrics that differentiate candidates in the 90 to 100 range where AI inflates results.

Enterprise trust and regulatory alignment

EU data storage in Frankfurt, SOC 2 Type II, ISO 27001, GDPR, and WCAG 2.1 AA compliance. Human-in-the-loop philosophy aligns with EU AI Act requirements. Codility does not use customer or candidate data to train AI models.

Availability

AI capabilities across products


Capability Screen Interview Skills Intelligence
Cody: AI Assistant Available Available Available
AI Copilot in VS Code Preview Preview Preview
AI Follow-Up Questions Preview n/a n/a
AI Feedback Preview n/a Preview
AI Readiness: Tech Roles Available Available Available
AI Readiness for Business Tasks Preview Preview Preview
Integrity Risk Available n/a n/a
Pattern Detection Preview Preview n/a
Cheating apps detection Preview Preview Preview
Similarity Check Available n/a Available
Scoring 2.0 n/a n/a Available
AI Task Creation via MCP Available Available Available