Open scoring model · Last updated March 2026

How HireSignal Scores a Profile

Complete transparency on every dimension, weight, confidence band, and bias check. If you can see inside the model, you can trust the output.

Why we publish this: HireSignal is not an AI wrapper. The core scoring model is deterministic — same GitHub data always produces the same score. LLaMA 3.1 enriches the report with narrative and interview questions, but it doesn't set the number. Publishing the model is evidence of that. Wrappers don't have documented, explainable models.

The Scoring Formula

// Final score — deterministic, reproducible, auditable

Score = Σ(dimension_score × role_weight) / Σ(max_score × role_weight) × 100

// Example: Backend Engineer preset

Score = (commitQuality×1.4 + consistency×1.3 + breadth×1.2 + ...) / max × 100

Each dimension has a defined maximum score (see table below).
Role weights are multipliers — they shift relative emphasis, not the raw dimension score.
The final score is always 0–100, regardless of role preset.
Scores are reproducible: given identical GitHub API data, the score is always identical.
LLM narrative is generated after scoring and does not affect the number.

The 11 Signals

Total maximum score: 145 points (normalised to 100 via the weighted formula above).

Profile Completeness

Max 10 points

Signals

·Display name set (1 pt)
·Bio text present (1 pt)
·Location set (1 pt)
·Blog / personal site linked (1 pt)
·Email or contact visible (1 pt)
·Hireable flag (0.5 pt)
·Twitter / social linked (0.5 pt)
·Avatar is non-default (1 pt)
·Organisation membership (1 pt)
·Pro account (0.5 pt)

Confidence

Very high — all fields are directly observable from the GitHub API response.

Repository Quality

Max 25 points

Signals

·Repo has a description (up to 5 repos scored × 2 pts each)
·Repo has topics / tags (1 pt per repo, max 5)
·Repo has an open-source licence (1 pt per repo, max 5)
·Repo is not a fork (quality filter)
·README.md detected via topics heuristic (1 pt per repo)
·Bonus: >10 original repos shows initiative

Confidence

High — repo metadata is public; occasional private repos may be missed.

Community Impact

Max 10 points

Signals

·Total stargazers across original repos (log-scaled, max 5 pts)
·Total forks of original repos (log-scaled, max 3 pts)
·External OSS event contributions (max 2 pts)

Confidence

Moderate — stars are visible but can be gamed via star-farming. Flagged when detected.

Contribution Consistency

Max 20 points

Signals

·Active days in the last 90 days (GitHub Events API, max 8 pts)
·Contribution streak length (max 4 pts)
·Event type diversity — pushes, PRs, reviews, issues (max 4 pts)
·Consistency across weekdays vs. single-burst sessions (max 4 pts)

Confidence

Moderate — GitHub Events API only returns the last 300 events / 90 days. Long-tenure corporate engineers with private activity will score lower here.

Technical Breadth

Max 15 points

Signals

·Unique language count across original repos (max 6 pts)
·Topic / domain diversity (frontend, backend, ML, DevOps) (max 4 pts)
·Multi-language repos (polyglot indicators) (max 3 pts)
·Rare / specialised language bonus (Rust, Go, Erlang) (2 pts)

Confidence

High — language data is per-repo and reliably reported by GitHub.

Social Proof

Max 5 points

Signals

·Follower count (log-scaled, max 3 pts)
·Following / follower ratio sanity check (max 1 pt)
·Public gists (max 1 pt)

Confidence

Low — follower counts are lagging signals and easily inflated. This dimension has the lowest weight in all role presets.

Commit Quality

Max 20 points

Signals

·Samples 50–100 commits across the top 5 original repos
·Conventional Commit format (feat/fix/chore/docs) (up to 6 pts)
·Average commit message length ≥ 30 chars (up to 4 pts)
·Atomic commits — one logical change per commit (up to 5 pts)
·Absence of noise commits ('wip', 'asdf', 'test') (up to 5 pts)

Confidence

High when ≥30 commits sampled. Low for accounts with < 10 commits visible.

Recency

Max 10 points

Signals

·Last push date (days since last commit, decays over time)
·Active repos in the last 6 months (max 4 pts)
·Push events in the last 30 days (max 3 pts)
·Issue / PR activity in the last 30 days (max 3 pts)

Confidence

High — timestamps are reliable. Note: engineers between jobs will naturally score lower here.

PR Review Quality

Max 10 points

Signals

·PullRequestReviewEvent count in the last 90 days (max 5 pts)
·External PR reviews (reviewing others' work, not your own repos) (max 3 pts)
·Review diversity — reviewing in multiple repos (max 2 pts)

Confidence

Low to moderate — PR Review events are under-represented in the public Events API. Strong corporate contributors will have invisible review activity.

OSS Contributions

Max 10 points

Signals

·PushEvents or IssueEvents on repos owned by others (last 90 days)
·Non-trivial contribution: multi-line commits or issue engagement (bonus)
·Diversity of external projects contributed to (max 3 pts)

Confidence

Moderate — public OSS events are visible; corporate open source (behind VPNs or enterprise GitHub) is invisible.

Documentation Quality

Max 10 points

Signals

·README detected in top repos (heuristic via topics + description) (max 4 pts)
·Personal website or portfolio linked (max 2 pts)
·Blog / writing linked (max 2 pts)
·Description text quality (length, punctuation, professionalism) (max 2 pts)

Confidence

Moderate — README presence is inferred; full content is not fetched to stay within API rate limits on the free tier.

Confidence Scoring Model

Every report includes an overall data confidence score (30–98%) and a per-dimension confidence band (high / moderate / low). These are shown as coloured pills on each scoring bar so recruiters never mistake a low-data estimate for a high-confidence fact.

High confidence

Threshold: ≥ 80

1,247 commits · 4 years of activity

Moderate confidence

Threshold: 50–79

Some data but limited events

Low confidence

Threshold: < 50

Only 3 repos, mostly forks

The hardest case: Senior engineers at BigCo. Staff engineers at Google, Meta, or any company with private GitHub Enterprise will show minimal public activity — they're working in private repos all day. When confidence is low and the inferred experience level is Senior/Staff, HireSignal automatically upgrades a NO_HIRE to MAYBE and shows a data completeness warning.

Role Presets & Weight Adjustments

Role presets apply weight multipliers to dimensions without changing the raw scores. The final 0–100 score reflects relative emphasis — a backend engineer is judged primarily on commit quality, not social proof.

Backend Engineer

Emphasises commit quality, contribution consistency, and technical breadth.

↑ Commit Quality↑ Contribution Consistency↑ Technical Breadth↓ Social Proof↓ Profile Completeness

Frontend / Full-Stack

Balances breadth with repo quality and documentation — UI engineers often have polished public work.

↑ Repository Quality↑ Documentation Quality↑ Technical Breadth↓ PR Review Quality↓ Social Proof

ML / AI Engineer

Heavy weight on commit quality and recency — ML work is often in Jupyter Notebooks; we account for that.

↑ Commit Quality↑ Recency↑ Community Impact↓ PR Review Quality↓ Documentation Quality

DevOps / Platform

Values consistency and PR review quality — infra engineers often review more than they push.

↑ Contribution Consistency↑ PR Review Quality↑ Recency↓ Community Impact↓ Social Proof

OSS Contributor

Maximises community impact and OSS contribution signals.

↑ Community Impact↑ OSS Contributions↑ PR Review Quality↓ Social Proof↓ Profile Completeness

Balanced (Default)

Equal-weight baseline. Recommended for general screening.

Enterprise customers can define custom weight multipliers per role. Pro customers can override the preset on any individual analysis.

Bias & Fairness Checks

HireSignal is designed for NYC Local Law 144 compliance and EU AI Act Article 22. The following automated checks run on every analysis and are recorded in the audit log.

Flag	Trigger condition	Mitigation action
`low-data-bias`	< 5 original repos OR hireConfidence < 50%	Upgrade NO_HIRE → MAYBE for Senior/Staff engineers. Show data completeness warning banner.
`recency-bias`	Account inactive in the last 6 months	Surface warning: candidate may be employed, on leave, or working in private repos.
`popularity-bias`	Community Impact score is the highest-weighted dimension AND total score > 80	Warn: high star counts may reflect trending projects rather than engineering skill.
`ai-inflation`	AI usage likelihood ≥ 40% (detected via commit pattern analysis)	Surface AI usage flag with tailored interview probe questions. Score is not adjusted — human judgment required.

The Role of AI in HireSignal

LLaMA 3.1 8B runs self-hosted on HireSignal infrastructure. It is used for three things:

Plain-English narrative summary of the candidate's profile (does not affect the score)
5 tailored interview questions generated from the candidate's actual stack and signals
Red flag and standout factor identification in natural language

Zero data leaves your trust boundary. No candidate data is sent to OpenAI, Anthropic, Google, or any third-party LLM API. The LLaMA model runs in HireSignal's private inference cluster. Enterprise customers can optionally deploy the model on their own infrastructure.

The Outcome Feedback Loop

When a recruiter marks a candidate as hired, HireSignal sends a 90-day and 180-day check-in asking for a performance rating (output quality, team fit, retention risk, would hire again). These ratings are stored against the original score.

After 50+ outcomes, the dashboard shows which HireSignal score bands correlate with high performers for that specific recruiter's hiring patterns. After 100+ outcomes, Enterprise customers can request auto-suggested weight recalibrations.

This dataset is the product's primary moat. No base model + prompt can replicate recruiter-specific outcome data. After 10,000 hires across users, the correlation dataset becomes a proprietary asset that defines HireSignal's accuracy advantage.

Questions about the model? We're happy to walk through any dimension in detail.

Try it on a real profile How we compare →

Loading…

Open scoring model · Last updated March 2026

How HireSignal Scores a Profile

Complete transparency on every dimension, weight, confidence band, and bias check. If you can see inside the model, you can trust the output.

The Scoring Formula

// Final score — deterministic, reproducible, auditable

Score = Σ(dimension_score × role_weight) / Σ(max_score × role_weight) × 100

// Example: Backend Engineer preset

Score = (commitQuality×1.4 + consistency×1.3 + breadth×1.2 + ...) / max × 100

Each dimension has a defined maximum score (see table below).
Role weights are multipliers — they shift relative emphasis, not the raw dimension score.
The final score is always 0–100, regardless of role preset.
Scores are reproducible: given identical GitHub API data, the score is always identical.
LLM narrative is generated after scoring and does not affect the number.

The 11 Signals

Total maximum score: 145 points (normalised to 100 via the weighted formula above).

Profile Completeness

Max 10 points

Signals

·Display name set (1 pt)
·Bio text present (1 pt)
·Location set (1 pt)
·Blog / personal site linked (1 pt)
·Email or contact visible (1 pt)
·Hireable flag (0.5 pt)
·Twitter / social linked (0.5 pt)
·Avatar is non-default (1 pt)
·Organisation membership (1 pt)
·Pro account (0.5 pt)

Confidence

Very high — all fields are directly observable from the GitHub API response.

Repository Quality

Max 25 points

Signals

·Repo has a description (up to 5 repos scored × 2 pts each)
·Repo has topics / tags (1 pt per repo, max 5)
·Repo has an open-source licence (1 pt per repo, max 5)
·Repo is not a fork (quality filter)
·README.md detected via topics heuristic (1 pt per repo)
·Bonus: >10 original repos shows initiative

Confidence

High — repo metadata is public; occasional private repos may be missed.

Community Impact

Max 10 points

Signals

·Total stargazers across original repos (log-scaled, max 5 pts)
·Total forks of original repos (log-scaled, max 3 pts)
·External OSS event contributions (max 2 pts)

Confidence

Moderate — stars are visible but can be gamed via star-farming. Flagged when detected.

Contribution Consistency

Max 20 points

Signals

·Active days in the last 90 days (GitHub Events API, max 8 pts)
·Contribution streak length (max 4 pts)
·Event type diversity — pushes, PRs, reviews, issues (max 4 pts)
·Consistency across weekdays vs. single-burst sessions (max 4 pts)

Confidence

Moderate — GitHub Events API only returns the last 300 events / 90 days. Long-tenure corporate engineers with private activity will score lower here.

Technical Breadth

Max 15 points

Signals

·Unique language count across original repos (max 6 pts)
·Topic / domain diversity (frontend, backend, ML, DevOps) (max 4 pts)
·Multi-language repos (polyglot indicators) (max 3 pts)
·Rare / specialised language bonus (Rust, Go, Erlang) (2 pts)

Confidence

High — language data is per-repo and reliably reported by GitHub.

Social Proof

Max 5 points

Signals

·Follower count (log-scaled, max 3 pts)
·Following / follower ratio sanity check (max 1 pt)
·Public gists (max 1 pt)

Confidence

Low — follower counts are lagging signals and easily inflated. This dimension has the lowest weight in all role presets.

Commit Quality

Max 20 points

Signals

·Samples 50–100 commits across the top 5 original repos
·Conventional Commit format (feat/fix/chore/docs) (up to 6 pts)
·Average commit message length ≥ 30 chars (up to 4 pts)
·Atomic commits — one logical change per commit (up to 5 pts)
·Absence of noise commits ('wip', 'asdf', 'test') (up to 5 pts)

Confidence

High when ≥30 commits sampled. Low for accounts with < 10 commits visible.

Recency

Max 10 points

Signals

·Last push date (days since last commit, decays over time)
·Active repos in the last 6 months (max 4 pts)
·Push events in the last 30 days (max 3 pts)
·Issue / PR activity in the last 30 days (max 3 pts)

Confidence

High — timestamps are reliable. Note: engineers between jobs will naturally score lower here.

PR Review Quality

Max 10 points

Signals

·PullRequestReviewEvent count in the last 90 days (max 5 pts)
·External PR reviews (reviewing others' work, not your own repos) (max 3 pts)
·Review diversity — reviewing in multiple repos (max 2 pts)

Confidence

Low to moderate — PR Review events are under-represented in the public Events API. Strong corporate contributors will have invisible review activity.

OSS Contributions

Max 10 points

Signals

·PushEvents or IssueEvents on repos owned by others (last 90 days)
·Non-trivial contribution: multi-line commits or issue engagement (bonus)
·Diversity of external projects contributed to (max 3 pts)

Confidence

Moderate — public OSS events are visible; corporate open source (behind VPNs or enterprise GitHub) is invisible.

Documentation Quality

Max 10 points

Signals

·README detected in top repos (heuristic via topics + description) (max 4 pts)
·Personal website or portfolio linked (max 2 pts)
·Blog / writing linked (max 2 pts)
·Description text quality (length, punctuation, professionalism) (max 2 pts)

Confidence

Moderate — README presence is inferred; full content is not fetched to stay within API rate limits on the free tier.

Confidence Scoring Model

High confidence

Threshold: ≥ 80

1,247 commits · 4 years of activity

Moderate confidence

Threshold: 50–79

Some data but limited events

Low confidence

Threshold: < 50

Only 3 repos, mostly forks

Role Presets & Weight Adjustments

Backend Engineer

Emphasises commit quality, contribution consistency, and technical breadth.

↑ Commit Quality↑ Contribution Consistency↑ Technical Breadth↓ Social Proof↓ Profile Completeness

Frontend / Full-Stack

Balances breadth with repo quality and documentation — UI engineers often have polished public work.

↑ Repository Quality↑ Documentation Quality↑ Technical Breadth↓ PR Review Quality↓ Social Proof

ML / AI Engineer

Heavy weight on commit quality and recency — ML work is often in Jupyter Notebooks; we account for that.

↑ Commit Quality↑ Recency↑ Community Impact↓ PR Review Quality↓ Documentation Quality

DevOps / Platform

Values consistency and PR review quality — infra engineers often review more than they push.

↑ Contribution Consistency↑ PR Review Quality↑ Recency↓ Community Impact↓ Social Proof

OSS Contributor

Maximises community impact and OSS contribution signals.

↑ Community Impact↑ OSS Contributions↑ PR Review Quality↓ Social Proof↓ Profile Completeness

Balanced (Default)

Equal-weight baseline. Recommended for general screening.

Enterprise customers can define custom weight multipliers per role. Pro customers can override the preset on any individual analysis.

Bias & Fairness Checks

HireSignal is designed for NYC Local Law 144 compliance and EU AI Act Article 22. The following automated checks run on every analysis and are recorded in the audit log.

Flag	Trigger condition	Mitigation action
`low-data-bias`	< 5 original repos OR hireConfidence < 50%	Upgrade NO_HIRE → MAYBE for Senior/Staff engineers. Show data completeness warning banner.
`recency-bias`	Account inactive in the last 6 months	Surface warning: candidate may be employed, on leave, or working in private repos.
`popularity-bias`	Community Impact score is the highest-weighted dimension AND total score > 80	Warn: high star counts may reflect trending projects rather than engineering skill.
`ai-inflation`	AI usage likelihood ≥ 40% (detected via commit pattern analysis)	Surface AI usage flag with tailored interview probe questions. Score is not adjusted — human judgment required.

The Role of AI in HireSignal

LLaMA 3.1 8B runs self-hosted on HireSignal infrastructure. It is used for three things:

Plain-English narrative summary of the candidate's profile (does not affect the score)
5 tailored interview questions generated from the candidate's actual stack and signals
Red flag and standout factor identification in natural language

The Outcome Feedback Loop

Questions about the model? We're happy to walk through any dimension in detail.

Try it on a real profile How we compare →