How HireSignal Scores a Profile
Complete transparency on every dimension, weight, confidence band, and bias check. If you can see inside the model, you can trust the output.
Why we publish this: HireSignal is not an AI wrapper. The core scoring model is deterministic — same GitHub data always produces the same score. LLaMA 3.1 enriches the report with narrative and interview questions, but it doesn't set the number. Publishing the model is evidence of that. Wrappers don't have documented, explainable models.
The Scoring Formula
- Each dimension has a defined maximum score (see table below).
- Role weights are multipliers — they shift relative emphasis, not the raw dimension score.
- The final score is always 0–100, regardless of role preset.
- Scores are reproducible: given identical GitHub API data, the score is always identical.
- LLM narrative is generated after scoring and does not affect the number.
The 11 Dimensions
Total maximum score: 145 points (normalised to 100 via the weighted formula above).
Profile Completeness
Max 10 points
Signals
- ·Display name set (1 pt)
- ·Bio text present (1 pt)
- ·Location set (1 pt)
- ·Blog / personal site linked (1 pt)
- ·Email or contact visible (1 pt)
- ·Hireable flag (0.5 pt)
- ·Twitter / social linked (0.5 pt)
- ·Avatar is non-default (1 pt)
- ·Organisation membership (1 pt)
- ·Pro account (0.5 pt)
Confidence
Very high — all fields are directly observable from the GitHub API response.
Repository Quality
Max 25 points
Signals
- ·Repo has a description (up to 5 repos scored × 2 pts each)
- ·Repo has topics / tags (1 pt per repo, max 5)
- ·Repo has an open-source licence (1 pt per repo, max 5)
- ·Repo is not a fork (quality filter)
- ·README.md detected via topics heuristic (1 pt per repo)
- ·Bonus: >10 original repos shows initiative
Confidence
High — repo metadata is public; occasional private repos may be missed.
Community Impact
Max 10 points
Signals
- ·Total stargazers across original repos (log-scaled, max 5 pts)
- ·Total forks of original repos (log-scaled, max 3 pts)
- ·External OSS event contributions (max 2 pts)
Confidence
Moderate — stars are visible but can be gamed via star-farming. Flagged when detected.
Contribution Consistency
Max 20 points
Signals
- ·Active days in the last 90 days (GitHub Events API, max 8 pts)
- ·Contribution streak length (max 4 pts)
- ·Event type diversity — pushes, PRs, reviews, issues (max 4 pts)
- ·Consistency across weekdays vs. single-burst sessions (max 4 pts)
Confidence
Moderate — GitHub Events API only returns the last 300 events / 90 days. Long-tenure corporate engineers with private activity will score lower here.
Technical Breadth
Max 15 points
Signals
- ·Unique language count across original repos (max 6 pts)
- ·Topic / domain diversity (frontend, backend, ML, DevOps) (max 4 pts)
- ·Multi-language repos (polyglot indicators) (max 3 pts)
- ·Rare / specialised language bonus (Rust, Go, Erlang) (2 pts)
Confidence
High — language data is per-repo and reliably reported by GitHub.
Social Proof
Max 5 points
Signals
- ·Follower count (log-scaled, max 3 pts)
- ·Following / follower ratio sanity check (max 1 pt)
- ·Public gists (max 1 pt)
Confidence
Low — follower counts are lagging signals and easily inflated. This dimension has the lowest weight in all role presets.
Commit Quality
Max 20 points
Signals
- ·Samples 50–100 commits across the top 5 original repos
- ·Conventional Commit format (feat/fix/chore/docs) (up to 6 pts)
- ·Average commit message length ≥ 30 chars (up to 4 pts)
- ·Atomic commits — one logical change per commit (up to 5 pts)
- ·Absence of noise commits ('wip', 'asdf', 'test') (up to 5 pts)
Confidence
High when ≥30 commits sampled. Low for accounts with < 10 commits visible.
Recency
Max 10 points
Signals
- ·Last push date (days since last commit, decays over time)
- ·Active repos in the last 6 months (max 4 pts)
- ·Push events in the last 30 days (max 3 pts)
- ·Issue / PR activity in the last 30 days (max 3 pts)
Confidence
High — timestamps are reliable. Note: engineers between jobs will naturally score lower here.
PR Review Quality
Max 10 points
Signals
- ·PullRequestReviewEvent count in the last 90 days (max 5 pts)
- ·External PR reviews (reviewing others' work, not your own repos) (max 3 pts)
- ·Review diversity — reviewing in multiple repos (max 2 pts)
Confidence
Low to moderate — PR Review events are under-represented in the public Events API. Strong corporate contributors will have invisible review activity.
OSS Contributions
Max 10 points
Signals
- ·PushEvents or IssueEvents on repos owned by others (last 90 days)
- ·Non-trivial contribution: multi-line commits or issue engagement (bonus)
- ·Diversity of external projects contributed to (max 3 pts)
Confidence
Moderate — public OSS events are visible; corporate open source (behind VPNs or enterprise GitHub) is invisible.
Documentation Quality
Max 10 points
Signals
- ·README detected in top repos (heuristic via topics + description) (max 4 pts)
- ·Personal website or portfolio linked (max 2 pts)
- ·Blog / writing linked (max 2 pts)
- ·Description text quality (length, punctuation, professionalism) (max 2 pts)
Confidence
Moderate — README presence is inferred; full content is not fetched to stay within API rate limits on the free tier.
Confidence Scoring Model
Every report includes an overall data confidence score (30–98%) and a per-dimension confidence band (high / moderate / low). These are shown as coloured pills on each scoring bar so recruiters never mistake a low-data estimate for a high-confidence fact.
High confidence
Threshold: ≥ 80
1,247 commits · 4 years of activity
Moderate confidence
Threshold: 50–79
Some data but limited events
Low confidence
Threshold: < 50
Only 3 repos, mostly forks
The hardest case: Senior engineers at BigCo. Staff engineers at Google, Meta, or any company with private GitHub Enterprise will show minimal public activity — they're working in private repos all day. When confidence is low and the inferred experience level is Senior/Staff, HireSignal automatically upgrades a NO_HIRE to MAYBE and shows a data completeness warning.
Role Presets & Weight Adjustments
Role presets apply weight multipliers to dimensions without changing the raw scores. The final 0–100 score reflects relative emphasis — a backend engineer is judged primarily on commit quality, not social proof.
Backend Engineer
Emphasises commit quality, contribution consistency, and technical breadth.
Frontend / Full-Stack
Balances breadth with repo quality and documentation — UI engineers often have polished public work.
ML / AI Engineer
Heavy weight on commit quality and recency — ML work is often in Jupyter Notebooks; we account for that.
DevOps / Platform
Values consistency and PR review quality — infra engineers often review more than they push.
OSS Contributor
Maximises community impact and OSS contribution signals.
Balanced (Default)
Equal-weight baseline. Recommended for general screening.
Enterprise customers can define custom weight multipliers per role. Pro customers can override the preset on any individual analysis.
Bias & Fairness Checks
HireSignal is designed for NYC Local Law 144 compliance and EU AI Act Article 22. The following automated checks run on every analysis and are recorded in the audit log.
| Flag | Trigger condition | Mitigation action |
|---|---|---|
low-data-bias | < 5 original repos OR hireConfidence < 50% | Upgrade NO_HIRE → MAYBE for Senior/Staff engineers. Show data completeness warning banner. |
recency-bias | Account inactive in the last 6 months | Surface warning: candidate may be employed, on leave, or working in private repos. |
popularity-bias | Community Impact score is the highest-weighted dimension AND total score > 80 | Warn: high star counts may reflect trending projects rather than engineering skill. |
ai-inflation | AI usage likelihood ≥ 40% (detected via commit pattern analysis) | Surface AI usage flag with tailored interview probe questions. Score is not adjusted — human judgment required. |
The Role of AI in HireSignal
LLaMA 3.1 8B runs self-hosted on HireSignal infrastructure. It is used for three things:
- Plain-English narrative summary of the candidate's profile (does not affect the score)
- 5 tailored interview questions generated from the candidate's actual stack and signals
- Red flag and standout factor identification in natural language
Zero data leaves your trust boundary. No candidate data is sent to OpenAI, Anthropic, Google, or any third-party LLM API. The LLaMA model runs in HireSignal's private inference cluster. Enterprise customers can optionally deploy the model on their own infrastructure.
The Outcome Feedback Loop
When a recruiter marks a candidate as hired, HireSignal sends a 90-day and 180-day check-in asking for a performance rating (output quality, team fit, retention risk, would hire again). These ratings are stored against the original score.
After 50+ outcomes, the dashboard shows which HireSignal score bands correlate with high performers for that specific recruiter's hiring patterns. After 100+ outcomes, Enterprise customers can request auto-suggested weight recalibrations.
This dataset is the product's primary moat. No base model + prompt can replicate recruiter-specific outcome data. After 10,000 hires across users, the correlation dataset becomes a proprietary asset that defines HireSignal's accuracy advantage.
Questions about the model? We're happy to walk through any dimension in detail.