Overview
The GitHub Webset provides enriched profile data for every active GitHub user, combining raw GitHub data with additional fields we’ve linked or normalized: full name parsing, personal emails, and LinkedIn profile mappings.Dataset Size: ~50M+ GitHub users
Refresh Rate: Monthly
LinkedIn Mappings: ~5M profiles
Refresh Rate: Monthly
LinkedIn Mappings: ~5M profiles
What’s Included
The GitHub Webset enriches standard GitHub profile data with:Personal Information
- Name parsing - Full name, first, middle, and last name extracted from various sources
- Personal emails - Email addresses beyond what’s publicly visible on GitHub
- Location data - Structured location information (city, state, country, continent)
Professional Data
- LinkedIn mapping - Connected LinkedIn profiles for ~5M GitHub users
- Work information - Current company, position, and work history (when available)
- Contact details - Work emails, school emails, and other contact methods
GitHub Activity
- Repositories - All public repos with metadata (stars, forks, topics, languages)
- Commits - Recent commit activity across repositories
- Stars - Repositories starred by the user
- Issues - Issues created or commented on
- Social graph - Followers and following accounts
Common Use Cases
Recruiting & Talent Sourcing
Find developers with specific skills, technologies, or open source experience. Use
must_have=personal_email to only get profiles with contact information.Sales & Marketing
Identify decision-makers and technical leads at target companies. Enrich existing leads with GitHub activity and tech stack insights.
Lead Enrichment
Enhance your CRM data with comprehensive GitHub profiles, contribution history, and verified contact information.
Developer Research
Analyze developer communities, technology trends, and open source ecosystems at scale.
API Reference
For complete API documentation including endpoints, parameters, authentication, and response schemas:View GitHub Webset API Reference
Complete request/response documentation, authentication details, and code examples
Quick Start Example
Here’s a minimal example to get started:Key Features
LinkedIn Addon
Addaddons=linkedin to include comprehensive LinkedIn profile data in the response. This provides professional history, education, skills, and more for users with linked profiles.
Must-Have Fields
Use themust_have parameter to only pay for profiles that meet your requirements:
must_have=personal_email- Only return profiles with personal emailsmust_have=personal_email,repos- Must have email OR repositoriesmust_have=linkedin- Only return profiles with LinkedIn mappings
{} is returned and you are not charged.
Rich Activity Data
Every profile includes detailed GitHub activity:- Repositories with topics, languages, stars, and forks
- Commit history with author info and repository context
- Stars showing interests and technologies
- Issues demonstrating community engagement
Best Practices
Data freshness: GitHub data is refreshed monthly. The
last_updated field indicates when the profile was last enriched.