Skip to main content

Overview

The GitHub Webset provides enriched profile data for every active GitHub user, combining raw GitHub data with additional fields we’ve linked or normalized: full name parsing, personal emails, and LinkedIn profile mappings.
Dataset Size: ~50M+ GitHub users
Refresh Rate: Monthly
LinkedIn Mappings: ~5M profiles

What’s Included

The GitHub Webset enriches standard GitHub profile data with:

Personal Information

  • Name parsing - Full name, first, middle, and last name extracted from various sources
  • Personal emails - Email addresses beyond what’s publicly visible on GitHub
  • Location data - Structured location information (city, state, country, continent)

Professional Data

  • LinkedIn mapping - Connected LinkedIn profiles for ~5M GitHub users
  • Work information - Current company, position, and work history (when available)
  • Contact details - Work emails, school emails, and other contact methods

GitHub Activity

  • Repositories - All public repos with metadata (stars, forks, topics, languages)
  • Commits - Recent commit activity across repositories
  • Stars - Repositories starred by the user
  • Issues - Issues created or commented on
  • Social graph - Followers and following accounts

Common Use Cases

Recruiting & Talent Sourcing

Find developers with specific skills, technologies, or open source experience. Use must_have=personal_email to only get profiles with contact information.

Sales & Marketing

Identify decision-makers and technical leads at target companies. Enrich existing leads with GitHub activity and tech stack insights.

Lead Enrichment

Enhance your CRM data with comprehensive GitHub profiles, contribution history, and verified contact information.

Developer Research

Analyze developer communities, technology trends, and open source ecosystems at scale.

API Reference

For complete API documentation including endpoints, parameters, authentication, and response schemas:

View GitHub Webset API Reference

Complete request/response documentation, authentication details, and code examples

Quick Start Example

Here’s a minimal example to get started:
curl -X GET "https://api.peoplecontext.com/v1/webset/github/person?github=torvalds&addons=linkedin" \
  -H "Authorization: Bearer YOUR_API_KEY"

Key Features

LinkedIn Addon

Add addons=linkedin to include comprehensive LinkedIn profile data in the response. This provides professional history, education, skills, and more for users with linked profiles.

Must-Have Fields

Use the must_have parameter to only pay for profiles that meet your requirements:
  • must_have=personal_email - Only return profiles with personal emails
  • must_have=personal_email,repos - Must have email OR repositories
  • must_have=linkedin - Only return profiles with LinkedIn mappings
When a profile doesn’t meet your requirements, an empty object {} is returned and you are not charged.

Rich Activity Data

Every profile includes detailed GitHub activity:
  • Repositories with topics, languages, stars, and forks
  • Commit history with author info and repository context
  • Stars showing interests and technologies
  • Issues demonstrating community engagement

Best Practices

Optimize costs with must_have: Only pay for profiles that meet your requirements by specifying required fields like must_have=personal_email.
Rate limiting: API requests are subject to rate limits based on your plan. Contact support if you need higher limits.
Data freshness: GitHub data is refreshed monthly. The last_updated field indicates when the profile was last enriched.

Response Structure

Here’s what a typical response looks like (trimmed for clarity):
{
  "first_name": "Linus",
  "last_name": "Torvalds",
  "full_name": "Linus Torvalds",
  "personal_email": "[email protected]",
  "url": "https://github.com/torvalds",
  "username": "torvalds",
  "id": 1024025,
  "bio": "Creator of Linux and Git",
  "location": "Portland, OR",
  "company": "@linuxfoundation",
  "repos": [
    {
      "full_name": "torvalds/linux",
      "language": "C",
      "stargazers_count": 150000,
      "description": "Linux kernel source tree"
    }
  ],
  "linkedin": {
    "full_name": "Linus Torvalds",
    "headline": "Creator of Linux",
    "current_position": {
      "company": { "name": "Linux Foundation" }
    }
  }
}
For the complete schema, see the API Reference.