How we handle your data
Paxel analyzes your coding agent sessions to score developer behavior across five axes.
File bodies stay local — the Docker container reads transcripts and (unless
--no-repo)
mounts your repo read-only for on-device analysis. What gets uploaded to YC: scores,
behavioral summaries, session metadata (including file paths), git commit metadata,
per-commit numstat, redacted decision records, and pipeline telemetry.
The binding privacy policy is at /privacy. This page is a technical companion that describes the Paxel client's data-handling practices in detail.
Last updated: April 2026
What the upload script collects
AI coding agent transcripts
Claude Code — JSONL files from
~/.claude/projects/.
The Docker container reads these locally to generate narratives and extract behavioral signals.
Only the processed output leaves your machine: per-session narratives (~2,000 characters each),
bounded tool-use summaries (session_events —
file paths + truncated command text + action types, capped at ~3,000 events per session
with text fields shortened within each event),
a 200-character excerpt of your first prompt for each session,
user-highlight excerpts (representative quotes drawn from your prompts, capped at 10,000 characters per session),
steering traces (counts and timestamps of course-corrections),
and dispatch metadata for subagent tasks (task descriptions ≤200 characters,
passed through the credential redactor).
Raw conversation history, full prompts, full agent responses, and full tool outputs
do not leave your machine.
Cursor IDE — extracted from
~/Library/Application Support/Cursor
(macOS) or the equivalent on Linux/Windows, and merged into the same pipeline.
Codex CLI — JSONL files from
~/.codex/sessions, uploaded and analyzed alongside Claude Code sessions.
Cursor IDE and Codex CLI sessions are restricted to sessions whose git remote matches the selected project, unless you pass --all.
Git repository data (unless --no-repo)
Your source code, file contents, and diffs stay on your machine. The Docker container mounts your repo read-only for on-device code quality analysis; only aggregate metrics (file counts, language ratios, complexity scores) are uploaded.
The following git metadata is uploaded to the server: per-author numstat totals
(insertions/deletions per commit), velocity signals (commits/day, LOC/day, active days),
commit metadata for up to ~1,000 recent commits
(sha, short sha, author name, author email, date, subject —
no diffs, no file contents),
and your git remote URL.
Use --no-repo to skip all repo analysis entirely.
Metadata sidecar
Git remote URLs, local directory paths, and PR links extracted from transcripts.
What the script removes before sending
Paxel applies these redaction patterns to all content destined for upload (decision text, session events, narratives, first-prompt excerpts, user-highlight quotes, and commit subjects) inside the Docker container, before any data leaves your machine — both at the source as each field is extracted and again, fail-closed, at the upload boundary. The same patterns also run server-side on persisted LLM payloads as defense-in-depth:
| Pattern | What it catches |
|---|---|
| sk-ant-* | Anthropic API keys |
| sk-* | OpenAI API keys |
| sk_live_*, rk_live_* | Stripe secret keys |
| AKIA* | AWS access keys |
| AC*, SK* | Twilio Account / API-key SIDs |
| gh[pousr]_* | GitHub tokens (PATs, OAuth, fine-grained) |
| xoxb-*, xapp-* | Slack tokens |
| hf_* | HuggingFace tokens |
| npm_* | npm tokens |
| pypi-* | PyPI tokens |
| yk_* | YC / Paxel API tokens |
| AIza* | Google API keys |
| 1//0* | Google OAuth refresh tokens |
| AccountKey=* | Azure storage keys |
| eyJ*.eyJ*.* | JSON Web Tokens (JWTs) |
| Bearer * | Bearer authorization tokens |
| -----BEGIN * PRIVATE KEY----- | PEM private keys (RSA/EC/OpenSSH/PKCS#8) |
| postgres://, redis://, … with credentials | Database connection strings |
| API_KEY=, SECRET_KEY=, etc. | Environment variable assignments |
Excluded from code quality analysis
The on-device code quality analyzer skips:
node_modules
vendor
.git
build
dist
tmp
log
How to limit what you send
| --no-repo | Skip repo mounting and all git analysis. Only transcripts are analyzed and uploaded. |
| --since 2m | Only include sessions from the last 2 months (supports days, weeks, months). |
| --project NAME | Select a specific project by repository name instead of all projects. |
| --no-sentry | Disable client-side error reporting to Sentry for this run. |
Account data
Email. Used for magic link authentication. Tokens are SHA256-hashed at rest, expire after 15 minutes, and are single-use.
Session cookie. _paxel_session, 1-week expiry. HttpOnly, Secure, SameSite=Lax.
API tokens. Used for Docker client authentication. SHA256-hashed at rest. Admins can revoke tokens and set usage limits.
What we generate from your data
- LLM narratives. High-level behavioral summaries of each session, generated by Anthropic's Claude or OpenAI's GPT models (Paxel's analysis models, routed through a proxy).
- Behavior scores. Numeric scores across 5 axes: Execution Leverage, Steering, Engineering Quality, Product Thinking, Planning.
- Decision patterns. Structured records of how you directed the AI during coding sessions.
- Most-questionable-prompts surface. For each upload, an LLM picks up to 3 of your decision-text user directives that read as vague or scoped-too-loosely and writes a short one-line reason per pick. Stored as
{prompt, reason}pairs on the upload row and shown on your profile page under "Your most questionable prompts". Subject to the same admin-access and retention rules as your other upload data. - Subagent dispatch metadata. When you run Claude Code's
TaskorAgenttool to spawn a subagent, we record counts of dispatches, returns, the subagent'srun_in_backgroundflag, and a short dispatch description (≤200 chars, passed through the same redactor as decision text to strip code identifiers and file paths). The raw dispatch prompt is never uploaded — only the first 12 hex characters of its SHA-1 hash, used as a fallback identifier for matching subagent sessions to their parent when the parent session was filtered out upstream. - Evidence excerpts. Transcript and commit excerpts with vector embeddings, used for search and analysis.
- Episode groupings. Sessions grouped into coherent work episodes.
- Commit group analysis. Git diffs grouped and reviewed by LLM for code quality signals.
- LLM call logs. Every LLM call Paxel makes — on behalf of your upload (via the client proxy) and for internal system tasks — is recorded on Postgres. Prompts, responses, and row metadata (model, token counts, cost, HMAC nonce, timestamps) all live on the same table. Retention schedule is in Section 10.
- Upload error messages. When an upload fails, we store the normalized exception message and — for LLM proxy errors whose response body was not our usual JSON envelope (e.g. an infrastructure-layer "Forbidden" response) — a scrubbed preview of that response body. Total stored length is capped at 500 characters. Scrubbing removes Anthropic/OpenAI API keys, GitHub tokens, Bearer tokens, YC tokens, JWTs, email addresses, and IPv4 addresses before storage. These error messages are visible to you on your results page and to YC admins.
Local cache on your machine
The Paxel upload script keeps two kinds of working data on your machine. The LLM-result cache (which avoids re-billing identical prompts across runs) lives in a Docker-managed named volume (paxel-cache-<your-uid>) — reachable only through your Docker daemon, not a file in your home directory; clear it any time with --clean. When an upload fails after three retries, a stashed copy of the upload payload is written to ~/.paxel/data/pending-uploads/<id>.json.gz. The stash contains the same narratives, scores, decisions, session events, steering traces, and dispatch metadata that would have been sent to our server, plus a SHA-256 fingerprint of the API token used to create it — never the raw token. Stash files are mode 0600 inside a mode-0700 directory. Your next upload automatically replays a stashed upload and exits; a subsequent run proceeds with a fresh analysis. Stashes older than 14 days, or that cannot be replayed (token changed, server rejected, endpoint changed), are moved to a pending-uploads/failed/ quarantine subdirectory alongside a short .error.json marker that records the quarantine reason and — for server-rejected stashes — the first 500 characters of the server's response body. To remove all pending and quarantined stashes, re-run the upload command with the --clear-pending flag.
Third-party services
| Anthropic Claude API | Transcript text sent for behavioral analysis. Requests are routed through both Anthropic's API and Microsoft Foundry. Processes condensed conversation excerpts (not source code files). |
| OpenAI GPT API | Transcript text sent for behavioral analysis. Requests are routed through both OpenAI's API and Microsoft Foundry. Processes condensed conversation excerpts (not source code files). |
| Google AI Studio (Gemini) | Code evidence and transcript excerpts sent for vector embedding generation. Used for semantic search in the chat feature. No data is stored by Google. |
| Mailgun | Email delivery only (magic link authentication). We send your email address; Mailgun delivers the message. |
| Google Fonts | Loaded browser-side. Exposes your IP address and user agent to Google. |
| Amazon Web Services | Application hosting (ECS Fargate, us-west-2 Oregon). Manages web server, background workers, database (RDS PostgreSQL), Redis (ElastiCache), and file storage (S3). |
| Cloudflare | Proxies both paxel.ycombinator.com (the main app — login, results, admin, the curl|bash upload script, and the CLI device-auth flow) and paxel-llm.ycombinator.com (the LLM proxy that handles per-call requests from the Docker client to the configured LLM providers — Anthropic, OpenAI, or Microsoft Foundry). TLS terminates at Cloudflare for both, so it sees plaintext request and response bodies in transit — including LLM prompts and model responses (Claude, GPT, or provider-routed) on the proxy domain, and login/results/admin traffic on the main domain. Several user-private values are passed in URL query strings or paths and therefore appear in Cloudflare's request logs alongside other URL metadata: CLI device-auth codes (?code=, 8 chars, valid for 30 minutes), personalized upload-script API tokens (?token= on /upload.sh, the value lasts as long as the API token does — typically 90 days), and magic-link login tokens (in the URL path on /auth/verify/:token, single-use and valid for 15 minutes). Cloudflare applies its WAF and bot-detection rules, and logs request metadata (IPs, paths, status codes, security events) per Cloudflare's standard logging. Cloudflare's standard request logs do not include body content. Used for DDoS mitigation and WAF protection. |
| Sentry (server-side) | Error tracking in production only. On LLM-pipeline errors, we attach your upload slug, your account email (so we can proactively reach out about a failure), and — when the error is tied to a specific transcript session — that session's identifier. For LLM proxy errors whose response body was not our usual JSON envelope, a scrubbed preview of the response body is attached as event extra (same 500-character cap and same redaction list as the first-party storage above: Anthropic/OpenAI keys, GitHub tokens, Bearer tokens, YC tokens, JWTs, emails, IPv4 addresses). When an upload's /api/v1/results submission is rejected by server-side nonce verification (the anti-tampering check that compares each submitted nonce to its matching proxy log), we also fire a low-severity event with your API token's database id, your user id, the stable reason code (e.g. no_nonces_matched), the upload's idempotency key, and the counters that drove the decision (submitted/matched/mismatched counts and up to five truncated request-id identifiers from the unmatched sample). No nonce values, prompts, transcripts, or response content are attached. These are the same identifiers we would use to look you up in our own admin UI; they are not shared with third parties. |
| Sentry (client-side) | Exception class, message, and stack traces sent from the Docker container on your machine when the pipeline errors. Before sending, the client redacts home paths, API tokens, email addresses, git remote URLs, database connection strings (postgres/redis/mongodb/mysql/mssql/amqp URLs with embedded credentials), and long strings; strips OS/device/runtime context and local variables; and drops HTTP request bodies so prompts and transcript content never leave your machine via error telemetry. The upload slug and the failing transcript session's identifier are attached as tags so we can correlate a reported error to your upload and — if needed — email you to retry. No API tokens are attached. For LLM proxy errors where the response body was not our usual JSON envelope (e.g. an infrastructure-layer "Forbidden" response), a scrubbed preview of the response body — capped at 500 characters, with Anthropic/OpenAI keys, Bearer tokens, YC tokens, JWTs, emails, and IPv4 addresses redacted — is attached as event extra so we can diagnose the failure without asking you to share logs. The response body's content-type and a boolean "preview-present" flag are attached as low-cardinality tags. When the final upload POST to /api/v1/results is rejected (any 4xx) or fails after all retries (network, timeout, or 5xx), we also fire an event with the HTTP status, the server's "error" field (if the response was JSON), the idempotency key for this upload attempt, and a preview of the response body truncated to 1,000 characters. This response body is a structured Rails JSON error (e.g. {"error": "Verification failed: ..."}) or, for edge cases like a CDN intermediate page, a short HTML string; it never contains your upload payload, prompts, transcripts, or nonce values. The DSN is baked into published images so telemetry is on by default. Disable per-run with --no-sentry. |
Who can see your results
You. Results pages require login. Only the account that uploaded the data can view the results.
YC admins. Employees with @ycombinator.com email addresses have admin access to all uploads.
Chat conversations. Stored server-side and tied to your upload. Same access rules apply.
Security
- HTTPS with HSTS enforced on all connections
- Secure, HttpOnly, SameSite cookies
- API tokens and magic link tokens SHA256-hashed at rest
- Rate limiting: 100 requests/hour per IP, 500 LLM calls/day per token
- Sensitive parameters filtered from server logs
- CORS restricted to application origin
- Anti-gaming: 5-layer verification system (HMAC nonces, score re-derivation, anomaly detection)
What we do not collect
- No source code files, working-tree snapshots, or git archive tarballs
- No per-commit diffs or patch content
- No raw transcript JSONL — we upload narratives and bounded-length session events, not the original conversation history
- No analytics services (no Google Analytics, Mixpanel, Segment, Hotjar)
- No tracking pixels or beacons
- No browser fingerprinting
- No cookies beyond the session cookie
- No third-party ad or marketing trackers
- localStorage is used client-side only (chat UI state)
Data retention and deletion
Upload data (scores, narratives, metrics) is stored indefinitely. When an upload is deleted, its associated data is cascade-deleted: projects, sessions, episodes, decisions, evidence chunks, chat conversations, and internal LLM calls.
LLM call payloads (prompts and responses) are stored on Postgres and are nulled out on a tiered schedule by an hourly job: proxy payloads (from client-pipeline LLM calls) are eligible for deletion 48 hours after successful verification, or 14 days if a submission never lands; internal system-call payloads are eligible at 30 days. Actual deletion runs once per hour, so the effective window is up to one additional hour. Payloads flagged for active fraud or legal investigation are retained until the hold expires or is explicitly released by an admin; a newly-banned API token holds its payloads for 180 days past the ban expiry set at the time of the ban.
Metadata rows (request id, HMAC nonce of the response, API token id, model, token counts, cost, timestamps, rejection reason) are retained indefinitely on Postgres for fraud review and anti-replay detection. Client IP addresses on proxy logs are retained for 90 days, then cleared — unless the row is under an active fraud, legal, or ban-linked hold, in which case the IP is preserved alongside the payload until the hold expires or is released.
Erasure. Proxy logs are scoped to your API token, not to an individual upload, so deleting an upload does not cascade to them. To erase your data ahead of the scheduled retention windows, email us at the address below. On request we immediately purge every LLM call's content — prompts, responses, system prompts, the stored client IP, and the cold-storage payloads — and delete your uploaded reports and your shared score projection. The content-free metadata described above (model, token counts, cost, timestamps) is retained: it no longer holds your content and it underpins our cost and abuse accounting. If you ask us to delete your account, we additionally anonymize the account itself — your email, profile, and git identity are removed and your API tokens are revoked — so nothing that identifies you remains.
Contact
Questions about privacy or data handling: oss@ycombinator.com
Example: what gets uploaded to YC
Anonymized example of the JSON payload sent from the Docker container. No source code, no file contents, no raw transcripts.
{
"episode_scores": [
{
"title": "Authentication refactor",
"scores": { "throughput": 7.5, "steering": 8.0, "eng_quality": 7.0, "product_thinking": 6.5, "planning": 7.0 },
"confidence": "high"
}
],
"session_summaries": {
"abc123": "Developer refactored auth middleware across 3 sessions. Started with a clear plan, tested edge cases, caught a regression early..."
},
"session_metadata": [
{ "session_id": "abc123", "started_at": "2026-04-15T09:12:03Z", "duration_minutes": 47, "project": "acme" }
],
"git_remote": "git@github.com:acme/service.git",
"recent_commits": [
{ "sha": "a1b2c3def4567890", "short": "a1b2c3d", "author": "Avery Builder", "email": "avery@acme.example", "date": "2026-04-15T09:12:03Z", "subject": "Refactor auth middleware" }
],
"decisions": [
{ "title": "Chose JWT over session cookies", "rationale": "Rejected session cookies because the app is stateless. [identifier] handles refresh; [path] stores the decision log." }
],
"git_metrics": {
"velocity": { "loc_per_day": 3200, "commits_per_day": 12 },
"total_commits": 47,
"loc_stats": { "test_ratio": 1.2 }
},
"client_telemetry": {
"pipeline_duration_s": 612,
"llm_stats": { "total_calls": 84, "total_cost_cents": 31 },
"system_info": { "ruby_version": "3.4.8", "rails_env": "production", "pid": 4321 }
},
"nonces": ["req_001:a1b2c3d4", "req_002:e5f6g7h8"]
}