How it works
One code generator handles both Bot API types and Mini App types. It scrapes two official Telegram HTML pages and produces TypeScript types and Markdown docs from each.
Shared Pipeline
Both pages go through the same pipeline:
core.telegram.org/bots/api core.telegram.org/bots/webapps | | fetch & cache fetch & cache (input/api.html) (input/webapp.html) | | v v DocPage WebAppPage | | ┌──────┴──────┐ | v v v ExtractedType ExtractedMethod ExtractedWebApp | | | | v v v v types.ts api.ts webapp.ts (types) | | └──────┬──────┘ v docs/ (markdown)- Fetch —
PageProviderServicedownloads HTML fromcore.telegram.organd caches it locally ininput/. - Parse — HTML is parsed into a DOM tree using
node-html-parser. Each entity is located by its<a class="anchor">tag. - Extract — the extraction pipeline walks the DOM and deterministically converts HTML into typed structures.
- Generate TypeScript —
ts-morphemits.tsfiles. - Generate Markdown — extracted data is converted into browsable
.mdfiles with cross-linked types.
# generate everythingpnpm gen
# or individuallypnpm gen:bot:apipnpm gen:webapp:apiBot API Extraction
The Bot API documentation (core.telegram.org/bots/api) is a single long HTML page with a regular structure. The extractor (DocPage) exploits this regularity.
Page Layout
<h3>Getting updates</h3> <!-- group name --> <h4><a class="anchor" name="getUpdates"/>getUpdates</h4> <p>...</p> <!-- description --> <table>...</table> <!-- field definitions --> <h4><a class="anchor" name="update"/>Update</h4> <p>...</p> <table>...</table><h3>sets the current section group (e.g. “Getting updates”, “Available types”)<h4>is an individual entity — a type or a method
Type vs Method
There is no HTML attribute that distinguishes a type from a method. Telegram consistently uses:
- Uppercase first letter → Type (e.g.
User,Chat,Message) - Lowercase first letter → Method (e.g.
sendMessage,getUpdates)
Entity Extraction
Starting from each <h4>, the extractor walks siblings to collect:
<p>paragraphs — description and (for methods) the return type sentence<table>— field/parameter definitions<ul>— union type members →ChatMemberOwner | ChatMemberAdministrator | ...
Return type is detected by patterns like "On success", "Returns ", or "is returned".
4-column tables (methods) have an explicit Required column. 3-column tables (types) mark optional fields with "Optional." in the description.
Type Mapping
Telegram pseudo-types are mapped to TypeScript:
| Telegram | TypeScript |
|---|---|
String | string |
Integer, Int | number |
Float | number |
Boolean, True, False | boolean |
Array of X | X[] |
X or Y | X | Y |
InputFile | { file_content: Uint8Array, file_name: string } |
Enum Extraction
Some String fields represent enums. Descriptions with hints like "one of", "must be", "always" trigger string literal union extraction:
emoji: "dice" | "darts" | "bowling" | "basketball" | "football" | "slot_machine"Mini App Extraction
The Mini Apps documentation (core.telegram.org/bots/webapps) uses the same HTML structure with <h4> headings, but has key differences that WebAppPage handles.
The Function Pseudo-Type
The Mini Apps docs use Function as a type for callable fields. The extractor detects this and converts fields ending with () into function signatures.
Manual Overrides
Since the Mini Apps docs don’t specify parameter types for most functions, webapp methods rely heavily on manual type overrides in typeOverrides to produce correct TypeScript signatures.
Override Mechanisms
Both Bot API and Mini App extraction support three override mechanisms for cases that can’t be inferred from the HTML:
typeOverrides— per-entity, per-field type replacementstypeAliasOverrides— for types that have no<table>or<ul>(e.g.InputFile)returnTypeOverrides— for methods whose return type is ambiguous (e.g.sendMediaGroup→Message[])
Module Map
codegen/├── main.ts entry point├── runtime.ts Effect runtimes (DI wiring)├── types.ts shared type aliases│├── scrape/│ ├── type-system.ts type representation & conversion│ ├── entity.ts single-entity extraction from HTML│ ├── page.ts DocPage & WebAppPage — HTML page models│ └── entities.ts batch extraction — walks full page│└── service/ ├── page-provider.ts fetches & caches HTML ├── code-writers.ts ts-morph TypeScript generation └── markdown.ts markdown docs generation