Skip to main content

Source Code API Discovery

Levo's Source Code API Discovery catalogs every REST endpoint defined in a repository — handlers, paths, methods, path and query parameters, request and response types — and imports the result into Levo.ai. It runs entirely against a source checkout: the application is not executed, no traffic is captured, and no staging environment is required.

This is the fastest path to a complete API inventory for repositories you already govern in source control. Two modes are supported and can be combined on the same repository:

  • Source-code scan — statically analyzes application code and generates a new OpenAPI 3.0.1 specification.
  • Specification import — locates existing OpenAPI or Swagger specification files (openapi.yaml, openapi.yml, swagger.json, etc.) in the repository and imports them as-is, preserving the original OpenAPI or Swagger version.

How it works

  1. The Levo scanner runs locally as a container — on a developer workstation, a CI runner, or any host with Docker — against a repository checkout you control.
  2. Depending on the mode, the scanner prepares an API specification for upload in one of two ways:
    • Source-code scan: static analysis extracts route definitions, handler bindings, annotations, and parameter schemas directly from the source tree and generates an OpenAPI 3.0.1 specification. The application is never compiled-and-run or otherwise executed against live traffic.
    • Specification import: the scanner locates existing .yaml, .yml, or .json files identified as OpenAPI or Swagger specifications in the repository and uses them as-is. No static analysis is performed, and the specification's original format (OpenAPI 3.x or Swagger 2.0) is preserved.
  3. The specification is transmitted over TLS to your Levo tenant at https://api.levo.ai.
  4. Levo validates the specification server-side and imports it into the application and environment you specify. The application record is created automatically on first run.
  5. Endpoints become available in the Levo dashboard for API inventory, risk scoring, and security testing.
Data boundary

Source code never leaves the host executing the scanner. Only the generated or discovered API specification itself is uploaded to https://api.levo.ai.

Supported languages and frameworks

LanguageFile MarkersNotes
Java.java, pom.xml, build.gradleRequires compilation
JAR.jar filesPre-compiled Java archives
JavaScript.js, package.json
TypeScript.ts, tsconfig.json
Python.py, requirements.txt, setup.py, pyproject.tomlSupports 3.x through 3.13
C / C++.c, .cpp, .hIncludes .h headers and pre-processed .i files
PHP.php, composer.jsonRuntime PHP ≥ 7.4 required. Source compatibility: PHP 7.0 – 8.4
Ruby.rb, GemfileRuntime Ruby 4.0.x required. Source compatibility: Ruby 1.8 – 4.0.x
C#.cs, .csprojSupports C# code targeting .NET Framework 4.x through .NET 9. Pass --language csharp. Uses a Roslyn-based static analyzer bundled in the container image — no application build, runtime, or environment configuration required. VB.NET and F# are not supported.
Android APK.apk filesRequires the Android SDK. Set the ANDROID_HOME environment variable, or use the container image.
Scala.scala, build.sbtWork in progress

Prerequisites

  • Docker is installed on the host and you can launch containers with outbound network connectivity.
  • https://api.levo.ai is reachable from the host that will run the scan.
  • Your CLI Authorization Key — get it from app.levo.ai/settings/keys.
  • Your Organization ID — get it from app.levo.ai/settings/organization.
For Indian users

Get your CLI authorization key from app.india-1.levo.ai/settings/keys and your organization ID from app.india-1.levo.ai/settings/organization instead.

Approach-specific prerequisites are listed under each approach below.

Integration approaches

Source Code API Discovery can be consumed in one of the following ways today. All approaches run the same underlying scanner (published as the levoai/code-scanner container image) and upload the specification to the same Levo tenant — a generated OpenAPI 3.0.1 spec for source-code scans, or the existing OpenAPI/Swagger spec as-is for specification imports. Additional integrations are on the roadmap.

ApproachBest suited for
DockerInteractive scans from a developer workstation, scheduled scans in any CI system (Jenkins, GitLab CI, Bitbucket Pipelines, CircleCI, Azure Pipelines, etc.), and ad-hoc runs inside a container.
GitHub ActionDeclarative, automated scans triggered on every push or pull request for repositories hosted on GitHub.
Bulk Scanning ScriptGenerating OpenAPI specifications for every repository in a GitHub organization in one batch run. Useful for initial API inventory across an entire engineering org, security backfills, or onboarding a new tenant.

Choose based on where your repositories live and how you want scans triggered. You can adopt multiple approaches in the same organization — for example, Docker for ad-hoc backfills and the GitHub Action for continuous coverage.


Approach 1: Docker

Run the scanner locally or in any CI system as a Docker container. No Levo CLI installation is required; everything the scanner needs ships inside the image.

Mode A: Scan source code and discover REST endpoints

Use this mode to reverse-engineer an OpenAPI specification from application source.

First, pull the latest image:

docker pull levoai/code-scanner:latest

Then navigate to the project directory you want to scan. Set --language to match your project's programming language (e.g., java, python, javascript, typescript, c, cpp, php, ruby, csharp, dotnet):

cd /path/to/your/project

docker run --rm \
-e LEVO_BASE_URL=https://api.levo.ai \
-v "$(pwd)":/workspace:rw \
levoai/code-scanner:latest \
--app-name "my-api" \
--env-name "staging" \
--language <your-project-language> \
--key <your-authorization-key>

Example: If your Java project is at /home/john/payment-service:

cd /home/john/payment-service

docker run --rm \
-e LEVO_BASE_URL=https://api.levo.ai \
-v "$(pwd)":/workspace:rw \
levoai/code-scanner:latest \
--app-name "payment-service" \
--env-name "staging" \
--language java \
--key <your-authorization-key>

On success, you'll see: "Application 'payment-service' has been created successfully!" — check your Levo dashboard.

For Indian users

Set -e LEVO_BASE_URL=https://api.india-1.levo.ai instead.

If you belong to multiple organizations, add the -o flag:

docker run --rm \
-e LEVO_BASE_URL=https://api.levo.ai \
-v "$(pwd)":/workspace:rw \
levoai/code-scanner:latest \
--app-name "my-api" \
--env-name "staging" \
--language <your-project-language> \
--key <your-authorization-key> \
-o <your-organization-id>

Flags Reference

Flag / Env VarRequiredDescription
-e LEVO_BASE_URLNoLevo SaaS API URL. Default: https://api.levo.ai. India: https://api.india-1.levo.ai.
-v "$(pwd)":/workspace:rwYesMounts your source code into the container.
--app-nameYesApplication name on the Levo dashboard. Created automatically on first scan.
--env-nameYesEnvironment label (e.g., staging, production).
--languageYesSource language: java, python, javascript, typescript, c, cpp, php, ruby, csharp.
--keyYesYour CLI authorization key.
-oNoOrganization ID. Required only if you belong to multiple organizations.

Mode B: Import existing OpenAPI / Swagger specifications

Use this mode when your repositories already contain hand-authored or generated OpenAPI specifications (.yaml, .yml, or .json) and you want to import them directly, without running static analysis.

cd /path/to/your/project

docker run --rm \
-e LEVO_BASE_URL=https://api.levo.ai \
-v "$(pwd)":/workspace:rw \
levoai/code-scanner:latest \
schema \
--dir . \
--env-name "staging" \
--key <your-authorization-key>

Notes:

  • The scanner recursively searches the mounted directory for OpenAPI/Swagger spec files; non-spec files are skipped.
  • The application name for each imported specification is taken from the info.title field of that specification.
  • Use --dir <relative-path> to scan only a sub-directory of the project.

Approach 2: GitHub Action

Automate code scanning in your CI/CD pipeline. Every push to the configured branches triggers a scan, and discovered API endpoints are uploaded to the Levo dashboard automatically. The Action supports the same languages as the Docker approach (see Supported languages and frameworks).

Prerequisites

Setup

Step 1: Add secrets to your repository on GitHub.

Go to your repo → SettingsSecrets and variablesActionsNew repository secret and add the following:

Secret NameWhere to Get It
LEVO_AUTH_KEYapp.levo.ai/settings/keys
LEVO_ORG_IDapp.levo.ai/settings/organization

Step 2: Create a workflow file in your repository at .github/workflows/levo-code-scan.yml:

name: Levo Code Scan

on:
push:
branches: [main]

jobs:
scan:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4

- name: Run Levo Code Scanner
uses: levoai/actions/scan@main
with:
authorization-key: ${{ secrets.LEVO_AUTH_KEY }}
organization-id: ${{ secrets.LEVO_ORG_ID }}
app-name: "my-api"
language: "java"
env-name: "staging"

Replace my-api with your application name and java with your repository's programming language.

Action inputs

InputRequiredDescription
authorization-keyYesCLI authorization key. Get it from app.levo.ai/settings/keys.
organization-idYesYour Levo organization ID.
app-nameYesApplication name on the Levo dashboard.
languageYesSource language: java, python, javascript, typescript, c, cpp, php, ruby, csharp.
dirNoRelative path within the repository to scan. Default: . (the repository root). Use a subdirectory path (for example, services/payments) to scan only part of a monorepo.
env-nameNoEnvironment label. Default: staging.
saas-urlNoLevo SaaS API URL. Default: https://api.levo.ai. India: https://api.india-1.levo.ai.
scanner-docker-imageNoDocker image override. Default: levoai/code-scanner:latest.

Action environment variable

The Action sets the following environment variable for use by downstream steps in the same job:

VariableDescription
scan-successtrue if the scan succeeded, false otherwise.

Approach 3: Bulk Scanning Script

Levo's Bulk Scanning Script generates OpenAPI specifications for every repository in a GitHub organization in a single run and imports them into your Levo dashboard. It is suited to security and platform teams that need a complete API catalog across an entire engineering organization, without instrumenting each repository individually.

Prerequisites

In addition to the page-level prerequisites, this approach requires:

  • A GitHub Personal Access Token (PAT) with repo read scope on the target organization's repositories.
  • git, curl, python3, and a Bash shell (Git Bash on Windows; native on macOS and Linux). python3 is used to pre-filter repositories by language via the GitHub API.
  • Docker Desktop configured with at least 4 CPUs and 8 GB of memory (Settings → Resources → Advanced).
macOS users

The script uses timeout, which is not available on macOS by default. Install GNU coreutils via Homebrew:

  1. Install Xcode Command Line Tools:
    xcode-select --install
  2. Install Homebrew:
    /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
  3. Install GNU coreutils:
    brew install coreutils

macOS uses gtimeout (from coreutils) instead of timeout. The script handles this automatically once coreutils is installed.

How it works

  1. The script enumerates every repository in the configured GitHub organization using the GitHub API.
  2. Repositories written in unsupported languages are filtered out automatically and excluded from the scan, without being downloaded.
  3. Each remaining repository is cloned locally (latest commit only) and analyzed by the Levo scanner container.
  4. Multiple repositories are processed in parallel — a configurable number of containers run concurrently, each with bounded CPU and memory limits.
  5. After each scan completes, the cloned source is removed from the host and the generated OpenAPI specification is uploaded to your Levo dashboard over TLS.

Setup

Step 1 — Download the script

Click to download: bulk-scan.sh

Or download from your terminal:

curl -O https://docs.levo.ai/artifacts/code-scanner/bulk-scan.sh

Step 2 — Create a working folder

On your machine, create a new folder to hold the script and its output. We recommend naming the folder levo so all Levo-related files stay together. The folder can live anywhere — your home directory, your Desktop, or C:\ — as long as it has at least a few GB of free disk space (repositories are cloned here temporarily and removed after each scan).

Place the downloaded bulk-scan.sh inside this folder. The folder should look like:

levo/
└── bulk-scan.sh

The script will automatically create repos/ and results/ subfolders inside levo/ when it runs.

Step 3 — Fill in your credentials

Open bulk-scan.sh in any text editor (VS Code, Notepad, TextEdit, etc.). Near the top of the file you'll see a configuration block. Fill in the required values below, and set LEVO_ORG_ID only if you belong to multiple Levo organizations:

GITHUB_PAT=""        # Your GitHub Personal Access Token (classic: `repo`, and possibly `read:org`; fine-grained: read access to repository contents/metadata)
GITHUB_ORG="" # Your GitHub organization name (e.g., my-company)
LEVO_AUTH_KEY="" # Get yours from https://app.levo.ai/settings/keys
LEVO_ORG_ID="" # Optional — leave empty if you only belong to one Levo organization

Save and close the file.

Step 4 — Run the script from inside the levo folder

Linux / macOS:

chmod +x bulk-scan.sh
./bulk-scan.sh

Windows (Git Bash):

bash bulk-scan.sh

Windows (PowerShell or Command Prompt, with Git for Windows installed):

bash bulk-scan.sh
Windows shells

Git for Windows ships a Bash interpreter and adds it to PATH, so the script runs from any Windows shell — Git Bash, PowerShell, or Command Prompt — without modification.

Configuration variables

VariableRequiredDescription
GITHUB_PATYesGitHub Personal Access Token with repo read scope.
GITHUB_ORGYesThe GitHub organization name.
LEVO_AUTH_KEYYesLevo CLI authorization key. Get it from app.levo.ai/settings/keys.
LEVO_ORG_IDNoLevo organization ID. Required only if you belong to multiple Levo organizations.
LEVO_BASE_URLNoLevo SaaS API URL. Default https://api.levo.ai. India: https://api.india-1.levo.ai.
ENV_NAMENoEnvironment label shown on the dashboard. Default staging.
SCANNER_IMAGENoScanner container image. Default levoai/code-scanner:3e0aa82 (pinned for reproducibility).
MAX_PARALLELNoNumber of repositories scanned concurrently. Default 4.
CONTAINER_MEMNoMemory cap per container. Default 2g.
CONTAINER_CPUSNoCPU cap per container. Default 1.

Scaling parallelism

The script runs MAX_PARALLEL containers concurrently. Each container is hard-capped at CONTAINER_CPUS CPUs and CONTAINER_MEM memory. The defaults are tuned for Docker Desktop with 4 CPUs / 8 GB RAM.

When tuning, honor these two rules:

  1. MAX_PARALLEL × CONTAINER_CPUS ≤ Docker Desktop CPUs
  2. MAX_PARALLEL × CONTAINER_MEM ≤ Docker Desktop memory
Docker Desktop allocationRecommended valuesBehavior
4 CPUs / 8 GB (default)MAX_PARALLEL=4, CONTAINER_CPUS=1, CONTAINER_MEM=2gBaseline
8 CPUs / 16 GBMAX_PARALLEL=8, CONTAINER_CPUS=1, CONTAINER_MEM=2g~2× throughput
16 CPUs / 32 GBMAX_PARALLEL=8, CONTAINER_CPUS=2, CONTAINER_MEM=4gFaster per-scan, helpful for large repositories

To bump Docker Desktop's allocation: Docker Desktop → Settings → Resources → Advanced → CPUs / Memory.

Estimated scan time

Per repository (single scan, under the default --cpus=1):

Repository profileTypical scan time
Small (simple Python / JS / Ruby service, fewer than ~20 endpoints)1–2 min
Medium (Java Spring Boot, TypeScript / Express, ~20–80 endpoints)3–6 min
Large (monorepo, many modules, deep dependency graph)10–20 min
Large repositories

Repositories whose scan exceeds 30 minutes are automatically skipped to keep the bulk run moving and are reported in the final summary as timed out. To capture endpoints from these large repositories, scan them individually using Approach 1: Docker, which has no time limit and can handle very large monorepos.

Approximate wall-clock time for a 150-repository organization (assuming ~70% scannable after the language pre-filter, and an average ~4 minutes per scan):

MAX_PARALLELApproximate total time
1 (serial)~7 hours
4 (default)~1.5–2.5 hours
8~50–80 minutes

Actual time depends on the mix of repository sizes, network speed, and disk I/O.

Output

On the dashboard: every successfully scanned repository becomes an application at https://api.levo.ai, with the generated OpenAPI specification imported and ready for inventory, risk scoring, and security testing.

On the terminal: a per-repository card is printed as each scan finishes, followed by a final summary reporting the number of repositories scanned successfully, failed, skipped (unsupported language or not accessible), and timed out, along with the total duration.

Limitations

note

The bulk scanning script supports source-code mode only. To import existing OpenAPI/Swagger specifications, use Approach 1 — Mode B.

Was this page helpful?