Hei,
Dette var veldig interessant - takk for at du tok deg tid og sjekket det og ga tilbakemelding. Vi snakket om dette for flere år tilbake og jeg var lunken til å prøve meg på det da jeg regnet med at det ville kreve mye pakking og vedlikehold av spring avhengighetene. Sludderboten bekrefter det når den sier:
This is a **multi-year effort**. Each package needs proper metadata, patches, and maintenance. Spring Boot alone has dozens of modules that would need individual packaging.
Jeg er usikker om det er en **multi-year effort**, men jeg tror at det er en rabbit-hole og mye som man må sette seg inn i (som sludderbot sier). Ellers er jeg enig med det sludderbot sier om å fjerne avhengigheter som kø, dersom det ikke er bruk for kø når man kjører lokalt.
Det som kanskje også hadde vært interessant å se på var hvorvidt sludderbot kan oppgradere tjenestegrensesnitt fra 5.5 til 5.6 og eventuell om sludderbot klarer å finne uklarheter i beskrivelsen av tjenestegrensesnittet. Av og til er det fint med et ekstra sett øyner og sludderbotene er flink til det.
Thomas
________________________________ Fra: Petter Reinholdtsen pere@hungry.com Sendt: torsdag 28. mai 2026 10:29 Til: nikita-noark@nuug.no nikita-noark@nuug.no Emne: Nikita i Debian, eller øde øy-støtte for Nikita...
Jeg har lekt litt med kunstig idioti i det siste, og som et forsøk ga jeg den oppdrag å analysere kodebasen til Nikita og se hva som trengs for å kunne få Nikita inn som en offisiell Debian pakke, i praksis hva som trengs for å kunne bygge Nikita uten Internett-tilkobling. Her er det den kom opp med. Det er som forventet ganske mye som må på plass før vi er i mål, men tenkte det var greit om flere er kjent med detaljene rundt utfordringen.
# Deserted Island Build Proposal for nikita-noark5-core
## Goal
Enable building and running the project with **zero Internet access** --- only a local Debian mirror (Testing) on disk and the source repository. The "deserted island" test: can you get this up and running without any network connection?
------------------------------------------------------------------------
## Current Situation
### What Gets Downloaded Today
------------------------------------------------------------------------------------------ When What From Count/Size ---------------- ----------------------------- -------------- ---------------------------- **Build** ~450 JAR artifacts (Spring Maven Central ~200 MB (`make build`) Boot BOM + transitive deps)
**Build** Additional test-scoped Maven Central ~100 MB (`make check`) dependencies (Mockito, JUnit engine, etc.)
**Build** `antlr4-maven-plugin`, Maven Central ~50 MB (plugins) `asciidoctor-maven-plugin`, `spring-boot-maven-plugin`, `maven-surefire-plugin` + transitive deps
**Test setup** Keycloak 26.0.6 binary GitHub ~130 MB tarball Releases (`keycloak-setup-start.sh`)
**Runtime Language detection model Maven Central ~5 MB (Tika)** (`langdetect` models) --- may / CDN download on first use if not bundled ------------------------------------------------------------------------------------------
The `maven-repo/` directory exists in the working tree but is **not committed to git** (untracked files), so every fresh clone downloads everything from scratch.
### Why Debian Packages Alone Aren't Enough
Spring Boot 3.4.5 and its entire ecosystem (spring-boot-starter-web, -data-jpa, -security, -oauth2-resource-server, -amqp, etc.) are **not packaged for Debian**. Many individual libraries *are* available as `lib*-java` packages in Debian Testing:
------------------------------------------------------------------------------------- Dependency Available as Debian pkg? Package name ------------------- ---------------------------------- ------------------------------ ANTLR 4 runtime ✅ Yes `libantlr4-runtime-java`
commons-lang3 ✅ Yes `libcommons-lang3-java`
Guava ✅ Yes `libguava-java` (32.0.1)
H2 database ✅ Yes `libh2-java`
PostgreSQL JDBC ✅ Yes `libpostgresql-jdbc-java`
Joda-Time ✅ Yes `libjoda-time-java`
JAXB runtime ✅ Yes `libjaxb-java`
ByteBuddy ✅ Yes `libbyte-buddy-java`
Reflections ✅ Yes `libreflections-java`
RabbitMQ client ✅ Yes `librabbitmq-client-java`
JSON (org.json) ⚠ Partially `libjson-java` is different library (gson-based, not org.json)
Apache POI ❌ No Not packaged in Debian
**Spring Boot** ❌ No Not packaged at all
Spring Security ❌ No Not packaged at all
springdoc-openapi ❌ No Not packaged at all
Tika parsers ❌ No Not packaged for Debian
AsciiDoctor (Ruby ✅ Yes `asciidoctor` package provides CLI) the Ruby-based CLI tool
AsciiDoctorJ (Maven ❌ No The JVM wrapper plugin bridge) (`asciidoctor-maven-plugin`) is not in Debian. Can be replaced by calling the Debian `asciidoctor` CLI from Makefile, or skipped entirely since docs are cosmetic. -------------------------------------------------------------------------------------
**Bottom line**: Replacing Maven with Debian packages alone is **not feasible** because Spring Boot, POI, springdoc-openapi, and Tika have no Debian equivalents. The only realistic path is to make Maven work offline.
------------------------------------------------------------------------
## Proposal: Three-Part Approach
### Part 1 --- Build-Time Offline (Maven)
#### Option A: Commit `maven-repo/` to git (Recommended for simplicity)
**Changes needed:** 1. Add `maven-repo/*.jar`, `maven-repo/*.pom`, `maven-repo/*.sha1`, `maven-repo/*.lastUpdated` to `.gitattributes` with `export-ignore = false` and commit the directory. 2. Alternatively, use a sparse checkout or git-lfs for large binaries. 3. Add an offline guard to the Makefile:
``` makefile # In Makefile, add: MVNOPTS := -Dmaven.repo.local=$(CURDIR)/maven-repo --offline
# Optional: fail-fast if network would be needed: .PHONY: verify-offline verify-offline: @echo "Verifying offline build capability..." $(MVN) $(MVNOPTS) dependency:resolve -DskipTests || \ (echo "ERROR: Missing dependencies. Run 'make populate-maven-repo' online first."; exit 1)
build: verify-offline $(MVN) $(MVNOPTS) clean validate install ```
4. Provide a one-time online bootstrap target for maintainers to keep `maven-repo/` current:
``` makefile .PHONY: populate-maven-repo populate-maven-repo: @echo "Populating local Maven repo (requires Internet)..." $(MVN) -Dmaven.repo.local=$(CURDIR)/maven-repo dependency:go-offline $(MVN) -Dmaven.repo.local=$(CURDIR)/maven-repo dependency:resolve-plugins ```
**Pros**: Simple, works immediately, no build-system changes needed. **Cons**: Bloated git repository (~300 MB of JARs). Consider git-lfs or a separate tarball artifact instead.
#### Option B: Debian policy-compliant approach (Recommended for packaging)
For proper Debian packaging (`dpkg-buildpackage`), the standard approach is:
1. **List all upstream VCS artifacts in `debian/watch`** and use `uscan/udeb` or manually manage them in `debian/source/include-binaries`. 2. **Download all JARs during package build** from Maven Central using the `debian/rules` target, with checksums pinned in `debian/control` or a separate file. 3. Use **Maven's offline mode** (`--offline`) pointing at a pre-populated local repo that was assembled during the online phase of the Debian build.
Debian Java Policy recommends: - Each upstream JAR dependency should be either (a) packaged in Debian as `lib*-java`, or (b) downloaded and built from source within the package build process. - For option (b), use `download-maven-poms` helper script or similar to fetch artifacts.
However, since Spring Boot is not in Debian, **this project cannot be packaged purely from Debian packages**. The practical approach:
``` makefile # debian/rules snippet: %: dh $@ --with javahelper,maven
override_dh_auto_build: # Ensure offline build dh_auto_build -- --offline -Dmaven.repo.local=$(CURDIR)/deps-maven-repo ```
With `debian/control` Build-Depends including all available Debian packages:
Build-Depends: debhelper-compat (=13), dh-buildupdate, default-jdk (>= 17), maven-debian-helper, maven-repo-helper, libantlr4-runtime-java, libcommons-lang3-java, libguava-java, libh2-java, libpostgresql-jdbc-java, libjoda-time-java, libjaxb-java, libbyte-buddy-java, librabbitmq-client-java, libreflections-java
Then supplement with direct downloads for non-packaged dependencies (Spring Boot BOM, POI, Tika, springdoc-openapi). The `maven-repo-helper` tools can download these from Maven Central during the online build phase.
### Part 2 --- Test-Time Offline (Keycloak)
The file `scripts/keycloak-setup-start.sh` downloads Keycloak from GitHub:
``` bash wget https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com...https://github.com/keycloak/keycloak/releases/download/$ ```
**Fixes:**
1. **Bundle Keycloak tarball in the repo** (or ship as a separate artifact): - Download `keycloak-26.0.6.tar.gz` and place it in `scripts/keycloak/`. - Modify script to check for local file first:
``` bash if [ -f "scripts/keycloak/keycloak-${ver}.tar.gz" ]; then echo "Using bundled Keycloak tarball." else wget https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com...https://github.com/keycloak/keycloak/releases/download/$ fi ```
2. **Alternatively**, package Keycloak from Debian: `libkeycloak-admin-rest-client-java` exists, but the full Keycloak server is not packaged. The bundled tarball approach is simpler.
3. **Or skip Keycloak** for unit tests entirely --- run only `make check` (which runs JUnit tests that don't need Keycloak) and document that integration tests require the online setup:
``` makefile check-offline: $(MVN) $(MVNOPTS) test -Dtest='!*IntegrationTest,!*IT' ```
### Part 3 --- Runtime Offline (Tika Language Detection)
Apache Tika's `tika-langdetect` module downloads language detection models on first use. This happens at runtime, not build time.
**Fix:** Set the system property to prevent online download:
``` java // In application configuration or startup code: System.setProperty("org.apache.tika.language.detect.model", "/path/to/local/model.jar"); ```
Or exclude the language detection module from Tika's dependency tree if it's not needed. Check `FileHandlingService.java` and `FileUtilsService.java` --- they use `Tika` for MIME detection and text extraction, which works fine without language detection models. The models are only needed for `LanguageDetector`.
**In pom.xml**, add exclusions if language detection is not used:
``` xml <dependency> <groupId>org.apache.tika</groupId> <artifactId>tika-parsers-standard-package</artifactId> <exclusions> <exclusion> <groupId>com.github.pemistahl</groupId> <artifactId>lingua-language-detector</artifactId> </exclusion> </exclusions> </dependency> ```
------------------------------------------------------------------------
## Proposed Debian Packages to Install
### Required Build Dependencies
default-jdk (>= 21) # Java 21+ compiler and runtime (see JDK note below) maven # Maven build tool libantlr4-runtime-java # ANTLR runtime (already in deps, but available as pkg) git # Version control (for git-commit-id plugin if used)
**JDK version bump required:** `pom.xml` currently sets `<java.version>17</java.version>`. JDK 17 has been removed from Debian Testing --- only JDK 21 (`openjdk-21-jdk`) and JDK 25 (`openjdk-25-jdk`) remain. The fix is straightforward:
``` xml <!-- In pom.xml, change: --> <properties> <java.version>21</java.version> </properties> ```
Spring Boot 3.x supports JDK 17 through 21+ and the codebase doesn't use any Java 17-specific APIs that would break on 21. This is a one-line change in `pom.xml` and updates to `.gitlab-ci.yml` (which references `openjdk-17-jdk`).
### Required Runtime Dependencies
default-jre # Java runtime postgresql # Database backend for production use rabbitmq-server # Message queue (if using AMQP integration profile) keycloak # Auth provider (NOT in Debian — bundle or skip) tesseract-ocr-nor # OCR language data (used by Tika, from .gitlab-ci.yml) unoconv # Document conversion (from .gitlab-ci.yml) libreoffice-core # For document format conversion via unoconv python3 # Test scripts use Python 3 curl # Health checks and API testing jq # JSON processing in test scripts
### Optional / Nice to Have
asciidoctor # For documentation generation (alternative to maven plugin) libh2-java # Embedded database for demo/testing mode
------------------------------------------------------------------------
## Proposed Dependencies That Could Be Dropped
-------------------------------------------------------------------------------------------------------------- Dependency Used For Can Drop? Notes ---------------------------------- ----------------- ------------------- ------------------------------------- `spring-boot-starter-amqp` RabbitMQ **Yes, Only needed if mail queue integration integration conditionally** is used. Profile-gated via `application-queueintegration.yml`.
`spring-boot-starter-validation` Bean validation No Core functionality depends on it. (`@NotNull`)
`springdoc-openapi-*` Swagger/OpenAPI **Yes** Cosmetic/documentation only. Can be UI docs excluded for minimal build.
`asciidoctor-maven-plugin` API documentation **Yes** Only generates HTML docs during generation package phase, not needed to run the app.
`spring-restdocs-*` (test scope) REST API **Yes** Test-time doc generation only. documentation tests
`junit-vintage-engine` (test JUnit 3/4 Maybe Only if all tests migrate to JUnit 5. scope) compatibility in tests
`spring-boot-starter-webflux` Reactive web Maybe Depends on which tests use it. (test scope) client for tests
Tika language detection models Language **Yes** MIME type detection works without it. identification of Exclude from classpath or set offline documents mode property. --------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------
## Summary: Concrete Steps to Achieve "Deserted Island" Build
### Immediate (low effort, high impact)
1. **Add `--offline` flag to Maven in Makefile**:
``` makefile MVNOPTS := -Dmaven.repo.local=$(CURDIR)/maven-repo --offline ```
2. **Commit `maven-repo/` contents** (or create a tarball artifact):
``` bash mvn dependency:go-offline -Dmaven.repo.local=$(pwd)/maven-repo mvn dependency:resolve-plugins -Dmaven.repo.local=$(pwd)/maven-repo # Then commit or tar the directory ```
3. **Bundle Keycloak** in `scripts/keycloak/` and update `keycloak-setup-start.sh`.
4. **Guard against Tika model download** via system property: Add to `application.yml`:
``` yaml spring: main: add-application-context-initializer: true --- # Or set JVM flag: -Dorg.apache.tika.language.detect.model=none ```
### Medium-term (proper Debian packaging)
5. **Create proper `debian/` directory** with: - `debian/control` listing all Build-Depends - `debian/rules` using `dh-sequence-maven` or manual Maven invocation with offline mode - `debian/source/include-binaries` for bundled JARs that can't be Debian-packaged 6. **Document the offline build process** in a new `docs/general/OfflineBuild.md`.
### Long-term: Get Into Debian Main (reduce external dependency count)
This section addresses the fundamental conflict between Maven's "download JARs" model and Debian's requirement that all code in `main` be built from source available within Debian.
#### The Problem: Bundled Binary JARs vs DFSG Compliance
Debian Free Software Guidelines (DFSG) and [Java Policy](https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.debian...https://www.debian.org/doc/packaging-manuals/policies/java-policy/) require: 1. **All code must be available in source form** --- binary-only JAR blobs are not acceptable for `main` 2. **Build dependencies must themselves be Debian packages** --- downloading from Maven Central during build is only acceptable for `non-free` 3. **Each upstream component should be packaged separately** as its own `lib*-java` package
Currently, ~450 JARs are downloaded from Maven Central. Even if their licenses are DFSG-compatible (Apache 2.0, MIT, LGPL), they violate the "buildable from source" requirement because: - They're pre-built binaries shipped alongside our source - Their build chain is external to Debian and not reproducible within the archive
#### Three Paths Forward
**Path A: Package Each Dependency Individually (Required for `main`)**
Every non-packaged dependency must become its own Debian package. The ones we need to package are:
------------------------------------------------------------------------ Dependency License Packaging Difficulty -------------------- --------------- ----------------------------------- Spring Boot 3.4.x Apache 2.0 **Very high** --- ~50 transitive modules, each needs separate packaging; depends on Jakarta EE APIs not all in Debian
Apache POI 5.4.0 Apache 2.0 Medium --- single project but large; may already exist as `libpoi-java` (check)
Tika parsers 2.8.0 Apache 2.0 **Very high** --- huge dependency tree including Lucene, XML libraries, etc.
springdoc-openapi MIT Low --- single project, 1.6.x straightforward Maven build
Keycloak server Apache 2.0 High --- not needed in `main` if (test only) test-only; bundle as VCS artifact or skip ------------------------------------------------------------------------
This is a **multi-year effort**. Each package needs proper metadata, patches, and maintenance. Spring Boot alone has dozens of modules that would need individual packaging.
**Path B: Use Maven Offline with Online Download Phase (Acceptable for `non-free`)**
For `contrib`/`non-free`, the approach is simpler: 1. `debian/rules` downloads all JARs from Maven Central during online build phase 2. Checksums are verified against pinned values in `debian/control` or checksum file 3. Build runs with `--offline` pointing at pre-populated local repo
This still requires Internet access during package build, but is acceptable for non-free. The "deserted island" test would pass because the Debian mirror includes these downloaded artifacts as part of the built package.
**Path C: Hybrid Approach (Recommended Near-Term)**
Use a combination: 1. **Replace packaged dependencies with Debian packages** where available (ANTLR, Guava, H2, etc.) --- reduces JAR count from ~450 to ~350 2. **Package the critical non-packaged ones ourselves**: springdoc-openapi (MIT, easy), any others that are small and well-maintained 3. **Bundle remaining JARs** for now with proper licensing documentation, targeting `non-free` initially 4. **Work upstream** to get Spring Boot packaged in Debian --- this is the blocker for everything
#### Concrete Steps for Path C
``` makefile # debian/rules approach: override_dh_auto_build: # Use Debian packages where available via classpath # Download remaining from Maven Central (online phase only) dh_auto_build -- -Dmaven.repo.local=$(CURDIR)/.deps-repo --offline ```
With `debian/control`:
Build-Depends: debhelper-compat (=13), default-jdk (>= 21), maven-debian-helper, libantlr4-runtime-java, libguava-java, libh2-java, libpostgresql-jdbc-java, libjoda-time-java, libjaxb-java, libbyte-buddy-java, librabbitmq-client-java, libreflections-java, # Replace Maven deps with Debian packages where available Standards-Version: 4.6.2
And `debian/copyright` documenting all bundled JAR licenses.
#### Impact on "Deserted Island" Goal
The "deserted island" goal is **achievable now** for building and running the application, even without Debian main packaging: - A complete Debian mirror + our source repo + pre-populated `maven-repo/` = offline build works - The barrier to debian `main` is separate from the ability to build/run offline
The two goals should be tracked separately: 1. **Offline build capability** (this document) --- achievable with bundled artifacts 2. **Debian main compliance** --- requires packaging all dependencies or dropping Spring Boot
7. **Evaluate dropping springdoc-openapi and asciidoctor-maven-plugin** from default build, moving them to an optional profile.
8. **Profile-gate AMQP integration** more clearly so it's not pulled in by default.
9. **Replace Tika with Debian-packaged alternatives** where possible (e.g., `file` command for MIME detection, `tesseract-ocr` for OCR) --- significant refactoring needed.
------------------------------------------------------------------------
## Verification Checklist
To verify the "deserted island" build works:
``` bash # 1. Start with fresh clone + bundled maven-repo tarball git clone <repo> && cd nikita-noark5-core-upstream tar xzf ../maven-repo-bundle.tar.gz # or already in git
# 2. Ensure no network connectivity iptables -A OUTPUT -p tcp --dport 80 -j DROP iptables -A OUTPUT -p tcp --dport 443 -j DROP
# 3. Build offline make build check
# 4. Verify success test -f target/nikita-noark5-core-*.jar && echo "BUILD SUCCESS"
# 5. Restore network iptables -D OUTPUT -p tcp --dport 80 -j DROP iptables -D OUTPUT -p tcp --dport 443 -j DROP ```
If `make build check` succeeds with all ports blocked, the deserted island test passes.
-- Vennlig hilsen Petter Reinholdtsen _______________________________________________ nikita-noark mailing list -- nikita-noark@nuug.no To unsubscribe send an email to nikita-noark-leave@nuug.no