diff --git a/content/blog/2025-06-25-expanding-catalog.mdx b/content/blog/2025-06-25-expanding-catalog.mdx new file mode 100644 index 00000000..8c66b0f2 --- /dev/null +++ b/content/blog/2025-06-25-expanding-catalog.mdx @@ -0,0 +1,93 @@ +--- +title: "GSoC'25: Expanding the Unikraft Software Support Ecosystem" +description: | + This GSoC project aims to expand Unikraft’s software support by adding new applications via binary-compatibility mode into the application catalog. +publishedDate: 2025-06-25 +image: /images/gsoc25.jpeg +authors: +- Prasoon Kumar +tags: +- gsoc +- gsoc25 +- catalog +- sqlite +- mosquitto +- prometheus +- consul +--- + +## Project Overview + +Unikraft makes it easy to run existing applications in two ways: +1. **Musl libc support** + +2. [**Binary-compatibility mode**](https://unikraft.org/docs/concepts/compatibility): +Unikraft can run unmodified Linux ELF binaries by using an ELF loader app and a system-call shim layer. +When a binary-compatible unikernel starts, the `app-elfloader` parses and maps the ELF segments into memory, then transfers control to the application entry point. +Whenever the application invokes a Linux system call, the syscall shim intercepts the call number and routes it to the corresponding Unikraft handler. +Unimplemented calls return `ENOSYS`, which many applications can handle or fake because Unikraft follows Linux’s x86_64 ABI closely. +This helps when running on hypervisors like KVM, QEMU, or Firecracker. + +## Adding an application in binary-compatibility mode + +Though the full procedure is detailed in the Unikraft Docs (see [Adding Applications to the Catalog](https://unikraft.org/docs/contributing/adding-to-the-app-catalog)), here are a few additional findings: + +1. **Find supported syscalls** + - Search for syscall macros in the Unikraft source (see [Syscall Shim Layer](https://unikraft.org/docs/internals/syscall-shim)). + - Or look for `UK_PROVIDED_SYSCALLS-` entries in the `Makefile.uk` files of Unikraft libraries. +2. **Discover the application’s syscalls** + - Run the application under `strace` inside a container using `strace -o {destination_file} -f {application_binary}`. + - Save the trace output and extract the unique syscall list using `awk '{split($2, a, "("); print a[1]}' strace_out.txt | sort | uniq`. + - Compare against Unikraft’s supported syscalls set, to spot any missing calls. +3. **Prepare a dynamic ELF** + - If the existing Docker image has only a statically linked binary, rebuild the application in an `alpine` container to produce a `PIE (ET_DYN)` ELF suitable for Unikraft’s loader. + +## Current Progress + +Initially, I selected a subset of 14 applications and listed out the system calls they use which are not yet supported in Unikraft. + +I looked for adding the following applications in the [catalog](https://github.com/unikraft/catalog) in binary-compatibility mode and I was able to add several of them. +I'm also looking into finding and documenting the reasons why some applications can't be added to bincompat yet: + +1. **sqlite:3.44** + - A small, serverless SQL database engine that stores data in a single file, perfect for edge and cloud applications needing fast, local storage without a separate server. + - Runs successfully once `CONFIG_LIBPOSIX_PROCESS_SIGNAL` is enabled in the Kraftfile. + - Initially, I opened a PR before GSoC, but have later discovered that the POSIX process signal option was required. + - Added GitHub Actions workflow YAML in the catalog for automatic builds. + - Pull Request: [catalog#163](https://github.com/unikraft/catalog/pull/163). + +2. **mosquitto:2.0.21** + - A lightweight MQTT message broker used in microservice systems, enabling efficient, real-time publish / subscribe communication. + - Unikraft currently lacks multi-user support, so the user must be hard-coded as root in `mosquitto.conf`. + - Added GitHub Actions workflow YAML in the catalog for automatic builds. + - Pull request: [catalog#207](https://github.com/unikraft/catalog/pull/207) + +3. **prometheus:2.53.4** + - A monitoring and alerting system for cloud-native environments, collecting and querying metrics to help operators observe and maintain distributed services. + - Uses file-backed `mmap(..., MAP_SHARED, ...)`, which is not yet supported in Unikraft. + +4. **consul:1.21.1** + - A service discovery and configuration platform that enables dynamic registration, health checks, and key / value storage for microservice orchestration. + - Calls `socket(AF_NETLINK, ...)`, which Unikraft does not support yet. + +## Next Steps + +For the next three weeks, I plan to continue expanding the catalog by adding more high-priority applications. +My immediate targets are: +- **etcd**: A distributed key-value store for service discovery and coordination in cloud-native environments. +- **InfluxDB**: A time-series database optimized for real-time monitoring and analytics. +- **Vault**: A tool for securely managing secrets and encryption keys. + +I will follow the binary-compatibility workflow as I discussed [above](/blog/2025-06-25-expanding-catalog#adding-an-application-in-binary-compatibility-mode). + +## Acknowledgement + +I would like to thank my mentors, [Razvan Virtan](https://github.com/razvanvirtan) and [Razvan Deaconescu](https://github.com/razvand), for their guidance throughout this project. +I’m also grateful to the entire Unikraft community for being so welcoming and helpful. + +## About Me + +I am a first-year MTech student at IIT Bombay. +I have experience across many areas of computer science, including operating systems, virtualization, Web3, web development, and algorithms. +In my free time, I love solving challenging math and algorithmic problems. +- Socials: [Linkedin](https://www.linkedin.com/in/prasoon054/) [Twitter](https://x.com/prasoon054) [Github](https://github.com/prasoon054)