CF


Project Quarks

March 2020

The following is a compilation of some thoughts and technical knowledge I have gathered since the inception of project Quarks back in 2018. The idea behind project Quarks is merely to containerize CF(Cloud Foundry) in a native container application approach. If you are familiar with CF or Kubernetes, the following might be of your interest.

Concepts

Kubernetes is the tool of choice in the Cloud Industry for deploying platforms, see k8s.

CF is a PaaS focused on the developer experience rather than in the underlying infrastructure. The main selling point is the cf push experience(see CF for an explanation)

SCF is a release engineering repository that allows you to deploy a containerized CF on Kubernetes. SCF was created by SUSE and is the predecessor of Quarks.

Quarks(also known as cf-operator) is a set of Kubernetes controllers and CRDs that operate based on an instance of a BOSHDeployment CRD. Quarks transform´s a cf-deployment manifest into Kubernetes resources. These Kubernetes resources represent a CF instance.

Kubecf is a package, that bundles CF into a helm chart, in other words is like a package manager for deploying CF on Kubernetes.

BOSH, Kubernetes and CF

In the CF world, so far BOSH has been the tool of choice for the deployment and lifecycle management of CF. BOSH, a powerful tool for the orchestration of CF, has enabled the standardization of the CF deployment by defining a common way for packaging and exposing each component. The caveat, however, is that BOSH is heavily VM(Virtual Machine) biased.

With the raise of application containers, and Kubernetes being the platform of choice in the industry, there was a big question for CF, how to deploy it on Kubernetes?. While there were different attempts on how to achieve this, Project Quarks is to my knowledge the most predominant option because of several reasons:

  • A strong background on containerizing CF(e.g. SCF)
  • BOSH Manifest parity(it originally started as a BOSH incubation project)
  • Highly mutable(e.g. can integrate with CF components that are Kubernetes native)
  • Provides you with a certified CF to deploy on your production environments.

One cannot speak about Project Quarks without understanding the background or main ideas that lead to it, which points back to SCF. At the same time, I personally spent significant amounts of time working with the SCF implementation, and from my experience it is worth explaining some pieces of it.

SCF

SCF is a release engineering repository, consisting of YAML files and two binaries (fissile & configgin), where operators(engineers) could describe a BOSH manifest(to some extent) that will be transformed into helm charts and docker images.

The SCF helm charts(see assets) consist of two charts: one is aimed to be deployed in the uaa namespace, and the other one is aimed to be deployed into the cf namespace(I think SUSE calls it scf namespace). The cf namespace, hosts all control plane components(e.g. capi,diego,routing,loggregator,nats). The purpose of this separation is to support multiple CF clients within a single uaa instance.

It is relevant to know that, at this point, SCF provides you with a production-ready CF, which is used by different companies, offering CF on top of Kubernetes.

Now that we have a more structured idea of what SCF is, we can focus on understanding some of the caveats of its implementation, which can be seen as lessons learned in the journey of containerizing CF, caveats that Project Quarks leverage.

Some of the SCF implementation caveats are:

—YAMLS and Inmutable Configuration—

While I like YAMLs and I had quite previous experience finding ways to override them, SCF didn´t provide an out-of-the-box way to modify them (e.g. BOSH ops-files support). One requires some level of ingenuity to come up with a proper mechanism to override YAMLs consistently. At the same time, pieces of the YAMLs configuration is embedded into the containers during run-time but is only configurable during build-time. This makes it almost impossible to make modifications once your containers are running, forcing you to re-build if you want to modify something.

—Build Time vs Run Time—

Build-time is focused on the generation of helm-charts and docker images based on YAMLs. Run-time is focused on running your containers, while triggering a set of processes during initialization, e.g. rendering bosh ERB files via configgin.

The Build-time lifetime is too long(more than one hour), leading to painful experiences for any misconfiguration spotted during runtimes. For an individual error one hour isn´t too much, but when you deal with YAML files of more than 4000 lines, you are prone to errors.

—Lifecycle Management—

SCF was all about building and deploying, but it did not act on your CF instance once this was running. In other words, any modification to a Kubernetes resource generated from the SCF helm charts will be overridden on a restart/deletion of that resource. The only way of persisting changes is helm, lacking of an automated lifecycle management for CF.

—Dependencies management—

While configgin and fissile are great tools (part of their algorithms were ported/used in Quarks), building images and helm charts implies keeping some synchrony between configgin(inside the stemcell) and fissile, raising the need of knowing which versions to use for each specific SCF release.

Quarks and KubeCF

In the last years, SUSE, IBM and SAP have been developing both Quarks and KubeCF as an evolution of the previous SCF implementation. Quarks has already shipped a v3.2.0 stable release, while KubeCF has a v1.0.0 release. Therefore, it is important to highlight the strong dependency between them.

Quarks is a set of Kubernetes controllers and CRD´s distributed around different github repositories(see cf-operator and quarks-job). Without going into much detail, the main takeaway from the Quarks mechanics is that the entry point to trigger the whole conversion of a CF manifest into Kubernetes resources happens throughout the definition of a BOSHDeployment CRD(see an example in here). Another important takeaway, is that the Quark´s helm chart will only install the Quark´s controllers in your cluster. Only with this chart, you will not get a running CF.

KubeCF can be used to create the above Kubernetes CRD instance alongside all other Kubernetes resources referenced in the same instance. KubeCF generates a CF base manifest and a set of ops-files inside proper Kubernetes resources that the Quarks controllers can understand(e.g. configmaps and secrets). In the above example, spec.manifest defines the base YAML, and spec.ops a list of ops-files to interpolate on top(sounds familiar right?).

Note: The above is a very simple explanation of the BOSHDeployment controller mechanics, there are more controllers and different CRD instances to define, but this is done automatically on-the-fly by Quarks controllers. Most importantly, you do not need to worry about this, while KubeCF abstracts this from you, it exposes all configurable parts throughout it´s values.yaml file. Also, KubeCF will install the Quark´s helm chart for you.

Bonus: If you dislike helm v3(e.g. because of old Tiller traumas), you can apply all KubeCF templates directly with kubectl.

Quarks Implementation Highlights

Project Quarks delivered the first v1.0 release at the end of last year. Here is a list of some of the main implementation highlights of Quarks:

  • Supports the same BOSH standardization mechanism, a BOSH manifest(see cf-deployment) as the way to define CF. Quarks started as a BOSH incubation project. Having the same YAML syntax opens a better possibility for future migrations to CF on Kubernetes.

  • Support for BOSH ops-files This a great feature, specially because depending on your Infrastructure provider, a lot of the base configuration requires modifications, or additions of customize features.

  • Building docker images Contrary to SCF, you just need a single fissile subcommand, to build docker images per component. See the KubeCF docs. If the KubeCF images are not enough for you, you can setup a pipeline that builds all images per cf-deployment version, based on your customized stemcell.

  • Uses state-of-the-art Kubernetes concepts. While not everything requires Kubernetes controllers, the complexity of CF on Kubernetes and all the knowledge gather throught years of managing CF, perfectly fitted on a Controller Pattern approach. Therefore, Project Quarks is composed of several controllers and CRDs to manage the lifecycle of a CF instance on Kubernetes.

  • The Quarks controllers live on independent repositories. This is partially true, while not all the controllers have been isolated, but they will be. An example is the quarks-job. The whole idea is to contribute this controllers back to the Kubernetes community.

  • Supports integration with CF components that are Kubernetes native. This is a great feature, while inside the CF community many component teams are favoring a Kubernetes native implementation rather than BOSH releases. The Project implemented the so called Quarks Links or Entanglements(see entaglements docs), allowing you to integrate with other components that are not BOSH release based. You can see other on-going efforts related to this topic, like the eirini release integration.

  • Build-times and run-times Build-times are just about building your CF component images and pushing them to a registry, nothing else. All the components configuration happen during run-time, so you do not longer need to worry about missconfigurations and repeating the whole build/run cycle.

  • Kubecf is the de facto for installing CF via Quarks It boils down to run two separate helm chart installations(quarks and kubecf charts). Kubecf allows you to override the default setup, either via the values.yml for the Kubecf chart, or by placing new set of ops-files.

  • One operator many CF instances The quarks operator was designed to operate multiple CF instances at the same time. You can deploy the quarks operators follow by multiple KubeCF charts, in different namespaces.

  • Deployment times are great. A good consequence of running all BOSH components with bpm, this ends up in a smaller footprint for the whole CF. With the current KubeCF charts, you can get a running CF in 30 minutes or less(using the default KubeCF docker images).

  • Deployment modes One of the best features of KubeCF is that you can take the helm charts and deploy them locally in minikube or kind. This is great for any type of test.

  • No nested containers Project Quarks allowed you to switch the application scheduler(diego vs eirini) via a feature flag in the helm charts. If you want to do it for Quarks, see eirini enable flag.

Deploying CF in k8s with Quarks

While this post was not intended to illustrate the steps to run CF with the above technologies, if your desire(most probably) is to get your hands dirty on this containerized technologies, go to see a digestable tutorial from Stark & Wayne. Also, see the official documentation from KubeCF

The future

Moving Cloud Foundry to Kubernetes have been a topic since years(elephant time), but only few have really give it a try and succeed with a maintanable/scalable approach.

Technologies like Eirini have played an important role, fitting together with technologies like Quarks(at the right time), providing a more container native implementation and reducing the footprint of the whole CF architecture.

The Kubernetes landscape is too wide and everyone can build their own tooling, leading to none standardization when it comes to building complex cloud native applications. Containerize Cloud Foundry is not the exception, it would be interesting to see what does the future will look, and in which direction will the community finally head on the containerize journey.

The Quarks implementation is highly mutable(e.g. quarks links). Together with KubeCF, they provide a solid landscape for all CF components to integrate via a container native implementation approach.

Containerizing CF


Project Quarks

March 2020

The following is a compilation of some thoughts and technical knowledge I have gathered since the inception of project Quarks back in 2018. The idea behind project Quarks is merely to containerize CF(Cloud Foundry) in a native container application approach. If you are familiar with CF or Kubernetes, the following might be of your interest.

Concepts

Kubernetes is the tool of choice in the Cloud Industry for deploying platforms, see k8s.

CF is a PaaS focused on the developer experience rather than in the underlying infrastructure. The main selling point is the cf push experience(see CF for an explanation)

SCF is a release engineering repository that allows you to deploy a containerized CF on Kubernetes. SCF was created by SUSE and is the predecessor of Quarks.

Quarks(also known as cf-operator) is a set of Kubernetes controllers and CRDs that operate based on an instance of a BOSHDeployment CRD. Quarks transform´s a cf-deployment manifest into Kubernetes resources. These Kubernetes resources represent a CF instance.

Kubecf is a package, that bundles CF into a helm chart, in other words is like a package manager for deploying CF on Kubernetes.

BOSH, Kubernetes and CF

In the CF world, so far BOSH has been the tool of choice for the deployment and lifecycle management of CF. BOSH, a powerful tool for the orchestration of CF, has enabled the standardization of the CF deployment by defining a common way for packaging and exposing each component. The caveat, however, is that BOSH is heavily VM(Virtual Machine) biased.

With the raise of application containers, and Kubernetes being the platform of choice in the industry, there was a big question for CF, how to deploy it on Kubernetes?. While there were different attempts on how to achieve this, Project Quarks is to my knowledge the most predominant option because of several reasons:

  • A strong background on containerizing CF(e.g. SCF)
  • BOSH Manifest parity(it originally started as a BOSH incubation project)
  • Highly mutable(e.g. can integrate with CF components that are Kubernetes native)
  • Provides you with a certified CF to deploy on your production environments.

One cannot speak about Project Quarks without understanding the background or main ideas that lead to it, which points back to SCF. At the same time, I personally spent significant amounts of time working with the SCF implementation, and from my experience it is worth explaining some pieces of it.

SCF

SCF is a release engineering repository, consisting of YAML files and two binaries (fissile & configgin), where operators(engineers) could describe a BOSH manifest(to some extent) that will be transformed into helm charts and docker images.

The SCF helm charts(see assets) consist of two charts: one is aimed to be deployed in the uaa namespace, and the other one is aimed to be deployed into the cf namespace(I think SUSE calls it scf namespace). The cf namespace, hosts all control plane components(e.g. capi,diego,routing,loggregator,nats). The purpose of this separation is to support multiple CF clients within a single uaa instance.

It is relevant to know that, at this point, SCF provides you with a production-ready CF, which is used by different companies, offering CF on top of Kubernetes.

Now that we have a more structured idea of what SCF is, we can focus on understanding some of the caveats of its implementation, which can be seen as lessons learned in the journey of containerizing CF, caveats that Project Quarks leverage.

Some of the SCF implementation caveats are:

—YAMLS and Inmutable Configuration—

While I like YAMLs and I had quite previous experience finding ways to override them, SCF didn´t provide an out-of-the-box way to modify them (e.g. BOSH ops-files support). One requires some level of ingenuity to come up with a proper mechanism to override YAMLs consistently. At the same time, pieces of the YAMLs configuration is embedded into the containers during run-time but is only configurable during build-time. This makes it almost impossible to make modifications once your containers are running, forcing you to re-build if you want to modify something.

—Build Time vs Run Time—

Build-time is focused on the generation of helm-charts and docker images based on YAMLs. Run-time is focused on running your containers, while triggering a set of processes during initialization, e.g. rendering bosh ERB files via configgin.

The Build-time lifetime is too long(more than one hour), leading to painful experiences for any misconfiguration spotted during runtimes. For an individual error one hour isn´t too much, but when you deal with YAML files of more than 4000 lines, you are prone to errors.

—Lifecycle Management—

SCF was all about building and deploying, but it did not act on your CF instance once this was running. In other words, any modification to a Kubernetes resource generated from the SCF helm charts will be overridden on a restart/deletion of that resource. The only way of persisting changes is helm, lacking of an automated lifecycle management for CF.

—Dependencies management—

While configgin and fissile are great tools (part of their algorithms were ported/used in Quarks), building images and helm charts implies keeping some synchrony between configgin(inside the stemcell) and fissile, raising the need of knowing which versions to use for each specific SCF release.

Quarks and KubeCF

In the last years, SUSE, IBM and SAP have been developing both Quarks and KubeCF as an evolution of the previous SCF implementation. Quarks has already shipped a v3.2.0 stable release, while KubeCF has a v1.0.0 release. Therefore, it is important to highlight the strong dependency between them.

Quarks is a set of Kubernetes controllers and CRD´s distributed around different github repositories(see cf-operator and quarks-job). Without going into much detail, the main takeaway from the Quarks mechanics is that the entry point to trigger the whole conversion of a CF manifest into Kubernetes resources happens throughout the definition of a BOSHDeployment CRD(see an example in here). Another important takeaway, is that the Quark´s helm chart will only install the Quark´s controllers in your cluster. Only with this chart, you will not get a running CF.

KubeCF can be used to create the above Kubernetes CRD instance alongside all other Kubernetes resources referenced in the same instance. KubeCF generates a CF base manifest and a set of ops-files inside proper Kubernetes resources that the Quarks controllers can understand(e.g. configmaps and secrets). In the above example, spec.manifest defines the base YAML, and spec.ops a list of ops-files to interpolate on top(sounds familiar right?).

Note: The above is a very simple explanation of the BOSHDeployment controller mechanics, there are more controllers and different CRD instances to define, but this is done automatically on-the-fly by Quarks controllers. Most importantly, you do not need to worry about this, while KubeCF abstracts this from you, it exposes all configurable parts throughout it´s values.yaml file. Also, KubeCF will install the Quark´s helm chart for you.

Bonus: If you dislike helm v3(e.g. because of old Tiller traumas), you can apply all KubeCF templates directly with kubectl.

Quarks Implementation Highlights

Project Quarks delivered the first v1.0 release at the end of last year. Here is a list of some of the main implementation highlights of Quarks:

  • Supports the same BOSH standardization mechanism, a BOSH manifest(see cf-deployment) as the way to define CF. Quarks started as a BOSH incubation project. Having the same YAML syntax opens a better possibility for future migrations to CF on Kubernetes.

  • Support for BOSH ops-files. This a great feature, especially because depending on your Infrastructure provider, a lot of the base configuration requires modifications, or additions of customize features.

  • Building docker images. Contrary to SCF, you just need a single fissile subcommand, to build docker images per component. See the KubeCF docs. If the KubeCF images are not enough for you, you can setup a pipeline that builds all images per cf-deployment version, based on your customized stemcell.

  • Uses state-of-the-art Kubernetes concepts. While not everything requires Kubernetes controllers, the complexity of CF on Kubernetes and all of the knowledge gathered by platform operators throughout years of managing CF, perfectly fitted on a Controller Pattern approach. Therefore, Project Quarks is composed of several controllers and CRDs to manage the lifecycle of a CF instance on Kubernetes.

  • The Quarks controllers live on independent repositories. While this is not a finalized process, the ideas is to contribute these controllers back to Kubernetes community. An example of this is the quarks-job, which was originally inside the cf-operator repository, but later isolated into its own repository.

  • Supports integration with CF components that are Kubernetes native. This is a great feature because it allows the integration of component teams with a Kubernetes native implementation. The Project implemented the so called Quarks Links or Entanglements(see entaglements docs), allowing you to integrate with other components that are not BOSH release based. You can see other on-going efforts related to this topic, like the eirini release integration.

  • Build-times and run-times. Build-times are just about building your CF component images and pushing them to a registry, nothing else. All of the components configuration happens during run-time, so you do not longer need to worry about misconfigurations and repeating the whole build/run cycle.

  • Kubecf is the de facto release engineering repository for installing CF via Quarks. The installation boils down to running two separate helm chart installations(quarks and kubecf charts). Kubecf allows you to override the default setup, either via the values.yml for the Kubecf chart, or by placing a new set of ops-files.

  • One operator, many CF instances. The quarks operator was designed to operate multiple CF instances at the same time. You can deploy the quarks operators followed by multiple KubeCF charts, in different namespaces.

  • Deployment times are significantly reduced. A good consequence of running all BOSH components with bpm is that it ends up in a smaller footprint for the whole CF. With the current KubeCF charts, you can get a running CF in 30 minutes or less(using the default KubeCF docker images).

  • Deployment modes. One of the best features of KubeCF is that you can take the helm charts and deploy them locally in minikube or kind. This is great for any type of test because enables you to control everything from you local machine, without the need of external infrastructure.

  • No nested containers. Project Quarks allowed you to switch the application scheduler(diego vs eirini) via a feature flag in the helm charts. If you want to do it for Quarks see eirini enable flag.

The future

Moving Cloud Foundry to Kubernetes has been a highly discussed topic over the last several years(elephant time), but only few have really give it a try and succeed with a maintainable/scalable approach.

Technologies like Eirini have played an important role in the large-scale implementation of a containerized CF by fitting together with technologies like Quarks(at the right time), providing a more container native implementation and reducing the footprint of the whole CF architecture.

How the Kubernetes landscape will evolve is uncertain; it is too wide and everyone can build their own tooling, leading to none standardization when it comes to building complex cloud native applications. Containerize Cloud Foundry is not the exception. Additionally, the Quarks implementation is highly mutable(e.g. quarks links), then together with KubeCF, these two tools provide a solid landscape for all CF components to integrate via a container native implementation approach.

Deploying CF in k8s with Quarks

This post is not intended to illustrate the steps to run CF with the above technologies. However, if you´re interested in learning about it you can see the official documentation from KubeCF and Stark & Wayne offers a very good tutorial.

About

I am a software developer currently working on the Cloud Computing field, with a strong focus on Kubernetes and CloudFoundry. I like to work with different technologies, try to understand them and build something upon that. Part of my spare time goes into developing code for some personal projects or contributing to Open Source ones.

Work

Open Source Projects

  • havener. Havener is a golang project with two main goals, improve my golang skills and provide a cli that fills the helm lack of flexibility before and during chart deployments. The flexibility here refers mainly to those multiple pre-processing operations that can be automated into a single configuration yaml file, which will be used during the helm install/upgrade.
  • From time to time, also contributing to homeport and friends, for the list of tools we have in this github organization.
  • Project Quarks. An Open Source project in incubation by BOSH, building a Kube native operator designed to manage the lifecycle of a CF instance inside a Kubernetes cluster.
  • This website. This site is powered by hugo, which I found after being a fan of OS repositories like cobra et al. I migrated my previous site, from wordpress to this one.

Talks