Category Archives: ops

Use both ‘withParam’ and ‘when’ in Argo Workflows (on Kubernetes)

In Argo, we can use ‘withParam’ to create loop logic:

But in my YAML, it also use when in Argo:

When the NEED_RUN is 0, the Argo will report error since it can’t find the {{steps.generate.outputs.result}}. Seems the YAML parser of Argo will try to parse withParam before… Read more »

Some tips about Argo Workflows (on Kubernetes)

Using Argo to execute workflows last week, I met some problems and also find the solutions. 1. Can’t parse “outputs” By submitting this YAML file:

I met the error:

Why the Argo could’t recognize the “steps.generate.outputs.result”? Because only “source” could have a default “output”, not “args”. So the… Read more »

Some problems when using GCP

After I launched a compute engine with container, it report error: gcr.io/xx/xx-xx/feature:yy Feb 03 00:12:28 xx-d19b201 konlet-startup[4664]: {“errorDetail”:{“message”:”failed to register layer: Error processing tar file(exit status 1): write /xxx/2020-01-16/base_cmd/part-00191-2e99af0e-1615-42af-9c60-910f9a9e6a17-c000.snappy.parquet: no space left on device”},”error”:”failed to register layer: Error processing tar file(exit status 1): write /xxx/2020-01-16/base_cmd/part-00191-2e99af0e-1615-42af-9c60-910f9a9e6a17-c000.snappy.parquet: no space left on device”}… Read more »

Problem about installing Kubeflow

Try to install Kubeflow by following this guide. But when I run

it reports

It did cost me some time to find the solution. So let’s try to make it short: Download file https://raw.githubusercontent.com/kubeflow/manifests/v0.7-branch/kfdef/kfctl_k8s_istio.0.7.1.yaml, and find some of its bottom lines:

Download the https://github.com/kubeflow/manifests/archive/v0.7-branch.tar.gz, untar it, and… Read more »

Directly deploy containers on GCP VM instance

We can directly deploy containers into VM instance of Google Compute Engine, instead of launching a heavy Kubernetes cluster. The command looks like:

To add enviroment variables to this container, we just need to add an argument:

To let the container run command for us, we need to… Read more »

A tip about Terraform

      No Comments on A tip about Terraform

Terraform is a interesting (in my opinion) tool to implement Infrastructure-as-Code. When I first used it to write production script at yesterday, I met a error report:

After a while of searching on Google, I got the cause: it can’t find my AWS credential in my computer. Actually I… Read more »

Use docker as normal user

      No Comments on Use docker as normal user

I have used docker for more than 4 years, although not in product environment. Until last week, my colleague told that docker can be used as non-root user. The document is here. I just need to

So easy.

Problems about using DistCp on Hadoop

After installing all Hadoop environment, I used DistCp to copy large files in distributed cluster. But it report error:

Seems it can’t even find the basic MapReduce class. Then I checked CLASSPATH for Hadoop:

Pretty strange, the HADOOP_CLASSPATH contains ‘mapreduce’ directories. It supposed to be able to find… Read more »