Robin on Linux – Page 10 – All about technology

Kubeflow deployment: part 2

Instead of using other strange components of Kubeflow, we can just use Kubeflow Pipelines which is much easier to be deployed and used.

After deployment (less than 20 minutes), I test my first example and it succeeded:

import kfp
from kfp.components import func_to_container_op, InputPath, OutputPath
@func_to_container_op
def do_something(output_csv: OutputPath(str)):
    with open(output_csv, "w") as writer:
        writer.write("name,score\n")
        writer.write("hello,100\n")
        writer.write("world,50\n")
@func_to_container_op
def show_something(input_csv: InputPath()):
    with open(input_csv, "r") as reader:
        for line in reader:
            print(line)
def do_and_show_something():
    csv_data = do_something()
    csv_data.set_memory_limit("1G")
    csv_data.set_cpu_limit("0.5")
    show_op = show_something(csv_data.output)
    show_op.set_memory_limit("1G")
    show_op.set_cpu_limit("0.5")
if __name__ == "__main__":
    kfp.compiler.Compiler().compile(do_and_show_something, __file__ + ".yaml")

To limit the resource (CPU/Memory), we only need to call the function of the op (Operation).

Then python3 xxx.py to convert the python script file to YAML, and upload the YAML file to Kubeflow Pipelines.

That is how easy to use just Pipeline component of Kubeflow!

My summary for the paper “Unified Language Model Pre-training for Natural Language Understanding and Generation”

For NLU (Natural Language Understanding), we use the bidirectional language model (like BERT), but for NLG(Natural Language Generation), the left-to-right unidirectional language model (like GPT) is the only choice.

Could we accomplish these two tasks by using one unified language model?

In this paper, the authors use a mask matrix to run different tasks in the same model:

The pivotal equation for this method is:

“M is the mask matrix and determines whether a pair of tokens can be attended to each other.”

“Unidirectional LM is done by using a triangular matrix for the self-attention mask M (as in the above equation), where the upper triangular part of the self-attention mask is set to −∞, and the other elements to 0”

“Within one training batch, 1/3 of the time we use the bidirectional LM objective, 1/3 of the time we employ the sequence-to-sequence LM objective, and both left-to-right and right-to-left LM objectives are sampled with the rate of 1/6”

Keep a note that the training process use bidirectional/unidirectional/seq2seq objective, not samples)

Kubeflow deployment: part 1

By following the document, I tried to deploy the management cluster of Kubeflow. But after running make apply-cluster it reported:

The management cluster name "kubeflow-mgmt" is valid.
# Delete the directory so any resources that have been removed
# from the manifests will be pruned
rm -rf build/cluster
mkdir -p build/cluster
kustomize build ./cluster -o build/cluster
# Create the cluster
anthoscli apply -f build/cluster
I0723 14:53:19.329785   24546 main.go:230] reconcile serviceusage.cnrm.cloud.google.com/Service container.googleapis.com
I0723 14:53:23.236897   24546 main.go:230] reconcile container.cnrm.cloud.google.com/ContainerCluster kubeflow-mgmt
Unexpected error: error reconciling objects: error reconciling ContainerCluster:gcp-wow-rwds-ai-mlchapter-dev/kubeflow-mgmt: error creating GKE cluster kubeflow-mgmt: googleapi: Error 400: Project "gcp-wow-rwds-ai-mlchapter-dev" has no network named "default".
make: *** [apply-cluster] Error 1

The reason for this error is that Kubeflow could only use the network with the name “default” in GCP as its VPC. This issue is still open and has been pointed to anthos.

Workaround: Create a new GKE cluster manually, and set MGMT_NAME to the existed cluster name

export MGMT_NAME=kubeflow-exp

Then the make apply-cluster would work properly.

Some tips in groovy of Jenkins

Groovy could be used as the configuration file for Jenkins workflow. Although I totally don’t make head or tail of Groovy, its syntax is not hard to learn.

How to iterate a list

To export a bunch of tables to CSV format, we could use

[
  "table1",
  "table2",
  "table3",
].each {
  table_name ->
  sh "export ${table_name} to ${table_name}.csv"
}

Get the output of the shell

We could run shell commands in the Groovy file and then get its output.

all_files = sh (
    script: 'ls -lh',
    returnStdout: true
).trim()
echo "All files: ${all_files}"

How to paste in Vim

After I have written the SQL on BigQuery’s UI:

SELECT
  first_column
FROM
  `project.dataset.table`
ORDER BY
  `first_column` ASC
LIMIT
  1000

I tried to copy it to my Vim editor through “Ctrl + c” and “Ctrl + v”. But the result in my Vim looks like irregular steps

SELECT
  *
  FROM
    `project.dataset.table`
    ORDER BY
      `first_column` ASC
      LIMIT
        1000

Even using “Shift + Option + Command + v” to paste code without format couldn’t solve this problem.

Actually, it’s not a problem about system pasting, but about Vim. The correct way is to set Vim to accept “paste” at first

:set paste

then we can get the pasting content correctly.

The “real size” of hex file for AVR microcontroller

Recently I dug out my USBasp tool and a few AVR microcontrollers, for enjoying programming the C language again. Unexpectedly, the old ATTINY2313V and ATmega88V couldn’t work with my USBasp tool (maybe they have already been fused with an external crystal but I don’t have one at hand). The only two pieces that could work are ATmega16A and ATmega16L. At least I could still have some fun with it.

My waterfall-light built from ATmega16A

The code for playing the ATmega16A is at https://github.com/RobinDong/atmega16a.

Later when glancing over the documents of Atmel’s new ATTINY series, I found out that the ATTINY13A only have 1KB space to store program. My example of waterfall light was compiled out to a 2KB hex file. Does that mean I couldn’t put my program into the ATTINY13A? How could I get the real occupation size of flash for the hex file?

Here is one solution by using avr-size

$ avr-size main.hex
   text	   data	    bss	    dec	    hex	filename
      0	    756	      0	    756	    2f4	main.hex

Only 756 bytes (text + data) will be used in flash, so the ATTINY13A should be okay.

Another command for this is more human-readable

$ avr-size --format=avr --mcu=atmega16 main.elf
AVR Memory Usage
----------------
Device: atmega16
Program:     756 bytes (4.6% Full)
(.text + .data + .bootloader)
Data:          6 bytes (0.6% Full)
(.data + .bss + .noinit)

Now, what should I do if I want to reduce the size of the binary file compiled from my code? Here is the guide from Atmel.

First, I change the type of the variable “mode” from “unsigned int” to “unsigned char”, this leads the binary file to 726 bytes. Then change all inner functions to “static” (I guess this removed some unused symbol for external linking), reduce the binary size to 668 bytes.

—— 2021.07.13 ——

Furthermore, when I use “-mrelax” option in gcc-avr for linker relaxation, the binary size shrink to 656 bytes.

First experiments about Vertex AI of Google Cloud

As the above menu show in the Vertex AI, it is trying to include all common processes of building and running a machine learning model.

For my experiment, I just create a Dataset by loading file from GCS. Unfortunately, the loading process support only CSV file as tabular data so I have to convert my big PARQUET file into CSV format first (really inconvenient).

Strange error

But after I created a training process by using builtin XGBoost container. It report a strange error:

There is an invalid column, but what’s the name of it? The GUI didn’t show. I finally find out that it’s a column with an empty name. Seems Vertex AI couldn’t even process a table with a column of an empty name.

2. AutoML

After manually removed the column with an empty name and select AutoML for my tabular data. The training went successfully. The final regression L1 loss is 0.237, just the same result with my own LightGBM model.

3. Custom Pakcage

By following this document, I create a custom Python package for my training of the XGBoost model. The self-brew package use environment-variable to get Dataset from GCS. The final L1 loss is slightly worse than LightGBM.

Frankly speaking, I haven’t seen any advantage of Vertex AI over our home-brew Argo/K8S training framework. In the Vertex AI training process, those special errors, like OOM(Out Of Memory), are hard to discover.

The low-power mode for STM8L and ATTINY2313

I used to think STM8L would cost less energy than ATTINY. Today I got some time to view the datasheet for these two MCU.

First comes the STM8L datasheet:

If we just use low speed oscillator (LSI) in the STM8L, it would only cost 0.9uA when all other components has shutdown.

How about ATTINY? Then comes the datasheet:

Since we need to enable watch dog timer (WDT) to wakeup, the ATTINY cost typical 4uA in low power mode.

Seems my memory is still correct 🙂 STM8L is the winner.

Upgrade GKE cluster

Normally, to upgrade a cluster of Google Kubernetes Engine, we need to upgrade the master at first, and then node_pools. For convenience, I just click the button “UPGRADE AVAILABLE” in the “Release Channel” section under the “DETAILS” tab of the cluster GUI.

After about 5-10 mins, I started to use command to upgrade all our node_pools in this cluster

for pool in highcpu-2 highcpu-4 highcpu-8 ; do
  echo y|gcloud container clusters upgrade my-cluster --node-pool="${pool}" --zone=my-zone --project=my-project
done

Since we have quite a bunch of node_pools, this upgrade process takes about half an hour to finish. Even the script has ended, there was still two node_pools showed “Error” status. And the detail of the error is:

Insufficient quota to satisfy the request: waiting on IG: instance https://www.googleapis.com/compute/v1/projects/my-project/zones/us-central1-a/instances/my-zone-highcpu-2-e435c7e2-xelk is still CREATING. Last attempt error: [QUOTA_EXCEEDED] Instance ‘my-zone-highcpu-2-e435c7e2-xelk’ creation failed: Quota ‘CPUS’ exceeded. Limit: 1200.0 in region …

Seems the upgrade process need to create extra new nodes with a new version first and then delete the old nodes hence would require extra CPUs and cause the exceeding of CPU quota.

Don’t worry. We just need to rerun the upgrade for these two node_pools with “Error” status and eventually all cluster upgraded to 1.19.9-gke.1900

Recover truncated table in BigQuery

If you accidentally truncate a table in BigQuery, you can try this article to recover the data. Furthermore, I found out that the "bq cp project:dataset.table@-36000 project:dataset.table” method could not work in my situation. The only working solution is “SYSTEM_TIME AS OF“:

CREATE `mydataset.newtable` AS
SELECT *
FROM `mydataset.mytable`
  FOR SYSTEM_TIME AS OF TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 2 HOUR);

and then “bq cp project:mydataset.newtable project:mydataset.mytable“