Skip to main content

Setting Clickhouse column data warehouse at Google Cloud Compute Engine VM

I didn't have a Google Cloud account associated with my email, so I signed up for one. It needs a valid Credit Card and mobile number to check if you are human.

On successful sign up I get 300$ to spend within 3 months.

Creating a free forever Google Cloud Compute Engine VM

As per Google Cloud documentation you can have 1 non-preemptible e2-micro VM instance (1GB 2vCPU, 30GB Disk, etc.) per month free forever in some regions with some restrictions.

I wanted the following stuff in my VM before I can install Clickhouse on to that:

  1. Ubuntu 20.x LTS
  2. SSH access from my machine

Enabling SSH-based access to Google Compute Engine VM

Step 1

Created an ssh private and public key on my mac using the following command

ssh-keygen -t rsa -f ~/.ssh/gcloud-ssh-key -C mrityunjay -b 2048

Step 2

Copied the public key from the console using the following command:

cat ~/.ssh/gcloud-ssh-key.pub

output

ssh-rsa <Gibrish :)> mrityunjay

Step 3

I went to Google Cloud Console > Compute Engine > Metadata > SSH Keys Section.

I clicked on Edit > Add Item and pasted the previously copied public SSH key and saved the item.

Newly added SSH keys started appearing in the list.

Creating a new VM

Creating a new VM is a breeze with easy-to-use options on the screen in the Console.

I made sure the following stuff:

  1. The selected region is us-west-1.
  2. Changed Disk to 30 GB general purpose.
  3. Choose Ubuntu 20 LTS.
  4. Paste Google Cloud SSH Public Key.

VM successfully created!

Accessing my VM

Command

ssh -i ~/.ssh/gcloud-ssh-key mrityunjay@xx.yyy.xxxx.xx

And I am within the Ubuntu 20.x Machine.

Setting up the ClickHouse

I found a great article on DigitalOcean documentation for setting up ClickHouse on Ubuntu 20.x LTS.

Most of my setup follows instructions from the article. God bless the author.

Adding Yandex managed ClickHouse APT repository


sudo apt-key adv --keyserver keyserver.ubuntu.com --recv E0C56BD4


 echo "deb http://repo.yandex.ru/clickhouse/deb/stable/ main/" | sudo tee /etc/apt/sources.list.d/clickhouse.list

sudo apt update

Installing clickhouse-server and clickhouse-client packages


sudo apt install clickhouse-server clickhouse-client


I provided a secure password for the default user of ClickHouse.

Setup done!

ClickHouse Administration

Starting ClickHouse on Ubuntu 20.X LTS


sudo service clickhouse-server start

Checking status for the ClickHouse


sudo service clickhouse-server status


Connecting to ClickHouse


clickhouse-client --password

On success, I was seeing a shell that invited me to try some commands.

ClickHouse is in the heart like an RDBMS with some twist.

Executing few commands

Creating a new database

create database test;

Output

Query id: 9dc02359-02da-480e-b538-96d63d1155ad

Ok.

0 rows in set. Elapsed: 0.005 sec.

Using the database

use test

Creating a new table

CREATE TABLE test1
(
    `id` UInt64,
    `name` String
)
ENGINE = MergeTree
PRIMARY KEY id
ORDER BY id

Output


Query id: dd2cd818-780b-4c63-9c8f-695a83121719

Ok.

0 rows in set. Elapsed: 0.010 ```

Displaying all tables


show tables

Output

Query id: 2adb7b7e-3f53-40cf-8b2d-263c27a487a0

┌─name──┐
│ test1 │
└───────┘

1 rows in set. Elapsed: 0.013 sec. 


Okay so for this tree in my garden I can stop now by trying out commands of ClickHouse.

Accessing ClickHouse from other Machine

To access ClickHouse from another machine I made the following changes in the ClickHouse config to allow listening from all interfaces.

sudo nano /etc/clickhouse-server/config.xml

And uncomment <!-- <listen_host>::</listen_host> -->.

Restart server


sudo service clickhouse-server restart

There are two ports on which ClickHouse listen 8123 (HTTP) and 9000 (Internal)

I enabled them in Google Cloud Compute Engine Firewall options.

And I am good to go with accessing ClickHouse DB from another system.

Allowing MySQL Connector

ClickHouse supports MySQL wire protocol. It can be enabled by mysql_port setting in the configuration file:


 <!-- Compatibility with MySQL protocol.

 ClickHouse will pretend to be MySQL for applications connecting to this port.

 -->
<mysql_port>9004</mysql_port>

Uncomment or add the above line in /etc/clickhouse-server/config.xml the file. Restart the server after saving.

Allowing PostgreSQL Connector


 <!-- Compatibility with PostgreSQL protocol.

 ClickHouse will pretend to be PostgreSQL for applications connecting to this port.

 -->

 <postgresql_port>9005</postgresql_port>

Uncomment or add the above line in /etc/clickhouse-server/config.xml the file. Restart the server after saving.

Comments

Popular posts from this blog

Working with request header in Jersey (JAX-RS) guide

In the  previous post , we talked about, how to get parameters and their values from the request query string. In this guide learn how to get request header values in Jersey (JAX-RS) based application. We had tested or used the following tools and technologies in this project: Jersey (v 2.21) Gradle Build System (v 2.9) Spring Boot (v 1.3) Java (v 1.8) Eclipse IDE This is a part of  Jersey (JAX-RS) Restful Web Services Development Guides series. Please read Jersey + Spring Boot getting started guide . Gradle Build File We are using Gradle for our build and dependency management (Using Maven rather than Gradle is a very trivial task). File: build.gradle buildscript { ext { springBootVersion = '1.3.0.RELEASE' } repositories { mavenCentral() } dependencies { classpath("org.springframework.boot:spring-boot-gradle-plugin:${springBootVersion}") } } apply plugin: 'java' apply plugin: 'eclipse' a

Ajax Cross Domain Resource Access Using jQuery

Some time back in our project we faced a problem while making an Ajax call using jQuery. Chrome Browser console had given some weird error message like below when we try to access one of our web pages: When we try to access the same web page in the Firefox browser, it doesn't give any error in the console but some parsing error occurred. In our case, we were accessing XML as an Ajax request resource. I was curious to check if the non-XML cross-domain resource was successfully loading or not. But finally, I realized that it is not going through. jersey-spring-boot-quick-starter-guide In our Ajax call, requesting domain was not the same as the requested URL domain. $.ajax({ url: "https://10.11.2.171:81/xxxxxx/xxxxxxx.xml" , type : "get" , success: function (response) { alert( "Load was performed." ); }, error : function (xhr, status) {

FastAPI first shot

Setup on my Mac (Macbook Pro 15 inch Retina, Mid 2014) Prerequisite Python 3.6+ (I used 3.7.x. I recently reinstalled OS after cleaning up disk, where stock Python 2.7 was available. I installed Pyenv and then used it to install 3.7.x). I already had a git repo initialized at Github for this project. I checked that out. I use this approach to keep all the source code safe or at a specific place 😀. I set the Python version in .python-version file. I also initialize the virtual environment using pyenv in venv folder. I started the virtual environment. FastAPI specific dependencies setup Now I started with basic pip commands to install dependency for the project. I saved dependencies in requirements.txt  the file. Minimal viable code to spin an API Server FastAPI is as cool as NodeJS or Go Lang (?) to demonstrate the ability to spin an API endpoint up and running in no time. I had the same feeling for the Flask too, which was also super cool. app/main.py: from typing i