<![CDATA[A Gema's Ramble.]]>https://gemawardian.com/https://gemawardian.com/favicon.pngA Gema's Ramble.https://gemawardian.com/Ghost 5.96Wed, 30 Oct 2024 09:53:16 GMT60<![CDATA[Fixing Docker Layer Push issue on GitLab Container Registry]]>https://gemawardian.com/fixing-docker-layer-push-issue-on-gitlab-container-registry/6721b67612d6d927b1b90304Wed, 30 Oct 2024 09:47:58 GMT

As a DevOps Engineer, my main responsibility is to manage deployments, particularly publishing images to the GitLab Container Registry. Occasionally, the deployment of an image layer fails for unknown reasons, particularly with larger-sized images.

I experience this issue both on GitLab.com and self-hosted ones, and troubleshooting it could be hard, especially when the error log is vague and the process keeps retrying itself.

Fixing Docker Layer Push issue on GitLab Container Registry
One of issues encountered when pushing image

Most common error messages encountered on pushing to the registry:

  • received unexpected HTTP status: 500 Internal Server Error
  • unknown: Client Closed Request
  • http: proxy error: context canceled
  • write: connection reset by peer

This guide will help you troubleshoot and fix deployment issues, especially on stuck or failed layer pushes on the GitLab container registry platform.

Troubleshooting #1: Gitlab Nginx Upload Limit

When running GitLab on-premises, the default built-in nginx upload limit is too small to handle large uploads and needs to be increased. Open the GitLab configuration file and increase the client_max_body_size value as needed.

# /etc/gitlab/gitlab.rb
nginx['client_max_body_size'] = '256m'

After increasing the limit, run the reconfiguration script and restart the GitLab server.

gitlab-ctl reconfigure
gitlab-ctl restart

Troubleshooting #2: GitLab Authentication Token Timeout

Simply put, if an image upload takes a considerable amount of time due to computational or bandwidth limitations, the registry token may expire midway through the upload process, causing it to fail.

This value cannot be changed on GitLab.com, and you may need to either self-host your docker registry or optimize the image size further. Fortunately, in a self-hosted GitLab installation, you can access this page and change the token limit.

# with your GitLab admin account
https://your_gitlab_domain/admin/application_settings/ci_cd
Fixing Docker Layer Push issue on GitLab Container Registry
Auth token limit duration, changed from default value of 5 to 30 minutes.

Troubleshooting #3: Reverse Proxy Timeout

If you running a reverse proxy behind GitLab or the container registry, you need to increase the timeout, especially on read/write timeout to handle a longer HTTP connection.

If your GitLab registry log file contains the error "client disconnected during blob PATCH", increasing the proxy timeout could help solve the issue.

On Nginx, increase the connection timeout by adding the value below to your configuration:

# Nginx configuration
http {
    ...
    keepalive_timeout 65;
    send_timeout 30;
    client_body_timeout 600s;
    client_header_timeout 600s;
    ...
}

# Virtual host config
location / {
    proxy_pass http://backendserver;
    proxy_connect_timeout 60s;
    proxy_read_timeout 600s;
    proxy_send_timeout 600s;
}

Increase this value as needed, for troubleshooting purposes set the timeout to high value

For Traefik:

[entryPoints.websecure]
    address = ":443"
    [entryPoints.websecure.transport]
      [entryPoints.websecure.transport.respondingTimeouts]
        idleTimeout = 600
        writeTimeout = 600
        readTimeout = 600

Add entry point timeout, this shown on HTTPS value.

Troubleshooting #4: Decrease Docker Concurrent Uploads

By default, docker push will upload five concurrent layer at the same time. If your registry and runner bandwidth is limited and slow decreasing the concurrent layer upload could help with the issue.

Open your daemon config file, should be on /etc/docker/daemon.json, or check the documentation. Update the concurrent upload/download value as needed, and restart the docker daemon.

{
    "max-concurrent-uploads": 2,
    "max-concurrent-downloads": 2
}

Troubleshooting #5: Insufficient Server Capacity

Ultimately, if you have exhausted all troubleshooting options, it may be that the server's performance or capacity is reaching its limits and necessitates an essential upgrade, particularly for a high-activity GitLab server.

Begin by upgrading the Disk and Network performance, as these are crucial for fast and efficient registry operations, or consider separating the registry server from GitLab itself.

GitLab container registry administration | GitLab
GitLab product documentation.
Fixing Docker Layer Push issue on GitLab Container Registry

Conclusion

Managing GitLab and Docker registry could be hard, but hopefully, this guide will help you troubleshoot and fix build push issues more easily. An efficient and stable deployment will make your time as a DevOps engineer easier.

]]>
<![CDATA[Removing a Stale Node in Self-hosted Streamed Netdata]]>https://gemawardian.com/removing-a-stale-node-in-self-hosted-streamed-netdata/670cd7bc3b36bf0ddc90740aWed, 16 Oct 2024 03:30:31 GMT

Netdata is a great platform for monitoring your infrastructure. It is lightweight, feature-complete, and easy to set up and use. I have used it for personal and work use for a long time and love it.

One day, I removed an unused server with netdata installed in a parent-child streaming configuration. I was stuck with a single node that could not be removed from the UI.

Removing a Stale Node in Self-hosted Streamed Netdata
An unused, stale node in Netdata

In Netdata Cloud, there's an option to remove a stale node from the UI by accessing the Space settings page, and clicking the remove node button on the stale nodes.

Removing a Stale Node in Self-hosted Streamed Netdata
Node setting page, only visible on Netdata Cloud

Unfortunately, due to cost and compliance requirements, we decided on a self-hosted Netdata configuration with one Netdata parent node and multiple child nodes installed on an internal network with limited external access.

Method #1: netdatacli remove-stale-node

This method was introduced in the latest version of Netdata, allowing the removal of a stale node with one single command. To use this feature you need to know your Netdata node ID by accessing the UI and clicking the Node Information tab.

Removing a Stale Node in Self-hosted Streamed Netdata

Following your click on the node information (i) button, a panel should appear and open to the right. Click the "View node info in JSON" button. A notification stating "JSON copied to clipboard" will appear and you can paste the JSON to a text editor.

Removing a Stale Node in Self-hosted Streamed Netdata
Netdata node JSON value

On your Netdata parent server, execute the command below:

netdatacli remove-stale-node [your_node_id]
Removing a Stale Node in Self-hosted Streamed Netdata
Response of remove-stale-node command

After running the command and restarting the Netdata service, the stale node should be removed from the UI; however, if you were unable to delete the node using this method due to an older version or an unexpected issue, you can proceed with the second method.

Method #2: Remove the node via the internal database

This method requires you to access the internal SQLite database in the Netdata cache folder and execute this command to access the metadata file.

# access the internal database
sudo sqlite3 /var/cache/netdata/netdata-meta.db

# get host list from metadata
SELECT quote(host_id), hostname FROM host;
Removing a Stale Node in Self-hosted Streamed Netdata
Host list data from SQLite

After running the query, you will see your connected node list, including the internal host_id value, which you must save. To begin removing the stale node, execute the query below in the SQLite prompt.

delete from host where host_id = X'A13A087E55D511EE8769577D69007174';
delete from host_label where host_id = X'A13A087E55D511EE8769577D69007174';
delete from host_info where host_id = X'A13A087E55D511EE8769577D69007174';
delete from node_instance where host_id = X'A13A087E55D511EE8769577D69007174';
delete from chart where host_id = X'A13A087E55D511EE8769577D69007174';

Query for node removal, in this case (staging-2) will be removed

sudo service netdata restart

Restart Netdata service after removing node from meta database

That's it; you've successfully removed the stale node from the parent server, and perhaps this advice will be useful if you're running the same configuration as I am.

]]>
<![CDATA[Your Guide to Troubleshooting Slow Linux Servers]]>https://gemawardian.com/your-guide-to-troubleshooting-slow-linux-servers/6480400ba5ebe2242a7d38efWed, 09 Oct 2024 07:26:58 GMT

You know the dreaded message of "Slow website, please fix." always comes occasionally with no further information from the users. While it's easy to blame the user's network connectivity or device performance, we need to know if something is wrong with the servers.

In this guide, we will try to learn the troubleshooting steps for server performance or stability issues, even with almost little to zero information provided by the user or client.

Try accessing it yourself.

The easiest step to confirm the slow server's complaint is to try accessing it yourself, it helps to know if the server is unresponsive or only affecting certain users.

Sometimes, the slow issue only affects certain functionality on your site such as trying to log in or accessing the dashboard, but accessing the front page is still relatively quick. It helps to confirm especially if page caching is enabled that the issues come from trying to access or process uncached content or data.

Also, if you enabled ssh in your server you could try accessing the server and check how responsive it is, or maybe you can't log in to the server.

The ssh command includes a verbose debug message tool to help you diagnose timeout issues and a helpful tool as a first step for debugging server issues.

$ ssh -vvv username@hostname

The magical power of rebooting.

Sometimes the easiest fix is the best solution, rebooting allows you to solve a lot of problems with minimal effort and is one of the best ways to diagnose application and hardware issues, especially if performance problems persist after you reboot the server.

Check the hardware usage.

Compute, Memory, Storage, and Networking are the backbone of a working server and one or many of them could be part of the issues affecting your server.

  • A slow or overwhelmed CPU could cause a process to wait a long before finishing an execution task.
  • A full memory issue could cause the process to be killed before finishing a task or if you have swap-enabled impact finishing a task on a slow I/O procedure.
  • Slow or overwhelmed storage use could impact database read/write performance, and slow down data access considerably.
  • Significant network latency or slow throughput could cause a considerable delay on the end user and impact overall responsiveness.

Start checking usage with top.

To monitor usage, we will be using top as a utility to help monitor hardware usage for CPU and Memory, and top should be included by default in most Linux distros.

Your Guide to Troubleshooting Slow Linux Servers
Screenshot of top, running on Ubuntu-based server.

From a glance, you can see a couple of things that stood out after executing top.

  • Server uptime
  • Load average in 1, 5, and 15 minutes
  • Number of tasks running including idle and zombie processes
  • CPU usage in percentage for user, system, nice, idle, waiting i/o, hardware interrupts, software interrupts, and steal
  • Memory and swap usage
  • List of currently running tasks and processes

Understanding CPU usage.

  • User (us) reveals the use that external programs (like web servers and databases) make use of. Most of the time, high utilization is completely typical; however, if user usage is high for an extended time, your application is likely being slowed down by the CPU.
  • System (sy) use refers to the CPU time allotted to the kernel when user processes ask it to do a task, such as allocating memory or starting a child process. This amount should be as low as possible, and if it spikes frequently during a certain period, you might want to evaluate how many processes are spawning and how memory and storage are being used to see whether the CPU might need to allocate more resources.
  • Nice (ni) refers to how much CPU is spent running user processes that have been niced, if this value is high for your production application running in lower priority, you may need to increase the priority scale or run at default priority.
  • Idle (id) is simply the time the CPU is idling, most of the time when there is no usage this value should be high.
  • Iowait (wa) corresponds to processes on input and output, such as reading or writing data to storage, When the value is high, it indicates that the CPU is waiting for the disk to complete its task. If the server still uses a conventional hard drive, you may need to improve disk IOPS or move to a quicker storage option like SSD.
  • Hardware and Software Interrupts (hi & si) show how much time the processor has spent servicing interrupts. Hardware interrupts are physical interrupts sent to the CPU from various peripherals like disks and network interfaces. Software interrupts come from processes running on the system. High hardware interrupts could be caused by faulty hardware or processes that cause a lot of software interrupts.
  • Steal (st) utilization is the time the CPU waits for the hypervisor to complete serving another virtual CPU. When the steal value is too high, which only pertains to virtual machines, the host system that is running the hypervisor is overloaded. Check the other virtual machines that are currently using the hypervisor, and/or move your virtual machine to a different host, if at all possible.

Memory

After CPU, the next obvious thing to troubleshoot is memory use, full memory usage will make your server unresponsive and even inaccessible.

free -h

Execute the command above to view usage for both memory and swap.

Your Guide to Troubleshooting Slow Linux Servers
Screenshot of memory and swap usage

If you look at the screenshot above, you may think that we only have around 212 MB left in our 2 GB memory but this is quite normal because our focus needs to be watching the buff/cache and available column.

First, the buff/cache column is used by the kernel buffers and page cache responsible when the kernel performs I/O operations on the disk, the higher the value means that more files and metadata are cached in RAM.

Second, the available column means how much memory can be used by applications without resolving to swap, if the available memory is low applications and the server could be slowed considerably.

Memory (Swap)

While it's tempting to disable and remove swap, in production swap is useful as a protection from OOM issues.

Swap means the applications are using disk as an alternative to RAM, and it's quite slow when the swap usage increases, this value needs to be kept as low as possible, and make sure server memory is adequate to handle your applications.

There's a useful command that you can use to check which applications are using swap.

for file in /proc/*/status; do awk '/VmSwap|Name/{printf $2 " " $3}END{ print ""}' $file; done | sort -k 2 -n -r | less
Your Guide to Troubleshooting Slow Linux Servers
Application swap usage

The screenshot shows MariaDB swap usage is quite high, and databases are especially prone to timeouts and poor performance. Ensure your important application has adequate memory needs, especially in a high transaction activity environment.

Storage

Storage performance is paramount, a slow storage speed could cause a lot of issues, including freezes, lags, and load time issues. In troubleshooting storage problems, the first thing to check is how much storage is used. Sometimes, the main culprit of slow or unstable servers is the system or app being unable to create a new file because of full storage.

error: No space left on device

The usual culprit of storage issues.

That message could mean multiple things: a full storage partition, insufficient inodes, or disk corruption. To check how much storage is used, the simple thing is to execute the command below:

df -h

df stands for disk free, and one of the commands you should remember as a sysadmin.

Your Guide to Troubleshooting Slow Linux Servers
Screenshot of df, showing 70% of disk usage.

In most Linux installations, especially on VMs you only need to look at how much the root partition uses, or in this case, it's using 70% of the total available storage with 18GB left. When it's full, the servers still work but you cannot create or even execute simple commands like tab autocomplete on bash.

The next topic is inodes. An inode, also known as an index node, is a filesystem's unique identifier. Every file and folder in the filesystem has metadata that is kept in a shared table. If your server generates a lot of little data over time, it may eventually use up all the inodes.

If you find that the disk space is still sufficient and encountering no space left error, it's best to check if the inodes are full.

df -ih

Command for checking inodes, just add -i to the df command arguments

Your Guide to Troubleshooting Slow Linux Servers
Inodes usage, showing 3% usage on root directory

After determining storage and inode usage are fine, the best thing to check about storage performance is what applications are stressing and causing huge storage loads with iotop.

iotop
Your Guide to Troubleshooting Slow Linux Servers
Screenshot of iotop

With iotop, you can see which processes are currently reading and writing to the disk, with the total disk read/write shown at the top of the column list. When applications are heavily using storage, it could be the cause of slow performance especially when the server still running on a spinning hard disk.

Network Latency

The last hardware topic we going to discuss is networking, a slow or congested network activity could create a bad experience for the end-user especially if the server is exposed to the internet.

There are a lot of troubleshooting steps to check network performance, the first and most known step to check is doing a ping to and from the servers to determine network latency.

# ping to server ip address
ping 192.168.0.133

# ping from the server to global internet or other local server
ping google.com
Your Guide to Troubleshooting Slow Linux Servers
Screenshot of ping command on google.com

The value you need to watch is how long the response time is, as shown in the screenshot above it stays on a constant good value of 29-30ms. This value should be as low and consistent as possible, a huge spike or inconsistent response time could cause network instability.

The next useful to determine slow network response time is the traceroute command, it shows more granular details on what path that traffic takes to its destination.

traceroute google.com
Your Guide to Troubleshooting Slow Linux Servers
Screenshot of traceroute command on google.com

Network Utilization

We can monitor network bandwidth usage by using the iftop command utility, it's counting network packets coming through the network interface. With the tool, you can check active network traffic and active network utilization.

Your Guide to Troubleshooting Slow Linux Servers
Screenshot of iftop command tool

Wrapping up

Being an administrator, particularly working on servers, is exhausting and necessitates much knowledge and hands-on experience. Troubleshooting is part of the job, and it isn't always fun or frustrating, but I hope I can give you a few tips and pointers to help you solve and enhance your server performance.

]]>
<![CDATA[Improving PHP performance with OPcache]]>https://gemawardian.com/improving-php-performance-with-opcache/63918eaf0bfb6f91b52eae47Wed, 08 Feb 2023 06:35:45 GMT

OPcache is a PHP extension that allows opcode caching and optimization by storing the precompiled script bytecode in memory, eliminating the need to read the code from disk and improving performance.

From PHP version 5.5 and beyond, the extension is included by default; in this article, I'll explain how to install and set up the extension to enhance PHP performance.

Installation

In this guide, I am using Ubuntu 22.04 and following the LEMP stack guide I posted here:

A Comprehensive Guide to setting up LEMP Stack (Linux, Nginx, MariaDB, and PHP) in Ubuntu 20.04 LTS.
There’s a lot of guides for setting up the LAMP/LEMP stack on the web, and most of them are scattered and sometimes only provide the default config without any optimization. Here I am going to tell you how to install and configure the software that is production-ready and optimized.
Improving PHP performance with OPcache

Before we install the opcache, please make sure your repositories are already updated by executing the following command:

sudo apt update && sudo apt upgrade

By default, opcache should be already installed when you install the PHP package, but to make sure you could execute the following command below to set the package to be manually installed.

# for default installation
sudo apt install php-opcache

# for specific php version (7.4 and 8.1)
sudo apt install php7.4-opcache
sudo apt install php8.1-opcache

You could view the PHP version and if opcache is installed by executing the
php -v command

nerdv2@nerdLaptop:~$ php -v
PHP 8.1.2-1ubuntu2.10 (cli) (built: Jan 16 2023 15:19:49) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.1.2, Copyright (c) Zend Technologies
    with Zend OPcache v8.1.2-1ubuntu2.10, Copyright (c), by Zend Technologies

Configuring OPCache

To configure and enable opcache, we need to modify the default settings within the php.ini configuration file which you could find in the path below.

# for apache
/etc/php/7.4/apache2/php.ini

# for fpm and nginx
/etc/php/7.4/fpm/php.ini
PHP.ini file location

When accessing the file, modify the opcache configuration value to match the recommended configuration for production as explained in the official documentation.

opcache.enable=1
opcache.memory_consumption=128
opcache.interned_strings_buffer=8
opcache.max_accelerated_files=4000
opcache.revalidate_freq=60
opcache.fast_shutdown=1
opcache.save_comments=0

After modifying the configuration file, restart php to apply the configuration

# For apache:
sudo service apache2 restart

# For fpm and nginx (change the version to match your install):
sudo service php7.4-fpm restart

Benchmarking Result

The benchmark will be conducted on my work project site using the siege load testing tool using the following command, and it will run for approximately 10 minutes.

siege -v -c 25 -r 10 https://target_url/
Improving PHP performance with OPcache

As you can see, I reduced the average response time of my site by 40% by using opcache.

Although your results may differ based on the kind of projects, server specifications, and other dependencies you have, as you can see from the result above, it's still worthwhile to give it a try.

Managing OPcache

I advised using the CacheTool command line tool to assist with managing opcache. You might install it by using the command below and adjusting the version to one compatible with your system.

Improving PHP performance with OPcache
CacheTool compatibility matrix.
curl -sLO https://github.com/gordalina/cachetool/releases/download/7.0.0/cachetool.phar
chmod +x cachetool.phar
sudo mv cachetool.phar /usr/local/bin/cachetool

Test opcache, by executing the command:

sudo cachetool opcache:status
Improving PHP performance with OPcache

The most useful command for cachetool is the ability to flush opcache contents without the need to restart the php server.

sudo cachetool opcache:reset

Conclusion

After following the guide, we showed that opcache is able to improve your PHP web performance which will make your user more satisfied.

]]>
<![CDATA[How to setup MongoDB Replica Set to Improve Redundancy and Availability]]>Implementing a replication strategy is one technique to increase the availability of your website, as services and websites are frequently unavailable due to a single point of failure brought on by crashes, OOM circumstances, etc.

MongoDB is no exception; many users, including mine, use this document-oriented database. I've

]]>
https://gemawardian.com/introduction-to-mongodb-replication-and-replica-set/637741bb52439ab6c3e46ed1Thu, 24 Nov 2022 03:45:07 GMT

Implementing a replication strategy is one technique to increase the availability of your website, as services and websites are frequently unavailable due to a single point of failure brought on by crashes, OOM circumstances, etc.

MongoDB is no exception; many users, including mine, use this document-oriented database. I've observed many services running in a single MongoDB instance so frequently that chaos ensues when it fails.

Thankfully, one of the best things about MongoDB is how easy to set up replication, and in this article, I'll help guide you on how to set up and manage your MongoDB Replica Set.

To start, we need to plan how many instances of MongoDB you need to begin deploying a replica set. In most basic setups and deployments 3 members of a replica set are sufficient to provide enough redundancy on network or system failures.

How to setup MongoDB Replica Set to Improve Redundancy and Availability
A replication strategy that we are going to deploy (Courtesy of MongoDB)

In my configuration, I ended up installing MongoDB in 3 different VM with the following network configuration:

  • mongo-instance-1: 192.168.20.1
  • mongo-instance-2: 192.168.20.2
  • mongo-instance-3: 192.168.20.3

I am not including the installation process, but you could easily install each MongoDB by following the guide here:

Install MongoDB Community Edition on Ubuntu
How to setup MongoDB Replica Set to Improve Redundancy and Availability

Network Preparation

Since each instance of MongoDB needs to communicate with the others, we need to change the default network IP binding. To do this, enter the value below in your mongod config file:

# If the bindIp are listening to localhost, change the value
# to your hostname or IPs, but to make thing easier 
# we are going to set the IP binding to listen on all
net:
  port: 27017
  bindIp: 0.0.0.0

Don't forget to restart the MongoDB services.

sudo service mongod restart

One of the thing that MongoDB recommend when deploying a replica set is setting up the hostname for each instance to prevent configuration issues when the host IP needs changing, this could be accomplished by setting up the values in the /etc/hosts file

ubuntu@mongo-instance-1:~$ cat /etc/hosts
127.0.0.1 localhost
192.168.20.1 mongo-instance-1
192.168.20.2 mongo-instance-2
192.168.20.3 mongo-instance-3
Don't forget to add the values to all of the instances.

Make sure every server can connect and the required port which defaulted to 27017 is opened and able to communicate with each server.

# ping the instance to check the connectivity
ubuntu@mongo-instance-1:~$ mongo --host mongo-instance-2
ubuntu@mongo-instance-1:~$ mongo --host mongo-instance-3

ubuntu@mongo-instance-2:~$ mongo --host mongo-instance-1
ubuntu@mongo-instance-2:~$ mongo --host mongo-instance-3

ubuntu@mongo-instance-3:~$ mongo --host mongo-instance-1
ubuntu@mongo-instance-3:~$ mongo --host mongo-instance-2

You might need to verify the hosts file configuration, the MongoDB service status, the instance network settings, or the firewall configuration if one of the commands doesn't work.

Generate Keyfile for Authentication

We'll create a Keyfile and set it up to enable a safe connection between MongoDB servers.

First, generate a Keyfile using the following commands:

ubuntu@mongo-instance-1:~$ sudo openssl rand -base64 756 > /etc/mongo-keyfile
ubuntu@mongo-instance-1:~$ sudo chmod 400 /etc/mongo-keyfile

Next, copy the generated Keyfile to each instance, and make sure the path and file permission is the same as with the mongo-instance-1.

After copying the Keyfile edit each instance MongoDB configuration to enable Keyfile Authentication:

security:
  keyFile: /etc/mongo-keyfile

Enabling the Replica Set

Now that the networking and authentication are already set up, we need to add the following line within the configuration to enable the replica set.

# rename the replSetName to your desired replica set name.
replication:
  replSetName: main-replica

After setting up the replica set name, restart each MongoDB service to enable the configuration.

Next, we need to initiate the Replica Set by accessing the mongo command line utility and executing the following command:

rs.initiate(
  {
    _id : "main-replica",
    members: [
      { _id : 0, host : "mongo-instance-1:27017" },
      { _id : 1, host : "mongo-instance-2:27017" },
      { _id : 2, host : "mongo-instance-3:27017" }
    ]
  }
)
Execute the command within the mongo command line utility

After executing the command, your prompt should be looking like this:

mongo-instance-1:PRIMARY>

And connecting to another MongoDB instance will result in also looking like this:

mongo-instance-2:SECONDARY>

Depending on the outcome of the election, a different PRIMARY or SECONDARY instance may be used, but all subsequent actions must be carried out within the PRIMARY instance.

Creating the Administrator User

When deploying the database to production, we must enable User Account Control, which necessitates the creation of an administrator user.

Please keep in mind that after you create the administrator user, localhost authentication exceptions are no longer available, and you must enter a username and password each time you connect to the database.

Run the queries below in the PRIMARY instance, and don't forget to set the password:

admin = db.getSiblingDB("admin")
admin.createUser(
  {
    user: "dba",
    pwd: 'put-your-password-here',
    roles: [ { role: "userAdminAnyDatabase", db: "admin" } ]
  }
)

Try using the command line to log in after creating the account:

mongo -u dba -p put-your-password-here

If you are successfully logged in, then congrats the authentication are successfully configured.

Testing the Replica Set

After setting it up, you can view the replica set configuration by using the following command:

rs.conf()

It should be resulting the following response:

mongo-instance-1:PRIMARY> rs.conf()
{
	"_id" : "main-replica",
	"version" : 1,
	"term" : 16,
	"members" : [
		{
			"_id" : 0,
			"host" : "mongo-instance-1:27017",
			"arbiterOnly" : false,
			"buildIndexes" : true,
			"hidden" : false,
			"priority" : 1,
			"tags" : {
				
			},
			"secondaryDelaySecs" : NumberLong(0),
			"votes" : 1
		},
		{
			"_id" : 1,
			"host" : "mongo-instance-2:27017",
			"arbiterOnly" : false,
			"buildIndexes" : true,
			"hidden" : false,
			"priority" : 1,
			"tags" : {
				
			},
			"secondaryDelaySecs" : NumberLong(0),
			"votes" : 1
		},
		{
			"_id" : 2,
			"host" : "mongo-instance-3:27017",
			"arbiterOnly" : false,
			"buildIndexes" : true,
			"hidden" : false,
			"priority" : 1,
			"tags" : {
				
			},
			"secondaryDelaySecs" : NumberLong(0),
			"votes" : 1
		}
	],
	"protocolVersion" : NumberLong(1),
	"writeConcernMajorityJournalDefault" : true,
	"settings" : {
		"chainingAllowed" : true,
		"heartbeatIntervalMillis" : 2000,
		"heartbeatTimeoutSecs" : 10,
		"electionTimeoutMillis" : 10000,
		"catchUpTimeoutMillis" : -1,
		"catchUpTakeoverDelayMillis" : 30000,
		"getLastErrorModes" : {
			
		},
		"getLastErrorDefaults" : {
			"w" : 1,
			"wtimeout" : 0
		},
		"replicaSetId" : ObjectId("6306e97e468ec52991273d45")
	}
}

After receiving the response, congratulations you now have successfully configured a replica set.

]]>
<![CDATA[Experimenting with Encrypted Client Hello (ECH)]]>One day I was reading the news and come across an article that China has blocked all ESNI-related traffic because it serves as an effective tool for censorship circumvention it made me intrigued because of the way it was blocked.

Normally, their censorship relies on filters like Transparent Proxies, DPI,

]]>
https://gemawardian.com/experimenting-with-encrypted-client-hello-ech/6360ea1aee3b62f72b2afbf6Tue, 08 Nov 2022 08:03:21 GMT

One day I was reading the news and come across an article that China has blocked all ESNI-related traffic because it serves as an effective tool for censorship circumvention it made me intrigued because of the way it was blocked.

Normally, their censorship relies on filters like Transparent Proxies, DPI, and similar things, but this instance is a bit severe because it simply blackholes all traffic using TLSv1.3 and ESNI which means the protocol works as intended by making snooping SNI value hard.

China now blocking ESNI-enabled TLS 1.3 connections, say Great-Firewall-watchers
And needs a very blunt instrument to do the job, because the protocol works as planned
Experimenting with Encrypted Client Hello (ECH)

I live in Indonesia, where the government similarly imposed censorship using techniques like reading SNI values, keyword matching, and deep packet inspection.
Although I can see some of the justifications for censoring, I'm always uneasy about the fact that your ISP and the government can see what you're currently browsing.

Experimenting with Encrypted Client Hello (ECH)
The dreaded Internet Positif webpage.

Despite the fact that I almost always use VPNs, they occasionally make my work more difficult, either due to slow speeds or the overzealous Firewall that prevents connections from almost all VPN servers.

It points back to ECH, which takes the place of Encrypted SNI as a solution to plug some security gaps with TLS and SNI.

Browser Support

Mozilla champions ECH along with Cloudflare and others, so ECH is deeply integrated well with Firefox since version 85+

Encrypted Client Hello: the future of ESNI in Firefox – Mozilla Security Blog
Background Two years ago, we announced experimental support for the privacy-protecting Encrypted Server Name Indication (ESNI) extension in Firefox Nightly. The Server Name Indication (SNI) TLS extension enables server and ...
Experimenting with Encrypted Client Hello (ECH)

It is supported for Chrome, Edge, and other Chromium-based browsers starting with version 105+, however, the feature is still only partially supported for most of them so your mileage may vary.

1091403 - chromium - An open-source project to help move the web forward. - Monorail
Chromium bug feature tracker for ECH
You can now Enable Encrypted Client Hello (Encrypted SNI or ESNI/ECH) in Microsoft Edge
How to enable Encrypted Client Hello (ECH) in Microsoft Edge version 105 and above. Right-click on desktop shortcut of Edge browser, select properties and add this at the end of the target: --enable-features=EncryptedClientHello so that it will look like this: (there is a space before --) prefe…
Experimenting with Encrypted Client Hello (ECH)

Setting up ECH

Since Firefox was the browser I used on a regular basis, I'll be using it in this tutorial.

Before we begin, we'll use this convenient URL created by Tolerant Networks to verify if ECH is enabled as a benchmark.

https://defo.ie/ech-check.php
Experimenting with Encrypted Client Hello (ECH)

As you see, the browser wasn't attempting to encrypt the client's hello message.

To fix that, we'll make sure to enable DNS-over-HTTPS (DoH) first by accessing the settings menu.

Settings -> General -> Network Settings

Experimenting with Encrypted Client Hello (ECH)

We will set the provider to Cloudflare, as they are one of the major operators of ECH-enabled servers and networks.

Next, to enable ECH we need to change two config variables by accessing the about:config page.

network.dns.echconfig.enabled = true
network.dns.use_https_rr_as_altsvc = true

After changing both values, to make sure the configuration is applied I restarted my browser and re-checking the ECH check page.

Experimenting with Encrypted Client Hello (ECH)

Voila! The browser now attempting to encrypt the SNI value.

One of the pages I tried accessing is ThePirateBay, which for you who don't know mostly hosted a lot of pirated content and has been blocked and censored in Indonesia.

Experimenting with Encrypted Client Hello (ECH)
Not blocked anymore.

Conclusion

Although it isn't a privacy cure-all, ECH helps to address many problems with TLS in general in the age where governments are leveraging its vulnerability to spy on people, gather user data, and impose censorship.

Although it won't soon replace my VPN, it is one of the more effective solutions for enhancing user privacy, and I can't wait for it to be used more widely.

Read More about ECH

TLS Encrypted Client Hello
This document describes a mechanism in Transport Layer Security (TLS) for encrypting a ClientHello message under a server public key. Discussion Venues This note is to be removed before publishing as an RFC. Source for this draft and an issue tracker can be found at https://github.com/tlswg/draft-ie…
Experimenting with Encrypted Client Hello (ECH)
]]>
<![CDATA[Setup Monitoring with Prometheus, Grafana, and node_exporter.]]>Server uptime and stability are pillars of a successful infrastructure design, and downtime is always painful if you spend time working on IT infrastructure. Monitoring is one way to prevent anomalies and help keep your servers running optimally.

There are many options to monitor your servers, ranging from one-script setup-ready

]]>
https://gemawardian.com/setting-up-monitoring-with-prometheus-grafana-and-node_exporter/62c6ac2dc93684a6471f45edFri, 09 Sep 2022 18:15:36 GMT

Server uptime and stability are pillars of a successful infrastructure design, and downtime is always painful if you spend time working on IT infrastructure. Monitoring is one way to prevent anomalies and help keep your servers running optimally.

There are many options to monitor your servers, ranging from one-script setup-ready solutions like Netdata and more complex options dedicated to monitoring your web servers, database, or even file-integrity monitoring solution with something like OSSEC.

But, for now, we will set up a monitoring stack consisting of Prometheus, Grafana, and node_exporter.

We chose this program selection because it was the most widely supported and popular monitoring tool, and I even use it daily at the place I work to monitor hundreds of endpoints.

Thinking Ahead

In this tutorial, we are going to achieve setting up monitoring for 2 nodes (web server, and database) resulting in a setup looking like this:

Setup Monitoring with Prometheus, Grafana, and node_exporter.

I'll keep this tutorial as production-ready as possible, and we will use Ubuntu 22.04 LTS for every node.

Setting up Prometheus

After connecting to your monitoring instance, we need to get the package for installation.
Use the following command below, or check the Prometheus website for the latest version.

wget https://github.com/prometheus/prometheus/releases/download/v2.37.0/prometheus-2.37.0.linux-amd64.tar.gz

After downloading the package, extract the packages using the following command:

tar xvf prometheus*.tar.gz

Navigate to the extracted folder, and move the executable to the bin directory with the following command:

cd prometheus*/
sudo mv prometheus promtool /usr/local/bin/

Validate the installed version by executing prometheus --version and the result should look like this:

Setup Monitoring with Prometheus, Grafana, and node_exporter.

Next, we will create folders for Prometheus to store configuration and storage data, and execute the following command:

sudo mkdir /etc/prometheus /var/lib/prometheus/
sudo chown -R prometheus:prometheus /etc/prometheus /var/lib/prometheus/
sudo mv prometheus.yml /etc/prometheus/prometheus.yml
sudo mv consoles/ console_libraries/ /etc/prometheus/

To make Prometheus run as a service, we will create a dedicated user and group for it:

sudo groupadd --system prometheus
sudo useradd -s /sbin/nologin --system -g prometheus prometheus

After creating the users, let's make a new systemd service:

sudo nano /etc/systemd/system/prometheus.service

Put the following code inside the file:

[Unit]
Description=Prometheus
Documentation=https://prometheus.io/docs/introduction/overview/
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/usr/local/bin/prometheus \
--config.file /etc/prometheus/prometheus.yml \
--storage.tsdb.path /var/lib/prometheus/ \
--web.console.templates=/etc/prometheus/consoles \
--web.console.libraries=/etc/prometheus/console_libraries

[Install]
WantedBy=multi-user.target

Reload systemd and enable the service:

sudo systemctl daemon-reload
sudo systemctl enable --now prometheus

Validate the service is running by executing the following command:

sudo service prometheus status
Setup Monitoring with Prometheus, Grafana, and node_exporter.

If the output showed that the service is active, congrats you have learned how to set up Prometheus, and now we'll go to the next step.

Setting up Grafana

We start installing grafana by adding the private key to the repositories, and executing the following command:

wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -

After that, we add the following stable repositories:

echo "deb https://packages.grafana.com/oss/deb stable main" | sudo tee -a /etc/apt/sources.list.d/grafana.list

Update the repository and start installing Grafana:

sudo apt update && sudo apt install grafana

Enable the services at startup and start it:

sudo systemctl enable --now grafana-server.service

Setting up node_exporter

Follow this guide on each of your instances, we start by downloading the required executable:

wget https://github.com/prometheus/node_exporter/releases/download/v1.3.1/node_exporter-1.3.1.linux-amd64.tar.gz

Extract the downloaded file:

tar -xvf node_exporter-1.3.1.linux-amd64.tar.gz && cd node_exporter-1.3.1.linux-amd64

Make the file executable and move the file to install it:

sudo mv node_exporter /usr/bin/

Create an empty configuration file for us to configure later:

sudo touch /etc/node_exporter.conf

Create a specific user for node_exporter to run as:

sudo groupadd --system node_exporter
sudo useradd -s /sbin/nologin --system -g node_exporter node_exporter

Create a new service file for node_exporter:

sudo nano /etc/systemd/system/node-exporter.service
[Unit]
Description=Node Exporter

[Service]
User=node_exporter
Group=node_exporter
EnvironmentFile=/etc/node_exporter.conf
ExecStart=/usr/bin/node_exporter $OPTIONS

[Install]
WantedBy=multi-user.target

Enable the service:

sudo systemctl enable --now node-exporter

Check the service started, and you can begin to access it via:

http://your_server_name_or_ip:9100/metrics
Setup Monitoring with Prometheus, Grafana, and node_exporter.
Setup Monitoring with Prometheus, Grafana, and node_exporter.

Configuring Prometheus with node_exporter

On your monitoring server, we are going to allow Prometheus to talk with node_exporter, first open the configuration file:

sudo nano /etc/prometheus/prometheus.yml

Add the following code to the configuration, and don't forget to replace the webserver and database IP address:

scrape_configs:
  - job_name: "prometheus"

    static_configs:
      - targets: ["localhost:9090"]

  - job_name: "node"
    static_configs:
      - targets: ['webserver_ip:9100', 'database_ip:9100']

Restart the Prometheus instance:

sudo service prometheus restart

After restarting it, you could validate if the node exporter is available and its health status data by accessing:

http://monitoring_instance_ip:9090/targets

Setting up Grafana dashboard

After setting up the entire monitoring stack and its configuration, we finalize our setup by installing a dashboard to monitor your server at a glance.

First, let's login to Grafana by accessing the dashboard:

http://monitoring_instance_ip:3000/
Setup Monitoring with Prometheus, Grafana, and node_exporter.

Login with its default credential:

username: admin
password: admin

After, logging in you may be prompted to change your default password, switch to a new password, and continue to the next step.

In the panel go to Configuration -> Data Sources and add a new Prometheus source:

Setup Monitoring with Prometheus, Grafana, and node_exporter.

Continue setting up by importing a new dashboard created by Ricardo F and we import it by accessing the Dashboard -> + Import page.

Setup Monitoring with Prometheus, Grafana, and node_exporter.

Enter 1860 as the dashboard ID and click Load.

After importing the dashboard, you should be redirected to a new shiny panel consisting of your infrastructure metrics.

Setup Monitoring with Prometheus, Grafana, and node_exporter.

Ta-da!
Congrats, you've learned how to set up a Prometheus monitoring stack.

]]>
<![CDATA[How to setup catch-all wildcard subdomain in Nginx]]>Let's say you have a problem. You are serving a SaaS-based software with each client needing to have their own subdomain setup. The easiest way is to point all of their DNS to the servers, right?

But how about your VirtualHost setup? Do I need to set up

]]>
https://gemawardian.com/wildcard-subdomain-setup-in-nginx/624a760a24d3ec47e92a80a2Thu, 14 Apr 2022 06:49:56 GMT

Let's say you have a problem. You are serving a SaaS-based software with each client needing to have their own subdomain setup. The easiest way is to point all of their DNS to the servers, right?

But how about your VirtualHost setup? Do I need to set up each subdomain's virtual host every time I add a new subdomain?

The answer is absolute no. We could always use a catch-all wildcard configuration, and in this article, I will try to show you how to set it up.

The Typical Setup

server {
        listen 80;
        listen [::]:80;
        server_name domain.com www.domain.com;

        root /var/www/domain.com/htdocs/;
        index index.html index.php;

        location / {
                try_files $uri $uri/ /index.php$is_args$args;
        }

        location ~ \.php$ {
                include snippets/fastcgi-php.conf;
                fastcgi_param SERVER_NAME domain.com;
                fastcgi_pass 127.0.0.1:9000;
        }

        access_log /var/log/nginx/access.log;
        error_log /var/log/nginx/error.log;
}
Typical nginx+php virtual host configuration

The code above shows a typical Nginx with PHP configuration. We will focus on a few variables to change, with the final result looking like this:

a.domain.com -> /var/www/domain.com/a/
b.domain.com -> /var/www/domain.com/b/

First, we need to change the server_name variable to accept wildcard subdomains. To do this, change the value with:

server_name	~^(?<subdomain>.*)\.domain\.com$;

With that, Nginx now by default accepts every subdomain request within this virtual host configuration.
Next, change which directory the directory points towards by using the $subdomain variable:

root /var/www/domain.com/$subdomain/;

When using PHP, sometimes I would like to capture the domain name using the default superglobal value $_SERVER['SERVER_NAME'] and use the fastcgi_param to help forward it to the application.

By doing that, we need to add a new variable initialization and change the fastcgi_param value.

set $server_name_full $subdomain.domain.com;
fastcgi_param SERVER_NAME $server_name_full;

In the end, your configuration should look like this:

server {
        listen 80;
        listen [::]:80;
        server_name	~^(?<subdomain>.*)\.domain\.com$;
        set $server_name_full $subdomain.domain.com;

        root /var/www/domain.com/$subdomain/;
        index index.html index.php;

        location / {
                try_files $uri $uri/ /index.php$is_args$args;
        }

        location ~ \.php$ {
                include snippets/fastcgi-php.conf;
                fastcgi_param SERVER_NAME $server_name_full;
                fastcgi_pass 127.0.0.1:9000;
        }

        access_log /var/log/nginx/access.log;
        error_log /var/log/nginx/error.log;
}

That's all you need to get started with setting up the catch-all subdomain config.
Feel free to give feedback or ask for help in the comments!

]]>
<![CDATA[Optimizing APT updates with apt-mirror-updater]]>The story started when I diagnosed my friend's laptop running Ubuntu, and running apt update is getting really slow even with my high-speed internet just getting around 100kbps updating the APT sources.

It turns out the default repository that the laptop uses, is farther than our current location

]]>
https://gemawardian.com/optimizing-apt-updates-with-apt-mirror-updater/6234be75711f7e8bba80f2edSat, 26 Mar 2022 17:09:48 GMTThe story started when I diagnosed my friend's laptop running Ubuntu, and running apt update is getting really slow even with my high-speed internet just getting around 100kbps updating the APT sources.

It turns out the default repository that the laptop uses, is farther than our current location which is in Jakarta.

When using Ubuntu on the desktop, you could always use the built-in Select Best Server option within the Software & Updates settings page.

Select the best download server dialog box.

Choosing that option solves the bandwidth issue, but what about running it on a Terminal?
Maybe you are running a server, IoT devices, or maybe just want to automate the fastest mirror selection task?

One way to solve that is changing the /etc/apt/sources.list manually, but there must be a better solution.

And it seems I found one good choice, using the apt-mirror-updater package provided by Peter Odding.

If you running the latest Ubuntu versions, and pip3 is already installed then skip this part.

sudo apt install python3-pip

Otherwise, you could continue installing apt-mirror-updater by running this command.

sudo pip3 install apt-mirror-updater

Then run the following command to start finding the best mirror server.

sudo apt-mirror-updater --auto-change-mirror

After executing the command above, it will start finding the fastest repository and in my case, it found one at idcloudhost servers.

That's it, and it only takes around 10 minutes to update Ubuntu from a fresh install.

]]>
<![CDATA[A Comprehensive Guide to setting up LEMP Stack (Linux, Nginx, MariaDB, and PHP) in Ubuntu 20.04 LTS.]]>https://gemawardian.com/a-comprehensive-guide-to-lamp-stack/618a8d50c229fa2f9f49fbf8Sat, 25 Dec 2021 21:21:10 GMT

There's a lot of guides for setting up the LAMP/LEMP stack on the web, and most of them are scattered and sometimes only provide the default config without any optimization. Here I am going to tell you how to install and configure the software that is production-ready and optimized for daily use.

Let's start with what you need.

  1. A fairly sized VM, or bare-metal installation for good measure.
  2. Freshly installed Ubuntu OS and please, please use LTS releases for your production servers.
  3. Experience in working with command line stuff.
  4. Patience.

In this tutorial, I am going to use a standard cloud VM from Vultr with these specs:
2vCPU, 4GB RAM and 80GB SSD
You could try and set up your Vultr account by clicking here.

Setup the required repositories.

# Enable secure APT downloads
sudo apt install software-properties-common dirmngr apt-transport-https
# For nginx
sudo add-apt-repository ppa:ondrej/nginx-mainline

# For PHP
sudo LC_ALL=C.UTF-8 add-apt-repository ppa:ondrej/php

# For MariaDB
sudo apt-key adv --fetch-keys 'https://mariadb.org/mariadb_release_signing_key.asc'
sudo add-apt-repository 'deb [arch=amd64,arm64,ppc64el] https://mirror.djvg.sg/mariadb/repo/10.6/ubuntu focal main'

As you notice above, I'm using the excellent PPA provided by Ondřej Surý.
The build of Nginx has already been compiled for TLS 1.3 and HTTP/2 support, and his PHP repositories are constantly up-to-date with the latest version and complete extensions support.

For the MariaDB installation, I am using a local official mirror provided by MariaDB based in Singapore, if you want to change to your local mirror you could check here and replace the URL.

Update your system.

After setting up the repositories, you should update your system by using the standard:

sudo apt update && sudo apt -y upgrade

Setup all the software.

sudo apt install nginx curl git zip unzip wget mariadb-server mariadb-client haveged

The script above will install a few things.
1. Nginx - the webserver we going to use.
2. curl, git, zip, unzip, wget - This batch of software is used for many PHP extension dependencies.
3. MariaDB - the database we going to use.
4. Haveged - an entropy daemon to help cryptographic randomness especially when running in a low-entropy scenario like a virtual machine.

Next, we are going to set up PHP and choose your flavor 7.4 or 8.0:

# For PHP 7.4

sudo apt install php7.4-common php7.4-cli php7.4-curl php7.4-fpm php7.4-gd php7.4-gmp php7.4-intl php7.4-json php7.4-mbstring php7.4-mysql php7.4-opcache php7.4-readline php7.4-xml php7.4-zip

# For PHP8
sudo apt install php8.0-common php8.0-cli php8.0-curl php8.0-fpm php8.0-gd php8.0-gmp php8.0-intl php8.0-json php8.0-mbstring php8.0-mysql php8.0-opcache php8.0-readline php8.0-xml php8.0-zip

You could add or remove the extension as you choose, the list I provided above is what I encounter daily on setting up applications for clients and such.
Follow this link to check for the available extensions package.

Initial MariaDB setup.

When we finished installing the packages from the previous step, the best practice is always to run the included security tweaks script.

Execute the script:

sudo mysql_secure_installation

Then, follow these steps:

  • Ignore the "Enter current password for root" by pressing enter, we won't use the root account as by default root uses socket authentication which for most PHP applications only allows database connection by using password auth.
  • On the "Set root password?" question just type N and enter.
  • From there, just type Y on the rest of the question to securely implement the best-practice changes.

Next, we are going to create a new administrative account for use with the application.

sudo mariadb
CREATE USER 'administrator'@'localhost' IDENTIFIED BY 'password';
GRANT ALL ON *.* TO 'administrator'@'localhost' WITH GRANT OPTION;

Be sure to change the username and password above, next flush the session privileges.

FLUSH PRIVILEGES;

After you finish that, you could type exit to return to the terminal and we could try the new account by typing:

mysql -u administrator

You will be prompted to enter your password, and if you're successfully login then congrats, now MariaDB is installed and configured.

Configuring Kernel Variables.

It may sound scary, but if you want to achieve high web concurrency you should try to tune the kernel variable at /etc/sysctl.conf

These scripts are based on a few references (which you should read if you are a performance freak!):
- https://www.nginx.com/blog/tuning-nginx/
- https://www.brendangregg.com/blog/2015-03-03/performance-tuning-linux-instances-on-ec2.html
- https://russ.garrett.co.uk/2009/01/01/linux-kernel-tuning/
- https://fasterdata.es.net/host-tuning/linux/

#
# /etc/sysctl.conf - Configuration file for setting system variables
# See /etc/sysctl.d/ for additional system variables.
# See sysctl.conf (5) for information.
#

### KERNEL TUNING ###

# Increase size of file handles and inode cache
fs.file-max = 2097152

# Do less swapping / Virtual memory
vm.swappiness = 10
vm.dirty_ratio = 60
vm.dirty_background_ratio = 2

# Sets the time before the kernel considers migrating a proccess to another core
kernel.sched_migration_cost_ns = 5000000

# Group tasks by TTY
#kernel.sched_autogroup_enabled = 0

### GENERAL NETWORK SECURITY OPTIONS ###

# Number of times SYNACKs for passive TCP connection.
net.ipv4.tcp_synack_retries = 2

# Allowed local port range
net.ipv4.ip_local_port_range = 2000 65535

# Protect against tcp time-wait assassination hazards, drop RST packets for sockets in the time-wait state
net.ipv4.tcp_rfc1337 = 1

# Helps protect against SYN flood attacks. Only kicks in when net.ipv4.tcp_max_syn_backlog is reached
net.ipv4.tcp_syncookies = 1

# Specify how many seconds to wait for a final FIN packet before the socket is forcibly closed
net.ipv4.tcp_fin_timeout = 10

# With the following settings, your application will detect dead TCP connections after 120 seconds (60s + 10s + 10s + 10s + 10s + 10s + 10s)
net.ipv4.tcp_keepalive_time = 60
net.ipv4.tcp_keepalive_intvl = 10
net.ipv4.tcp_keepalive_probes = 6

### TUNING NETWORK PERFORMANCE ###

# https://www.ibm.com/support/knowledgecenter/en/SSQPD3_2.6.0/com.ibm.wllm.doc/UDPSocketBuffers.html
# On the Linux platform Tx ring buffer overruns can occur when transmission rates approach 1Gbps and the default send socket buffer is greater than 65536.
# It is recommended to set the net.core.wmem_default kernel parameter to no larger than 65536 bytes.
# Transmitting applications can configure the send socket buffer size for InfiniBand, UDP, or TCP protocols independently in a transmit instance.

# Default and Maximum Socket Receive Buffer
net.core.rmem_default = 8388608
net.core.rmem_max = 16777216

# Default and Maximum Socket Send Buffer
net.core.wmem_default = 65536
net.core.wmem_max = 16777216

# Increase the maximum amount of option memory buffers
net.core.optmem_max = 65536

# Increase number of incoming connections
net.core.somaxconn = 4096

# Increase number of incoming connections backlog
net.core.netdev_max_backlog = 100000

# Maximum number of microseconds in one NAPI polling cycle.
# Polling will exit when either netdev_budget_usecs have elapsed during the poll cycle or the number of packets processed reaches netdev_budget.
net.core.netdev_budget = 60000
net.core.netdev_budget_usecs = 6000

# Increase the tcp read and write-buffer-space allocatable
net.ipv4.tcp_rmem = 4096 1048576 2097152
net.ipv4.tcp_wmem = 4096 65536 16777216

# Increase the tcp read and write-buffer-space allocatable (default = 4096)
net.ipv4.udp_rmem_min = 8192
net.ipv4.udp_wmem_min = 8192

# Increase the maximum total buffer-space allocatable
# This is measured in units of pages (4096 bytes)
#net.ipv4.tcp_mem = 786432 1048576 26777216
#net.ipv4.udp_mem = 65536 131072 262144

# Make room for more TIME_WAIT sockets due to more clients,
# and allow them to be reused if we run out of sockets
net.ipv4.tcp_max_syn_backlog = 30000
net.ipv4.tcp_max_tw_buckets = 2000000
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_slow_start_after_idle = 0

/etc/sysctl.conf

After changing the value in /etc/sysctl.conf, you could enable it by using the following commands:

sudo sysctl -p

Configure file-max limit for Nginx.

By default, the ulimit value or process file-max value for Nginx is 1024.

We could change that by modifying /etc/security/limits.conf file and add this line.

* soft nofile 65535
* hard nofile 65535

Update /etc/default/nginx and add this value

ULIMIT="-n 65535"

On the first installation, the Nginx service wasn't enabled automatically at startup, you could enable the service by executing the command below.

sudo systemctl enable nginx
sudo systemctl start nginx

Update /etc/systemd/system/nginx.service or /lib/systemd/system/nginx.service by adding LimitNOFILE=65535 under the [Service] block, then reload the systemd by executing this command

systemctl daemon-reload
sudo service nginx restart

After changing the value, you could validate the config by running this command

cat /proc/`ps -aux | grep -m 1 nginx | awk -F ' ' '{print $2}'`/limits | grep "open files" | awk -F ' ' '{print $4}'

# The output should be: 65535

Tuning Nginx

Next, let's configure the /etc/nginx/nginx.conf file to start optimizing the default settings based on the configuration below.

user www-data;
worker_processes 2;
worker_rlimit_nofile 65535;
pid /run/nginx.pid;
include /etc/nginx/modules-enabled/*.conf;

events {
	worker_connections 4096;
	use epoll;
	multi_accept on;
}

http {

	##
	# Basic Settings
	##

	sendfile on;
	tcp_nopush on;
	tcp_nodelay on;
	types_hash_max_size 2048;
	server_tokens off;

	# server_names_hash_bucket_size 64;
	# server_name_in_redirect off;

	##
	# Timeout Settings
	##

	client_body_timeout 12;
	client_header_timeout 12;
	keepalive_timeout 15;
	send_timeout 10;

	##
	# Buffers Optimization
	##
    
	client_body_buffer_size 10K;
	client_header_buffer_size 1k;
	client_max_body_size 8m;
	large_client_header_buffers 2 1k;

	include /etc/nginx/mime.types;
	default_type application/octet-stream;

	##
	# SSL Settings
	##

	ssl_protocols TLSv1.2 TLSv1.3; # Dropping SSLv3, ref: POODLE
	ssl_prefer_server_ciphers on;

	##
	# Logging Settings
	##

	access_log /var/log/nginx/access.log;
	error_log /var/log/nginx/error.log;

	##
	# Gzip Settings
	##

	gzip on;
	gzip_disable "MSIE [1-6]\.";

	# gzip_vary on;
	gzip_proxied expired no-cache no-store private auth;
	gzip_comp_level 2;
	gzip_min_length 1000;
	# gzip_buffers 16 8k;
	# gzip_http_version 1.1;
	gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;

	##
	# Virtual Host Configs
	##

	include /etc/nginx/conf.d/*.conf;
	include /etc/nginx/sites-enabled/*;
}

nginx.conf configuration

if you compare the default config with this, it adds a few things.

  1. The worker_processes value is set from auto to 2, this value should correspond with how many CPU cores you have as a common practice.
  2. worker_rlimit_nofile value corresponding to the open file-max configuration previously.
  3. worker_connections value changed from 768 to 4096, you should know that the maximum concurrent connection value is determined like this:
    (Max clients = worker_connections * worker_processes)
    For my baseline configuration, it left me with an 8k limit that I believe is a good start to check if we should increase or decrease it depending on the system load and config.
  4. multi_accept essentially allows each process to accept multiple connections as much as possible, this config relates to the worker_connections we set.
  5. use epoll is a newer, more optimized connection method with the newer Linux version.
    It's essential to switch to it.
  6. tcp_nodelay don't buffer data-sends, if you are sending a frequent small burst of data this setting is good to enable.
  7. The timeout setting is decreased from the default value so it going to terminate a long-standing request in case the application hangs, or issue with the scripts.
    Keep in mind, that this value should correspond with how long is the limit for your scripts and application to run, don't take it as face value.
  8. Added buffer optimization setting to override the limitation, also helps to prevent DDOS. But keep in mind the client_max_body_size also determines what your upload file size limit is.
  9. Enabling the gzip compression for static resources, so you don't need to send the whole file size whenever serving the content.

Setup Nginx Virtual Hosts

After you finish setting up the Nginx configuration, next we're going to set up your website virtual hosts settings. By default, Nginx has one server block configured to serve a directory at /var/www/html by creating new virtual host settings, you could easily manage and host multiple sites on a single host.

First, create a root web directory and set user permission for your_domain as follows:

sudo mkdir /var/www/your_domain/ 
sudo mkdir /var/www/your_domain/htdocs/
sudo mkdir /var/www/your_domain/logs
sudo chown -R $USER:$USER /var/www/your_domain

Next, create a new file in /etc/nginx/sites-available and you could name it with the domain name like /etc/nginx/sites-available/your_domain and include it with the config below.

server {
        listen 80;
        listen [::]:80;
        server_name your_domain www.your_domain;

        root /var/www/your_domain/htdocs/;
        index index.html index.php;

        location / {
                try_files $uri $uri/ /index.php$is_args$args;
        }

        location ~ \.php$ {
                include snippets/fastcgi-php.conf;
                # add param to handle retrieving domain name within php
                fastcgi_param SERVER_NAME your_domain;
                fastcgi_pass 127.0.0.1:9000;
        }

        # deny access to hidden files such as .htaccess
        location ~ /\. {
                deny all;
        }

        # If file is an asset then enable cache, set expires and break
        location ~* \.(ico|css|js|gif|jpe?g|png)(\?[0-9]+)?$ {
                expires max;
                break;
        }

        access_log /var/www/your_domain/logs/access.log;
        error_log /var/www/your_domain/logs/error.log;
}

Nginx virtual host configuration.

Enable your new configuration.

sudo ln -sf /etc/nginx/sites-available/your_domain /etc/nginx/sites-enabled/your_domain

sudo service nginx restart

Configure PHP-FPM.

After you finish configuring Nginx, next we are going to configure PHP-FPM.
Open php-fpm.conf in /etc/php/7.4/fpm/php-fpm.conf or change the 7.4 to 8.0 if you are using PHP 8 and change the following variables to this value:

emergency_restart_threshold = 10
emergency_restart_interval = 1m
process_control_timeout = 10s

Don't forget to remove ";" to enable the variables.

These settings tell php-fpm if the child processes fail within a minute, the main process forces them to restart. It's useful when handling memory leaks within a process.

For the /etc/php/7.4/fpm/pool.d/www.conf:

listen = 127.0.0.1:9000
pm = static
pm.max_children = 16
pm.max_requests = 2000

I found out that using TCP sockets, rather than Unix ones like the default configuration for the listener allows more flexibility and solves compatibility issues if I want to use something like a custom SERVER_NAME value or use the built-in fpm status monitoring.

You also may notice I am changing from dynamic process management to static and for that, I refer you to read more about using pm static from the wonderful article by Hayden James.
https://haydenjames.io/php-fpm-tuning-using-pm-static-max-performance/

Configure php.ini.

You could safely skip this part if you think the default configuration is enough, but there are a few things that you should know when setting up your application.

The default path for the file is /etc/php/7.4/fpm/php.ini

  • display_errors
    This directive allows you to control whenever you want to show errors displayed on the screen during the script execution, for production use this value should be turned off to prevent showing unexpected vulnerabilities.
  • error_reporting
    This directive allows you to set the error reporting level, accepting a constant range of E_ALL, E_NOTICE, E_STRICT, E_DEPRECATED.
    For example, if you want to set it to E_ALL to show all types of errors.
  • file_uploads
    As the name says this directive allows you to enable/disable HTTP file uploads, if your site doesn't need file upload functionality you are safe to disable this.
  • upload_max_filesize
    If you enable file upload, this directive allows you to increase or decrease the size of file uploads, the default value is 2MB.
  • post_max_size
    This setting allows you to control the maximum size of a POST request and if you enable file uploads this directive needs to be higher than the upload_max_filesize value.
  • memory_limit
    This allows you to set a maximum limit of memory that your script allows to allocate and use, you may want to fine-tune this depending on what your application needs because if you set it too high, poorly written or buggy scripts could consume all the memory your server has.
  • max_execution_time
    This directive allows you to set a maximum amount of time scripts are allowed to run, the default is 30 seconds, similar to the memory_limit please tune and set this value as needed to avoid issues on buggy script.
  • max_input_time
    This directive allows you to set the maximum amount of time a script is allowed to parse incoming form data from a GET or POST.

Validate Installation

After thoroughly following all the installation steps, now let's test and validate our installation to make sure everything works okay.

  • Create a PHP info test file by creating one in /var/www/your_domain/htdocs/info.php
    with the following content:
<?php
phpinfo();
  • Access your test file by visiting your public website domain or IP address
    http://server_domain_or_ip/info.php
  • Your site display output should be like this:
A Comprehensive Guide to setting up LEMP Stack (Linux, Nginx, MariaDB, and PHP) in Ubuntu 20.04 LTS.
phpinfo page.
  • If the output looks the same, then Nginx and PHP are successfully installed.
    Now remove the info.php page to prevent unauthorized disclosure.

Conclusion

With that last step, now you have the knowledge of how to set up a LEMP stack with a good baseline configuration for production usage. There is always a good next step after installing like configuring SSL with Let's Encrypt or uploading your own SSL certificate, configuring a PHP opcache, tweaking MariaDB configuration, and more.

But that's for the next time.
For now, pat yourself on the back and treat yourself to some coffee or tea!

]]>
<![CDATA[Guide to deploying VueJS or any NodeJS project via SSH using Github Actions]]>https://gemawardian.com/deploying-vuejs-via-ssh-using-github-pages/618a3e6be4a3e0181c5d8113Tue, 09 Nov 2021 10:32:00 GMTOne of the things that people working on any JavaScript framework worry about is how to deploy it. There's a lot of options ranging from copying the production build to the servers or using something all-in-one like Cloudflare Pages or AWS Amplify.

But sometimes, you want something as simple as just a standard rsync to a Virtual Machine via SSH every time you push to a git repository. And today I want to show you how to build an automated deployment using GitHub Actions and VueJS (or anything else that similar...)

Setup your secrets.

After you create your repository and push your code, open the settings tab and go to the Secrets options. From there you could add a new repository secret, create one for your SSH keys, Server IP, User, and Port.

After that you should have 4 secrets, I named them:
1. PRODUCTION_SSH_KEY
2. PRODUCTION_SSH_USER
3. PRODUCTION_SSH_IP
4. PRODUCTION_SSH_PORT

Create your Actions workflow.

Then go to the Actions tab, from there you should click the "Skip this and set up a workflow yourself ->" button. From there you should remove the default value GitHub place them and rename the workflow for good measure, I renamed it from main.yml to deploy-production.yml

Replace the default value to this code, and replace the /home/vuejsapp/ path to the one you are choosing.

name: Deploy Production

on: 
  push:
    branches: [ main ]

jobs:
  deploy:
    name: Deployment
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v1
      - uses: actions/setup-node@master
      - run: npm install && npm run build
      - name: Install SSH Key
        uses: shimataro/ssh-key-action@v2
        with:
          key: ${{ secrets.PRODUCTION_SSH_KEY }}
          known_hosts: 'placeholder-value'
      - name: Adding Known Hosts
        run: ssh-keyscan -p ${{ secrets.PRODUCTION_SSH_PORT }} -H ${{ secrets.PRODUCTION_SSH_IP }} >> ~/.ssh/known_hosts
      - name: Deploy with rsync
        run: rsync -avz -e "ssh -p ${{ secrets.PRODUCTION_SSH_PORT }}" ./dist/ ${{ secrets.PRODUCTION_SSH_USER }}@${{ secrets.PRODUCTION_SSH_IP }}:/home/vuejsapp/

Let's break down the code above.

on: 
  push:
    branches: [ main ]

This part means these actions will run whenever there is a push or change on the main branch, you could change this as you choose for let's say if you are deploying using a different branch such as one for development, staging, and production.

jobs:
  deploy:
    name: Deployment
    runs-on: ubuntu-latest

When GitHub runs this deployment code, you could specify what OS images that the GitHub actions use, please note that this value is for the Actions worker, not your target OS.

    steps:
      - uses: actions/checkout@v1
      - uses: actions/setup-node@master
      - run: npm install && npm run build
      - name: Install SSH Key
        uses: shimataro/ssh-key-action@v2
        with:
          key: ${{ secrets.PRODUCTION_SSH_KEY }}
          known_hosts: 'placeholder-value'
      - name: Adding Known Hosts
        run: ssh-keyscan -p ${{ secrets.PRODUCTION_SSH_PORT }} -H ${{ secrets.PRODUCTION_SSH_IP }} >> ~/.ssh/known_hosts
      - name: Deploy with rsync
        run: rsync -avz -e "ssh -p ${{ secrets.PRODUCTION_SSH_PORT }}" ./dist/ ${{ secrets.PRODUCTION_SSH_USER }}@${{ secrets.PRODUCTION_SSH_IP }}:/home/vuejsapp/

This is the crucial part of building the application code, and deploying it to the server and it's running it in these steps.

  1. Check out the project to the GitHub actions server.
  2. Setup the NodeJS, you can specify what version to use by following this example.
  3. Install the dependencies from package-lock.json and run the build script.
  4. Setup the SSH Key, and make sure to add the known_hosts value to avoid manual confirmation action.
  5. Deploying the application using Rsync from the finished build to the server, you could specify which SSH user to use by changing the ubuntu part to your Linux user.

After you finished it, go save and commit the code.

Test and check.

After you commit the action workflow, you should notice a checkmark if your deployment is successful, a yellow dot if it's still running and a red cross if it's failed.
You could check the deployment details by clicking the icon and selecting detail.

In the end, it should look like this.

That's it!
You've successfully created an automated deployment to your server, and hope this guide is useful for you to start deploying your Javascript projects and learning how GitHub Actions could help you achieve that.

]]>
<![CDATA[DevOps: Recommended Readings.]]>https://gemawardian.com/devops-recommended-readings/6188f0e621d055982c13807fTue, 09 Nov 2021 08:52:17 GMTI love reading, and if you are starting in this field or maybe trying out what do you want to do with your tech career, I figure this small list could help you to clear some doubts or maybe inspire you to help implement or optimize your organization.

The pheonix project book cover.

The Pheonix Project - Gene Kim, Kevin Behr, and George Spafford

Yes, this book is mentioned a lot when talking about what's the best novel if you are working on IT especially management stuff. It is full of case studies and sometimes when I am reading it the stories feel like my own personal diary on my day-to-day work.

Accelerate: Building and Scaling High-Performing Technology Organizations - Nicole Forsgren, Jez Humble, and Gene Kim

Honestly, this book isn't that revolutionary if you already working in the field for a while but if you just starting in the field this is a good introduction to DevOps that it's short and accessible enough to be consumed.

The Twelve-Factor App - Adam Wiggins

Either on a big company or a small startup, this guide of software development is what I use in practice. I consider this a list of best practices on building software for the cloud.

Building Secure & Reliable Systems - Heather Adkins, Betsy Beyer, Paul Blankinship, Piotr Lewandowski, Ana Oprea & Adam Stubblefield

Made from people working at Google, this book is a great reference point for me to evaluate whenever I build it up to standard from a security point of view, It is full of valuable advice and relevant information just maybe try don't read the whole thing. It's a long book that could be much shorter in my opinion 🤣.

That's it!
I hope this list could be useful for you to start learning about DevOps, I am sure there are more good book recommendations but this is my pick for starting out.
Good luck and have fun!

]]>