Previous Next

Take your data home

We like to use services that allow us to connect with many people and share our ideas and data with them. Technologies with a broad network are more appealing to use, as they enable us to speak to people that we would never have addressed otherwise. 
However, most services run on a freemium base and we pay for this by watching advertisements, tolerating the fact, that our information is analyzed to make those more 'tailored' on the one hand and also made available to algorithms that search for a pattern on the other.

It would indeed be valuable if they bring transparency into this, allowing us to reject certain projects that we don't agree with, but until this point in time, hosting contacts, calendar or anything else in a private cloud might be worth thinking about. You will find links to the scripts at the end of this post.

Hardware

A cloud that is available only occasionally, like from 9 AM to 9 PM, would limit us to specific timezones and can't, therefore, be seen as an optimal solution and the same holds true if we run the same service for 24 hours on a modern pc. So, we have to find a good compromise between processing power and energy consumption with the goal of running everything smoothly. Secondly, the system should support containerization, at least, if you are planning to follow the instructions below.
We think that the HC1 from hard kernel suites that compromise quite well and a bundle with UART support (a terminal serial interface) costs about $80 to $100 without an SSD. It can be used with both, debian and arch linux and comes with an Octa-core Exynos 5, 2Gbyte LPDDR3 and a SATA 3 port. Hardkernel states, that the HC1 consumes roughly 1 to 2 A in most cases and recommend to use a 5V/4A power adapter, which results in a yearly charge of $10 to $55, if we assume $0.29 per kWh - depending on your usage. It's also worth to look at the hardware review forum of Armbian to find a board that may suit you better in terms of additional requirements.

Access

Secondly, your cloud should be accessible from everywhere and it's preferable to use a semantic name instead of trying to remember your public, non-constant IP address, which can be achieved by using a dynamic DNS provider. It's less error prone to stick to one of the dynamic DNS providers that can be configured by your router, as it's most likely the easiest and most covered solution in terms of changing IP addresses and DNS lookup behavior.
The incoming traffic should then be delegated to your hardware, so don't forget to activate port forwarding for port 80 and 443 respectively.
It turns out that not every router supports NAT loopback or has it enabled by default, which means that you might not be able to establish a connection using the DNS name if you are connected to the same network. If in doubt, use your mobile phone network to verify that everything is set up properly. 

Security

This should give us a good starting point to get our data back, but there is still one thing missing: encryption. You may have already heard about Let's encrypt. They provide a free encryption service based on SSL certificates, which is by far better than creating a self-signed certificate, as you don't get bothered with "do you really want to proceed" browser warnings.
In order to integrate their service, we have to find software that is capable of speaking the ACME protocol and this is where Traefik comes into play.
Traefik is an HTTP reverse proxy and monitors the state of docker containers to eventually route incoming traffic to participating and matching containers. In addition to that, they have a module capable of speaking ACME. However, please take the following caveats into account, if you plan to use Traefik with docker compose and read this article if you like to play around with Traefik.

  • Decide if you like to run compose services as stack or local containers because Traefik monitors either swarm or regular containers exclusively and this should match the mode of your overall compose strategy.
  • Add labels to the deploy section when running in swarm mode, otherwise, use the regular labels section.
  • Define a network name explicitly (since v3.5), as compose defined networks inherit a _default suffix if no network is specified, or a name_ prefix if it's not external or named. This is important when it comes to the traefik.docker.network label.

Code

You will find the complete examples and further instructions at Github, which include an apache based solution and one for Nginx. All them use nextcloud and are based on armhf in order to avoid redundant code definitions. Choose the one that you prefer by defining appropriate environment variables or by taking one of the other mechanisms. Both should be interchangeable and please open a ticket if they are not. The difference between Apache and Nginx is the way they handle PHP, which is php-fpm in case of Nginx. We could have provided the same configuration for apache, but that would just lead to an additional variant with less benefit over the other. You may further want to have a look at davdroid in order to make your data available on a cell phone.

Conclusion

In the end, we pay $100 for the board, another $100 (worst case) for electricity and additional $100 for a nice SSD component, if we take a two years warranty period into account. So, hosting and sharing your own data costs about $0.41 a day. No ads, no screening. Once you have everything set up, you can think about adding more services like a VPN or voice server. One last hint: Make a backup from time to time :)

The photograph for this article was taken by Jeremy Bishop


Become a backer or share this article with your colleagues and friends. Any kind of interaction is appreciated.

Quick Link