Home I am PR Shorts are Short and Funny, or How We Prepared the Site for the Habra Effect

Shorts are Short and Funny, or How We Prepared the Site for the Habra Effect

by admin

Shorts are Short and Funny, or How We Prepared the Site for the Habra Effect

Namba 0: Introduction

One day, thoughtfully leafing through bashorg, I suddenly realized that most quotes are quite long and very often they are just too lazy to read. In this age of universal haste – it takes a lot of time. Thus a new format was born : Shorts (from the English word Short). Shorts – is a short (one- or two-sentence) joke, strictly limited to 255 characters (to keep it safe).
Just at this point I wanted to learn programming and was looking for a simple task. I made the site pretty quickly – in a couple of evenings, told my coworkers and a couple of friends on IRC about it, and was about to forget about it, when I suddenly discovered that within two days the site was visited by 500 people, almost a third of whom subscribed to RSS. It became clear that people liked the concept. After a little polishing of the site externally, I decided to show it to the Habrasociety. The shorts are short and funny Love and love.

P.S. Considering the fact that this is not exactly a startup (the project is still non-commercial, and "startup" is too strong a word for my handiwork), the habradery that I acquired during my years on this site did not allow me to write a non-technical article. That’s why under the curtain you can find an entertaining story about how we made Shorts For the Habra Effect.

Namba 1: Resources

Any web project, and especially a startup, always has a sudden surge of traffic caused by various reasons – publishing an article on a major resource (like Habra), a press release, an advertising campaign, an unexpected mention in the news, etc. Often (as in the last example) a spike in attendance occurs UNSTANTLY™.
This leads to an important conclusion – to put a resource with attendance of 100k visitors a day on the initial VPS for 200 rubles, of course, you can. This will be a great source of pride in the circle of geek friends, but is likely to lead to the fall of the site at the most crucial moment. In general, it’s good when the web-production system is working in normal mode at no more than 10% of its capacity. This will allow her to withstand a surge in traffic. Based on all this, even such a "light" site as Shorts is better to put on a server with a decent power reserve.

Namba 2: FrontEnd and Backend

One of the first commandments of web optimization is to separate static and dynamic content. We will do this with a standard solution: Nginx on the frontend, Apache listening on 127.0.0.1. Nginx sends the static content and ifit sees that the request is for dynamic content it sends the request to Apache "inside":

server {
listen 10.0.121.124:80;
server_name shortiki.com www.shortiki.com;
# Unify the domain
if ($host = 'www.shortiki.com' ) {
rewrite ^/(.*)$ http: //shortiki.com/$1 permanent;
}
access_log / var /log/vhosts/nginx-shortiki.com-access.log main;
# Static is given by nginx, let browsers come back for it once a month. And we don't need picture request logs.
location ~^.+\.(html|jpg|jpeg|gif|png|ico|css|js)$ {
root /usr/home/vhosts/shortiki;
expires 30d;
access_log off;
}
location / {
http proxy_pass: //127.0.0.1:8081;
# Here in our case you can do some tricky caching, but that's a topic for a separate article
}
}
* This source code was highlighted with Source Code Highlighter

Namba 3: MPM

Today Apache can work in two main MPMs (Multi-Processing Modules, they define how a web server handles tracks/subsidiary processes, etc.) – prefork and worker.
Prefork works like this: many processes, one thread per process, processes handle requests. Prefork is stable, but consumes more memory and is slower than worker.
Worker is different in that it uses many processes, many traces per process, requests process traces. It is much faster than its competitor and uses less memory, but its isolation between processes is incomplete, which can cause problems on sites with sessions, registrations, and other important data. Considering popularity of prefork, we will discuss optimization by its example. However, in my case it makes more sense to use worker in Shortcuts.

Namba 4: Accept Filters

Accept filters (in this case) are a kernel module that buffers incoming connections, and passes the request to the web server only when a valid HTTP request is fully received.
In my case FreeBSD is used, therefore, we load the module :
# kldload accf_http
Make the module load each time the system boots :
echo 'accf_http_load="YES"' > > /boot/loader.conf
Setting up servers, for apache :
AcceptFilter http httpready
Andnginx:
listen 10.0.121.124:80 default sndbuf=16k rcvbuf=8k accept_filter=httpready
And restarting the web servers :
/usr/local/etc/rc.d/nginx reload
/usr/local/etc/rc.d/apache22 restart

Namba 5: Apache tuning

Apache tuning traditionally starts with the most important recommendation – the first thing to do is to disable unnecessary modules. Once the unnecessary modules are discarded, we start tweaking the main settings :
MaxClients is a parameter specifying the maximum number of simultaneous connections the server can hold. If MaxClients = 300, then with 301 simultaneous requests, the last request will queue and wait for one of the processes to free up to serve it. The main resource limiting MaxClients is RAM – 300 created Apache child processes have to fit into memory at the same time. It is common practice to calculate MaxClients based on the amount of free memory :
MaxClients = FreeMemory / Apache Process Size
You can see the size of the Apache child process in the RSS output column of the top or ps command.
Disabling AllowOverride:

< Directory />
AllowOverride none
</ Directory >
* This source code was highlighted with Source Code Highlighter

The thing is, if you leave it on, it will force apache to query the file system every time, checking if there is a .htaccess file.
Turn off ExtendedStatus (adds 1 or 2 system calls per quest):
ExtendedStatus Off
Add FollowSymLinks to the web directory, otherwise Apache will do a symlink check before it every time or file :

< Directory />
Options FollowSymLinks
</ Directory >
* This source code was highlighted with Source Code Highlighter

Reducing the timeout :
Timeout 10
Add some data compression (saves a lot of traffic):

<Location />
AddOutputFilterByType DEFLATE text/html text/plain text/css text/xml application/x-javascript
</Location>

* This source code was highlighted with Source Code Highlighter

MinSpareServers and MaxSpareServers are parameters specifying how many child processes to keep "idle". For example, if there are currently four processes involved in queries, and MinSpareServers is 2, then apache will start two more processes which will be idle while waiting for queries. The trick is that creating a new process is a relatively resource-intensive task, so, basically, the settings here are to avoid a situation where the server starts creating/killing processes all the time.
MinSpareServers 2
MaxSpareServers 8

With this setting apache will always keep 2 processes waiting for requests, but at the same time if 8 or more processes are idle it will kill them to free resources.
You can put more in front of Habra Effect :
MinSpareServers 8
MaxSpareServers 32

StartServers is the number of child processes Apache will create on startup. If StartServers is less than MinSpareServers, Apache will catch up to MinSpareServers. This setting depends on the initial load on the site, and the number of resources in our case :
StartServers 8
MaxRequestPerChild specifies how many requests a process will handle before it is killed and a new one is started instead. This is to counterbalance the effects of memory leaks. If your code is well written and you don’t have such problems you can safely increase this value, but it’s not recommended to set it higher than 10000, it’s not a problem to kill a process once in 10000 requests, but it’s a good preventative.
MaxRequestPerChild 3000
KeepAlive is a setting that allows to process several requests within one TCP session, without opening a new session for each request. It is important to leave the KeepAliveTimeout value low, because with a large number of requests, many processes will spend time waiting, and then apache will need to start more processes.
KeepAlive On
KeepAliveTimeout 5

Namba 6: PHP tuning

In our particular case, we will reduce the php tuning to installing the php module memcache and the memcached daemon itself, since all the other parameters won’t really affect performance. On FreeBSD, this is simple :
cd /usr/ports/databases/pecl-memcache
make install clean
cd /usr/ports/databases/memcached
make install clean

Namba 7: Adding Memkesh

Almost all users who come to the site will come from the home page. Consequently, it is most important to optimize it. On the main page we have one SQL query, which takes from the database 20 most recent shorts :

SELECT sid, sdate, stext, srating FROM quotes ORDER BY id ASC LIMIT $shortik_first, $shortiks_main
* This source code was highlighted with Source Code Highlighter

There is no point in going to the database every time you open the page, so with a light touch this query turns into a :

// Establishing a connection with Memcached
$mem = new Memcache();
$mem->connect( 'localhost' , 11211);
//zero the variables
$quotesonpage = '' ;
//we check ifthe memcached record we need is there
if (!$mem-> get [ 's_main' ] ) {
// If not, we take it from MySQL...
$connect = @mysql_connect ($server, $user, $pass) or die( 'Could not connect: ' mysql_error());
@mysql_select_db( "ShoDB" );
$query = "SELECT sid, sdate, stext, srating FROM quotes ORDER BY id ASC LIMIT $shortik_first, $shortiks_main" ;
@mysql_set_charset( utf8 , $connect);
$get_smain = mysql_query ($query) or die( 'Cannot execure query: ' mysql_error());
$quotesonpage = array();
while ($shortik = mysql_fetch_assoc($get_smain)) {
$quotesonpage[] = $shortik;
}
$quotesonpage = array_reverse($quotesonpage);
// ...and add to memcache, with a lifetime of half an hour (1800 seconds).
$mem-> set ( 'quotes' , $quotesonpage, MEMCACHE_COMPRESSED, 1800);
} else {
$quotesonpage = $mem-> get [ 'quotes' ];
}
* This source code was highlighted with Source Code Highlighter

Thus, with any number of users, the load of the main page on MySQL is equal to one query per half hour. Of course, fresh shorts will appear on the site with a delay of half an hour, but that’s okay.

Namba 8: Welcome 🙂

Welcome to Shorts ! I hope the site will be as warmly received as it was in the narrow circle of those who came in the early days of the resource’s life.
As a reminder, the original purpose of this project was to tighten up programming skills, so don’t shoot the pianist programmer, he writes as he can 🙂

You may also like